The value of humans in generative AI and priorities in data

By Ariane Bernard

INMA

New York City, Paris

Connect      

Hi everyone.

I’m going to guess that by the time this e-mail finds you, you have heard that everyone’s favourite large language model, OpenAI’s GPT-3, has now been joined by a more advanced version, GPT-4. “Bigger, Better, Faster, More!” (if you know the reference, hello fellow geriatric Millennial). GPT-4 is already being put through her paces, so see my selection of tasty links in the “FAWWW” section of this newsletter. 

But in non GPT-4 news, this week, we’re looking at two different things: The Guardian’s approach to using value created to make prioritisation decisions for data and Schibsted’s foray into using AI-generated summaries and how human-in-the-loop emerged as a preferred implementation for the technology.

And as always, my mailbox is open: ariane.bernard@inma.org. See you in a couple of weeks.

Ariane

Testing “human-in-the-loop” approaches to GPT-3 integration at Schibsted

The good folks at Schibsted’s Futures Lab explored opportunities offered by GPT-3 in the area of summaries, sharing their recent experiment at an online panel earlier this month hosted by the Associated Press. 

Yifan Hu, a UX design designer at Schibsted in Norway,  explained how the team trained the AI with over 1,000 articles in Norwegian, along with 2,500 articles in Swedish and their human-written summaries from two Schibsted brands. 

GPT-3 having been fed this training data, the team at Schibsted looked at the quality of what we produced for new articles. For Swedish, the team found, the tools could produce summaries that were “comparable to human-written ones,” Yifan said. This really means the language and “tone of voice” matched the two Schibsted publishers. But, she notes, “90% of the summaries are factually correct.” 

Since 90% correct really means 10% incorrect — and therefore not quite ready for prime time — the team brought this work closer to journalists, integrating it into daily content creation tools. This is the “human-in-the-loop” approach, where a robot provides a service but doesn’t get to directly publish its own work without, if not human intervention, at least some level of human supervision.

Swedish organisations of Schibsted used the AI to provide pre-prepped summaries of wire articles for use in live coverage.

Schibsted's Swedish newsrooms are using GPT-3 to pre-prep wire article summaries that are then looked at by a human before going live.
Schibsted's Swedish newsrooms are using GPT-3 to pre-prep wire article summaries that are then looked at by a human before going live.

Meanwhile, VG of Norway chose to integrate the UI in their tool supporting their Snapchat production, providing summaries of the body text “to help speed up the process.”

Schibsted's VG is using CPT-3 to write summaries of articles published on Snapchat.
Schibsted's VG is using CPT-3 to write summaries of articles published on Snapchat.

The team at Futures Lab proceeded to conduct a survey of editors using the tool to get their sentiment on the test: “Our biggest concern was that they would be resistant to generative AI entering the workplace because they perceive it as a threat to their roles in the organisation,” Yifan said. “However, after testing out these demos — with them seeing the results and also explaining a little bit about the underlying technology — they started seeing the real use case, and they became more open, and many were eager to adopt these AI tools.” 

Human-in-the-loop means that a lower degree of perfection is required on the AI. 

“While we were testing, we caught some factual errors caused by AI hallucination, and some of the language of course can be improved,” Yifan said. “But this whole process helped all of us to understand the importance of having human in the loop, considering where the technology is at today. AI is not a replacement. However, our newsroom can leverage it in different ways for efficiency gains.”

Applying product thinking to data projects

In modern product management practices, you go out and investigate your user’s pain points — try to understand the specific circumstances of these pain points and imagine what remedies could come to solve these pain points. 

Some aspects of the process also involve trying to size the problem (in terms of how many users have this problem, but also in terms of how bad the problem itself is to these users). And this becomes the size of the opportunity — the size of the opportunity being the flip size of the size of the problem. 

In data, we often approach projects where we assume that because data is a utility rather than the end goal, this work of product management really happens outside of our own department.

If a product team tells you, “Hey, I need data to learn X or to better under Y” — the issue for the data team becomes how to answer the question. And if the data team has several stakeholders with requests for new work, the data team has to figure out some way to prioritise this work.

This has long been an issue in our publishing space because the type of work the data team may be asked to do is often quite diverse. There’s CRM-related work for the marketing team. There’s product analytics work for the product team. There’s audience analytics for the newsroom. There’s funnel and conversion analytics for the revenue team.

You get my drift. 

Now, when a team has several very different internal clients like the above, it’s quite complicated to prioritise one type of work over the other. Who says the audience analytics work should be more highly prioritised than the marketing work? Every one of these stakeholders will have deadlines and real reasons for why they want what they want. But unless you have unlimited resources in your team, you most likely can’t do everything all at once.

I was chatting with Mehul Shah, the interim chief data officer of The Guardian, who said he considered one of the most important jobs of the data team is “making sure that whatever insight we are creating is reaching the maximum number of people across the organisation […] which is important for us to create value, because data is not directly the kind of initiative that generates value.”

So this idea — to try and make sure whatever data does get created is used in as many ways as possible — is analogous to the opportunity sizing exercise of mapping your proposed solution to all the personas and problems that you think your new product could solve. While the product may have been created to solve a core problem or intended for a specific core audience, it doesn’t mean that part of the opportunity that exists for you is actually lying elsewhere — in more tangential markets and solving more tangential problems. 

“We have to make sure that we start capturing where data is going to add value. So we started using an approach where we started talking to people who have an interest in certain new data or using it in a new way, and we try to understand what are they doing today? What will it will do for them if we have provided XYZ data insight and what value it will bring to them?” Mehul said.

To be clear, this is hard for the data team because it means collecting information we don’t always collect, which is trying to formulate and size the impact of data into a given application. 

In fact, as a utility team, data is often infantilised into not asking this type of impact questions precisely because it, as a team, does not have specific revenue goals to pursue. While it may be seen as a bit of a relief not to peg every single goal or decisions to a specific dollar number, pursuing value is a method that serves many parts of a company: Product teams decide what to build after sizing an opportunity; marketing teams decide where to focus campaign dollars based on expected returns; sales teams decide on pricing tactics based on volume projections, and therefore on revenue projections …

Now, I’m not suggesting the data team should have to assign dollars and cents to every goal. It would pretty difficult to do this in a number of cases, especially when it comes to building out core capabilities. But when it comes to the method Mehul is pursuing, this team at The Guardian is looking at value created — not so much because the team has to report on value created, but because it helps them make prioritisation decisions and can also help support funding or investment requests. 

“Every time now I have to go out and ask for investment, I can actually go out and say that, by the way, by investing in data initiatives like this, you should get the value out of it — whether its time efficiency, whether its cost efficiency, whether its revenue generation,” he said.

Further afield on the wide, wide Web 

  • OpenAI’s GPT-4 came out this week (originally whispered to come out in February, it came out in March — pretty impressive!) with an unknown number of new parameters though rumored to be near a trillion, about five times more than GPT-3. There is no lack of articles taking GPT-4 for a spin, but The New York Times article (gift link) is pretty thorough. Some maturity has come for GPT-4 relative to some of the oddities observed in earlier versions. 
  • An interview with Meredith Broussard, data scientist and professor at NYU, speaking in the MIT Tech Review. Her take would seem counter-intuitive, which is that AI is being applied in too many cases and particularly too indiscriminately to be useful. On the other hand, she shares various examples (in medicine, policing) where the effects of using AI are negative, with long-lasting consequences. In our era, many headlines will drum up excitement and futuristic takes for AI. But we have to remember that a lot of that excitement is drummed up by companies that have a stake in attracting investment and or/direct business to themselves. That’s not to say there aren’t many reasons to dig further into AI and look at how to apply it to interesting problems. But this leaves little room for the more sober takes of folks who don’t have anything to sell,and whose perspective may be less indiscriminately enthusiastic. 
  • Ilya Sutskever, a co-founder of OpenAi and its chief scientist, spoke about GPT-4 and large language models  in a long interview with Eye on AI. I really enjoyed this one because it’s accessible to non data-scientists and actually contextualizes some of the overall approaches to these large language models — in terms of the overall historical progression of these approaches and where someone like Sutskever thinks the future may hold promise for further improvements.

About this newsletter

Today’s newsletter is written by Ariane Bernard, a Paris- and New York-based consultant who focuses on publishing utilities and data products, and is the CEO of a young incubated company, Helio.cloud

This newsletter is part of the INMA Smart Data Initiative. You can e-mail me at Ariane.Bernard@inma.org with thoughts, suggestions, and questions. Also, sign up to our Slack channel.

About Ariane Bernard

By continuing to browse or by clicking “ACCEPT,” you agree to the storing of cookies on your device to enhance your site experience. To learn more about how we use cookies, please see our privacy policy.
x

I ACCEPT