Gen AI might not be as big for news companies as some think

By Michelle Palmer Jones

INMA

Nashville, Tennessee, United States

Connect      

Saving time means saving money for news publishers, and language learning models are a great way for publishers to save time. 

Anyone who has used LLMs to complete repetitive tasks like transcribing interviews or writing headlines has certainly seen the benefits of the technology. In a time when the news business as a whole is seeing a plateau in growth, it’s encouraging to see technology that helps in the day to day business. Despite all the excitement, Ritvvij Parrikh, senior director of product at Times Internet, wants media companies to consider all sides of Gen AI — including some of its shortcomings.

Speaking at the recent Generative AI Town Hall, presented by the INMA Generative AI Initiative, Parrikh said knows Generative AI is a great piece of technology but believes it falls short from a business and revenue perspective for media companies.

Parrikh told attendees he does not see LLMs as likely to kickstart a new wave of revenue and profitability growth for media companies and laid out four reasons why.

1. LLMs do not provide strategic leverage

Many news publishers depend heavily on traffic from search and social, and that is dropping dramatically. Google traffic from affiliates has tanked from anywhere between 40%-90%. 

News companies who are looking to diversify the sources of their audiences can go toward one of two extremes. One of those ways is they can decide to become a creator on BigTech.  

“That is, acknowledge that their Web sites and apps are essentially brand presence whereas most of the revenue is likely to come from off-product sources like YouTube, Substack, physical offline events, etc.,” Parrikh said.

LLMs are unlikely to better a company’s odds in these situations because if you’re able to leverage LLMs to automate a lot of content generation, YouTube, Substack, etc., will likely downrank a lot of the LLM-generated content.

“Which means despite the investment and despite the increase in quantity of content produced, you might not earn more,” Parrikh said.

If audiences and algorithms do accept synthetic content, it’s likely Facebook itself will start generating its own. This strategy doesn't require publishers to generate the content.

Alternatively, publishers can go down the path like The New York Times and Disney+ and start investing in building owned and operated platforms. This would require getting to a point where audiences start relying on the company’s own apps and Web sites to consume content.

“This is an extremely hard transition,” Parrikh said. “You’re essentially saying [a media company can] transition most of your traffic from logged out to logged in, transition most of your traffic from Google Discover and Search to directly coming to your Web site.”

News companies need to have an extremely valuable customer value proposition. They need to prove to consumers that having the app on their phone is worth their time.

“Sadly, many publishers lack the people, process, and technology to pull off this transition. It’s a very difficult transition that even Disney+ has struggled to pull off,” Parrikh said.

The New York Times has already been working on this strategy for over a decade, Parrikh said said, and LLMs have no role to play here.

2. Scaling LLMs can be risky

Right now, LLMs are operating as probability engines. They’re given a word, and they predict the next word. They’re great for language completion tasks dealing with grammar, style, phrasing, and vocabulary enhancement. However LLMs cannot yet take over core news tasks without extremely high risk.

In news, journalists are often writing a story about a very specific topic. It would be common that the LLM isn't trained in that topic. Therefore the information it has would be sparse, and the model hallucinating would be highly likely. Hallucination would include responses that are wrong, don’t make sense, or are not related at all to the prompt.

“News constantly changes. That information will not be there inside of the LLMs training and hence it’s likely to hallucinate,” Parrikh said.

For example, if a news organisation were to give its archives to an LLM on a specific event like the Russia-Ukraine war without investing in putting people in place to train the LLM in a specific way, the outcome will change constantly — which is not ideal for a publisher.

“Also if you don’t fine tune, the style in which the content is written eventually by the LLM is extremely robotic and doesn’t adhere to how your brand writes,” Parrikh said.

At this point in time, LLMs are not good at producing or analysing news. 

“So they can take old content and rewrite it, but it’s unlikely that they will be able to find news meaningfully,” Parrikh said.

LLMs also can’t connect dots, analyse, and tell why something has happened. If a journalist is writing an analysis article or opinion piece, LLMs won't be able to connect dots and provide insight for a meaningful story. It's unlikely readers will want pieces that aren’t novel and full of insight. They would see that as a lack of value.

LLMs are taking human-generated content and creating synthetic content from it. Each time the LLM is used to create more synthetic content, the quality decreases. Without this changing, Parrikh believes the success of scaling it for use in a major newsroom remains uncertain.

LLMs can be used as a feature inside a publisher’s operating system since they can help editors do certain tasks faster and better.

“But this is not an exponential return that will kickstart a new wave of growth within the industry,” Parrikh said.

3. It can save a publisher cost, but it does that for everyone 

When it comes to solving problems that many publishers face, LLMs can only help with cost optimisation. They can’t help with effectiveness or help companies better balance retention vs. monetisation issues. 

While cost savings is great, Parrikh believes the 20%-30% cost savings LLMs can help with will just become the new industry norm and will give no one publisher much of a competitive advantage.

Right now Bloomberg is the only news media company that Parrikh identifies as investing in training their own LLMs, so everyone else is using all the same models.

4. What if ChatGPT releases new models? 

Research being done now on LLMs is showing a growing number of parameters in training foundational models isn't making them better. 

Media companies are waiting for natural language inference as a next step with this technology. This would be when the tech gets to a point where it stops predicting words and starts inferring what is written inside of the text and spatially understands knowledge. 

But the tech isn’t there yet.

“Hence you cannot create models at mass from LLMs today,” Parrikh said.

So what do publishers do until that day comes?

“Keep using LLMs, build the skill, build intuition, do not expect it to magically improve the business,” he said.

For publishers that have the budget, he recommends investing in fine tuning the models and doing supervised task training. 

And for those with an even bigger budget?

“Start building the technical know-how to host open source foundational models,” Parrikh said. “These are all important steps towards being ready for when natural language inference comes in.”

About Michelle Palmer Jones

By continuing to browse or by clicking “ACCEPT,” you agree to the storing of cookies on your device to enhance your site experience. To learn more about how we use cookies, please see our privacy policy.
x

I ACCEPT