Chatbot, LLMs further data innovation at Schibsted Media

By Veera Törmä

Haaga-Helia University of Applied Sciences

Helsinki, Finland

Connect      

Reader feedback is key to the GenAI progress at Schibsted Media. 

“It is one thing to have this idea, but it’s completely another to put it live in real-time and to put it to the test,” Juan Carlos Lopez, director of data and AI at Schibsted Media in Norway, told attendees of INMA’s annual Media Innovation Week in Helsinki this week.

Without customer participation, he added, “we wouldn’t know the right answer.”

 Juan Carlos Lopez shared Schibsted's tips for driving innovation within the company..
Juan Carlos Lopez shared Schibsted's tips for driving innovation within the company..

The power of chat

Schibsted has utilised its proprietary data as a competitive advantage with the Tek Oracle (“Test-Orakelet”) chatbot, which was designed to answer customer questions.

While the Tek Oracle occasionally provided incorrect answers, the overall results were positive. Lopez noted the company could potentially monetise the chatbot, as it delivered strong results in a short amount of time.  

Schibsted uses Slack for work-related tasks while simultaneously harnessing the power of its own data. However, there is always still a human behind the desk, overseeing how things are progressing. 

A language all its own 

In collaboration with the Norwegian University of Trondheim, Schibsted developed its own language model in Norwegian: SchibLM (Schibsted’s language model). It was trained in Norwegian, Danish, and Swedish from scratch, and had to function similarly to how language is spoken and written at Schibsted.

The goal of the language model is to help journalists work more efficiently and improve the quality of written articles. 

For instance, the model provides recommendations for the most suitable titles. Photo captions, which journalists in busy newsrooms sometimes forget to add, are proposed by SchibLM. Article summaries are generated automatically once the article is complete. According to tests, SchibLM has performed well compared to larger language models. 

“We have transcribed over 12,000 podcasts for our podcast platform PodMe, offering easily accessible text output stored for analysis and potential future modelling,” Lopez said.

Having thousands of hours of spoken Norwegian transcribed into texts is a valuable asset. 

“It’s one thing with written Norwegian, but it’s very different when people are just conversing. They might be using slang words or casual language,” Lopez said, explaining this makes audio transcription more challenging.  

Automatically transcribing audio from various sources is another new area Schibsted is currently exploring: 

“For example, we can have a podcast where politicians are talking about their plans in Norwegian and make statements in the interview. The idea is to combine all these sources and transcribe them. The technology will allow us to automatically identify who is speaking and which topics are being discussed.”

To drive innovations in media, Lopez offered attendees one parting insight: “Making sure data is readily available is how we can democratise the access to data.”

About Veera Törmä

By continuing to browse or by clicking “ACCEPT,” you agree to the storing of cookies on your device to enhance your site experience. To learn more about how we use cookies, please see our privacy policy.
x

I ACCEPT