Bergens Tidende found its voice for text-to-speech audio project
Ideas Blog | 17 June 2025
Text-to-speech is a technology many media companies have embraced in recent years, and Bergens Tidende (BT) has been a forerunner in this field.
Synthetic voices are typically generic and neutral, so we chose to reflect our local identity by cloning a voice with a distinct Bergen dialect. As a regional media house, incorporating this local dimension was crucial to building credibility, recognition, and emotional connection with our listeners.
Our voice cloning started with a simple question: How do we find the perfect voice for Bergen? A voice that truly sounds like the city’s people: natural, relatable, and authentic.
To find it, we hosted an internal audition within the newsroom, with 15 journalists participating in a live reading session. A jury, with input from our partner Beyond Words, ultimately selected Eir Stegane, a journalist at BT, as the voice best suited.
This inclusive approach — involving the newsroom in the decision-making — had an unexpected but highly valuable side effect: It sparked internal engagement and a sense of ownership, anchoring the project deeply in the organisation.
Finding the right tone
While it has become relatively easy to create a synthetic voice using AI, capturing a regional dialect — and making it sound natural, expressive, and human — is a whole different challenge.
Stegane spent hours in the studio, recording no fewer than 4,000 sentences. These recordings gave the AI engine the data it needed to understand and replicate the unique features of her voice, from intonation, rhythm, and melody to the subtle characteristic nuances of the Bergen dialect.
Once the voice was chosen, we entered the technical phase: the cloning process.

Refining the voice of BT
Even with thousands of audio samples and advanced technology, the result wasn’t flawless immediately. Since the launch in summer 2024, we’ve been in a continuous iteration phase to refine pronunciation and speech quality.
We brought in a linguist to analyse the voice’s performance and help improve its flow, accuracy, and authenticity. Listener feedback has also been a vital part of this process. As technology evolves, so does the voice, which steadily improves over time.
This reflects BT’s commitment to not just create a digital voice, but to craft “The Voice of Bergen” — an auditory representation of our city and its people.
It is this very linguistic uniqueness — the unmistakable Bergen accent — that sets BT’s project apart from similar initiatives. It resonates with our audience, fostering a stronger emotional connection to the content.
This could be especially valuable in reaching younger audiences and individuals with reading difficulties, many of whom prefer listening over reading. According to Dyslexia Norway, between 5% and 10% of the population lives with a reading disorder, and for them, audio is an essential gateway to information and inclusion.
Another exciting outcome of the Bergen voice is its role as an accelerator for product development. We’re testing automated afternoon podcasts summarising the day’s top stories and integrating them into Spotify-style news playlists with full CarPlay support.
The response from our users has been positive. Listeners seem to appreciate the sense of local belonging that the voice provides, and BT is seeing a steady increase in the number of people who listen to our content.
This proves that the Voice of Bergen is more than a tech novelty; it’s a powerful tool for enhancing the user experience and expanding our reach to new audiences.