In a recent blog post, we shared that we created a cloned voice using machine learning, making it possible for our audience to listen to articles from Aftenposten. Using quantitative data, we learned what people did — and didn’t — like.
In addition to learning from the quantitative data, we interviewed six people who tested our playlists and learned a lot from their feedback. Here, we highlight four findings that determined what we will focus on for audio in 2024.
Too few know about the audio opportunity
One important learning is that what you don’t expect, you do not see.
Many users who are now enthusiastic about text-to-speech and use Aftenposten frequently did not discover the audio opportunity until many months after we launched in March, even though the play button has been visible in most of our written articles from the start.
Users also did not discover playlists through the widget on the front page or in the podcast section of the app. This is probably because listening is not part of their news habits when entering the Aftenposten universe.
To put it another way, the mode you are in also matters.
Most of our users who enter the front page are in a reading mode and might not be in a situation where it is appropriate to listen. Our hypothesis about the importance of being in the mode for reading — and being in a situation where it is possible to listen — was strengthened in one of our first user interviews about playlists. The person had clicked on the playlist widget on the front page, but she intended to read:
Another problem regarding awareness of the playlist is that many actively prefer to use our front page through a browser, and the playlists are only available through the app. In interviews, they asked for the opportunity to listen to more than one audio article in a row and don’t know the app already offered this.
Users like to be surprised
The “automatically play next” functionality allows users to listen continuously without actively choosing what they will listen to next. Instead, it is based on our suggestions. This opens up the ability to discover and be exposed to new and unexpected content. It turns out that this is seen as a positive experience and reminds some of a radio experience.
This quote from one of the playlist testers we interviewed illustrates the point well:
Make an audio-friendly text version
Another important finding is the importance of using a text version fit for the purpose, meaning text that makes sense to read out loud.
In several interviews, users said they were confused and annoyed when textual clues like subheads and highlighted quotes were read aloud twice; those elements only make sense in a textual context.
Continue improving the quality of the voice
Even if our Norwegian voice is among the voices with the highest quality in Scandinavia, it is far from perfect — especially compared to cloned English voices. We have learned that mispronounced foreign words and the wrong stress in words annoy people the most.
What this means for 2024
To summarise, we learned that to get more users to start using text-to-speech in situations where they can’t read, we need to make it easier to use and discover the text-to-speech opportunity.
We also need to increase and improve the information given to new and existing users about all our audio offerings. We will also focus on making an audio-friendly version of the original text and further improve the quality of the voice. We have already hired a linguist who will make sure we drastically decrease the amount of mispronounced words.