How do readers feel when AI-generated voices read the news?
Smart Data Initiative Newsletter Blog | 09 June 2023
Hi everyone.
Lots of conferencing going on for me. Barely recovered from INMA’s World Congress of News Media in NYC, then off to speak in Dublin for Google’s Publisher Data summit, then again in Dublin where I’ll be on a panel mid-June for the European Broadcasting Union’s News XChange.
So I get to meet many of you in person, which feels very 2019. But somebody asked for my business card just last week, and I had forgotten that these were even a thing. And, I somehow attended a full day of conferences without a phone charger — big mistake, I eventually remembered. So, very fun to meet many of you in person, even if, apparently, I am no longer fit for this exercise.
Please let’s meet over Zoom. I have attained the stage where I can only be 0s and 1s. A kind of reverse generative AI, if you think about it.
I will leave you to ponder this dystopian idea.
See you in two weeks,
Signed: ArianeGPT
How synthetic voices affect users’ perception of news credibility
A few weeks ago, I was lucky to attend the Nordic AI in Media Summit in Copenhagen, full of interesting case studies from publishers or academics taking on work that touched various parts of machine learning. One of the cases there addressed the upsides, but also complexities, of using synthetic voices to read the news.
There is, indeed, a lot of excitement around the growing capabilities of text-to-speech, and much of this is for good reason. This is a technology that has advanced enormously in just a few years — and not just for the biggest languages.
But a lot of the commentary on the technology focuses on pure capabilities — and, well, you’re reading a newsletter that often focuses on capabilities, too, so that’s in no way an indictment of such a focus. But since many of us have an interest in the final outcome as it applies in our industry — can we use this to, specifically, convey the news— we have to look beyond whether this text-to-speech AI is serviceable to see if it actually accomplishes this final outcome, which is doing a good, credible job of being a vector for news distribution. And there is a bit of difference between these two concerns.
Lene Heiselberg, associate professor with the Centre for Journalism of Syddansk Universitet in Denmark, shared her research that dealt with the audience response to this technology. She carried out semi-structured interviews with frequent radio listeners from different ages and across different regions of Denmark.
Some groups did identify or suspect that the news was being read by a synthetic voice and were impacted — sometimes positively or negatively — when they identified the presence of technology. “They thought it sounded like their GPS or like a robot,” said Professor Heiselberg. But, on the other hand, Professor Heiselberg also identified that users could get equally annoyed (though not for the same reason) with the same voice when they thought it was human.
Where things had more clarity were on factors that affected the credibility of the synthetic voice. And Professor Heidelberg listed some of the same factors as increasing or decreasing credibility — basically, the eye of the beholder was more of a factor than the inherent characteristics themselves.
For example, the synthetic voice lacking emotionality could be perceived as a source of increased objectivity for the reporting because users felt they couldn’t be emotionally manipulated. “That came as a surprise to me,” Professor Heiselberg said.
But lack of emotion could also be perceived as jarring against the user’s expectation of what they felt the tone of a news-reading voice ought to be for certain types of stories, like the weather, deaths, or sports. “When you ask the listeners, they want to feel the enthusiasm in the reporter’s voice when their team won,” said Professor Heiselberg.
Similarly, the source of the voice as it pertains to the persona the synthetic voice was loaned, which also could play in its favour or disfavour:
Finally, the context in which the synthetic voice was introduced — was it disclaimed as such — also affected perceptions of credibility. The results were less ambiguous that disclaiming helped make the technology credible but could have a knock-on effect of a more social nature about the place and nature of how AI-driven content creation.
The anthropomorphization of AI, or of robots, is a divisive topic. On the one hand, it helps make these technologies more approachable to us humans. On the other hand, anthropomorphization is a fallacy which exploits the way we’ve evolved as a species to relate to a category of “things” (other living creatures) that a piece of software, not being a living creature, does not belong to. That’s another way of saying it hijacks the regular way you’d assess a non-living thing and takes a more social path instead. A misbehaving robot can go in the trash, but a misbehaving puppy cannot.
Professor Heiselberg noted four ways that user would humanize their synthetic news reader:
- Giving them physical attributes.
- Associating them to real humans/famous humans.
- Projecting that the voice suggested it belonged to someone with certain stereotypical characteristics.
- Giving the voices a human backstory.
I’d like for a second to point at some of the text-to-voice AIs many of us have encountered — Alexa, Siri. These both have human-sounding names, which is probably not by luck. Furthermore, they have female names, also probably seen to be more friendly, collaborative, and benign — qualities that we loan to women — even if these assistants also have alternate male voices available.
And some of these synthetically voiced assistants very much lean into the notion of being friendly and having emotions. I burn with the fury of a thousand suns when Alexa tells me, “I hope you’re having a great day,” a disingenuous attempt at trying to trigger in me some reciprocal empathy for a hunk of plastic and a computer processor. Now, to be fair, you apparently don’t even need to have a voice to get this outcome: The Internet tells me that there’s a whole trend of people who think that their Roomba robot vacuum cleaners have personalities because, apparently, they are programmed to mimic having one.
This is an example where having personality, or an attempt by the makers of the software to give their product a personality, is seen as desirable, presumably for further entrenching the robot into our lives.
In the context of news, however, credibility —rather than friendliness — is the place where we have to measure whether we live or die. In this respect, Professor Heiselberg shared some takeaways from her survey participants:
- Communicating the presence of an AI-generated voice.
- Making considerations for the type of news content where text-to-speech should be used.
- Being mindful of how credibility was affected by having a voice deliver without emotion.
- Being thoughtful about the tone of voice itself – remembering our tendencies for anthropomorphization.
- How this voice will become part of your brand identity.
I would not be surprised if there was some regional variance in how users perceived these synthetic voices. Just as some societies treat pets as members for their family while other societies don’t, how we may feel toward robots in our lives probably varies, too.
If you are a news media company who is making a push for text-to-voice at scale, this type of user testing may highlight qualitative insights you may not readily see in your quantitative user data. Yes, they are just robots, but they are providing information that goes well beyond confirming that you’ve locked the garage door.
And also, maybe they are not “just robots.”
Further afield on the wide, wide Web
A few good reads from the wider world of data this week:
- Has it been a hot minute since I’ve last written the words “Cookie Apocalypse”? I’d hate for you to think the plot has been lost, because it hasn’t. In Digiday, a great summary of some of the publisher side of ad tech, with advanced leaders explaining some of their current thinking for not jumping on vendor-touted ID schemes. Not everyone can be Schibsted and push their IDs as a way to buy their first-party audiences, but the reluctance of publishers to replace cookies with cookie-like schemes certainly speaks to the growing maturity of the industry.
- Many of you may have heard of this one, but I’d be remiss if I didn’t link to the coverage of the open letter signed by several key leaders of AI technological platforms: “Mitigating the risk of extinction from A.I. should be a global priority alongside other societal-scale risks, such as pandemics and nuclear war” (NYT Gift link). There are of course lots of takes — sober, hot, etc — on the Internet, but here’s a sober counter point among many, from Ars Technica: “To be clear, critics like [Dr. Sasha Luccioni, a machine-learning research scientist at Hugging Face] and her colleagues do not think that AI technology is harmless, but instead argue that prioritising hypothetical future threats serves as a diversion from AI harms that exist right now — those that serve up thorny ethical problems that large corporations selling AI tools would rather forget.”
About this newsletter
Today’s newsletter is written by Ariane Bernard, a Paris- and New York-based consultant who focuses on publishing utilities and data products, and is the CEO of a young incubated company, Helio.cloud.
This newsletter is part of the INMA Smart Data Initiative. You can e-mail me at Ariane.Bernard@inma.org with thoughts, suggestions, and questions. Also, sign up to our Slack channel.