Transparency in AI guidelines isn’t a black-and-white process

By Ariane Bernard

INMA

New York City, Paris

Connect      

Hi everyone.

This week, I’m headed into big-words territory: Transparency.  

How do you live by this in the age of AI? How could we define it and where do we leave our organisations when we don’t? Arguably, transparency may be able to borrow from  the classic retort given by U.S. Supreme Court Justice Potter Stewart when pressed to further define hard-core pornography: “I know it when I see it.”

But when we write AI guidelines that advocate for transparency to our users in our use of AI, where does this leave our news organisations?

And reminding you, if e-mail blasts got redirected to the wrong filters, that our data master class is coming up in just two weeks, starting October 5. The  agenda is here

Until then, all my best, 

Ariane

On what “transparency” means in the age of generative AI

You have no doubt heard about various news media companies that have started to work internally on establishing guidelines for their use of AI — generative and otherwise — in their organisations. Some of these companies have already published their work, and three academic researchers have done the Lord’s work by doing an analysis and comparison of 52 of these policies, covering 12 different countries. 

In their work published through Oxford University, Kim Björn Becker of the Frankfurter Allgemeine Zeitung and New Trier University, Felix Simon and Christopher Crum, both of Oxford, are deep diving into some of the topics that are making publishers both excited but also anxious when it comes to bringing AI into our organisations.

Obviously, I live for stuff like this: It’s a ton of work, and it allows the less industrious (it’s me, hi) to get a sense of where the comfort zone of news organisation has been crystalizing, as well as where they are articulating their no-go zones.

But it’s actually the space in the middle that’s even more interesting: the ambiguous gray areas, because, arguably, this is where humans will also pause and hesitate — even with guidelines in hand.

The authors note there is some amount of homogeneity in these guidelines. And looking at the list of reviewed policies, a majority of which are from the DACH region and the Nordics, this makes sense because these are mature media markets where the “rules of the road” are also somewhat homogenous. We’re talking about mainly subscription-supported players — so the issue of the trust that individual users will have in your content is particularly high. 

I interviewed Kim Becker and Felix Simon to get their analysis on the work they did. They noted the prevalence of two topics, in particular, in the guidelines they reviewed.

“There are two very prominent categories when it comes to transparency and human supervision,” Becker said. “So many guidelines make statements about being transparent towards the audience about how to disclose the presence of AI — when AI has been used in the journalistic process at some point and, of course, whenever a human is called to supervise what AI has done.” 

Becker is a full-time writer at FAZ but has taken an academic interest in issues surrounding AI in media, so he contributed with a special perspective on the question as a “member of the tribe.”

And it’s worth pausing on transparency ranking so high. 

“What we found in our non-representative samples is basically that 82% of transparency is not explicitly specified in the way that should be communicated,” Simon said. In other words, the principle to espouse is spelled out, but the ways to support it are still left undefined. 

I think this gray area is where news organisations will actually find some of their harder conundrums. Because, well, almost everyone would agree that to support world peace is a good goal, but the starting of the goal isn’t really what folks are after. It’s the road map to get there that’s missing.

Taking, for example, some of the debate around how many disclaimers to attach to work where AI has contributed. This is a question that has come up so many times in the past few months for me — truly, a question for the ages. You’ll find organisations that have put a lot of effort in writing thoughtful disclaimers and that it’s visible enough on the page that there could be no argument the organisation is trying to hide the disclaimer in the first place.

But as any product person who has ever dealt with adding terms of service and other mandatory disclaimer language to a digital property knows, heavy-handed disclaimers receive less interaction than short ones. The paradox of the disclaimer is the more information and transparency you have, the less users actually learn because your transparency actually comes across as a painful ordeal for the user.

So there is a conflict here between wanting to inform our users about how and where and in which specific areas a news organisation is bringing in AI into their product and the fact that transparency through disclaimers has diminishing returns.

VG in Norway recently joined our Webinar series to chat with us about their work with AI-generated summaries. Vebjørn Nevland, a data scientist at VG, explained they only minimally disclose the presence of AI in the summaries because the presence of a human reviewer changed the role of the AI from creator to supporting tool.

And I agree with the folks at VG here. This is a case where we shouldn’t confuse our goal to receive the trust of our users with the goal to blindly disclaim for the sake of disclaiming. I think we should think differently of transparency in the context of AI depending on whether a human has a chance to review the work versus an application where AI gets to publish unchecked.

And even then, there is a difference between content that’s been generated by an expert system — that is, content automated by a smart system where all the rules and patterns are provided by a human programmer — versus content that’s been generated by a deep-learned AI, whose model of behaviour is by definition a black box to us. 

Content generated by an expert system — like what Tamedia does with its standing articles on all the towns and cities of Switzerland — is never surprising by its very nature. If something is odd, it means the data came over bungled in some way or it means we didn’t test the automation enough.

So a simple line to say this content was automatically generated based on templates is enough to explain that humans aren’t there, day-to-day, to check out the content but that humans have full control and ownership over what gets printed.  

The analogy here is whether you really expect an industrial bread factory to tell you a machine packed the bread in plastic and another machine sliced it. It’s not really material to your decision to buy or eat the bread. And if the product comes in misshapen, it will speak more about issues with this factory’s quality control than the fact that they didn’t tell me a machine packed the product.

VG uses GPT-4 to summarise their articles with a human editor  to control the content before it makes it to production, thus rendering the deep-learned AI tool essentially non-autonomous.  But the main reason in this day and age to provide some form of non-intrusive way to learn more about the role of AI in a feature is simple: There is such hype and myth-making around AI that most media companies probably don’t want to get publicly attacked for having “covertly” brought an AI tool into the content creation process.

The fact that a tool is involved in scaling up the production feels as benign as our use of an oven to cook the food or a coffee grinder to make your cup of coffee less of an ordeal for a human to prepare. For now, we inhabit these gray areas the way our ancestors may have feared the first phones. The AI guidelines of organisation have left transparency to be interpreted, on a daily basis, by their employees. Hopefully, this means we can evolve what exactly this translates to, in practice, as we move further along.

Further afield on the wide, wide Web

Some good reads from the wider world of data:

About this newsletter

Today’s newsletter is written by Ariane Bernard, a Paris- and New York-based consultant who focuses on publishing utilities and data products, and is the CEO of a young incubated company, Helio.cloud

This newsletter is part of the INMA Smart Data Initiative. You can e-mail me at Ariane.Bernard@inma.org with thoughts, suggestions, and questions. Also, sign up to our Slack channel.

About Ariane Bernard

By continuing to browse or by clicking “ACCEPT,” you agree to the storing of cookies on your device to enhance your site experience. To learn more about how we use cookies, please see our privacy policy.
x

I ACCEPT