Record label battles with generative AI carry lessons for news industry

By Ariane Bernard


New York, Paris


A little musical moment for this blog. And this will be a temporary respite from my relentless Swiftie programming because this week, it’s about Drake and The Weeknd.

You see, there is a great lil’ bop “from” the two artists making the rounds on TikTok. Now, I am no music critic, but as the youth say and to use a technical term: It slaps. You could hear a sample of it on this tweet when I wrote this newsletter, but the copyright laws seem to have caught up to it just before it was published:

Between the writing of this newsletter and its publication, copyright laws also removed this platform for the AI generated song.
Between the writing of this newsletter and its publication, copyright laws also removed this platform for the AI generated song.

Where this gets interesting for our purposes: AI generated the song.

UMG, the record label behind Drake and The Weeknd, has not enjoyed the work to say the least, fighting on two fronts: working  to block streaming services from scraping the songs of its artists, as well as blocking the AI-generated tunes from being distributed (see more in this article from the Financial Times). 

It’s been fairly successful on that last goal. In the few days — between the release of the song and me submitting this article for editing — the song from TikTok creator Ghostwriter977 was successfully removed from YouTube, Spotify, the original TikTok. That said, you can still find it if you search for the name of the tune, “Heart on my Sleeve,” because the Internet never forgets.

I have just spent the past three months researching a report into generative AI. The legal and intellectual ramification of generative AI was one of the most interesting, but also complicated, parts of researching this report.

In the matter of blocking AI-generated tunes, for example, there is a question of authorship, which is rather complex to dissect: Neither the artists or composers of the songs on which the AI trained are the author of the AI’s synthetic work. And the financial interests who back these artists — UMG in this case — aren’t rights holders to the synthetic tunes either. 

Yet there is a question of what I could best call “filiation,” which really is not addressed in our current intellectual property concepts simply because there were no real grounds for it until now.

Until this age of Artificial Intelligence, someone was an author or they weren’t. If they used any form of non-human tools to aid in creation, these systems or their originators didn’t get credit (musical instrument makers don’t get credit, recording equipment companies don’t get credit, and the AI doesn’t have copyright either). And if someone inspired a piece of work, or even gave the idea for the work, they have no IP claim to it (ideas, famously, cannot be copyrighted or patented).

So this takes us to the question of how the scraped data on which the AI trained can be considered a part of the output — this filiation question. 

Does the training data lead to having copyrights into the output? That’s about the only place where there may be some actual enforceable claims, and lawsuits like the one Getty Images brought against Stable Diffusion for training on its licensed content is making the claim that AI training goes beyond Fair Use, even though there is a degree of transformation involved in the synthetic output.

Much of the output from AI-created sound does not fall under current copyright laws.
Much of the output from AI-created sound does not fall under current copyright laws.

The rest of the output, like the voice being used is, in the current state of the law,  not copyrightable. There are some provisions where the law blocks the use of a well-known voice generated by AI, but they have to do with how it is used rather than by virtue of the voice itself being copyrightable.

In 2020, Jay Z tried to block deep fakes mimicking his voice but found he couldn’t: YouTube reinstated the videos he tried to get banned

And lest you think the problem would go away for UMG if they were successful in preventing AI makers from scraping the production of their artists, it would probably only be successful at slowing down the AI’s learning. There are plenty of places for an AI to “learn” the voice of Drake which wouldn’t require using licensed material. Drake gives interviews, he speaks on his social media. The AI can learn hip hop from lots of different non-licensable places and can “learn” Drake’s voice from lots of different places where UMG has no copyright to defend. 

“ChatGPT and similar tools commit a highly sophisticated form of plagiarism,” said Jenna Burrell, the director of research at Data & Society, in an op-ed in Tech Policy Press. “The bigger concern is how ChatGPT concentrates wealth for its owners off of copyrighted work. It’s not clear if the current state of copyright law is up to the challenge of tools like it, which treat the Internet as a free source of training data.”

In the news media, we, too, have strong authors whose distinctive voices are part of the final product we put out. And voice may be “vocal,” but even writing style can be distinctive enough. As news organisations are often known for distinctive signatures, long-time readers can often recognise their creative styles. 

As far as audio, there are plenty of synthetic voices available to get your content out in automated ways, but audio branding is, of course, a very real thing — and not just the jingles in front of audio programming. 

As an example from the U.S., certain NPR programmes are instantly recognisable to frequent listeners just by the type of production they receive (Radiolab, my beloved). Voices, of course, are the strongest branded signature for the content itself in an audio-only environment.

Aftenposten in Norway trained an AI on the voice of its podcasters, so every article became available via audio in a voice that listeners already associated with its brand and products, their host Anne Lindholm. They could have used available synthetic voices in existence and called it a day, but the extra trouble was worth it.

Many news brands, like NPR's Radiolab, are instantly recognisable by frequent listeners, meaning their audio is part of the brand itself.
Many news brands, like NPR's Radiolab, are instantly recognisable by frequent listeners, meaning their audio is part of the brand itself.

Even without generative AI, delivery style or written style could be considered a certain kind of brand asset. In 2015, BuzzFeed created a Tom Friedman quote generator after The New York Times columnist (the Twitter bot is still there, though she hasn’t tweeted since 2016). This one is clearly parody, so protected speech under the First Amendment, and unlikely to devalue the brand asset of The New York Times. 

But if you are UMG and you signed Drake for what appears to be a record amount of cash, the synthetic tunes are potentially devaluing the asset you fought pretty hard to get. Flood the market with enough synthetic Drake songs … are the real ones worth less?

Not to mention that in the economy of streaming services, potential earnings are a zero sum game pegged as a proportion of streaming time against the fixed amount of royalty money per subscriber. In this respect, Synthetic Drake and Real Drake are fighting equally for streaming minutes.

What should be interesting for the news media is looking at how the broader world of content creators and their rights holders — whether that’s UMG or Hollywood studios — take on AI, because their assets usually have more individual longevity in the market than everyday news has.

In news, the totality of our catalogs (our archive) has value, but very few assets have long-standing value in individual distribution. Some of your articles continue to do well in search for years, but most articles are a flash-in-the-pan. We therefore tend to think of the value of our archive mostly through the prism of global B2B licensing deals rather than end-user single-item distribution.

The assets of a movie studio or record label’s catalog continue to have individual distribution value for much longer than news does, so there are really two battles up ahead for these media rights holders: picking up a fight with AI companies on the matter of the training data and with distributors of the synthetic material on the matter of infringement.

But the news media could, in the end, also face a similar challenge if an AI decided to create content in the “tone and voice” of a well-known publication — and do so day in and day out. At the moment, the topic of scraping is the one that occupies the news media more so than the distribution angle, but the fight of record labels may tell us something about our future, too.

If you’d like to subscribe to my bi-weekly newsletter, INMA members can do so here.

About Ariane Bernard

By continuing to browse or by clicking “ACCEPT,” you agree to the storing of cookies on your device to enhance your site experience. To learn more about how we use cookies, please see our privacy policy.