Copyright fight regarding generative AI in music is a good test for media
Smart Data Initiative Newsletter Blog | 04 May 2023
Hi everyone.
This week, we’re picking up a few loose threads on the generative AI and audio copyright content from my previous newsletter. I believe this copyright fight will turn out to be a meaningful test for what will shake off between training data rights owners and AI technology companies. And, well, there’s always money in the banana stand.
And on this, reminding you that the INMA World News Congress is at the end of the month in New York. Drop me a note if you’ll be there so we can meet!
Ariane
The battle lines in Generative AI tech vs the recording industry are firming up
Not to toot my own horn — but yes, absolutely, to toot my own horn since you read it here before you read it in The New York Times — but the spotlight on a rather great bop “from” the artists Drake and The Weeknd is now making ripples all the way into important places: the earning calls (of Spotify, SiriusXM, and Universal Music Group), according to a story from Bloomberg.
Hollywood studios and the recording industry quietly do battle against copyright infringements of all sorts all around the Internet, and we don’t hear about it unless a new Napster-like entrant shakes things up. This is so routine, in fact, that one of YouTube’s great early triumphs was finding a way to scale up rights claims in such a manner that it allowed YouTube to continue to grow in its catalog and traffic served while making an ally of rights holders even as their content would get uploaded by copyright infringers.
This system is called Content ID, and it is a powerful display of technology turning around something that should have dragged young YouTube straight into court and instead turned it into a money-making machine for a creative industry that could have instead wanted it dead.
Movies and audio have, of course, more “personality” than text. The incarnation of content with a face or a voice provides more of a distinctive quality right at the gate than words on screens. This is why the value of a Drake is so specific. This value isn’t just loaned to another hip hop artist, whereas, arguably, many articles from a specific publication have certain shared characteristics of personality and aren’t so distinctive. And, also arguably, you can be equally satisfied reading an article about a given topic from a number of publications of similar caliber.
This will probably weaken the ability of copyrights holder of text-based content to negotiate with generative AI tech companies that train on content because if Publisher A makes certain demands (or asks to be removed) to train the algorithm of a large language model, Publisher B may just have equivalent training data to provide.
But Hollywood and the recording industry have built-in distinctiveness in their creative assets, which both increases the value of these assets but also gives them a place of strength to defend them. “You can’t train on my Star Wars movies” will, de facto, make it very difficult to have anything Star Wars-related to train on. All of it is copyrighted, and there’s only one rights holder. And if there’s some Star War content that makes it out of a generative AI system, you don’t have to be an attorney (which, if you needed to be reminded, I am not) to imagine that this synthetic content is going to court.
Universal Music Group’s CEO Lucian Grainge noted during his company’s earnings call last week (as reported by Bloomberg) that platforms distributing AI-generated music will have an “additional responsibility” to identify infringing content or content trained on artists’ voices. But, and this is key, Grainge also noted that UMG would be open to licensing its catalog to businesses that are “legitimate” and “supportive,” according to Bloomberg. (There is another great story reporting on this call from Variety if you want to go further)
Which is why I took note of this coverage of generative AI becoming a topic of concern at earning calls. Forget the writer’s room or the recording studio: The most important room in Hollywood will always be the box office and the earnings call. If there is a perception that generative AI may depreciate the earning potential of future creative work — and depreciate the value of the catalogs of studios and record labels — the threat just got serious.
Netflix, Spotify, Hulu et. al. were only allowed to live so as to crowd out Napster and Bittorrent (and it generally worked; what most users wanted wasn’t so much to cheat on payments as having an easy, all-you-can-eat buffet of good content). YouTube’s Content ID is a peace agreement that turned infringement into a method for automated revenue sharing. Essentially, tech companies and creative rights holding companies designed revenue sharing models they could live with and supported this with technology. The sequence of events is important here: The agreement on revenue sharing had to be there for the tech platforms to live.
So we now know something: The motivation to find a way forward is present in the highest rooms. And this is good for us in the news industry because whichever methods are designed to imagine that licensing and revenue sharing may very well be blueprints for us, too.
Now we’re not a perfect copy there. Because we do news rather than entirely arbitrary creative work, not everything that our companies build is distinctive enough to be a highly defensible asset. When we cover straight news from a news conference that’s also piped live on the Internet, this is a significantly more complicated asset to defend than a thoughtful analysis built from several interviews conducted by your reporter (to be clear, from an IP perspective, any creative work is protectable — the issue here is rather enforcement).
For example, a problem that may end up being specific to news content is the creation of very generic copycat content sites. Just this week, the organisation Newsguard, which monitors misinformation online, reported it had identified 49 sites that seem to be generating “basic” news content across a few common languages. These sites, which seem to mainly have the goal to monetise via a high load of programmatic ads, are reminiscent of the content farms of yore: unlikely to rise to organic prominence but likely to be pests if they figure out how to arbitrage their cost-to-acquire clicks against their ad revenue per page.
The real stakes are in large-scale agreements with the bigger AIs training on licensed content. So it will be interesting to see just how conversations between larger parties — the creative industries, the larger technological platforms — can evolve, whether on single content or on a catalog item-licensing basis. The heart of the matter is likely, in the end, the devising of a general model for revenue sharing rather than the specific licensing of properties.
Hot off the presses: My new report on generative AI
My latest INMA report, News Media at the Dawn of Generative AI, dropped a couple of weeks ago. It’s free to INMA members, so I hope you’ll download it and feel ready to take on the coming wave of technological changes.
Further afield on the wide, wide Web
A few good reads from the wider world of data this week:
“Dr. [Geoffrey] Hinton said he has quit his job at Google, where he has worked for more than a decade and became one of the most respected voices in the field, so he can freely speak out about the risks of AI. A part of him, he said, now regrets his life’s work.”
So The New York Times introduces the gravity of an important moment in this age where a fledging generative AI is taking flight. The New York Times interviews Dr. Hinton in this piece (gift link), which well deserves your attention.
And some more lighthearted fare:
- How GPT learns, with some modelization from Jane Austen or Harry Potter. Watch it learn with 500 rounds of training or 30,000 (and marvel? Or be frightened … your pick). (A NYT gift link.)
- “‘Total Crap’ written entirely by AI,” a funny article from the good people of McSweeney (which, yes, is humour). If this paragraph stays in print, remember that there’s an extra pin in the Ariane-shaped voodoo doll that INMA’s editor certainly keeps in her desk to make me pay for using a dirty word.
- Another eyebrow-raising headline: “First real-world study showed generative AI boosted worker productivity by 14%” reads like something that probably needs a few caveats — such a precise number for something that can only be highly dependent on sample and, even, defining productivity. But, the odd precision of this number notwithstanding and all the same, Bloomberg looked at a recent survey from Stanford and MIT and we can all dissect how we feel about the perfection of this study’s findings.
- Extra credit opportunity with this one: the white paper on AI published by the Liberal Democratic Party of Japan. I’m actually linking to a LinkedIn thread here because the comments are interesting, too. In a nutshell (it’s 26 pages, so I’ll give you the TL;DR): The paper is part industrial policy and partly a call for regulation but without the bend you often see in regulation literature where a not-so-hidden angle is to affect industrial competition. It is an incredibly digestible read. It really is worth reading if you want a broad overview of some areas of concerns and interest for generative AI at the scale of a country. Many of these concerns (literacy, abuse, governance) are as true for a national government as they are for news media.
- And finally, and I’m really not sure why I am sharing this but … Italian Vogue generated its first cover using AI and Futurism shared this with this headline, “Vogue Just Used AI-Generated Imagery in a Nightmare Fuel Cover Shoot.” I will therefore not be providing my own commentary.
About this newsletter
Today’s newsletter is written by Ariane Bernard, a Paris- and New York-based consultant who focuses on publishing utilities and data products, and is the CEO of a young incubated company, Helio.cloud.
This newsletter is part of the INMA Smart Data Initiative. You can e-mail me at Ariane.Bernard@inma.org with thoughts, suggestions, and questions. Also, sign up to our Slack channel.