Input-, usage-based models offer different payment strategies for AI contributions

By Annelies Jansen

ProRata.ai

Pasadena, California, USA

Connect      

One of the most complex questions publishers face when considering partnerships with AI companies is how their work should be valued and compensated.

Generative AI systems are powered by editorial content: articles, investigations, explainers, and opinion pieces that inform and shape public discourse. Yet, the economic relationship between those creating the content and those building the models remains uneven.

Generative AI systems are powered by editorial content, and news companies should rightly seek compensation for their contribution to these systems.
Generative AI systems are powered by editorial content, and news companies should rightly seek compensation for their contribution to these systems.

As publishers seek to negotiate licensing agreements and test new frameworks, two main compensation models have emerged: input-based and usage-based. Each of these has its own strengths and limitations.

Why input-based models fall short

A significant portion of the early AI licensing deals were fragmented, one-off deals. They were privately negotiated, often with large or well-known publishers that have the leverage to command attention and resources to bring legal action. Smaller or regional outlets were left out, and the ecosystem risks consolidating around a handful of players per market.

More recently, input-based compensation models have shifted toward “pay-per-crawl” deals — either directly negotiated or through emerging marketplaces such as TollBit and Cloudflare. Under these deals, publishers are paid upfront for access to their archives and ongoing crawls, granting AI companies permission to use that material to train or power their generative answers.

A major challenge is the misalignment between publishers and AI companies on how to assess the value of content. Factors such as originality, recency, and reputation of the publishing brand all influence value. The difficulty stems from the stark differences in how content owners and AI companies value the content, which prevents meaningful progress toward an input-based market.

While news publishers should absolutely negotiate to maximise immediate returns, input-only models are unlikely to deliver consistent, predictable, or scalable revenue.

Usage-based compensation: tying compensation to value delivered

A more sustainable long-term model lays in usage-based compensation. That is, paying content owners when their content is used to generate answers. Just as in Hollywood or the music business, or indeed any market leveraging royalties, this aligns compensation with value-creation.

Two main frameworks are emerging in this space: citation-based attribution and claim-based contribution.

Citation-based attribution

Citation-based attribution is calculated by counting how often content is cited or linked within an AI-generated output.

In this approach, answers are produced from vast, unlabelled, and often scrambled training data. Citations are appended afterward, either by heuristic rankers (essentially educated guesses) or retrieved from external search results.

While this marks a step toward greater transparency, it remains an inherently flawed system. Inaccuracy is in-built and bias toward more recognised brands that are more likely to be cited or hallucinated may be introduced. Ultimately, it provides only a superficial understanding of how much each source contributed to an answer.

A variant of this model compensates every content owner whose material is processed during the creation of an answer, even if it isn’t included in the final response. While more inclusive, this approach does not equitably reflect each source’s contribution and cannot account for facts generated solely by the large-language model (LLM).

Claim-based attribution

Claim-based attribution breaks down AI-generated outputs into atomic “claims” (the smallest factual or contextual statements) and maps each one back to indexed source material.

This system offers a high degree of transparency and more accurately reflects each source’s true contribution to the answer, enabling fair compensation. Because this approach relies on indexed content, it narrows the scope of material that can be used to generate responses but delivers greater answer accuracy and trustworthiness.

The road to equitable compensation

As generative AI reshapes how people consume and engage with information, the value chain of journalism must evolve in parallel.

  • Input-based models offer content owners a one-off, up-front fee for content, but they struggle with content valuation and scalability. They also lack incentives to sustain a diverse ecosystem generating valuable content.
  • Usage-based approaches tie compensation to how often content is used in AI answers and the value it generates. The two main approaches are citation-based and claim-based attribution.
  • Claim-based attribution offers the greatest accuracy, supports a diverse ecosystem, and provides users with the most trustworthy answers, but content must first be indexed.
  • Citation-based attribution does not require indexing, but it sacrifices answer veracity, introduces biases, and offers a less accurate picture of content contribution.

News publishers should absolutely continue to strike deals with AI companies and make the most of today’s input-based opportunities. However, for the industry to truly thrive, we need to look beyond short-term gains and build a fair, sustainable ecosystem that rewards those who create the most value, empowers diverse voices, and ensures journalism continues to inform, inspire, and endure for generations to come.

About Annelies Jansen

By continuing to browse or by clicking “ACCEPT,” you agree to the storing of cookies on your device to enhance your site experience. To learn more about how we use cookies, please see our privacy policy.
x

I ACCEPT