Language model optimisation is journalism’s next big shift

By Dr. Dietmar Schantin

IFMS Media Ltd

London, United Kingdom

Connect      

We’re witnessing a fundamental shift in how information reaches people.

In the Internet Age, the model was passive presentation: Search engines displayed ranked links, news sites offered article lists, social platforms showed feeds of options. Users chose what to read.

In the Intelligence Age, information delivery is becoming active synthesis: AI companions generate answers, compile briefings, and provide guidance drawn from multiple sources. Audiences increasingly rely on these AI companions and what I’ve called the “AI buddy economy” is emerging.

Journalism must be tailored for the Intelligence Age so that it suits readers’ needs while also capable of being synthesised for answer generation by AI tools.
Journalism must be tailored for the Intelligence Age so that it suits readers’ needs while also capable of being synthesised for answer generation by AI tools.

What makes this shift so consequential, and the implications for journalism so significant, is how these AI companions function.

They don’t work like search engines or social feeds. When someone asks ChatGPT “What happened at yesterday’s city council meeting?” the system doesn’t display links. It synthesises an answer from whatever sources it can access, weaving them into coherent prose.

This shift from data retrieval to data generation — from passive presentation to active synthesis — represents a categorical change in how people access information.

For journalism, the implications are profound. Search engines made us discoverable; we competed for ranking position. Social platforms made us shareable; we competed for viral reach. AI companions make us citable, but only if they can accurately extract and reconstruct what we’ve reported.

This isn’t about visibility anymore. It’s about whether our journalism exists within the knowledge base these systems draw upon, and whether they represent it faithfully when they do. Our journalism must be fit for the Intelligence Age — fit for readers and the AI buddies that will synthesise answers to reader queries.

Fitness for purpose requires a new discipline: Call it language model optimisation or LMO.

From external to internal: a strategic inflection point

For 25 years, news publishers optimised content to make other companies successful.

When search engines emerged in the late 1990s, search engine optimisation (SEO) made journalism discoverable. News publishers competed for visibility on Yahoo, AltaVista, Lycos, Excite, and Ask Jeeves — and ultimately Google, which came to dominate the search landscape.

We learned to front-load keywords in headlines, structure articles with H1 and H2 tags, and optimise meta descriptions. SEO specialists joined newsrooms. Headlines were rewritten. Content management systems (CMS) were restructured.

We measured success by search rankings; AltaVista alone processed 80 million queries daily at its peak. Being invisible there meant being invisible to audiences. By the mid-2000s, Google’s algorithm had become the arbiter of discoverability, and we adapted our journalism accordingly.

In the 2010s, social media optimisation (SMO) aligned our stories with Facebook, Twitter, and Instagram. We A/B tested headlines for click-through rates. We embedded share buttons, crafted tweet-length summaries, and timed publication to match platform algorithms.

Social media editors became a distinct newsroom role. Instagram’s filters, Twitter’s hashtags, and Facebook’s algorithm shaped our editorial decisions as much as news judgment.

Both SEO and SMO were outward-facing disciplines. We made our content compatible with someone else’s systems. We optimised for their platforms, fed their algorithms, and made them more valuable.

The unspoken deal was simple: Make your content work with our systems and we’ll send traffic.

The outcome is familiar: Platforms thrived. News publishers became dependent. Revenue stayed elsewhere.

Now, in 2025, language model optimisation represents journalism’s third great optimisation challenge — but with a fundamental difference. LMO is also about internal optimisation.

We’re optimising to make the most of our own data set so the conversational interfaces, research tools, personalised briefings, and archive synthesis we offer our readers on our own platform are as rich as possible. The optimisation serves our products, not solely someone else’s.

This distinction matters strategically. SEO and SMO made platforms successful while publishers remained dependent. LMO makes our own systems work better. It’s infrastructure for journalism’s future, not another form of platform submission.

Why this optimisation is fundamentally different

SEO was deterministic. Google’s algorithm looked for signals like title tags, header hierarchy, keyword density, and backlink profiles. Match the pattern, improve your ranking.

LMO deals with probabilistic systems. Language models don’t match keywords. They calculate probabilities across vast neural networks trained on billions of text sequences. When you write “Fed” instead of “Federal Reserve,” you’re not failing to match a keyword — you’re introducing ambiguity that increases the probability of errors in how the model interprets, connects, and reproduces your reporting.

This difference has practical implications for how journalism must adapt. With SEO, we could optimise after publication — add keywords, adjust tags, build links. With LMO, we’re shaping probability spaces. Clarity must be built in. Ambiguity reduces retrieval accuracy. Inconsistency fragments understanding across your archive.

The competitive threat: why structure beats quality

Recent analyses reveal a troubling pattern in AI citation behaviour. When systems like ChatGPT, Claude, Gemini, and Perplexity cite sources, they favour platforms like Reddit (40.1% citation frequency) and Wikipedia (26.3%). Traditional journalism rarely appears.

A 2025 study tracking over 5.7 million AI citations found structured platforms and user-generated content dominate — not because they’re more accurate, but because their data is organised for machine parsing.

This isn’t a quality problem. Professional journalism — more authoritative, more verified, more contextual — loses out because it’s written for human nuance, not machine precision. As AI companions become primary information interfaces for millions of users, publishers who don’t address this compatibility gap risk becoming invisible.

The pattern mirrors what happened with SEO. Early search results favoured sites that understood HTML structure over sites with superior content. Quality eventually mattered, but only after structure made content discoverable. With LMO, we face the same dynamic: Quality journalism that AI systems can’t reliably parse gets bypassed for inferior content they can process.

The dual-text solution: publishing for humans, optimising for machines

Here’s the critical distinction between LMO from past optimisation regimes: The LMO-optimised version never gets published.

Readers continue receiving journalism written for human consumption: nuanced, narrative-driven, stylistically rich. The LMO version exists as internal parallel text that feeds your AI systems. When users ask your AI assistant a question, the system retrieves from the machine-optimised version, ensuring accuracy and completeness.

This dual-text approach means editorial quality is never compromised. Journalists write as they always have. The LMO layer is created downstream through AI-assisted workflows with human verification — quality control, not authorship.

Two versions of the same news report illustrate the distinction:

  • Human-facing version (published): “Fed Chair Jerome H. Powell announced today that the Federal Reserve will continue to hold interest rates steady. The Fed kept its short-term benchmark rate unchanged at 4.25 to 4.5 percent, amid uncertainty over administration policies that could drag on the economy. Fed governors Christopher Waller and Michelle Bowman dissented, preferring immediate cuts.”
  • Internal LMO version (for AI retrieval): “On 30 July 2025 in Washington, D.C., Federal Reserve Chair Jerome H. Powell announced that the Federal Reserve will maintain its short-term benchmark interest rate at 4.25 to 4.5 percent. “Two members of the Federal Open Market Committee dissented from the decision. Governors Christopher Waller and Michelle Bowman preferred an immediate rate cut — reflecting growing concern over economic uncertainty tied to government policy decisions.”

The second version adds explicit entities, absolute dates, and standalone paragraphs. These elements help AI systems identify who did what, when, and where. This clarity improves retrieval accuracy and ensures our journalism is represented faithfully in AI responses.

Readers never see this version. It’s internal infrastructure that makes our AI products work better.

The style guide as critical infrastructure

The traditional newsroom style guide plays an important role in the workflow.

If your house style specifies “Fed” as an abbreviation for Federal Reserve and your LMO system has ingested that style guide, generating “Federal Reserve” in the optimised version becomes straightforward. The system knows the abbreviation, understands the full form, and can standardise accordingly. Without maintained, followed style guides, you’re relying on probability — and probability introduces errors.

The practical workflow implication is that style guide adherence directly affects LMO quality. Organisations with rigorous, updated, systematically followed style guides can automate LMO generation more reliably. Those without face either higher error rates or more intensive manual verification.

This transforms style guides from editorial nicety to infrastructure. They’re no longer just about consistency for readers. They’re about reducing probability errors in how AI systems process your archive.

The retrieval-augmented generation architecture: turning archives into living knowledge bases

An increasing number of news publishers are building systems that turn archives into conversational interfaces using technology called retrieval-augmented generation (RAG). Ask The Post at The Washington Post, Hej Aftonbladet in Sweden, and Hey BILD in Germany are systems that ground AI responses in verified journalism rather than allowing open-ended generation.

Unlike traditional search, which points users to information, these systems synthesise content into contextual guidance. When a reader asks about yesterday’s city council meeting, the system searches the archive, extracts key decisions and votes, and generates a summary complete with context from previous meetings and background on the issues.

But most news publishers feed these systems with content written solely for human readers. The result is inconsistent retrieval quality, incomplete context, and occasional inaccuracy. LMO addresses this by creating machine-readable versions optimised for reliable extraction and accurate synthesis.

The implementation reality is that LMO-optimised versions need not be created manually. AI systems can generate machine-ready variants from published articles, with editorial staff verifying accuracy rather than rewriting from scratch. Language models can create formats optimised for their own processing.

This is quality control, not authorship — a manageable addition to existing workflows that provides substantial competitive advantage.

Meta-data as retrieval infrastructure

Optimising text is only part of LMO. Structured meta-data plays an equally important role in how AI systems retrieve and prioritise content.

Traditional newsroom meta-data includes author, publication date, and topic. For AI, we can extend this by adding signals that express recency, relevance, and utility. These indicators help AI systems prioritise current, actionable content, improving both retrieval quality and user satisfaction.

This meta-data can be generated through AI-assisted workflows with human verification. It’s not a burdensome manual process; it’s supervised automation that scales.

What this means for competitive positioning

Every news publisher building conversational AI interfaces — whether customer-facing assistants or internal research tools — depends on retrieval quality. Content that’s ambiguous, lacking clear entities, or narratively complex produces less precise results. Clear, semantically consistent writing lets AI systems link related facts and concepts more reliably.

LMO provides competitive advantage in three domains:

  • Your own AI products work better. Conversational interfaces, archive search, and personalised briefings all depend on how well AI systems can extract and synthesise your journalism. LMO improves this directly.
  • External models cite you more reliably. When ChatGPT or Claude search for authoritative information, structured content gets retrieved and cited more frequently than unstructured equivalents. LMO improves discoverability in the AI era.
  • Workflow efficiency increases. Reporters using AI research tools, editors using AI fact-checking systems, and audience teams using AI personalisation all benefit from journalism that’s structured for machine processing.

The choice facing publishers is straightforward: Optimise journalism for the AI era on our terms, or watch platforms with inferior content but superior structure capture the value we create.

About Dr. Dietmar Schantin

By continuing to browse or by clicking “ACCEPT,” you agree to the storing of cookies on your device to enhance your site experience. To learn more about how we use cookies, please see our privacy policy.
x

I ACCEPT