Newsrooms are leveraging AI to improve investigative reporting

By Paula Felps

INMA

United States

For many news organisations, the past few years have been a crash course in using AI tools wisely and effectively. These days, more companies have learned not just how to use the emerging tools but also to build their own.

This week’s INMA Webinar, AI to power investigations and impact, looked behind the scenes at two news media companies to see how they’re using AI tools to work smarter, faster, and more efficiently.

Before the presentations, Sonali Verma, lead of the INMA Newsroom Innovation Initiative, polled attendees to learn how they are using AI in investigative work. The most common use, according to member responses, is summarising documents or transcripts, with 45% of participants indicating this was their primary use.

Among INMA members, summarising documents or transcripts is the most common use case for AI in investigative work.
Among INMA members, summarising documents or transcripts is the most common use case for AI in investigative work.

That ability to process overwhelming volumes of information is key for newsrooms, and both presenters made compelling cases for how it has unlocked stories that were buried in pages of data points.

“A working tool”

At Kompas Daily in Jakarta, Indonesia, AI has become a crucial component in coverage, said Ratna Sri Widyastuti of the collaborative content desk. She noted that while conversations about AI often vary between excitement and fear, her team approaches it with a simpler, more functional mindset.

“It’s not a magic, it’s not a dread, but AI is a working tool for us, for the journalists,” she explained. And that tool “can save our time, organise complexity, and then help us as journalists focus on what human do best: asking question, verifying facts, and also telling meaningful stories.”

Indonesia is overflowing with data from court rulings, government datasets, financial disclosures, public records, and millions of online conversations every day, she said. “Information is not the problem,” she said. “Our problem is how to process it fast enough.”

With smaller teams, tighter deadlines, and rising expectations from both leadership and audiences, journalists are now expected to deliver speed, depth, and accuracy simultaneously.

“This is very difficult,” she said. “Sometimes, the story exists, but it’s buried inside thousands of pages or rows of data.” AI’s real opportunity, then, is its ability to help journalists “find the signal inside the noise.”

Her team uses AI in four primary ways: faster document review, organising large information sets, early‑stage research support, and workflow efficiency.

None of these involves writing final stories. Instead, AI accelerates the early steps, allowing journalists to spend more time on analysis, context, and verification.

“Human editors always make the final decision,” she said. “AI helps us move faster, but it does not replace newsroom judgment.”

At Kompas Daily, AI is a useful tool, but the newsroom recognises its limitations as well. Journalists verify information, and human judgment remains at the centre of its use.
At Kompas Daily, AI is a useful tool, but the newsroom recognises its limitations as well. Journalists verify information, and human judgment remains at the centre of its use.

A three-stage process

Widyastuti outlined a three‑stage workflow that has become standard practice. The input stage uses AI for transcription, translation, and summarisation — tasks she often handles with Google NotebookLM because it is free and easy to use.

Next, the processing stage is where AI becomes most valuable: cleaning messy datasets, standardising names and formats, extracting variables, classifying information, and detecting anomalies or trends. For data journalists, she said, this stage “matters a lot.”

The final stage, editorial, is fully human-led, with journalists verifying findings, editors setting priorities, and writers providing context and nuance.

“A story is not only data,” she reminded the audience. “A story lives with judgment, ethics, and public interest.”

Her case studies illustrated the impact. In one project, her team analysed 1,113 murder court rulings from Indonesia’s Supreme Court. Accessing the documents was easy; processing them was not. Each ruling was long, inconsistent, and often involved multiple perpetrators or victims. A manual review proved nearly impossible.

Using AI, the team summarised each ruling and extracted key fields — region, sentence length, demographics, case type — and turned an unmanageable project into a workable one. Once structured, patterns emerged: sentencing disparities, regional differences, repeated motives, demographic trends, and outlier cases.

“AI did not produce the journalism,” Widyastuti said. “AI helped us create the raw material for the journalism.”

Kompas Daily used AI to find patterns in court rulings that showed disparities in sentences and also uncovered new story angles.
Kompas Daily used AI to find patterns in court rulings that showed disparities in sentences and also uncovered new story angles.

A similar approach was used to review 500 corruption rulings involving village development funds. Again, AI summarised, categorised, and extracted information. This allowed journalists to drill down into the data and understand the impact of what was uncovered.

But AI is also useful beyond investigative reporting, Widyastuti said. “We have used it better to understand audience interest,” she noted, introducing a study on how the newsroom used AI to analyse Netflix Top 10 content in Indonesia, identifying genres, themes, and even whether films had happy, sad, or ambiguous endings.

The final project she shared sorted through more than 10,000 tweets about loneliness.

“We made a loneliness index and also we also used social media, she said. Reviewing those 10,000 tweets manually would have been “slow and subjective,” but AI helped classify emerging themes before journalists took the data and added context about societal and mental‑health implications.

With help from AI, journalists were able to see how widespread the loneliness problem is in their region and develop mental health-related content.
With help from AI, journalists were able to see how widespread the loneliness problem is in their region and develop mental health-related content.

“Again, AI helps identify the signal, and the journalists build the meaning,” she said.

For all AI does, Widyastuti was also clear about its limitations: It struggles with very long documents, can miss nuance, may hallucinate, and cannot verify truth or understand culture the way humans can.

“Technology alone is never enough,” she said, noting that journalism needs human judgment, human failures, and human responsibilities.

“The future is not AI versus journalists. The future is journalists who use AI well.”

AI as a support tool

At Reuters, AI is being integrated into investigative and data‑driven journalism. Allison Martell, enterprise project manager and editor/Artificial Intelligence, shared how they are leveraging the new tools.

“When it became clear about two years ago that the large language models might be important, my editors asked … to figure out what sorts of AI tools we could be using to deliver better investigative stories,” she said.

“We are not here to find or develop tools with superhuman abilities; I think I’m here to find and develop tools that give our reporters superhuman abilities.”

Martell views it as simply the next evolution of data journalism: Just as spreadsheets, statistical models, polls, and satellite imagery expanded what journalists could understand about the world, AI now extends their ability to process information, search vast archives, write and review code, and uncover patterns that would otherwise remain invisible.

Newer AI tools, such as LLMs, can help process data “in ways that were nearly impossible before,” and AI has become invaluable for learning technical skills, especially coding.

When she shifted from R to Python two years ago — a move driven by the explosion of AI libraries in Python — chatbots made the transition dramatically easier. They synthesised answers from across the Internet, explained unfamiliar syntax, and even helped her troubleshoot. But she also against the use of coding agents like Claude Code and Cursor, which Martell said introduces new risks.

“A lot of people are experimenting with coding agents now ... and I feel actually a little bit worried about that. I want to talk about how we do it and how we try to do it responsibly and what has been helpful to us.”

AI in action

Martell walked attendees through several investigative projects where AI played a crucial role. One involved thousands of photographs of documents from the Assad regime in Syria — many handwritten, multilingual, and poorly suited for standard OCR tools. After extensive experimentation, her team used Google’s DocumentAI for OCR and translation, then loaded the results into ICIJ’s DataShare for search and review. AI coding assistance was essential for building the pipelines that processed these documents, organized them, and made them searchable for reporters around the world.

Another example came from her reporting on China’s so‑called “shadow navy” — civilian vessels that participate in military exercises. Initially, she trained an image‑classification model to detect when ships deviated from commercial routes, but ultimately found that a custom dashboard was more useful. Built entirely with AI‑generated code, the dashboard displayed ship locations, historical exercise sites, and real‑time movements. When Martell went on vacation just as the story heated up, her reporting partner was able to use the dashboard to task satellites and capture images revealing a new offloading strategy relevant to a potential future invasion of Taiwan.

Reuters used ship-tracking data and satellite images to monitor the role civilian vessels played in maritime exercises.
Reuters used ship-tracking data and satellite images to monitor the role civilian vessels played in maritime exercises.

Martell also discussed how Reuters uses AI for search and summarisation, particularly through tools like Thomson Reuters’ CoCounsel. Although that tool was developed for the legal profession, Martell said it’s also effective for reviewing and summarising “non-legal things.”  In one Syria-related investigation, CoCounsel helped identify relevant sections within thousands of pages of documents, which a reporter then read in the original language. But she cautioned that journalists must understand how these tools work — especially retrieval‑augmented generation (RAG) systems, which rely on semantic search. Because semantic search ranks everything and has no natural stopping point, she warned that reporters must avoid asking questions these tools cannot answer.

“It’s all about understanding the tool,” she said.

Understanding handwritten data

Reuters has also begun using AI to unlock information trapped in handwritten documents, something that was previously impossible to process at scale. Martell pointed to a recent investigation using handwritten temperature logs from California prisons, in which she drafted the initial Gemini prompt that extracted the numerical data, and the team then built a full analysis from there.

Reuters used AI to extract data from handwritten documents in an investigation into temperatures at California prisons.
Reuters used AI to extract data from handwritten documents in an investigation into temperatures at California prisons.

Because the story relied on quantitative findings, they developed a rigorous verification process: calculating confidence intervals, measuring the size of extraction errors, and determining whether those errors would meaningfully affect the conclusions. In this case, the discrepancies amounted to roughly three and a half degrees — significant enough to disclose, but not enough to change the story’s outcome.

Martell underscored the importance of verification when using AI. Whether reviewing code, dashboards, document summaries, or text classifications, she said journalists must understand the underlying methods, test outputs systematically, and disclose accuracy rates when appropriate. “The model’s going to give you its interpretation of what the text means,” she said. “You still need to do the reporting.” 

About Paula Felps

By continuing to browse or by clicking “ACCEPT,” you agree to the storing of cookies on your device to enhance your site experience. To learn more about how we use cookies, please see our privacy policy.
x

I ACCEPT