BBC News has recently upped its game in data journalism with a new award-winnng team that journalists, data scientists, designers, and developers — all working together to focus on finding stories in Big Data sets and telling stories with that data in a variety of visually rich ways. The team is also dipping its toes in the world of machine learning and Artificial Intelligence (AI).

In an exclusive Webinar for INMA members on September 25, Amanda Farnsworth, head of visual and data journalism, and John Walton, data journalism editor, shared details on exactly how the BBC is doing this with INMA members.

Farnsworth and Walton began the Webinar by explaining what the BBC visual journalism data team does.

“If you’re not able to have the expertise to work with numbers, then you’re going to miss a lot of opportunities, and you’re going to miss opportunities to explain the world to your audiences,” Walton said. “So, very simply, I think any large news media organisation needs somebody who can work with numbers — hopefully more than one person, hopefully a few people who can work together. So that’s the reason for our existence.”

The definition of this existence was explained as:

“We find original news in data. We produce data-driven interactives and we offer data journalism training and advice. We check colleague’s figures and data stories.”

Farnsworth jumped in to share the newest thing that the team is starting to experiment with: machine learning and Artificial Intelligence.

Next, they shared the skills that the data team members (consisting of about 20) have:

  • Journalists who write.
  • Journalists who code.
  • Statisticians.
  • Data scientists.
  • Designers.
  • Developers.
  • Trainers.

The team also does a little broadcasting and visits universities for outreach and open source initiatives. “There’s a lot more interest at universities and in academia for telling stories with data than there was a few years ago, and it seems to be increasing all the time,” Walton said.

Case study: A month in Afghanistan

Farnsworth and Walton shared an example of a piece of reporting that came from the BBC Monitoring team, which monitors breaking news from around the world.

“Somebody from monitoring came to us with the idea that we should record all the incidents that cause injury and death in Afghanistan for one month as we approached the likely signing of some kind of peace deal between the Taliban and its foes,” Farnsworth explained. “Now obviously, we’re all in London. We weren’t on the ground collecting these figures, but we used our data sources — the U.N. and various other data sources. Meanwhile, our colleagues at the overseas bureau based in Kabul, they actually did go out on the ground and did some old-fashioned, shoe leather journalism.”

These three teams brought their expertise together to do this project, which was an original piece of journalism that Farnsworth said possibly only the BBC could have done.

The BBC news coverage of "A month in Afghanistan" used collaboration between the data team, journalists, and on-the-ground reporters.
The BBC news coverage of "A month in Afghanistan" used collaboration between the data team, journalists, and on-the-ground reporters.

Walton explained that the data team’s role in the project was to spearhead the data gathering. The team organised a spreadsheet to determine and organise the various data that was important to the reported piece.

“We could talk about the number of incidents, the number of people affected, what kind of people they might be and how they’ve been killed, what kind of attack or a bombing, however it happened,” Walton said. “We structure the data in a way that we can then analyse.”

If other news media publishers are thinking about doing such stories in their organisations, Walton said, it’s really important that team members make themselves available to colleagues so expertise can be shared right at the beginning of a story.

“If our colleagues in Kabul had gone out and collected the data before talking to us, it may have been that they structured things in such a way that we couldn’t work with it or we couldn’t analyse it in a consistent way,” Walton said. “And that’s really important for projects like this.”

Case study: The gender pay gap in the UK

Another story from the data journalist team had to do with the gender pay gap. Companies in the UK with more than 250 employees are required to release their information on gender wage differences. With around 10,000 such companies’ data on this subject, the team analysed the numbers for the reporting.

The gender pay gap was a piece of BBC reporting that heavily utilised data from the data journalism team.
The gender pay gap was a piece of BBC reporting that heavily utilised data from the data journalism team.

“We’re trying to hold the reader’s hand through these different steps,” Walton explained about how the team translates the data into a story. “Because the gender pay gap is quite a complicated thing to understand. Every time we report on the gender pay gap, people need to understand if we’re talking about a mean, or a median, or the differences between those things and how individual companies compare to the average, and what’s a fair comparison.”

Using this technique, the reader can be led through the issue in a way that makes sense and is easy to understand, even though it entails a lot of complex data. They can even look up the pay gap data at their own workplaces.

Case study: Election results coverage

Elections are another area in which data journalism is highly effective, using maps and charts to deliver information about the results to readers. This is particularly used to illustrate what has happened in their local area.

“In every election we try and make sure people know what happened with their own vote, what happened to the candidate they picked in the area they live in,” Walton explained. Results can be searched by post code, for example, and has shown to be a popular way for readers to discover their election results.

For election results, data journalism enables readers to find out how people voted by post code.
For election results, data journalism enables readers to find out how people voted by post code.

Other examples of BBC News data journalism

Walton presented a few other reported pieces to demonstrate the range of stories that the data journalism team is involved in.

“We’ve got stories on hospital waits, on rent and young people, and we’ve got a story about language learning, an education story,” he said. “A data journalist can be required to work across a lot of different topics. It means there’s a lot of variety involved, but it also means you need to partner your expertise in data journalism along with some subject matter expertise.”

The data team draws on this expertise from BBC colleagues in areas such as health, education, and business; but also they will go to experts outside the BBC if they need specialist knowledge.

“We always try to make sure that we combine our ability to work with data, with someone else’s knowledge of the subject,” Walton said.

Graphics made with scripts

“If you’re the only data journalist on a team, you might need to work with templates or graphics,” Walton said.

What the BBC has started to do is use its open-source data so journalists can easily create their own graphics using scripts. All of these graphics are produced by journalists, end-to-end within the BBC data system.

“There’s a Medium post explaining how you can use some of this work that we’ve done with open source,” Walton said.

These are examples of some of the various graphics that are created with data scripts.
These are examples of some of the various graphics that are created with data scripts.

The graphics are also used for television as well, Farnsworth added. “We’re making simple but clear graphics for our news channel that can be used on touch screens, and they can also be used just ‘out of the box’ on screen.”

Making new data sets for stories

The data team also creates brand new data for reporting. An example of this is when colleagues at Radio 1 asked the data team to find out where the best place to be young in Great Britain is. “To do this, we produced our own sets of data. We took about 11 different data sets and brought them together to make an index,” Walton said.

The team used criteria such as 4G reception, transport, sports facilities, places to go out, mental health care, employment opportunities, and rental prices to rank the best places for young people ages 18-24 to live.

“By doing that, we used data to make our own stories — an original piece of work, which produced a map that people could search and have a look at how their own area rated on those different metrics,” Walton said.

The data also provided great talking points for broadcast TV and radio.

Farnsworth added the determination of which criteria to use came about after a lot of discussion amongst the team. “We brought in our data scientist, and we tested our criteria with some external sociologists and people who have done a lot of studies about young people’s lives.”

This provided a quality check that the criteria made sense, and readers wouldn’t feel that the rankings were arbitrary or ridiculous.

Parliament votes

Going back to politics, another area in which the BBC data team utilised their processes was applied to the high number of votes in British Parliament, particularly surrounding Brexit.

The team took data on how each Member of Parliament (MP) voted on each issue and used that to create an interactive feature in which a reader could easily find out how their individual MP voted. This information was fed into the data system as quickly as 15 to 20 minutes after a vote, and the team was able to have the resulting piece available to readers within the hour.

This has proven to be extremely popular with the BBC audience. Just a year or two ago, it would have taken a day or at minimum, half a day, to turn this type of content around for the newsroom. “But now, we can turn it around in a couple of hours,” Walton said. “We’ve really had to respond to the increase in knife-edge votes.”

The team is also able to create fast graphics on these votes so they can be published within minutes, as quickly as the text journalists can turn around their content. This is all done through the advance scripting.

Templates also work very well for graphics that are used in TV reporting, particularly on issues such as inflation rates, which are issued at regular intervals. It’s a great use of resources, Walton said, to be able to help colleagues in television, radio, or online using data and scripts that the team already has at their disposal.

Best bang for the buck

As these examples have shown, the BBC data team is committed to getting the “best bang for the buck” out of every resource.

BBC uses data journalism for reports such as the NHS Tracker on UK hospitals and Police Under Pressure.
BBC uses data journalism for reports such as the NHS Tracker on UK hospitals and Police Under Pressure.

At the left of this slide, Webinar attendees could see some of the team’s longest-running data projects in action: the BBC NHS Tracker. This has run for two years, tracking performance data for every hospital in the UK.

“We were able to explore those data sets and find stories in them,” Farnsworth said. One of the most interesting stories to come from this data was that there was only one hospital in all of England that was meeting all of its targets.

“We were then able to dispatch a broadcast team to go and investigate that hospital and find out why,” she said. “The answer, interestingly, was not that it was better funded or had more staff. It was actually a matter of the woman who was heading the hospital up, who was just kind of a visionary, inspirational person who had led the whole hospital to be much improved.”

Another example of a long-form data story was the “Police Under Pressure” documentary piece done by Panorama. The team took data from each police force and looked at how many crimes took place and how many led to actual charges — and how that was changing in different police forces across the country.

A similar data story was done looking at housing sales in England and Wales since the 2007 crash.

“By scripting our data journalism, we’re able to work with really large data sets,” Walton said. “I think that sort of speaks to the growing specialisation of data journalism.”

He added there are two strands to data journalism. One is a democracy, with tools and open-source resources available to everyone. The other is the specialism that is required for certain stories, which wasn’t available even a few years ago.

Collaboration

To make this data journalism work, the team collaborates with many other teams across the BBC including:

  • All main topic teams and correspondents.
  • Newsbeat.
  • BBC Three.
  • Panorama.
  • World Service & Language Services.
  • Today.
  • BBC Nations and Regions.

They also work with teams outside news, including Sport, BBC Factual, universities, the ONS and NGOs.

Q&A:

INMA: Can you tell us a little more about the machine learning and AI that your team is starting to work with?

Farnsworth and Walton: I think there are two distinct things. The first is called robo-journalism — how to use Artificial Intelligence and machine learning to generate reliable journalistic text and stories. We partner with a department at the BBC to do that, particularly with things like sports events or share prices, because those are very factual. It doesn’t really take a lot of journalism to report the results of a football match. Election results would be another area where we are dipping our toes in it, like our elections in May that we shared to our local Web sites in the UK.

The second area, and more interesting for me, is to do with the increasing number of algorithms that are being used in public services. In Britain some police forces, some social services, and some councils are beginning to use algorithms to decide where to spend sparse resources. Such as areas in which crime is most likely to be committed, etc. So what we wanted to do was equip ourselves with the tools to be able to hold those algorithms to account so that when the comes and we want to investigate (a topic), we can not just hold the people involved to account, but the algorithm and if it has any internal bias in it.

INMA: Who primarily pitches a data story?

Farnsworth and Walton: They can come from something in our team or a specialty section. It’s kind of a two-way thing. The door’s open from our point of view for people to come and pitch ideas. Quite often they’re from specialty journalists, so they have hunches about something that’s happening.

INMA: How does Artificial Intelligence leveraged to combat fake news?

Farnsworth and Walton: I don’t think we’re leveraging it to combat fake news at the moment. I imagine that all the major companies will be looking at AI with their own sites. But within our own team we are not doing that specifically.

INMA: What are some other examples of machine learning and AI stories that your organisation has published?

Farnsworth and Walton: Some journalists have taken the health data we talked about earlier and wrote a couple of paragraphs about each hospital in an automated way, which was then edited by humans.

INMA: To what extent does your team act as a service unit to the rest of the newsroom? How do balance the needs of your team versus other teams?

Farnsworth and Walton: Yes, getting that balance is a bit tricky. For instance, we’re heading into elections in the next few months, so about three months ago we started to dedicate a huge amount of our resources to that. At the moment, well over half of our resources are devoted to election content. It’s a very sort of communal, BBC-wide effort at the moment. I think it’s fair to say that we’re always being asked more than we have the resources to do; you can to some degree be victims of your own success. I think we tend to prioritise things that will work as broadly as possible. Some projects are very high profile, and therefore it’s a priority for us to contribute a good piece of datawork to be part of that coverage.