Larry Birnbaum likes to tell stories from data. “It is clear that there is a lot of data and a lot of stories in that data, but it is not easy to find them,” he told delegates at INMA’s Big Data for Media conference at Google London on Friday. “I tend to start with a question.”
In his talk, subtitled “Finding and Telling Stories at Internet Scale,” Birnbaum outlined a number of free tools to exploit free data built by journalism and computer science students at Northwestern University, where he is a professor of computer science and of journalism.
The first, TweetCast Your Vote, is an engagement tool used to predict how a user will vote according to their tweets. Type in any name and it will search for key words used in that person’s tweets.
LocalRx (local recommendations) is another tool, also based on tweets, which currently only works in the United States and in Barcelona. “It can tell you what people who patronise your business are tweeting about, which helps business owners to build a profile of their customers,” Birnbaum said.
Another, NewsRx (news recommendations), aggregates news stories based on tweets. For example, if a user regularly tweets about food, news stories about food or restaurants could be recommended.
Birnbaum pointed out these tools allow access to free data because they are based on Twitter, an open platform. “If you are the New York Times or the London Times, you have a lot of access to data about your readers. Small publications do not have a lot of data about their readers,” he said. “By using these tools, you can find out about your readers. All you need is their twitter handles.”
Birnbaum said these mechanisms were relatively simple to build. “It only took 50 lines of code to build them. The first version of NewsRx was built by four undergrad students in ten weeks.
“Another system we built is Local Angle to help locally relevant stories surface in national news. It goes through a news feed, pulls out names from the feed, then finds locations associated with the people mentioned, and it sorts stories into local areas,” he said. Birnbaum described other tools including Buzz Lite and Quill Connect, which also analyses Twitter behaviour and tells you how to change it to build networks.
Birnbaum also described his work for Narrative Science, a U.S. technology company that builds algorithms to write stories from data. Most of the stories created by algorithms currently are based around sports reports and finance news and reports.
“The critical thing is working out what to say and how to say it. We use a language generator,” he said. “Clichés are great because people get them. Clichés are good.”
He said automated journalism is unlikely to take the place of human journalism: “A lot of the work the company is doing is not traditional journalism.”
But he acknowledged that there are still uncertainties around automated journalism. In answer to a question about legal problems that might arise when an algorithm describes a company as “performing badly,” Birnbaum agreed it was an unresolved issue.