The current forecast for data consumption is growing rapidly because the cost of data processing technology is getting cheaper, Clancy Childs, chief data officer at Dow Jones, explained to the audience of 200+ media executives attending Big Data for Media Week conference in London on Thursday. 

“Data is growing faster than ever before. In the last two years, there has been nine times more information created than in the entire history of humanity,” Childs said. 

Clancy Childs, chief data officer at Dow Jones, talks about the proliferation of available data.
Clancy Childs, chief data officer at Dow Jones, talks about the proliferation of available data.

Forecasters predict that by 2020, data market revenue will be at US$132 billion. This is because “companies recognise the importance of data,” which acts as a “future-proof” method for businesses, Childs said. “There is a big opportunity to sell data and news.”

Childs went on to discuss the vast opportunity data mining offers, giving the audience insight into how data has revolutionised the newsroom and the wider context of Big Data. 

Using crude oil as an analogy for the data mining process, Childs explained: “What we see here is a lot of companies want raw access to data, for instance unstructured news content. We take some of that raw data ourselves and content that our journalists create. We mine through a lot of that content. We identify data that is not meant for human consumption.

“You then jump into the analysis and research — to be able to query a dataset for future analysis, which has been Dow Jones bread and butter for a long time.” 

What excites Childs most is the front-line end and last stage of the data process, he said: “If you are using Google now, news or information before you even asked for that is where there are interesting things happening.” 

So what is driving demand for data? 

The rapid rise in data demand has been seen across industries from financial services, health care to newsrooms, Child said: “Every hedge fund is asking for data and they are very willing to pay for unique data sets.”

The newsroom is increasingly becoming a major consumer of data, self-reflecting on what stories may be attract attention again and whether data can be used to predict the next big story. 

Child summarised that newsroom data use as “being able to draw connection in a large archive to better understand what that article about that topic might be today.” 

However, across the board, the most prominent data mining is centered on sentimental analysis. The reasoning for this is that positive or negative indicators are crucial for figuring out impact, response, and engagement, Childs said.

Moreover, data can be used to “figure out within milliseconds whether, for example, an article about a company is positive or negative that may impact trade,” he said. 

The vast realm of opportunities data mining can provide seems endless, Childs said. Data can aggregate multiple news sources to help look for the next big story, predict epidemics, mass tag content, and even automate content.

Childs argued that in some cases, “the value of content to some of these machine-readable use-cases can actually be much greater than the value to a single human subscriber.”

However, Childs reiterated that Dow Jones “comes from a publisher DNA” and “we do this with full respect with what the content publisher and the end use.”