Why a data puddle might be better than a data lake

By Greg Piechota


Oxford, United Kingdom


Nobody has patience for data transformation projects that start with building the pipes and make everybody wait years for the value. Strike vertically: Find a problem and fix the data that solves this particular problem,” says the author of The Chief Data Officer’s Playbook.

Caroline Carruthers was one of the first women to take on the role of chief data officer in the U.K. public sector for Network Rail. An author of several best-selling books, she runs the data consultancy Carruthers and Jackson. We sat on a video call recently to discuss the value of data, the best approach to tying data to business objectives, and the role of a CDO. Here are my questions and her answers:

Do all news organisations need a chief data officer?

You need a chief data officer when you are ready to treat data as an asset. The fact that many organisations do have one is not enough. Smaller organisations may do without a CDO. Surely, somebody at the board level has to have a focus on data. It’s an asset like talent or technology.

What’s the job of a CDO about?

Primarily, about making sure data is treated like an asset.

Having talked to executives, I think they sometimes fall into a trap: They have all this volume of data, they put everything into a lake, but they are not sure what to do with it.  

Data should underpin the organisation’s vision. Data strategy cannot be about hoarding data for the sake of it. It’s about thinking of the purpose, or the outcome of using data.

Think of CDO as a constructive, challenging friend to others in the organisation — a detective, like Sherlock Holmes, asking questions, helping in finding answers.

In all the talk about data science, we often miss what “science” really means — you propose a theory, and you experiment to validate it, and this is how you learn. CDO helps keep the wheel turning.

Where to start? Building the pipelines? Getting data specialists on-board? This is what the world’s leading academics urge publishers to do.

I respectfully disagree. I am not from academia. I have been a CDO in the real world, and I know that nobody has patience today for data transformation projects that start with building the pipes and make everybody wait years for the value.

A new CDO should start with building relationships across the organisation, listening to others: What are their challenges? What keeps them awake at night?

There are two sides of the CDO’s job: risk-averse and value-add. We tend to focus on the latter: how we can create better products and services. There is though a lot of value in the former — in stopping wasting money on collecting data that is not useful, in making sure the right information gets in the hands of the right people so they make better decisions.

The success of the CDO doesn’t really depend on the data team. Ten data scientists sitting in the ivory tower can achieve only this much. Enabling everybody to understand the data and make better decisions will be more impactful. 

The CDO should then focus on driving data literacy, engaging everybody, and improving their decision-making.

And instead of a data lake, I would much rather start with a data puddle. 

A puddle?

Rather than trying to fix everything, strike vertically -— find a problem and fix the data that solves this particular problem. Set up data governance, put people in place, get the necessary tech. Solving one problem after another, you will build your base eventually, but people will see the return on investment more quickly. 

The Economist famously wrote: “Data is the new oil.” Is it? Is it true that the more data we have, the richer we are?

I hate this phrase. No, data is more like dirt. You can turn it into valuable things, but it requires effort and most of the time it is just messy.

That’s why having to choose between a puddle of data that I am sure is accurate and useful and a lake contaminated with inaccurate data, I choose the puddle.

How to measure the success in data transformation?

I’d measure it as progress, for example, asking on an annual basis whether we are improving our decision-making and how.

To measure the value of data we have collected, I would ask: How unique and powerful this data is to you? How hard would it be to replicate it? How much time and effort did it take to have it? And if you miss it, would you bother again to get it?

Recommended reading: Caroline Carruthers, Peter Jackson, The Chief Data Officer’s Playbook, Facet Publishing 2021.

If you’d like to subscribe to my bi-weekly newsletter, INMA members can do so here.

About Greg Piechota

By continuing to browse or by clicking “ACCEPT,” you agree to the storing of cookies on your device to enhance your site experience. To learn more about how we use cookies, please see our privacy policy.