ALERT: Free Audience Analytics Town Hall set for TODAY, register now

Punching through the Big Data hype

By Xavier van Leeuwe and Matthijs van de Peppel

NRC Media

Amsterdam, The Netherlands


In the past few years, there has been so much talk about Big Data. At almost every conference, in many boards, and throughout marketing departments, the idea rose that the Holy Grail was knowing everything about everybody at every moment.

We were no exception. Developing holistic user profiles was on top of our priority list.

Then, one weekend, Matthijs visited his parents. Now and then, in their little village in the Dutch countryside, they made a brave attempt to understand how he and his brothers fill their working days. So Matthijs tried to explain his new project to collect as much individual data from subscribers and Web site visitors as possible.

He explained how we wanted to know much more than just how to contact them and where to deliver the newspaper. What do they read? What is their income, educational level, and birthday? How many kids do they have, what are their interests, and what computer brand do they use? We wanted all the personal information we could get our hands on to build profiles of them — everything from everybody, really.

At one point, his mother shyly asked, “Maybe I just don’t understand, but why does my newspaper have to know all that stuff from me? I don’t really like you to know everything from me. It feels a little intrusive.”

That simple question put our legs firmly on the ground.

The power of Big Data

So, what is Big Data? A common way to define the border between regular data and Big Data is with the three Vs:

  • Volume: The amount of data is large.
  • Variety: The data is often not clean and tabular but messy, like text or images.
  • Velocity: New data arrives continuously.

In the publishing world, we usually speak about Big Data when combining online behaviour (like clicks, scrolls, and reading behaviour) and offline data (like names, addresses, bought products, and demographics). Every company has the opportunity to collect huge amounts of data from its customers.

That feeds the imagination, but what can you actually do with it? Here are some examples:

• Personalise products: By knowing your individual customers’ interests, habits, and personal situations, you can match your digital products to their needs. Not everybody has to see the same home page on your Web site, for example. Sports fans could get a different landing page than politics aficionados.

• Target ads: Yes, Facebook has finally entered our story. It is earning large amounts of money with targeted advertising. Facebook probably knows more about its customers than any other company in the history of mankind, because that’s what it’s primarily used for: sharing as much as possible about yourself.

Facebook gives advertisers the opportunity to target campaigns at almost every imaginable segment of demographics and interests. Advertisers are investing big sums to reach exactly the audience they want to reach. Almost any traditional publisher dreams of this money.

• Data mining: One of the most intriguing possibilities of Big Data is the chance to find hidden gems in your data. It goes like this: You store all the data you can get and drag it into the Amazon cloud, even before you have the slightest idea what you will do with it. The magic comes when you start data mining.

Algorithms can find patterns in the data you never could have foreseen. This kind of analysis is used, for example, to find correlations between DNA strings and the rate of success of certain medicines. It can also be used, for example, to find drivers for churn (stops).

• Predictive modeling and machine learning: If your algorithms find correlations between certain variables in data, the next step is to make predictions. For example, knowing what series will be popular is a great feat by Netflix’s data team. Pharmacists understand what medicine will have the best chances of success on a particular patient based on his or her DNA. Telecom companies better understand which subscribers are likely to churn.

These predictions can be automatically generated, and the predictive model can improve itself via machine learning.

Use Big Data with common sense

There are all these state-of-the-art applications of Big Data, and storing data gets cheaper and easier every day. This combination shapes opportunities for new and existing companies. Consultancies pop up like fair maids in spring, software is developed at a rapid speed, and no week goes by without an e-mail hitting our inboxes from some company with a state-of-the-art Big Data solution. There is a new, pretty girl in town, and she keeps asking for a date. That’s hard to resist.

But the thing is, this girl is not right for everybody. Big Data is not equally relevant to every company, and building up holistic user profiles is probably not the thing to start with if you are turning into a data-driven company.

Before you start investing in Hadoop clusters, cloud storage, machine-learning algorithms, and a lot of expensive and scarce manpower, try to bring common sense into the equation and ask yourself if these investments will pay off at some point in the future.

Are your customers interested in a personalised product? Are your advertisers interested in hyper-segmentation? What are the chances you will find patterns in your data that will earn you serious money?

And last, but definitely not least, what will the people with whom you have relationships think of you when you tell them what kind of data you are collecting about them? How do you feel about that?

If you have a solid business case for your Big Data project, the next question should be, what data is really needed here? Is it important to know “everything from everybody,” or will a couple of data points also do the job?

Find the million-dollar business case

We were facing these questions at NRC. The first step we took was visiting companies we had befriended in other industries that were far ahead of us on this path to find out if their extensive customer profiles brought golden insights.

We reached out and received a lot enthusiasm. Many companies were willing to share their experiences and show their systems. We saw amazing Hadoop installations, many terabytes of personal data, and many analysts, data scientists, and business developers who had invested serious time and money in Big Data.

We asked only one question: Can you tell us your million-dollar business case with Big Data? Because, after these investments, there has to be at least one application of Big Data that had a big payoff, right?

Despite the transparency, success cases where quite scarce, and the connection between those successes and the Big Data projects was weak. For example, there was a case involving a home decoration store. It had an impressive loyalty programme that brought them huge amounts of data about their customers and their purchases.

What you buy says a lot about your stage in life and about how you live, so there are endless opportunities for analysis of that data. But it was not possible to attribute a clear profit to having all this data about customers.

When we persisted, there turned out to be one particular thing that really paid off, and it turned out to be a lot simpler than we expected. The store sent an e-mail and asked what the customers were planning to renovate in the next couple of months. When customers share information about what they are planning — for example, to rebuild the bathroom — the company will send offers on bathroom tiles, showers, and taps. Response rates went through the roof.

That was the million-dollar business case. But it was not very complicated. It was one question and one answer — not what you would call Big Data. The data volume is small, the data has little variety, and it does not come in continuously; it is just one data point collected by one e-mail.

None of the firms at the forefront of data usage we met were able to share examples that showed the holistic profile of their customers really paid off.

“But,” one analyst told us, “it’s a great toy!”

Don’t do Big Data for the sake of Big Data

Maybe that’s the risk of the Big Data hype. Most people in the boardroom don’t really know how these data projects exactly work, but almost all agree they should “do something with Big Data,” because that’s what they hear at conferences and read in the management books.

For those who do know how Big Data works, this new era is a dream come true. There is money, fancy software, support from top management, and more data than ever. It’s like a planet-sized playground where you can play for ages. So if we lose common sense about data, chances are companies are “doing Big Data” not for the business, but just for the sake of Big Data.

As we wrote in an earlier INMA blog post, it helps when businesspeople lead the data teams. They tend to make business cases and evaluate beforehand whether the investment in time and money is justified by the predicted earnings in money or improved customer experience.

If that’s the case, go for it, keeping in mind you have to start with three things:

  1. Comply with privacy legislation.
  2. Keep the relationship with your customers in mind.
  3. Be sure you understand your basic business processes before you start a Big Data project.

Privacy legislation

There are big differences in privacy laws around the world. For example, people in the United States seem to have less concern about lost personal privacy than Europeans. North Americans may more readily accept that data capture occurs and companies use information on demographics and behaviours for a variety of purposes.

There are Americans who actively try to thwart data collection attempts, but they are a very small share of the market. Americans may trust companies with their data to a greater degree than they trust their government.

Europeans appear to trust their governments with their data but take a wary eye toward private sector data capture. Privacy protection is a hot topic, legislation is strict, and regulatory agencies are active and powerful. Building a profile of customers without telling them exactly what information you store and what you are going to do with it is forbidden by the European Union.

People in different parts of the world have different concerns regarding privacy.
People in different parts of the world have different concerns regarding privacy.

Be careful with relationships

In addition to the question of whether a Big Data project will bring value to your business, there is also an ethical question to ask: To what extent do you store, connect, and analyse the personal data of your customers? If you are in the relationship business, you may want to be careful with the collection of knowledge about your partners.

If you decide you want to know “everything from everybody,” ask yourself how you would feel if you found out your wife or husband wanted to know everything from you at every moment. What if your partner tracked your phone, analysed your spending behaviour, and read your e-mails without letting you know? It would cause serious doubts about the trust and equality in the relationship.

At NRC, we have become very cautious in what we do with the personal data of our customers and Web site visitors. We always ask ourselves two questions before we start any Big Data project: Is there a positive business case? And, can we explain to our mothers how we use this data?

Another future option will be to put the customer in the driver seat. There will evolve possibilities to empower customers with tools to control the flow and use of personal data and dictate their own terms of service.

Start with the basics

Just a couple of years ago, we were in a room with three people — an analyst, a marketer, and a data warehouse developer — and we discussed a question, which looked very simple on first sight: “What is a subscription?”

We had three different answers. The marketer was counting in terms of the audit bureau standard (Saturday-only delivery is divided by six). The data warehouse developer counted the pieces of the package in the subscription system (digital as one, the newspaper as one). And the analyst wanted to count the bundles regardless of frequency in delivery and number of pieces in the system.

Not being able to agree about this single question was very confounding. We had to nail down these very basic definitions of our business for the brand-new data warehouse.

We found there is a lot of work to do with relatively small amounts of data already available in the current systems in order to understand the day-to-day business. It turned out there were a lot more basics than we thought, and we had to define many metrics before we could start to report and analyse them:

What is a new subscriber? When is an ex-subscriber a prospect? When is a new subscription a switcher? What is the revenue per subscription? What are the variable costs? How many paying customers do we have, and at what price?

We basically didn’t understand what was really happening with the 265,000 relationships we had. That’s strange, because at the same time, we were telling each other those relationships were the one thing bringing value to the company.

As a result, we chose to invest time and money in developing definitions, data preparation in our data warehouse, and reports on the basics in our business intelligence tool. These are not the sexiest subjects for data analysts and scientists. They’re not Big Data toys. It’s not Hadoop; it’s just counting 265,000 records and their evolution.

But at the same time, this is exactly what did. And it paid off.

We started to see how many core relationships we had as well as what kind of promotions and products were driving the relationships and which ones were bad for relationships. We understood what kind of service kept the relationships alive.

After a while, we went from reporting the past to forecasting the future and started to understand the buttons we could push to improve results. More than a year after that confounding meeting where we were not able to count our subscribers, the improvements started to pay off, and we started to grow in relationships and in our financial bottom line.

Start at the beginning when devising a Big Data strategy.
Start at the beginning when devising a Big Data strategy.

It is advisable to start with the basics. Maybe this doesn’t sound very new or revolutionary. When you build a house, you start with the foundation — not with a carport. For some reason, that’s not how it usually goes in the field of data science. Many analysts will choose the sexy Big Data projects and skip the basics. It’s like asking a mechanic if he wants to change the tires of a car or tune the engine.

About Xavier van Leeuwe and Matthijs van de Peppel

By continuing to browse or by clicking “ACCEPT,” you agree to the storing of cookies on your device to enhance your site experience. To learn more about how we use cookies, please see our privacy policy.