How media companies can get into predictive analytics

By Ariane Bernard


New York City, Paris


Hi everyone.

I’m just back from INMA’s Media Innovation Week conference in Copenhagen — and it was my first in-person conference since 2019. Wow. A bit like skiing, you don’t forget these muscles. But also, apparently, I forgot that it was good form to actually wear your badge throughout.

On the more productive side: It was a great opportunity to connect with some of the INMA members at the conference and this week’s newsletter is taking some inspiration from our conversations. 

Thanks for the great conversations, folks!


So you want to … get into predictive analytics

INMA had set up a few “Ask Me Anything” sessions for me and a few great publishers in attendance in Copenhagen, and this actually had an unexpected benefit for me. While these were pretty different profiles of publishers, some topics of interest reliably cropped up. A kind of Venn diagram of topics of interest, if you will. 

The topic that surprised me the most for appearing often — not because it’s wild but rather because it’s specific — is predictive analytics.

Media publishers at Media Innovation Week in Copenhagen were most curious about predictive analytics.
Media publishers at Media Innovation Week in Copenhagen were most curious about predictive analytics.

And, in these conversations, one thing was clear: This means very different things to different people. So, just so we are talking about the same thing here: Predictive analytics is an area of analytics that produces not just insights from the data at your disposal but also models this data (which, by definition, describes the past) to make, errr, predictions about the future.

So when we talk about predictive analytics, we really talk about machine learning applied to analytics.

What do you need to get started?

  • Good data: clean, accessible, consistent. You know, good data.

  • A good problem (in the mathematical sense — your topic of interest may not be a “problem”’ but could also be about recognising an opportunity). A good problem is one that’s well defined: defined in goals, but also well understood in terms of how it presents itself from a data perspective.

  • The ability to train models and do so with enough speed of execution that the problem you are trying to identify is identified at the earliest possible moment. It’s not much of a useful prediction if the fire alarm is able to anticipate the fire about one minute before it breaks, even if, technically, the alarm was a prediction of a fire about to break out.

I reread myself here and I think: INMA’s editor is going to want to delete a few instances of the word “good,” which I repeat a good too many times here. But beside lazy writing, the reason I am repeating “good” is because as you set your sights to this next stage of your analytics journey — from descriptive analytics to prescriptive — the quality of what you can do will markedly suffer from any one of these three things being just OK.

Have messy, inconsistent data? You won’t be able to train on it.

Pick an ill-defined problem to work on? You won’t be able to reliably model it.

Have only so-so technical resources to process your data and train? You won’t make it to the finish line.

I don’t mean to sound discouraging, but this is all to say that as you advance in your analytics journey, the requirements become more rigid. And when you consider all of this takes time and investment, there is certainly good reason to wait until there is maturity in the data team, the data itself, and the infrastructure to get going on this prescriptive journey.

Identifying good problems to work on

Good problems to consider for your prescriptive analytics programme are those that have these characteristics:

  • A late-stage funnel problem, whose attached data is unambiguous. For example, a good candidate problem is one where you will be observing conversions (which are, in analytics terms, “goals”) rather than a problem where you observe engagement. The reason these make for better problems to sink your teeth into isn’t actually because conversions are tied to revenue and engagement is soft n’ squishy. It’s because, generally, engagement is triggered in very high volume by many different levers in your product experience. 

  • A problem where the data is mostly from logged-in users. This is somewhat connected to “good, clean data.” Also it’s because when you look at running models, you want to be able to run A/B tests where you have good control of who gets into the test, who is in your control pool, and possibly follow these tests over a significant period of time. It’s difficult to do this with logged out users. 

Not all problems are equal where prescriptive analytics are concerned.
Not all problems are equal where prescriptive analytics are concerned.

Does this mean you could never endeavor to get into prescriptive analytics for top of the funnel problems? Sure you could. You could, for example, try to look at how coming from Google predicts certain kinds of sessions or engagement patterns — and try to predict whether this user looks like they could become a loyal user or not, and what seems to increase the likelihood of such an outcome.

But everything else being equal, I wouldn’t choose this kind of problem for some of my early foray into this space. Instead, I’ll look to a great first candidate: Predicting the likelihood to churn.

This topic is a late-stage funnel problem and involves all logged-in users by definition. If you want to get inspired, here are some examples from INMA’s archives:

Training, modeling

I don’t like futurists because there’s little accountability built into the job. “In 50 years, we will be able to watch Netflix from the inside of our eyelids” — that futurist on stage didn’t stake her speaking fee against this one, so, you know, cool stuff. 

But for our training models, there’s a bit more accountability: observable real life. 

So if we take the example of a predictive model for churning, some of the evaluation for the model is going to control how a group originally predicted to be likely to churn did in fact perform in real life. There is some probability work that goes into this (intervals of confidence). But in general — with a problem whose data set is clean and known, and users who are logged in — we should be in a good place to be able to assess the quality of the model.

In general, this points to a very important part of choosing a good candidate problem for your predictive analytics work: the ability to evaluate the model against observable data gathered over time

Your models are making a prediction and you have a next-best action — now what?

I am patting myself on the back with this one because I get to trot out my favourite topics of all: Personalisation. Yes, this entire newsletter was a set up, all along, to talk about personalisation. Mwahahaha. 

This question came up last week in my chat with publishers: Once you have modeled a next best action, what do you do? 

At this point, we no longer have an analytics problem and we no longer have a data science modeling problem. Instead, we’re in the delivery side of things: variating {something} for our user whose activity — or lack thereof — makes them a candidate for some intervention relative to the one-size-fits-all experience you already have in place.

Predicting the likelihood of churn is a smart use of predictive analytics and leads to personalisation.
Predicting the likelihood of churn is a smart use of predictive analytics and leads to personalisation.

If we’re talking about churn and your model is identifying a group of users who are not reading enough and now are in the “danger zone” of your model, the next best action may be to try and put more articles in front of them.

Doing this, of course, is the territory of personalisation. I won’t go too nuts in the details of this (I won’t go too nuts yet ... but just you wait, because I have a whole report on the topic coming out in a few weeks).

But I want to underline something here: Not all personalisation has to come from your Web frontend or your app. In fact, one of the more technically friendly place to deliver some personalisation is e-mail, leaning into your CRM and ESP to deliver some tailored e-mails to a flagging user (with more to read, with custom offers, etc.). By its nature, a CRM is oriented at personalisation. So anything that can use the pipes of your CRM can be useful to handle the personalised delivery of the next best action.

So that’s the last mile of your predictive analytics journey: closing the prescription with an action loop.

In doing so, we actually head into prescriptive analytics (next best action), and automating the delivery of this prescription isn’t analytics at all. The border between predictive analytics and prescriptive analytics is often pretty fine because making predictions will naturally identify the variables that are significant in the model.

You could, of course, stop at making predictions (“Lee is likely to churn”). But in most cases, you’ll want to act on this. 

Which is why, on this journey to predictive analytics, you will, sooner or later, encounter personalisation. And on this topic, and to borrow the words of the Terminator: “I’ll be back.”

Further afield on the wide, wide Web

About this newsletter

Today’s newsletter is written by Ariane Bernard, a Paris- and New York-based consultant who focuses on publishing utilities and data products, and is the CEO of a young incubated company,

This newsletter is part of the INMA Smart Data Initiative. You can e-mail me at with thoughts, suggestions, and questions. Also, sign up to our Slack channel.

About Ariane Bernard

By continuing to browse or by clicking “ACCEPT,” you agree to the storing of cookies on your device to enhance your site experience. To learn more about how we use cookies, please see our privacy policy.