With the help of machine learning, local and regional publisher Mittmedia has looked into why its subscribers have churned. The company got a comprehensive picture of the causes using different data points such as age, gender, online behaviour, and payment history. Some of the results were quite surprising.

INMA members sat in on a live Webinar Wednesday to learn about Mittmedia’s findings as presented by Head of Digital Development Robin Govik and Product Manager for Data Michelle Ludovici.

Based in Sweden and part of the Bonnier News group, Mittmedia has:

  • 280,000 subscribers (33% digital only).
  • 28 local newspapers.
  • 20 local news brands.
  • 10-plus free newspapers.
  • 40-plus newsroom locations.

Mittmedia presented insights it learned from its churn models in an INMA Webinar.
Mittmedia presented insights it learned from its churn models in an INMA Webinar.

The problem

Govik began by saying the trends Mittmedia identified were clear. While there has been a steady decline in the numbers of print subscribers since 2015, the numbers of digital-only subscribers have been increasing — they now total more than 90,000.

“The biggest threat to that nice increase is churn,” Govik said. “From 2018 to 2019, Mittmedia managed to decrease the churn rate from 17% to 11.5%. But then it stagnated. It went up and then down again.”

While digital subscriptions have been increasing steadily, Mittmedia decided to address the problem of churn.
While digital subscriptions have been increasing steadily, Mittmedia decided to address the problem of churn.

In summer 2019, the team felt it was time to take a deep look into the reasons for reader churn and these fluctuations to effectively combat it. They asked questions such as: Were their offerings good enough? Were there signs for churn that they could have acted upon?

“The foundation for our investigation [into churn] was our first-party data,” Govik explained. The team utilised its data platform, Soldr, named because of its key objective to solder different products and datasets together. The system provides three distinct datasets: interaction data, content data, and user data.

“From the perspective of the end user, Soldr is the driving factor behind the majority of content being exposed, regardless of what product or device the user is on,” Govik said.

The Mittmedia data platform Soldr takes data from multiple platforms and devices to analyse.
The Mittmedia data platform Soldr takes data from multiple platforms and devices to analyse.

Data insights

Now it was time to look at what the data was telling the Mittmedia team about its churn patterns. Ludovici took over here, sharing the team’s definition of churn: A user has churned at time of cancellation or last payment. A user can also churn multiple times.

A definition is necessary because it might not be as obvious as it seems, and different companies have varied definitions of what constitutes churn.

From there, several steps were taken to gain valuable data insight:

  • Identify and clean dataset: Identify the outliers and make sure the data is correct.
  • Factor analysis: Finding the parameters that correlate with churn.

Once this was done, the team created three distinct models:

  • Survival/hazard statistical model, which is the simplest model.
  • Machine learning model one, GBM, which is the more simple of the two and uses little data to compute.
  • Machine learning model two, Neural Network, which is a more complex, deep learning model that uses more data and is more accurate.

All three models can be used for current situation analysis and predictions, Ludovici said. The dataset used more than 37,000 customers that had been subscribers between April 12 and August 12, 2019, and analysed their activities and behaviour in the month before cancellation.

“We have not taken into account the first 30 days of subscription, and that is because we have many campaigns,” Ludovici added. These campaigns are often one-month free trial periods and generate substantial cancellations at the end of the period.

The Mittmedia data models looked at four different sets of data.
The Mittmedia data models looked at four different sets of data.

The model looked at subscription data, user data (such as gender and age), behavioural data (such as what is clicked and read), and data on app versus the Web site. Behaviour that correlated with a high propensity to churn included:

  • Users who received a lot of notifications.
  • Users with many past subscriptions.
  • Users with payment errors.

A lower risk to churn correlated with:

  • Users who read more articles.
  • Users who subscribed to the newspaper in the past.
  • Users who open more images.
  • Users who subscribed via a campaign or the app.

Ludovici shared a graph illustrating the effect of the number of push notifications sent to users. “If we send up to 20 notifications, that’s OK. But if we send more than 20, there’s a higher risk of churn.”

Govik jumped in to say that this finding, in particular, really surprised the Mittmedia team: “We still don’t know the reason behind this, we only know the correlation,” he said. But the findings are used to research what those reasons might be as they formulate their strategy for dealing with churn.

“This doesn’t mean that push notifications are bad,” Ludovici added. “But we need to review how we do pushes, and how many pushes we send out, because it’s very strongly correlated to churn.”

The findings also revealed Mittmedia’s female users are 10% more likely to churn than male users. When the team used this data to dig deeper, they realised they publish many more articles about men than about women, as well as images.

“Now we are adjusting a bit to that and trying to write more articles about women and having more pictures with women in them,” Ludovici said.

When it comes to age, most of the churn comes from users between 25 and 45 years old. This was especially distressing to learn because this is the exact age group that is a target audience for the publisher. Yet they found that the slightly younger age range of 20-40 is where their highest potential is.

Surprisingly, Mittmedia found churn to be highest in its target age range of 25-45, leading the company to focus more on 20-40 year olds.
Surprisingly, Mittmedia found churn to be highest in its target age range of 25-45, leading the company to focus more on 20-40 year olds.

The number of articles read in the app also correlates to churn, the data concluded. Users who read more articles in the app have significantly longer subscriptions. The mean number of app reads in one month is 60, and the median is 36. Mittmedia found the magic number is 80 — churn is much higher for users who read less than 80 articles in a month and much lower for those who read more than 80.

“The same goes for images that are opened,” Ludovici said. “Users who open more images, it doesn’t matter if it’s in the articles or image scrolls, they are less likely to churn.” Three thousand of the churned users in the data set opened more than 10 images, while more than 9,500 of them opened between one and 10 images.

The critical first 100 days

One major finding from all this research was how critical the first 100 days are for retaining customers. “In the first 100 days, this is where you want to have efforts to keep the customer,” Ludovici said.

When it comes to which groups to target with anti-churn efforts, the group who are 60%-80% likely to churn is where Mittmedia is focusing its efforts. “The ones under 50%, those are the people very likely to stay with us.”

Mittmedia decided to focus its anti-churn efforts on users who are 60%-80% likely to churn.
Mittmedia decided to focus its anti-churn efforts on users who are 60%-80% likely to churn.

Those in the 60%-80% range haven’t made up their minds yet and could be swayed with efforts to retain them. For those with a 90% or higher likelihood of churning, Ludovici said perhaps it’s not worth the retention efforts, which are better spent on those more likely to be retained.

Taking action

So, what was Mittmedia to do with all these insights? The team made some concept slides from the data, of high- and low-risk groups across its different brands.

“An editorial person could take this as a measure to counteract [churn],” Ludovici said. “We could use this in personalisation to show a different mix of articles — gender based, for example.”

Concept slides were created from the user data for strategic planning in combating churn.
Concept slides were created from the user data for strategic planning in combating churn.

Other concept slides included those that addressed the number of push notifications and other behaviours — all to provide action plans to counteract the churn predictors.

Ludovici addressed the question of why her team made three data models. It was a lot more work, she admitted, but the team wanted to ensure the results and see if different models produced contradictory data. She noted that if they do more analysis or go deeper in the future, they will use the Neural Network model because it is more comprehensive and gives more complex data.

Further opportunities

Next steps could include:

  • Segmenting users (such as churn reasons and risk levels per title).
  • Getting more context through interviews.
  • Looking at campaign channel data.
  • Adding more features such as bank ID, cross sales statistics, etc.

At the end of the day, Mittmedia came up with a list of general recommended action based on its data for a strategy of churn reduction:

  • Implement easy, anti-churn actions on high risk customers.
  • Hypothesis test anti-churn actions on at-risk user segments.
  • Analyse the types of pushes for risk versus non-risk groups.
  • Investigate potential for linking several users to one subscription.
  • Investigate payment errors.
  • Analyse interactions, such as users that turn on the sports content filter, for risk versus non-risk groups.

“Everyone can act on this,” Ludovici said, “and they can also sync with each other to not counteract each other’s efforts.”

Govik added: “Hopefully we can use this analysis and integrate it into different kinds of work flow.” The anti-churn actions could be automated to activate specific responses to risk behaviour. “Now, we have used the churn model for insights and actions.”

Q&A

INMA: How do you relay this information throughout your organisation?

Mittmedia: I think there is some cultural challenge between the data team on the one side and marketing and sales people on the other side. We have tried with our best efforts to make them work together so that we don’t just hand over the results. We think it’s very important that they also participate in defining the questions and defining the challenge. We gave them the opportunity to be involved from the very beginning.

INMA: Regarding actionability, did you see a factor such as the number of articles read as a predictor of churn? Do you try to determine casuality?

Mittmedia: We tried to define causality, of course, but it’s a very tricky question because there are so many parameters of a user. One way is to do a hypothesis test, for example with pushes and with content geared more toward women. You need to have very defined conditions for that as well. That’s a very controlled test. Maybe we can see the results very soon. So we do try to determine causality in the areas where we can do that.

INMA: Have you considered recency, frequency, and volume while building churn models?

Mittmedia: Yes, actually. Frequency of visits to our products is a parameter that goes into the model. We didn’t have time to consider all of those because you have to choose one and start somewhere.

INMA: Could you elaborate what you mean by payment errors?

Mittmedia: Credit card errors, where the transaction hasn’t gone through correctly and so on. Involuntary churn.

INMA: Do you segment your subscribers, and do you separate the models for different segments? Do you find some subgroups respond differently to pushes?

Mittmedia: In general we do have segments we use for example, for personalisation, but we don’t do it for the churn models yet. That’s on the horizon. 

INMA: How did you admit your data set to include all the user demographic data?

Mittmedia: Sweden is a very transparent society, so we actually get the personal (social security numbers) from the people. And from this we can drive the gender and age, as well as offline data such as address, which we get from the Swedish government system.