In today’s competitive and crowded news market, finding and retaining loyal readers is an essential component of sustainability for news organisations. After all, with so many competing ways to access news, reader loyalty is hard to come by.

The data team at the South China Morning Post (SCMP) recently set out to understand how readers develop loyalty to specific news outlets and how to nurture that loyalty.

In January 2019, the team began to build an algorithm using machine learning to predict reader loyalty. We’ve named our predictive engine Bluefin because:

  1. Bluefin tuna return to their birthplace to hatch, just like we hope our loyal users will return to find informative, meaningful content.
  2. Just as the bluefin tuna are critically endangered, loyal readers are increasingly rare. Both need protection and cultivation to return to healthy population levels.

While the ability to predict reader loyalty has many applications, we were interested in using the prediction to optimise our marketing campaigns. We used A/B testing with a control group to test whether our predictive engine could improve SCMP’s marketing strategy. By focusing on the highest potential readers, we increased engagement with our marketing campaigns and improved the cost efficiency of our marketing.

We defined loyalty as a multi-session user who returned to the site with a pre-specified frequency and recency.  Of the 40+ variables in the model, several stood out as top indicators of loyalty:

  • Percentage of pageviews in each section.
  • Time on page.
  • Duration between the last two visits.
  • Percentage of sessions on various platforms.
  • Percentage of sessions by source and medium.

Using data from the period between June 2018 and November 2018, we built a workflow for the algorithm, which began with data extraction.

SCMP's workflow for the algorithm is created from a year's worth of data.
SCMP's workflow for the algorithm is created from a year's worth of data.

The scoring engine measured the proportion of loyal users that our algorithm correctly predicted and the precision of the algorithm’s predictions (loyal readers correctly predicted as loyal readers/total predicted loyal readers). Each month, we feed the scoring engine back into the data engine to improve the model. This allows the algorithm to learn from the latest data and incorporate any new relevant variables.

Recognising SCMP readers’ consumption patterns vary by region, we ran the model separately for the United States and Asia regions. We also used multi-month historical data to perform a cross-time validation on the model’s prediction.

In our multivariate A/B test campaigns in the United States and Asia, we found using the algorithm’s predictions increased engagement by 58%-78% and cost effectiveness by 36%-52%. This demonstrated that predictive algorithms such as Bluefin can optimise marketing campaigns effectively and efficiently. We are now starting to use Bluefin to identify new audiences that share characteristics with existing high potential loyal users.

Identifying potential loyal readers is beneficial not only for maximising marketing budgets, but also creating personalised reader experiences. Predictive algorithms such as Bluefin can be used to engage loyal readers in more meaningful ways and minimise reader churn. Predictive algorithms also offer exciting possibilities for engaging with readers in other meaningful ways, which is determined based on a user’s preferences and potential.

The success of Bluefin provided extremely valuable insight into SCMP’s readers. We trust it will continue to be a powerful and dynamic tool going forward.