This week, we are pursuing our topic of personalisation — which we had begun to look at in the previous newsletter: contrasting the impact and option of personalising content using unsupervised learning versus supervised learning. The topic of personalisation will be looked at in-depth with four fantastic panelists for the first module of our March master class series, which begins tomorrow, March 10 (even if you register late, you’ll have access to all our presentations on replay).
As always … I’d love to hear from you. E-mail me at email@example.com to say hi and share with me how the Smart Data Initiative could be useful to you: from programming to speakers, to topics that are on your mind and want us to take on.
Unsupervised learning and personalisation: useful but not for all
The contentious understanding that personalisation can distract from the editorial mission is based on one large assumption: That a personalisation algorithm can only consider all links “equally.” That is, a Web site with one million articles will consider all these content links to all be equal to start.
And if there is one thing a newsroom will agree on, it’s this: Not all articles are equal at all.
Now this assumption that personalisation of the content selection will treat each article as an equal is correct with unsupervised recommendation — “unsupervised” in the context of machine learning means that algorithms are working from untagged data.
Say the recommendation engine is given only articles, on the one hand, and user behaviour toward these links on the other hand. The recommendation engine, no matter what algorithm is used, is never going to understand that some articles are “not equal” (more editorially meaningful) when looking just at the article side of things — simply because there are no parameters attached to the articles to differentiate them in the first place.
The algorithm will be able to act on the articles based on factors coming from user behaviors like “highly shared” and “good scroll depth.” But it doesn’t recognise importance, just popularity. So the article about Beyoncé is likely to come on top of the deeply reported investigation in the corruption of some institution.
Unsupervised learning can still produce some very good personalised feeds. The New York Times found out several years ago that using the Latent Dirichlet Allocation topic extraction algorithm was more successful as a recommendation method than others — meaning that allowing recommendation to run on a clustering of articles around topic did correspond to an increase in reading.
Unsupervised learning will eventually produce winners and losers, but they won’t be optimised to quality because quality is neither a provided input nor a measurable outcome. Clicks, shares, engagement minutes, subscriptions — these are, in analytics terms, “events.” And as events, they are trackable. Since they are trackable, they feed back into the recommendation engine as the success metric of the recommendation.
But quality is not a measurable metric. The model will only correlate with quality by chance (if it does), not by design.
It follows that if a personalisation strategy wants to use some differentiating factor of article quality (which is another way of saying “not all articles are equal”), this information can only come from a tag — a tag created by the humans who have the most competence in assessing the article quality. This approach also works with a score given by humans. A score will allow sorting and the subsetting of the article in different value groups. This is an approach NZZ once took with its personalised news app.
Supervised learning of recommendation algorithms
Which leads us to supervised learning for these areas of a publisher site where personalisation could be an option as long as the recommendation algorithms behind personalisation are being trained on data that reflects how a human would judge every piece.
This additional structured data (tags, scores) opens up the option of so-called supervised learning. The recommendation algorithm can now train on an enriched data set that, specifically, includes human judgment of the article. The outcome of such a personalisation will feel much more like the “perfect outcome” of personalisation: The impression that humans actually produced every single personal feed given for each individual user.
Human-like recommendations aren’t as outside of the realm as it may sound: The New York Times has a project in the works where editors are in the loop of the recommender system to better inform and coral it.
An Indian publisher recently was telling me about their current large effort to cohortise users in part to support their personalisation effort. To be clear, this is hugely important because it’s a dimension of personalisation that can support efforts well behind content recommendation. Content tagging will, obviously, readily be useful for content recommendation, and, to a lesser extent, to advertising on the page. But it cannot readily inform customer journeys, whereas audience cohorts can.
Audience cohorts also give you trends — and there’s an algorithm you have no doubt encountered that’s based on capturing these audience trends, which is Netflix’s recommendation engine (this excellent talk from 2018 explains). But audience cohorts are orthogonal to article quality — that stuff the newsroom worries about when they hear a recommendation engine is moving in where manual curation used to live. And it will not have escaped you that, crucially, Netflix’s recommendations make no attempt at ranking content on your screen on the basis of editorial quality. At Netflix, the worst and best movies are competing on an equal basis of quality: What matters is the likelihood you want to watch it.
For personalisation to be informed with human-like understanding of quality, the algorithms used to produce these recommendations must be supervised on training data that includes quality as one of their training factors. If there are three bins of articles for the site — one with “super important articles” and two other bins for “cool stuff” and “less essential” — you can write rules that will state certain areas of the site will use a personalisation rule that only dips from the “super important articles.”
Within these sets of binned content, you don’t necessarily have to write rule-based personalisation. The same sort of unsupervised learning you may use to power a widget far down the page can be used. A version of this was an approach taken by NPR with its NPR One curated story app back a few years ago, and the way human editors were tipping the scales was quite straightforward — an “editorially conscious algorithm,” as it was called then.
In 2020, CNN bought a company called Canopy that was working on a personalisation tool leveraging human inputs to be integrated into its wider product offering.
It is that approach — where the newsroom has essentially infused its reading of quality into the set of articles the recommender is working with —that will enable personalisation outcomes to align with the newsroom’s appreciation of quality.
Further afield on the wide, wide Web
One good read from the wider world of data. This week: a view of human-algorithm feedback loops, from a social science angle by two Cornell University researchers. It’s … dense. But it’s an amazing jumping point into all the ways adaptive algorithms need checks and balances lest they mirror our less great human traits. (Social Science Research Council)
Dates to remember
The Smart Data Initiative’s first master class for 2022, “Transforming What We Build Using Data,” starts tomorrow (March 10). See you there!
Meet the community
For each installment of this newsletter, I am hoping to introduce one member of the community in this space. Can we introduce you? A few questions here to get to know you better. Thanks!
About this newsletter
Today’s newsletter is written by Ariane Bernard, a Paris- and New York-based consultant who focuses on publishing utilities and data products, and is the CEO of a young incubated company, Helio.cloud.