Sydney Morning Herald uses systematic experiments to enhance product, services
Product and Tech Blog | 20 May 2019
“If a tree falls in a forest and no one is around to hear it, does it make a sound?”
We’ve all seen versions of that timeless metaphor used to convey a simple message that has nothing to do with trees, forests, or sounds. I’m no exception, so please forgive my use of it to explain what I’ve been doing at The Sydney Morning Herald for the past six months: digital experimentation.
As experimentation manager at Nine, my role is to use experiments to validate assumptions we make about our users — essentially, eliminate as much of the guesswork as possible and instead base decisions on real-world data. Ultimately this should optimise how we present our journalism and our subscription offering more broadly, while saving time and resources on pursuing ideas based on false assumptions.
If we want to make a change to our Web site — whether it’s a new layout for the homepage, a new marketing offer on our paywall, or introducing a new concept to our article page — it’s my role to test this change on a small portion of our audience, observe the impact this has on user behaviour, and then recommend if that change should be implemented more widely, tweaked, tested again, or abandoned entirely.
My chief concern when I started in the role was not how to run experiments, but how to ensure the lessons of those experiments led to meaningful improvements for the publication.
To be blunt, before I even thought about that, my main thoughts were actually, “Will anyone even know what I’m doing, and, if they do, will they care?”
It was for this not-so-unselfish reason that we formalised a process for the creation, prioritisation, implementation, and reporting of experiments on our Web sites.
Using a combination of Google Docs, Slack, Confluence, and JIRA, we’ve ensured all relevant areas of our digital business — from editorial to product, technology, and marketing — know that an experiment is taking place and exactly where to find the results.
All stakeholders — but specifically those in product management, design, and development — are encouraged to contribute ideas to our experiments backlog, where every item has a clearly-stated hypothesis, metrics, goals, and assumptions.
Crucially, every experiment must directly relate to one of the publication’s strategic targets, ensuring an experiment has a clear link to a practical business goal, rather than simply being someone’s pet project which has an aim of “let’s just see what happens if we do this.”
Every experiment in the backlog is then assigned a dollar value to assist in the prioritisation process, using a similar formula outlined by the The Wall Street Journal’s Peter Gray in his INMA blog post last year.
Prioritisation sessions are held monthly, with representatives from our product, technology, data, editorial, and subscriptions and growth teams coming to a collective decision on which experiments should be run in the upcoming period.
The experiment is then implemented — like many publishers, our tool of choice is Optimizely — often with the assistance of an engineer within the relevant tech squad.
We generally run each experiment for two weeks, which allows us enough time to build up an audience of statistical significance, but also counters some of the volatility of the daily or weekly news cycle.
When the experiment concludes, the results are synthesised and presented back to the key stakeholders who came up with the experiment idea, and then distributed to the wider business.
So, going back to our shaky metaphor, the tree has fallen and the sound has been heard, but what are we going to do about it?
This should be the question hanging over every experiment, and all data work more broadly. No matter how pretty the graphs are, data is only powerful if it’s used to inform decisions.
So far, we’ve had a few clear examples of experiments influencing decisions after using the process described above:
- A component on the homepage was going to be moved from the bottom of the page to two-thirds up, until an experiment showed this would actually lead to a decrease in overall click-throughs from the homepage to articles.
- In another example, we showed that by simply changing the font in one of our homepage sections from italics to regular, we could increase the click-through rate by up to 16%. After making the change, the click-through rate has actually gone up more than 30%.
Yet for other experiments, the results have proven less conclusive.
- A test of a new concept within our article pages showed there was strong interest among some user groups, but certainly not from all of them. The task now is to better identify those users and think of ways we can target them specifically with this new experience.
- The results of an experiment we ran on our paywall, in which we showed some users a sale price and some users a regular price, actually indicated that in some cases, users preferred the regular price, while in others, they preferred the sale price. Go figure!
This is why we’re constantly looking to refine the hypotheses, goals, and metrics we’re using to determine the success of our experiments.
The clearer we are in articulating these at the start of an experiment, the greater our chances of attaining powerful data learnings at the end of it. And ultimately, the larger the impact that data will have on influencing decisions across the business.
Now if you’ll excuse me, I’m off to fell a few more trees.