Logistically, where does data governance fit in your media company?
Smart Data Initiative Newsletter Blog | 02 June 2022
In my last newsletter, we kicked off a lil’ dive into data governance — that complex beast. We looked at how governance touched on many preoccupations in your wider organisation — data design and compliance among them.
This week, we’re looking at two specific angles on the topic:
How and when to evaluate evolving your governance.
The proposition (which isn’t prevalent on our shores) of data governance as a function that stands more separate from everyday data operations.
We’ll look at data stewardship and data design next — topics I know can be a sensitive issue because creating data that is equally geared to very different functions in the organisation is not exactly easy. Is this a topic you’d want to tell me about at your organisation? Have a brilliant idea for how things should be done? Reach out at ariane.bernard@inma.org.
For now, let’s get on with this week’s dive.
All my best, Ariane
Data governance: when and how to decide where to place the function in your data organisation
Earlier this year, I talked to Gilad Lotan, BuzzFeed’s VP: “The governance piece — and setting up transforming an organisation’s data operation — is definitely an area I think about a lot, and I know many folks are struggling. It’s not as flashy as talking about our latest and greatest machine learning algorithm, but in a way it’s much more important.”
Gilad shared some of his organisation’s recent technical and organisation rehaul with INMA’s spring Smart Data master class series — and his remark highlights the layered connection between data governance and the more practical matterns of how our data stack works.
As we rethink or expand our capabilities, we also reassess (or entirely discover) new problems that tie with our soon-to-be-expanded capabilities. More data flows. More data points. More data-powered systems. To rearrange a quote from the philosopher Biggie Smalls: “Mo Data Mo Problem.”
So perhaps this problem is on your mind as a growing concern as your data organisation matures and things only seem to get more complex. And to be sure, if you are realising you are having growing compliance problems, these can be quite symptomatic of governance issues.
Classically, a place where this can be manifested is when your organisation is getting Right to Be Forgotten requests from EU data subjects (or California data subjects, where the law is very similar). But it seems the same complaining individuals have to make repeated requests (for example) for e-mails to stop coming to them. This means there are broken data flows that have split user data across systems that don’t talk to each other or that your team doesn’t have a clear vision for the various paths of the data — so these Right to Be Forgotten requests aren’t executed fully when they come through.
Not only may you be getting a warning from a country’s data regulator (and don’t let that happen too often lest it becomes a full-on investigation and later a fine), but you are also expanding valuable amounts of your internal resources chasing data across several broken up tasks when these requests come in.
A reorg or a big engineering rehaul would make an interesting juncture point to rethink your data governance and frontload it: As you re-org or rebuild large data systems, you’re essentially auditing your own data organisation. You’re laying out a clean future picture of your data flows and the various systems that each contribute but also use data. You may just be at a time where you have your clearest picture of your organisation and how what it does touches everything else.
Karine Serfaty, chief data officer of The Economist (and one of the Smart Data Initiative’s advisory board members), also used the occasion of a re-org and technical upgrade to examine where the data governance functions would exist. At her organisation, data governance is blended in with other daily concerns like data architecture. It’s a close-to-base function, deeply built into operations.
This is a model that, based on my own sampling of data organisation leaders (definitely not an exhaustive or even representative list, to be sure), appears to be more common — the governance function being deeply woven with operations.
But the other way in which governance will be impeded or enabled is also at what time governance is brought in as you build out your data organisation and the data itself. There, there seems to be an agreement that you really ideally go with governance first as much as you are able to.
“I had the pleasure here to start from scratch,” said John Souleles, the chief data officer at Torstar in Canada and also an advisory board member of the Smart Data Initiative. “And I knew from the beginning when you’re building out the data warehouse … a lot of people think of data governance [is] at the end. We put it in at the beginning. In other companies, where you have legacy processes, usually governance is done at the end, because it’s a mishmash of legacy databases that you’ve got to try to figure out. But if you’re starting from scratch, it was easier to do, and we put governance and security at the beginning.”
As John noted, it may be “easier” but it’s certainly not “easy’”— not even because data systems are complex, but rather because organisations are complex and ultimately governance is about organising humans around rules they may or may not be interested to follow.
Still, what John notes here is that there is something more organic to thinking of governance earlier rather than trying to retro-engineer it on existing frameworks. And as your organisation reorganises teams or systems, frontloading governance into these projects can be a viable way to attack the governance beast, piece by piece.
Data governance: the case for a stand-alone function
In media, data governance tends not to be a stand-alone function — at least at the present moment. It tends to be handled by a mix of IT folks (integrity of system access), some engineering stakeholders with good vision over your platform architecture, and key data stakeholders who are closer to data design (and so understand, roughly speaking, what data flows are in service of what piece of your data-driven product).
In organisational terms, this means data governance is often handled by folks who combine both strategic and operational responsibilities.
As someone who comes from product, I have a favourable bias toward this type of scenario. As many a product person will tell you, there’s nothing more frustrating than being handed a mandate from the ivory-tower strategy team — which entirely ignores how this company actually works or how systems actually get built.
But as data organisations mature and grow, the usefulness — and perhaps even the necessity — for a more (if not fully) dedicated data governance function grows.
News UK, for example, expanded its data governance function. “We have built a bigger team for governance, giving them more influence and more power,” said Pedro Cosa, News UK’s data general manager, when he presented at our Smart Data Initiative module of INMA’s World News Congress of News Media earlier this month. “Governance was sitting somewhere as an unknown team, unknown function. We definitely wanted to elevate data governance as something that can unlock a lot of value for us.”
I found this remark particularly interesting because, dear reader, I had never thought of governance as a place to create strength for the data operation. My thinking of governance was that it was a place to create clarity, order, enshrine non-arbitrary rules for decision making.
I thought of governance the way I think of good management in general: If a group of individuals had perfect vision of the totality of your systems, they’d naturally be able to bring more order to it. There would be fewer aberrations, repetitions, dead branches. Better documentation. All good and desirable things, certainly. But I hadn’t particularly considered the angle of governance needing to exist from a place of strength.
But strength, of course (she says now), is what it’s about. Because something that is easily overlooked when strategy and operations are separate is that strategy is inherently weak and operation is inherently powerful.
Put another way: You can spell out policies and goals, but whoever is the doer still gets the final say. Or, put another way, when the ivory-tower strategy team comes down to deliver an idealistic mandate for the product/engineering team to execute, they forget that while the product team may seem like they have been streamrolled, they will ultimately do what they want.
So how do you give strength to governance?
One avenue is creating it as a stand-alone function, like Pedro Cosa mentioned. It means that you’re not so much splitting the focus of folks sitting astride operational and strategic purviews. And this model, while less common in media, has been prevalent in other “hard” industries for a while.
A couple of weeks ago in a meeting with the Smart Data Initiative advisory Torstar’s Souleles mentioned his own experience working at Bell Canada, the telecom company. “We had a full-time team focused on that because in a big organisation with 55,000 employees and …. millions and hundreds of metrics, you needed that functionality. It wasn’t a big team ... maybe two or three people,” but their goal was to make sure there was alignment on the definition and usage of data points even was before this data was being created.
Pedro’s own goal with governance was preventing what he called the “mushrooms” of data that would crop up with repeating or overlapping data sets and tools. But what John identified as a key preoccupation was data design — the creation of metrics or data items that would be coherent across the company.
The presence of a dedicated function would mean that the single focus of a group of collaborators is to create a coherent, strong foundation for the data organisation to work — and for the organisation to lean on data that is coherent throughout.
I still think a purely strategic mandate has the inherent fragility of being subject to operational “interpretation” (whether willfully different or just not being executed quite like what was put on paper). But putting governance as its own function does address one fragility of such a purview: When governance is wrapped in the jobs of collaborators with other responsibilities, these collaborators just don’t have singularity of focus. By definition. And everybody looks for ways to bend the rules and for an easier path here or there.
Our messy data is not usually made messy by any one bad action or thoughtless application. It’s a collection of small adjustments and corners cut — of rules we ignored “this one time.”
So this place of strength that Pedro was speaking of is actually a place of focus. Governance made stronger because it is, in fact, the only preoccupation of a group of people rather than a general responsibility spread across various functions whose purviews, on paper, aren’t strictly governance.
Further afield on the wide, wide Web
“The introduction of AI in the news risks shifting even more control to and increasing the news industry’s dependence on platform companies. While platform companies’ power over news organisations has to date mainly flown from their control over the channels of distribution, AI potentially allows them to extend this control to the means of production as the technology increasingly permeates all stages of the news-making process.”
This is the central thesis of Uneasy Bedfellows: AI in the News, Platform Companies and the Issue of Journalistic Autonomy by Felix Simon, a researcher at Oxford University. In it, Simon posits that there is an opportunity but also a potentially intractable future dependency — vendor-lock-in being one of them — in the “easy AI in a box,” where companies are leveraging the Googles or AWSs of the world.
As with a lot of things, your own read on this thesis will hinge on where you consider the platform vs. media company relationship falls in the symbiotic axis between parasitic to mutualistic.
Since I know this isn’t exactly light reading, I’ll also share this far lighter read (abstract from Synced) on an AI system at Berkeley that won a crossword puzzle tournament. Some people enjoy pets dressed as humans; I like computers dressed as humans.
And also
Jodie Hopperton, the lead for the Product Initiative at INMA, convened a small panel to talk about personalisation at the final session of the INMA World Congress last week. Greg Piechota, the lead for the INMA Readers First Initiative, and I shared some of what we saw as some of the strategic considerations for publishers who are thinking of starting or ramping up their efforts there.
If you haven’t caught up with this, you can catch the replay here.
This is my segway to say that I am preparing a report on personalisation, so please do reach out to tell me about what your company is working on in this area!
About this newsletter
Today’s newsletter is written by Ariane Bernard, a Paris- and New York-based consultant who focuses on publishing utilities and data products, and is the CEO of a young incubated company, Helio.cloud.
This newsletter is a public face of the INMA Smart Data Initiative. You can e-mail me at Ariane.Bernard@inma.org with thoughts, suggestions, and questions. Also, sign up to our Slack channel.