Governance is central to a media company’s data strategy
Smart Data Initiative Newsletter Blog | 19 May 2022
Hi everyone.
We just wrapped last week the Smart Data Initiative module of the INMA World Congress. If you haven’t caught up with the recording, it’s available for you as attendees.
We looked at the maturing data organisation — and this was my starting point for our next topical dive: governance. The mature data organisation values and strengthens their governance practice as it continues to expand its collection and usage of data — to maximise the value of this data but also to minimise risk around this data.
If these are topics you want to chat about, we’ve got a few weeks looking at various angles, so I’d love nothing more than to start a conversation: Reach out at ariane.bernard@inma.org.
And now, let’s get on with it …
All my best, Ariane
If data is a wheel with many spokes, governance is the hub
For the next few weeks, we’re going to dig into the topic of data governance and its related preoccupations: data stewardship and data design.
And at this point, 50% of you have already decided to move on to their next inbox items. But this is more likely to be connected to questions you’ve had before than you may realise. Keep reading if you’ve ever asked yourself one (or several) of these questions:
Could I access some sweet user data on {something}?
Why does it seem like two different parts of the data for {something} are describing the same thing? Or another version of this: Wait, I thought {this thing} was called {something else}.
Why does {Team A} have a data definition for {name} while {Team B} seems to use another definition?
How could I get someone to add some dimension to our data, because I’d like for us to collect {new information}?
All these questions connect with data governance, stewardship, and design.
And the difference between a young data organisation and a more established one is whether they have enshrined some best practices about these matters and have a process for how to move through these questions — a repeatable, accountable, documented process.
To my left: The young data organisation where, in service of a faster time-to-market, new data points are added as needed wherever they seem to best fit; and anyone with a good presenting reason could get whatever data they need (provided, of course, they know who to ask or how to get the data themselves).
To my right: The mature data organisation that did once enjoy the more flowing operation they used to have but realised it was a liability in terms of data privacy, in terms of data quality, and in terms of scalability. And so, eventually, got to formalising its governance.
Data governance matters if:
1. You want to control the protocol and verification for getting access to data — only users or services that need certain sensitive data have access to it — and enshrine objective criteria for getting access. In a media organisation, sensitive data information can include things beyond just payment or user info — someone’s reading usage on your site is more sensitive than you may appreciate.
In fact, in the U.S. there has long been some legal protection for this kind of data being accessed: The 1988 Video Privacy Protection Act prevents the release of a patron’s video club rental records to just anyone who asks. Consider it an early example of data governance for media!
(Sidebar: This law still does get used in our digital media world. Some interesting modern-day cases rested on it.)
2. Have a pathway to stop the flow of data to users or services that should no longer have access to it. In media, we usually tend to have technology stacks that bring in various levels of vendor tech, and, particularly in advertising, there are third-party exchanges happening all over the place. So we can’t just open a data pipe over to a system without a manner to end it.
3. Have a way to audit who and what data is being used. In media, this one is particularly important in case of breaches, since one aspect of laws like GDPR is that the breached organisation has a requirement to inform the specific affected users (so, audit needs right there) and to convey what data was stolen.
But generally speaking, and even beyond the compliance need that rests on the ability to audit data systems and pipelines, there is also an efficiency issue: It is, alas, rather easy for a data-requesting system to make a lot of greedy requests down from another data-giving system. The flow of data can quickly balloon and incur either unreasonable costs (all that flow of data usually racks up a bill) or stress such systems. So it’s in the organisation’s interest to have ways to observe data flows to be able to improve them to “just” what is needed.
Data stewardship matters if:
1. You want to make sure the data you are creating is coherent even as it evolves.
2. The data is able to serve stakeholders across a variety of organisation goals — even if this data is primarily created or collected by a more limited part of the organisation. For example, subscription data may be created and collected by a team close to subscription engineering, but many teams across the organisation will be using this data. So the way we create this data is of interest largely beyond the data’s creator stakeholder.
Data design matters if:
1. You want to make sure we agree on the overall principles for how we describe things in our data: how granular we are, whether we take object-driven approaches or action-driven approaches.
2. We standardise how we call things so that what is called “an active user” by one part of our data is also the same definition for an active user in other areas.
Compliance pre-supposes governance
Data governance will matter even if you don’t have it formalised yet. It is not an optional preoccupation — especially not if you care about compliance since it’s a building block of good privacy practice. There is not going to be privacy compliance in your sensitive data unless you’re able to account who (or what) is accessing what data, why they are doing this, and how you can revoke these privileges. That old slogan on late-night newscasts in America, “It’s 10 pm, do you know where your data is?”
Consider these findings from the Centre for Data Ethics and Innovation, a UK government body, which is mentioned in a recent report from the Ada Lovelace Institute: In the UK, only “31% of people agree that ‘the digital sector is regulated enough to protect my interests,’ compared with 30% who disagree.” The CDEI survey also notes that “few people express confidence that there are protections in place around digital technologies.”
This is important to keep in mind because there are two angles to good data governance:
How it protects us when or if we are ever the subject of a complaint from a country’s data protection government body. Your data governance is an important part of proving you have reliable, repeatable processes for the handling of data in your organisation.
How we can represent to our users the care we will take to insure their data is safe, judiciously handled, and protected including within our own organisation.
When we ask ourselves “how could we get more users to give us data about themselves so we can improve something for them?” we have to remember this level of distrust. Can we reassure them in these moments by being able to communicate how, exactly, we use our data and who, exactly, has access to it by what rules?
If you think this risk of distrust is not as mortal as it sounds, another statistic for you: The current rate of opt-in for tracking since Apple released its new App Tracking Transparency update is hovering around 25% per the analytics provider Flurry. Put another way, it means a full 75% of users basically do not trust us enough to believe the balance of what they stand to gain from allowing us to track them is worth the risk to their privacy.
But now, what does data governance look like for a smaller organisation?
Perhaps not so nailed down. And, to be clear, the fact that you don’t have formalised data governance doesn’t mean that you don’t have data governance at all. In fact, a small organisation or a young data organisation is probably doing some governance quite effectively, even if it isn’t enshrined in deep documentation.
The fact that the organisation is smaller can, in fact, give you the advantage of only a few stakeholders being the gatekeepers to your data. And where the gatekeepers are few, they tend to have good overall vision for how things are run. (You could tell someone how everything works in your house. There is probably some housemaster who is in charge of Buckingham Palace. Do you think they are able to give precise directions to locating a specific type of forks? And where is the electric outlet nearest to a particular window?)
Which is a paradox of good data governance: The best data governance is done with individuals who have a long vision for how the organisation runs and how your product is built. But the bigger the organisation, the less likely this vision is perfectly known to any one person. So while a larger organisation has more at stake in organising great governance, it certainly is harder to set up as you grow.
In our next newsletter, we’re going to look at a few approaches to set things up.
Further afield on the wide, wide Web
You know that Internet thing where you read a thing, and then you click another thing, and then another, and before you know it you’ve been having fun on the Internet for an hour but also you’re no closer to getting on with your day?
I guess I just described an Internet rabbit hole.
But now, I recently went down such a rabbit hole and the subject was habits, and then the subject was math somehow. And now I can say I didn’t just click around the Internet for an hour, but instead will call it “did some useful reading for the readers of my INMA newsletter.” Not a rabbit hole! Legit work business stuff!
In my previous newsletter, I mentioned my appreciation for the work of Cassie Kozyrkov, the chief decision scientist at Google. Decision is a big part of algorithms (it’s their outcome), and it turns out that in our own decision-making processes, we run algorithms, too. We decide when we stop our exploration and make a decision. It turns out that there is a final (?) mathematical answer to the question of when to stop exploring: Mathematicians say that we should ditch the first 37% of options we find on any questions, and then settle for the next option that beats all the first 37% of options. This great article from The Big Think gives you the highlight of this research (and a link to the full not-a-rabbit-hole, should you decide to go there).
Date for the diary: Yesterday, but it’s not too late
We did a members-only Webinar yesterday. I hope you caught. If not, as an INMA member, it’s free to watch here. If you’re in need of sharing with like-minded folks, we talked about legacy (legacy data, legacy practices, legacy mindset), with Jensen Boey, a head of engineering at SPH Media in Singapore, and Giulianna Carranza, the chief data officer at Grupo El Comercio in Peru.
About this newsletter
Today’s newsletter is written by Ariane Bernard, a Paris- and New York-based consultant who focuses on publishing utilities and data products, and is the CEO of a young incubated company, Helio.cloud.
This newsletter is a public face of the INMA Smart Data Initiative. You can e-mail me at Ariane.Bernard@inma.org with thoughts, suggestions, and questions. Also, sign up to our Slack channel.