It is summer. My American friends broke out their barbecue for the Fourth of July. My French friends kicked off a successful airport strike that stranded a ton of luggage. It is summer. All the rituals are there.
Which is why, in a bit of counter-programming, this week’s newsletter doesn’t deal with anything light or fresh — but instead the continued confusion with the legality of Google Analytics in the context of European data privacy legislation.
But, since it’s summer, I am hoping those of you who are not yet at the beach may just have a little bit more flexibility in their schedule and perhaps have time to chat with me? Tell me what’s on your mind or what your team is working on. My inbox is at firstname.lastname@example.org.
All my best, Ariane
No, Google Analytics didn’t get banned in [insert Euro country]
The latest such headline comes to us from Italy and exists in two flavours: “Italian data authorities declare Google Analytics illegal” and “Italian data authorities warn over Google Analytics.” Besides the fact that these are very different headlines yet refer to the same recent legal decision, only one of these headlines is correct. The second one: “Warns over.”
The more sensationalist headlines will also remark that other European countries (France, Austria) have also found some aspects of Google Analytics — namely, data transfer from Europe to the U.S. — to be illegal. Again, depending on the clickbaity-level of the headline, you’d read something between “is illegal” or “warns over.”
So first, the extremely abridged summary:
A 2020 European judicial decisions called “Schrems II” (after the activist who brought the case, Max Schrems) led to striking down Privacy Shield, the agreement passed between the EU and the U.S. to allow for transatlantic data transfers, guaranteeing the standards of the EU’s data privacy GDPR legislation are upheld.
Schrems II opened the door to various legal actions brought on by activist groups — usually against large targets like Google Analytics. GA is compliant within the letter of Privacy Shield. But, since Privacy Shield has been struck down, it is now no longer compliant with GDPR unless a site using GA also decides to configure their installation with tighter settings that deliberately do not collect various piece of information (making the GA information less useful to the publisher but allowing for better compliance since there is more anonymity for the end user).
Put another way, GA “out of the box” is likely not compliant, but a properly managed GA is. (I have to disclaim, again, that I am taking huge shortcuts in setting all of this up — so take this description as your CliffNotes here).
So now, today — and these headlines. At the moment, there is only one possible outcome for this type of case being brought to a European data agency, which is to rule that using GA is likely not compliant (this is why “warned over” is the correct headline, and “is illegal” is not the correct headline). That is, GA, deployed in its “default” state, is not compliant. This is because the agreement that was giving it its compliance umbrella, Privacy Shield, is no longer in place. There is therefore nothing surprising about these activist cases finding the conclusion that they do.
But, and here’s the important thing here: We’re talking about organisations using GA without using some of the more advanced features that put GA in line with current European data requirements.
One thing you’ll notice is that these cases are often brought against smaller Web sites — presumably with fairly plain-vanilla installations of Google Analytics. My perspective is this is very deliberate on the part of the folks bringing forth these cases.
The goal here isn’t really to disrupt Web site publishers. And if activist cases were being brought against larger publisher targets, the case actually wouldn’t be as useful. A larger target, risking heavier fines, would work hard at rejiggering (overwhelming use of technical terms here, hope you appreciate it) their install of Google Analytics to be compliant in the current state where we do not have a Privacy Shield agreement.
The publisher would certainly be unhappy since their data would essentially be amputated, but, well, they’d do it.
If you want to dig deeper: The path to making GA compliant with GDPR requires deliberate steps both in terms of consent collection but also how, and what, data is collected in the first place (anonymised on entry, essentially). This post explains this very well. And, if you want to dig into the more deeply legal angles, this Q&A from the law firm Hogan Lovells is also super helpful.
Privacy Shield II, Google Analytics (and the others): Where are we headed?
In March this year, the U.S. and the EU conveyed that they had landed on the principles of a Privacy Shield replacement — the details of which are still being hammered. It’s taking a while (probably until the end of the year). Expect that any self-respecting privacy activist group would of course give a good kick in the tires of whatever agreement is proposed, and so, if you were U.S. and EU legislators, you’d make sure to leave no stone unturned on this one.
But still, there are two scenarios:
We get Privacy Shield II and activists find angles to bring to court. It will still take a few years between PS II and a new court case leading to invalidating it, so another few years with a Privacy Shield agreement. And then rinse, repeat.
We get Privacy Shield II, and it’s the world’s most perfect and final legislation. No one challenges it. It is accepted as the final law of the land for years to come.
This is not the most likely scenario because of the highly political nature of what the various parties involved may consider to be privacy for two reasons: because legislators do not historically have a great technical command of the (admittedly) complex technical ramifications of what these laws actually govern, and because gray areas are bound to continue to be created as our technical tools continue to evolve.
But anyway, my betting dollar would be on the first scenario in any event, which means: We should get used to reading fuzzy headlines that call this technical product or platform “illegal” and learn to not immediately worry that we have to replatform whatever technology.
You’re a publisher, and the headlines have you worried
… or maybe your head of legal would like “a chat” about Google Analytics. I was asking a few publishers last week — on both sides of the Atlantic — about how they were handling both the recent news and how this was affecting their long-term strategy.
For a premium British publisher with a worldwide readership: “It’s on our radar for sure.” Their original assessment was that they were safe. Still, this publisher notes, “it’s a flag and we need to dig deeper in general in how we approach these issues from a regional perspective, not just the EU.”
Said a top American publisher, this has only solidified their drive to continue to build their own tools, but they do rely on Google Analytics and remain cautious.
Still, no one is jumping ships on this one. Because, there is another angle to all this — which is the technology itself.
Concurrently to all of this, Google may also take the bull by the horns here and figure out some ways to keep all EU data in Europe. There are also possible approaches where some level of encryption keys that would be held only by the publisher may be making some of the issues moot, though not everything can be encrypted within an application so this is not a panacea.
As the British publisher noted, “We can make a whole bunch of changes, but it’s likely Google will make changes and shore things up faster than we can make our own changes. And then what’s the point?”
And that’s actually where I would make my biggest bet (a drink at the bar): There is just too much riding on a good solution being designed for one not to be found.
Google built self-driving cars, and data transfer seems like a significantly easier issue. Not an easy issue, but an easier issue. Do I think Google can figure out some way to build a privacy-first Google Analytics? I absolutely do. This may require a significant rehaul of how GA is engineered (Google, if you’re reading this, I am writing this from the peanut gallery — I have no idea how you’d do this). All of this is, really, economics and incentives. There is enough size and value to Google to solve this market issue, even at a large cost. Europe is a large market, and enough countries are in the process of adopting their own flavour of data privacy laws, that digging into this problem is worth it.
The biggest reason I’d put my drink bet on this isn’t even because of the market size argument. It’s that bad press has a cost too, and that cost reflects on all of Google’s products and image — not just Google Analytics.
Right now, the approaches proposed by Google require a significant amount of expertise and gotchas on the part of a publisher to know just how to configure their Google Analytics install to be in compliance. This is in part why cases are being brought (successfully) by activists (the other reason is the lack of a legislative framework to afford more flexibility).
The confusion caused by these headlines, the headaches for data protection officers, and the general complexity for analytics and data engineering team is not a sustainable place to be for Google. As any PR person would confirm: Even if your product does not in fact kill baby seals, you don’t want to see headlines that suggest that it might.
Further afield on the wide, wide Web
- Barr Moses, the CEO of a data engineering start-up, shared a playlist that will appeal to the data engineers among us.
- And a high-level question for data science: Is deep learning going to replace (traditional) machine learning (read: machine learning working from untagged data vs. tagged and cleaned data). I feel like the answer should be that we should leave an AI to crunch through that question and tell us, but, otherwise, the article from Open Data Science is offering a perspective on the matter.
On deck for this summer
I’ll be looking at open-source options for analytics (consider it part II of this column); Natural language processing (should we translate all the articles?); and Responsible AI, data science ethics. Write to me if you’d like to chat about these topics as I dig into them further — email@example.com.
About this newsletter
Today’s newsletter is written by Ariane Bernard, a Paris- and New York-based consultant who focuses on publishing utilities and data products, and is the CEO of a young incubated company, Helio.cloud.