Media companies should consider a self-hosted analytics solution

By Ariane Bernard


New York, Paris


I’m not suggesting removing your current provider if you’re using a SaaS solution like Google Analytics (my previous newsletter explains why I believe our current legal pickle will improve). But I am suggesting you add a brick to your stack so you can collect additional analytics — particularly from folks who do not accept your cookies. This also divides and conquers what you may do with your primary analytics versus your secondary set.

A self-hosted analytics solution may be a smart answer for media companies.
A self-hosted analytics solution may be a smart answer for media companies.

Two-pronged strategy

This is a strategy that does two things: 

  • It gives you a way to pick up basic analytics for the totality of your visitors (the cookie-havings and the not-cookie-havings).

  • It allows you to have a kind of “always there” batch. You don’t need to configure your self-hosted analytics with every kind of event you may already be tracking with your primary analytics suite (in fact, this has downsides, as we will see later).

I’m anonymising many publishers here because the spotlight is a bit harsh as you will see. But let’s say a hypothetical publisher (not you, of course) doesn’t have a CMP where they should have one, is dropping cookies alongside their consent banner, or is dropping their big analytics cookie before the cookie banner.  

“Legitimate interest per GDPR,” says the hypothetical publisher. 

At a high level, legitimate interest is meant to cover things like audience measurements for a publisher. But solutions like Google Analytics straddle such a wide range of needs that go well beyond audience measurements. And jurisprudence (from Germany in 2019) has ruled that GA needed to be part of a tracked consent strategy. 

In addition, you would have to give up a lot of the features of your GA to pass off GA under the banner of legitimate interest.

Sidebar: You got cheated because I was shooting to give you something far more interesting than this general “give up lots of features.” But instead, I will share what I tried to do and what I didn’t succeed at doing, but it’s actually illustrative of the issue at hand. I originally went down a research journey to build a handy little table of what in GA would be “pure audience measurements” versus “marketing-y and requiring consent.” I read so much conflicting information that, dear reader, I’ll turn you back to your legal department instead. It’s a mess. But that’s the thing: The mess is why you don’t chance it. Put that GA cookie behind your CMP. 

Enter your self-hosted analytics tool, with a privacy-by-design bend. This one will be configured to also take only basic analytics, the kind that would pass legitimate interest (as in, inherently wouldn’t allow you to fingerprint a user in any way). This cookie can be very high up, pre-consent.

Why you should consider a secondary analytics cookie

Upsides from a data building perspective

Beside giving you a good “whole audience” basic cookie and allowing you to keep your main analytics solution as full-featured as you want to — but respectfully behind your CMP, naturally, and taking in any current caveats introduced by our lack of a Privacy Shield legislation — you also get these other benefits from this approach:

  • It allows you to create the minimal year of double data that you would want to have if you ever had to switch analytics tools. Basically, whether you think you may switch in a year, two years or five years, you’re beginning to build that doubled-up data. I wouldn’t suggest that you go about creating double data for no reason (there are costs as we will see). But since you are adding the privacy-first audience cookie, consider that you get a data set that you may also lean on should your strategy really change in the future.

  • Along the same lines, this secondary audience cookie gives you a path to do a kind of slow replatforming — if this was a decision you wanted to explore. As you rehaul parts of your property for reasons unrelated to your analytics strategy — and have to retag these pages for analytics as you do so — this is when you may want to onboard the double event reporting across your primary and secondary self-hosted analytics.  It’s replatforming — but from a roadmapping perspective, it’s part of your other projects. It may take a few years to have any form of critical mass of useful analytics from the second self-hosted cookie, but you incurred minimal built cost for this.

Upsides from an organisation perspective

  • Your data science, marketing, and product teams get to learn how to use your new self-hosted platform — and what can be done with it — at a far more reasonable pace than if you were setting out a big replatforming project. If there are areas where your main platform and your new platform don’t have the same affordances, you’ll discover this organically. But it won’t be a five-alarm fire since you’re not leaning on this data at this point.

  • You also have some peace of mind that you can pull a fire alarm system on your main analytics tool should some new legal challenge emerge. Your secondary cookie data wouldn’t be at par with your main analytics platform, but it can tide you over and give you a little bit of flexibility if you find that a particular country is mounting a legal challenge you couldn’t contend with quickly enough. Let’s hope things don’t come to that though.


Obviously, if this approach was only upsides, this newsletter would just be a paragraph long. So here are some downsides: 

  • Loading another analytics tracker into your property would affect page speed.

  • The cost of running this additional tool. 

On the first, analytics tools aren’t usually the most offending scripts you load on a page. That title reliably gets awarded to the third-party scripts of your advertising — I will never not take that shot. Sure, the whole concept of optimising your site for speed is to take a harsh look at anything that’s not truly delivering a lot of value and taking this out. The weight creep of trackers is a story of incremental lenience, more often than true mismanagement. Except bad ads. Have I mentioned the ads?

But this is why the secondary audience cookie shouldn’t try to be as full featured as your primary audience cookie. Don’t track every event. Take just what you need for that back-up. Over time, and as you expand the double tagging, the loading costs of your secondary analytics will increase. And it will be worth watching how these incremental events are affecting the script weight. 

If the scenario where the secondary audience cookie eventually subsumes your primary platform, then things change, of course. Then you’d be increasing the load of the secondary tracker but with the goal to eventually get rid of the first. In other words, if you’re headed to “perfect double-tagging” territory, you’ll take on more weight temporarily. But you may actually soon be in a single-tag universe again. 

The second downside has no contest. You add a data pipeline from some infrastructure you own, you’re going to pay for it. Consider that it’s probably quite a bit less than if you wanted to pay for unsampled analytics from a vendor-hosted solution, but still. 

There, the path to having a realistic assessment for the operation is to drop this cookie on a sample of your pages for a period of time, with super minimal work to actually calibrate and check the quality of the data. Basically, run the secondary cookie programme just to have a sense of what your bill could be. 

There are ways to run the numbers from a hypothetical perspective, but running an experiment may be even faster and will be more accurate.

If you’d like to subscribe to my bi-weekly newsletter, INMA members can do so here.

About Ariane Bernard

By continuing to browse or by clicking “ACCEPT,” you agree to the storing of cookies on your device to enhance your site experience. To learn more about how we use cookies, please see our privacy policy.