Consolidating ad customer data in 4 essential steps
Big Data For News Publishers | 21 September 2017
Last month I implored you to take on the project of building out the ever-elusive complete customer view.
I get it. The full, single view of the customer is hard to create. I worked with a news media company that had more than 100 different advertising accounts for a single customer! On average, it had three accounts for any business big enough to run preprints. And, on top of that, every time an account representative turned over, the new rep set up new accounts for the business.
Why? To keep commissions calculations easy. It totally messed up account history tracking, but in the case of the representative, commissions were more important.

Bad habits are born from a poorly executed solution.
Data analysis is difficult on good days. But those of us in the data field must get a seat at the table when data is involved and put our proverbial feet down when ill-formed point solutions will create major unintended consequences.
A point in case: I just finished another project where there were 28 different ways to spell the name of the national bookstore chain, Barnes & Noble. Well, we figured that one out, and will implement a fix-and-lockdown feature to prevent future Barnes & Noble variants from spawning.
We all have these data issues. We always will. Our data management response is going to enable our individual companies with forward-looking planning, or we will let the status quo continue.
I am reminded of a recent TV commercial where the IT guy can’t get anything done because of legacy systems. Well, data problems aren’t always something a hardware or software can fix. The errors are usually in the data management! Data folks, stop giving the “I’d love to, but … “ speech. Fix it!
Let’s move on to building a single customer view solution. Keep in mind your solution must be repeatable, and generally that means software-driven. You can do the manual work once; the trick is to productionise the solution. You don’t want to have to spend an hour manually consolidating 28 Barnes & Noble accounts every time you want to refresh the data. Sometimes you must build bridging tables to retain the fixes even though the best solution is to fix the error at the source.
Trust me, complaining to the advertising management team about the 28 different Barnes & Noble variations isn’t going to get it to drop everything and make the fix. So, solve it yourself by building the solution whether that is with a consolidation xref table or another solution (see below).
Just remember, there is probably a “good” (yeah, right!) reason there are 28 account numbers, and fixing them for your data analysis will break something else. Logic does not win here.
Enough of the preamble. Let’s get started.
Break the data problem into bite-sized chunks and solve them.
1. Define a goal. The boss says, “I want to see all the advertising revenue each company spends with me, whether in print, digital, special sections, or employment. Show me the total! Compare what I get to the market spending in each category and sales territory. You know, do your usual magic.”
There are lots of layers in the request. Peel them apart. With an end-game in mind you start to see where the data hooks must be placed and then connected.
For revenue questions, it will involve pulling from the data the who, what, when where, how much, trends, and projections. This request has a subtle focus on “where” things are happening — several geographic analysis points and a category type of “where” as well.
The first instinct is just to run a revenue-by-advertiser report. Wrong. Doing so will show the 17 versions of the furniture business and three versions of Starbucks, or whatever the case might be.
So, back up and think about all the data elements you will want and where they all reside. Hopefully all are in one system, though realistically they are not. It seems to be there are always four systems: retail order taking/billing, classified order taking/billing, CRM, and circulation. Some operations have separate systems for order taking and billing.
Regardless of the number, you have to find them all and start the consolidations and data standardisation (this includes things like 6-col retail, 9-col classified adjustment factors, etc.).
Once the data locations are identified, collecting the information needed begins. Beyond the obvious business names, address information, and revenue, I would also bring along these elements to do the deeper dives:
- SIC (NAICS) code penetration (quantity and dollars) at eight-digit, six-digit, four-digit, and two-digit levels.
- ZIP code and +4.
- Type of advertising (ROP, special section, preprint, etc.).
- Product bought.
- Ad sizes.
- Ad cost.
- Ad run dates (seasonality analysis).
This set of elements gives you some of the data points for slicing and dicing (aka pattern finding).
Now that data elements and repositories are in hand ...
2. Turn your focus to the actual consolidation of the advertiser’s business information into a single view. Even if you have a single system, you are still going to have to deal with the possibility of bringing 28 Barnes & Noble together.
For those cases where multiple systems are involved and you don’t have a marketing database integrator, you will have a few tricky decisions to make to start the data merge/match process.
The first is to define what a match is:
- Address match alone.
- Name and address.
- Name, phone, and address.
If you go with name/phone/address matching, you will quickly learn this is not as easy as it sounds. Never!
Company X’s employment ads are connected to the owners’ credit card billing address. Its display ad billing is to a local ad agency. It once ran a print-and-deliver, and that (its third account) is using one of Company X’s four branch offices as the address.
And, finally, for whatever reason, it has a second display ad account, but it is missing the sub-unit number. And all will have different phone numbers. One advertiser, four accounts and four phone numbers. How do you bring them together?
I suggest running the run through a postal NCOA and CAS standardisation process. This isn’t the final step, but it will solve many of the issues — after you learn to read the NCOA and CAS return code values.
This step alone is why news media marketing services companies like LEAP and BlueVenn (to name just two) exist. They’ve nailed this merging and matching. If you don’t work with them, or other similar firms, both InfoGroup and Dun & Bradstreet offer matching services.
Finally, there is the do-it-yourself approach. It can be done, but it is complex and difficult to productionise.
3. Once your internal data is consolidated, it is time to bring in the external data and get the SIC/NAICS coding consistently recorded. This becomes critical when doing a customer, prospect, penetration, or market potential analysis. If you use SIC code, remember the “official” SIC code is a four-digit code.
When you talk to vendors about appending an SIC code, the top vendors have “extended” with their own proprietary values or categorization values and created their own six-digit and eight-digit code values. The values are not the same between the top companies so use caution when analysing (and assigning) the values. Pick a vendor, or assign the four-digit values yourself, but be consistent in sourcing the codes.
I use the OSHA site to get the codes even though the OMB/SEC is the official agency behind it. NAICS codes are replacing SIC codes, and yet almost 20 years after the introduction of NAICS we still use SIC.
4. At this point you’ve finally brought the records together — a single record for each business with about 15 data element columns for analysis. Double check the work. Sort the data backward and forward across every column to see if the data groupings and company names are sorting in the order expected. (i.e. no US$200 account for the largest business in town that didn’t get consolidated correctly into the large “master” account record).
I also suspect that, to get to this point, another month of processing has passed, so it is time to update the records!