The perils of making linear projections

By Sebo Banerjee

HT Media Ltd.

New Delhi, Delhi, India


I often wonder if an observation I made in the past was right.

I knew a publication company during a time when it was making its first strides toward the digital pasture. The company frequently changed its digital business head. Every new person — inevitably prouder, more pampered, and more “entitled” than the previous one — would reject all plans made by the previous one and a make a new “five-year plan” that would fascinate the CEO even more.

Projecting the future requires taking several unexpected factors into account.
Projecting the future requires taking several unexpected factors into account.

Yes, you read it right, a new five-year plan almost every two years! And all those plans, with a few exceptions, would have projections and targets moving up in straight lines — as if they had a secret potion to survive the great uncertainties of life and business together.

Try rolling a wheel on an uneven surface. Would it go along a straight line? Roll a dice. What are the probabilities you’ll get a six every time? Or, take a chance and try to predict your spouse’s mood in a straight line.

And think about digital content business, where consumption and monetisation standards change at the drop of a hat. There’s no standard at all. It’s like most traditional media content businesses are trying to share a small, cramped compartment on a wobbly train speeding through a series of disruptions.

Try imagining a linear prediction in such a setting, where revenues increase independent of grounded reality, in a rhythm that only pleases one person’s projected aesthetics! And imagine the havoc it would create in the lives of people in associated departments whose inputs were hardly asked for.

It’s also true that such projections would do just fine with a shot of scientific and practical logic. In most cases, targets are set by assuming an overall growth rate of business at various periods. That’s the quickest method, and the least efficient, since it carries a good volume of “guesstimation” with it.

Also, the business volume cannot always be predicted by metrics like the number of new users acquired or how much page engagement generated. It’s not that easy. Several other factors, many of which may sound too trivial to even be considered, impact the growth (or lack of growth) patterns.

In the last few regression equation building efforts I’ve been a part of, factors like the following assumed importance:

  • Weather: The onset of summer (in north India) impacts news consumption negatively. It’s a very dull phase in terms of political activities. Soaring temperatures and humidity seem to get the better of the politicians; we don’t see many rallies or conferences capable of assuming national level importance.

    This is also the school holiday season, people rush to go on vacations with family — parents, kids, and journalists. News operations are managed by a skeletal staff. News is read by fewer readers.
  • Stability and efficiency of the ruling party: The higher the stability, the worse the traffic volumes.
  • Technology or server maintenance/change schedule: If an efficient team is in place to handle the maintenance, we can give it a very low weight or can even ignore it. However, inefficient handling may result in major search traffic loss. Chances of that should be noted. 

These kinds of points relevant to a specific operation can be added. That takes us to the next question: How would you predict the extent of their impact? We can’t just guess.

The answer required some work. We laid out the traffic and engagement pattern for our products for the trailing 365 days and started annotating each variation point with possible reasons in a nice columnar data set ready to be queried by SQL-type commands.

I’ll briefly explain how we approached the task of making Web site performance projections and then, with the help of it, broke down the monthly targets into daily targets spread along a continuous timeline. The objective is to create a nice evaluation platform for targets set already to tell us if the targets are pragmatic and where they need revision.

Here’s a quick look at the three-step approach we have taken.

First, we tried to define our existing data pattern through a regression analysis and fit the data into an equation.

Second, with the help of the existing equation, we created the projection equation involving an exponential triple smoothing (ETS) technique. The projection equation would also — yes, you guessed it right — directly draw from the annotation timeline we created first to represent seasonality.

That part was complicated; several independent variables had to be switched on or off depending on time, and we used a method akin to using “dummy variables” of Boolean nature. I’d avoid the mathematical details as it’s becoming very intimidating already. If you need more information, you can write to me.

Let me tell you about an intriguing challenge the Web site traffic data threw at us in this phase. We all know users consume Web site and app content differently on the weekends, starting from Friday afternoon. Very interestingly, the equation we had already derived simply failed to define that behaviour since it was dominated by week day data.

You may ask, why wouldn’t you look at a larger data set to ignore the weekend behaviour? That would have been a good workaround had the Unternet content business not been such a fast-changing one. So many things change so fast here that we often wonder if data that is even six months old is still worth looking at. We needed a different approach.

So, we developed two equations: one for the weekdays and the other for the weekends. Then we wrote an algorithm stitching the two that is capable of firing only the appropriate one depending on the type of the day, all along the projection.

Third, as the projection had now shown us a very realistic pattern of traffic behaviour for the next 12 months, we comfortably used the pattern as a model to break the month’s target into daily targets.

And, bingo! It accurately showed us if the set target was practical.

The plot, when laid down for the entire financial year gave us a nice view of the tryst between projection and targets. The same exercise can be repeated for revenues where traffic has a direct bearing on the money earned. For moneys that don’t depend on traffic, we can create a separate ETS equation solely dependent on traffic independent revenue and calculate the output as a blend of the two.

Are you asking now if all that hard work finally helped us establish a more rationalised target?

No, please don’t ask that question, for reasons you know already since you asked it in the first place!

About Sebo Banerjee

By continuing to browse or by clicking “ACCEPT,” you agree to the storing of cookies on your device to enhance your site experience. To learn more about how we use cookies, please see our privacy policy.