‘Always-on’ Analytics: The Pros and Cons of Automation in Marketing Mix Modelling

September 6, 2023

Welcome to the Wheelhouse, a series of blogs from Ebiquity’s Marketing Effectiveness team.
In this fourth edition of The Wheelhouse, Principal Consultant George Wood considers the benefits and drawbacks of running automated, ‘always-on’ analytics for marketing effectiveness in marketing mix modelling.

Econometric modelling for marketing mix modelling (MMM) has come a long way. When I started in the industry some years ago, there was a limited number of channels that an analyst could accurately extract from a model. Yet there was always a desire for bigger, faster, and more detailed results.

A brave new world?

Over the past few years, we’ve seen a big increase in automation modelling platforms that employ machine learning and AI, bringing with them the promise of faster turnaround times for MMM projects. This has been made possible by advances in computational power and is made all the more desirable because of the (still impending) ‘cookie apocalypse’ making real-time, multi-touch attribution reporting much more challenging.

These quick turn-around solutions have dangled this tantalising possibility in front of marketing leaders: “What if we could deploy these to get our market mix modelling results quicker and, therefore, be able to tell if a campaign is working earlier and consequently adjust strategy before we’ve spent an entire quarter running a creative or channel mix that doesn’t work?”

Traditional MMM vendors have usually resisted this kind of always-on analytics updating. And the reason why is simple: it takes a long time to verify whether data is correct and gather all of the information that could have affected a brand’s KPIs.

Real world problems

Data and model validation has traditionally required some ‘grappling time’. This is time that a human being spends troubleshooting problems that may not immediately present themselves. Grappling time is how you discover that the agency has put the wrong cost definition into their monthly feed; it’s how you find that there has been a definition change in the syndicated market data. An automated platform cannot usually even detect – let alone act on – these issues. And this is simply because they are outside the boundaries of what it will have been programmed to expect.

Let’s work through a hypothetical (but all-too-real) example.

Brand A provides all of its media, pricing, PR, etc data.

Its econometrics provider creates a model with the data available and notices an uptick in sales. This coincides with a small investment in a rarely-used social media channel.

The statistical model, with the information it has, assigns a lot of credit to this small spend in social media.

The analyst, who notices this spike in sales, goes back to their client at Brand A and asks whether there was anything happening at this point in time that we are not aware of.

The marketer at Brand A responds by saying: “We held a sales event and sold lots of our product at a heavily discounted rate. This wouldn’t have been captured in our pricing data, as this offer was not available to regular customers.”

The analyst adjusts their model to incorporate this event. This gives credit to the event, rather than to the small social media channel, which is a result that is more realistic and closer to the truth.

In the scenario of quick-turnaround, automation models, this extra step in investigation is missed out. Such a model will use incomplete information – the only information is has to make the decision – and assign credit, incorrectly, to the social media channel. This may lead to incorrect recommendations, such as upweighting the budget on this social media channel, when in fact sales would benefit more from investment in other areas.

Is there a solution to speed up this modelling process?

Using some automation modelling platforms, we can constrain results by giving the model more initial information, enabling results to align better with prior beliefs. In the example above, our plucky social media channel has an assumed belief attached to it – that it probably performs similarly to the results of other social media channels used in the past. This means that the model can’t deviate far from that assumed belief, even though the sales spike was actually correlated with the sales event. So, we get a result that is much more in line with historic expectations, despite not having the time to account for the sales event in our model.

Great, right? Well, even if you constrain the social media channel and its movement in performance is limited, it will still have a small uplift in performance caused by this unrelated sales event. Perhaps not as large as before we inserted our assumed beliefs, but an uplift nevertheless. And we know, with the benefit of hypothetical hindsight, that this uplift has nothing to do with a tiny spend on our social media channel.

Equally, the opposite can happen, where we do have a rise in sales that was genuinely caused by our social media channel. But because of our used constraints, we do not assign all of the credit where it is genuinely deserved.

The statistician’s bane: spurious correlation

This is just one example of spurious correlation and how it can ruin a model. A spurious correlation is an apparent relationship between two or more variables that are associated but not, in fact, causally connected. It’s spurious because it actually reflects chance or coincidence or, more likely, it’s the result of a hidden, third factor.

Let’s be clear – this could happen to any method of modelling where data is not properly investigated. Additionally, no model is perfect, and as the statistician George Box said, “All models are wrong, but some are useful”. Analysts will always be missing some data or information about what has driven sales. However, with thorough investigation, analysts can mitigate these missing factors as much as possible, up to the point where their impact on results is minimal and insights gained from modelling lead to correct recommendations.

So is that a “No!” to automation?

Far from it. I come from a position of being a fan of developing marketing automation to aid speedy project delivery. But analysts working in MMM need to have all the information, namely:

What assumed information is being used in models?
Are assumed beliefs constraining results too much?
How is information being shared across variables that explain a KPI? (essentially, the performance of which factors are interlinked and how does that effect results?)

Where automation can help

There are five principal domains where automation most definitely can help in marketing econometrics.

Data processing – templates can be populated with data and then ingested by processing scripts which do error checks, detect for anomalies and then put into a format useable for modelling.
Building basic models – setting up models to a level that explains a large amount of variation in a brand’s KPIs. However, letting a machine do all of the thinking is risky. Assumed beliefs must be updated and be wide enough to accommodate new truths. Human checking a model – both inputs and outputs – is a good way to minimise spurious correlation and encourage good practice of further research into what is causing sales trends.
Parameter testing – once basic models are built to a satisfactory level, AI based solutions can be designed to test for factors such as the rates of decay and diminishing returns of investment.
Dashboarding and results – model outputs can be exported in a standardised manner, which enables an automated upload to a results portal or an optimiser tool, say. This takes little time for set up, and once models are built, enables KPIs – from return on investment to cost per acquisition – to be calculated pretty much instantaneously. This quickly turns an econometric model into useful insights and recommendations.
Elsewhere – automation is being used more and more within lots of different areas of analytical project work. Implementing it – with the necessary caution, checks and balances – is essential to keep pace with the rapidly changing world of analytics.

Summing up: there is no universal answer

MMM in some ways is a victim of its own success. Clients demanding “Bigger! Better! Faster!” is, on one level, an endorsement of how valuable it can be. But equally, it can be done badly using any methodology, and this can lead to harmful recommendations.

To conclude, it’s fair to say that an automation model created almost entirely by a machine in a short amount of time may provide a brilliant estimation of results. It may also be a poor estimation, based on heavily-biased, assumed views and skewed by incorrect or missing data. It’s probably impossible to know whether this is the case until and unless all prior beliefs and data have been thoroughly reviewed. That’s why it’s our recommendation that brand custodians should: (i) treat automation models with caution, (ii) realise that a quick turnaround reflects probably more of an initial estimation; and, (iii) not implement drastic changes to marketing investment off the back of this initial read.

So, let us be open minded to some forms of automation, with all of the necessary caution of any good analyst worth their salt. And at times with MMM delivery – when one feels that accuracy is at stake – be willing to take a step back by remembering the maxim: more haste, less speed.

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent system. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent system. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent system. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent system. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent system and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
csrf_token	session	User protection against possible Cross-Site Request Forgery attack. Recommended.
mwsid	session	HispanicAd.com Newsletter cookie should user decides to sign-up. Essential.