Bayesian Marketing Mix Model

Growth & Marketing Analytics · Python · PyMC · Bayesian Inference

Problem

The marketing team was spending across six channels — paid search, display, affiliates, email, push, and TV — with no reliable way to measure the true contribution of each. Last-click attribution (the default in their analytics stack) was heavily crediting paid search and undervaluing brand campaigns and TV, leading to systematic over-investment in lower-funnel spend and neglect of awareness channels.

The goal was to build a marketing mix model (MMM) that could attribute revenue to each channel accounting for carryover effects, saturation, and external factors like seasonality and competitor activity.

Why Bayesian MMM?

Traditional frequentist MMM (OLS regression) is fast but brittle with sparse data and correlated spend channels. A Bayesian approach lets us encode prior knowledge (e.g., "TV has a longer carryover than paid search") and get uncertainty estimates on every coefficient — which matter when recommending budget reallocation to a CFO.

Adstock Transformation

Marketing spend doesn't immediately convert — it has a carryover effect that decays over time. We model this with a geometric adstock transformation:

import numpy as np

def adstock(spend: np.ndarray, decay: float) -> np.ndarray:
    """Geometric adstock with decay rate between 0 and 1."""
    adstocked = np.zeros_like(spend, dtype=float)
    adstocked[0] = spend[0]
    for t in range(1, len(spend)):
        adstocked[t] = spend[t] + decay * adstocked[t - 1]
    return adstocked

# TV has a longer carryover; paid search decays quickly
tv_adstock   = adstock(tv_spend,   decay=0.6)
sem_adstock  = adstock(sem_spend,  decay=0.2)

Saturation (Diminishing Returns)

Doubling spend doesn't double response — each channel has a saturation curve. We model this with a Hill transformation, which captures the S-curve shape of diminishing returns:

def hill_transform(x: np.ndarray, K: float, n: float) -> np.ndarray:
    """Hill function: K = half-saturation point, n = steepness."""
    return x**n / (K**n + x**n)

# Each channel gets its own K and n, learned from data
tv_saturated  = hill_transform(tv_adstock,  K=500_000, n=2.5)
sem_saturated = hill_transform(sem_adstock, K=80_000,  n=1.8)

Bayesian Model with PyMC

We used PyMC to estimate channel coefficients with regularising priors. The key advantage: each coefficient comes with a posterior distribution, giving us confidence intervals for attribution and budget recommendations.

import pymc as pm

with pm.Model() as mmm:
    # Priors for channel coefficients (must be positive)
    beta_tv    = pm.HalfNormal("beta_tv",    sigma=1)
    beta_sem   = pm.HalfNormal("beta_sem",   sigma=1)
    beta_email = pm.HalfNormal("beta_email", sigma=0.5)

    # Seasonality component
    seasonality = pm.Normal("seasonality", mu=0, sigma=0.2, shape=52)

    # Base revenue (intercept)
    intercept = pm.Normal("intercept", mu=log_revenue.mean(), sigma=1)

    # Linear predictor
    mu = (intercept
          + beta_tv    * tv_saturated
          + beta_sem   * sem_saturated
          + beta_email * email_saturated
          + seasonality[week_idx])

    # Likelihood
    sigma = pm.HalfNormal("sigma", sigma=0.1)
    revenue = pm.Normal("revenue", mu=mu, sigma=sigma, observed=log_revenue)

    trace = pm.sample(2000, tune=1000, target_accept=0.9)

Budget Optimisation

With posterior distributions on each channel's ROI and its saturation curve, we ran a constrained optimisation to find the spend allocation that maximises expected revenue given a fixed total budget:

Reduce paid search spend by 18% (heavily saturated — operating past the inflection point).
Increase TV spend by 12% (below saturation, high carryover).
Increase affiliate spend by 22% (strongest marginal ROI at current spend levels).

Results

After reallocation based on model recommendations (validated with a phased geo-test):

31% improvement in blended marketing efficiency (revenue per ₹ of spend).
Reallocating ₹3 crore from over-invested paid search to TV and affiliates without reducing total revenue.
Reduced dependence on last-click attribution across the team — stakeholders now cite MMM numbers in quarterly planning.

Key Learnings

Priors are a feature, not a crutch. Encoding domain knowledge (TV decays slowly) stabilises the model where data is sparse.
Uncertainty quantification is essential for budget decisions. A point estimate without a confidence interval is just a guess.
Always validate MMM recommendations with geo-tests before a full reallocation. Models can be wrong; markets will tell you.