← Back to Projects

Bayesian Marketing Mix Model

Growth & Marketing Analytics  ·  Python · PyMC · Bayesian Inference


Problem

The marketing team was spending across six channels — paid search, display, affiliates, email, push, and TV — with no reliable way to measure the true contribution of each. Last-click attribution (the default in their analytics stack) was heavily crediting paid search and undervaluing brand campaigns and TV, leading to systematic over-investment in lower-funnel spend and neglect of awareness channels.

The goal was to build a marketing mix model (MMM) that could attribute revenue to each channel accounting for carryover effects, saturation, and external factors like seasonality and competitor activity.


Why Bayesian MMM?

Traditional frequentist MMM (OLS regression) is fast but brittle with sparse data and correlated spend channels. A Bayesian approach lets us encode prior knowledge (e.g., "TV has a longer carryover than paid search") and get uncertainty estimates on every coefficient — which matter when recommending budget reallocation to a CFO.


Adstock Transformation

Marketing spend doesn't immediately convert — it has a carryover effect that decays over time. We model this with a geometric adstock transformation:

import numpy as np

def adstock(spend: np.ndarray, decay: float) -> np.ndarray:
    """Geometric adstock with decay rate between 0 and 1."""
    adstocked = np.zeros_like(spend, dtype=float)
    adstocked[0] = spend[0]
    for t in range(1, len(spend)):
        adstocked[t] = spend[t] + decay * adstocked[t - 1]
    return adstocked

# TV has a longer carryover; paid search decays quickly
tv_adstock   = adstock(tv_spend,   decay=0.6)
sem_adstock  = adstock(sem_spend,  decay=0.2)

Saturation (Diminishing Returns)

Doubling spend doesn't double response — each channel has a saturation curve. We model this with a Hill transformation, which captures the S-curve shape of diminishing returns:

def hill_transform(x: np.ndarray, K: float, n: float) -> np.ndarray:
    """Hill function: K = half-saturation point, n = steepness."""
    return x**n / (K**n + x**n)

# Each channel gets its own K and n, learned from data
tv_saturated  = hill_transform(tv_adstock,  K=500_000, n=2.5)
sem_saturated = hill_transform(sem_adstock, K=80_000,  n=1.8)

Bayesian Model with PyMC

We used PyMC to estimate channel coefficients with regularising priors. The key advantage: each coefficient comes with a posterior distribution, giving us confidence intervals for attribution and budget recommendations.

import pymc as pm

with pm.Model() as mmm:
    # Priors for channel coefficients (must be positive)
    beta_tv    = pm.HalfNormal("beta_tv",    sigma=1)
    beta_sem   = pm.HalfNormal("beta_sem",   sigma=1)
    beta_email = pm.HalfNormal("beta_email", sigma=0.5)

    # Seasonality component
    seasonality = pm.Normal("seasonality", mu=0, sigma=0.2, shape=52)

    # Base revenue (intercept)
    intercept = pm.Normal("intercept", mu=log_revenue.mean(), sigma=1)

    # Linear predictor
    mu = (intercept
          + beta_tv    * tv_saturated
          + beta_sem   * sem_saturated
          + beta_email * email_saturated
          + seasonality[week_idx])

    # Likelihood
    sigma = pm.HalfNormal("sigma", sigma=0.1)
    revenue = pm.Normal("revenue", mu=mu, sigma=sigma, observed=log_revenue)

    trace = pm.sample(2000, tune=1000, target_accept=0.9)

Budget Optimisation

With posterior distributions on each channel's ROI and its saturation curve, we ran a constrained optimisation to find the spend allocation that maximises expected revenue given a fixed total budget:


Results

After reallocation based on model recommendations (validated with a phased geo-test):


Key Learnings