← Back to Blog

Multi-Touch Attribution: Moving Beyond Last Click

Published on: July 14, 2025


Last-click attribution is the analytics equivalent of giving the goal scorer all the credit and ignoring the build-up play. In most real customer journeys, users touch five to twelve channels before converting — and the channel that happens to be last is rarely the one that did the most work. Multi-touch attribution (MTA) tries to distribute credit more fairly across the full journey.


Why Last-Click Fails

Consider a user who sees a brand video on YouTube, clicks a display ad three days later, searches for your brand name on Google and clicks a paid search ad, and finally converts. Last-click gives 100% of the credit to paid brand search — the easiest, cheapest touch that required the least work. The YouTube campaign that introduced the brand gets nothing.

Over time, this creates a feedback loop: brand campaigns are defunded because they show zero attributed revenue, awareness drops, and eventually paid search becomes less effective because there's no brand to search for. The death spiral of last-click-optimised budgets.


Rule-Based Models

Rule-based models are transparent and easy to implement. The trade-off is that the rules are arbitrary:


Data-Driven Attribution: Markov Chains

Data-driven MTA doesn't assume which touches matter — it learns from your actual conversion data. The Markov chain approach models the customer journey as a sequence of states (channels), estimates transition probabilities between states, and computes each channel's "removal effect" — how much conversion probability drops if that channel is removed from all paths.

from itertools import chain
from collections import defaultdict

def build_transition_matrix(paths, conversions):
    """
    paths: list of lists, e.g. [['paid_search','email','direct'],...]
    conversions: list of 0/1 matching each path
    """
    transitions = defaultdict(lambda: defaultdict(int))

    for path, converted in zip(paths, conversions):
        journey = ['start'] + path + (['conversion'] if converted else ['null'])
        for i in range(len(journey) - 1):
            transitions[journey[i]][journey[i + 1]] += 1

    # Normalise to probabilities
    matrix = {}
    for state, nexts in transitions.items():
        total = sum(nexts.values())
        matrix[state] = {k: v / total for k, v in nexts.items()}

    return matrix

def removal_effect(matrix, channel):
    """Probability of conversion when 'channel' is removed."""
    modified = {s: {t: p for t, p in v.items() if t != channel}
                for s, v in matrix.items()}
    # Re-normalise and compute conversion probability from 'start'
    # (simplified — full implementation uses matrix multiplication)
    pass

Shapley Value Attribution

An alternative data-driven approach uses Shapley values from cooperative game theory. Each channel is a "player" and the conversion is the "prize." The Shapley value computes each channel's fair share of credit by averaging its marginal contribution across all possible orderings of the journey. It's theoretically sound but computationally expensive for long journeys (exponential with path length).

In practice: use Markov chains for speed and interpretability at scale; use Shapley for high-value B2B journeys where precision matters more than throughput.


Practical Constraints

MTA has real limitations to be honest about:


The Right Stack

In practice, mature analytics teams use all three in combination:

Each addresses a different question. The mistake is using any one of them as the single source of truth.


Conclusion

Multi-touch attribution is better than last-click in almost every situation, but it's not a solved problem. The transition from last-click to a more sophisticated model is less about picking the perfect algorithm and more about building the data infrastructure (unified IDs, clean journey tables) and the organisational trust to act on numbers that don't always flatter the last channel in the funnel.