User Churn Prediction Model

Fintech · Python · XGBoost · SHAP · SQL

Problem

At a high-growth fintech platform, we were losing roughly 18% of activated users within the first 90 days. Retention campaigns existed but were untargeted — every churned user got the same re-engagement email, regardless of why they left. The result: low response rates, high unsubscribe rates, and wasted campaign budget.

The goal was to build a model that could identify users at risk of churning before they churned, score them by risk level, and enable the CRM team to run differentiated interventions.

Defining Churn

The first challenge was definitional. For a payments platform, "churn" is non-contractual — users don't cancel; they just stop transacting. We defined churn as: no transaction in 60 days for users who had at least one transaction in their first 30 days of activation. This gave us a clean binary label and excluded new users still in the activation funnel.

Feature Engineering

We built a feature set around three themes, all computed on a rolling 30-day lookback window per user:

Recency & Frequency: Days since last transaction, transaction count, active days, session count.
Product Depth: Number of distinct features used (UPI, bill pay, investments, etc.), feature adoption velocity in first 14 days.
Payment Behaviour: Average transaction value, transaction success rate, payment method diversity, failed transaction ratio.
Engagement Signals: Push notification open rate, app session duration, customer support contacts.

import pandas as pd
from datetime import timedelta

def build_features(df_events, reference_date, lookback_days=30):
    cutoff = reference_date - timedelta(days=lookback_days)
    window = df_events[df_events['event_date'] >= cutoff].copy()

    features = (
        window.groupby('user_id')
        .agg(
            txn_count        = ('txn_id',      'count'),
            active_days      = ('event_date',  'nunique'),
            avg_txn_value    = ('amount',      'mean'),
            success_rate     = ('is_success',  'mean'),
            products_used    = ('product_code','nunique'),
            days_since_last  = ('event_date',  lambda x: (reference_date - x.max()).days),
        )
        .reset_index()
    )
    return features

Model

We trained a gradient-boosted classifier (XGBoost) with 5-fold stratified cross-validation. Hyperparameters were tuned with Optuna. Class imbalance (~80:20 non-churn:churn) was handled via scale_pos_weight. We prioritised recall in the top two deciles — we'd rather flag too many at-risk users than miss them.

Final model metrics on holdout:

AUC-ROC: 0.87
Precision @ top decile: 76%
Recall @ top two deciles: 84% (covering 84% of actual churners)

Interpretability with SHAP

SHAP (SHapley Additive exPlanations) was used to make the model actionable for the CRM team. Instead of a black-box score, each user got a risk score plus the top 3 drivers of their score. This told the CRM team why a user was at risk — enabling personalised messaging.

import shap

explainer   = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_holdout)

# Get top 3 features driving churn risk per user
def top_shap_drivers(shap_row, feature_names, n=3):
    sorted_idx = abs(shap_row).argsort()[::-1][:n]
    return [(feature_names[i], round(shap_row[i], 3)) for i in sorted_idx]

df_holdout['shap_drivers'] = [
    top_shap_drivers(shap_values[i], X_holdout.columns)
    for i in range(len(X_holdout))
]

Deployment & Impact

The model was deployed as a weekly batch job. Every Monday, users were scored and the top two risk deciles were passed to the CRM platform with their SHAP-driven personalisation tags. Three CRM interventions were built:

Low feature usage: Triggered a "discover" flow highlighting unused features relevant to the user's transaction history.
High failed transaction rate: Triggered a support call from the payments team.
Recency drop: Triggered a cashback offer tied to a relevant use case.

After 3 months of running the model-driven campaigns vs the old blanket approach: 22% reduction in 90-day churn in the targeted cohort, translating to approximately ₹8 crore in retained annual revenue.

Key Learnings

Churn definition matters more than model choice. Garbage labels → garbage predictions, regardless of algorithm sophistication.
Feature diversity beats feature count. 15 well-engineered behavioural features outperformed 200 raw event flags.
Interpretability is a business requirement, not a nice-to-have. The model only drove impact because CRM could act on the SHAP drivers.

Run the code: github.com/.../nikunjkaushik-demos/churn-prediction — XGBoost churn classifier on synthetic fintech data with SHAP-driven per-user explanations.
Related reading: Churn Prediction with Machine Learning: A Practitioner's Guide