Customer loyalty programs are integral to many business strategies aimed at increasing repeat purchases and customer lifetime value. However, measuring their impact on customer spending is challenging due to confounding factors, such as pre-existing customer behaviors or demographic characteristics that may influence program enrollment and spending patterns. This blog presents a causal modeling approach using Propensity Score Matching (PSM) to estimate the effect of loyalty programs on customer spending. We will also provide a detailed technical explanation of the PSM algorithm, end-to-end implementation using Azure services, and a system architecture diagram for scalability in the cloud.
Proposed Solution
We propose the use of Propensity Score Matching (PSM), a causal inference technique that balances the treatment (loyalty program members) and control groups (non-members) based on observable characteristics, making the groups comparable.
The technical solution includes:
- Building a Propensity Score model using logistic regression to estimate the probability of joining the loyalty program.
- Matching customers from treatment and control groups based on their propensity scores.
- Analyzing post-matching outcomes to measure the impact on customer spending.
Causal Modeling for Measuring the Impact of a Customer Loyalty Program
Causal modeling techniques allow us to isolate the effect of loyalty programs from other factors influencing spending. PSM focuses on balancing the treatment (joined the customer loyalty program) and control (did not join the customer loyalty) groups by matching customers with similar characteristics.
Propensity Score Matching Algorithm Logic
Propensity Score Matching can be broken down into four key steps:
Estimating Propensity Scores
Use logistic regression to model the probability of joining the loyalty program. Let Xi be the set of covariates (age, income, previous spending) for customer i. The model estimates: p(Xi) = Pr(Ti= 1|Xi)
p(Xi) is the propensity score.
Ti= 1 if the customer joins the program, and Ti= 0 otherwise.
Matching
After calculating propensity scores for each customer, we match treated customers (those who joined the program) with untreated customers (those who did not) based on the nearest propensity scores using one of the following methods:
- Nearest Neighbor Matching: Match each treated customer with the closest untreated customer.
- Caliper Matching: Match treated and untreated customers within a predefined range of propensity scores.
- Kernel Matching: Assign weights to control customers based on the similarity of their propensity scores to treated customers.
- Balance Checking: Once matching is complete, check the covariate balance between the treatment and control groups to ensure matching successfully balances the two groups on all observable characteristics.
Outcome Estimation
The Average Treatment Effect on the Treated (ATT) is computed by comparing the post-program spending for the matched treated and control groups:
End-to-End Implementation of PSM in Measuring the Impact of a Customer Loyalty Program
Step 1: Data Preparation
Load customer data into Azure Data Lake. Data includes demographics (age, income, etc.), prior spending behavior, and loyalty program participation.
Step 2: Propensity Score Estimation
Implement the logistic regression model using Azure Machine Learning. Use features like customer demographics and spending history to estimate propensity scores for each customer.
Step 3: Matching
Use Azure Databricks for distributed computation of the matching algorithm. Perform Nearest Neighbor or Caliper Matching based on the calculated propensity scores.
Step 4: Covariate Balance Checking
After matching, run balance checks to ensure that covariates between treated and control groups are statistically similar using standardized mean differences (SMDs).
Step 5: Outcome Analysis
Compute the ATT by comparing the difference in average spending between treated and matched control groups using Azure Databricks.
Explaining PSM with an Example
Scenario:
Customer ID |
Age |
Income |
Past Spending |
Joined Loyalty (Treated) |
Propensity Score |
1 |
30 |
50,000 |
500 |
1 |
0.72 |
2 |
45 |
60,000 |
700 |
0 |
0.71 |
3 |
25 |
40,000 |
200 |
1 |
0.34 |
4 |
35 |
55,000 |
450 |
0 |
0.35 |
We match Customer 1(treatment group) and Customer 2 (control group) based on propensity score. Similarly, we can match Customer 3 (treatment group) and Customer 4 (control group).
Example Outcome
Customer 1: Spending after joining the loyalty program = $1000. The matched control customer spent $800.
Customer 3: Spending after joining the loyalty program = $600. The matched control customer spent $550.
ATT=1/2[(1000−800)+(600−550)]=125
So, the Average Treatment Effect on the Treated (ATT) is $125. This suggests that, on average, customers who joined the loyalty program spent $125 more than similar customers who didn’t join.
Conclusion and Future Work
This blog provided a technical walkthrough of measuring the impact of a customer loyalty program on customer spending using Propensity Score Matching. By leveraging causal modeling, we controlled selection bias and confounding factors in observational data.
Future Work
Heterogeneous Treatment Effects: Explore how the impact of the loyalty program varies across different customer segments.
Longitudinal Analysis: Expand the model to capture the long-term effect of the loyalty program on customer spending using time-series data.
Additional Azure Services: Incorporate Azure Synapse Analytics for even larger-scale data processing and advanced analytics.
AUTHOR - FOLLOW
Johny Jose
Manager, Data Science
Next Topic
Graph Neural Networks Enhancing Personalization in E-Commerce Product Recommendations
Next Topic