Defining Metric Cohort Analysis
- There are many different methods for analyzing customer cohorts for churn
- Generally, we should create cohorts for customers who we think have different churn rates
- The most basic method involves considering people who often use a product as less likely to churn
- As an example, suppose active customers churn less often compared to inactive customers
- Then, a group of active customers should have a lower churn rate than a group of inactive customers
- As a result, we should create cohorts based on a customer's level of activity
Illustrating Steps for Cohort Analysis
-
Read in data including the following columns:
- Customer ID
- Binary churn column
- Date
- Cohort column (e.g. number of clicks)
-
Segment cohorts by cohort column
- Usually, we group customers into cohorts so each cohort contains 10% of the total sum of the cohort column
- E.g. each customer is assigned to a cohort where each cohort contains 10% of the total number of clicks
- Calculate averages of cohort column for each cohort (or segment)
- Plot the churn rates versus the corresponding average of the cohort column for each cohort
Illustrating Code for Cohort Analysis
# Step 1: Read in data
df = pd.read_csv('data.csv')
# Step 2: Group customers into cohorts
groups = pd.qcut(df['cohort_column'], 10)
# Step 3: Calculate averages for plot
means = df.groupby(groups)['cohort_column'].mean()
churns = df.groupby(groups)['is_churn'].mean()
# Step 4: Plot churn rates vs averages
plt = pd.DataFrame({x: means.values, y: churns.values})
plt.plot.line()
Summarizing Customer Behavior
-
Typically, we'll need to impute missing values and aggregate non-zero values for each column
- We may also want to remove columns with 90% missing values
-
We can do this by calling the
DataFrame.describe
function in pandas- We can determine if columns are skewed by calling
DataFrame.skew
- We can determine if columns are skewed by calling
- This will help in profiling and understandomg the distributions of each column
-
In other words, we'll need to run quick profiles including the following aggregates for each column:
- Counts
- Null values
- Non-zero values
- Skewness of columns
- Means
- Standard deviation
- Minimum values
- Maximum values
Scoring Metrics for Segmentation
- Typically, we'll want to standardize each column before clustering
-
In particular, we'll use a standard scaling formula
- To do this, we can call the
sklearn.preprocessing.StandardScaler
function
- To do this, we can call the
- However, some columns may be extremely skewed
- In order to reduce the skewness of a column, we can first transform those skewed columns
- Specifially, we'll take the logarithm of the column
-
The following steps are done for skewed columns:
- Gather feature
- Take the log of feature
- Standardize features by removing the mean and scaling to unit variance
- Taking the log will result in more evenly distributed standardized features
- We can skip step 2 for unskewed features
Segmenting Customers by using Cohort Analysis
- Cohorts should be created for customers with different churn rates
-
The following features are some of the most common features used for separating segments:
- Shopping frequency
- Engagement frequency
- Seasonality
- Purchases or engagement around promotional events
-
As a result, each segment should have a different churn rate
-
Because, assigning a segment with an arbitrary or fixed churn rate (used across all segments) may either:
- Inaccurately label certain customers as churned
- Or may miss certain churned customers
-
For example, suppose a segment contains customers consistently purchasing during a yearly promo
-
If we naively define a churn window of year, we may mistakenly treat this customer as churned
- By having a better understanding of this customer, we can apply a more specialized targeting strategy for this segment of customers (e.g. increasing frequency promos)
-
-
However, if we naively increase the churn window to years for all customers, then we may mistakenly label certain customers as non-churned when they actually have churned
- E.g. if a customer who frequently purchases every week stops purchasing all of a sudden, we wouldn't label this customer as churned for another year or so
- This would be a big mistake, since this segment of customers would be most likely high value customers
-
- Then, we can assign different targeting strategies for each segment
-
Each targeting strategy should achieve any of the following goals:
- Improving shopping (or engagement) frequency
- Improving size of orders
- Reducing churn rate
Describing Clustering Methods
- Essentially, clustering methods group customers based on features that're correlated with each other
-
Roughly, there are popular clustering methods:
- Agglomerative
- Density-based
- Centroid-based
-
Agglomerative clustering combines similar customers in a bottom-up manner
- First, groups are formed starting from just two similar customers
- More elements are added to form larger groups of similar items as the algorithm advances
- Density-based clustering combines similar customers that are close to each other
- Centroid-based clustering combines similar customers together who are closest to a centroid
Standardizing Features with Ratios
- Typically, we'll want to transform features before clustering as ratios
-
For example, we may not only want to include a number of email opens feature for clustering
- This is because certain customers might be sent more emails than others
- In this case, we may want to only cluster on a feature repesenting a ratio of number of email opens relative to number of email sends
- Also, we may want to create a ratio of number of email opens relative to the spend for each customer