Data Science

Clustering refers to the process of grouping similar observations together
- Roughly, we can think of clustering as the process of grouping observations that are correlated with each other
Typically, clustering is used for exploratory data analysis
One common use-case is customer segmentation
- Where, similar customer profiles can be identified and grouped together into clusters (or segments)
Clusters are grouped together based on a similarity metric that accounts for:
- Demographic attributes
- Behavioral attributes
- Financial attributes
Each segment has a centroid representing its geometric center
Roughly, we can think of the centroid as the average of the profile

For example, one segment may represent power shopper, who is older and higher income
Whereas, another segment may represent high engager, who is young and typically uses social media
As a result, segmentation allows us to do the following:
- Understand who our customers are
- Summarize those established customer profiles
- Create segments that can be used as a feature for other predictive models
  - E.g. creating labels representing power shoppers, app engagers, brand loyalists, etc.
- Generalizing sparse data
  - E.g. using topic modeling to produce topics from words in articles

Segmentation projects involve the following:
- Good understanding of customers and expectations
- Thorough experimentation
- Executable marketing strategies
Segmentation projects involve the following challenges:
- Thorough, unsupervised experimentation
- Optimization function is ambiguous
  - Not as deterministic as classification models
  - Dependent on the chosen clustering method and parameters
- Difficult to interpret why certain customers are included in one cluster over another cluster

K-means algorithm:
- Better for more spherical clustering data
- Very efficient
- Can specify number of clusters
- Each observation gets assigned to a cluster
DBSCAN:
- Better for density-based clustering data
- Not very efficient
- Can't specify number of clusters
- Not every observation is assigned to a cluster
Gaussian Mixture Modeling:
- Better for elliptical clustering data
- Less efficient
- Can specify number of clusters
- Each observation gets assigned to a cluster
Hierarchical:
- Better for tree-based clustering data
- Efficient
- Can specify number of clusters
- Each observation gets assigned to a cluster

RFM Segmentation

Propensity Modeling