Coefficient of Variation (CV) - Understanding volatility in customer behavior
Published:
Before applying any forecasting method, I must understand customers’ purchase pattern. I learned “coefficient of variation” (CV) can be a vital metric in my toolbox to produce robust forecasting.
What is Coefficient of Variation (CV)?
The calculation is “Standard Deviation / Mean”. In layman’s term, it tells us how volatile is the data relative to its average.
import numpy as np
# customer A - stable
sales_A = [100, 105, 98, 102, 100, 103]
cv_A = np.std(sales_a) / np.mean(sales_a)
# cv_A = 0.03 (very low - predictable!)
# Customer B - Volatile
sales_B = [0, 0, 500, 0, 0, 300, 0, 0]
cv_B = np.std(sales_B) / np.mean(sales_B)
# cv_B ≈ 1.8 (very high - erratic!)
In my scripts, I use CV along with other metrics, such as recent activity % and revenue to separate predictable from unpredictable customers.
Using CV for customer Classification
"""
From config.py
If a customer's cv < 2.0, the customer is assigned to forecasting group which represents predictable and regular purchasing customers; otherwise, the customer is assigned to allocation group which are irregular volume customers with sporadic activity.
"""
cv_threshold: float = 2.0
However CV isn’t perfect because it can be misleading with lots of zero. Therefore, I used it with other metrics for robust classification. The calculation is simple but powerful. The conclustion: before you forecast, segment your data by CV. You’ll save hours of debugging why your model doesn’t work.
