Coefficient of Variation (CV) - Understanding volatility in customer behavior

1 minute read

Published:

Before applying any forecasting method, I must understand customers’ purchase pattern. I learned “coefficient of variation” (CV) can be a vital metric in my toolbox to produce robust forecasting.

What is Coefficient of Variation (CV)?

The calculation is “Standard Deviation / Mean”. In layman’s term, it tells us how volatile is the data relative to its average.

import numpy as np
# customer A - stable
sales_A = [100, 105, 98, 102, 100, 103]
cv_A = np.std(sales_a) / np.mean(sales_a)
# cv_A = 0.03 (very low - predictable!)

# Customer B - Volatile  
sales_B = [0, 0, 500, 0, 0, 300, 0, 0]
cv_B = np.std(sales_B) / np.mean(sales_B)
# cv_B ≈ 1.8 (very high - erratic!)

In my scripts, I use CV along with other metrics, such as recent activity % and revenue to separate predictable from unpredictable customers.

Using CV for customer Classification

"""
From config.py
If a customer's cv < 2.0, the customer is assigned to forecasting group which represents predictable and regular purchasing customers; otherwise, the customer is assigned to allocation group which are irregular volume customers with sporadic activity.
"""
cv_threshold: float = 2.0

However CV isn’t perfect because it can be misleading with lots of zero. Therefore, I used it with other metrics for robust classification. The calculation is simple but powerful. The conclustion: before you forecast, segment your data by CV. You’ll save hours of debugging why your model doesn’t work.