stability
Population Stability Index (PSI) and Event Stability Index (ESI) for monitoring feature and target drift over time.
PSI
Bases: BaseEstimator
Population Stability Index.
Measures distributional shift between a reference dataset and a monitoring
dataset. Fit on the reference, call score on any subsequent snapshot.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n_quantile_bins
|
int
|
Number of quantile bins for numeric features. |
10
|
missing_value
|
float
|
Frequency floor applied to empty bins to avoid log(0). |
0.0001
|
Attributes:
| Name | Type | Description |
|---|---|---|
bin_breaks_ |
Quantile cut points fitted on the reference (numeric only). |
|
ref_dist_ |
Reference frequency distribution as a DataFrame. |
Source code in datasci_toolkit/stability.py
ESI
Event Stability Index.
Measures rank stability of a model score across time periods. Returns two variants: V1 (rank-correlation based) and V2 (event-rate-ratio based).
StabilityMonitor
StabilityMonitor(
features: list,
n_quantile_bins: int = 10,
missing_value: float = 0.0001,
col_weight: str | None = None,
)
Bases: BaseEstimator
Monitors PSI for a set of features over time.
Fits one PSI instance per feature on a reference DataFrame and exposes
three scoring modes: against a fixed reference, consecutive period pairs,
or arbitrary boolean masks.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
features
|
list
|
Column names to monitor. |
required |
n_quantile_bins
|
int
|
Quantile bins for numeric features (passed to |
10
|
missing_value
|
float
|
Frequency floor for empty bins (passed to |
0.0001
|
col_weight
|
str | None
|
Optional weight column in the input DataFrame. |
None
|
Attributes:
| Name | Type | Description |
|---|---|---|
psis_ |
Dict mapping feature name to fitted |
Source code in datasci_toolkit/stability.py
plot_psi_comparison
plot_psi_comparison(
months: list,
psi_values: list,
labels: list,
title: str = "PSI",
size: tuple = (12, 8),
output_folder: str | None = None,
show: bool = True,
) -> None
Source code in datasci_toolkit/stability.py
psi_hist
psi_hist(
data: DataFrame,
scores: list,
months: list,
month_col: str,
pivot: int = 0,
score_names: list | None = None,
title: str = "PSI",
bins: int = 10,
output_folder: str | None = None,
show: bool = True,
) -> None