model_selection
Gini-based stepwise logistic regression for scorecard-style feature selection.
AUCStepwiseLogit
AUCStepwiseLogit(
initial_predictors: list[str] | None = None,
all_predictors: list[str] | None = None,
selection_method: str = "stepwise",
max_iter: int = 1000,
min_increase: float = 0.005,
max_decrease: float = 0.0025,
max_predictors: int = 0,
max_correlation: float = 1.0,
enforce_coef_sign: bool = False,
penalty: str = "l2",
C: float = 1000.0,
correlation_sample: int = 10000,
use_cv: bool = False,
cv_folds: int = 5,
cv_seed: int = 42,
cv_stratify: bool = True,
)
Bases: BaseEstimator
Gini-based stepwise logistic regression.
Selects features by Gini improvement rather than p-values, with optional correlation filtering, sign enforcement, and cross-validated scoring.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
initial_predictors
|
list[str] | None
|
Features forced into the model at the start. |
None
|
all_predictors
|
list[str] | None
|
Candidate pool (defaults to all columns in |
None
|
selection_method
|
str
|
|
'stepwise'
|
max_iter
|
int
|
Maximum number of add/remove steps. |
1000
|
min_increase
|
float
|
Minimum Gini gain required to add a feature. |
0.005
|
max_decrease
|
float
|
Maximum Gini drop allowed before removing a feature. |
0.0025
|
max_predictors
|
int
|
Hard cap on model size (0 = unlimited). |
0
|
max_correlation
|
float
|
Reject candidates correlated above this with any already-selected feature. |
1.0
|
enforce_coef_sign
|
bool
|
Reject features that flip a coefficient sign. |
False
|
penalty
|
str
|
Regularisation type passed to |
'l2'
|
C
|
float
|
Regularisation strength. |
1000.0
|
correlation_sample
|
int
|
Max rows used for the correlation check. |
10000
|
use_cv
|
bool
|
Score via k-fold CV instead of a held-out validation set. |
False
|
cv_folds
|
int
|
Number of CV folds. |
5
|
cv_seed
|
int
|
Random seed for CV splits. |
42
|
cv_stratify
|
bool
|
Use stratified folds. |
True
|
Attributes:
| Name | Type | Description |
|---|---|---|
predictors_ |
Ordered list of selected feature names. |
|
coef_ |
Coefficients for selected features. |
|
intercept_ |
Model intercept. |
|
progress_ |
DataFrame logging each add/remove step with Gini deltas. |