bin_editor

Headless state machine for editing bin boundaries, with an optional anywidget-based UI.

BinEditor

BinEditor(
    bin_specs: dict[str, dict[str, Any]],
    features: DataFrame,
    target: Series,
    time_periods: Series | None = None,
    weights: Series | None = None,
    stability_threshold: float = 0.1,
)

Headless state machine for editing bin boundaries.

Works identically in plain Python scripts, notebooks, and agents. All edits are logged per feature with undo support. Call accept() to export the final bin specs dict for use with WOETransformer.

Parameters:

Name	Type	Description	Default
`bin_specs`	`dict[str, dict[str, Any]]`	Initial bin specifications — a dict produced by `StabilityGrouping.bin_specs_` or built manually.	required
`features`	`DataFrame`	Feature DataFrame matching the features in `bin_specs`.	required
`target`	`Series`	Binary target series (0/1 or float).	required
`time_periods`	`Series \| None`	Optional time series for temporal stability metrics.	`None`
`weights`	`Series \| None`	Optional sample weight series.	`None`
`stability_threshold`	`float`	RSI threshold used to flag unstable bins in the state dict (does not block edits).	`0.1`

Note

All state is accessible via state(feat), which returns a FeatureState dataclass with attributes bins, n_bins, counts, event_rates, woe, iv, dtype, groups, and temporal.

Source code in datasci_toolkit/bin_editor.py

def __init__(
    self,
    bin_specs: dict[str, dict[str, Any]],
    features: pl.DataFrame,
    target: pl.Series,
    time_periods: pl.Series | None = None,
    weights: pl.Series | None = None,
    stability_threshold: float = 0.1,
) -> None:
    self._targets = target.cast(pl.Float64).to_numpy()
    self._weights = weights.cast(pl.Float64).to_numpy() if weights is not None else np.ones(len(self._targets))
    self._time: np.ndarray | None = time_periods.to_numpy() if time_periods is not None else None
    self._threshold = stability_threshold
    self._x: dict[str, np.ndarray] = {}
    self._splits: dict[str, list[float]] = {}
    self._cat_bins: dict[str, dict[str, int]] = {}
    self._history: dict[str, list[tuple[str, Any]]] = {}
    self._orig: dict[str, dict[str, Any]] = {}

    for feat, spec in bin_specs.items():
        if feat not in features.columns:
            continue
        self._orig[feat] = spec
        self._history[feat] = []
        if spec["dtype"] == FeatureDtype.NUMERIC:
            self._x[feat] = features[feat].cast(pl.Float64).to_numpy()
            self._splits[feat] = [float(s) for s in spec["bins"][1:-1] if np.isfinite(s)]
        else:
            self._x[feat] = features[feat].cast(pl.Utf8).to_numpy().astype(str)
            self._cat_bins[feat] = {str(k): int(v) for k, v in spec["bins"].items()}

BinEditorWidget

BinEditorWidget(*args: Any, **kwargs: Any)

Source code in datasci_toolkit/bin_editor_widget.py

def __init__(self, *args: Any, **kwargs: Any) -> None:
    raise ImportError("anywidget and matplotlib are required for BinEditorWidget")