Skip to content

XYWeightedRegressor

mcup.xy_weighted.XYWeightedRegressor

Bases: BaseRegressor

Regression estimator for data where both x and y have measurement errors.

Uses iteratively reweighted least squares (IRLS) to combine x and y variances via error propagation: σ_combined² = σ_y² + (∂f/∂x)² σ_x². Faster than DemingRegressor and well-suited to mildly nonlinear models.

Supports two solvers selected via the method argument:

  • "analytical" — IRLS with (J^T W J)^{-1} covariance (fast).
  • "mc" — Monte Carlo sampling with Welford online covariance (robust for nonlinear models).

Parameters:

Name Type Description Default
func Callable

Model function with signature func(x, params) -> float, where x is a scalar for 1-D input or a length-k array for k-dimensional input.

required
method str

Solver to use, either "analytical" or "mc". Default "mc".

'mc'
n_iter int

Maximum number of Monte Carlo iterations. Default 10_000.

10000
rtol Optional[float]

Relative tolerance for MC convergence stopping. Default None (disabled).

None
atol Optional[float]

Absolute tolerance for MC convergence stopping. Default None (disabled).

None
optimizer str

SciPy optimizer name used for parameter fitting. Default "Nelder-Mead".

'Nelder-Mead'

Attributes:

Name Type Description
params_

Fitted parameter array.

params_std_

Standard deviations of fitted parameters.

covariance_

Full parameter covariance matrix.

n_iter_

Actual number of MC iterations run (MC method only).

Notes

X may be shape (n,) for scalar input or (n, k) for k-dimensional input per data point. x_err must match X.shape exactly; set a column to zero for any feature that is measured without uncertainty (e.g. a controlled variable). Zero-error features contribute nothing to the combined variance and receive no noise during Monte Carlo sampling.

Source code in mcup/xy_weighted.py
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
class XYWeightedRegressor(BaseRegressor):
    """Regression estimator for data where both x and y have measurement errors.

    Uses iteratively reweighted least squares (IRLS) to combine x and y variances
    via error propagation: ``σ_combined² = σ_y² + (∂f/∂x)² σ_x²``. Faster than
    ``DemingRegressor`` and well-suited to mildly nonlinear models.

    Supports two solvers selected via the ``method`` argument:

    - ``"analytical"`` — IRLS with ``(J^T W J)^{-1}`` covariance (fast).
    - ``"mc"`` — Monte Carlo sampling with Welford online covariance (robust for nonlinear models).

    Parameters:
        func: Model function with signature ``func(x, params) -> float``, where
            ``x`` is a scalar for 1-D input or a length-k array for k-dimensional
            input.
        method: Solver to use, either ``"analytical"`` or ``"mc"``. Default ``"mc"``.
        n_iter: Maximum number of Monte Carlo iterations. Default ``10_000``.
        rtol: Relative tolerance for MC convergence stopping. Default ``None`` (disabled).
        atol: Absolute tolerance for MC convergence stopping. Default ``None`` (disabled).
        optimizer: SciPy optimizer name used for parameter fitting. Default ``"Nelder-Mead"``.

    Attributes:
        params_: Fitted parameter array.
        params_std_: Standard deviations of fitted parameters.
        covariance_: Full parameter covariance matrix.
        n_iter_: Actual number of MC iterations run (MC method only).

    Notes:
        ``X`` may be shape ``(n,)`` for scalar input or ``(n, k)`` for k-dimensional
        input per data point.  ``x_err`` must match ``X.shape`` exactly; set a column
        to zero for any feature that is measured without uncertainty (e.g. a controlled
        variable).  Zero-error features contribute nothing to the combined variance and
        receive no noise during Monte Carlo sampling.
    """

    def fit(  # type: ignore[override]
        self,
        X: np.ndarray,
        y: np.ndarray,
        x_err: np.ndarray,
        y_err: np.ndarray,
        p0: np.ndarray,
        n_irls: int = 10,
    ) -> "XYWeightedRegressor":
        X, y, y_err, x_err = self._validate_inputs(X, y, y_err, x_err)  # type: ignore[misc]
        p0 = np.asarray(p0, dtype=float)

        if self.method == "analytical":
            params = p0.copy()
            for _ in range(n_irls):
                weights = _combined_weights(self.func, X, params, x_err, y_err)
                params, cov = analytical_solve(self.func, X, y, weights, params, self.optimizer)
            self.params_ = params
            self.covariance_ = cov
            self.params_std_ = np.sqrt(np.diag(cov))
        else:

            def cost_fn_builder(
                x_s: np.ndarray, y_s: np.ndarray, params_est: np.ndarray
            ) -> object:
                weights = _combined_weights(self.func, x_s, params_est, x_err, y_err)

                def cost(params: np.ndarray) -> float:
                    r = np.array([y_s[i] - self.func(x_s[i], params) for i in range(len(y_s))])
                    return float(np.dot(r**2, weights))

                return cost

            mean, cov, n = mc_solve(
                cost_fn_builder,
                X,
                y,
                x_err,
                y_err,
                p0,
                self.n_iter,
                self.rtol,
                self.atol,
                self.optimizer,
            )
            self.params_ = mean
            self.covariance_ = cov
            self.params_std_ = np.sqrt(np.diag(cov))
            self.n_iter_ = n

        return self