onlinerake.OnlineRakingSGD

class onlinerake.OnlineRakingSGD(targets: Targets, learning_rate: float = 5.0, min_weight: float = 0.001, max_weight: float = 100.0, n_sgd_steps: int = 3, verbose: bool = False, track_convergence: bool = True, convergence_window: int = 20, compute_weight_stats: bool | int = False)[source]

Bases: object

Online raking via stochastic gradient descent.

Parameters:
  • targets (Targets) – Target population proportions for each demographic characteristic.

  • learning_rate (float, optional) – Step size used in the gradient descent update. Larger values lead to more aggressive updates but may cause oscillation or divergence.

  • min_weight (float, optional) – Lower bound applied to the weights after each update to prevent weights from collapsing to zero. Must be positive.

  • max_weight (float, optional) – Upper bound applied to the weights after each update to prevent runaway weights. Must exceed min_weight.

  • n_sgd_steps (int, optional) – Number of gradient steps applied each time a new observation arrives. Values larger than 1 can help reduce oscillations but increase computational cost.

  • compute_weight_stats (bool or int, optional) – Controls computation of weight distribution statistics for performance. If True, compute on every call. If False, use cached values. If integer k, compute every k observations. Default is False.

Notes

  • For binary demographic indicators the gradient of the margin with respect to each weight can be derived analytically. See the documentation for details.

  • The algorithm does not currently support categorical controls with more than two levels. Extending to multi‑level categories would require storing one hot encodings and expanding the margin loss accordingly.

__init__(targets: Targets, learning_rate: float = 5.0, min_weight: float = 0.001, max_weight: float = 100.0, n_sgd_steps: int = 3, verbose: bool = False, track_convergence: bool = True, convergence_window: int = 20, compute_weight_stats: bool | int = False) None[source]

Methods

__init__(targets[, learning_rate, ...])

check_convergence([tolerance])

Check if algorithm has converged based on loss stability.

detect_oscillation([threshold])

Detect if loss is oscillating rather than converging.

fit_one(obs)

Consume a single observation and update weights.

partial_fit(obs)

Consume a single observation and update weights.

Attributes

converged

Return True if the algorithm has detected convergence.

convergence_step

Return the step number where convergence was detected, if any.

effective_sample_size

Return the effective sample size (ESS).

gradient_norm_history

Return history of gradient norms for convergence analysis.

loss

Return the current squared‑error loss on margins.

loss_moving_average

Return moving average of loss over convergence window.

margins

Return current weighted margins as a dictionary.

raw_margins

Return unweighted (raw) margins as a dictionary.

weight_distribution_stats

Return comprehensive weight distribution statistics.

weights

Return a copy of the current weight vector.

check_convergence(tolerance: float = 1e-06) bool[source]

Check if algorithm has converged based on loss stability.

Parameters:

tolerance (float) – Convergence tolerance for loss stability.

Returns:

True if convergence is detected.

Return type:

bool

property converged: bool

Return True if the algorithm has detected convergence.

property convergence_step: int | None

Return the step number where convergence was detected, if any.

detect_oscillation(threshold: float = 0.1) bool[source]

Detect if loss is oscillating rather than converging.

Parameters:

threshold (float) – Relative threshold for detecting oscillation vs trend.

Returns:

True if oscillation is detected in recent loss history.

Return type:

bool

property effective_sample_size: float

Return the effective sample size (ESS).

ESS is defined as (sum w_i)^2 / (sum w_i^2). It reflects the number of equally weighted observations that would yield the same variance as the current weighted estimator.

fit_one(obs: Any) None

Consume a single observation and update weights.

Parameters:

obs (mapping or object) – An observation containing demographic indicators. The attributes/keys age, gender, education and region must be accessible on the object. The values should be 0 or 1. Anything truthy is interpreted as 1.

Returns:

The internal state is updated in place. The caller can inspect the properties weights, margins and loss after the call for diagnostics.

Return type:

None

property gradient_norm_history: list[float]

Return history of gradient norms for convergence analysis.

property loss: float

Return the current squared‑error loss on margins.

property loss_moving_average: float

Return moving average of loss over convergence window.

property margins: dict[str, float]

Return current weighted margins as a dictionary.

partial_fit(obs: Any) None[source]

Consume a single observation and update weights.

Parameters:

obs (mapping or object) – An observation containing demographic indicators. The attributes/keys age, gender, education and region must be accessible on the object. The values should be 0 or 1. Anything truthy is interpreted as 1.

Returns:

The internal state is updated in place. The caller can inspect the properties weights, margins and loss after the call for diagnostics.

Return type:

None

property raw_margins: dict[str, float]

Return unweighted (raw) margins as a dictionary.

property weight_distribution_stats: dict[str, float]

Return comprehensive weight distribution statistics.

property weights: ndarray

Return a copy of the current weight vector.