onlinerake.OnlineRakingSGD
- class onlinerake.OnlineRakingSGD(targets: Targets, learning_rate: float = 5.0, min_weight: float = 0.001, max_weight: float = 100.0, n_sgd_steps: int = 3, verbose: bool = False, track_convergence: bool = True, convergence_window: int = 20, compute_weight_stats: bool | int = False)[source]
Bases:
object
Online raking via stochastic gradient descent.
- Parameters:
targets (
Targets
) – Target population proportions for each demographic characteristic.learning_rate (float, optional) – Step size used in the gradient descent update. Larger values lead to more aggressive updates but may cause oscillation or divergence.
min_weight (float, optional) – Lower bound applied to the weights after each update to prevent weights from collapsing to zero. Must be positive.
max_weight (float, optional) – Upper bound applied to the weights after each update to prevent runaway weights. Must exceed
min_weight
.n_sgd_steps (int, optional) – Number of gradient steps applied each time a new observation arrives. Values larger than 1 can help reduce oscillations but increase computational cost.
compute_weight_stats (bool or int, optional) – Controls computation of weight distribution statistics for performance. If True, compute on every call. If False, use cached values. If integer k, compute every k observations. Default is False.
Notes
For binary demographic indicators the gradient of the margin with respect to each weight can be derived analytically. See the documentation for details.
The algorithm does not currently support categorical controls with more than two levels. Extending to multi‑level categories would require storing one hot encodings and expanding the margin loss accordingly.
- __init__(targets: Targets, learning_rate: float = 5.0, min_weight: float = 0.001, max_weight: float = 100.0, n_sgd_steps: int = 3, verbose: bool = False, track_convergence: bool = True, convergence_window: int = 20, compute_weight_stats: bool | int = False) None [source]
Methods
__init__
(targets[, learning_rate, ...])check_convergence
([tolerance])Check if algorithm has converged based on loss stability.
detect_oscillation
([threshold])Detect if loss is oscillating rather than converging.
fit_one
(obs)Consume a single observation and update weights.
partial_fit
(obs)Consume a single observation and update weights.
Attributes
Return True if the algorithm has detected convergence.
Return the step number where convergence was detected, if any.
Return the effective sample size (ESS).
Return history of gradient norms for convergence analysis.
Return the current squared‑error loss on margins.
Return moving average of loss over convergence window.
Return current weighted margins as a dictionary.
Return unweighted (raw) margins as a dictionary.
Return comprehensive weight distribution statistics.
Return a copy of the current weight vector.
- check_convergence(tolerance: float = 1e-06) bool [source]
Check if algorithm has converged based on loss stability.
- property convergence_step: int | None
Return the step number where convergence was detected, if any.
- detect_oscillation(threshold: float = 0.1) bool [source]
Detect if loss is oscillating rather than converging.
- property effective_sample_size: float
Return the effective sample size (ESS).
ESS is defined as (sum w_i)^2 / (sum w_i^2). It reflects the number of equally weighted observations that would yield the same variance as the current weighted estimator.
- fit_one(obs: Any) None
Consume a single observation and update weights.
- Parameters:
obs (mapping or object) – An observation containing demographic indicators. The attributes/keys
age
,gender
,education
andregion
must be accessible on the object. The values should be 0 or 1. Anything truthy is interpreted as 1.- Returns:
The internal state is updated in place. The caller can inspect the properties
weights
,margins
andloss
after the call for diagnostics.- Return type:
None
- property gradient_norm_history: list[float]
Return history of gradient norms for convergence analysis.
- partial_fit(obs: Any) None [source]
Consume a single observation and update weights.
- Parameters:
obs (mapping or object) – An observation containing demographic indicators. The attributes/keys
age
,gender
,education
andregion
must be accessible on the object. The values should be 0 or 1. Anything truthy is interpreted as 1.- Returns:
The internal state is updated in place. The caller can inspect the properties
weights
,margins
andloss
after the call for diagnostics.- Return type:
None