API Reference
This page provides detailed documentation for all public classes and functions
in the onlinerake
package.
Core Classes
Target population proportions for binary demographics. |
|
Online raking via stochastic gradient descent. |
|
Online raking via multiplicative weights updates. |
Targets
- class onlinerake.Targets(age: float = 0.5, gender: float = 0.5, education: float = 0.4, region: float = 0.3)[source]
Bases:
object
Target population proportions for binary demographics.
Each attribute represents the desired proportion of cases with indicator value
1
. If your survey uses different definitions or more categories per characteristic, either extend this class with additional fields or refactor your raking logic accordingly.
OnlineRakingSGD
- class onlinerake.OnlineRakingSGD(targets: Targets, learning_rate: float = 5.0, min_weight: float = 0.001, max_weight: float = 100.0, n_sgd_steps: int = 3, verbose: bool = False, track_convergence: bool = True, convergence_window: int = 20, compute_weight_stats: bool | int = False)[source]
Bases:
object
Online raking via stochastic gradient descent.
- Parameters:
targets (
Targets
) – Target population proportions for each demographic characteristic.learning_rate (float, optional) – Step size used in the gradient descent update. Larger values lead to more aggressive updates but may cause oscillation or divergence.
min_weight (float, optional) – Lower bound applied to the weights after each update to prevent weights from collapsing to zero. Must be positive.
max_weight (float, optional) – Upper bound applied to the weights after each update to prevent runaway weights. Must exceed
min_weight
.n_sgd_steps (int, optional) – Number of gradient steps applied each time a new observation arrives. Values larger than 1 can help reduce oscillations but increase computational cost.
compute_weight_stats (bool or int, optional) – Controls computation of weight distribution statistics for performance. If True, compute on every call. If False, use cached values. If integer k, compute every k observations. Default is False.
Notes
For binary demographic indicators the gradient of the margin with respect to each weight can be derived analytically. See the documentation for details.
The algorithm does not currently support categorical controls with more than two levels. Extending to multi‑level categories would require storing one hot encodings and expanding the margin loss accordingly.
- check_convergence(tolerance: float = 1e-06) bool [source]
Check if algorithm has converged based on loss stability.
- property convergence_step: int | None
Return the step number where convergence was detected, if any.
- detect_oscillation(threshold: float = 0.1) bool [source]
Detect if loss is oscillating rather than converging.
- property effective_sample_size: float
Return the effective sample size (ESS).
ESS is defined as (sum w_i)^2 / (sum w_i^2). It reflects the number of equally weighted observations that would yield the same variance as the current weighted estimator.
- fit_one(obs: Any) None
Consume a single observation and update weights.
- Parameters:
obs (mapping or object) – An observation containing demographic indicators. The attributes/keys
age
,gender
,education
andregion
must be accessible on the object. The values should be 0 or 1. Anything truthy is interpreted as 1.- Returns:
The internal state is updated in place. The caller can inspect the properties
weights
,margins
andloss
after the call for diagnostics.- Return type:
None
- property gradient_norm_history: list[float]
Return history of gradient norms for convergence analysis.
- partial_fit(obs: Any) None [source]
Consume a single observation and update weights.
- Parameters:
obs (mapping or object) – An observation containing demographic indicators. The attributes/keys
age
,gender
,education
andregion
must be accessible on the object. The values should be 0 or 1. Anything truthy is interpreted as 1.- Returns:
The internal state is updated in place. The caller can inspect the properties
weights
,margins
andloss
after the call for diagnostics.- Return type:
None
OnlineRakingMWU
- class onlinerake.OnlineRakingMWU(targets, learning_rate: float = 1.0, min_weight: float = 0.001, max_weight: float = 100.0, n_steps: int = 3, verbose: bool = False, track_convergence: bool = True, convergence_window: int = 20, compute_weight_stats: bool | int = False)[source]
Bases:
OnlineRakingSGD
Online raking via multiplicative weights updates.
- Parameters:
targets (
Targets
) – Target population proportions for each demographic characteristic.learning_rate (float, optional) – Step size used in the exponent of the multiplicative update. A typical default is
learning_rate=1.0
. The algorithm automatically clips extreme exponents based on the weights dtype to prevent numerical overflow/underflow, making it robust even with very large learning rates.min_weight (float, optional) – Lower bound applied to the weights after each update. This prevents weights from collapsing to zero. Must be positive.
max_weight (float, optional) – Upper bound applied to the weights after each update. This prevents runaway weights. Must exceed
min_weight
.n_steps (int, optional) – Number of multiplicative updates applied each time a new observation arrives.
compute_weight_stats (bool or int, optional) – Controls computation of weight distribution statistics for performance. If True, compute on every call. If False, use cached values. If integer k, compute every k observations. Default is False.