API Reference

This page provides detailed documentation for all public classes and functions in the onlinerake package.

Core Classes

Targets

Target population proportions for binary demographics.

OnlineRakingSGD

Online raking via stochastic gradient descent.

OnlineRakingMWU

Online raking via multiplicative weights updates.

Targets

class onlinerake.Targets(age: float = 0.5, gender: float = 0.5, education: float = 0.4, region: float = 0.3)[source]

Bases: object

Target population proportions for binary demographics.

Each attribute represents the desired proportion of cases with indicator value 1. If your survey uses different definitions or more categories per characteristic, either extend this class with additional fields or refactor your raking logic accordingly.

age: float = 0.5
as_dict() dict[source]

Return the targets as a plain dictionary.

Useful for iterating over targets programmatically.

education: float = 0.4
gender: float = 0.5
region: float = 0.3

OnlineRakingSGD

class onlinerake.OnlineRakingSGD(targets: Targets, learning_rate: float = 5.0, min_weight: float = 0.001, max_weight: float = 100.0, n_sgd_steps: int = 3, verbose: bool = False, track_convergence: bool = True, convergence_window: int = 20, compute_weight_stats: bool | int = False)[source]

Bases: object

Online raking via stochastic gradient descent.

Parameters:
  • targets (Targets) – Target population proportions for each demographic characteristic.

  • learning_rate (float, optional) – Step size used in the gradient descent update. Larger values lead to more aggressive updates but may cause oscillation or divergence.

  • min_weight (float, optional) – Lower bound applied to the weights after each update to prevent weights from collapsing to zero. Must be positive.

  • max_weight (float, optional) – Upper bound applied to the weights after each update to prevent runaway weights. Must exceed min_weight.

  • n_sgd_steps (int, optional) – Number of gradient steps applied each time a new observation arrives. Values larger than 1 can help reduce oscillations but increase computational cost.

  • compute_weight_stats (bool or int, optional) – Controls computation of weight distribution statistics for performance. If True, compute on every call. If False, use cached values. If integer k, compute every k observations. Default is False.

Notes

  • For binary demographic indicators the gradient of the margin with respect to each weight can be derived analytically. See the documentation for details.

  • The algorithm does not currently support categorical controls with more than two levels. Extending to multi‑level categories would require storing one hot encodings and expanding the margin loss accordingly.

check_convergence(tolerance: float = 1e-06) bool[source]

Check if algorithm has converged based on loss stability.

Parameters:

tolerance (float) – Convergence tolerance for loss stability.

Returns:

True if convergence is detected.

Return type:

bool

property converged: bool

Return True if the algorithm has detected convergence.

property convergence_step: int | None

Return the step number where convergence was detected, if any.

detect_oscillation(threshold: float = 0.1) bool[source]

Detect if loss is oscillating rather than converging.

Parameters:

threshold (float) – Relative threshold for detecting oscillation vs trend.

Returns:

True if oscillation is detected in recent loss history.

Return type:

bool

property effective_sample_size: float

Return the effective sample size (ESS).

ESS is defined as (sum w_i)^2 / (sum w_i^2). It reflects the number of equally weighted observations that would yield the same variance as the current weighted estimator.

fit_one(obs: Any) None

Consume a single observation and update weights.

Parameters:

obs (mapping or object) – An observation containing demographic indicators. The attributes/keys age, gender, education and region must be accessible on the object. The values should be 0 or 1. Anything truthy is interpreted as 1.

Returns:

The internal state is updated in place. The caller can inspect the properties weights, margins and loss after the call for diagnostics.

Return type:

None

property gradient_norm_history: list[float]

Return history of gradient norms for convergence analysis.

property loss: float

Return the current squared‑error loss on margins.

property loss_moving_average: float

Return moving average of loss over convergence window.

property margins: dict[str, float]

Return current weighted margins as a dictionary.

partial_fit(obs: Any) None[source]

Consume a single observation and update weights.

Parameters:

obs (mapping or object) – An observation containing demographic indicators. The attributes/keys age, gender, education and region must be accessible on the object. The values should be 0 or 1. Anything truthy is interpreted as 1.

Returns:

The internal state is updated in place. The caller can inspect the properties weights, margins and loss after the call for diagnostics.

Return type:

None

property raw_margins: dict[str, float]

Return unweighted (raw) margins as a dictionary.

property weight_distribution_stats: dict[str, float]

Return comprehensive weight distribution statistics.

property weights: ndarray

Return a copy of the current weight vector.

OnlineRakingMWU

class onlinerake.OnlineRakingMWU(targets, learning_rate: float = 1.0, min_weight: float = 0.001, max_weight: float = 100.0, n_steps: int = 3, verbose: bool = False, track_convergence: bool = True, convergence_window: int = 20, compute_weight_stats: bool | int = False)[source]

Bases: OnlineRakingSGD

Online raking via multiplicative weights updates.

Parameters:
  • targets (Targets) – Target population proportions for each demographic characteristic.

  • learning_rate (float, optional) – Step size used in the exponent of the multiplicative update. A typical default is learning_rate=1.0. The algorithm automatically clips extreme exponents based on the weights dtype to prevent numerical overflow/underflow, making it robust even with very large learning rates.

  • min_weight (float, optional) – Lower bound applied to the weights after each update. This prevents weights from collapsing to zero. Must be positive.

  • max_weight (float, optional) – Upper bound applied to the weights after each update. This prevents runaway weights. Must exceed min_weight.

  • n_steps (int, optional) – Number of multiplicative updates applied each time a new observation arrives.

  • compute_weight_stats (bool or int, optional) – Controls computation of weight distribution statistics for performance. If True, compute on every call. If False, use cached values. If integer k, compute every k observations. Default is False.

fit_one(obs: Any) None

Consume a single observation and update weights multiplicatively.

partial_fit(obs: Any) None[source]

Consume a single observation and update weights multiplicatively.