API Reference¶

This page provides detailed documentation for all public classes and functions in the onlinerake package.

Core Classes¶

Targets

Target population margins for binary and continuous features.

OnlineRakingSGD

Online raking via stochastic gradient descent.

OnlineRakingMWU

Online raking via multiplicative weights updates.

Targets¶

class onlinerake.Targets(**kwargs: float | tuple[float, str])[source]¶

Bases: object

Target population margins for binary and continuous features.

A flexible container for specifying target proportions (for binary features) or target means (for continuous features).

Parameters:

**kwargs –

Named feature targets. Each key is a feature name and each value specifies the target: - For binary features: a float in [0, 1] representing the target

proportion of the population where that feature is 1/True.

  • For continuous features: a tuple (value, "mean") where value is the target mean (any real number).

_targets¶

Internal storage of target values.

Type:

dict[str, float]

_feature_types¶

Maps feature names to “binary” or “continuous”.

Type:

dict[str, str]

_feature_names¶

Sorted list of feature names for consistent ordering.

Type:

list[str]

Private Methods:

_validate_feature_exists: Validates that a feature is defined in targets.

Examples

>>> # Binary features only (backward compatible)
>>> targets = Targets(owns_car=0.4, is_subscriber=0.2, likes_coffee=0.7)
>>> print(targets.feature_names)
['is_subscriber', 'likes_coffee', 'owns_car']
>>> # Mixed binary and continuous features
>>> targets = Targets(
...     gender=0.5,                 # binary: 50% female
...     college=0.35,               # binary: 35% college educated
...     age=(42.0, "mean"),         # continuous: mean age 42
...     income=(65000, "mean"),     # continuous: mean income $65k
... )
>>> print(targets.is_binary("gender"))
True
>>> print(targets.is_continuous("age"))
True
>>> print(targets["age"])
42.0
>>> # Access target values
>>> print(targets['owns_car'])
0.4
>>> # Check if feature exists
>>> print('owns_car' in targets)
True
Raises:

ValueError – If any binary target proportion is not between 0 and 1, or if the tuple syntax is malformed.

Note

Feature names are stored in sorted order for consistent behavior across different Python versions and hash randomization settings.

as_dict() dict[str, float][source]¶

Convert targets to a dictionary of values.

Returns:

Dictionary mapping feature names to target values

(proportions for binary, means for continuous).

Return type:

dict[str, float]

Examples

>>> targets = Targets(owns_car=0.4, is_subscriber=0.2)
>>> targets.as_dict()
{'owns_car': 0.4, 'is_subscriber': 0.2}
property binary_features: list[str]¶

Get list of binary feature names.

Returns:

Sorted list of binary feature names.

Return type:

list[str]

Examples

>>> targets = Targets(gender=0.5, age=(35.0, "mean"))
>>> targets.binary_features
['gender']
property continuous_features: list[str]¶

Get list of continuous feature names.

Returns:

Sorted list of continuous feature names.

Return type:

list[str]

Examples

>>> targets = Targets(gender=0.5, age=(35.0, "mean"))
>>> targets.continuous_features
['age']
property feature_names: list[str]¶

Get ordered list of feature names.

Returns:

Sorted list of feature names.

Return type:

list[str]

Examples

>>> targets = Targets(b=0.5, a=0.3, c=0.7)
>>> targets.feature_names
['a', 'b', 'c']
feature_type(feature: str) str[source]¶

Get the type of a feature.

Parameters:

feature – Feature name to look up.

Returns:

Either “binary” or “continuous”.

Return type:

str

Raises:

KeyError – If feature name is not defined in targets.

Examples

>>> targets = Targets(gender=0.5, age=(35.0, "mean"))
>>> targets.feature_type("gender")
'binary'
>>> targets.feature_type("age")
'continuous'
property has_continuous_features: bool¶

Check if any continuous features are defined.

Returns:

True if at least one continuous feature is defined.

Return type:

bool

is_binary(feature: str) bool[source]¶

Check if a feature is binary.

Parameters:

feature – Feature name to check.

Returns:

True if feature is binary, False otherwise.

Return type:

bool

Raises:

KeyError – If feature name is not defined in targets.

Examples

>>> targets = Targets(gender=0.5, age=(35.0, "mean"))
>>> targets.is_binary("gender")
True
>>> targets.is_binary("age")
False
is_continuous(feature: str) bool[source]¶

Check if a feature is continuous.

Parameters:

feature – Feature name to check.

Returns:

True if feature is continuous, False otherwise.

Return type:

bool

Raises:

KeyError – If feature name is not defined in targets.

Examples

>>> targets = Targets(gender=0.5, age=(35.0, "mean"))
>>> targets.is_continuous("age")
True
>>> targets.is_continuous("gender")
False
property n_features: int¶

Get number of features.

Returns:

Number of features defined in these targets.

Return type:

int

Examples

>>> targets = Targets(a=0.5, b=0.3, c=0.7)
>>> targets.n_features
3

OnlineRakingSGD¶

class onlinerake.OnlineRakingSGD(targets: Targets, learning_rate: float | LearningRateSchedule = 5.0, min_weight: float = 0.001, max_weight: float = 100.0, n_sgd_steps: int = 3, verbose: bool = False, track_convergence: bool = True, convergence_window: int = 20, compute_weight_stats: bool | int = False, max_history: int | None = 1000, track_kl_divergence: bool = False)[source]¶

Bases: object

Online raking via stochastic gradient descent.

A streaming weight calibration algorithm that adjusts observation weights to match target population margins using stochastic gradient descent (SGD). The algorithm minimizes squared-error loss between weighted margins and target proportions.

Parameters:
  • targets – Target population proportions for each feature.

  • learning_rate – Step size for gradient descent updates. Can be: - float: Fixed learning rate (default: 5.0) - LearningRateSchedule: Dynamic schedule (e.g., robbins_monro_schedule()) Larger values lead to faster convergence but may cause oscillation.

  • min_weight – Lower bound for weights to prevent collapse. Must be positive. Default: 0.001.

  • max_weight – Upper bound for weights to prevent explosion. Must exceed min_weight. Default: 100.0.

  • n_sgd_steps – Number of gradient steps per observation. More steps can reduce oscillations but increase computation. Default: 3.

  • verbose – If True, log progress information. Default: False.

  • track_convergence – If True, monitor convergence metrics. Default: True.

  • convergence_window – Number of observations for convergence detection. Default: 20.

  • compute_weight_stats – Control weight statistics computation. If True: compute every observation. If False: never compute (best performance). If int k: compute every k observations. Default: False.

  • max_history – Maximum historical states to retain. None for unlimited (may cause memory issues). Default: 1000.

  • track_kl_divergence – If True, track KL divergence between consecutive weight distributions. Useful for monitoring MWU convergence and comparing to IPF. Default: False.

targets¶

The target proportions.

history¶

List of historical states after each update.

Examples

>>> # General features
>>> targets = Targets(owns_car=0.4, is_subscriber=0.2)
>>> raker = OnlineRakingSGD(targets, learning_rate=5.0)
>>> raker.partial_fit({'owns_car': 1, 'is_subscriber': 0})
>>> print(f"Loss: {raker.loss:.4f}")
>>> # Process multiple observations
>>> for obs in stream:
...     raker.partial_fit(obs)
...     if raker.converged:
...         break
Raises:

ValueError – If any parameter is invalid (negative learning rate, invalid weight bounds, non-positive convergence window, invalid compute_weight_stats).

Note

The algorithm supports arbitrary binary features, not limited to demographics. Feature names must match those defined in targets.

check_convergence(tolerance: float = 1e-06) bool[source]¶

Check if algorithm has converged based on loss stability.

Parameters:

tolerance – Convergence tolerance. Smaller values require more stable loss. Default: 1e-6.

Returns:

True if convergence detected, False otherwise.

Note

Convergence is detected when loss is near zero or when relative standard deviation of recent losses is below tolerance.

property converged: bool¶

Return True if the algorithm has detected convergence.

property convergence_step: int | None¶

Get step number where convergence was detected.

Returns:

Observation number where convergence detected, or None if not yet converged.

property cumulative_kl_divergence: float¶

Get total KL divergence accumulated over all updates.

Sum of D_KL(w_t || w_{t-1}) across all t. This measures the total “path length” in KL space taken by the algorithm.

Returns:

Cumulative KL divergence. Returns 0.0 if tracking disabled or no updates processed.

Note

Useful for comparing algorithms: MWU should accumulate less KL divergence than SGD when starting from uniform weights, since MWU explicitly minimizes KL.

property current_learning_rate: float¶

Get current learning rate (may vary if using a schedule).

detect_oscillation(threshold: float = 0.1) bool[source]¶

Detect if loss is oscillating rather than converging.

Parameters:

threshold – Relative threshold for detecting oscillation vs trend. Higher values are less sensitive to oscillation. Default: 0.1.

Returns:

True if oscillation detected in recent loss history, False otherwise.

Note

Oscillation suggests the learning rate may be too high.

property effective_sample_size: float¶

Return the effective sample size (ESS).

ESS is defined as (sum w_i)^2 / (sum w_i^2). It reflects the number of equally weighted observations that would yield the same variance as the current weighted estimator.

fit_one(obs: dict[str, Any] | Any) None¶

Process single observation and update weights.

Parameters:

obs – Observation containing feature values. Can be: - dict: Keys should match feature names in targets - object: Features accessed as attributes For binary features: values should be 0/1 or False/True. For continuous features: values should be numeric (float/int). Missing features default to 0.

Returns:

None. Updates internal state in place.

Examples

>>> # Binary features only
>>> targets = Targets(owns_car=0.4, is_subscriber=0.2)
>>> raker = OnlineRakingSGD(targets)
>>> raker.partial_fit({'owns_car': 1, 'is_subscriber': 0})
>>> # Mixed binary and continuous features
>>> targets = Targets(gender=0.5, age=(35.0, "mean"))
>>> raker = OnlineRakingSGD(targets)
>>> raker.partial_fit({'gender': 1, 'age': 42.5})
>>> # Object input (e.g., dataclass or namedtuple)
>>> from dataclasses import dataclass
>>> @dataclass
... class Obs:
...     owns_car: int
...     is_subscriber: int
>>> raker.partial_fit(Obs(owns_car=1, is_subscriber=0))

Note

After calling, inspect weights, margins, and loss properties for current state.

property gradient_norm_history: list[float]¶

Get history of gradient norms.

Returns:

List of gradient norms from each SGD step. Useful for analyzing convergence behavior.

property kl_divergence_history: list[float]¶

Get history of KL divergence between consecutive weight updates.

Only populated if track_kl_divergence=True was passed to __init__. Each entry represents D_KL(w_t || w_{t-1}) after processing observation t.

Returns:

List of KL divergence values. Empty if tracking disabled.

Note

MWU minimizes KL divergence from previous weights at each step, so this tracks how much the distribution changes per update. Smaller values indicate the algorithm is stabilizing.

property learning_rate_history: list[float]¶

Get history of learning rates used at each observation.

property loss: float¶

Get current squared-error loss.

Computes sum of squared differences between current weighted margins and target proportions.

Returns:

Squared-error loss. Returns NaN if no observations processed. Lower values indicate better calibration to targets.

Examples

>>> # Perfect calibration would have loss near 0
>>> raker = OnlineRakingSGD(targets)
>>> # Process many observations...
>>> if raker.loss < 0.001:
...     print("Well calibrated")
property loss_moving_average: float¶

Return moving average of loss over convergence window.

property margins: dict[str, float]¶

Get current weighted margins.

Computes the weighted proportion of observations where each feature equals 1, using the current weight vector.

Returns:

Dictionary mapping feature names to weighted proportions. Returns NaN for all features if no observations processed.

Examples

>>> targets = Targets(a=0.5, b=0.3)
>>> raker = OnlineRakingSGD(targets)
>>> raker.partial_fit({'a': 1, 'b': 0})
>>> margins = raker.margins
>>> print(margins['a'] > margins['b'])  # a=1, b=0 in observation
True
partial_fit(obs: dict[str, Any] | Any) None[source]¶

Process single observation and update weights.

Parameters:

obs – Observation containing feature values. Can be: - dict: Keys should match feature names in targets - object: Features accessed as attributes For binary features: values should be 0/1 or False/True. For continuous features: values should be numeric (float/int). Missing features default to 0.

Returns:

None. Updates internal state in place.

Examples

>>> # Binary features only
>>> targets = Targets(owns_car=0.4, is_subscriber=0.2)
>>> raker = OnlineRakingSGD(targets)
>>> raker.partial_fit({'owns_car': 1, 'is_subscriber': 0})
>>> # Mixed binary and continuous features
>>> targets = Targets(gender=0.5, age=(35.0, "mean"))
>>> raker = OnlineRakingSGD(targets)
>>> raker.partial_fit({'gender': 1, 'age': 42.5})
>>> # Object input (e.g., dataclass or namedtuple)
>>> from dataclasses import dataclass
>>> @dataclass
... class Obs:
...     owns_car: int
...     is_subscriber: int
>>> raker.partial_fit(Obs(owns_car=1, is_subscriber=0))

Note

After calling, inspect weights, margins, and loss properties for current state.

partial_fit_batch(observations: list[dict[str, Any] | Any]) None[source]¶

Process multiple observations in batch.

Parameters:

observations – List of observations, each in same format as for partial_fit method.

Returns:

None. Updates internal state for all observations.

Examples

>>> observations = [
...     {'feature_a': 1, 'feature_b': 0},
...     {'feature_a': 0, 'feature_b': 1},
...     {'feature_a': 1, 'feature_b': 1},
... ]
>>> raker.partial_fit_batch(observations)

Note

Currently processes observations sequentially. Future versions may implement true batch processing for better performance.

property raw_margins: dict[str, float]¶

Get unweighted (raw) margins.

Computes the simple proportion of observations where each feature equals 1, without using weights.

Returns:

Dictionary mapping feature names to unweighted proportions. Returns NaN for all features if no observations processed.

Note

Useful for comparing weighted vs unweighted margins to assess the impact of the raking process.

property uses_lr_schedule: bool¶

Return True if using a dynamic learning rate schedule.

property weight_distribution_stats: dict[str, float]¶

Return comprehensive weight distribution statistics.

property weights: ndarray[tuple[Any, ...], dtype[float64]]¶

Get copy of current weight vector.

Returns:

Array of shape (n_obs,) containing current weights.

Examples

>>> raker = OnlineRakingSGD(targets)
>>> raker.partial_fit({'feature_a': 1, 'feature_b': 0})
>>> weights = raker.weights
>>> print(weights.shape)
(1,)

OnlineRakingMWU¶

class onlinerake.OnlineRakingMWU(targets, learning_rate: float = 1.0, min_weight: float = 0.001, max_weight: float = 100.0, n_sgd_steps: int = 3, verbose: bool = False, track_convergence: bool = True, convergence_window: int = 20, compute_weight_stats: bool | int = False, track_kl_divergence: bool = False)[source]¶

Bases: OnlineRakingSGD

Online raking via multiplicative weights updates.

Parameters:
  • targets (Targets) – Target population proportions for each feature.

  • learning_rate (float, optional) – Step size used in the exponent of the multiplicative update. A typical default is learning_rate=1.0. The algorithm automatically clips extreme exponents based on the weights dtype to prevent numerical overflow/underflow, making it robust even with very large learning rates.

  • min_weight (float, optional) – Lower bound applied to the weights after each update. This prevents weights from collapsing to zero. Must be positive.

  • max_weight (float, optional) – Upper bound applied to the weights after each update. This prevents runaway weights. Must exceed min_weight.

  • n_sgd_steps (int, optional) – Number of multiplicative updates applied each time a new observation arrives.

  • compute_weight_stats (bool or int, optional) – Controls computation of weight distribution statistics for performance. If True, compute on every call. If False, never compute. If integer k, compute every k observations. Default is False.

fit_one(obs: dict[str, Any] | Any) None¶

Consume a single observation and update weights multiplicatively.

Parameters:

obs (dict or object) – An observation containing feature values. For dict input, keys should match feature names in targets. For object input, features are accessed as attributes. For binary features, values should be 0/1 or False/True. For continuous features, values should be numeric (float/int).

Returns:

The internal state is updated in place.

Return type:

None

partial_fit(obs: dict[str, Any] | Any) None[source]¶

Consume a single observation and update weights multiplicatively.

Parameters:

obs (dict or object) – An observation containing feature values. For dict input, keys should match feature names in targets. For object input, features are accessed as attributes. For binary features, values should be 0/1 or False/True. For continuous features, values should be numeric (float/int).

Returns:

The internal state is updated in place.

Return type:

None