API Reference

This page contains the API reference for fairlex.

Calibration Functions

Core calibration routines for fairlex.

This module contains implementations of leximin-style calibration for survey weights. Two variants are provided:

  • leximin_residual minimises the worst absolute deviation between the calibrated and target margins (a min-max problem). It is akin to solving a Chebyshev approximation on the residuals. While this drives margin errors down, it can lead to large deviations from the original weights if the margin targets are difficult to meet within bounds.

  • leximin_weight_fair first performs the residual leximin step and then minimises the largest relative change from the base weights, subject to keeping the residuals at the optimum level (plus a small optional slack). This spreads the adjustments more evenly across units and yields a more stable set of weights.

Both functions accept a membership matrix A of shape (m, n) where m is the number of margins and n is the number of units. Each row should correspond to a margin (e.g., a demographic group), and the entries indicate whether a unit belongs to that margin (1.0 or 0.0, or continuous weights for soft membership). The target totals b must be of length m. The base weights w0 must be of length n.

Weight bounds are specified as multiplicative factors relative to w0. For example, min_ratio=0.5 and max_ratio=2.0 constrains each calibrated weight to lie between half and twice its original value.

The underlying optimisation problems are solved via scipy.optimize.linprog using the HiGHS solvers. If SciPy is unavailable, attempting to call these functions will raise an informative ImportError.

class fairlex.calibration.CalibrationResult(w, epsilon, t, status, message)[source]

Bases: object

Structured result from a calibration call.

w

Calibrated weights of shape (n,).

Type:

ndarray

epsilon

The worst absolute residual achieved in the residual minimisation problem. Only meaningful for leximin_residual.

Type:

float

t

The worst relative weight change achieved in the weight fairness problem. None if only the residual stage is performed.

Type:

Optional[float]

status

Status code from the linear programme (0 indicates success).

Type:

int

message

Solver termination message for diagnostics.

Type:

str

Parameters:
w: ndarray
epsilon: float
t: float | None
status: int
message: str
__init__(w, epsilon, t, status, message)
Parameters:
fairlex.calibration.leximin_residual(A, b, w0, *, min_ratio=0.1, max_ratio=10.0)[source]

Compute weights by minimising the worst absolute margin residual.

This function solves

\[\min_{w, \epsilon}\;\epsilon \quad\text{such that}\quad -\epsilon \le A\,w - b \le \epsilon, \quad \text{and}\quad w_i \in [w_{0,i}\,\text{min\_ratio},\, w_{0,i}\,\text{max\_ratio}].\]
Parameters:
  • A (ndarray) – Membership matrix of shape (m, n).

  • b (ndarray) – Target totals of shape (m,).

  • w0 (ndarray) – Base weights of shape (n,).

  • min_ratio (float, optional) – Lower bound on weights relative to w0. Defaults to 0.1.

  • max_ratio (float, optional) – Upper bound on weights relative to w0. Defaults to 10.0.

Returns:

Structured result containing the weights, the optimum epsilon, and solver diagnostics.

Return type:

CalibrationResult

Notes

If the problem is infeasible (e.g., because the bounds preclude any solution), the returned status will be nonzero and the weights may not be meaningful. Check status and message on the result.

fairlex.calibration.leximin_weight_fair(A, b, w0, *, min_ratio=0.1, max_ratio=10.0, slack=0.0, return_stages=False)[source]

Compute weights via residual leximin followed by weight-fair refinement.

This function first solves the residual minimisation problem as in leximin_residual(), yielding an optimum epsilon. It then fixes residuals to remain within epsilon + slack and minimises the worst relative change from the base weights. The optimisation problem for the second stage is:

\[\min_{w, t}\;t \quad\text{such that}\quad |A\,w - b| \le \epsilon^\* + \text{slack}, \quad |w_i - w_{0,i}| \le t\,w_{0,i}, \quad w_i \in [w_{0,i}\,\text{min\_ratio},\, w_{0,i}\,\text{max\_ratio}].\]
Parameters:
  • A (see leximin_residual().)

  • b (see leximin_residual().)

  • w0 (see leximin_residual().)

  • min_ratio (float, optional) – Weight bounds relative to the base weights. Defaults are 0.1 and 10.0 respectively.

  • max_ratio (float, optional) – Weight bounds relative to the base weights. Defaults are 0.1 and 10.0 respectively.

  • slack (float, optional) – Additional slack added to the optimum residual when constraining residuals in the second stage. Allows the algorithm to trade a small increase in margin error for improved weight stability. Defaults to 0.0.

  • return_stages (bool, optional) – If True, return the intermediate result of the residual stage as well as the final result.

Returns:

If return_stages is False (default), a single CalibrationResult containing the final weights and both the residual and weight fairness optima. If return_stages is True, a tuple (stage1_result, stage2_result).

Return type:

CalibrationResult or tuple

Core Classes

class fairlex.CalibrationResult(w, epsilon, t, status, message)[source]

Bases: object

Structured result from a calibration call.

w

Calibrated weights of shape (n,).

Type:

ndarray

epsilon

The worst absolute residual achieved in the residual minimisation problem. Only meaningful for leximin_residual.

Type:

float

t

The worst relative weight change achieved in the weight fairness problem. None if only the residual stage is performed.

Type:

Optional[float]

status

Status code from the linear programme (0 indicates success).

Type:

int

message

Solver termination message for diagnostics.

Type:

str

Parameters:
w: ndarray
epsilon: float
t: float | None
status: int
message: str
__init__(w, epsilon, t, status, message)
Parameters:

Metrics and Evaluation

Metric and summary utilities for fairlex.

This module contains helper functions to assess the quality of calibrated weights. The primary entry point, evaluate_solution(), returns a dictionary of commonly used diagnostics:

  • resid_max_abs - the maximum absolute residual across all margins.

  • resid_p95 - the 95th percentile of absolute residuals.

  • resid_median - the median absolute residual.

  • ESS - the Kish effective sample size of the weights.

  • deff - design effect due to the weights (n / ESS).

  • weight_max - maximum weight.

  • weight_p99 - 99th percentile of weights.

  • weight_p95 - 95th percentile of weights.

  • weight_median - median weight.

  • weight_min - minimum weight.

  • total_error - the difference between the weighted total and the target total in the last margin (assumed to be the sum of membership for the entire population).

Additional convenience functions are provided for computing the effective sample size and design effect alone.

fairlex.metrics.design_effect(weights)[source]

Compute the design effect due to weighting.

The design effect is given by n / ESS where n is the number of observations. It quantifies the inflation in variance attributable to unequal weights.

Parameters:

weights (ndarray) – Array of weights.

Returns:

Design effect. Returns np.nan if the effective sample size is undefined.

Return type:

float

fairlex.metrics.effective_sample_size(weights)[source]

Compute the Kish effective sample size.

The Kish effective sample size is defined as

\[\mathrm{ESS} = \frac{\left(\sum_i w_i\right)^2}{\sum_i w_i^2}.\]
Parameters:

weights (ndarray) – Array of weights.

Returns:

Effective sample size. Returns np.nan if the denominator is zero.

Return type:

float

fairlex.metrics.evaluate_solution(A, b, w, *, quantiles=(0.99, 0.95, 0.5), base_weights=None)[source]

Compute summary diagnostics for a calibration solution.

Parameters:
  • A (ndarray) – Membership matrix of shape (m, n) used in the calibration.

  • b (ndarray) – Target totals of shape (m,).

  • w (ndarray) – Calibrated weights of shape (n,).

  • quantiles (tuple of float, optional) – Quantiles to compute on the weight distribution. Defaults to (0.99, 0.95, 0.5), corresponding to the 99th percentile, 95th percentile and median.

  • base_weights (ndarray, optional) – Original/base weights. If provided, relative deviations will be computed and returned under the keys max_rel_dev, p95_rel_dev, and median_rel_dev.

Returns:

A dictionary containing residual and weight diagnostics. See module docstring for the key descriptions.

Return type:

dict

Main Package

Top level package for fairlex.

This package provides routines for leximin-style calibration of survey weights.

Two primary calibration strategies are exposed:

  • leximin_residual - minimises the worst absolute margin residual across all constraints (min-max), optionally refining the next worst in lexicographic order. This approach will tend to squeeze margin errors to near zero at the cost of increased leverage on the weights.

  • leximin_weight_fair - after achieving the smallest possible worst residual, this method minimises the largest relative change from the base weights. It balances fairness in both the errors and the weight movements, offering a compromise between calibration accuracy and variance inflation.

The core implementation lives in fairlex.calibration. Convenience functions and metric helpers live in fairlex.metrics.

class fairlex.CalibrationResult(w, epsilon, t, status, message)[source]

Bases: object

Structured result from a calibration call.

w

Calibrated weights of shape (n,).

Type:

ndarray

epsilon

The worst absolute residual achieved in the residual minimisation problem. Only meaningful for leximin_residual.

Type:

float

t

The worst relative weight change achieved in the weight fairness problem. None if only the residual stage is performed.

Type:

Optional[float]

status

Status code from the linear programme (0 indicates success).

Type:

int

message

Solver termination message for diagnostics.

Type:

str

Parameters:
__init__(w, epsilon, t, status, message)
Parameters:
w: ndarray
epsilon: float
t: float | None
status: int
message: str
fairlex.evaluate_solution(A, b, w, *, quantiles=(0.99, 0.95, 0.5), base_weights=None)[source]

Compute summary diagnostics for a calibration solution.

Parameters:
  • A (ndarray) – Membership matrix of shape (m, n) used in the calibration.

  • b (ndarray) – Target totals of shape (m,).

  • w (ndarray) – Calibrated weights of shape (n,).

  • quantiles (tuple of float, optional) – Quantiles to compute on the weight distribution. Defaults to (0.99, 0.95, 0.5), corresponding to the 99th percentile, 95th percentile and median.

  • base_weights (ndarray, optional) – Original/base weights. If provided, relative deviations will be computed and returned under the keys max_rel_dev, p95_rel_dev, and median_rel_dev.

Returns:

A dictionary containing residual and weight diagnostics. See module docstring for the key descriptions.

Return type:

dict

fairlex.leximin_residual(A, b, w0, *, min_ratio=0.1, max_ratio=10.0)[source]

Compute weights by minimising the worst absolute margin residual.

This function solves

\[\min_{w, \epsilon}\;\epsilon \quad\text{such that}\quad -\epsilon \le A\,w - b \le \epsilon, \quad \text{and}\quad w_i \in [w_{0,i}\,\text{min\_ratio},\, w_{0,i}\,\text{max\_ratio}].\]
Parameters:
  • A (ndarray) – Membership matrix of shape (m, n).

  • b (ndarray) – Target totals of shape (m,).

  • w0 (ndarray) – Base weights of shape (n,).

  • min_ratio (float, optional) – Lower bound on weights relative to w0. Defaults to 0.1.

  • max_ratio (float, optional) – Upper bound on weights relative to w0. Defaults to 10.0.

Returns:

Structured result containing the weights, the optimum epsilon, and solver diagnostics.

Return type:

CalibrationResult

Notes

If the problem is infeasible (e.g., because the bounds preclude any solution), the returned status will be nonzero and the weights may not be meaningful. Check status and message on the result.

fairlex.leximin_weight_fair(A, b, w0, *, min_ratio=0.1, max_ratio=10.0, slack=0.0, return_stages=False)[source]

Compute weights via residual leximin followed by weight-fair refinement.

This function first solves the residual minimisation problem as in leximin_residual(), yielding an optimum epsilon. It then fixes residuals to remain within epsilon + slack and minimises the worst relative change from the base weights. The optimisation problem for the second stage is:

\[\min_{w, t}\;t \quad\text{such that}\quad |A\,w - b| \le \epsilon^\* + \text{slack}, \quad |w_i - w_{0,i}| \le t\,w_{0,i}, \quad w_i \in [w_{0,i}\,\text{min\_ratio},\, w_{0,i}\,\text{max\_ratio}].\]
Parameters:
  • A (see leximin_residual().)

  • b (see leximin_residual().)

  • w0 (see leximin_residual().)

  • min_ratio (float, optional) – Weight bounds relative to the base weights. Defaults are 0.1 and 10.0 respectively.

  • max_ratio (float, optional) – Weight bounds relative to the base weights. Defaults are 0.1 and 10.0 respectively.

  • slack (float, optional) – Additional slack added to the optimum residual when constraining residuals in the second stage. Allows the algorithm to trade a small increase in margin error for improved weight stability. Defaults to 0.0.

  • return_stages (bool, optional) – If True, return the intermediate result of the residual stage as well as the final result.

Returns:

If return_stages is False (default), a single CalibrationResult containing the final weights and both the residual and weight fairness optima. If return_stages is True, a tuple (stage1_result, stage2_result).

Return type:

CalibrationResult or tuple