Diagnostics and Monitoring

The onlinerake package provides comprehensive diagnostics and monitoring capabilities to help you understand the behavior of your streaming raking algorithms. These features are particularly useful for debugging convergence issues, tuning parameters, and monitoring the quality of your weight calibration.

The algorithms are designed with numerical stability in mind, automatically handling extreme cases that could cause overflow, underflow, or convergence detection failures.

Enhanced Monitoring Features

Both OnlineRakingSGD and OnlineRakingMWU support the following diagnostic features:

Convergence Monitoring

Automatic Convergence Detection

The algorithms can automatically detect when they have converged based on loss stability:

from onlinerake import OnlineRakingSGD, Targets

targets = Targets(age=0.5, gender=0.5, education=0.4, region=0.3)
raker = OnlineRakingSGD(
    targets,
    learning_rate=3.0,
    track_convergence=True,    # Enable convergence detection
    convergence_window=20      # Use last 20 observations for convergence check
)

# Process observations
for obs in observations:
    raker.partial_fit(obs)

    if raker.converged:
        print(f"Converged at observation {raker.convergence_step}")
        break

Gradient Norm Tracking

Monitor the magnitude of gradient updates to understand convergence behavior:

# After processing observations
gradient_norms = raker.gradient_norm_history

# Plot convergence
import matplotlib.pyplot as plt
plt.plot(gradient_norms)
plt.xlabel('Observation')
plt.ylabel('Gradient Norm')
plt.title('Convergence Behavior')

Loss Moving Average

Track smoothed loss over a configurable window:

print(f"Current loss: {raker.loss:.6f}")
print(f"Moving average: {raker.loss_moving_average:.6f}")

Oscillation Detection

Detect when algorithms are oscillating rather than converging:

# Check if algorithm is oscillating
oscillating = raker.detect_oscillation(threshold=0.1)

if oscillating:
    print("Warning: Algorithm may be oscillating")
    print("Consider reducing learning rate")

Weight Distribution Analysis

Monitor the distribution of weights to detect outliers and understand effective sample size:

weight_stats = raker.weight_distribution_stats

print(f"Weight range: [{weight_stats['min']:.3f}, {weight_stats['max']:.3f}]")
print(f"Median weight: {weight_stats['median']:.3f}")
print(f"Outliers detected: {weight_stats['outliers_count']}")
print(f"Effective sample size: {raker.effective_sample_size:.1f}")

Verbose Mode

Enable verbose output for real-time monitoring:

raker = OnlineRakingSGD(
    targets,
    learning_rate=3.0,
    verbose=True  # Print progress every 100 observations
)

# Output will show:
# Obs 100: loss=0.001234, grad_norm=0.005678, ess=85.3

Configuration Options

The diagnostic features can be configured via constructor parameters:

raker = OnlineRakingSGD(
    targets,
    learning_rate=5.0,
    verbose=False,               # Disable verbose output
    track_convergence=True,      # Enable convergence detection
    convergence_window=20        # Window size for convergence check
)

Parameters:

  • verbose (bool): Enable progress output every 100 observations

  • track_convergence (bool): Enable automatic convergence detection

  • convergence_window (int): Number of recent observations to use for convergence analysis

Comprehensive History Tracking

All diagnostic information is automatically stored in the history attribute:

# Access full history
for i, state in enumerate(raker.history):
    print(f"Step {i+1}:")
    print(f"  Loss: {state['loss']:.6f}")
    print(f"  Gradient norm: {state['gradient_norm']:.6f}")
    print(f"  ESS: {state['ess']:.1f}")
    print(f"  Converged: {state['converged']}")
    print(f"  Oscillating: {state['oscillating']}")

Each history entry contains:

  • loss: Current squared-error loss on margins

  • gradient_norm: L2 norm of the gradient vector

  • loss_moving_avg: Moving average of loss over convergence window

  • ess: Effective sample size

  • converged: Whether convergence has been detected

  • oscillating: Whether oscillation is detected

  • weight_stats: Comprehensive weight distribution statistics

  • weighted_margins: Current weighted demographic margins

  • raw_margins: Unweighted demographic margins

Practical Examples

Debugging Convergence Issues

import numpy as np
from onlinerake import OnlineRakingSGD, Targets

targets = Targets(age=0.5, gender=0.5, education=0.4, region=0.3)
raker = OnlineRakingSGD(
    targets,
    learning_rate=10.0,  # Potentially too high
    verbose=True,
    track_convergence=True
)

# Simulate data
for i in range(200):
    obs = {
        "age": np.random.binomial(1, 0.3),
        "gender": np.random.binomial(1, 0.4),
        "education": np.random.binomial(1, 0.6),
        "region": np.random.binomial(1, 0.2)
    }
    raker.partial_fit(obs)

    # Check for problems
    if i > 50 and raker.detect_oscillation():
        print(f"Oscillation detected at step {i+1}")
        print("Consider reducing learning rate")
        break

    if raker.converged:
        print(f"Successfully converged at step {i+1}")
        break

Monitoring Real-time Performance

# Set up monitoring
raker = OnlineRakingSGD(targets, learning_rate=3.0, verbose=True)

for obs in data_stream:
    raker.partial_fit(obs)

    # Monitor every 100 observations
    if raker._n_obs % 100 == 0:
        stats = raker.weight_distribution_stats
        print(f"\\nDiagnostics at observation {raker._n_obs}:")
        print(f"  Loss: {raker.loss:.6f}")
        print(f"  ESS: {raker.effective_sample_size:.1f}")
        print(f"  Weight outliers: {stats['outliers_count']}")

        if stats['outliers_count'] > raker._n_obs * 0.1:
            print("  Warning: High proportion of weight outliers")

Numerical Stability and Robustness

The algorithms include several built-in safeguards for numerical stability:

MWU Exponent Clipping

The multiplicative weights update algorithm automatically clips exponential arguments to prevent overflow:

# Internally, MWU clips extreme exponents
expo = np.clip(-learning_rate * grad, -50.0, 50.0)
update = np.exp(expo)

This prevents NaN/Inf values even with extreme learning rates or gradients.

Robust Convergence Detection

Convergence detection handles edge cases gracefully:

# Convergence when loss approaches zero
raker = OnlineRakingSGD(targets, track_convergence=True)

# Algorithm automatically detects:
# 1. Perfect convergence (loss ≈ 0)
# 2. Relative stability (low variance)

if raker.converged:
    print(f"Converged at step {raker.convergence_step}")

Extreme Parameter Handling

Both algorithms are robust to extreme parameter settings:

# High learning rates with extreme targets
extreme_targets = Targets(age=0.1, gender=0.9, education=0.1, region=0.9)
raker = OnlineRakingMWU(extreme_targets, learning_rate=50.0)

# Algorithm remains stable despite extreme settings
for obs in challenging_data:
    raker.partial_fit(obs)
    assert np.all(np.isfinite(raker.weights))  # Always finite

For complete examples demonstrating all diagnostic features, see examples/diagnostics_demo.py in the package repository.