Mathematical Foundations and Methods¶
This document provides comprehensive information about the mathematical foundations, methods, and best practices for the incline package.
Core Mathematical Problem¶
The package addresses the problem of estimating the instantaneous rate of change (derivative) at a point \(t_0\) in a noisy time series:
where \(f\) is the underlying smooth function and \(\epsilon_i\) is noise. We want to estimate \(f'(t_0)\) along with confidence intervals.
Available Methods and Capabilities¶
The incline package provides a comprehensive suite of trend estimation methods:
Basic Methods
Naive (Central Differences): Simple finite differences for clean data
Savitzky-Golay Filtering: Local polynomial fitting with uniform sampling
Spline Interpolation: Smooth curve fitting with automatic time scaling
Advanced Nonparametric Methods
LOESS/LOWESS: Locally weighted scatterplot smoothing with robust options
Local Polynomial Regression: Kernel-weighted polynomial fits
L1 Trend Filtering: Piecewise linear trends with changepoint detection
Bayesian and State-Space Methods
Gaussian Process Regression: Full posterior distribution with principled uncertainty
Kalman Filtering: Local linear trend models with adaptive parameters
Structural Time Series: Seasonal decomposition with state-space modeling
Multiscale Analysis
SiZer Maps: Significance of trends across multiple smoothing scales
Adaptive Methods: Time-varying parameters for non-stationary series
Seasonal and Robust Methods
STL Decomposition: Seasonal-trend decomposition using Loess
Robust Statistics: Outlier-resistant trend ranking and aggregation
Bootstrap Confidence Intervals: Non-parametric uncertainty quantification
Key Features and Improvements¶
Automatic Time Scaling ✅
All methods now handle:
DateTime indices with proper scaling
Irregular sampling (where applicable)
Custom time columns
Automatic detection of time units
# Automatic time handling result = spline_trend(df) # Uses datetime index result = gp_trend(df, time_column='timestamp') # Custom time column
Parameter Selection ✅
Cross-validation:
select_smoothing_parameter_cv()Automatic selection: Built into GP and Kalman methods
Adaptive methods:
adaptive_gp_trend(),adaptive_kalman_trend()
# Automatic parameter selection best_s, cv_scores = select_smoothing_parameter_cv(df, method='spline') result = gp_trend(df) # Auto-optimizes hyperparameters
Comprehensive Uncertainty Quantification ✅
Bootstrap confidence intervals: All basic methods
Bayesian posteriors: Gaussian Process methods
Kalman uncertainty: State-space models
Significance testing: SiZer analysis
# Multiple uncertainty quantification approaches result = bootstrap_derivative_ci(df, n_bootstrap=200) result = gp_trend(df, confidence_level=0.95) result = kalman_trend(df) # Natural uncertainty from Kalman filter
Robust to Irregular Sampling ✅
Most methods handle irregular sampling:
Splines, LOESS, GP, Kalman: Native support
Savitzky-Golay: Requires regular sampling (automatically detected)
Multiscale Significance Analysis ✅
# SiZer analysis across scales sizer = sizer_analysis(df, n_bandwidths=20) features = sizer.find_significant_features() # Combined trend + significance result = trend_with_sizer(df, trend_method='loess')
Mathematical Details¶
Savitzky-Golay Derivatives¶
The filter fits a polynomial of degree \(p\) using least squares over a window of size \(2m+1\):
The derivative is:
Key requirement: For \(k\)-th derivative, need \(p \geq k\) and window size \(\geq p + 1\).
Scaling: Derivative must be divided by \(\Delta t^k\) where \(\Delta t\) is the time step.
Spline Derivatives¶
Cubic splines minimize:
subject to \(\sum w_i (y_i - f(x_i))^2 \leq s\).
The derivative is obtained analytically from the spline coefficients.
Advantage: Handles irregular sampling naturally.
Challenge: Choosing \(s\) requires domain knowledge or cross-validation.
Naive Method (Central Differences)¶
Pros: Simple, unbiased for linear trends
Cons: High variance, sensitive to noise, poor at boundaries
Autocorrelation and Serial Dependence¶
Time series typically have autocorrelated errors:
This violates independence assumptions and means:
Standard errors are underestimated
Confidence intervals are too narrow
Significance tests are invalid
Solution: Use block bootstrap or model the autocorrelation explicitly.
Best Practices¶
Always specify time units
# BAD: Assumes unit time result = spline_trend(df) # GOOD: Explicit time handling result = improved_spline_trend(df, time_column='date')
Check sampling regularity
time_diffs = df.index.to_series().diff() if time_diffs.std() / time_diffs.mean() > 0.1: print("Warning: Irregular sampling detected") # Use splines, not Savitzky-Golay
Validate smoothing parameters
# Use cross-validation best_s, cv_results = select_smoothing_parameter_cv( df, param_name='s', method='spline' )
Quantify uncertainty
# Get confidence intervals result = bootstrap_derivative_ci( df, method='spline', n_bootstrap=100 ) # Check if trend is significant significant = result['significant_trend']
Handle seasonality
For seasonal data, consider:
Pre-deseasonalizing with STL decomposition
Using longer smoothing windows (> seasonal period)
Fitting seasonal models explicitly
Be cautious at boundaries
# Mark unreliable edge estimates window = 15 result['reliable'] = True result.iloc[:window//2, 'reliable'] = False result.iloc[-window//2:, 'reliable'] = False
Alternative Approaches¶
For more robust trend estimation, consider:
Local polynomial regression (LOESS) - More flexible than Savitzky-Golay - Better edge handling - Available in statsmodels
State-space models - Explicit modeling of trend component - Natural uncertainty quantification - Handles missing data
Gaussian processes - Full posterior distribution for derivatives - Principled uncertainty quantification - Heavy computationally
L1 trend filtering - Piecewise linear trends - Automatic changepoint detection - Robust to outliers
References¶
Savitzky, A., & Golay, M. J. (1964). Smoothing and differentiation of data by simplified least squares procedures. Analytical chemistry, 36(8), 1627-1639.
De Boor, C. (1978). A practical guide to splines. Springer-Verlag.
Fan, J., & Gijbels, I. (1996). Local polynomial modelling and its applications. Chapman and Hall.
Kim, S. J., Koh, K., Boyd, S., & Gorinevsky, D. (2009). ℓ1 trend filtering. SIAM review, 51(2), 339-360.