Basic Usage Examples¶
This page demonstrates the core functionality of incline with executable examples that run automatically during documentation build.
Quick Start: Basic Trend Estimation¶
Let’s start with a simple example using sample time series data:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from incline import naive_trend, spline_trend, sgolay_trend
# Load sample data
np.random.seed(42)
dates = pd.date_range('2020-01-01', periods=30, freq='D')
# Create a time series with trend + noise
trend_component = 0.5 * np.arange(30)
noise = np.random.normal(0, 1, 30)
values = 100 + trend_component + noise
df = pd.DataFrame({'value': values}, index=dates)
print("Sample time series data:")
print(df.head())
print(f"\nData shape: {df.shape}")
Sample time series data:
value
2020-01-01 100.496714
2020-01-02 100.361736
2020-01-03 101.647689
2020-01-04 103.023030
2020-01-05 101.765847
Data shape: (30, 1)
Method Comparison: Naive vs Spline vs Savitzky-Golay¶
# Apply all three basic methods
naive_result = naive_trend(df)
spline_result = spline_trend(df, function_order=3, s=5)
sgolay_result = sgolay_trend(df, window_length=7, function_order=3)
# Create comparison plot
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(12, 8))
# Plot 1: Original data and smoothed versions
ax1.plot(df.index, df['value'], 'ko-', alpha=0.6, markersize=4, label='Original Data')
ax1.plot(spline_result.index, spline_result['smoothed_value'], 'r-', linewidth=2, label='Spline Smoothed')
ax1.plot(sgolay_result.index, sgolay_result['smoothed_value'], 'b-', linewidth=2, label='Savitzky-Golay Smoothed')
ax1.set_ylabel('Value')
ax1.set_title('Time Series Smoothing Methods')
ax1.legend()
ax1.grid(True, alpha=0.3)
# Plot 2: Derivative estimates (trends)
ax2.plot(naive_result.index, naive_result['derivative_value'], 'g-', linewidth=2, label='Naive Trend', alpha=0.8)
ax2.plot(spline_result.index, spline_result['derivative_value'], 'r-', linewidth=2, label='Spline Trend')
ax2.plot(sgolay_result.index, sgolay_result['derivative_value'], 'b-', linewidth=2, label='S-G Trend')
ax2.axhline(y=0.5, color='black', linestyle='--', alpha=0.7, label='True Trend (0.5)')
ax2.axhline(y=0, color='gray', linestyle='-', alpha=0.5)
ax2.set_ylabel('Trend (derivative)')
ax2.set_xlabel('Date')
ax2.set_title('Trend Estimates')
ax2.legend()
ax2.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
Performance Analysis¶
# Calculate performance metrics
true_derivative = 0.5 # Known true trend
methods = {
'Naive': naive_result['derivative_value'],
'Spline': spline_result['derivative_value'],
'Savitzky-Golay': sgolay_result['derivative_value']
}
performance_metrics = {}
for method_name, derivatives in methods.items():
# Remove NaN values for fair comparison
valid_derivatives = derivatives.dropna()
mse = np.mean((valid_derivatives - true_derivative) ** 2)
bias = np.mean(valid_derivatives - true_derivative)
std = np.std(valid_derivatives)
performance_metrics[method_name] = {
'MSE': mse,
'Bias': bias,
'Std Dev': std,
'Valid Points': len(valid_derivatives)
}
# Create performance comparison table
performance_df = pd.DataFrame(performance_metrics).T
print("Performance Comparison (True trend = 0.5):")
print("=" * 50)
print(performance_df.round(4))
Performance Comparison (True trend = 0.5):
==================================================
MSE Bias Std Dev Valid Points
Naive 0.3759 -0.0317 0.6123 30.0
Spline 1.2734 -0.1262 1.1214 30.0
Savitzky-Golay 0.1858 0.0004 0.4310 30.0
Parameter Sensitivity Analysis¶
Understanding how smoothing parameters affect results:
# Test different smoothing parameters for spline method
smoothing_factors = [0.1, 1, 5, 20, 100]
colors = ['purple', 'blue', 'green', 'orange', 'red']
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(12, 8))
# Plot smoothed curves
ax1.plot(df.index, df['value'], 'ko-', alpha=0.6, markersize=4, label='Original Data')
for i, s_factor in enumerate(smoothing_factors):
result = spline_trend(df, function_order=3, s=s_factor)
ax1.plot(result.index, result['smoothed_value'],
color=colors[i], linewidth=2, label=f's = {s_factor}')
ax1.set_ylabel('Value')
ax1.set_title('Effect of Smoothing Parameter on Spline Fits')
ax1.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
ax1.grid(True, alpha=0.3)
# Plot corresponding derivatives
for i, s_factor in enumerate(smoothing_factors):
result = spline_trend(df, function_order=3, s=s_factor)
ax2.plot(result.index, result['derivative_value'],
color=colors[i], linewidth=2, label=f's = {s_factor}')
ax2.axhline(y=0.5, color='black', linestyle='--', alpha=0.7, label='True Trend')
ax2.axhline(y=0, color='gray', linestyle='-', alpha=0.5)
ax2.set_ylabel('Trend (derivative)')
ax2.set_xlabel('Date')
ax2.set_title('Effect of Smoothing Parameter on Trend Estimates')
ax2.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
ax2.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
# Calculate MSE for each smoothing parameter
print("\nSmoothing Parameter Analysis:")
print("=" * 35)
for s_factor in smoothing_factors:
result = spline_trend(df, function_order=3, s=s_factor)
mse = np.mean((result['derivative_value'] - true_derivative) ** 2)
print(f"s = {s_factor:3.0f}: MSE = {mse:.4f}")
Smoothing Parameter Analysis:
===================================
s = 0: MSE = 6.9108
s = 1: MSE = 4.1879
s = 5: MSE = 1.2734
s = 20: MSE = 0.0050
s = 100: MSE = 0.0050
Working with Real Time Series Features¶
# Create a more complex time series with multiple characteristics
np.random.seed(123)
n_points = 60
dates = pd.date_range('2020-01-01', periods=n_points, freq='D')
# Complex signal: trend + seasonality + noise + outliers
t = np.arange(n_points)
trend = 0.3 * t
seasonal = 5 * np.sin(2 * np.pi * t / 7) # Weekly seasonality
noise = np.random.normal(0, 2, n_points)
# Add some outliers
outlier_indices = [15, 35, 50]
complex_values = 100 + trend + seasonal + noise
for idx in outlier_indices:
complex_values[idx] += np.random.choice([-10, 10])
complex_df = pd.DataFrame({'value': complex_values}, index=dates)
# Apply different methods
methods_results = {
'Naive': naive_trend(complex_df),
'Spline (s=5)': spline_trend(complex_df, s=5),
'Spline (s=50)': spline_trend(complex_df, s=50),
'S-G (win=7)': sgolay_trend(complex_df, window_length=7, function_order=3),
'S-G (win=15)': sgolay_trend(complex_df, window_length=15, function_order=3),
}
# Plot results
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(12, 10))
# Original data
ax1.plot(complex_df.index, complex_df['value'], 'k-', alpha=0.6, linewidth=1, label='Original Data')
ax1.scatter([complex_df.index[i] for i in outlier_indices],
[complex_df.iloc[i]['value'] for i in outlier_indices],
color='red', s=50, zorder=5, label='Outliers')
# Show some smoothed curves
colors_dict = {'Spline (s=5)': 'blue', 'Spline (s=50)': 'red', 'S-G (win=15)': 'green'}
for name, color in colors_dict.items():
result = methods_results[name]
if 'smoothed_value' in result.columns:
ax1.plot(result.index, result['smoothed_value'],
color=color, linewidth=2, label=name)
ax1.set_ylabel('Value')
ax1.set_title('Complex Time Series: Trend + Seasonality + Outliers')
ax1.legend()
ax1.grid(True, alpha=0.3)
# Compare trend estimates
for name, result in methods_results.items():
ax2.plot(result.index, result['derivative_value'],
linewidth=2, label=name, alpha=0.8)
ax2.axhline(y=0.3, color='black', linestyle='--', alpha=0.7, label='True Trend (0.3)')
ax2.axhline(y=0, color='gray', linestyle='-', alpha=0.5)
ax2.set_ylabel('Trend (derivative)')
ax2.set_xlabel('Date')
ax2.set_title('Trend Estimates - Different Methods Handle Complexity Differently')
ax2.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
ax2.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
print("Method Comparison on Complex Data:")
print("=" * 40)
for name, result in methods_results.items():
derivatives = result['derivative_value'].dropna()
mse = np.mean((derivatives - 0.3) ** 2)
mean_trend = np.mean(derivatives)
print(f"{name:12s}: MSE = {mse:.3f}, Mean = {mean_trend:.3f}")
Method Comparison on Complex Data:
========================================
Naive : MSE = 13.369, Mean = 0.446
Spline (s=5): MSE = 26.257, Mean = 0.489
Spline (s=50): MSE = 23.498, Mean = 0.486
S-G (win=7) : MSE = 12.164, Mean = 0.410
S-G (win=15): MSE = 0.660, Mean = 0.403
Key Takeaways¶
print("📊 BASIC USAGE SUMMARY")
print("=" * 50)
print()
print("✅ Method Characteristics:")
print(" • Naive: Fast, high variance, poor at boundaries")
print(" • Spline: Smooth, handles irregular data, parameter sensitive")
print(" • Savitzky-Golay: Good for regular data, edge effects")
print()
print("✅ Parameter Guidelines:")
print(" • Lower smoothing = follows data more closely")
print(" • Higher smoothing = smoother trends, less noise sensitivity")
print(" • Window size affects boundary behavior")
print()
print("✅ When to Use Each Method:")
print(" • Naive: Quick estimates, clean data")
print(" • Spline: Irregular sampling, need smoothness")
print(" • S-G: Regular sampling, moderate noise")
📊 BASIC USAGE SUMMARY
==================================================
✅ Method Characteristics:
• Naive: Fast, high variance, poor at boundaries
• Spline: Smooth, handles irregular data, parameter sensitive
• Savitzky-Golay: Good for regular data, edge effects
✅ Parameter Guidelines:
• Lower smoothing = follows data more closely
• Higher smoothing = smoother trends, less noise sensitivity
• Window size affects boundary behavior
✅ When to Use Each Method:
• Naive: Quick estimates, clean data
• Spline: Irregular sampling, need smoothness
• S-G: Regular sampling, moderate noise
This completes the basic usage examples. Each code block executes during documentation build and produces static outputs for GitHub Pages.