hessband documentation
Hessband: Analytic-Hessian bandwidth selection for univariate kernel smoothers.
This package provides tools for selecting bandwidths for Nadaraya–Watson regression using analytic derivatives of the leave-one-out cross-validation risk. The main entry point is select_nw_bandwidth, which returns an optimal bandwidth according to different optimisation strategies, including the analytic-Hessian method.
Example
>>> import numpy as np
>>> from hessband import select_nw_bandwidth, nw_predict
>>> # Generate synthetic data
>>> X = np.linspace(0, 1, 200)
>>> y = np.sin(2 * np.pi * X) + 0.1 * np.random.randn(200)
>>> # Select bandwidth via analytic-Hessian method
>>> h_opt = select_nw_bandwidth(X, y, method='analytic')
>>> # Predict at new points
>>> y_pred = nw_predict(X, y, X, h_opt)
- hessband.select_nw_bandwidth(X, y, kernel='gaussian', method='analytic', folds=5, h_bounds=(0.01, 1.0), grid_size=30, init_bandwidth=None)[source]
Select the optimal bandwidth for Nadaraya–Watson regression.
- Parameters:
X (array-like, shape (n_samples,)) – Input values.
y (array-like, shape (n_samples,)) – Target values.
kernel (str, optional (default='gaussian')) – Kernel type (‘gaussian’ or ‘epanechnikov’).
method (str, optional (default='analytic')) – Bandwidth selection method: one of {‘analytic’, ‘grid’, ‘plugin’, ‘newton_fd’, ‘golden’, ‘bayes’}.
folds (int, optional (default=5)) – Number of folds for cross-validation.
h_bounds (tuple, optional (default=(0.01, 1.0))) – Lower and upper bounds for the bandwidth search.
grid_size (int, optional (default=30)) – Number of grid points for grid search.
init_bandwidth (float, optional) – Initial bandwidth for Newton-based methods. If None, uses plug-in rule.
- Returns:
Selected bandwidth.
- Return type:
float
- hessband.nw_predict(X_train, y_train, X_test, h, kernel='gaussian')[source]
Compute Nadaraya–Watson predictions using a specified kernel.
- hessband.grid_search_cv(X, y, kernel, predict_fn, h_grid, folds=5)[source]
Grid search for the best bandwidth using cross-validation.
- hessband.newton_fd(X, y, kernel, predict_fn, h_init, h_min=0.001, folds=5, tol=0.001, max_iter=10, eps=0.0001)[source]
Finite-difference Newton method for bandwidth selection.
- hessband.analytic_newton(X, y, kernel, predict_fn, h_init, h_min=0.001, folds=5, tol=0.001, max_iter=10)[source]
Analytic Newton method for LOOCV risk minimisation. Returns the bandwidth without performing CV evaluations in the loop.
- hessband.golden_section(X, y, kernel, predict_fn, a, b, folds=5, tol=0.001, max_iter=20)[source]
Golden-section search for bandwidth selection.
- hessband.bayes_opt_bandwidth(X, y, kernel, predict_fn, a, b, folds=5, init_points=5, n_iter=10)[source]
Bayesian optimisation for bandwidth selection.
- hessband.select_kde_bandwidth(x, kernel='gauss', method='analytic', h_bounds=(0.01, 1.0), grid_size=30, h_init=None)[source]
Select an optimal bandwidth for univariate KDE using LSCV.
- Parameters:
x (array-like) – Data samples.
kernel (str, optional) – Kernel name: ‘gauss’ or ‘epan’.
method (str, optional) – Selection method: ‘analytic’ (Newton–Armijo), ‘grid’, or ‘golden’.
h_bounds (tuple, optional) – Lower and upper bounds for the search.
grid_size (int, optional) – Number of grid points for grid search.
h_init (float, optional) – Initial bandwidth for Newton optimisation. Defaults to plug-in estimate.
- Returns:
Selected bandwidth.
- Return type:
float
- hessband.lscv_generic(x, h, kernel)[source]
Return (LSCV, gradient, Hessian) at bandwidth h for the chosen kernel.
- Return type:
Tuple
[float
,float
,float
]