hessband documentation

Hessband: Analytic-Hessian bandwidth selection for univariate kernel smoothers.

This package provides tools for selecting bandwidths for Nadaraya–Watson regression using analytic derivatives of the leave-one-out cross-validation risk. The main entry point is select_nw_bandwidth, which returns an optimal bandwidth according to different optimisation strategies, including the analytic-Hessian method.

Example

>>> import numpy as np
>>> from hessband import select_nw_bandwidth, nw_predict
>>> # Generate synthetic data
>>> X = np.linspace(0, 1, 200)
>>> y = np.sin(2 * np.pi * X) + 0.1 * np.random.randn(200)
>>> # Select bandwidth via analytic-Hessian method
>>> h_opt = select_nw_bandwidth(X, y, method='analytic')
>>> # Predict at new points
>>> y_pred = nw_predict(X, y, X, h_opt)
hessband.select_nw_bandwidth(X, y, kernel='gaussian', method='analytic', folds=5, h_bounds=(0.01, 1.0), grid_size=30, init_bandwidth=None)[source]

Select the optimal bandwidth for Nadaraya–Watson regression.

Parameters:
  • X (array-like, shape (n_samples,)) – Input values.

  • y (array-like, shape (n_samples,)) – Target values.

  • kernel (str, optional (default='gaussian')) – Kernel type (‘gaussian’ or ‘epanechnikov’).

  • method (str, optional (default='analytic')) – Bandwidth selection method: one of {‘analytic’, ‘grid’, ‘plugin’, ‘newton_fd’, ‘golden’, ‘bayes’}.

  • folds (int, optional (default=5)) – Number of folds for cross-validation.

  • h_bounds (tuple, optional (default=(0.01, 1.0))) – Lower and upper bounds for the bandwidth search.

  • grid_size (int, optional (default=30)) – Number of grid points for grid search.

  • init_bandwidth (float, optional) – Initial bandwidth for Newton-based methods. If None, uses plug-in rule.

Returns:

Selected bandwidth.

Return type:

float

hessband.nw_predict(X_train, y_train, X_test, h, kernel='gaussian')[source]

Compute Nadaraya–Watson predictions using a specified kernel.

hessband.grid_search_cv(X, y, kernel, predict_fn, h_grid, folds=5)[source]

Grid search for the best bandwidth using cross-validation.

hessband.plug_in_bandwidth(X)[source]

Plug-in bandwidth based on Silverman’s rule of thumb.

hessband.newton_fd(X, y, kernel, predict_fn, h_init, h_min=0.001, folds=5, tol=0.001, max_iter=10, eps=0.0001)[source]

Finite-difference Newton method for bandwidth selection.

hessband.analytic_newton(X, y, kernel, predict_fn, h_init, h_min=0.001, folds=5, tol=0.001, max_iter=10)[source]

Analytic Newton method for LOOCV risk minimisation. Returns the bandwidth without performing CV evaluations in the loop.

hessband.golden_section(X, y, kernel, predict_fn, a, b, folds=5, tol=0.001, max_iter=20)[source]

Golden-section search for bandwidth selection.

hessband.bayes_opt_bandwidth(X, y, kernel, predict_fn, a, b, folds=5, init_points=5, n_iter=10)[source]

Bayesian optimisation for bandwidth selection.

hessband.select_kde_bandwidth(x, kernel='gauss', method='analytic', h_bounds=(0.01, 1.0), grid_size=30, h_init=None)[source]

Select an optimal bandwidth for univariate KDE using LSCV.

Parameters:
  • x (array-like) – Data samples.

  • kernel (str, optional) – Kernel name: ‘gauss’ or ‘epan’.

  • method (str, optional) – Selection method: ‘analytic’ (Newton–Armijo), ‘grid’, or ‘golden’.

  • h_bounds (tuple, optional) – Lower and upper bounds for the search.

  • grid_size (int, optional) – Number of grid points for grid search.

  • h_init (float, optional) – Initial bandwidth for Newton optimisation. Defaults to plug-in estimate.

Returns:

Selected bandwidth.

Return type:

float

hessband.lscv_generic(x, h, kernel)[source]

Return (LSCV, gradient, Hessian) at bandwidth h for the chosen kernel.

Return type:

Tuple[float, float, float]

Modules

Indices and tables