Cross-Validation Utilities¶

The cv module provides functions for robust threshold estimation using cross-validation techniques.

Cross-Validation Functions¶

optimal_cutoffs.cv.cross_validate(y_true: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], y_score: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], *, metric: str = 'f1', cv: int = 5, random_state: int | None = None, **optimize_kwargs) tuple[ndarray, ndarray][source]¶

Cross-validate threshold optimization.

Parameters:
  • y_true (array-like) – True labels

  • y_score (array-like) – Predicted scores/probabilities

  • metric (str, default="f1") – Metric to optimize and evaluate

  • cv (int, default=5) – Number of cross-validation folds

  • random_state (int, optional) – Random seed for reproducibility

  • **optimize_kwargs – Additional arguments passed to optimize_thresholds()

Returns:

Arrays of per-fold thresholds and scores.

Return type:

tuple[np.ndarray, np.ndarray]

Examples

>>> thresholds, scores = cross_validate(y_true, y_scores, metric="f1", cv=5)
>>> print(f"CV Score: {np.mean(scores):.3f} ± {np.std(scores):.3f}")
optimal_cutoffs.cv.nested_cross_validate(y_true: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], y_score: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], *, metric: str = 'f1', inner_cv: int = 3, outer_cv: int = 5, random_state: int | None = None, **optimize_kwargs) tuple[ndarray, ndarray][source]¶

Nested cross-validation for unbiased threshold optimization evaluation.

Inner CV: Optimizes thresholds Outer CV: Evaluates the optimization procedure

Parameters:
  • y_true (array-like) – True labels

  • y_score (array-like) – Predicted scores/probabilities

  • metric (str, default="f1") – Metric to optimize and evaluate

  • inner_cv (int, default=3) – Number of inner CV folds (for threshold optimization)

  • outer_cv (int, default=5) – Number of outer CV folds (for evaluation)

  • random_state (int, optional) – Random seed for reproducibility

  • **optimize_kwargs – Additional arguments passed to optimize_thresholds()

Returns:

Nested CV results with keys: - ‘test_scores’: array of outer test scores - ‘mean_score’: mean outer test score - ‘std_score’: standard deviation of outer test scores - ‘thresholds’: threshold estimates from each outer fold

Return type:

dict

Examples

>>> # Get unbiased estimate of threshold optimization performance
>>> results = nested_cross_validate(y_true, y_scores, metric="f1")
>>> print(f"Unbiased CV Score: {results['mean_score']:.3f}")

Usage Examples¶

Basic Cross-Validation¶

from optimal_cutoffs import cv
import numpy as np

# Your data
y_true = np.random.randint(0, 2, 1000)
y_prob = np.random.uniform(0, 1, 1000)

# 5-fold cross-validation
thresholds, scores = cv.cross_validate(
    y_true, y_prob,
    metric='f1',
    cv=5,
    method='auto'
)

print(f"CV thresholds: {thresholds}")
print(f"CV scores: {scores}")
print(f"Mean threshold: {np.mean(thresholds):.3f} ± {np.std(thresholds):.3f}")

Stratified Cross-Validation¶

from sklearn.model_selection import StratifiedKFold

# Use stratified splits for imbalanced data
cv = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)

thresholds, scores = cv.cross_validate(
    y_true, y_prob,
    metric='f1',
    cv=cv,  # Pass custom CV splitter
    method='auto'
)

Nested Cross-Validation¶

from optimal_cutoffs import cv

# Nested CV for unbiased performance estimation
outer_scores, inner_results = cv.nested_cross_validate(
    y_true, y_prob,
    metric='f1',
    outer_cv=5,
    inner_cv=3,
    method='auto'
)

print(f"Outer CV scores: {outer_scores}")
print(f"Mean performance: {np.mean(outer_scores):.3f} ± {np.std(outer_scores):.3f}")

Custom Cross-Validation¶

from sklearn.model_selection import TimeSeriesSplit

# Time series cross-validation
tscv = TimeSeriesSplit(n_splits=5)

thresholds, scores = cv.cross_validate(
    y_true, y_prob,
    metric='precision',
    cv=tscv,
    method='sort_scan'
)

With Sample Weights¶

# Sample weights for imbalanced data
sample_weights = np.where(y_true == 1, 0.5, 2.0)  # Upweight minority class

thresholds, scores = cv.cross_validate(
    y_true, y_prob,
    metric='f1',
    cv=5,
    sample_weight=sample_weights
)

Multiclass Cross-Validation¶

# Multiclass data
y_true_mc = np.random.randint(0, 3, 1000)
y_prob_mc = np.random.dirichlet([1, 1, 1], 1000)  # 3 classes

# Returns list of threshold arrays (one per fold)
thresholds_list, scores = cv.cross_validate(
    y_true_mc, y_prob_mc,
    metric='f1',
    cv=5,
    average='macro'  # Macro-averaged F1
)

# Average thresholds across folds
mean_thresholds = np.mean(thresholds_list, axis=0)
print(f"Mean per-class thresholds: {mean_thresholds}")

Best Practices¶

Choosing CV Strategy¶

  • Balanced data: Use standard KFold or cv=5

  • Imbalanced data: Use StratifiedKFold to preserve class ratios

  • Time series: Use TimeSeriesSplit to respect temporal order

  • Small datasets: Use LeaveOneOut or higher k in k-fold

Threshold Aggregation¶

# Multiple strategies for combining CV thresholds
thresholds, scores = cv.cross_validate(y_true, y_prob, metric='f1', cv=10)

# Different aggregation methods
mean_threshold = np.mean(thresholds)
median_threshold = np.median(thresholds)

# Weighted by CV scores
weights = scores / np.sum(scores)
weighted_threshold = np.average(thresholds, weights=weights)

# Choose best single fold
best_idx = np.argmax(scores)
best_threshold = thresholds[best_idx]

Uncertainty Quantification¶

# Bootstrap confidence intervals
from scipy import stats

thresholds, scores = cv.cross_validate(y_true, y_prob, metric='f1', cv=10)

# 95% confidence interval for threshold
threshold_mean = np.mean(thresholds)
threshold_se = stats.sem(thresholds)
ci_lower, ci_upper = stats.t.interval(0.95, len(thresholds)-1,
                                     loc=threshold_mean, scale=threshold_se)

print(f"Threshold: {threshold_mean:.3f} [{ci_lower:.3f}, {ci_upper:.3f}]")