Core Functions¶
This module contains the main optimization functions that form the core of the library.
Main Optimization Function¶
Multiclass Optimization¶
Cost-Sensitive Optimization¶
Legacy Functions¶
Internal Functions¶
These functions are used internally but may be useful for advanced users:
Optimized O(n log n) sort-and-scan kernel for piecewise-constant metrics.
This module provides an exact optimizer for binary classification metrics that are piecewise-constant with respect to the decision threshold. The algorithm sorts predictions once and scans all n cuts in a single pass, achieving true O(n log n) complexity with vectorized operations.
Notes on require_proba: - If require_proba=True, inputs are validated to lie in [0, 1]. - The returned threshold is usually in [0, 1]; however, in boundary or tie cases,
we may nudge it by one floating-point ULP beyond the range to correctly realize strict inclusivity/exclusivity (e.g., to ensure “predict none” with ‘>=’ when max p == 1.0).
- optimal_cutoffs.piecewise.optimal_threshold_sortscan(y_true: ndarray[Any, Any], pred_prob: ndarray[Any, Any], metric: str | Callable[[ndarray[Any, Any], ndarray[Any, Any], ndarray[Any, Any], ndarray[Any, Any]], ndarray[Any, Any]], *, sample_weight: ndarray[Any, Any] | None = None, inclusive: bool = False, require_proba: bool = True, tolerance: float = 1e-10) OptimizationResult[source]¶
Exact optimizer for piecewise-constant metrics using O(n log n) sort-and-scan.
- Parameters:
y_true (array-like of shape (n_samples,)) – Binary labels in {0, 1}.
pred_prob (array-like of shape (n_samples,)) – Predicted probabilities in [0, 1] or arbitrary scores if require_proba=False.
metric (str or callable) – Metric name (e.g., “f1”, “precision”) or vectorized function. If string, automatically resolves to vectorized implementation. If callable: (tp_vec, tn_vec, fp_vec, fn_vec) -> score_vec.
sample_weight (array-like, optional) – Non-negative sample weights of shape (n_samples,).
inclusive (bool, default=False) – If True, use “>=”; if False, use “>”.
require_proba (bool, default=True) – Validate inputs in [0, 1]. Threshold may be nudged by ±1 ULP outside [0,1] to exactly realize inclusivity/exclusivity in boundary/tie cases.
tolerance (float, default=1e-10) – Numerical tolerance for floating-point comparisons when computing threshold midpoints and handling ties between scores.
- Returns:
thresholds : array([optimal_threshold]) scores : array([achieved_score]) predict : callable(probs) -> {0,1}^n metric : str, set to “piecewise_metric” n_classes : 2 diagnostics: dict with keys:
k_argmax: theoretical best cut index (0..n) from the sweep
k_realized: positives realized by the returned threshold
score_theoretical: score at k_argmax
score_actual: score achieved by the returned threshold
tie_discrepancy: abs(theoretical - actual)
inclusive: bool
require_proba: bool
- Return type:
OptimizationResult