Core Functions

This module contains the main optimization functions that form the core of the library.

Main Optimization Function

Multiclass Optimization

Cost-Sensitive Optimization

Legacy Functions

Internal Functions

These functions are used internally but may be useful for advanced users:

Optimized O(n log n) sort-and-scan kernel for piecewise-constant metrics.

This module provides an exact optimizer for binary classification metrics that are piecewise-constant with respect to the decision threshold. The algorithm sorts predictions once and scans all n cuts in a single pass, achieving true O(n log n) complexity with vectorized operations.

Notes on require_proba: - If require_proba=True, inputs are validated to lie in [0, 1]. - The returned threshold is usually in [0, 1]; however, in boundary or tie cases,

we may nudge it by one floating-point ULP beyond the range to correctly realize strict inclusivity/exclusivity (e.g., to ensure “predict none” with ‘>=’ when max p == 1.0).

optimal_cutoffs.piecewise.optimal_threshold_sortscan(y_true: ndarray[Any, Any], pred_prob: ndarray[Any, Any], metric: str | Callable[[ndarray[Any, Any], ndarray[Any, Any], ndarray[Any, Any], ndarray[Any, Any]], ndarray[Any, Any]], *, sample_weight: ndarray[Any, Any] | None = None, inclusive: bool = False, require_proba: bool = True, tolerance: float = 1e-10) OptimizationResult[source]

Exact optimizer for piecewise-constant metrics using O(n log n) sort-and-scan.

Parameters:
  • y_true (array-like of shape (n_samples,)) – Binary labels in {0, 1}.

  • pred_prob (array-like of shape (n_samples,)) – Predicted probabilities in [0, 1] or arbitrary scores if require_proba=False.

  • metric (str or callable) – Metric name (e.g., “f1”, “precision”) or vectorized function. If string, automatically resolves to vectorized implementation. If callable: (tp_vec, tn_vec, fp_vec, fn_vec) -> score_vec.

  • sample_weight (array-like, optional) – Non-negative sample weights of shape (n_samples,).

  • inclusive (bool, default=False) – If True, use “>=”; if False, use “>”.

  • require_proba (bool, default=True) – Validate inputs in [0, 1]. Threshold may be nudged by ±1 ULP outside [0,1] to exactly realize inclusivity/exclusivity in boundary/tie cases.

  • tolerance (float, default=1e-10) – Numerical tolerance for floating-point comparisons when computing threshold midpoints and handling ties between scores.

Returns:

thresholds : array([optimal_threshold]) scores : array([achieved_score]) predict : callable(probs) -> {0,1}^n metric : str, set to “piecewise_metric” n_classes : 2 diagnostics: dict with keys:

  • k_argmax: theoretical best cut index (0..n) from the sweep

  • k_realized: positives realized by the returned threshold

  • score_theoretical: score at k_argmax

  • score_actual: score achieved by the returned threshold

  • tie_discrepancy: abs(theoretical - actual)

  • inclusive: bool

  • require_proba: bool

Return type:

OptimizationResult