Core Functions¶
This module contains the main optimization functions that form the core of the library.
Main Optimization Functions¶
- optimal_cutoffs.api.optimize_thresholds(y_true: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], y_score: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], *, metric: str = 'f1', task: Task = Task.AUTO, average: Average = Average.AUTO, method: str = 'auto', mode: str = 'empirical', sample_weight: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str] | None = None, **kwargs) OptimizationResult[source]¶
Find optimal thresholds for classification problems.
This is THE canonical entry point for threshold optimization. Auto-detects problem type and selects appropriate algorithms.
- Parameters:
y_true – True labels
y_score – Predicted scores/probabilities - Binary: 1D array of scores - Multiclass: 2D array (n_samples, n_classes) - Multilabel: 2D array (n_samples, n_labels)
metric – Metric to optimize (“f1”, “precision”, “recall”, “accuracy”, etc.)
task – Problem type. AUTO infers from data shape and probability sums.
average – Averaging strategy for multiclass/multilabel. AUTO selects sensible default.
method – Optimization algorithm. AUTO selects best method per task+metric.
mode – “empirical” (standard) or “expected” (requires calibrated probabilities)
sample_weight – Sample weights
**kwargs – Additional keyword arguments passed to optimization algorithms. Common options include ‘comparison’ (“>”, “>=”), ‘tolerance’, ‘utility’.
- Returns:
Result with .thresholds, .predict(), and explanation of auto-selections
- Return type:
OptimizationResult
- Raises:
TypeError – If ‘bayes’ is passed as a keyword argument (deprecated).
ValueError – If mode=’bayes’ requires utility parameter but none provided. If comparison operator is not ‘>’ or ‘>=’. If mode=’expected’ with unsupported metric. If method is deprecated (‘dinkelbach’, ‘smart_brute’). If unknown metric name is provided. If true_labels required for empirical mode but not provided.
Examples
>>> # Binary classification - simple case >>> result = optimize_thresholds(y_true, y_scores, metric="f1") >>> print(f"Optimal threshold: {result.threshold}")
>>> # Multiclass classification >>> result = optimize_thresholds(y_true, y_probs, metric="f1") >>> print(f"Per-class thresholds: {result.thresholds}") >>> print(f"Task inferred as: {result.task.value}")
>>> # Explicit control when needed >>> result = optimize_thresholds( ... y_true, y_probs, ... metric="precision", ... task=Task.MULTICLASS, ... average=Average.MACRO ... )
- optimal_cutoffs.api.optimize_decisions(y_prob: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], cost_matrix: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], **kwargs) OptimizationResult[source]¶
Find optimal decisions using cost matrix (no thresholds).
For problems where thresholds aren’t the right abstraction. Uses Bayes-optimal decision rule: argmin_action E[cost | probabilities].
- Parameters:
y_prob – Predicted probabilities (n_samples, n_classes)
cost_matrix – Cost matrix (n_classes, n_actions) or (n_classes, n_classes) cost_matrix[i, j] = cost of predicting action j when true class is i
**kwargs – Additional keyword arguments passed to the Bayes optimal decision function.
- Returns:
Result with .predict() function (no .thresholds)
- Return type:
OptimizationResult
Examples
>>> # Cost matrix: rows=true class, cols=predicted class >>> costs = [[0, 1, 10], [5, 0, 1], [50, 10, 0]] # FN costs 5x more than FP >>> result = optimize_decisions(y_probs, costs) >>> y_pred = result.predict(y_probs_test)
Binary Classification¶
- optimal_cutoffs.binary.optimize_f1_binary(true_labels: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], pred_proba: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], *, beta: float = 1.0, sample_weight: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str] | None = None, comparison: str = '>') OptimizationResult[source]¶
Optimize F-beta score for binary classification using sort-and-scan.
Uses the O(n log n) sort-and-scan algorithm exploiting the piecewise structure of F-beta metrics. This finds the exact optimal threshold.
- Parameters:
true_labels – True binary labels in {0, 1}. Shape: (n_samples,)
pred_proba – Predicted probabilities for positive class in [0, 1]. Shape: (n_samples,)
beta – F-beta parameter. beta=1 gives F1 score
sample_weight – Sample weights. Shape: (n_samples,)
comparison – Comparison operator for threshold. Must be “>” or “>=”
- Returns:
Result with optimal threshold, F-beta score, and predict function
- Return type:
OptimizationResult
Examples
>>> y_true = [0, 1, 1, 0, 1] >>> y_prob = [0.2, 0.8, 0.7, 0.3, 0.9] >>> result = optimize_f1_binary(y_true, y_prob) >>> result.threshold 0.5 >>> result.score # F1 score at optimal threshold 0.8
- optimal_cutoffs.binary.optimize_metric_binary(true_labels: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], pred_proba: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], *, metric: str = 'f1', method: str = 'auto', sample_weight: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str] | None = None, comparison: str = '>', tolerance: float = 1e-10) OptimizationResult[source]¶
General binary metric optimization with automatic method selection.
Automatically selects the best optimization algorithm based on metric properties and data characteristics.
- Parameters:
true_labels – True binary labels in {0, 1}. Shape: (n_samples,)
pred_proba – Predicted probabilities for positive class in [0, 1]. Shape: (n_samples,)
metric – Metric to optimize (“f1”, “precision”, “recall”, “accuracy”, etc.)
method – Optimization method: - “auto”: Automatically select best method - “sort_scan”: O(n log n) sort-and-scan (exact for piecewise metrics) - “minimize”: Scipy optimization - “gradient”: Simple gradient ascent
sample_weight – Sample weights. Shape: (n_samples,)
comparison – Comparison operator for threshold. Must be “>” or “>=”
tolerance – Numerical tolerance for optimization
- Returns:
Result with optimal threshold, metric score, and predict function
- Return type:
OptimizationResult
- Raises:
ValueError – If method is unknown or not supported.
Examples
>>> result = optimize_metric_binary(y_true, y_prob, metric="precision") >>> result = optimize_metric_binary(y_true, y_prob, metric="f1", method="sort_scan")
- optimal_cutoffs.binary.optimize_utility_binary(true_labels: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str] | None, pred_proba: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], *, utility: dict[str, float], sample_weight: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str] | None = None) OptimizationResult[source]¶
Optimize binary classification using utility/cost specification.
Computes the Bayes-optimal threshold using the closed-form formula: τ* = (u_tn - u_fp) / [(u_tp - u_fn) + (u_tn - u_fp)]
This is exact and runs in O(1) time.
- Parameters:
true_labels – True binary labels. Can be None for pure Bayes optimization. Shape: (n_samples,)
pred_proba – Predicted probabilities for positive class in [0, 1]. Shape: (n_samples,)
utility – Utility specification with keys “tp”, “tn”, “fp”, “fn”
sample_weight – Sample weights (affects expected utility computation). Shape: (n_samples,)
- Returns:
Result with optimal threshold, expected utility, and predict function
- Return type:
OptimizationResult
- Raises:
ValueError – If probabilities are not in the range [0, 1] for utility optimization.
Examples
>>> # FN costs 5x more than FP >>> utility = {"tp": 10, "tn": 1, "fp": -1, "fn": -5} >>> result = optimize_utility_binary(None, y_prob, utility=utility) >>> result.threshold # Closed-form optimal 0.167
Multiclass Classification¶
- optimal_cutoffs.multiclass.optimize_multiclass(true_labels: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], pred_proba: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], *, metric: str = 'f1', average: str = 'macro', method: str = 'auto', sample_weight: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str] | None = None, comparison: str = '>', tolerance: float = 1e-10) OptimizationResult[source]¶
General multiclass threshold optimization with automatic method selection.
Routes to appropriate algorithm based on averaging strategy and method:
Macro + auto/coord_ascent: Margin rule with coordinate ascent (single-label)
Macro + independent: Independent OvR optimization (can predict multiple)
Micro: Single threshold optimization (single-label)
- Parameters:
true_labels (array-like of shape (n_samples,)) – True class labels in {0, 1, …, K-1}
pred_proba (array-like of shape (n_samples, n_classes)) – Predicted probabilities for each class
metric (str, default="f1") – Metric to optimize
average ({"macro", "micro"}, default="macro") – Averaging strategy
method ({"auto", "coord_ascent", "independent"}, default="auto") – Optimization method: - “auto”: For macro, uses coord_ascent (margin rule) - “coord_ascent”: Margin rule with coordinate ascent - “independent”: Independent per-class optimization (OvR)
sample_weight (array-like of shape (n_samples,), optional) – Sample weights
comparison (str, default=">") – Comparison operator
tolerance (float, default=1e-10) – Numerical tolerance
- Returns:
Result with optimal thresholds and prediction function
- Return type:
OptimizationResult
Examples
>>> # Margin rule (single-label, coordinate ascent) >>> result = optimize_multiclass(y_true, y_prob, method="coord_ascent") >>> >>> # Independent optimization (can predict multiple classes) >>> result = optimize_multiclass(y_true, y_prob, method="independent") >>> >>> # Micro averaging (single threshold) >>> result = optimize_multiclass(y_true, y_prob, average="micro")
- optimal_cutoffs.multiclass.optimize_ovr_independent(true_labels: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], pred_proba: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], *, metric: str = 'f1', method: str = 'auto', sample_weight: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str] | None = None, comparison: str = '>', tolerance: float = 1e-10) OptimizationResult[source]¶
Optimize multiclass metrics using independent per-class thresholds (OvR).
Treats each class as an independent binary problem (class vs rest). This does NOT enforce single-label predictions - can predict 0, 1, or multiple classes. Use this for macro-averaged metrics when you want exact optimization per class.
Decision rule: ŷ_j = 1 if p_j ≥ τ_j (independent for each class)
- Parameters:
true_labels (array-like of shape (n_samples,)) – True class labels in {0, 1, …, K-1}
pred_proba (array-like of shape (n_samples, n_classes)) – Predicted probabilities for each class
metric (str, default="f1") – Metric to optimize per class
method (str, default="auto") – Binary optimization method
sample_weight (array-like of shape (n_samples,), optional) – Sample weights
comparison (str, default=">") – Comparison operator
tolerance (float, default=1e-10) – Numerical tolerance
- Returns:
Result with per-class thresholds optimized independently
- Return type:
OptimizationResult
Examples
>>> y_true = [0, 1, 2, 0, 1] >>> y_prob = [[0.7, 0.2, 0.1], [0.1, 0.8, 0.1], [0.1, 0.1, 0.8], ...] >>> result = optimize_ovr_independent(y_true, y_prob, metric="f1") >>> predictions = result.predict(y_prob) # Can predict multiple classes
- optimal_cutoffs.multiclass.optimize_ovr_margin(true_labels: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], pred_proba: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], *, metric: str = 'f1', max_iter: int = 30, sample_weight: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str] | None = None, comparison: str = '>', tolerance: float = 1e-12) OptimizationResult[source]¶
Optimize multiclass metrics using margin rule with coordinate ascent.
Uses margin-based prediction: ŷ = argmax_j (p_j - τ_j) This ensures exactly one class is predicted per sample (single-label).
Thresholds are coupled because changing τ_j affects which samples are assigned to class j, which affects confusion matrices for all classes. Uses coordinate ascent to find local optimum.
- Parameters:
true_labels (array-like of shape (n_samples,)) – True class labels in {0, 1, …, K-1}
pred_proba (array-like of shape (n_samples, n_classes)) – Predicted probabilities for each class
metric (str, default="f1") – Metric to optimize (currently supports “f1” only)
max_iter (int, default=30) – Maximum coordinate ascent iterations
sample_weight (array-like of shape (n_samples,), optional) – Sample weights
comparison (str, default=">") – Comparison operator (only “>” supported for margin rule)
tolerance (float, default=1e-12) – Convergence tolerance
- Returns:
Result with per-class thresholds optimized via coordinate ascent
- Return type:
OptimizationResult
Examples
>>> result = optimize_ovr_margin(y_true, y_prob, metric="f1") >>> predictions = result.predict(y_prob) # Exactly one class per sample
Notes
The margin rule is Bayes-optimal when costs have OvR structure: C(i,j) = -r_j if i=j, else c_j
In this case, optimal thresholds are: τ_j = c_j/(c_j + r_j) (closed form!)
- optimal_cutoffs.multiclass.optimize_micro_multiclass(true_labels: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], pred_proba: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], *, metric: str = 'f1', method: str = 'auto', sample_weight: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str] | None = None, comparison: str = '>', tolerance: float = 1e-10) OptimizationResult[source]¶
Optimize micro-averaged multiclass metrics using single threshold.
For micro averaging, we use a single threshold applied to all classes, then predict the class with highest valid probability. This reduces to a single binary optimization problem on flattened data.
Decision rule: ŷ = argmax{j: p_j ≥ τ} p_j (or argmax p_j if none valid)
- Parameters:
true_labels (array-like of shape (n_samples,)) – True class labels in {0, 1, …, K-1}
pred_proba (array-like of shape (n_samples, n_classes)) – Predicted probabilities for each class
metric (str, default="f1") – Metric to optimize
method (str, default="auto") – Binary optimization method
sample_weight (array-like of shape (n_samples,), optional) – Sample weights
comparison (str, default=">") – Comparison operator
tolerance (float, default=1e-10) – Numerical tolerance
- Returns:
Result with single threshold applied to all classes
- Return type:
OptimizationResult
Examples
>>> result = optimize_micro_multiclass(y_true, y_prob, metric="f1") >>> result.thresholds # Same threshold for all classes [0.3, 0.3, 0.3]
Multilabel Classification¶
- optimal_cutoffs.multilabel.optimize_multilabel(true_labels: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], pred_proba: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], *, metric: str = 'f1', average: str = 'macro', method: str = 'auto', sample_weight: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str] | None = None, comparison: str = '>', tolerance: float = 1e-10) OptimizationResult[source]¶
General multi-label threshold optimization with automatic method selection.
Routes to appropriate algorithm based on averaging strategy: - Macro: Independent optimization per label (exact, O(K·n log n)) - Micro: Coordinate ascent for coupled thresholds (local optimum)
- Parameters:
true_labels (array-like of shape (n_samples, n_labels)) – True multi-label binary matrix
pred_proba (array-like of shape (n_samples, n_labels)) – Predicted probabilities for each label
metric (str, default="f1") – Metric to optimize
average ({"macro", "micro"}, default="macro") – Averaging strategy
method (str, default="auto") – Optimization method (passed to binary optimizer for macro)
sample_weight (array-like of shape (n_samples,), optional) – Sample weights
comparison (str, default=">") – Comparison operator
tolerance (float, default=1e-10) – Numerical tolerance
- Returns:
Result with optimal thresholds and metric score
- Return type:
OptimizationResult
Examples
>>> # Independent per-label optimization >>> result = optimize_multilabel(y_true, y_prob, average="macro") >>> >>> # Coupled optimization for global metric >>> result = optimize_multilabel(y_true, y_prob, average="micro")
- optimal_cutoffs.multilabel.optimize_macro_multilabel(true_labels: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], pred_proba: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], *, metric: str = 'f1', method: str = 'auto', sample_weight: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str] | None = None, comparison: str = '>', tolerance: float = 1e-10) OptimizationResult[source]¶
Optimize macro-averaged metrics for multi-label classification.
For macro averaging, each label is optimized independently: Macro-F1 = (1/K) Σ_j F1_j(τ_j)
Since each F1_j depends only on τ_j, we can optimize each threshold independently using binary optimization. This is exact and efficient.
- Parameters:
true_labels (array-like of shape (n_samples, n_labels)) – True multi-label binary matrix
pred_proba (array-like of shape (n_samples, n_labels)) – Predicted probabilities for each label
metric (str, default="f1") – Metric to optimize per label (“f1”, “precision”, “recall”)
method (str, default="auto") – Binary optimization method for each label
sample_weight (array-like of shape (n_samples,), optional) – Sample weights
comparison (str, default=">") – Comparison operator
tolerance (float, default=1e-10) – Numerical tolerance
- Returns:
Result with per-label thresholds and macro-averaged score
- Return type:
OptimizationResult
Examples
>>> # 3 independent labels >>> y_true = [[1, 0, 1], [0, 1, 0], [1, 1, 1]] >>> y_prob = [[0.8, 0.2, 0.9], [0.1, 0.7, 0.3], [0.9, 0.8, 0.7]] >>> result = optimize_macro_multilabel(y_true, y_prob, metric="f1") >>> len(result.thresholds) # One per label 3
- optimal_cutoffs.multilabel.optimize_micro_multilabel(true_labels: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], pred_proba: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], *, metric: str = 'f1', max_iter: int = 30, sample_weight: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str] | None = None, comparison: str = '>', tolerance: float = 1e-12) OptimizationResult[source]¶
Optimize micro-averaged metrics for multi-label classification.
For micro averaging, thresholds are coupled through global TP/FP/FN: Micro-F1 = 2·TP_total / (2·TP_total + FP_total + FN_total)
where TP_total = Σ_j TP_j(τ_j). Changing any τ_j affects the global metric, so we use coordinate ascent to optimize the coupled problem.
- Parameters:
true_labels (array-like of shape (n_samples, n_labels)) – True multi-label binary matrix
pred_proba (array-like of shape (n_samples, n_labels)) – Predicted probabilities for each label
metric (str, default="f1") – Metric to optimize (“f1”, “precision”, “recall”)
max_iter (int, default=30) – Maximum coordinate ascent iterations
sample_weight (array-like of shape (n_samples,), optional) – Sample weights
comparison (str, default=">") – Comparison operator
tolerance (float, default=1e-12) – Convergence tolerance
- Returns:
Result with per-label thresholds optimized for micro averaging
- Return type:
OptimizationResult
Examples
>>> result = optimize_micro_multilabel(y_true, y_prob, metric="f1") >>> # Thresholds are coupled - changing one affects global metric
Bayes-Optimal Decisions¶
- optimal_cutoffs.bayes.threshold(cost_fp: float, cost_fn: float, *, prior: float | None = None) float[source]¶
Compute binary Bayes-optimal threshold from costs.
- Parameters:
- Returns:
Optimal threshold
- Return type:
Examples
>>> # FN costs 5x more than FP >>> t = threshold(cost_fp=1.0, cost_fn=5.0) >>> # Will be < 0.5 (more conservative, avoids costly false negatives)
- optimal_cutoffs.bayes.thresholds_from_costs(fp_costs: ndarray[tuple[Any, ...], dtype[_ScalarT]] | list[float], fn_costs: ndarray[tuple[Any, ...], dtype[_ScalarT]] | list[float], **kwargs) ndarray[source]¶
Compute per-class Bayes-optimal thresholds from OvR costs.
- Parameters:
fp_costs (array-like) – False positive costs per class
fn_costs (array-like) – False negative costs per class
- Returns:
Per-class optimal thresholds
- Return type:
np.ndarray
Examples
>>> # Different costs per class >>> fp_costs = [1.0, 2.0, 0.5] # Class 1 FP costs 2x more >>> fn_costs = [5.0, 1.0, 10.0] # Class 2 FN costs 10x more >>> thresholds = thresholds_from_costs(fp_costs, fn_costs)
- optimal_cutoffs.bayes.policy(cost_matrix: ndarray[tuple[Any, ...], dtype[_ScalarT]]) OptimizationResult[source]¶
Create Bayes-optimal decision policy from cost matrix.
This is for general decision making where thresholds aren’t the right abstraction.
- Parameters:
cost_matrix (array-like) – Cost matrix (n_classes, n_actions) cost_matrix[i, j] = cost of taking action j when true class is i
- Returns:
Policy with .predict() method (no .thresholds)
- Return type:
OptimizationResult
Examples
>>> costs = [[0, 1, 10], [5, 0, 1], [50, 10, 0]] >>> policy = policy(costs) >>> decisions = policy.predict(probabilities)
Internal Functions¶
These functions are used internally but may be useful for advanced users:
Optimized O(n log n) sort-and-scan kernel for piecewise-constant metrics.
This module provides an exact optimizer for binary classification metrics that are piecewise-constant with respect to the decision threshold. The algorithm sorts predictions once and scans all n cuts in a single pass, achieving true O(n log n) complexity with vectorized operations.
- Notes on require_proba:
If require_proba=True, inputs are validated to lie in [0, 1].
The returned threshold is usually in [0, 1]; however, in boundary or tie cases, we may nudge it by one floating-point ULP beyond the range to correctly realize strict inclusivity/exclusivity (e.g., to ensure “predict none” with ‘>=’ when max p == 1.0).
- optimal_cutoffs.piecewise.optimal_threshold_sortscan(y_true: ndarray[Any, Any], pred_prob: ndarray[Any, Any], metric: str | Callable[[ndarray[Any, Any], ndarray[Any, Any], ndarray[Any, Any], ndarray[Any, Any]], ndarray[Any, Any]], *, sample_weight: ndarray[Any, Any] | None = None, inclusive: bool = False, require_proba: bool = True, tolerance: float = 1e-10) OptimizationResult[source]¶
Exact optimizer for piecewise-constant metrics using O(n log n) sort-and-scan.
- Parameters:
y_true (array-like of shape (n_samples,)) – Binary labels in {0, 1}.
pred_prob (array-like of shape (n_samples,)) – Predicted probabilities in [0, 1] or arbitrary scores if require_proba=False.
metric (str or callable) – Metric name (e.g., “f1”, “precision”) or vectorized function. If string, automatically resolves to vectorized implementation. If callable: (tp_vec, tn_vec, fp_vec, fn_vec) -> score_vec.
sample_weight (array-like, optional) – Non-negative sample weights of shape (n_samples,).
inclusive (bool, default=False) – If True, use “>=”; if False, use “>”.
require_proba (bool, default=True) – Validate inputs in [0, 1]. Threshold may be nudged by ±1 ULP outside [0,1] to exactly realize inclusivity/exclusivity in boundary/tie cases.
tolerance (float, default=1e-10) – Numerical tolerance for floating-point comparisons when computing threshold midpoints and handling ties between scores.
- Returns:
thresholds : array([optimal_threshold]) scores : array([achieved_score]) predict : callable(probs) -> {0,1}^n metric : str, set to “piecewise_metric” n_classes : 2 diagnostics: dict with keys:
k_argmax: theoretical best cut index (0..n) from the sweep
k_realized: positives realized by the returned threshold
score_theoretical: score at k_argmax
score_actual: score achieved by the returned threshold
tie_discrepancy: abs(theoretical - actual)
inclusive: bool
require_proba: bool
- Return type:
OptimizationResult
Unified threshold optimization for binary and multiclass classification.
This module consolidates all threshold optimization functionality into a single, streamlined interface. It includes high-performance Numba kernels, multiple optimization algorithms, and support for both binary and multiclass problems.
Key features: - Fast Numba kernels with Python fallbacks - Binary and multiclass threshold optimization - Multiple algorithms: sort-scan, scipy, gradient, coordinate ascent - Sample weight support (including in coordinate ascent) - Direct functional API without over-engineered abstractions
- optimal_cutoffs.optimize.fast_f1_score(tp: float, tn: float, fp: float, fn: float) float[source]¶
Compute F1 score from confusion matrix.
- optimal_cutoffs.optimize.compute_confusion_matrix_weighted(labels: ndarray, predictions: ndarray, weights: ndarray | None) tuple[float, float, float, float][source]¶
Compute weighted confusion matrix elements (serial, race-free).
- optimal_cutoffs.optimize.sort_scan_kernel(labels: ndarray, scores: ndarray, weights: ndarray, inclusive: bool) tuple[float, float][source]¶
Python fallback for sort_scan_kernel.
Note: weights must be a valid array (use np.ones for uniform weights).
- optimal_cutoffs.optimize.coordinate_ascent_kernel(y_true: ndarray, probs: ndarray, weights: ndarray | None, max_iter: int, tol: float) tuple[ndarray, float, ndarray][source]¶
- optimal_cutoffs.optimize.optimize_sort_scan(labels: ndarray, scores: ndarray, metric: str, weights: ndarray | None = None, operator: str = '>=') OptimizationResult[source]¶
Sort-and-scan optimization for piecewise-constant metrics.
- optimal_cutoffs.optimize.optimize_scipy(labels: ndarray, scores: ndarray, metric: str, weights: ndarray | None = None, operator: str = '>=', method: str = 'bounded', tol: float = 1e-06) OptimizationResult[source]¶
Scipy-based optimization for smooth metrics.
- optimal_cutoffs.optimize.optimize_gradient(labels: ndarray, scores: ndarray, metric: str, weights: ndarray | None = None, operator: str = '>=', learning_rate: float = 0.01, max_iter: int = 100, tol: float = 1e-06) OptimizationResult[source]¶
Simple gradient ascent optimization (use for smooth metrics).
- optimal_cutoffs.optimize.find_optimal_threshold_multiclass(true_labs: ndarray, pred_prob: ndarray, metric: str = 'f1', method: str = 'auto', average: str = 'macro', sample_weight: ndarray | None = None, comparison: str = '>', tolerance: float = 1e-10) OptimizationResult[source]¶
Find optimal per-class thresholds for multiclass classification.
- optimal_cutoffs.optimize.find_optimal_threshold(labels: ndarray, scores: ndarray, metric: str = 'f1', weights: ndarray | None = None, strategy: str = 'auto', operator: str = '>=', require_probability: bool = True, tolerance: float = 1e-10) OptimizationResult[source]¶
Simple functional interface for binary threshold optimization.