Utility Functions¶
This module provides utility functions for data validation and array operations.
Data Validation¶
- calibre.utils.check_arrays(X: ndarray, y: ndarray)[source]¶
Check and validate input arrays for calibration.
This function ensures that X and y are valid numpy arrays with compatible shapes and no invalid values.
- Parameters:
X – The input predictions/probabilities.
y – The target values/labels.
- Returns:
tuple[np.ndarray, np.ndarray] – Tuple of (validated_X, validated_y). validated_X validated_y
- Raises:
ValueError – If arrays are empty or have incompatible lengths.
Examples
>>> import numpy as np >>> from calibre.utils.validation import check_arrays >>> >>> X = np.array([0.1, 0.2, 0.3]) >>> y = np.array([0, 1, 1]) >>> X_checked, y_checked = check_arrays(X, y) >>> print(X_checked.shape, y_checked.shape) (3,) (3,)
- calibre.utils.check_array_1d(X: ndarray, name: str = 'X')[source]¶
Check that an array is 1-dimensional.
- Parameters:
X – The array to check.
name – Name of the array for error messages.
- Returns:
ndarray– Validated 1D array.- Raises:
ValueError – If array is not 1-dimensional or is empty.
Examples
>>> import numpy as np >>> from calibre.utils.validation import check_array_1d >>> >>> X = np.array([0.1, 0.2, 0.3]) >>> X_checked = check_array_1d(X) >>> print(X_checked.shape) (3,)
- calibre.utils.check_consistent_length(*arrays: ndarray)[source]¶
Check that all arrays have consistent first dimension.
- Parameters:
*arrays – Arrays to check for consistent length.
- Raises:
ValueError – If arrays have inconsistent lengths.
- Return type:
Examples
>>> import numpy as np >>> from calibre.utils.validation import check_consistent_length >>> >>> X = np.array([0.1, 0.2, 0.3]) >>> y = np.array([0, 1, 1]) >>> check_consistent_length(X, y) # No error >>> >>> z = np.array([0, 1]) # Different length >>> try: ... check_consistent_length(X, z) ... except ValueError as e: ... print("Error:", e)
Array Operations¶
- calibre.utils.sort_by_x(X: ndarray, y: ndarray)[source]¶
Sort arrays by X values and return sort indices.
- Parameters:
X – Values to sort by.
y – Values to sort along with X.
- Returns:
sort_idx (ndarray of shape (n_samples,)) – Indices that would sort X.
X_sorted (ndarray of shape (n_samples,)) – Sorted X array.
y_sorted (ndarray of shape (n_samples,)) – Sorted y array.
Examples
>>> import numpy as np >>> from calibre.utils.array_ops import sort_by_x >>> >>> X = np.array([0.3, 0.1, 0.2]) >>> y = np.array([1, 0, 0]) >>> idx, X_sorted, y_sorted = sort_by_x(X, y) >>> print(X_sorted) [0.1 0.2 0.3] >>> print(y_sorted) [0 0 1]
- calibre.utils.clip_to_range(X: ndarray, lower: float = 0.0, upper: float = 1.0)[source]¶
Clip array values to a specified range.
- Parameters:
X – Array to clip.
lower – Lower bound.
upper – Upper bound.
- Returns:
X_clipped (ndarray) – Clipped array.
Examples
>>> import numpy as np >>> from calibre.utils.array_ops import clip_to_range >>> >>> X = np.array([-0.1, 0.5, 1.2]) >>> X_clipped = clip_to_range(X, 0.0, 1.0) >>> print(X_clipped) [0. 0.5 1. ]
- calibre.utils.ensure_1d(X: ndarray)[source]¶
Ensure array is 1-dimensional by raveling.
- Parameters:
X – Array to ensure is 1D.
- Returns:
X_1d (ndarray of shape (n_samples,)) – 1-dimensional array.
Examples
>>> import numpy as np >>> from calibre.utils.array_ops import ensure_1d >>> >>> X = np.array([[0.1, 0.2, 0.3]]) >>> X_1d = ensure_1d(X) >>> print(X_1d.shape) (3,)
Usage Examples¶
Input Validation¶
from calibre.utils import check_arrays
import numpy as np
# Valid input
X = np.array([0.1, 0.3, 0.5, 0.7, 0.9])
y = np.array([0, 0, 1, 1, 1])
try:
X_checked, y_checked = check_arrays(X, y)
print("Arrays are valid")
except ValueError as e:
print(f"Validation error: {e}")
Sorting Operations¶
from calibre.utils import sort_by_x
# Unsorted data
X = np.array([0.7, 0.1, 0.9, 0.3, 0.5])
y = np.array([1, 0, 1, 0, 1])
# Sort by X values
sort_indices, X_sorted, y_sorted = sort_by_x(X, y)
print(f"Original X: {X}")
print(f"Sorted X: {X_sorted}")
print(f"Sorted y: {y_sorted}")
print(f"Sort indices: {sort_indices}")
Array Processing¶
from calibre.utils import ensure_1d, clip_to_range
import numpy as np
# Ensure array is 1D
arr_2d = np.array([[1], [2], [3]])
arr_1d = ensure_1d(arr_2d)
print(f"1D array: {arr_1d}")
# Clip values to valid range
values = np.array([-0.1, 0.5, 1.2])
clipped = clip_to_range(values, 0.0, 1.0)
print(f"Clipped: {clipped}")
Note¶
These utility functions are primarily for internal use within calibration algorithms. For typical calibration workflows, use the main calibrator classes directly:
from calibre import IsotonicCalibrator, expected_calibration_error
import numpy as np
# This is the recommended approach for users
X = np.array([0.1, 0.3, 0.5, 0.7, 0.9])
y = np.array([0, 0, 1, 1, 1])
cal = IsotonicCalibrator()
cal.fit(X, y)
X_calibrated = cal.transform(X)
ece = expected_calibration_error(y, X_calibrated)
print(f"ECE: {ece:.4f}")