Utility Functions¶

This module provides utility functions for data validation and array operations.

Data Validation¶

calibre.utils.check_arrays(X: ndarray, y: ndarray)[source]¶

Check and validate input arrays for calibration.

This function ensures that X and y are valid numpy arrays with compatible shapes and no invalid values.

Parameters:

X – The input predictions/probabilities.
y – The target values/labels.

Returns:

tuple[np.ndarray, np.ndarray] – Tuple of (validated_X, validated_y). validated_X validated_y

Raises:

ValueError – If arrays are empty or have incompatible lengths.

Examples

>>> import numpy as np
>>> from calibre.utils.validation import check_arrays
>>>
>>> X = np.array([0.1, 0.2, 0.3])
>>> y = np.array([0, 1, 1])
>>> X_checked, y_checked = check_arrays(X, y)
>>> print(X_checked.shape, y_checked.shape)
(3,) (3,)

calibre.utils.check_array_1d(X: ndarray, name: str = 'X')[source]¶

Check that an array is 1-dimensional.

Parameters:

X – The array to check.
name – Name of the array for error messages.

Returns:

ndarray – Validated 1D array.

Raises:

ValueError – If array is not 1-dimensional or is empty.

Examples

>>> import numpy as np
>>> from calibre.utils.validation import check_array_1d
>>>
>>> X = np.array([0.1, 0.2, 0.3])
>>> X_checked = check_array_1d(X)
>>> print(X_checked.shape)
(3,)

calibre.utils.check_consistent_length(*arrays: ndarray)[source]¶

Check that all arrays have consistent first dimension.

Parameters:: *arrays – Arrays to check for consistent length.
Raises:: ValueError – If arrays have inconsistent lengths.
Return type:: None

Examples

>>> import numpy as np
>>> from calibre.utils.validation import check_consistent_length
>>>
>>> X = np.array([0.1, 0.2, 0.3])
>>> y = np.array([0, 1, 1])
>>> check_consistent_length(X, y)  # No error
>>>
>>> z = np.array([0, 1])  # Different length
>>> try:
...     check_consistent_length(X, z)
... except ValueError as e:
...     print("Error:", e)

Array Operations¶

calibre.utils.sort_by_x(X: ndarray, y: ndarray)[source]¶

Sort arrays by X values and return sort indices.

Parameters:

X – Values to sort by.
y – Values to sort along with X.

Returns:

sort_idx (ndarray of shape (n_samples,)) – Indices that would sort X.
X_sorted (ndarray of shape (n_samples,)) – Sorted X array.
y_sorted (ndarray of shape (n_samples,)) – Sorted y array.

Examples

>>> import numpy as np
>>> from calibre.utils.array_ops import sort_by_x
>>>
>>> X = np.array([0.3, 0.1, 0.2])
>>> y = np.array([1, 0, 0])
>>> idx, X_sorted, y_sorted = sort_by_x(X, y)
>>> print(X_sorted)
[0.1 0.2 0.3]
>>> print(y_sorted)
[0 0 1]

calibre.utils.clip_to_range(X: ndarray, lower: float = 0.0, upper: float = 1.0)[source]¶

Clip array values to a specified range.

Parameters:

X – Array to clip.
lower – Lower bound.
upper – Upper bound.

Returns:

X_clipped (ndarray) – Clipped array.

Examples

>>> import numpy as np
>>> from calibre.utils.array_ops import clip_to_range
>>>
>>> X = np.array([-0.1, 0.5, 1.2])
>>> X_clipped = clip_to_range(X, 0.0, 1.0)
>>> print(X_clipped)
[0.  0.5 1. ]

calibre.utils.ensure_1d(X: ndarray)[source]¶

Ensure array is 1-dimensional by raveling.

Parameters:: X – Array to ensure is 1D.
Returns:: X_1d (ndarray of shape (n_samples,)) – 1-dimensional array.

Examples

>>> import numpy as np
>>> from calibre.utils.array_ops import ensure_1d
>>>
>>> X = np.array([[0.1, 0.2, 0.3]])
>>> X_1d = ensure_1d(X)
>>> print(X_1d.shape)
(3,)

Usage Examples¶

Input Validation¶

from calibre.utils import check_arrays
import numpy as np

# Valid input
X = np.array([0.1, 0.3, 0.5, 0.7, 0.9])
y = np.array([0, 0, 1, 1, 1])

try:
    X_checked, y_checked = check_arrays(X, y)
    print("Arrays are valid")
except ValueError as e:
    print(f"Validation error: {e}")

Sorting Operations¶

from calibre.utils import sort_by_x

# Unsorted data
X = np.array([0.7, 0.1, 0.9, 0.3, 0.5])
y = np.array([1, 0, 1, 0, 1])

# Sort by X values
sort_indices, X_sorted, y_sorted = sort_by_x(X, y)

print(f"Original X: {X}")
print(f"Sorted X: {X_sorted}")
print(f"Sorted y: {y_sorted}")
print(f"Sort indices: {sort_indices}")

Array Processing¶

from calibre.utils import ensure_1d, clip_to_range
import numpy as np

# Ensure array is 1D
arr_2d = np.array([[1], [2], [3]])
arr_1d = ensure_1d(arr_2d)
print(f"1D array: {arr_1d}")

# Clip values to valid range
values = np.array([-0.1, 0.5, 1.2])
clipped = clip_to_range(values, 0.0, 1.0)
print(f"Clipped: {clipped}")

Note¶

These utility functions are primarily for internal use within calibration algorithms. For typical calibration workflows, use the main calibrator classes directly:

from calibre import IsotonicCalibrator, expected_calibration_error
import numpy as np

# This is the recommended approach for users
X = np.array([0.1, 0.3, 0.5, 0.7, 0.9])
y = np.array([0, 0, 1, 1, 1])

cal = IsotonicCalibrator()
cal.fit(X, y)
X_calibrated = cal.transform(X)

ece = expected_calibration_error(y, X_calibrated)
print(f"ECE: {ece:.4f}")