{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Getting Started with Calibre\n", "\n", "This notebook provides a quick introduction to probability calibration using the Calibre library.\n", "\n", "**What you'll learn:**\n", "1. Basic calibration workflow from start to finish\n", "2. How to choose the right calibration method for your data\n", "3. How to evaluate calibration quality\n", "4. Common patterns and best practices\n", "\n", "**When to use this notebook:** Start here if you're new to calibration or the Calibre library." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Import required libraries\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "from sklearn.model_selection import train_test_split\n", "from sklearn.ensemble import RandomForestClassifier\n", "\n", "# Import calibre components\n", "from calibre import IsotonicCalibrator, mean_calibration_error, brier_score\n", "from calibre import calibration_curve\n", "\n", "# Set random seed for reproducibility\n", "np.random.seed(42)\n", "plt.style.use('default')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Create Sample Data\n", "\n", "Let's generate some sample data and train a model that produces poorly calibrated predictions:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Generate synthetic dataset\n", "n_samples = 1000\n", "X = np.random.randn(n_samples, 5)\n", "y = (X[:, 0] + 0.5 * X[:, 1] - 0.3 * X[:, 2] + np.random.randn(n_samples) * 0.1 > 0).astype(int)\n", "\n", "# Split into train/test\n", "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)\n", "\n", "# Train a model that tends to be poorly calibrated\n", "model = RandomForestClassifier(n_estimators=100, random_state=42)\n", "model.fit(X_train, y_train)\n", "\n", "# Get uncalibrated predictions\n", "y_proba_uncal = model.predict_proba(X_test)[:, 1]\n", "\n", "print(f\"Dataset: {len(X_train)} training, {len(X_test)} test samples\")\n", "print(f\"Class distribution: {np.mean(y_test):.1%} positive class\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Basic Calibration Workflow\n", "\n", "The standard calibration workflow has three steps:\n", "1. **Fit** the calibrator on training predictions\n", "2. **Transform** test predictions \n", "3. **Evaluate** calibration quality" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Step 1: Get training predictions for calibration\n", "y_proba_train = model.predict_proba(X_train)[:, 1]\n", "\n", "# Step 2: Fit calibrator\n", "calibrator = IsotonicCalibrator(enable_diagnostics=True)\n", "calibrator.fit(y_proba_train, y_train)\n", "\n", "# Step 3: Apply calibration to test data\n", "y_proba_cal = calibrator.transform(y_proba_uncal)\n", "\n", "print(\"āœ… Calibration complete!\")\n", "print(f\"Uncalibrated range: [{y_proba_uncal.min():.3f}, {y_proba_uncal.max():.3f}]\")\n", "print(f\"Calibrated range: [{y_proba_cal.min():.3f}, {y_proba_cal.max():.3f}]\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Evaluate Calibration Quality\n", "\n", "Let's measure how much calibration improved our predictions:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Calculate calibration metrics\n", "mce_before = mean_calibration_error(y_test, y_proba_uncal)\n", "mce_after = mean_calibration_error(y_test, y_proba_cal)\n", "\n", "brier_before = brier_score(y_test, y_proba_uncal)\n", "brier_after = brier_score(y_test, y_proba_cal)\n", "\n", "print(\"šŸ“Š Calibration Improvement:\")\n", "print(f\"Mean Calibration Error: {mce_before:.3f} → {mce_after:.3f} ({(mce_after/mce_before-1)*100:+.1f}%)\")\n", "print(f\"Brier Score: {brier_before:.3f} → {brier_after:.3f} ({(brier_after/brier_before-1)*100:+.1f}%)\")\n", "\n", "# Check diagnostics\n", "if calibrator.has_diagnostics():\n", " print(f\"\\nšŸ” Diagnostics: {calibrator.diagnostic_summary()}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4. Visualize the Results\n", "\n", "The best way to understand calibration is to visualize the calibration curve:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": "# Create calibration curves\nbin_means_uncal, bin_edges_uncal, _ = calibration_curve(y_test, y_proba_uncal, n_bins=10)\nbin_means_cal, bin_edges_cal, _ = calibration_curve(y_test, y_proba_cal, n_bins=10)\n\n# Plot comparison\nfig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))\n\n# Before calibration\nax1.plot([0, 1], [0, 1], 'k--', alpha=0.5, label='Perfect calibration')\nax1.plot(bin_edges_uncal, bin_means_uncal, 'o-', color='red', label='Uncalibrated')\nax1.set_xlabel('Mean Predicted Probability')\nax1.set_ylabel('Fraction of Positives')\nax1.set_title('Before Calibration')\nax1.legend()\nax1.grid(True, alpha=0.3)\n\n# After calibration\nax2.plot([0, 1], [0, 1], 'k--', alpha=0.5, label='Perfect calibration')\nax2.plot(bin_edges_cal, bin_means_cal, 'o-', color='blue', label='Calibrated')\nax2.set_xlabel('Mean Predicted Probability')\nax2.set_ylabel('Fraction of Positives')\nax2.set_title('After Calibration')\nax2.legend()\nax2.grid(True, alpha=0.3)\n\nplt.tight_layout()\nplt.show()\n\nprint(\"šŸ“ˆ A well-calibrated model should have points close to the diagonal line.\")\nprint(\"šŸ“ˆ The closer to the diagonal, the better the calibration!\")" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 5. Try Different Calibration Methods\n", "\n", "Calibre provides several calibration methods. Let's compare a few:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from calibre import NearlyIsotonicCalibrator, SplineCalibrator\n", "\n", "# Test different calibrators\n", "calibrators = {\n", " 'Isotonic': IsotonicCalibrator(),\n", " 'Nearly Isotonic': NearlyIsotonicCalibrator(),\n", " 'Spline': SplineCalibrator(n_splines=5)\n", "}\n", "\n", "results = {'Uncalibrated': (y_proba_uncal, mce_before)}\n", "\n", "# Fit and evaluate each calibrator\n", "for name, cal in calibrators.items():\n", " cal.fit(y_proba_train, y_train)\n", " y_cal = cal.transform(y_proba_uncal)\n", " mce = mean_calibration_error(y_test, y_cal)\n", " results[name] = (y_cal, mce)\n", "\n", "# Print comparison\n", "print(\"šŸ† Method Comparison (Mean Calibration Error):\")\n", "for name, (_, mce) in results.items():\n", " print(f\"{name:15}: {mce:.4f}\")\n", "\n", "# Find best method\n", "best_method = min(results.items(), key=lambda x: x[1][1])[0]\n", "print(f\"\\nšŸ„‡ Best method: {best_method}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Key Takeaways\n", "\n", "šŸŽÆ **Quick Start Pattern:**\n", "```python\n", "from calibre import IsotonicCalibrator, mean_calibration_error\n", "\n", "# Fit calibrator on training predictions\n", "calibrator = IsotonicCalibrator()\n", "calibrator.fit(train_probabilities, train_labels)\n", "\n", "# Apply to test predictions\n", "calibrated_probabilities = calibrator.transform(test_probabilities)\n", "\n", "# Evaluate improvement\n", "improvement = mean_calibration_error(labels, calibrated_probabilities)\n", "```\n", "\n", "šŸ“‹ **Best Practices:**\n", "- Always use separate data for calibration (like cross-validation)\n", "- Enable diagnostics to understand calibration behavior\n", "- Visualize calibration curves to verify improvement\n", "- Try multiple methods and pick the best for your data\n", "\n", "āž”ļø **Next Steps:**\n", "- **Validation & Evaluation**: See detailed calibration analysis\n", "- **Diagnostics & Troubleshooting**: Learn when calibration fails\n", "- **Performance Comparison**: Systematic method comparison" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.0" } }, "nbformat": 4, "nbformat_minor": 4 }