BaseStableTreeΒΆ
- class BaseStableTree(task='regression', max_depth=5, min_samples_split=40, min_samples_leaf=20, enable_honest_estimation=True, split_frac=0.6, val_frac=0.2, est_frac=0.2, enable_stratified_sampling=True, enable_validation_checking=True, validation_metric='variance_penalized', validation_consistency_weight=1.0, enable_prefix_consensus=False, prefix_levels=2, consensus_samples=12, consensus_threshold=0.5, enable_quantile_grid_thresholds=False, max_threshold_bins=24, leaf_smoothing=0.0, leaf_smoothing_strategy='m_estimate', enable_calibrated_smoothing=False, min_leaf_samples_for_stability=5, enable_winsorization=False, winsor_quantiles=(0.01, 0.99), enable_feature_standardization=False, enable_oblique_splits=False, oblique_strategy='root_only', oblique_regularization='lasso', enable_correlation_gating=True, min_correlation_threshold=0.3, enable_lookahead=False, lookahead_depth=1, beam_width=8, enable_ambiguity_gating=True, ambiguity_threshold=0.05, min_samples_for_lookahead=100, enable_deterministic_preprocessing=False, enable_deterministic_tiebreaks=True, enable_margin_vetoes=False, margin_threshold=0.03, enable_variance_aware_stopping=False, variance_stopping_weight=1.0, variance_stopping_strategy='variance_penalty', enable_bootstrap_variance_tracking=False, variance_tracking_samples=10, enable_explicit_variance_penalty=False, variance_penalty_weight=0.1, split_strategy=None, algorithm_focus='stability', classification_criterion='gini', random_state=None, enable_threshold_binning=False, enable_gain_margin_logic=False, enable_beam_search_for_consensus=False, enable_robust_consensus_for_ambiguous=False)[source]ΒΆ
Bases:
BaseEstimatorUnified base class implementing all 7 stability primitives.
The 7 stability primitives are: 1. Prefix stability (robust consensus on early splits) 2. Validation-checked split selection 3. Honesty (separate data for structure vs estimation) 4. Leaf stabilization (shrinkage/smoothing) 5. Data regularization (winsorization, etc.) 6. Candidate diversity with deterministic resolution 7. Variance-aware stopping
All tree methods inherit from this and configure different defaults to maintain their distinct personalities while sharing the unified stability infrastructure.
- Parameters:
task (str) β The prediction task type.
max_depth (int) β Maximum tree depth.
min_samples_split (int) β Minimum samples required to split an internal node.
min_samples_leaf (int) β Minimum samples required in a leaf node.
enable_honest_estimation (bool) β Enable honest estimation (separate data for structure vs estimation).
split_frac (float) β Fraction of data used for building tree structure.
val_frac (float) β Fraction of data used for validation.
est_frac (float) β Fraction of data used for estimation.
enable_stratified_sampling (bool) β Use stratified sampling for data partitioning.
enable_validation_checking (bool) β Enable validation-checked split selection.
validation_metric (Literal['median', 'one_se', 'variance_penalized']) β Metric for validation-based split selection.
validation_consistency_weight (float) β Weight for validation consistency in split selection.
enable_prefix_consensus (bool) β Enable prefix stability through consensus on early splits.
prefix_levels (int) β Number of tree levels to apply prefix consensus.
consensus_samples (int) β Number of bootstrap samples for consensus building.
consensus_threshold (float) β Minimum agreement threshold for consensus splits.
enable_quantile_grid_thresholds (bool) β Use quantile-based threshold grids.
max_threshold_bins (int) β Maximum number of threshold bins per feature.
leaf_smoothing (float) β Smoothing parameter for leaf value stabilization.
leaf_smoothing_strategy (Literal['m_estimate', 'shrink_to_parent', 'beta_smoothing']) β Strategy for leaf value stabilization.
enable_calibrated_smoothing (bool) β Use calibrated smoothing based on sample size.
min_leaf_samples_for_stability (int) β Minimum samples required for stable leaf estimation.
enable_winsorization (bool) β Enable feature winsorization for robustness.
winsor_quantiles (tuple[float, float]) β Quantiles for winsorization bounds.
enable_feature_standardization (bool) β Standardize features before splitting.
enable_oblique_splits (bool) β Enable oblique (linear combination) splits.
oblique_strategy (Literal['root_only', 'all_levels', 'adaptive']) β Where to apply oblique splits in the tree.
oblique_regularization (Literal['lasso', 'ridge', 'elastic_net']) β Regularization for oblique split learning.
enable_correlation_gating (bool) β Gate splits based on feature correlations.
min_correlation_threshold (float) β Minimum correlation for correlation gating.
enable_lookahead (bool) β Enable lookahead for better split selection.
lookahead_depth (int) β Depth of lookahead search.
beam_width (int) β Beam width for lookahead search.
enable_ambiguity_gating (bool) β Gate splits in ambiguous regions.
ambiguity_threshold (float) β Threshold for ambiguity detection.
min_samples_for_lookahead (int) β Minimum samples required for lookahead.
enable_deterministic_preprocessing (bool) β Use deterministic preprocessing for reproducibility.
enable_deterministic_tiebreaks (bool) β Use deterministic tiebreaking in split selection.
enable_margin_vetoes (bool) β Enable margin-based split vetoing.
margin_threshold (float) β Threshold for margin-based vetoing.
enable_variance_aware_stopping (bool) β Enable variance-aware stopping criteria.
variance_stopping_weight (float) β Weight for variance in stopping decisions.
variance_stopping_strategy (Literal['one_se', 'variance_penalty', 'both']) β Strategy for variance-aware stopping.
enable_bootstrap_variance_tracking (bool) β Track split variance using bootstrap sampling.
variance_tracking_samples (int) β Number of bootstrap samples for variance tracking.
enable_explicit_variance_penalty (bool) β Apply explicit variance penalty to splits.
variance_penalty_weight (float) β Weight for variance penalty.
split_strategy (str | None) β Explicit split strategy specification.
algorithm_focus (Literal['speed', 'stability', 'accuracy']) β Algorithm focus for automatic strategy selection.
classification_criterion (Literal['gini', 'entropy']) β Splitting criterion for classification.
random_state (int | None) β Random state for reproducibility.
enable_threshold_binning (bool) β Enable threshold binning for continuous features.
enable_gain_margin_logic (bool) β Apply margin logic to information gain.
enable_beam_search_for_consensus (bool) β Use beam search for consensus building.
enable_robust_consensus_for_ambiguous (bool) β Use robust consensus in ambiguous regions.
- Raises:
ValueError β If split_frac + val_frac + est_frac does not sum to 1.0.
- __init__(task='regression', max_depth=5, min_samples_split=40, min_samples_leaf=20, enable_honest_estimation=True, split_frac=0.6, val_frac=0.2, est_frac=0.2, enable_stratified_sampling=True, enable_validation_checking=True, validation_metric='variance_penalized', validation_consistency_weight=1.0, enable_prefix_consensus=False, prefix_levels=2, consensus_samples=12, consensus_threshold=0.5, enable_quantile_grid_thresholds=False, max_threshold_bins=24, leaf_smoothing=0.0, leaf_smoothing_strategy='m_estimate', enable_calibrated_smoothing=False, min_leaf_samples_for_stability=5, enable_winsorization=False, winsor_quantiles=(0.01, 0.99), enable_feature_standardization=False, enable_oblique_splits=False, oblique_strategy='root_only', oblique_regularization='lasso', enable_correlation_gating=True, min_correlation_threshold=0.3, enable_lookahead=False, lookahead_depth=1, beam_width=8, enable_ambiguity_gating=True, ambiguity_threshold=0.05, min_samples_for_lookahead=100, enable_deterministic_preprocessing=False, enable_deterministic_tiebreaks=True, enable_margin_vetoes=False, margin_threshold=0.03, enable_variance_aware_stopping=False, variance_stopping_weight=1.0, variance_stopping_strategy='variance_penalty', enable_bootstrap_variance_tracking=False, variance_tracking_samples=10, enable_explicit_variance_penalty=False, variance_penalty_weight=0.1, split_strategy=None, algorithm_focus='stability', classification_criterion='gini', random_state=None, enable_threshold_binning=False, enable_gain_margin_logic=False, enable_beam_search_for_consensus=False, enable_robust_consensus_for_ambiguous=False)[source]ΒΆ
Methods
__init__([task, max_depth, ...])fit(X, y)Fit the stable tree to the training data.
Get metadata routing of this object.
get_params([deep])Get parameters for this estimator.
predict(X)Predict targets for samples in X.
Predict class probabilities for classification tasks.
score(X, y)Return the mean accuracy (classification) or RΒ² (regression).
set_params(**params)Set the parameters of this estimator.
- __init__(task='regression', max_depth=5, min_samples_split=40, min_samples_leaf=20, enable_honest_estimation=True, split_frac=0.6, val_frac=0.2, est_frac=0.2, enable_stratified_sampling=True, enable_validation_checking=True, validation_metric='variance_penalized', validation_consistency_weight=1.0, enable_prefix_consensus=False, prefix_levels=2, consensus_samples=12, consensus_threshold=0.5, enable_quantile_grid_thresholds=False, max_threshold_bins=24, leaf_smoothing=0.0, leaf_smoothing_strategy='m_estimate', enable_calibrated_smoothing=False, min_leaf_samples_for_stability=5, enable_winsorization=False, winsor_quantiles=(0.01, 0.99), enable_feature_standardization=False, enable_oblique_splits=False, oblique_strategy='root_only', oblique_regularization='lasso', enable_correlation_gating=True, min_correlation_threshold=0.3, enable_lookahead=False, lookahead_depth=1, beam_width=8, enable_ambiguity_gating=True, ambiguity_threshold=0.05, min_samples_for_lookahead=100, enable_deterministic_preprocessing=False, enable_deterministic_tiebreaks=True, enable_margin_vetoes=False, margin_threshold=0.03, enable_variance_aware_stopping=False, variance_stopping_weight=1.0, variance_stopping_strategy='variance_penalty', enable_bootstrap_variance_tracking=False, variance_tracking_samples=10, enable_explicit_variance_penalty=False, variance_penalty_weight=0.1, split_strategy=None, algorithm_focus='stability', classification_criterion='gini', random_state=None, enable_threshold_binning=False, enable_gain_margin_logic=False, enable_beam_search_for_consensus=False, enable_robust_consensus_for_ambiguous=False)[source]ΒΆ
- fit(X, y)[source]ΒΆ
Fit the stable tree to the training data.
- Parameters:
X (ndarray[tuple[Any, ...], dtype[floating]]) β Training feature matrix of shape (n_samples, n_features).
y (ndarray[tuple[Any, ...], dtype[Any]]) β Training target values of shape (n_samples,).
- Returns:
Fitted estimator.
- Return type:
- Raises:
ValueError β If multi-class classification is attempted (not yet supported).
- predict(X)[source]ΒΆ
Predict targets for samples in X.
- Parameters:
X (ndarray[tuple[Any, ...], dtype[floating]]) β Feature matrix of shape (n_samples, n_features).
- Returns:
Predicted values of shape (n_samples,).
- Return type:
NDArray[Any]
- Raises:
ValueError β If the tree has not been fitted.
- predict_proba(X)[source]ΒΆ
Predict class probabilities for classification tasks.
- Parameters:
X (ndarray[tuple[Any, ...], dtype[floating]]) β Feature matrix of shape (n_samples, n_features).
- Returns:
Class probabilities of shape (n_samples, n_classes).
- Return type:
NDArray[np.floating]
- Raises:
ValueError β If called on regression task or tree not fitted.
- score(X, y)[source]ΒΆ
Return the mean accuracy (classification) or RΒ² (regression).
- Parameters:
X (ndarray[tuple[Any, ...], dtype[floating]]) β Feature matrix for evaluation.
y (ndarray[tuple[Any, ...], dtype[Any]]) β True target values.
- Returns:
Accuracy for classification, RΒ² for regression.
- Return type:
float
- classmethod __init_subclass__(**kwargs)ΒΆ
Set the
set_{method}_requestmethods.This uses PEP-487 [1] to set the
set_{method}_requestmethods. It looks for the information available in the set default values which are set using__metadata_request__*class attributes, or inferred from method signatures.The
__metadata_request__*class attributes are used when a method does not explicitly accept a metadata through its arguments or if the developer would like to specify a request value for those metadata which are different from the defaultNone.References
- get_metadata_routing()ΒΆ
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
routing β A
MetadataRequestencapsulating routing information.- Return type:
MetadataRequest
- get_params(deep=True)ΒΆ
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) β If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params β Parameter names mapped to their values.
- Return type:
dict
- set_params(**params)ΒΆ
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that itβs possible to update each component of a nested object.- Parameters:
**params (dict) β Estimator parameters.
- Returns:
self β Estimator instance.
- Return type:
estimator instance