BaseStableTreeΒΆ

class BaseStableTree(task='regression', max_depth=5, min_samples_split=40, min_samples_leaf=20, enable_honest_estimation=True, split_frac=0.6, val_frac=0.2, est_frac=0.2, enable_stratified_sampling=True, enable_validation_checking=True, validation_metric='variance_penalized', validation_consistency_weight=1.0, enable_prefix_consensus=False, prefix_levels=2, consensus_samples=12, consensus_threshold=0.5, enable_quantile_grid_thresholds=False, max_threshold_bins=24, leaf_smoothing=0.0, leaf_smoothing_strategy='m_estimate', enable_calibrated_smoothing=False, min_leaf_samples_for_stability=5, enable_winsorization=False, winsor_quantiles=(0.01, 0.99), enable_feature_standardization=False, enable_oblique_splits=False, oblique_strategy='root_only', oblique_regularization='lasso', enable_correlation_gating=True, min_correlation_threshold=0.3, enable_lookahead=False, lookahead_depth=1, beam_width=8, enable_ambiguity_gating=True, ambiguity_threshold=0.05, min_samples_for_lookahead=100, enable_deterministic_preprocessing=False, enable_deterministic_tiebreaks=True, enable_margin_vetoes=False, margin_threshold=0.03, enable_variance_aware_stopping=False, variance_stopping_weight=1.0, variance_stopping_strategy='variance_penalty', enable_bootstrap_variance_tracking=False, variance_tracking_samples=10, enable_explicit_variance_penalty=False, variance_penalty_weight=0.1, split_strategy=None, algorithm_focus='stability', classification_criterion='gini', random_state=None, enable_threshold_binning=False, enable_gain_margin_logic=False, enable_beam_search_for_consensus=False, enable_robust_consensus_for_ambiguous=False)[source]ΒΆ

Bases: BaseEstimator

Unified base class implementing all 7 stability primitives.

The 7 stability primitives are: 1. Prefix stability (robust consensus on early splits) 2. Validation-checked split selection 3. Honesty (separate data for structure vs estimation) 4. Leaf stabilization (shrinkage/smoothing) 5. Data regularization (winsorization, etc.) 6. Candidate diversity with deterministic resolution 7. Variance-aware stopping

All tree methods inherit from this and configure different defaults to maintain their distinct personalities while sharing the unified stability infrastructure.

Parameters:
  • task (str) – The prediction task type.

  • max_depth (int) – Maximum tree depth.

  • min_samples_split (int) – Minimum samples required to split an internal node.

  • min_samples_leaf (int) – Minimum samples required in a leaf node.

  • enable_honest_estimation (bool) – Enable honest estimation (separate data for structure vs estimation).

  • split_frac (float) – Fraction of data used for building tree structure.

  • val_frac (float) – Fraction of data used for validation.

  • est_frac (float) – Fraction of data used for estimation.

  • enable_stratified_sampling (bool) – Use stratified sampling for data partitioning.

  • enable_validation_checking (bool) – Enable validation-checked split selection.

  • validation_metric (Literal['median', 'one_se', 'variance_penalized']) – Metric for validation-based split selection.

  • validation_consistency_weight (float) – Weight for validation consistency in split selection.

  • enable_prefix_consensus (bool) – Enable prefix stability through consensus on early splits.

  • prefix_levels (int) – Number of tree levels to apply prefix consensus.

  • consensus_samples (int) – Number of bootstrap samples for consensus building.

  • consensus_threshold (float) – Minimum agreement threshold for consensus splits.

  • enable_quantile_grid_thresholds (bool) – Use quantile-based threshold grids.

  • max_threshold_bins (int) – Maximum number of threshold bins per feature.

  • leaf_smoothing (float) – Smoothing parameter for leaf value stabilization.

  • leaf_smoothing_strategy (Literal['m_estimate', 'shrink_to_parent', 'beta_smoothing']) – Strategy for leaf value stabilization.

  • enable_calibrated_smoothing (bool) – Use calibrated smoothing based on sample size.

  • min_leaf_samples_for_stability (int) – Minimum samples required for stable leaf estimation.

  • enable_winsorization (bool) – Enable feature winsorization for robustness.

  • winsor_quantiles (tuple[float, float]) – Quantiles for winsorization bounds.

  • enable_feature_standardization (bool) – Standardize features before splitting.

  • enable_oblique_splits (bool) – Enable oblique (linear combination) splits.

  • oblique_strategy (Literal['root_only', 'all_levels', 'adaptive']) – Where to apply oblique splits in the tree.

  • oblique_regularization (Literal['lasso', 'ridge', 'elastic_net']) – Regularization for oblique split learning.

  • enable_correlation_gating (bool) – Gate splits based on feature correlations.

  • min_correlation_threshold (float) – Minimum correlation for correlation gating.

  • enable_lookahead (bool) – Enable lookahead for better split selection.

  • lookahead_depth (int) – Depth of lookahead search.

  • beam_width (int) – Beam width for lookahead search.

  • enable_ambiguity_gating (bool) – Gate splits in ambiguous regions.

  • ambiguity_threshold (float) – Threshold for ambiguity detection.

  • min_samples_for_lookahead (int) – Minimum samples required for lookahead.

  • enable_deterministic_preprocessing (bool) – Use deterministic preprocessing for reproducibility.

  • enable_deterministic_tiebreaks (bool) – Use deterministic tiebreaking in split selection.

  • enable_margin_vetoes (bool) – Enable margin-based split vetoing.

  • margin_threshold (float) – Threshold for margin-based vetoing.

  • enable_variance_aware_stopping (bool) – Enable variance-aware stopping criteria.

  • variance_stopping_weight (float) – Weight for variance in stopping decisions.

  • variance_stopping_strategy (Literal['one_se', 'variance_penalty', 'both']) – Strategy for variance-aware stopping.

  • enable_bootstrap_variance_tracking (bool) – Track split variance using bootstrap sampling.

  • variance_tracking_samples (int) – Number of bootstrap samples for variance tracking.

  • enable_explicit_variance_penalty (bool) – Apply explicit variance penalty to splits.

  • variance_penalty_weight (float) – Weight for variance penalty.

  • split_strategy (str | None) – Explicit split strategy specification.

  • algorithm_focus (Literal['speed', 'stability', 'accuracy']) – Algorithm focus for automatic strategy selection.

  • classification_criterion (Literal['gini', 'entropy']) – Splitting criterion for classification.

  • random_state (int | None) – Random state for reproducibility.

  • enable_threshold_binning (bool) – Enable threshold binning for continuous features.

  • enable_gain_margin_logic (bool) – Apply margin logic to information gain.

  • enable_beam_search_for_consensus (bool) – Use beam search for consensus building.

  • enable_robust_consensus_for_ambiguous (bool) – Use robust consensus in ambiguous regions.

Raises:

ValueError – If split_frac + val_frac + est_frac does not sum to 1.0.

__init__(task='regression', max_depth=5, min_samples_split=40, min_samples_leaf=20, enable_honest_estimation=True, split_frac=0.6, val_frac=0.2, est_frac=0.2, enable_stratified_sampling=True, enable_validation_checking=True, validation_metric='variance_penalized', validation_consistency_weight=1.0, enable_prefix_consensus=False, prefix_levels=2, consensus_samples=12, consensus_threshold=0.5, enable_quantile_grid_thresholds=False, max_threshold_bins=24, leaf_smoothing=0.0, leaf_smoothing_strategy='m_estimate', enable_calibrated_smoothing=False, min_leaf_samples_for_stability=5, enable_winsorization=False, winsor_quantiles=(0.01, 0.99), enable_feature_standardization=False, enable_oblique_splits=False, oblique_strategy='root_only', oblique_regularization='lasso', enable_correlation_gating=True, min_correlation_threshold=0.3, enable_lookahead=False, lookahead_depth=1, beam_width=8, enable_ambiguity_gating=True, ambiguity_threshold=0.05, min_samples_for_lookahead=100, enable_deterministic_preprocessing=False, enable_deterministic_tiebreaks=True, enable_margin_vetoes=False, margin_threshold=0.03, enable_variance_aware_stopping=False, variance_stopping_weight=1.0, variance_stopping_strategy='variance_penalty', enable_bootstrap_variance_tracking=False, variance_tracking_samples=10, enable_explicit_variance_penalty=False, variance_penalty_weight=0.1, split_strategy=None, algorithm_focus='stability', classification_criterion='gini', random_state=None, enable_threshold_binning=False, enable_gain_margin_logic=False, enable_beam_search_for_consensus=False, enable_robust_consensus_for_ambiguous=False)[source]ΒΆ

Methods

__init__([task, max_depth, ...])

fit(X, y)

Fit the stable tree to the training data.

get_metadata_routing()

Get metadata routing of this object.

get_params([deep])

Get parameters for this estimator.

predict(X)

Predict targets for samples in X.

predict_proba(X)

Predict class probabilities for classification tasks.

score(X, y)

Return the mean accuracy (classification) or RΒ² (regression).

set_params(**params)

Set the parameters of this estimator.

__init__(task='regression', max_depth=5, min_samples_split=40, min_samples_leaf=20, enable_honest_estimation=True, split_frac=0.6, val_frac=0.2, est_frac=0.2, enable_stratified_sampling=True, enable_validation_checking=True, validation_metric='variance_penalized', validation_consistency_weight=1.0, enable_prefix_consensus=False, prefix_levels=2, consensus_samples=12, consensus_threshold=0.5, enable_quantile_grid_thresholds=False, max_threshold_bins=24, leaf_smoothing=0.0, leaf_smoothing_strategy='m_estimate', enable_calibrated_smoothing=False, min_leaf_samples_for_stability=5, enable_winsorization=False, winsor_quantiles=(0.01, 0.99), enable_feature_standardization=False, enable_oblique_splits=False, oblique_strategy='root_only', oblique_regularization='lasso', enable_correlation_gating=True, min_correlation_threshold=0.3, enable_lookahead=False, lookahead_depth=1, beam_width=8, enable_ambiguity_gating=True, ambiguity_threshold=0.05, min_samples_for_lookahead=100, enable_deterministic_preprocessing=False, enable_deterministic_tiebreaks=True, enable_margin_vetoes=False, margin_threshold=0.03, enable_variance_aware_stopping=False, variance_stopping_weight=1.0, variance_stopping_strategy='variance_penalty', enable_bootstrap_variance_tracking=False, variance_tracking_samples=10, enable_explicit_variance_penalty=False, variance_penalty_weight=0.1, split_strategy=None, algorithm_focus='stability', classification_criterion='gini', random_state=None, enable_threshold_binning=False, enable_gain_margin_logic=False, enable_beam_search_for_consensus=False, enable_robust_consensus_for_ambiguous=False)[source]ΒΆ
fit(X, y)[source]ΒΆ

Fit the stable tree to the training data.

Parameters:
  • X (ndarray[tuple[Any, ...], dtype[floating]]) – Training feature matrix of shape (n_samples, n_features).

  • y (ndarray[tuple[Any, ...], dtype[Any]]) – Training target values of shape (n_samples,).

Returns:

Fitted estimator.

Return type:

BaseStableTree

Raises:

ValueError – If multi-class classification is attempted (not yet supported).

predict(X)[source]ΒΆ

Predict targets for samples in X.

Parameters:

X (ndarray[tuple[Any, ...], dtype[floating]]) – Feature matrix of shape (n_samples, n_features).

Returns:

Predicted values of shape (n_samples,).

Return type:

NDArray[Any]

Raises:

ValueError – If the tree has not been fitted.

predict_proba(X)[source]ΒΆ

Predict class probabilities for classification tasks.

Parameters:

X (ndarray[tuple[Any, ...], dtype[floating]]) – Feature matrix of shape (n_samples, n_features).

Returns:

Class probabilities of shape (n_samples, n_classes).

Return type:

NDArray[np.floating]

Raises:

ValueError – If called on regression task or tree not fitted.

score(X, y)[source]ΒΆ

Return the mean accuracy (classification) or RΒ² (regression).

Parameters:
  • X (ndarray[tuple[Any, ...], dtype[floating]]) – Feature matrix for evaluation.

  • y (ndarray[tuple[Any, ...], dtype[Any]]) – True target values.

Returns:

Accuracy for classification, RΒ² for regression.

Return type:

float

classmethod __init_subclass__(**kwargs)ΒΆ

Set the set_{method}_request methods.

This uses PEP-487 [1] to set the set_{method}_request methods. It looks for the information available in the set default values which are set using __metadata_request__* class attributes, or inferred from method signatures.

The __metadata_request__* class attributes are used when a method does not explicitly accept a metadata through its arguments or if the developer would like to specify a request value for those metadata which are different from the default None.

References

get_metadata_routing()ΒΆ

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:

routing – A MetadataRequest encapsulating routing information.

Return type:

MetadataRequest

get_params(deep=True)ΒΆ

Get parameters for this estimator.

Parameters:

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params – Parameter names mapped to their values.

Return type:

dict

set_params(**params)ΒΆ

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**params (dict) – Estimator parameters.

Returns:

self – Estimator instance.

Return type:

estimator instance