pyppur package¶

class pyppur.GridOptimizer(objective_func: Callable[[...], float], n_components: int, n_directions: int = 250, n_iterations: int = 10, max_iter: int = 1000, tol: float = 1e-06, random_state: int | None = None, verbose: bool = False, **kwargs: Any)[source]

Bases: BaseOptimizer

Optimizer using a grid-based search approach.

This optimizer is particularly useful for projection indices that are not differentiable or have many local minima. It systematically explores the space of projection directions using a grid-based approach.

optimize(X: ndarray, initial_guess: ndarray | None = None, **kwargs: Any) → tuple[ndarray, float, dict[str, Any]][source]

Optimize the projection directions using a grid-based approach.

Parameters:

X – Input data, shape (n_samples, n_features).
initial_guess – Optional initial guess for projection directions.
**kwargs – Additional arguments for the objective function.

Returns:

Optimized projection directions, shape (n_components, n_features)
Final objective value
Additional optimizer information

Return type:

Tuple containing

class pyppur.Objective(*values)[source]

Bases: str, Enum

Objective types for projection pursuit.

DISTANCE_DISTORTION = 'distance_distortion'

RECONSTRUCTION = 'reconstruction'

class pyppur.ProjectionPursuit(n_components: int = 2, objective: Objective = Objective.DISTANCE_DISTORTION, alpha: float = 1.0, max_iter: int = 500, tol: float = 1e-06, random_state: int | None = None, optimizer: str = 'L-BFGS-B', n_init: int = 3, verbose: bool = False, center: bool = True, scale: bool = True, weight_by_distance: bool = False, tied_weights: bool = True, l2_reg: float = 0.0, use_nonlinearity_in_distance: bool = True)[source]

Bases: object

Implementation of Projection Pursuit for dimensionality reduction.

This class provides methods to find optimal projections by minimizing either reconstruction loss or distance distortion. It supports both initialization strategies and different optimizers.

property best_loss_: float

Get the best loss value achieved.

Returns:: Best loss value.

compute_silhouette(X: ndarray, labels: ndarray) → float[source]

Compute the silhouette score for the dimensionality reduction.

Silhouette score measures how well clusters are separated. A score close to 1.0 indicates that clusters are well separated, while a score close to -1.0 indicates poor separation.

Parameters:

X – Input data, shape (n_samples, n_features).
labels – Cluster labels for each sample.

Returns:

Silhouette score between -1.0 and 1.0.

compute_trustworthiness(X: ndarray, n_neighbors: int = 5) → float[source]

Compute the trustworthiness score for the dimensionality reduction.

Trustworthiness measures how well the local structure is preserved. A score of 1.0 indicates perfect trustworthiness, while a score of 0.0 indicates that the local structure is not preserved at all.

Parameters:

X – Input data, shape (n_samples, n_features).
n_neighbors – Number of neighbors to consider for trustworthiness.

Returns:

Trustworthiness score between 0.0 and 1.0.

property decoder_weights_: ndarray | None

Get the decoder weights (for untied weights only).

Returns:: Decoder weights, shape (n_components, n_features), or None if using tied weights.

distance_distortion(X: ndarray) → float[source]

Compute the distance distortion for X.

Parameters:: X – Input data, shape (n_samples, n_features).
Returns:: Mean squared distance distortion.

evaluate(X: ndarray, labels: ndarray | None = None, n_neighbors: int = 5) → dict[str, float][source]

Evaluate the dimensionality reduction with multiple metrics.

Parameters:

X – Input data, shape (n_samples, n_features).
labels – Optional cluster labels for silhouette score.
n_neighbors – Number of neighbors for trustworthiness.

Returns:

Dictionary with evaluation metrics.

fit(X: ndarray) → ProjectionPursuit[source]

Fit the ProjectionPursuit model to the data.

Parameters:: X – Input data, shape (n_samples, n_features).
Returns:: The fitted model.

property fit_time_: float

Get the time taken to fit the model.

Returns:: Time in seconds.

fit_transform(X: ndarray) → ndarray[source]

Fit the model with X and apply dimensionality reduction on X.

Parameters:: X – Input data, shape (n_samples, n_features).
Returns:: Transformed data, shape (n_samples, n_components).

property loss_curve_: list[float]

Get the loss curve during optimization.

Returns:: Loss values during optimization.

property optimizer_info_: dict[str, Any]

Get additional information from the optimizer.

Returns:: Optimizer information.

reconstruct(X: ndarray) → ndarray[source]

Reconstruct X from the projected data.

Parameters:: X – Input data, shape (n_samples, n_features).
Returns:: Reconstructed data, shape (n_samples, n_features).

reconstruction_error(X: ndarray) → float[source]

Compute the reconstruction error for X.

Parameters:: X – Input data, shape (n_samples, n_features).
Returns:: Mean squared reconstruction error.

transform(X: ndarray) → ndarray[source]

Apply dimensionality reduction to X.

Parameters:: X – Input data, shape (n_samples, n_features).
Returns:: Transformed data, shape (n_samples, n_components).

property x_loadings_: ndarray

Get the projection directions (encoder).

Returns:: Projection directions, shape (n_components, n_features).

class pyppur.ScipyOptimizer(objective_func: Callable[[...], float], n_components: int, method: str = 'L-BFGS-B', max_iter: int = 1000, tol: float = 1e-06, random_state: int | None = None, verbose: bool = False, **kwargs: Any)[source]

Bases: BaseOptimizer

Optimizer using SciPy’s optimization methods.

This optimizer leverages SciPy’s optimization functionality, particularly the L-BFGS-B method which is well-suited for projection pursuit problems.

optimize(X: ndarray, initial_guess: ndarray | None = None, **kwargs: Any) → tuple[ndarray, float, dict[str, Any]][source]

Optimize the projection directions using SciPy’s optimization methods.

Parameters:

X – Input data, shape (n_samples, n_features).
initial_guess – Optional initial guess for projection directions.
**kwargs – Additional arguments for the objective function.

Returns:

Optimized projection directions, shape (n_components, n_features)
Final objective value
Additional optimizer information

Return type:

Tuple containing

Subpackages¶

Note

For detailed API documentation including methods and properties, see the API Reference page.