alsgls package

class alsgls.ALSGLS(*, rank: int | str | None = 'auto', lam_F: float = 0.001, lam_B: float = 0.001, max_sweeps: int = 12, rel_tol: float = 1e-06, d_floor: float = 1e-08, cg_maxit: int = 800, cg_tol: float = 3e-07, scale_correct: bool = True, scale_floor: float = 1e-08)[source]

Bases: object

Scikit-learn style estimator for low-rank GLS via ALS.

fit(Xs: Sequence[Any], Y: Any) ALSGLS[source]
get_params(deep: bool = True) dict[str, Any][source]
predict(Xs: Sequence[Any]) ndarray[source]
score(Xs: Sequence[Any], Y: Any) float[source]
set_params(**params: Any) ALSGLS[source]
class alsgls.ALSGLSSystem(system: Mapping[Any, tuple[Any, Any]] | Sequence[tuple[Any, tuple[Any, Any]]], *, rank: int | str | None = 'auto', lam_F: float = 0.001, lam_B: float = 0.001, max_sweeps: int = 12, rel_tol: float = 1e-06, d_floor: float = 1e-08, cg_maxit: int = 800, cg_tol: float = 3e-07, scale_correct: bool = True, scale_floor: float = 1e-08)[source]

Bases: object

Statsmodels-style system container for ALS GLS fitting.

as_arrays() tuple[list[ndarray], ndarray][source]
fit() ALSGLSSystemResults[source]
property keqs: int
property nobs: int
class alsgls.ALSGLSSystemResults(model: ALSGLSSystem, estimator: ALSGLS)[source]

Bases: object

Lightweight results container mimicking statsmodels outputs.

params_as_series()[source]
predict(exog: Mapping[Any, Any] | Sequence[Any] | None = None) ndarray[source]
summary_dict() dict[str, Any][source]
alsgls.XB_from_Blist(Xs, B_list)[source]

Return N × K matrix of predictions.

alsgls.als_gls(Xs, Y, k, lam_F: float = 0.001, lam_B: float = 0.001, sweeps: int = 8, d_floor: float = 1e-08, cg_maxit: int = 800, cg_tol: float = 3e-07, *, scale_correct: bool = True, scale_floor: float = 1e-08, rel_tol: float = 1e-06)[source]

Alternating-least-squares GLS with low-rank-plus-diagonal covariance. Uses Woodbury throughout; never materializes K×K dense Σ.

Enhancements (correctness-first):
  • Cached Woodbury pieces per sweep.

  • Block-Jacobi preconditioner using diag(Σ^{-1}).

  • PCA init of F with F F^T ≈ R^T R / N.

  • Guarded MLE scale-correction of Σ each sweep.

  • β-step REVERT if it worsens NLL (keeps trace non-increasing).

  • Backtracking/damped acceptance on (F, D) to accept only NLL-improving updates.

  • Dual traces in info: nll_beta_trace (post-β), nll_trace/nll_sigma_trace (post-Σ).

Parameters:
  • Xs (list of (N×p_j) arrays (length K))

  • Y ((N×K) array)

  • k (int rank of low-rank component (F: K×k))

  • lam_F (float ridge for U/F ALS updates)

  • lam_B (float ridge for β-step (per-equation))

  • sweeps (int max ALS sweeps)

  • d_floor (float min variance for D to ensure SPD)

  • cg_maxit (int max CG iterations in β-step)

  • cg_tol (float CG relative tolerance)

  • scale_correct (bool if True, try guarded MLE scale fix on Σ each sweep)

  • scale_floor (float min scalar for scale correction)

  • rel_tol (float relative NLL improvement threshold for early stopping)

Returns:

info includes:
  • p_list

  • cg (last sweep)

  • nll_trace (post-Σ, non-increasing)

  • nll_sigma_trace (alias of nll_trace)

  • nll_beta_trace (post-β baseline per sweep)

  • accept_t (list of accepted backtracking t)

  • scale_used (list of accepted scale factors, 1.0 if not applied)

Return type:

B_list, F, D, mem_MB_est, info

alsgls.em_gls(Xs, Y, k, lam_F=0.001, lam_B=0.001, iters=30, d_floor=1e-08)[source]

Dense-ish EM baseline for low-rank+diag GLS. Builds Σ^{-1} explicitly (KxK) in the β-step to mimic O(K^2) memory. Returns (B_list, F, D, mem_MB_est, info)

alsgls.mse(Y, Yhat)[source]

Mean squared error between two matrices.

alsgls.nll_per_row(R, F, D)[source]

Negative log-likelihood per row for residual matrix R under Σ = F F^T + diag(D) with Gaussian errors.

Returns:

0.5 * [ tr(R Σ^{-1} R^T)/N + log det(Σ) + K log(2π) ] where N is the number of rows in R.

Return type:

float

alsgls.simulate_gls(N_tr, N_te, p_list, k, seed=0)[source]

Simulate a generalized least squares (GLS) dataset.

This variant allows each response equation to have its own number of features as specified by p_list.

Parameters:
  • N_tr (int) – Number of training samples.

  • N_te (int) – Number of test samples.

  • p_list (sequence of int) – Number of features for each equation.

  • k (int) – Latent factor dimension controlling correlated noise.

  • seed (int, optional) – Seed for the NumPy random number generator. Defaults to 0.

Returns:

  • X_tr (list of ndarray) – Feature matrices for the training set. X_tr[j] has shape (N_tr, p_list[j]).

  • Y_tr (ndarray) – Training responses of shape (N_tr, K) where K = len(p_list).

  • X_te (list of ndarray) – Feature matrices for the test set. X_te[j] has shape (N_te, p_list[j]).

  • Y_te (ndarray) – Test responses of shape (N_te, K).

Notes

Randomness is controlled via numpy.random.default_rng(seed); pass a different seed for different simulations.

Examples

>>> p_list = [3, 5, 2]
>>> Xtr, Ytr, Xte, Yte = simulate_gls(100, 20, p_list, k=2, seed=0)
alsgls.simulate_sur(N_tr, N_te, K, p, k, seed=0)[source]

Simulate a Seemingly Unrelated Regression (SUR) dataset.

Parameters:
  • N_tr (int) – Number of training samples.

  • N_te (int) – Number of test samples.

  • K (int) – Number of response equations.

  • p (int) – Number of features per equation.

  • k (int) – Latent factor dimension controlling correlated noise.

  • seed (int, optional) – Seed for the NumPy random number generator. Defaults to 0.

Returns:

  • X_tr (list of ndarray) – Feature matrices for the training set. Each element has shape (N_tr, p).

  • Y_tr (ndarray) – Training responses of shape (N_tr, K).

  • X_te (list of ndarray) – Feature matrices for the test set. Each element has shape (N_te, p).

  • Y_te (ndarray) – Test responses of shape (N_te, K).

Notes

Randomness is controlled via numpy.random.default_rng(seed); pass a different seed for different simulations.

Examples

>>> Xtr, Ytr, Xte, Yte = simulate_sur(100, 20, K=3, p=5, k=2, seed=42)

Submodules

alsgls.als module

alsgls.als.als_gls(Xs, Y, k, lam_F: float = 0.001, lam_B: float = 0.001, sweeps: int = 8, d_floor: float = 1e-08, cg_maxit: int = 800, cg_tol: float = 3e-07, *, scale_correct: bool = True, scale_floor: float = 1e-08, rel_tol: float = 1e-06)[source]

Alternating-least-squares GLS with low-rank-plus-diagonal covariance. Uses Woodbury throughout; never materializes K×K dense Σ.

Enhancements (correctness-first):
  • Cached Woodbury pieces per sweep.

  • Block-Jacobi preconditioner using diag(Σ^{-1}).

  • PCA init of F with F F^T ≈ R^T R / N.

  • Guarded MLE scale-correction of Σ each sweep.

  • β-step REVERT if it worsens NLL (keeps trace non-increasing).

  • Backtracking/damped acceptance on (F, D) to accept only NLL-improving updates.

  • Dual traces in info: nll_beta_trace (post-β), nll_trace/nll_sigma_trace (post-Σ).

Parameters:
  • Xs (list of (N×p_j) arrays (length K))

  • Y ((N×K) array)

  • k (int rank of low-rank component (F: K×k))

  • lam_F (float ridge for U/F ALS updates)

  • lam_B (float ridge for β-step (per-equation))

  • sweeps (int max ALS sweeps)

  • d_floor (float min variance for D to ensure SPD)

  • cg_maxit (int max CG iterations in β-step)

  • cg_tol (float CG relative tolerance)

  • scale_correct (bool if True, try guarded MLE scale fix on Σ each sweep)

  • scale_floor (float min scalar for scale correction)

  • rel_tol (float relative NLL improvement threshold for early stopping)

Returns:

info includes:
  • p_list

  • cg (last sweep)

  • nll_trace (post-Σ, non-increasing)

  • nll_sigma_trace (alias of nll_trace)

  • nll_beta_trace (post-β baseline per sweep)

  • accept_t (list of accepted backtracking t)

  • scale_used (list of accepted scale factors, 1.0 if not applied)

Return type:

B_list, F, D, mem_MB_est, info

alsgls.em module

alsgls.em.em_gls(Xs, Y, k, lam_F=0.001, lam_B=0.001, iters=30, d_floor=1e-08)[source]

Dense-ish EM baseline for low-rank+diag GLS. Builds Σ^{-1} explicitly (KxK) in the β-step to mimic O(K^2) memory. Returns (B_list, F, D, mem_MB_est, info)

alsgls.ops module

alsgls.ops.XB_from_Blist(Xs, B_list)[source]

Return N × K matrix of predictions.

alsgls.ops.apply_siginv_to_matrix(M: ndarray, F: ndarray, D: ndarray, *, Dinv: ndarray | None = None, C_inv: ndarray | None = None, C_chol: ndarray | None = None) ndarray[source]

Right-multiply an (N×K) matrix M by Σ^{-1} using Woodbury, where Σ = F F^T + diag(D).

You may pass either C_inv (explicit inverse of I + F^T D^{-1} F) or C_chol (its Cholesky factor). If neither is provided, a small explicit inverse will be computed internally for convenience.

alsgls.ops.cg_solve(operator_mv, b, x0=None, maxit=500, tol=1e-07, M_pre=None)[source]

Conjugate gradient for SPD operator A (matrix-free).

Parameters:
  • operator_mv (callable) – Function that returns A @ x for a given x.

  • b (ndarray) – Right-hand side.

  • x0 (ndarray, optional) – Initial guess.

  • maxit (int) – Maximum CG iterations.

  • tol (float) – Relative residual tolerance.

  • M_pre (callable, optional) – Preconditioner application: returns M^{-1} @ r.

Returns:

  • x (ndarray) – Approximate solution.

  • info (dict) – Iterations and final residual norm.

alsgls.ops.siginv_diag(F: ndarray, Dinv: ndarray, C_chol: ndarray) ndarray[source]

Compute the diagonal of Σ^{-1} = D^{-1} - D^{-1} F C^{-1} F^T D^{-1} efficiently given Dinv and the Cholesky factor of C.

Returns:

diag_Sinv – The diagonal entries of Σ^{-1}.

Return type:

(K,) array

alsgls.ops.stack_B_list(B_list)[source]

Stack list of (p_j×1) blocks into a flat vector.

alsgls.ops.unstack_B_vec(bvec, p_list)[source]

Inverse of stack: vector -> list of (p_j×1).

alsgls.ops.woodbury_chol(F: ndarray, D: ndarray)[source]

Return (Dinv, C_chol) where C_chol is the Cholesky factor of C = I + F^T D^{-1} F.

Intended for numerically stable downstream solves that avoid forming C^{-1} explicitly.

alsgls.ops.woodbury_pieces(F: ndarray, D: ndarray)[source]

Return (Dinv, C_inv) used in Woodbury for Σ = F F^T + diag(D).

Σ^{-1} = D^{-1} - D^{-1} F C^{-1} F^T D^{-1}, where C = (I + F^T D^{-1} F). For backward compatibility, this returns the explicit small inverse C_inv.

alsgls.api module

High-level estimator APIs for ALS-based GLS fitting.

class alsgls.api.ALSGLS(*, rank: int | str | None = 'auto', lam_F: float = 0.001, lam_B: float = 0.001, max_sweeps: int = 12, rel_tol: float = 1e-06, d_floor: float = 1e-08, cg_maxit: int = 800, cg_tol: float = 3e-07, scale_correct: bool = True, scale_floor: float = 1e-08)[source]

Bases: object

Scikit-learn style estimator for low-rank GLS via ALS.

fit(Xs: Sequence[Any], Y: Any) ALSGLS[source]
get_params(deep: bool = True) dict[str, Any][source]
predict(Xs: Sequence[Any]) ndarray[source]
score(Xs: Sequence[Any], Y: Any) float[source]
set_params(**params: Any) ALSGLS[source]
class alsgls.api.ALSGLSSystem(system: Mapping[Any, tuple[Any, Any]] | Sequence[tuple[Any, tuple[Any, Any]]], *, rank: int | str | None = 'auto', lam_F: float = 0.001, lam_B: float = 0.001, max_sweeps: int = 12, rel_tol: float = 1e-06, d_floor: float = 1e-08, cg_maxit: int = 800, cg_tol: float = 3e-07, scale_correct: bool = True, scale_floor: float = 1e-08)[source]

Bases: object

Statsmodels-style system container for ALS GLS fitting.

as_arrays() tuple[list[ndarray], ndarray][source]
fit() ALSGLSSystemResults[source]
property keqs: int
property nobs: int
class alsgls.api.ALSGLSSystemResults(model: ALSGLSSystem, estimator: ALSGLS)[source]

Bases: object

Lightweight results container mimicking statsmodels outputs.

params_as_series()[source]
predict(exog: Mapping[Any, Any] | Sequence[Any] | None = None) ndarray[source]
summary_dict() dict[str, Any][source]

alsgls.sim module

alsgls.sim.simulate_gls(N_tr, N_te, p_list, k, seed=0)[source]

Simulate a generalized least squares (GLS) dataset.

This variant allows each response equation to have its own number of features as specified by p_list.

Parameters:
  • N_tr (int) – Number of training samples.

  • N_te (int) – Number of test samples.

  • p_list (sequence of int) – Number of features for each equation.

  • k (int) – Latent factor dimension controlling correlated noise.

  • seed (int, optional) – Seed for the NumPy random number generator. Defaults to 0.

Returns:

  • X_tr (list of ndarray) – Feature matrices for the training set. X_tr[j] has shape (N_tr, p_list[j]).

  • Y_tr (ndarray) – Training responses of shape (N_tr, K) where K = len(p_list).

  • X_te (list of ndarray) – Feature matrices for the test set. X_te[j] has shape (N_te, p_list[j]).

  • Y_te (ndarray) – Test responses of shape (N_te, K).

Notes

Randomness is controlled via numpy.random.default_rng(seed); pass a different seed for different simulations.

Examples

>>> p_list = [3, 5, 2]
>>> Xtr, Ytr, Xte, Yte = simulate_gls(100, 20, p_list, k=2, seed=0)
alsgls.sim.simulate_sur(N_tr, N_te, K, p, k, seed=0)[source]

Simulate a Seemingly Unrelated Regression (SUR) dataset.

Parameters:
  • N_tr (int) – Number of training samples.

  • N_te (int) – Number of test samples.

  • K (int) – Number of response equations.

  • p (int) – Number of features per equation.

  • k (int) – Latent factor dimension controlling correlated noise.

  • seed (int, optional) – Seed for the NumPy random number generator. Defaults to 0.

Returns:

  • X_tr (list of ndarray) – Feature matrices for the training set. Each element has shape (N_tr, p).

  • Y_tr (ndarray) – Training responses of shape (N_tr, K).

  • X_te (list of ndarray) – Feature matrices for the test set. Each element has shape (N_te, p).

  • Y_te (ndarray) – Test responses of shape (N_te, K).

Notes

Randomness is controlled via numpy.random.default_rng(seed); pass a different seed for different simulations.

Examples

>>> Xtr, Ytr, Xte, Yte = simulate_sur(100, 20, K=3, p=5, k=2, seed=42)

alsgls.metrics module

alsgls.metrics.mse(Y, Yhat)[source]

Mean squared error between two matrices.

alsgls.metrics.nll_per_row(R, F, D)[source]

Negative log-likelihood per row for residual matrix R under Σ = F F^T + diag(D) with Gaussian errors.

Returns:

0.5 * [ tr(R Σ^{-1} R^T)/N + log det(Σ) + K log(2π) ] where N is the number of rows in R.

Return type:

float