alsgls package¶
- class alsgls.ALSGLS(*, rank: int | str | None = 'auto', lam_F: float = 0.001, lam_B: float = 0.001, max_sweeps: int = 12, rel_tol: float = 1e-06, d_floor: float = 1e-08, cg_maxit: int = 800, cg_tol: float = 3e-07, scale_correct: bool = True, scale_floor: float = 1e-08)[source]¶
Bases:
objectScikit-learn style estimator for low-rank GLS via ALS.
- class alsgls.ALSGLSSystem(system: Mapping[Any, tuple[Any, Any]] | Sequence[tuple[Any, tuple[Any, Any]]], *, rank: int | str | None = 'auto', lam_F: float = 0.001, lam_B: float = 0.001, max_sweeps: int = 12, rel_tol: float = 1e-06, d_floor: float = 1e-08, cg_maxit: int = 800, cg_tol: float = 3e-07, scale_correct: bool = True, scale_floor: float = 1e-08)[source]¶
Bases:
objectStatsmodels-style system container for ALS GLS fitting.
- class alsgls.ALSGLSSystemResults(model: ALSGLSSystem, estimator: ALSGLS)[source]¶
Bases:
objectLightweight results container mimicking
statsmodelsoutputs.
- alsgls.als_gls(Xs, Y, k, lam_F: float = 0.001, lam_B: float = 0.001, sweeps: int = 8, d_floor: float = 1e-08, cg_maxit: int = 800, cg_tol: float = 3e-07, *, scale_correct: bool = True, scale_floor: float = 1e-08, rel_tol: float = 1e-06)[source]¶
Alternating-least-squares GLS with low-rank-plus-diagonal covariance. Uses Woodbury throughout; never materializes K×K dense Σ.
- Enhancements (correctness-first):
Cached Woodbury pieces per sweep.
Block-Jacobi preconditioner using diag(Σ^{-1}).
PCA init of F with F F^T ≈ R^T R / N.
Guarded MLE scale-correction of Σ each sweep.
β-step REVERT if it worsens NLL (keeps trace non-increasing).
Backtracking/damped acceptance on (F, D) to accept only NLL-improving updates.
Dual traces in info: nll_beta_trace (post-β), nll_trace/nll_sigma_trace (post-Σ).
- Parameters:
Xs (list of (N×p_j) arrays (length K))
Y ((N×K) array)
k (int rank of low-rank component (F: K×k))
lam_F (float ridge for U/F ALS updates)
lam_B (float ridge for β-step (per-equation))
sweeps (int max ALS sweeps)
d_floor (float min variance for D to ensure SPD)
cg_maxit (int max CG iterations in β-step)
cg_tol (float CG relative tolerance)
scale_correct (bool if True, try guarded MLE scale fix on Σ each sweep)
scale_floor (float min scalar for scale correction)
rel_tol (float relative NLL improvement threshold for early stopping)
- Returns:
- info includes:
p_list
cg (last sweep)
nll_trace (post-Σ, non-increasing)
nll_sigma_trace (alias of nll_trace)
nll_beta_trace (post-β baseline per sweep)
accept_t (list of accepted backtracking t)
scale_used (list of accepted scale factors, 1.0 if not applied)
- Return type:
B_list, F, D, mem_MB_est, info
- alsgls.em_gls(Xs, Y, k, lam_F=0.001, lam_B=0.001, iters=30, d_floor=1e-08)[source]¶
Dense-ish EM baseline for low-rank+diag GLS. Builds Σ^{-1} explicitly (KxK) in the β-step to mimic O(K^2) memory. Returns (B_list, F, D, mem_MB_est, info)
- alsgls.nll_per_row(R, F, D)[source]¶
Negative log-likelihood per row for residual matrix
Runder Σ = F F^T + diag(D) with Gaussian errors.- Returns:
0.5 * [ tr(R Σ^{-1} R^T)/N + log det(Σ) + K log(2π) ] where N is the number of rows in R.
- Return type:
- alsgls.simulate_gls(N_tr, N_te, p_list, k, seed=0)[source]¶
Simulate a generalized least squares (GLS) dataset.
This variant allows each response equation to have its own number of features as specified by
p_list.- Parameters:
- Returns:
X_tr (list of ndarray) – Feature matrices for the training set.
X_tr[j]has shape(N_tr, p_list[j]).Y_tr (ndarray) – Training responses of shape
(N_tr, K)whereK = len(p_list).X_te (list of ndarray) – Feature matrices for the test set.
X_te[j]has shape(N_te, p_list[j]).Y_te (ndarray) – Test responses of shape
(N_te, K).
Notes
Randomness is controlled via
numpy.random.default_rng(seed); pass a differentseedfor different simulations.Examples
>>> p_list = [3, 5, 2] >>> Xtr, Ytr, Xte, Yte = simulate_gls(100, 20, p_list, k=2, seed=0)
- alsgls.simulate_sur(N_tr, N_te, K, p, k, seed=0)[source]¶
Simulate a Seemingly Unrelated Regression (SUR) dataset.
- Parameters:
N_tr (int) – Number of training samples.
N_te (int) – Number of test samples.
K (int) – Number of response equations.
p (int) – Number of features per equation.
k (int) – Latent factor dimension controlling correlated noise.
seed (int, optional) – Seed for the NumPy random number generator. Defaults to
0.
- Returns:
X_tr (list of ndarray) – Feature matrices for the training set. Each element has shape
(N_tr, p).Y_tr (ndarray) – Training responses of shape
(N_tr, K).X_te (list of ndarray) – Feature matrices for the test set. Each element has shape
(N_te, p).Y_te (ndarray) – Test responses of shape
(N_te, K).
Notes
Randomness is controlled via
numpy.random.default_rng(seed); pass a differentseedfor different simulations.Examples
>>> Xtr, Ytr, Xte, Yte = simulate_sur(100, 20, K=3, p=5, k=2, seed=42)
Submodules¶
alsgls.als module¶
- alsgls.als.als_gls(Xs, Y, k, lam_F: float = 0.001, lam_B: float = 0.001, sweeps: int = 8, d_floor: float = 1e-08, cg_maxit: int = 800, cg_tol: float = 3e-07, *, scale_correct: bool = True, scale_floor: float = 1e-08, rel_tol: float = 1e-06)[source]¶
Alternating-least-squares GLS with low-rank-plus-diagonal covariance. Uses Woodbury throughout; never materializes K×K dense Σ.
- Enhancements (correctness-first):
Cached Woodbury pieces per sweep.
Block-Jacobi preconditioner using diag(Σ^{-1}).
PCA init of F with F F^T ≈ R^T R / N.
Guarded MLE scale-correction of Σ each sweep.
β-step REVERT if it worsens NLL (keeps trace non-increasing).
Backtracking/damped acceptance on (F, D) to accept only NLL-improving updates.
Dual traces in info: nll_beta_trace (post-β), nll_trace/nll_sigma_trace (post-Σ).
- Parameters:
Xs (list of (N×p_j) arrays (length K))
Y ((N×K) array)
k (int rank of low-rank component (F: K×k))
lam_F (float ridge for U/F ALS updates)
lam_B (float ridge for β-step (per-equation))
sweeps (int max ALS sweeps)
d_floor (float min variance for D to ensure SPD)
cg_maxit (int max CG iterations in β-step)
cg_tol (float CG relative tolerance)
scale_correct (bool if True, try guarded MLE scale fix on Σ each sweep)
scale_floor (float min scalar for scale correction)
rel_tol (float relative NLL improvement threshold for early stopping)
- Returns:
- info includes:
p_list
cg (last sweep)
nll_trace (post-Σ, non-increasing)
nll_sigma_trace (alias of nll_trace)
nll_beta_trace (post-β baseline per sweep)
accept_t (list of accepted backtracking t)
scale_used (list of accepted scale factors, 1.0 if not applied)
- Return type:
B_list, F, D, mem_MB_est, info
alsgls.em module¶
alsgls.ops module¶
- alsgls.ops.apply_siginv_to_matrix(M: ndarray, F: ndarray, D: ndarray, *, Dinv: ndarray | None = None, C_inv: ndarray | None = None, C_chol: ndarray | None = None) ndarray[source]¶
Right-multiply an (N×K) matrix M by Σ^{-1} using Woodbury, where Σ = F F^T + diag(D).
You may pass either C_inv (explicit inverse of I + F^T D^{-1} F) or C_chol (its Cholesky factor). If neither is provided, a small explicit inverse will be computed internally for convenience.
- alsgls.ops.cg_solve(operator_mv, b, x0=None, maxit=500, tol=1e-07, M_pre=None)[source]¶
Conjugate gradient for SPD operator A (matrix-free).
- Parameters:
- Returns:
x (ndarray) – Approximate solution.
info (dict) – Iterations and final residual norm.
- alsgls.ops.siginv_diag(F: ndarray, Dinv: ndarray, C_chol: ndarray) ndarray[source]¶
Compute the diagonal of Σ^{-1} = D^{-1} - D^{-1} F C^{-1} F^T D^{-1} efficiently given Dinv and the Cholesky factor of C.
- Returns:
diag_Sinv – The diagonal entries of Σ^{-1}.
- Return type:
(K,) array
alsgls.api module¶
High-level estimator APIs for ALS-based GLS fitting.
- class alsgls.api.ALSGLS(*, rank: int | str | None = 'auto', lam_F: float = 0.001, lam_B: float = 0.001, max_sweeps: int = 12, rel_tol: float = 1e-06, d_floor: float = 1e-08, cg_maxit: int = 800, cg_tol: float = 3e-07, scale_correct: bool = True, scale_floor: float = 1e-08)[source]¶
Bases:
objectScikit-learn style estimator for low-rank GLS via ALS.
- class alsgls.api.ALSGLSSystem(system: Mapping[Any, tuple[Any, Any]] | Sequence[tuple[Any, tuple[Any, Any]]], *, rank: int | str | None = 'auto', lam_F: float = 0.001, lam_B: float = 0.001, max_sweeps: int = 12, rel_tol: float = 1e-06, d_floor: float = 1e-08, cg_maxit: int = 800, cg_tol: float = 3e-07, scale_correct: bool = True, scale_floor: float = 1e-08)[source]¶
Bases:
objectStatsmodels-style system container for ALS GLS fitting.
- class alsgls.api.ALSGLSSystemResults(model: ALSGLSSystem, estimator: ALSGLS)[source]¶
Bases:
objectLightweight results container mimicking
statsmodelsoutputs.
alsgls.sim module¶
- alsgls.sim.simulate_gls(N_tr, N_te, p_list, k, seed=0)[source]¶
Simulate a generalized least squares (GLS) dataset.
This variant allows each response equation to have its own number of features as specified by
p_list.- Parameters:
- Returns:
X_tr (list of ndarray) – Feature matrices for the training set.
X_tr[j]has shape(N_tr, p_list[j]).Y_tr (ndarray) – Training responses of shape
(N_tr, K)whereK = len(p_list).X_te (list of ndarray) – Feature matrices for the test set.
X_te[j]has shape(N_te, p_list[j]).Y_te (ndarray) – Test responses of shape
(N_te, K).
Notes
Randomness is controlled via
numpy.random.default_rng(seed); pass a differentseedfor different simulations.Examples
>>> p_list = [3, 5, 2] >>> Xtr, Ytr, Xte, Yte = simulate_gls(100, 20, p_list, k=2, seed=0)
- alsgls.sim.simulate_sur(N_tr, N_te, K, p, k, seed=0)[source]¶
Simulate a Seemingly Unrelated Regression (SUR) dataset.
- Parameters:
N_tr (int) – Number of training samples.
N_te (int) – Number of test samples.
K (int) – Number of response equations.
p (int) – Number of features per equation.
k (int) – Latent factor dimension controlling correlated noise.
seed (int, optional) – Seed for the NumPy random number generator. Defaults to
0.
- Returns:
X_tr (list of ndarray) – Feature matrices for the training set. Each element has shape
(N_tr, p).Y_tr (ndarray) – Training responses of shape
(N_tr, K).X_te (list of ndarray) – Feature matrices for the test set. Each element has shape
(N_te, p).Y_te (ndarray) – Test responses of shape
(N_te, K).
Notes
Randomness is controlled via
numpy.random.default_rng(seed); pass a differentseedfor different simulations.Examples
>>> Xtr, Ytr, Xte, Yte = simulate_sur(100, 20, K=3, p=5, k=2, seed=42)