adelie.adelie_core.state.StateMultiGaussianNaive32#

class adelie.adelie_core.state.StateMultiGaussianNaive32#

Core state class for MultiGaussian, naive method.

Methods

__init__(*args, **kwargs)

Overloaded function.

Attributes

X

Feature matrix.

X_means

Column means of X (weighted by \(W\)).

abs_grad

The \(\ell_2\) norms of (corrected) grad across each group.

active_set

List of indices into screen_set that correspond to active groups.

active_set_size

Number of active groups.

active_sizes

Active set size for every saved solution.

adev_tol

Percent deviance explained tolerance.

alpha

Elastic net parameter.

benchmark_fit_active

Fit time on the active set for each iteration.

benchmark_fit_screen

Fit time on the screen set for each iteration.

benchmark_invariance

Invariance time for each iteration.

benchmark_kkt

KKT time for each iteration.

benchmark_screen

Screen time for each iteration.

betas

betas[i] is the solution at lmdas[i].

constraint_buffer_size

Max constraint buffer size.

constraints

List of constraints for each group.

ddev_tol

Difference in percent deviance explained tolerance.

devs

devs[i] is the (normalized) \(R^2\) at betas[i].

dual_groups

List of starting indices to each dual group where G is the number of groups.

duals

duals[i] is the dual at lmdas[i].

early_exit

True if the function should early exit based on training percent deviance explained.

grad

The full gradient \(-X^\top \nabla \ell(\eta)\).

group_sizes

List of group sizes corresponding to each element in groups.

group_type

Multi-response group type.

groups

List of starting indices to each group where G is the number of groups.

intercept

True if the function should fit with intercept.

intercepts

intercepts[i] is the intercept at lmdas[i] for each class.

lmda

The last regularization parameter that was attempted to be solved.

lmda_max

The smallest \(\lambda\) such that the true solution is zero for all coefficients that have a non-vanishing group lasso penalty (\(\ell_2\)-norm).

lmda_path

The regularization path to solve for.

lmda_path_size

Number of regularizations in the path if it is to be generated.

lmdas

lmdas[i] is the regularization \(\lambda\) used for the i th solution.

loss_full

Full loss \(-\frac{1}{2} \|y\|_W^2\).

loss_null

Null loss \(-\frac{1}{2} \overline{y}^2\) where \(\overline{y}\) is given by y_mean.

max_active_size

Maximum number of active groups allowed.

max_iters

Maximum number of coordinate descents.

max_screen_size

Maximum number of screen groups allowed.

min_ratio

The ratio between the largest and smallest \(\lambda\) in the regularization sequence if it is to be generated.

multi_intercept

True if an intercept is added for each response.

n_classes

Number of classes.

n_threads

Number of threads.

n_valid_solutions

Number of valid solutions for each iteration.

newton_max_iters

Maximum number of iterations for the BCD update.

newton_tol

Convergence tolerance for the BCD update.

penalty

Penalty factor for each group in the same order as groups.

pivot_slack_ratio

If screening takes place, then pivot_slack_ratio number of groups with next smallest (new) active scores below the pivot point are also added to the screen set as slack.

pivot_subset_min

If screening takes place, then at least pivot_subset_min number of active scores are used to determine the pivot point.

pivot_subset_ratio

If screening takes place, then the (1 + pivot_subset_ratio) * s largest active scores are used to determine the pivot point where s is the current screen set size.

resid

Residual \(y_c - X \beta\) where \(\beta\) is given by screen_beta.

resid_sum

Weighted (by \(W\)) sum of resid.

rsq

The change in unnormalized \(R^2\) given by \(\|y_c-X_c\beta_{\mathrm{old}}\|_{W}^2 - \|y_c-X_c\beta_{\mathrm{curr}}\|_{W}^2\).

screen_X_means

Column means of \(X\) for screen groups (weighted by \(W\)).

screen_begins

List of indices that index a corresponding list of values for each screen group.

screen_beta

Coefficient vector on the screen set.

screen_hashset

Hashmap containing the same values as screen_set.

screen_is_active

Boolean vector that indicates whether each screen group in groups is active or not.

screen_rule

Strong rule type.

screen_set

List of indices into groups that correspond to the screen groups.

screen_sizes

Strong set size for every saved solution.

screen_transforms

List of \(V_k\) where \(V_k\) is from the SVD of \(\sqrt{W} X_{c,k}\) along the screen groups \(k\) and for possibly column-centered (weighted by \(W\)) \(X_k\).

screen_vars

List of \(D_k^2\) where \(D_k\) is from the SVD of \(\sqrt{W} X_{c,k}\) along the screen groups \(k\) and for possibly column-centered (weighted by \(W\)) \(X_k\).

setup_lmda_max

True if the function should setup \(\lambda_\max\).

setup_lmda_path

True if the function should setup the regularization path.

tol

Coordinate descent convergence tolerance.

weights

Observation weights \(W\).

y_mean

Mean of the response vector \(y\) (weighted by \(W\)), i.e. \(\mathbf{1}^\top W y\).

y_var

Variance of the response vector \(y\) (weighted by \(W\)), i.e. \(\|y_c\|_{W}^2\).

__init__(*args, **kwargs)#

Overloaded function.

  1. __init__(self: adelie.adelie_core.state.StateMultiGaussianNaive32, group_type: str, n_classes: int, multi_intercept: bool, X: adelie.adelie_core.matrix.MatrixNaiveBase32, X_means: numpy.ndarray[numpy.float32[1, n]], y_mean: float, y_var: float, resid: numpy.ndarray[numpy.float32[1, n]], resid_sum: float, constraints: adelie.adelie_core.constraint.VectorConstraintBase32, groups: numpy.ndarray[numpy.int64[1, n]], group_sizes: numpy.ndarray[numpy.int64[1, n]], dual_groups: numpy.ndarray[numpy.int64[1, n]], alpha: float, penalty: numpy.ndarray[numpy.float32[1, n]], weights: numpy.ndarray[numpy.float32[1, n]], lmda_path: numpy.ndarray[numpy.float32[1, n]], lmda_max: float, min_ratio: float, lmda_path_size: int, max_screen_size: int, max_active_size: int, pivot_subset_ratio: float, pivot_subset_min: int, pivot_slack_ratio: float, screen_rule: str, max_iters: int, tol: float, adev_tol: float, ddev_tol: float, newton_tol: float, newton_max_iters: int, early_exit: bool, setup_lmda_max: bool, setup_lmda_path: bool, intercept: bool, n_threads: int, screen_set: numpy.ndarray[numpy.int64[1, n]], screen_beta: numpy.ndarray[numpy.float32[1, n]], screen_is_active: numpy.ndarray[bool[1, n]], active_set_size: int, active_set: numpy.ndarray[numpy.int64[1, n]], rsq: float, lmda: float, grad: numpy.ndarray[numpy.float32[1, n]]) -> None

  2. __init__(self: adelie.adelie_core.state.StateMultiGaussianNaive32, arg0: adelie.adelie_core.state.StateMultiGaussianNaive32) -> None

X#

Feature matrix.

X_means#

Column means of X (weighted by \(W\)).

abs_grad#

The \(\ell_2\) norms of (corrected) grad across each group. abs_grad[i] is given by np.linalg.norm(grad[g:g+gs] - lmda * penalty[i] * (1-alpha) * beta[g:g+gs] - correction) where g = groups[i], gs = group_sizes[i], beta is the full solution vector represented by screen_beta, and correction is the output from calling constraints[i].gradient().

active_set#

List of indices into screen_set that correspond to active groups. screen_set[active_set[i]] is the i th active group. An active group is one with non-zero coefficient block, that is, for every i th active group, screen_beta[b:b+p] == 0 where j = active_set[i], k = screen_set[j], b = screen_begins[j], and p = group_sizes[k].

active_set_size#

Number of active groups. active_set[i] is only well-defined for i in the range [0, active_set_size).

active_sizes#

Active set size for every saved solution.

adev_tol#

Percent deviance explained tolerance.

alpha#

Elastic net parameter.

benchmark_fit_active#

Fit time on the active set for each iteration.

benchmark_fit_screen#

Fit time on the screen set for each iteration.

benchmark_invariance#

Invariance time for each iteration.

benchmark_kkt#

KKT time for each iteration.

benchmark_screen#

Screen time for each iteration.

betas#

betas[i] is the solution at lmdas[i].

constraint_buffer_size#

Max constraint buffer size. Equivalent to np.max([0 if c is None else c.buffer_size() for c in constraints]).

constraints#

List of constraints for each group. constraints[i] is the constraint object corresponding to group i.

ddev_tol#

Difference in percent deviance explained tolerance.

devs#

devs[i] is the (normalized) \(R^2\) at betas[i].

dual_groups#

List of starting indices to each dual group where G is the number of groups. dual_groups[i] is the starting index of the i th dual group.

duals#

duals[i] is the dual at lmdas[i].

early_exit#

True if the function should early exit based on training percent deviance explained.

grad#

The full gradient \(-X^\top \nabla \ell(\eta)\).

group_sizes#

List of group sizes corresponding to each element in groups. group_sizes[i] is the group size of the i th group.

group_type#

Multi-response group type.

groups#

List of starting indices to each group where G is the number of groups. groups[i] is the starting index of the i th group.

intercept#

True if the function should fit with intercept.

intercepts#

intercepts[i] is the intercept at lmdas[i] for each class.

lmda#

The last regularization parameter that was attempted to be solved.

lmda_max#

The smallest \(\lambda\) such that the true solution is zero for all coefficients that have a non-vanishing group lasso penalty (\(\ell_2\)-norm).

lmda_path#

The regularization path to solve for.

lmda_path_size#

Number of regularizations in the path if it is to be generated.

lmdas#

lmdas[i] is the regularization \(\lambda\) used for the i th solution.

loss_full#

Full loss \(-\frac{1}{2} \|y\|_W^2\).

loss_null#

Null loss \(-\frac{1}{2} \overline{y}^2\) where \(\overline{y}\) is given by y_mean.

max_active_size#

Maximum number of active groups allowed.

max_iters#

Maximum number of coordinate descents.

max_screen_size#

Maximum number of screen groups allowed.

min_ratio#

The ratio between the largest and smallest \(\lambda\) in the regularization sequence if it is to be generated.

multi_intercept#

True if an intercept is added for each response.

n_classes#

Number of classes.

n_threads#

Number of threads.

n_valid_solutions#

Number of valid solutions for each iteration.

newton_max_iters#

Maximum number of iterations for the BCD update.

newton_tol#

Convergence tolerance for the BCD update.

penalty#

Penalty factor for each group in the same order as groups.

pivot_slack_ratio#

If screening takes place, then pivot_slack_ratio number of groups with next smallest (new) active scores below the pivot point are also added to the screen set as slack.

pivot_subset_min#

If screening takes place, then at least pivot_subset_min number of active scores are used to determine the pivot point.

pivot_subset_ratio#

If screening takes place, then the (1 + pivot_subset_ratio) * s largest active scores are used to determine the pivot point where s is the current screen set size.

resid#

Residual \(y_c - X \beta\) where \(\beta\) is given by screen_beta.

resid_sum#

Weighted (by \(W\)) sum of resid.

rsq#

The change in unnormalized \(R^2\) given by \(\|y_c-X_c\beta_{\mathrm{old}}\|_{W}^2 - \|y_c-X_c\beta_{\mathrm{curr}}\|_{W}^2\).

screen_X_means#

Column means of \(X\) for screen groups (weighted by \(W\)).

screen_begins#

List of indices that index a corresponding list of values for each screen group. screen_begins[i] is the starting index corresponding to the i th screen group. From this index, reading group_sizes[screen_set[i]] number of elements will grab values corresponding to the full i th screen group block.

screen_beta#

Coefficient vector on the screen set. screen_beta[b:b+p] is the coefficient for the i th screen group where k = screen_set[i], b = screen_begins[i], and p = group_sizes[k].

screen_hashset#

Hashmap containing the same values as screen_set.

screen_is_active#

Boolean vector that indicates whether each screen group in groups is active or not. screen_is_active[i] is True if and only if screen_set[i] is active.

screen_rule#

Strong rule type.

screen_set#

List of indices into groups that correspond to the screen groups. screen_set[i] is i th screen group.

screen_sizes#

Strong set size for every saved solution.

screen_transforms#

List of \(V_k\) where \(V_k\) is from the SVD of \(\sqrt{W} X_{c,k}\) along the screen groups \(k\) and for possibly column-centered (weighted by \(W\)) \(X_k\). It only needs to be properly initialized for groups with size > 1. screen_transforms[i] is \(V_k\) for the i th screen group where k = screen_set[i].

screen_vars#

List of \(D_k^2\) where \(D_k\) is from the SVD of \(\sqrt{W} X_{c,k}\) along the screen groups \(k\) and for possibly column-centered (weighted by \(W\)) \(X_k\). screen_vars[b:b+p] is \(D_k^2\) for the i th screen group where k = screen_set[i], b = screen_begins[i], and p = group_sizes[k].

setup_lmda_max#

True if the function should setup \(\lambda_\max\).

setup_lmda_path#

True if the function should setup the regularization path.

tol#

Coordinate descent convergence tolerance.

weights#

Observation weights \(W\).

y_mean#

Mean of the response vector \(y\) (weighted by \(W\)), i.e. \(\mathbf{1}^\top W y\).

y_var#

Variance of the response vector \(y\) (weighted by \(W\)), i.e. \(\|y_c\|_{W}^2\).