adelie.adelie_core.state.StateMultiGaussianNaive32#

class adelie.adelie_core.state.StateMultiGaussianNaive32#

Core state class for MultiGaussian, naive method.

Methods

`__init__`(args, *kwargs)	Overloaded function.
`solve`(self, arg0, arg1)	Solves the state-specific problem.

Attributes

`X`	Feature matrix.
`X_means`	Column means of `X` (weighted by $W$ ).
`abs_grad`	The $ℓ_{2}$ norms of (corrected) `grad` across each group.
`active_set`	List of indices into `screen_set` that correspond to active groups.
`active_set_size`	Number of active groups.
`active_sizes`	Active set size for every saved solution.
`adev_tol`	Percent deviance explained tolerance.
`alpha`	Elastic net parameter.
`benchmark_fit_active`	Fit time on the active set for each iteration.
`benchmark_fit_screen`	Fit time on the screen set for each iteration.
`benchmark_invariance`	Invariance time for each iteration.
`benchmark_kkt`	KKT time for each iteration.
`benchmark_screen`	Screen time for each iteration.
`betas`	`betas[i]` is the solution at `lmdas[i]`.
`constraint_buffer_size`	Max constraint buffer size.
`constraints`	List of constraints for each group.
`ddev_tol`	Difference in percent deviance explained tolerance.
`devs`	`devs[i]` is the (normalized) $R^{2}$ at `betas[i]`.
`dual_groups`	List of starting indices to each dual group where G is the number of groups.
`duals`	`duals[i]` is the dual at `lmdas[i]`.
`early_exit`	`True` if the function should early exit based on training percent deviance explained.
`grad`	The full gradient $- X^{⊤} \nabla ℓ (η)$ .
`group_sizes`	List of group sizes corresponding to each element in `groups`.
`groups`	List of starting indices to each group where G is the number of groups.
`intercept`	`True` if the function should fit with intercept.
`intercepts`	`intercepts[i]` is the intercept at `lmdas[i]` for each class.
`lmda`	The last regularization parameter that was attempted to be solved.
`lmda_max`	The smallest $λ$ such that the true solution is zero for all coefficients that have a non-vanishing group lasso penalty ( $ℓ_{2}$ -norm).
`lmda_path`	The regularization path to solve for.
`lmda_path_size`	Number of regularizations in the path if it is to be generated.
`lmdas`	`lmdas[i]` is the regularization $λ$ used for the `i` th solution.
`loss_full`	Full loss $- \frac{1}{2} ‖ y ‖_{W}^{2}$ .
`loss_null`	Null loss $- \frac{1}{2} {\overset{―}{y}}^{2}$ where $\overset{―}{y}$ is given by `y_mean`.
`max_active_size`	Maximum number of active groups allowed.
`max_iters`	Maximum number of coordinate descents.
`max_screen_size`	Maximum number of screen groups allowed.
`min_ratio`	The ratio between the largest and smallest $λ$ in the regularization sequence if it is to be generated.
`multi_intercept`	`True` if an intercept is added for each response.
`n_classes`	Number of classes.
`n_threads`	Number of threads.
`n_valid_solutions`	Number of valid solutions for each iteration.
`newton_max_iters`	Maximum number of iterations for the BCD update.
`newton_tol`	Convergence tolerance for the BCD update.
`penalty`	Penalty factor for each group in the same order as `groups`.
`pivot_slack_ratio`	If screening takes place, then `pivot_slack_ratio` number of groups with next smallest (new) active scores below the pivot point are also added to the screen set as slack.
`pivot_subset_min`	If screening takes place, then at least `pivot_subset_min` number of active scores are used to determine the pivot point.
`pivot_subset_ratio`	If screening takes place, then the `(1 + pivot_subset_ratio) * s` largest active scores are used to determine the pivot point where `s` is the current screen set size.
`resid`	Residual $y_{c} - X β$ where $β$ is given by `screen_beta`.
`resid_sum`	Weighted (by $W$ ) sum of `resid`.
`rsq`	The change in unnormalized $R^{2}$ given by $‖ y_{c} - X_{c} β_{old} ‖_{W}^{2} - ‖ y_{c} - X_{c} β_{curr} ‖_{W}^{2}$ .
`screen_X_means`	Column means of $X$ for screen groups (weighted by $W$ ).
`screen_begins`	List of indices that index a corresponding list of values for each screen group.
`screen_beta`	Coefficient vector on the screen set.
`screen_hashset`	Hashmap containing the same values as `screen_set`.
`screen_is_active`	Boolean vector that indicates whether each screen group in `groups` is active or not.
`screen_rule`	Strong rule type.
`screen_set`	List of indices into `groups` that correspond to the screen groups.
`screen_sizes`	Strong set size for every saved solution.
`screen_transforms`	List of $V_{k}$ where $V_{k}$ is from the SVD of $\sqrt{W} X_{c, k}$ along the screen groups $k$ and for possibly column-centered (weighted by $W$ ) $X_{k}$ .
`screen_vars`	List of $D_{k}^{2}$ where $D_{k}$ is from the SVD of $\sqrt{W} X_{c, k}$ along the screen groups $k$ and for possibly column-centered (weighted by $W$ ) $X_{k}$ .
`setup_lmda_max`	`True` if the function should setup $λ_{max}$ .
`setup_lmda_path`	`True` if the function should setup the regularization path.
`tol`	Coordinate descent convergence tolerance.
`weights`	Observation weights $W$ .
`y_mean`	Mean of the response vector $y$ (weighted by $W$ ), i.e. $1^{⊤} W y$ .
`y_var`	Variance of the response vector $y$ (weighted by $W$ ), i.e. $‖ y_{c} ‖_{W}^{2}$ .

__init__(*args, **kwargs)#

Overloaded function.

__init__(self: adelie.adelie_core.state.StateMultiGaussianNaive32, n_classes: int, multi_intercept: bool, X: adelie.adelie_core.matrix.MatrixNaiveBase32, X_means: numpy.ndarray[numpy.float32[1, n]], y_mean: float, y_var: float, resid: numpy.ndarray[numpy.float32[1, n]], resid_sum: float, constraints: adelie.adelie_core.constraint.VectorConstraintBase32, groups: numpy.ndarray[numpy.int64[1, n]], group_sizes: numpy.ndarray[numpy.int64[1, n]], dual_groups: numpy.ndarray[numpy.int64[1, n]], alpha: float, penalty: numpy.ndarray[numpy.float32[1, n]], weights: numpy.ndarray[numpy.float32[1, n]], lmda_path: numpy.ndarray[numpy.float32[1, n]], lmda_max: float, min_ratio: float, lmda_path_size: int, max_screen_size: int, max_active_size: int, pivot_subset_ratio: float, pivot_subset_min: int, pivot_slack_ratio: float, screen_rule: str, max_iters: int, tol: float, adev_tol: float, ddev_tol: float, newton_tol: float, newton_max_iters: int, early_exit: bool, setup_lmda_max: bool, setup_lmda_path: bool, intercept: bool, n_threads: int, screen_set: numpy.ndarray[numpy.int64[1, n]], screen_beta: numpy.ndarray[numpy.float32[1, n]], screen_is_active: numpy.ndarray[bool[1, n]], active_set_size: int, active_set: numpy.ndarray[numpy.int64[1, n]], rsq: float, lmda: float, grad: numpy.ndarray[numpy.float32[1, n]]) -> None
__init__(self: adelie.adelie_core.state.StateMultiGaussianNaive32, arg0: adelie.adelie_core.state.StateMultiGaussianNaive32) -> None

solve(self: adelie.adelie_core.state.StateMultiGaussianNaive32, arg0: bool, arg1: Callable[[adelie.adelie_core.state.StateMultiGaussianNaive32], bool]) → dict#: Solves the state-specific problem.

X#: Feature matrix.

X_means#: Column means of X (weighted by $W$ ).

abs_grad#: The $ℓ_{2}$ norms of (corrected) grad across each group. abs_grad[i] is given by np.linalg.norm(grad[g:g+gs] - lmda * penalty[i] * (1-alpha) * beta[g:g+gs] - correction) where g = groups[i], gs = group_sizes[i], beta is the full solution vector represented by screen_beta, and correction is the output from calling constraints[i].gradient().

active_set#: List of indices into screen_set that correspond to active groups. screen_set[active_set[i]] is the i th active group. An active group is one with non-zero coefficient block, that is, for every i th active group, screen_beta[b:b+p] == 0 where j = active_set[i], k = screen_set[j], b = screen_begins[j], and p = group_sizes[k].

active_set_size#: Number of active groups. active_set[i] is only well-defined for i in the range [0, active_set_size).

active_sizes#: Active set size for every saved solution.

adev_tol#: Percent deviance explained tolerance.

alpha#: Elastic net parameter.

benchmark_fit_active#: Fit time on the active set for each iteration.

benchmark_fit_screen#: Fit time on the screen set for each iteration.

benchmark_invariance#: Invariance time for each iteration.

benchmark_kkt#: KKT time for each iteration.

benchmark_screen#: Screen time for each iteration.

betas#: betas[i] is the solution at lmdas[i].

constraint_buffer_size#: Max constraint buffer size. Equivalent to np.max([0 if c is None else c.buffer_size() for c in constraints]).

constraints#: List of constraints for each group. constraints[i] is the constraint object corresponding to group i.

ddev_tol#: Difference in percent deviance explained tolerance.

devs#: devs[i] is the (normalized) $R^{2}$ at betas[i].

dual_groups#: List of starting indices to each dual group where G is the number of groups. dual_groups[i] is the starting index of the i th dual group.

duals#: duals[i] is the dual at lmdas[i].

early_exit#: True if the function should early exit based on training percent deviance explained.

grad#: The full gradient $- X^{⊤} \nabla ℓ (η)$ .

group_sizes#: List of group sizes corresponding to each element in groups. group_sizes[i] is the group size of the i th group.

groups#: List of starting indices to each group where G is the number of groups. groups[i] is the starting index of the i th group.

intercept#: True if the function should fit with intercept.

intercepts#: intercepts[i] is the intercept at lmdas[i] for each class.

lmda#: The last regularization parameter that was attempted to be solved.

lmda_max#: The smallest $λ$ such that the true solution is zero for all coefficients that have a non-vanishing group lasso penalty ( $ℓ_{2}$ -norm).

lmda_path#: The regularization path to solve for.

lmda_path_size#: Number of regularizations in the path if it is to be generated.

lmdas#: lmdas[i] is the regularization $λ$ used for the i th solution.

loss_full#: Full loss $- \frac{1}{2} ‖ y ‖_{W}^{2}$ .

loss_null#: Null loss $- \frac{1}{2} {\overset{―}{y}}^{2}$ where $\overset{―}{y}$ is given by y_mean.

max_active_size#: Maximum number of active groups allowed.

max_iters#: Maximum number of coordinate descents.

max_screen_size#: Maximum number of screen groups allowed.

min_ratio#: The ratio between the largest and smallest $λ$ in the regularization sequence if it is to be generated.

multi_intercept#: True if an intercept is added for each response.

n_classes#: Number of classes.

n_threads#: Number of threads.

n_valid_solutions#: Number of valid solutions for each iteration.

newton_max_iters#: Maximum number of iterations for the BCD update.

newton_tol#: Convergence tolerance for the BCD update.

penalty#: Penalty factor for each group in the same order as groups.

pivot_slack_ratio#: If screening takes place, then pivot_slack_ratio number of groups with next smallest (new) active scores below the pivot point are also added to the screen set as slack.

pivot_subset_min#: If screening takes place, then at least pivot_subset_min number of active scores are used to determine the pivot point.

pivot_subset_ratio#: If screening takes place, then the (1 + pivot_subset_ratio) * s largest active scores are used to determine the pivot point where s is the current screen set size.

resid#: Residual $y_{c} - X β$ where $β$ is given by screen_beta.

resid_sum#: Weighted (by $W$ ) sum of resid.

rsq#: The change in unnormalized $R^{2}$ given by $‖ y_{c} - X_{c} β_{old} ‖_{W}^{2} - ‖ y_{c} - X_{c} β_{curr} ‖_{W}^{2}$ .

screen_X_means#: Column means of $X$ for screen groups (weighted by $W$ ).

screen_begins#: List of indices that index a corresponding list of values for each screen group. screen_begins[i] is the starting index corresponding to the i th screen group. From this index, reading group_sizes[screen_set[i]] number of elements will grab values corresponding to the full i th screen group block.

screen_beta#: Coefficient vector on the screen set. screen_beta[b:b+p] is the coefficient for the i th screen group where k = screen_set[i], b = screen_begins[i], and p = group_sizes[k].

screen_hashset#: Hashmap containing the same values as screen_set.

screen_is_active#: Boolean vector that indicates whether each screen group in groups is active or not. screen_is_active[i] is True if and only if screen_set[i] is active.

screen_rule#: Strong rule type.

screen_set#: List of indices into groups that correspond to the screen groups. screen_set[i] is i th screen group.

screen_sizes#: Strong set size for every saved solution.

screen_transforms#: List of $V_{k}$ where $V_{k}$ is from the SVD of $\sqrt{W} X_{c, k}$ along the screen groups $k$ and for possibly column-centered (weighted by $W$ ) $X_{k}$ . It only needs to be properly initialized for groups with size > 1. screen_transforms[i] is $V_{k}$ for the i th screen group where k = screen_set[i].

screen_vars#: List of $D_{k}^{2}$ where $D_{k}$ is from the SVD of $\sqrt{W} X_{c, k}$ along the screen groups $k$ and for possibly column-centered (weighted by $W$ ) $X_{k}$ . screen_vars[b:b+p] is $D_{k}^{2}$ for the i th screen group where k = screen_set[i], b = screen_begins[i], and p = group_sizes[k].

setup_lmda_max#: True if the function should setup $λ_{max}$ .

setup_lmda_path#: True if the function should setup the regularization path.

tol#: Coordinate descent convergence tolerance.

weights#: Observation weights $W$ .

y_mean#: Mean of the response vector $y$ (weighted by $W$ ), i.e. $1^{⊤} W y$ .

y_var#: Variance of the response vector $y$ (weighted by $W$ ), i.e. $‖ y_{c} ‖_{W}^{2}$ .