adelie.adelie_core.state.StateGaussianNaive32#
- class adelie.adelie_core.state.StateGaussianNaive32#
- Core state class for Gaussian, naive method. - Methods - __init__(*args, **kwargs)- Overloaded function. - solve(self, arg0, arg1)- Solves the state-specific problem. - Attributes - Feature matrix. - Column means of - X(weighted by \(W\)).- The \(\ell_2\) norms of (corrected) - gradacross each group.- List of indices into - screen_setthat correspond to active groups.- Number of active groups. - Active set size for every saved solution. - Percent deviance explained tolerance. - Elastic net parameter. - Fit time on the active set for each iteration. - Fit time on the screen set for each iteration. - Invariance time for each iteration. - KKT time for each iteration. - Screen time for each iteration. - betas[i]is the solution at- lmdas[i].- Max constraint buffer size. - List of constraints for each group. - Difference in percent deviance explained tolerance. - devs[i]is the (normalized) \(R^2\) at- betas[i].- List of starting indices to each dual group where G is the number of groups. - duals[i]is the dual at- lmdas[i].- Trueif the function should early exit based on training percent deviance explained.- The full gradient \(-X^\top \nabla \ell(\eta)\). - List of group sizes corresponding to each element in - groups.- List of starting indices to each group where G is the number of groups. - Trueif the function should fit with intercept.- intercepts[i]is the intercept at- lmdas[i].- The last regularization parameter that was attempted to be solved. - The smallest \(\lambda\) such that the true solution is zero for all coefficients that have a non-vanishing group lasso penalty (\(\ell_2\)-norm). - The regularization path to solve for. - Number of regularizations in the path if it is to be generated. - lmdas[i]is the regularization \(\lambda\) used for the- ith solution.- Full loss \(-\frac{1}{2} \|y\|_W^2\). - Null loss \(-\frac{1}{2} \overline{y}^2\) where \(\overline{y}\) is given by - y_mean.- Maximum number of active groups allowed. - Maximum number of coordinate descents. - Maximum number of screen groups allowed. - The ratio between the largest and smallest \(\lambda\) in the regularization sequence if it is to be generated. - Number of threads. - Number of valid solutions for each iteration. - Maximum number of iterations for the BCD update. - Convergence tolerance for the BCD update. - Penalty factor for each group in the same order as - groups.- If screening takes place, then - pivot_slack_rationumber of groups with next smallest (new) active scores below the pivot point are also added to the screen set as slack.- If screening takes place, then at least - pivot_subset_minnumber of active scores are used to determine the pivot point.- If screening takes place, then the - (1 + pivot_subset_ratio) * slargest active scores are used to determine the pivot point where- sis the current screen set size.- Residual \(y_c - X \beta\) where \(\beta\) is given by - screen_beta.- Weighted (by \(W\)) sum of - resid.- The change in unnormalized \(R^2\) given by \(\|y_c-X_c\beta_{\mathrm{old}}\|_{W}^2 - \|y_c-X_c\beta_{\mathrm{curr}}\|_{W}^2\). - Column means of \(X\) for screen groups (weighted by \(W\)). - List of indices that index a corresponding list of values for each screen group. - Coefficient vector on the screen set. - Hashmap containing the same values as - screen_set.- Boolean vector that indicates whether each screen group in - groupsis active or not.- Strong rule type. - List of indices into - groupsthat correspond to the screen groups.- Strong set size for every saved solution. - List of \(V_k\) where \(V_k\) is from the SVD of \(\sqrt{W} X_{c,k}\) along the screen groups \(k\) and for possibly column-centered (weighted by \(W\)) \(X_k\). - List of \(D_k^2\) where \(D_k\) is from the SVD of \(\sqrt{W} X_{c,k}\) along the screen groups \(k\) and for possibly column-centered (weighted by \(W\)) \(X_k\). - Trueif the function should setup \(\lambda_\max\).- Trueif the function should setup the regularization path.- Coordinate descent convergence tolerance. - Observation weights \(W\). - Mean of the response vector \(y\) (weighted by \(W\)), i.e. \(\mathbf{1}^\top W y\). - Variance of the response vector \(y\) (weighted by \(W\)), i.e. \(\|y_c\|_{W}^2\). - __init__(*args, **kwargs)#
- Overloaded function. - __init__(self: adelie.adelie_core.state.StateGaussianNaive32, X: adelie.adelie_core.matrix.MatrixNaiveBase32, X_means: numpy.ndarray[numpy.float32[1, n]], y_mean: float, y_var: float, resid: numpy.ndarray[numpy.float32[1, n]], resid_sum: float, constraints: adelie.adelie_core.constraint.VectorConstraintBase32, groups: numpy.ndarray[numpy.int64[1, n]], group_sizes: numpy.ndarray[numpy.int64[1, n]], dual_groups: numpy.ndarray[numpy.int64[1, n]], alpha: float, penalty: numpy.ndarray[numpy.float32[1, n]], weights: numpy.ndarray[numpy.float32[1, n]], lmda_path: numpy.ndarray[numpy.float32[1, n]], lmda_max: float, min_ratio: float, lmda_path_size: int, max_screen_size: int, max_active_size: int, pivot_subset_ratio: float, pivot_subset_min: int, pivot_slack_ratio: float, screen_rule: str, max_iters: int, tol: float, adev_tol: float, ddev_tol: float, newton_tol: float, newton_max_iters: int, early_exit: bool, setup_lmda_max: bool, setup_lmda_path: bool, intercept: bool, n_threads: int, screen_set: numpy.ndarray[numpy.int64[1, n]], screen_beta: numpy.ndarray[numpy.float32[1, n]], screen_is_active: numpy.ndarray[bool[1, n]], active_set_size: int, active_set: numpy.ndarray[numpy.int64[1, n]], rsq: float, lmda: float, grad: numpy.ndarray[numpy.float32[1, n]]) -> None 
- __init__(self: adelie.adelie_core.state.StateGaussianNaive32, arg0: adelie.adelie_core.state.StateGaussianNaive32) -> None 
 
 - solve(self: adelie.adelie_core.state.StateGaussianNaive32, arg0: bool, arg1: Callable[[adelie.adelie_core.state.StateGaussianNaive32], bool]) dict#
- Solves the state-specific problem. 
 - X#
- Feature matrix. 
 - X_means#
- Column means of - X(weighted by \(W\)).
 - abs_grad#
- The \(\ell_2\) norms of (corrected) - gradacross each group.- abs_grad[i]is given by- np.linalg.norm(grad[g:g+gs] - lmda * penalty[i] * (1-alpha) * beta[g:g+gs] - correction)where- g = groups[i],- gs = group_sizes[i],- betais the full solution vector represented by- screen_beta, and- correctionis the output from calling- constraints[i].gradient().
 - active_set#
- List of indices into - screen_setthat correspond to active groups.- screen_set[active_set[i]]is the- ith active group. An active group is one with non-zero coefficient block, that is, for every- ith active group,- screen_beta[b:b+p] == 0where- j = active_set[i],- k = screen_set[j],- b = screen_begins[j], and- p = group_sizes[k].
 - active_set_size#
- Number of active groups. - active_set[i]is only well-defined for- iin the range- [0, active_set_size).
 - active_sizes#
- Active set size for every saved solution. 
 - adev_tol#
- Percent deviance explained tolerance. 
 - alpha#
- Elastic net parameter. 
 - benchmark_fit_active#
- Fit time on the active set for each iteration. 
 - benchmark_fit_screen#
- Fit time on the screen set for each iteration. 
 - benchmark_invariance#
- Invariance time for each iteration. 
 - benchmark_kkt#
- KKT time for each iteration. 
 - benchmark_screen#
- Screen time for each iteration. 
 - betas#
- betas[i]is the solution at- lmdas[i].
 - constraint_buffer_size#
- Max constraint buffer size. Equivalent to - np.max([0 if c is None else c.buffer_size() for c in constraints]).
 - constraints#
- List of constraints for each group. - constraints[i]is the constraint object corresponding to group- i.
 - ddev_tol#
- Difference in percent deviance explained tolerance. 
 - devs#
- devs[i]is the (normalized) \(R^2\) at- betas[i].
 - dual_groups#
- List of starting indices to each dual group where G is the number of groups. - dual_groups[i]is the starting index of the- ith dual group.
 - duals#
- duals[i]is the dual at- lmdas[i].
 - early_exit#
- Trueif the function should early exit based on training percent deviance explained.
 - grad#
- The full gradient \(-X^\top \nabla \ell(\eta)\). 
 - group_sizes#
- List of group sizes corresponding to each element in - groups.- group_sizes[i]is the group size of the- ith group.
 - groups#
- List of starting indices to each group where G is the number of groups. - groups[i]is the starting index of the- ith group.
 - intercept#
- Trueif the function should fit with intercept.
 - intercepts#
- intercepts[i]is the intercept at- lmdas[i].
 - lmda#
- The last regularization parameter that was attempted to be solved. 
 - lmda_max#
- The smallest \(\lambda\) such that the true solution is zero for all coefficients that have a non-vanishing group lasso penalty (\(\ell_2\)-norm). 
 - lmda_path#
- The regularization path to solve for. 
 - lmda_path_size#
- Number of regularizations in the path if it is to be generated. 
 - lmdas#
- lmdas[i]is the regularization \(\lambda\) used for the- ith solution.
 - loss_full#
- Full loss \(-\frac{1}{2} \|y\|_W^2\). 
 - loss_null#
- Null loss \(-\frac{1}{2} \overline{y}^2\) where \(\overline{y}\) is given by - y_mean.
 - max_active_size#
- Maximum number of active groups allowed. 
 - max_iters#
- Maximum number of coordinate descents. 
 - max_screen_size#
- Maximum number of screen groups allowed. 
 - min_ratio#
- The ratio between the largest and smallest \(\lambda\) in the regularization sequence if it is to be generated. 
 - n_threads#
- Number of threads. 
 - n_valid_solutions#
- Number of valid solutions for each iteration. 
 - newton_max_iters#
- Maximum number of iterations for the BCD update. 
 - newton_tol#
- Convergence tolerance for the BCD update. 
 - penalty#
- Penalty factor for each group in the same order as - groups.
 - pivot_slack_ratio#
- If screening takes place, then - pivot_slack_rationumber of groups with next smallest (new) active scores below the pivot point are also added to the screen set as slack.
 - pivot_subset_min#
- If screening takes place, then at least - pivot_subset_minnumber of active scores are used to determine the pivot point.
 - pivot_subset_ratio#
- If screening takes place, then the - (1 + pivot_subset_ratio) * slargest active scores are used to determine the pivot point where- sis the current screen set size.
 - resid#
- Residual \(y_c - X \beta\) where \(\beta\) is given by - screen_beta.
 - resid_sum#
- Weighted (by \(W\)) sum of - resid.
 - rsq#
- The change in unnormalized \(R^2\) given by \(\|y_c-X_c\beta_{\mathrm{old}}\|_{W}^2 - \|y_c-X_c\beta_{\mathrm{curr}}\|_{W}^2\). 
 - screen_X_means#
- Column means of \(X\) for screen groups (weighted by \(W\)). 
 - screen_begins#
- List of indices that index a corresponding list of values for each screen group. - screen_begins[i]is the starting index corresponding to the- ith screen group. From this index, reading- group_sizes[screen_set[i]]number of elements will grab values corresponding to the full- ith screen group block.
 - screen_beta#
- Coefficient vector on the screen set. - screen_beta[b:b+p]is the coefficient for the- ith screen group where- k = screen_set[i],- b = screen_begins[i], and- p = group_sizes[k].
 - screen_hashset#
- Hashmap containing the same values as - screen_set.
 - screen_is_active#
- Boolean vector that indicates whether each screen group in - groupsis active or not.- screen_is_active[i]is- Trueif and only if- screen_set[i]is active.
 - screen_rule#
- Strong rule type. 
 - screen_set#
- List of indices into - groupsthat correspond to the screen groups.- screen_set[i]is- ith screen group.
 - screen_sizes#
- Strong set size for every saved solution. 
 - screen_transforms#
- List of \(V_k\) where \(V_k\) is from the SVD of \(\sqrt{W} X_{c,k}\) along the screen groups \(k\) and for possibly column-centered (weighted by \(W\)) \(X_k\). It only needs to be properly initialized for groups with size > 1. - screen_transforms[i]is \(V_k\) for the- ith screen group where- k = screen_set[i].
 - screen_vars#
- List of \(D_k^2\) where \(D_k\) is from the SVD of \(\sqrt{W} X_{c,k}\) along the screen groups \(k\) and for possibly column-centered (weighted by \(W\)) \(X_k\). - screen_vars[b:b+p]is \(D_k^2\) for the- ith screen group where- k = screen_set[i],- b = screen_begins[i], and- p = group_sizes[k].
 - setup_lmda_max#
- Trueif the function should setup \(\lambda_\max\).
 - setup_lmda_path#
- Trueif the function should setup the regularization path.
 - tol#
- Coordinate descent convergence tolerance. 
 - weights#
- Observation weights \(W\). 
 - y_mean#
- Mean of the response vector \(y\) (weighted by \(W\)), i.e. \(\mathbf{1}^\top W y\). 
 - y_var#
- Variance of the response vector \(y\) (weighted by \(W\)), i.e. \(\|y_c\|_{W}^2\).