adelie.adelie_core.state.StateGlmNaive64#
- class adelie.adelie_core.state.StateGlmNaive64#
Core state class for GLM, naive method.
Methods
__init__(*args, **kwargs)Overloaded function.
solve(self, arg0, arg1, arg2)Solves the state-specific problem.
Attributes
Feature matrix.
The \(\ell_2\) norms of (corrected)
gradacross each group.List of indices into
screen_setthat correspond to active groups.Number of active groups.
Active set size for every saved solution.
Percent deviance explained tolerance.
Elastic net parameter.
Fit time on the active set for each iteration.
Fit time on the screen set for each iteration.
Invariance time for each iteration.
KKT time for each iteration.
Screen time for each iteration.
The current intercept value.
betas[i]is the solution atlmdas[i].Max constraint buffer size.
List of constraints for each group.
Difference in percent deviance explained tolerance.
devs[i]is the (normalized) \(R^2\) atbetas[i].List of starting indices to each dual group where G is the number of groups.
duals[i]is the dual atlmdas[i].Trueif the function should early exit based on training percent deviance explained.The natural parameter \(\eta = X\beta + \beta_0 \mathbf{1} + \eta^0\) where \(\beta\) and \(\beta_0\) are given by
screen_betaandbeta0.The full gradient \(-X^\top \nabla \ell(\eta)\).
List of group sizes corresponding to each element in
groups.List of starting indices to each group where G is the number of groups.
Trueif the function should fit with intercept.intercepts[i]is the intercept atlmdas[i].Maximum number of IRLS iterations.
IRLS convergence tolerance.
The last regularization parameter that was attempted to be solved.
The smallest \(\lambda\) such that the true solution is zero for all coefficients that have a non-vanishing group lasso penalty (\(\ell_2\)-norm).
The regularization path to solve for.
Number of regularizations in the path if it is to be generated.
lmdas[i]is the regularization \(\lambda\) used for theith solution.Full loss \(\ell(\eta^\star)\) where \(\eta^\star\) is the minimizer.
Null loss \(\ell(\beta_0^\star \mathbf{1} + \eta^0)\) from fitting an intercept-only model (if
interceptisTrue) and otherwise \(\ell(\eta^0)\).Maximum number of active groups allowed.
Maximum number of coordinate descents.
Maximum number of screen groups allowed.
The ratio between the largest and smallest \(\lambda\) in the regularization sequence if it is to be generated.
Number of threads.
Number of valid solutions for each iteration.
Maximum number of iterations for the BCD update.
Convergence tolerance for the BCD update.
Observation offsets \(\eta^0\).
Penalty factor for each group in the same order as
groups.If screening takes place, then
pivot_slack_rationumber of groups with next smallest (new) active scores below the pivot point are also added to the screen set as slack.If screening takes place, then at least
pivot_subset_minnumber of active scores are used to determine the pivot point.If screening takes place, then the
(1 + pivot_subset_ratio) * slargest active scores are used to determine the pivot point wheresis the current screen set size.Residual \(-\nabla \ell(\eta)\) where \(\eta\) is given by
eta.List of indices that index a corresponding list of values for each screen group.
Coefficient vector on the screen set.
Hashmap containing the same values as
screen_set.Boolean vector that indicates whether each screen group in
groupsis active or not.Strong rule type.
List of indices into
groupsthat correspond to the screen groups.Strong set size for every saved solution.
Trueif the function should setup \(\lambda_\max\).Trueif the function should setup the regularization path.Trueif the function should setuploss_null.Coordinate descent convergence tolerance.
- __init__(*args, **kwargs)#
Overloaded function.
__init__(self: adelie.adelie_core.state.StateGlmNaive64, X: adelie.adelie_core.matrix.MatrixNaiveBase64, eta: numpy.ndarray[numpy.float64[1, n]], resid: numpy.ndarray[numpy.float64[1, n]], constraints: adelie.adelie_core.constraint.VectorConstraintBase64, groups: numpy.ndarray[numpy.int64[1, n]], group_sizes: numpy.ndarray[numpy.int64[1, n]], dual_groups: numpy.ndarray[numpy.int64[1, n]], alpha: float, penalty: numpy.ndarray[numpy.float64[1, n]], offsets: numpy.ndarray[numpy.float64[1, n]], lmda_path: numpy.ndarray[numpy.float64[1, n]], loss_null: float, loss_full: float, lmda_max: float, min_ratio: float, lmda_path_size: int, max_screen_size: int, max_active_size: int, pivot_subset_ratio: float, pivot_subset_min: int, pivot_slack_ratio: float, screen_rule: str, irls_max_iters: int, irls_tol: float, max_iters: int, tol: float, adev_tol: float, ddev_tol: float, newton_tol: float, newton_max_iters: int, early_exit: bool, setup_loss_null: bool, setup_lmda_max: bool, setup_lmda_path: bool, intercept: bool, n_threads: int, screen_set: numpy.ndarray[numpy.int64[1, n]], screen_beta: numpy.ndarray[numpy.float64[1, n]], screen_is_active: numpy.ndarray[bool[1, n]], active_set_size: int, active_set: numpy.ndarray[numpy.int64[1, n]], beta0: float, lmda: float, grad: numpy.ndarray[numpy.float64[1, n]]) -> None
__init__(self: adelie.adelie_core.state.StateGlmNaive64, arg0: adelie.adelie_core.state.StateGlmNaive64) -> None
- solve(self: adelie.adelie_core.state.StateGlmNaive64, arg0: adelie.adelie_core.glm.GlmBase64, arg1: bool, arg2: Callable[[adelie.adelie_core.state.StateGlmNaive64], bool]) dict#
Solves the state-specific problem.
- X#
Feature matrix.
- abs_grad#
The \(\ell_2\) norms of (corrected)
gradacross each group.abs_grad[i]is given bynp.linalg.norm(grad[g:g+gs] - lmda * penalty[i] * (1-alpha) * beta[g:g+gs] - correction)whereg = groups[i],gs = group_sizes[i],betais the full solution vector represented byscreen_beta, andcorrectionis the output from callingconstraints[i].gradient().
- active_set#
List of indices into
screen_setthat correspond to active groups.screen_set[active_set[i]]is theith active group. An active group is one with non-zero coefficient block, that is, for everyith active group,screen_beta[b:b+p] == 0wherej = active_set[i],k = screen_set[j],b = screen_begins[j], andp = group_sizes[k].
- active_set_size#
Number of active groups.
active_set[i]is only well-defined foriin the range[0, active_set_size).
- active_sizes#
Active set size for every saved solution.
- adev_tol#
Percent deviance explained tolerance.
- alpha#
Elastic net parameter.
- benchmark_fit_active#
Fit time on the active set for each iteration.
- benchmark_fit_screen#
Fit time on the screen set for each iteration.
- benchmark_invariance#
Invariance time for each iteration.
- benchmark_kkt#
KKT time for each iteration.
- benchmark_screen#
Screen time for each iteration.
- beta0#
The current intercept value.
- betas#
betas[i]is the solution atlmdas[i].
- constraint_buffer_size#
Max constraint buffer size. Equivalent to
np.max([0 if c is None else c.buffer_size() for c in constraints]).
- constraints#
List of constraints for each group.
constraints[i]is the constraint object corresponding to groupi.
- ddev_tol#
Difference in percent deviance explained tolerance.
- devs#
devs[i]is the (normalized) \(R^2\) atbetas[i].
- dual_groups#
List of starting indices to each dual group where G is the number of groups.
dual_groups[i]is the starting index of theith dual group.
- duals#
duals[i]is the dual atlmdas[i].
- early_exit#
Trueif the function should early exit based on training percent deviance explained.
- eta#
The natural parameter \(\eta = X\beta + \beta_0 \mathbf{1} + \eta^0\) where \(\beta\) and \(\beta_0\) are given by
screen_betaandbeta0.
- grad#
The full gradient \(-X^\top \nabla \ell(\eta)\).
- group_sizes#
List of group sizes corresponding to each element in
groups.group_sizes[i]is the group size of theith group.
- groups#
List of starting indices to each group where G is the number of groups.
groups[i]is the starting index of theith group.
- intercept#
Trueif the function should fit with intercept.
- intercepts#
intercepts[i]is the intercept atlmdas[i].
- irls_max_iters#
Maximum number of IRLS iterations.
- irls_tol#
IRLS convergence tolerance.
- lmda#
The last regularization parameter that was attempted to be solved.
- lmda_max#
The smallest \(\lambda\) such that the true solution is zero for all coefficients that have a non-vanishing group lasso penalty (\(\ell_2\)-norm).
- lmda_path#
The regularization path to solve for.
- lmda_path_size#
Number of regularizations in the path if it is to be generated.
- lmdas#
lmdas[i]is the regularization \(\lambda\) used for theith solution.
- loss_full#
Full loss \(\ell(\eta^\star)\) where \(\eta^\star\) is the minimizer.
- loss_null#
Null loss \(\ell(\beta_0^\star \mathbf{1} + \eta^0)\) from fitting an intercept-only model (if
interceptisTrue) and otherwise \(\ell(\eta^0)\).
- max_active_size#
Maximum number of active groups allowed.
- max_iters#
Maximum number of coordinate descents.
- max_screen_size#
Maximum number of screen groups allowed.
- min_ratio#
The ratio between the largest and smallest \(\lambda\) in the regularization sequence if it is to be generated.
- n_threads#
Number of threads.
- n_valid_solutions#
Number of valid solutions for each iteration.
- newton_max_iters#
Maximum number of iterations for the BCD update.
- newton_tol#
Convergence tolerance for the BCD update.
- offsets#
Observation offsets \(\eta^0\).
- penalty#
Penalty factor for each group in the same order as
groups.
- pivot_slack_ratio#
If screening takes place, then
pivot_slack_rationumber of groups with next smallest (new) active scores below the pivot point are also added to the screen set as slack.
- pivot_subset_min#
If screening takes place, then at least
pivot_subset_minnumber of active scores are used to determine the pivot point.
- pivot_subset_ratio#
If screening takes place, then the
(1 + pivot_subset_ratio) * slargest active scores are used to determine the pivot point wheresis the current screen set size.
- resid#
Residual \(-\nabla \ell(\eta)\) where \(\eta\) is given by
eta.
- screen_begins#
List of indices that index a corresponding list of values for each screen group.
screen_begins[i]is the starting index corresponding to theith screen group. From this index, readinggroup_sizes[screen_set[i]]number of elements will grab values corresponding to the fullith screen group block.
- screen_beta#
Coefficient vector on the screen set.
screen_beta[b:b+p]is the coefficient for theith screen group wherek = screen_set[i],b = screen_begins[i], andp = group_sizes[k].
- screen_hashset#
Hashmap containing the same values as
screen_set.
- screen_is_active#
Boolean vector that indicates whether each screen group in
groupsis active or not.screen_is_active[i]isTrueif and only ifscreen_set[i]is active.
- screen_rule#
Strong rule type.
- screen_set#
List of indices into
groupsthat correspond to the screen groups.screen_set[i]isith screen group.
- screen_sizes#
Strong set size for every saved solution.
- setup_lmda_max#
Trueif the function should setup \(\lambda_\max\).
- setup_lmda_path#
Trueif the function should setup the regularization path.
- setup_loss_null#
Trueif the function should setuploss_null.
- tol#
Coordinate descent convergence tolerance.