adelie.solver.grpnet#
- adelie.solver.grpnet(X: ndarray | MatrixNaiveBase32 | MatrixNaiveBase64, glm: GlmBase32 | GlmBase64 | GlmMultiBase32 | GlmMultiBase64, *, constraints: list[ConstraintBase32 | ConstraintBase64] | None = None, groups: ndarray | None = None, alpha: float = 1, penalty: ndarray | None = None, offsets: ndarray | None = None, lmda_path: ndarray | None = None, irls_max_iters: int = 10000, irls_tol: float = 1e-07, max_iters: int = 100000, tol: float = 1e-07, adev_tol: float = 0.9, ddev_tol: float = 0, newton_tol: float = 1e-12, newton_max_iters: int = 1000, n_threads: int = 1, early_exit: bool = True, intercept: bool = True, screen_rule: str = 'pivot', min_ratio: float = 0.01, lmda_path_size: int = 100, max_screen_size: int | None = None, max_active_size: int | None = None, pivot_subset_ratio: float = 0.1, pivot_subset_min: int = 1, pivot_slack_ratio: float = 1.25, check_state: bool = False, progress_bar: bool = True, warm_start=None, exit_cond: Callable | None = None)[source]#
Solves group elastic net via naive method.
The group elastic net problem minimizes the following:
\[\begin{split}\begin{align*} \mathrm{minimize}_{\beta, \beta_0} \quad& \ell(\eta) + \lambda \sum\limits_{g=1}^G \omega_g \left( \alpha \|\beta_g\|_2 + \frac{1-\alpha}{2} \|\beta_g\|_2^2 \right) \\ \text{subject to} \quad& \eta = X\beta + \beta_0 \mathbf{1} + \eta^0 \end{align*}\end{split}\]where \(\beta_0\) is the intercept, \(\beta\) is the coefficient vector, \(X\) is the feature matrix, \(\eta^0\) is a fixed offset vector, \(\lambda \geq 0\) is the regularization parameter, \(G\) is the number of groups, \(\omega \geq 0\) is the penalty factor, \(\alpha \in [0,1]\) is the elastic net parameter, \(\beta_g\) are the coefficients for the \(g\) th group, and \(\ell(\cdot)\) is the loss function defined by a GLM.
For multi-response problems (i.e. when \(y\) is 2-dimensional) such as in multigaussian or multinomial, the group elastic net problem minimizes the following:
\[\begin{split}\begin{align*} \mathrm{minimize}_{\beta, \beta_0} \quad& \ell(\eta) + \lambda \sum\limits_{g=1}^G \omega_g \left( \alpha \|\beta_g\|_2 + \frac{1-\alpha}{2} \|\beta_g\|_2^2 \right) \\ \text{subject to} \quad& \mathrm{vec}(\eta^\top) = (X\otimes I_K) \beta + (\mathbf{1}\otimes I_K) \beta_0 + \mathrm{vec}(\eta^{0\top}) \end{align*}\end{split}\]where \(\mathrm{vec}(\cdot)\) is the operator that flattens the input as column-major. Note that if
intercept
isTrue
, then an intercept for each class is provided as additional unpenalized features in the data matrix and the global intercept is turned off.- Parameters:
- X(n, p) Union[ndarray, MatrixNaiveBase32, MatrixNaiveBase64]
Feature matrix. It is typically one of the matrices defined in
adelie.matrix
submodule ornumpy.ndarray
.- glmUnion[GlmBase32, GlmBase64, GlmMultiBase32, GlmMultiBase64]
GLM object. It is typically one of the GLM classes defined in
adelie.glm
submodule.- constraints(G,) list[Union[ConstraintBase32, ConstraintBase64]], optional
List of constraints for each group.
constraints[i]
is the constraint object corresponding to groupi
. Ifconstraints[i]
isNone
, then thei
th group is unconstrained. IfNone
, every group is unconstrained. Default isNone
.- groups(G,) ndarray, optional
List of starting indices to each group where G is the number of groups.
groups[i]
is the starting index of thei
th group. Ifglm
is of multi-response type, thengroups[i]
is the starting feature index of thei
th group. In either case,groups[i]
must then be a value in the range \(\{1,\ldots, p\}\). Default isNone
, in which case it is set tonp.arange(p)
.- alphafloat, optional
Elastic net parameter. It must be in the range \([0,1]\). Default is
1
.- penalty(G,) ndarray, optional
Penalty factor for each group in the same order as
groups
. It must be a non-negative vector. Default isNone
, in which case, it is set tonp.sqrt(group_sizes)
.- offsets(n,) or (n, K) ndarray, optional
Observation offsets \(\eta^0\). Default is
None
, in which case, it is set tonp.zeros(n)
ify
is single-response andnp.zeros((n, K))
if multi-response.- lmda_path(L,) ndarray, optional
The regularization path to solve for. The full path is not considered if
early_exit
isTrue
. It is recommended that the path is sorted in decreasing order. IfNone
, the path will be generated. Default isNone
.- irls_max_itersint, optional
Maximum number of IRLS iterations. This parameter is only used if
glm
is not of gaussian type. Default isint(1e4)
.- irls_tolfloat, optional
IRLS convergence tolerance. This parameter is only used if
glm
is not of gaussian type. Default is1e-7
.- max_itersint, optional
Maximum number of coordinate descents. Default is
int(1e5)
.- tolfloat, optional
Coordinate descent convergence tolerance. Default is
1e-7
.- adev_tolfloat, optional
Percent deviance explained tolerance. Default is
0.9
.- ddev_tolfloat, optional
Difference in percent deviance explained tolerance. Default is
0
.- newton_tolfloat, optional
Convergence tolerance for the BCD update. Default is
1e-12
.- newton_max_itersint, optional
Maximum number of iterations for the BCD update. Default is
1000
.- n_threadsint, optional
Number of threads. Default is
1
.- early_exitbool, optional
True
if the function should early exit based on training deviance explained. Default isTrue
.- min_ratiofloat, optional
The ratio between the largest and smallest \(\lambda\) in the regularization sequence if it is to be generated. Default is
1e-2
.- lmda_path_sizeint, optional
Number of regularizations in the path if it is to be generated. Default is
100
.- interceptbool, optional
True
if the function should fit with intercept. Ify
is multi-response, then an intercept for each class is added and the global intercept is turned off. Default isTrue
.- screen_rulestr, optional
The type of screening rule to use. It must be one of the following options:
"strong"
: adds groups whose active scores are above the strong threshold."pivot"
: adds groups whose active scores are above the pivot cutoff with slack.
Default is
"pivot"
.- max_screen_sizeint, optional
Maximum number of screen groups allowed. The function will return a valid state and guarantees to have screen set size less than or equal to
max_screen_size
. IfNone
, it will be set to the total number of groups. Default isNone
.- max_active_sizeint, optional
Maximum number of active groups allowed. The function will return a valid state and guarantees to have active set size less than or equal to
max_active_size
. IfNone
, it will be set to the total number of groups. Default isNone
.- pivot_subset_ratiofloat, optional
If screening takes place, then the
(1 + pivot_subset_ratio) * s
largest active scores are used to determine the pivot point wheres
is the current screen set size. It is only used ifscreen_rule="pivot"
. Default is0.1
.- pivot_subset_minint, optional
If screening takes place, then at least
pivot_subset_min
number of active scores are used to determine the pivot point. It is only used ifscreen_rule="pivot"
. Default is1
.- pivot_slack_ratiofloat, optional
If screening takes place, then
pivot_slack_ratio
number of groups with next smallest (new) active scores below the pivot point are also added to the screen set as slack. It is only used ifscreen_rule="pivot"
. Default is1.25
.- check_statebool, optional
True
is state should be checked for inconsistencies before calling solver. Default isFalse
.Warning
The check may take a long time if the inputs are big!
- progress_barbool, optional
True
to enable progress bar. Default isTrue
.- warm_startoptional
If no warm-start is provided, the initial solution is set to 0 and other invariance quantities are set accordingly. Otherwise, the warm-start is used to extract all necessary state variables. If warm-start is used, the user must still provide consistent inputs, that is, warm-start will not overwrite most arguments passed into this function. However, changing configuration settings such as tolerance levels is well-defined. Default is
None
.Note
The primary use-case is when a user already called the function with
warm_start=False
but would like to continue fitting down a longer path of regularizations. This way, the user does not have to restart the fit at the beginning, but can simply continue from the last returned state.Warning
We have only tested warm-starts in the setting described in the note above, that is, when
lmda_path
and possibly static configurations have changed. Use with caution in other settings!- exit_condCallable, optional
If not
None
, it must be a callable object that takes in a single argument. The argument is the current state object of the same type as the return value. During the optimization, after obtaining the solution at each regularization value,exit_cond(state)
is evaluated as an opportunity for the user to early exit the program based on their own rule. Default isNone
.Note
The algorithm early exits if
exit_cond(state)
evaluates toTrue
or the built-in early exit function evaluates toTrue
(ifearly_exit
isTrue
). The latter can be disabled withearly_exit=False
.
- Returns:
- state
The resulting state after running the solver.
See also
adelie.adelie_core.state.StateGaussianNaive32
adelie.adelie_core.state.StateGaussianNaive64
adelie.adelie_core.state.StateGlmNaive32
adelie.adelie_core.state.StateGlmNaive64
adelie.adelie_core.state.StateMultiGaussianNaive32
adelie.adelie_core.state.StateMultiGaussianNaive64
adelie.adelie_core.state.StateMultiGlmNaive32
adelie.adelie_core.state.StateMultiGlmNaive64