Frequently-Asked Questions#
[1]:
import adelie as ad
import logging
import numpy as np
How to properly solve for a single regularization value?#
The bread-and-butter of our solver is block-coordinate descent, which works stunningly well if it is warm-started with a good initial value (like most algorithms). As such, we strongly recommend users to always solve the group elastic net using a path of regularization values. adelie
auto-generates an evenly-spaced path on the log-scale (if the path is not provided) to start from \(\lambda_{\max}\), the smallest \(\lambda\) such that the optimal penalized coefficients are exactly
\(0\), to min_ratio
times \(\lambda_{\max}\). If the user is only interested in the solution at a particular \(\lambda^\star\), we recommend the methods discussed below.
For simplicity, we work under the lasso setting though the discussion carries to the general group elastic net setting as well. We first generate a dataset.
[2]:
n = 100 # number of samples
p = 200 # number of features
seed = 0 # random seed
lmda_star = 1e-2 # user-specified lambda
np.random.seed(seed)
X = np.asfortranarray(np.random.normal(0, 1, (n, p)))
y = X[:,0] * np.random.normal(0, 1) + np.random.normal(0, 1, n)
Next, we run a “dry-run” of the solver to find \(\lambda_{\max}\). We recommend this method in general since \(\lambda_{\max}\) is difficult to determine when some coefficients are unpenalized (e.g. \(0\) penalty factor for some groups).
[3]:
with ad.logger.logger_level(logging.ERROR):
state = ad.grpnet(X, ad.glm.gaussian(y), lmda_path_size=0, progress_bar=False)
lmda_max = state.lmda_max
We discuss the first method to solve for \(\lambda^\star\). The idea is to generate an evenly-spaced path on the log-scale from \(\lambda_{\max}\) to \(\lambda^\star\). The easiest way to do this is to set min_ratio
such that min_ratio
times \(\lambda_{\max}\) is precisely \(\lambda^\star\). Note that if min_ratio
is larger than \(1\), then the solution at min_ratio
times \(\lambda_{\max}\) is equivalent to that at \(\lambda_{\max}\). Moreover, the
user can provide the fineness of the gridding via lmda_path_size
argument. Finally, we set early_exit
to False
so that we always fit until the end of the path.
[9]:
min_ratio = lmda_star / lmda_max # min_ratio * lmda_max == lmda_star
lmda_path_size = 20 # number of grid points on the path
state = ad.grpnet(
X,
ad.glm.gaussian(y),
min_ratio=min(1, min_ratio),
lmda_path_size=lmda_path_size,
early_exit=False,
)
100%|██████████| 20/20 [00:00:00<00:00:00, 4873.59it/s] [dev:98.0%]
We can now verify that the last fitted regularization is indeed \(\lambda^\star\).
[11]:
assert state.lmdas[-1] == lmda_star
The more general method is to provide a path of \(\lambda\) directly to the solver. Suppose the user wishes to provide a differently generated path from \(\lambda_{\max}\) to \(\lambda^\star\) (e.g. evenly-spaced on the original space). Then, we may run the solver with the following arguments:
[16]:
lmda_path = np.linspace(lmda_max, lmda_star, num=lmda_path_size, endpoint=True)
state = ad.grpnet(
X,
ad.glm.gaussian(y),
lmda_path=lmda_path,
early_exit=False,
)
100%|██████████| 20/20 [00:00:00<00:00:00, 4388.17it/s] [dev:98.0%]
Once again, we verify that the last fitted regularization is indeed \(\lambda^\star\).
[17]:
assert state.lmdas[-1] == lmda_star