Frequently-Asked Questions#

[1]:

import adelie as ad
import logging
import numpy as np

How to properly solve for a single regularization value?#

The bread-and-butter of our solver is block-coordinate descent, which works stunningly well if it is warm-started with a good initial value (like most algorithms). As such, we strongly recommend users to always solve the group elastic net using a path of regularization values. adelie auto-generates an evenly-spaced path on the log-scale (if the path is not provided) to start from \(\lambda_{\max}\), the smallest \(\lambda\) such that the optimal penalized coefficients are exactly \(0\), to min_ratio times \(\lambda_{\max}\). If the user is only interested in the solution at a particular \(\lambda^\star\), we recommend the methods discussed below.

For simplicity, we work under the lasso setting though the discussion carries to the general group elastic net setting as well. We first generate a dataset.

[2]:

n = 100             # number of samples
p = 200             # number of features
seed = 0            # random seed
lmda_star = 1e-2    # user-specified lambda

np.random.seed(seed)
X = np.asfortranarray(np.random.normal(0, 1, (n, p)))
y = X[:,0] * np.random.normal(0, 1) + np.random.normal(0, 1, n)

Next, we run a “dry-run” of the solver to find \(\lambda_{\max}\). We recommend this method in general since \(\lambda_{\max}\) is difficult to determine when some coefficients are unpenalized (e.g. \(0\) penalty factor for some groups).

[3]:

with ad.logger.logger_level(logging.ERROR):
    state = ad.grpnet(X, ad.glm.gaussian(y), lmda_path_size=0, progress_bar=False)
    lmda_max = state.lmda_max

We discuss the first method to solve for \(\lambda^\star\). The idea is to generate an evenly-spaced path on the log-scale from \(\lambda_{\max}\) to \(\lambda^\star\). The easiest way to do this is to set min_ratio such that min_ratio times \(\lambda_{\max}\) is precisely \(\lambda^\star\). Note that if min_ratio is larger than \(1\), then the solution at min_ratio times \(\lambda_{\max}\) is equivalent to that at \(\lambda_{\max}\). Moreover, the user can provide the fineness of the gridding via lmda_path_size argument. Finally, we set early_exit to False so that we always fit until the end of the path.

[9]:

min_ratio = lmda_star / lmda_max    # min_ratio * lmda_max == lmda_star
lmda_path_size = 20                 # number of grid points on the path
state = ad.grpnet(
    X,
    ad.glm.gaussian(y),
    min_ratio=min(1, min_ratio),
    lmda_path_size=lmda_path_size,
    early_exit=False,
)

100%|██████████| 20/20 [00:00:00<00:00:00, 4873.59it/s] [dev:98.0%]

We can now verify that the last fitted regularization is indeed \(\lambda^\star\).

[11]:

assert state.lmdas[-1] == lmda_star

The more general method is to provide a path of \(\lambda\) directly to the solver. Suppose the user wishes to provide a differently generated path from \(\lambda_{\max}\) to \(\lambda^\star\) (e.g. evenly-spaced on the original space). Then, we may run the solver with the following arguments:

[16]:

lmda_path = np.linspace(lmda_max, lmda_star, num=lmda_path_size, endpoint=True)
state = ad.grpnet(
    X,
    ad.glm.gaussian(y),
    lmda_path=lmda_path,
    early_exit=False,
)

100%|██████████| 20/20 [00:00:00<00:00:00, 4388.17it/s] [dev:98.0%]

Once again, we verify that the last fitted regularization is indeed \(\lambda^\star\).

[17]:

assert state.lmdas[-1] == lmda_star