adelie.cv.cv_grpnet#
- adelie.cv.cv_grpnet(X: ndarray | MatrixNaiveBase32 | MatrixNaiveBase64, glm: GlmBase32 | GlmBase64 | GlmMultiBase32 | GlmMultiBase64, *, n_threads: int = 1, early_exit: bool = False, min_ratio: float = 0.1, lmda_path_size: int = 100, n_folds: int = 5, seed: int | None = None, **grpnet_params)[source]#
Solves cross-validated group elastic net via naive method.
This function was written with the intent that
glm
is to be one of the GLMs defined inadelie.glm
. In particular, we assume the observation weightsw
associated withglm
has the property that ifw[i] == 0
, then thei
th prediction \(\eta_i\) is ignored in the computation of the loss.- Parameters:
- X(n, p) Union[ndarray, MatrixNaiveBase32, MatrixNaiveBase64]
Feature matrix. It is typically one of the matrices defined in
adelie.matrix
submodule ornumpy.ndarray
.- glmUnion[GlmBase32, GlmBase64, GlmMultiBase32, GlmMultiBase64]
GLM object. It is typically one of the GLM classes defined in
adelie.glm
submodule.- n_threadsint, optional
Number of threads. Default is
1
.- early_exitbool, optional
True
if the function should early exit based on training deviance explained. Unlike inadelie.solver.grpnet()
, the default value isFalse
. This is because internally, we construct a common regularization path that roughly contains every generated path using each training fold. Ifearly_exit
isTrue
, then some training folds may not fit some smaller \(\lambda\)’s, in which case, an extrapolation method is used based onadelie.diagnostic.coefficient()
. To avoid misinterpretation of the CV loss curve for the general user, we disable early exiting and fit on the entire (common) path for every training fold. Ifearly_exit
isTrue
, the user may see a flat component to the right of the loss curve. The user must be aware that this may then be due to the extrapolation giving the same coefficients. Default isFalse
.- min_ratiofloat, optional
The ratio between the largest and smallest \(\lambda\) in the regularization sequence. Unlike in
adelie.solver.grpnet()
, the default value is increased. This is because CV tends to pick a \(\lambda\) early in the path. If the loss curve does not look bowl-shaped, the user may decrease this value to fit further down the regularization path. Default is1e-1
.- lmda_path_sizeint, optional
Number of regularizations in the path. Default is
100
.- n_foldsint, optional
Number of CV folds. Default is
5
.- seedint, optional
Seed for random number generation. If
None
, the seed is not explicitly set. Default isNone
.- **grpnet_paramsoptional
Parameters to
adelie.solver.grpnet()
. The following cannot be specified:ddev_tol
: internally enforced to be0
. Otherwise, the solver may stop too early whenearly_exit=True
.
- Returns:
- resultCVGrpnetResult
Result of running K-fold CV.