adelie.diagnostic.gradient_norms#

adelie.diagnostic.gradient_norms(grads: ndarray, betas: csr_matrix, duals: csr_matrix, lmdas: ndarray, *, constraints: list[ConstraintBase32 | ConstraintBase64] | None = None, groups: ndarray | None = None, alpha: float = 1, penalty: ndarray | None = None)[source]#

Computes the group-wise gradient norms.

The group-wise gradient norm is given by h^RG where

h^g=γ^gλ(1α)ωgβgϕg(βg)μg2g=1,,G

where γ^g is the gradient as in adelie.diagnostic.gradients(), λ is the regularization, α is the elastic net proportion, ωg is the penalty factor, βg is the coefficient block for group g, ϕg is the constraint function for group g, and μg is the dual block for group g.

Parameters:
grads(L, p) or (L, p, K) ndarray

Gradients.

betas(L, p) or (L, p*K) csr_matrix

Coefficient vectors β.

duals(L, d) csr_matrix

Dual vectors μ.

lmdas(L,) ndarray

Regularization parameters λ.

constraints(G,) list[Union[ConstraintBase32, ConstraintBase64]], optional

List of constraints for each group. constraints[i] is the constraint object corresponding to group i. If constraints[i] is None, then the i th group is unconstrained. If None, every group is unconstrained. Default is None.

groups(G,) ndarray, optional

List of starting indices to each group where G is the number of groups. groups[i] is the starting index of the i th group. If glm is of multi-response type, then groups[i] is the starting feature index of the i th group. In either case, groups[i] must then be a value in the range {1,,p}. Default is None, in which case it is set to np.arange(p).

alphafloat, optional

Elastic net parameter α. It must be in the range [0,1]. Default is 1.

penalty(G,) ndarray, optional

Penalty factor for each group in the same order as groups. It must be a non-negative vector. Default is None, in which case, it is set to np.sqrt(group_sizes).

Returns:
norms(L, G) ndarray

Gradient norms.