adelie.diagnostic.gradient_norms#

adelie.diagnostic.gradient_norms(grads: ndarray, betas: csr_matrix, duals: csr_matrix, lmdas: ndarray, *, constraints: list[ConstraintBase32 | ConstraintBase64] | None = None, groups: ndarray | None = None, alpha: float = 1, penalty: ndarray | None = None)[source]#

Computes the group-wise gradient norms.

The group-wise gradient norm is given by \(\hat{h} \in \mathbb{R}^{G}\) where

\[\begin{align*} \hat{h}_g = \| \hat{\gamma}_g - \lambda (1-\alpha) \omega_g \beta_g - \phi_g'(\beta_g)^\top \mu_g \|_2 \quad g=1,\ldots, G \end{align*}\]

where \(\hat{\gamma}_g\) is the gradient as in adelie.diagnostic.gradients(), \(\lambda\) is the regularization, \(\alpha\) is the elastic net proportion, \(\omega_g\) is the penalty factor, \(\beta_g\) is the coefficient block for group \(g\), \(\phi_g\) is the constraint function for group \(g\), and \(\mu_g\) is the dual block for group \(g\).

Parameters:

grads(L, p) or (L, p, K) ndarray: Gradients.
betas(L, p) or (L, p*K) csr_matrix: Coefficient vectors \(\beta\).
duals(L, d) csr_matrix: Dual vectors \(\mu\).
lmdas(L,) ndarray: Regularization parameters \(\lambda\).
constraints(G,) list[Union[ConstraintBase32, ConstraintBase64]], optional: List of constraints for each group. constraints[i] is the constraint object corresponding to group i. If constraints[i] is None, then the i th group is unconstrained. If None, every group is unconstrained. Default is None.
groups(G,) ndarray, optional: List of starting indices to each group where G is the number of groups. groups[i] is the starting index of the i th group. If glm is of multi-response type, then groups[i] is the starting feature index of the i th group. In either case, groups[i] must then be a value in the range \(\{1,\ldots, p\}\). Default is None, in which case it is set to np.arange(p).
alphafloat, optional: Elastic net parameter \(\alpha\). It must be in the range \([0,1]\). Default is 1.
penalty(G,) ndarray, optional: Penalty factor for each group in the same order as groups. It must be a non-negative vector. Default is None, in which case, it is set to np.sqrt(group_sizes).

Returns:

norms(L, G) ndarray: Gradient norms.