adelie.glm.cox#

adelie.glm.cox(start: ndarray, stop: ndarray, status: ndarray, *, strata: ndarray | None = None, weights: ndarray | None = None, tie_method: str = 'efron', dtype: float32 | float64 | None = None)[source]#

Creates a Cox GLM family object.

The Cox GLM family specifies the loss function as:

\[\begin{split}\begin{align*} \ell(\eta) &= -\sum\limits_{i=1}^n w_i \delta_i \eta_i +\sum\limits_{i=1}^n \overline{w}_i \delta_i A_i(\eta) \\ A_i(\eta) &= \log\left( \sum\limits_{k \in R(t_i)} w_k e^{\eta_k} - \sigma_i \sum\limits_{k \in H(t_i)} w_k e^{\eta_k} \right) \end{align*}\end{split}\]

where

\[\begin{split}\begin{align*} R(u) &= \{i : u \in (s_i, t_i]\} \\ H(u) &= \{i : t_i = u, \delta_i = 1\} \\ \overline{w}_i &= \frac{\sum_{k \in H(t_i)} w_k}{\sum_{k \in H(t_i)} 1_{w_k > 0}} 1_{\delta_i = 1, w_i > 0} \end{align*}\end{split}\]

Here, \(\delta\) is the status (1 for event, 0 for censored) vector, \(s\) is the vector of start times, \(t\) is the vector of stop times, \(R(u)\) is the at-risk set at time \(u\), \(H(u)\) is the set of ties at event time \(u\), \(\overline{w}\) is the vector of average weights within ties with positive weights, \(\sigma\) is the correction scale for tie-breaks, which is determined by the type of correction method (Breslow or Efron). Note that \(\overline{w}_i\) and \(A_i(\eta)\) are only well-defined whenever \(\delta_i=1\), which is not an issue in the computation of \(\ell(\eta)\).

The link function is given by

\[\begin{align*} g(\mu)_i = \log(\mu_i) \end{align*}\]

If strata is specified, then the loss function is simply the sum of the losses for each strata. Namely, given the strata vector \(S\),

\[\begin{align*} \ell(\eta) &= \sum_{m=0}^{M-1} \ell_m(\eta_{\mathcal{I}_m}) \end{align*}\]

where \(M\) is the number of strata, \(\mathcal{I}_m = \{i : S_i = m\}\) is the set of individuals in stratum \(m\), \(\eta_I\) is the subset of \(\eta\) given by the indices in \(I\), and \(\ell_m\) is the usual Cox loss function as above using only the input data in \(\mathcal{I}_m\).

Note

The strata indicator \(S_i\) must take on values in the set \(\{0, \ldots, M-1\}\).

Parameters:
start(n,) ndarray

Start time vector \(s\).

stop(n,) ndarray

Stop time vector \(t\).

status(n,) ndarray

Status vector \(\delta\).

strata(n,) ndarray, optional

Strata vector \(S\). If None, there is only one stratum. Default is None.

weights(n,) ndarray, optional

Observation weights \(W\). Weights are normalized such that they sum to 1. Default is None, in which case, it is set to np.full(n, 1/n).

tie_methodstr, optional

The tie-breaking method that determines the scales \(\sigma\). It must be one of the following:

  • "efron"

  • "breslow"

Default is "efron".

dtypeUnion[float32, float64], optional

The underlying data type. If None, it is inferred from status, in which case status must have an underlying data type of numpy.float32 or numpy.float64. Default is None.

Returns:
glm

Cox GLM object.