adelie.solver.css_cov#
- adelie.solver.css_cov(S: ndarray, subset_size: int | None = None, *, subset: ndarray | None = None, method: str = 'swapping', loss: str = 'least_squares', max_iters: int = 1000, n_threads: int = 1)[source]#
Solves column subset selection via covariance method.
Column subset selection via covariance method solves
\[\begin{align*} \mathrm{minimize}_{T \subseteq [p] : |T|=k} &\quad \ell\left( \Sigma, T \right) \end{align*}\]for a variety of loss functions \(\ell\) where \(\Sigma \in \mathbb{R}^{p \times p}\) is a positive semi-definite matrix and \(T\) is an index set of size \(k\).
The least squares loss is given by
\[\begin{align*} \ell(\Sigma, T) = \mathrm{Tr}\left( \Sigma - \Sigma_{\cdot T} \Sigma_{T,T}^\dagger \Sigma_{T \cdot} \right) \end{align*}\]The subset factor loss is given by
\[\begin{align*} \ell(\Sigma, T) = \log|\Sigma_T| + \log(|\mathrm{diag}( \Sigma_{-T,-T} - \Sigma_{-T,T} \Sigma_{T,T}^\dagger \Sigma_{T,-T} )|) \end{align*}\]The minimum determinant loss is given by
\[\begin{align*} \ell(\Sigma, T) = \left|\Sigma_T\right| \end{align*}\]The greedy method begins with the empty set for \(T\) and adds at every iteration the column that most minimizes the loss until \(T\) is of size \(k\). The swapping method begins with an initial set \(T\) of size \(k\) and iteratively attempts to swap out each entry of \(T\) for the column that most improves the loss (strictly).
Note
The greedy method is generally significantly faster than the swapping method. However, the swapping method yields a much more accurate solution to the CSS problem. We recommend using the greedy method only if the swapping method is too time-consuming or an accurate solution is not necessary.
- Parameters:
- S(p, p) ndarray
Positive semi-definite matrix \(\Sigma\).
- subset_sizeint, optional
Subset size \(k\). It must satisfy the following conditions for each method type:
"greedy"
: must be an integer."swapping"
: must satisfy the conditions for"greedy"
ifsubset
isNone
. Otherwise, it is ignored.
Default is
None
.- subsetndarray, optional
Initial subset \(T\). This argument is only used by the swapping method. If
None
, the greedy method is used to first initialize a subset of sizesubset_size
. Default isNone
.- methodstr, optional
Search method to identify the optimal \(T\). It must be one of the following:
"greedy"
: greedy method."swapping"
: swapping method.
Default is
"swapping"
.- lossstr, optional
Loss type. It must be one of the following:
"least_squares"
: least squares loss."subset_factor"
: subset factor loss."min_det"
: minimum determinant loss.
Default is
"least_squares"
.- max_itersint, optional
Maximum number of cycles. Default is
int(1e3)
.- n_threadsint, optional
Number of threads. Default is
1
.
- Returns:
- state
The resulting state after running the solver.