adelie.matrix.interaction#

adelie.matrix.interaction(mat: ndarray, intr_map: dict, levels: ndarray | None = None, *, copy: bool = False, n_threads: int = 1)[source]#

Creates a matrix with pairwise interactions.

This matrix \(X \in \mathbb{R}^{n\times p}\) represents pairwise interaction terms within a given base matrix \(Z \in \mathbb{R}^{n\times d}\) where the interaction structure is defined as follows. We assume \(Z\) contains, in general, a combination of continuous and discrete features (as columns). Denote \(L : \{1,\ldots, d\} \to \mathbb{N}\) as the mapping that maps each feature index of \(Z\) to the number of levels of that feature where a value of \(0\) means the feature is continuous and otherwise means it is discrete with that many levels (or categories). Let \(S \subseteq \{1,\ldots, d\}^2\) denote the set of valid and unique pairs of feature indices of \(Z\). A pair is valid if the two values are not equal. We define uniqueness up to ordering so that \((x,y)\) and \((y,x)\) are considered the same pairs. Finally, for each pair \((i, j) \in S\), define the interaction term \(Z_{i:j}\) as

\[\begin{split}\begin{align*} Z_{i:j} &:= \begin{cases} \begin{bmatrix} Z_{i} & Z_{j} & Z_i \odot Z_j \end{bmatrix} ,& L(i) = 0, L(j) = 0 \\ \begin{bmatrix} \mathbf{1} & Z_{i} \end{bmatrix} \star I_{Z_{j}} ,& L(i) = 0, L(j) > 0 \\ I_{Z_{i}} \star \begin{bmatrix} \mathbf{1} & Z_{j} \end{bmatrix} ,& L(i) > 0, L(j) = 0 \\ I_{Z_{i}} \star I_{Z_{j}} ,& L(i) > 0, L(j) > 0 \end{cases} \end{align*}\end{split}\]

Here, \(Z_i\) is the \(i\) th column of \(Z\), \(I_{v}\) is the indicator matrix, or one-hot encoding, of \(v\), and for any two matrices \(A \in \mathbb{R}^{n\times d_A}\), \(B \in \mathbb{R}^{n\times d_B}\),

\[\begin{align*} A \star B &= \begin{bmatrix} A_{1} \odot B_{1} & \cdots & A_{d_A} \odot B_{1} & A_{1} \odot B_{2} & \cdots & A_{d_A} \odot B_{2} & \cdots \end{bmatrix} \end{align*}\]

Then, \(X\) is defined as the column-wise concatenation of \(Z_{i:j}\) in lexicographical order of \((i,j) \in S\).

Note

Every discrete feature of Z must take on values in the set \(\{0, \ldots, \ell-1\}\) where \(\ell\) is the number of levels for that feature.

Note

This matrix only works for naive method!

Parameters:

mat(n, d) ndarray: The base matrix \(Z\) from which to construct interaction terms.
intr_mapdict: Dictionary mapping a column index of mat to a list of (column) indices to pair with. If the value of a key-value pair is None, then every column is paired with the key. Internally, only valid and unique (as defined above) pairs are registered to construct \(S\). Moreover, the pairs are stored in lexicographical order of (key, val) for each val in intr_map[key] and for each key.
levels(d,) ndarray, optional: Number of levels for each column in mat. A non-positive value indicates that the column is a continuous variable whereas a positive value indicates that it is a discrete variable with that many levels (or categories). If None, it is initialized to be np.zeros(d) so that every column is a continuous variable. Default is None.
copybool, optional: If True, a copy of mat is stored internally. Otherwise, a reference is stored instead. Default is False.
n_threadsint, optional: Number of threads. Default is 1.

Returns:

wrap: Wrapper matrix object.