Matrix#

In group elastic net problems, the matrix object plays a crucial role in the performance of the solver. It becomes apparent in our optimization algorithm (and our benchmark analysis) that most of the runtime lies in interacting with the matrix object, e.g. computing inner-products. Hence, a highly performant matrix object implementing a select set of methods that the solver requires will yield tremendous speed gains overall. In addition, we have found numerous examples where a matrix class admits some special structure that can be exploited for further speed and memory gains. One simple example is a large sparse matrix, which cannot fit in memory as a dense matrix. Another example is genomics datasets which are not only sparse, but only take on 3 possible integer values (see Examples), and generally have over 160 billion entries with 30% non-zero entries.

For these reasons, we found it fruitful to abstract out the matrix class. adelie provides a few matrix classes in the ad.matrix submodule. We discuss below some ways a user may interact with these classes as well as define a class of their own to plug into our solver.

[19]:
import adelie as ad
import numpy as np

Naive Matrix Class#

The naive matrix class refers to matrix classes that abstract out the feature matrix \(X\). The simplest example of such a matrix is simply a dense matrix. Let us first construct a dense matrix and wrap it using ad.matrix.dense.

[20]:
n = 100
p = 1000
seed = 0

np.random.seed(seed)
X = np.random.normal(0, 1, (n, p))
X_wrap = ad.matrix.dense(X, method="naive")

X_wrap can be thought of a simple wrapper of X, only exposing a few methods that our solver requires. For example, .bmul() is a method that computes X[:, i:i+q].T @ (w * v). The canonical application of .bmul() is in computing the correlation between the feature matrix and the residual with observation weights w. It is worth mentioning that adelie also relies on such member functions when computing diagnostic quantities in ad.diagnostic.

As an example, we generate data below and show the equivalence of calling .bmul() and the equivalent numpy code.

[21]:
i = 10      # starting column index
q = 3       # column block size

w = np.random.uniform(0, 1, n)
v = np.random.normal(0, 1, n)
out = np.empty(q)
X_wrap.bmul(i, q, v, w, out)
assert np.allclose(
    out,
    X[:, i:i+q].T @ (w * v),
)

For the full set of methods, we refer the readers to MatrixNaiveBase64.

Covariance Matrix Class#

The covariance matrix class refers to matrix classes that abstract out the covariance matrix \(A = X^\top X\). This matrix is currently only used in the context of ad.gaussian_cov solver. Nonetheless, like the naive matrix class, it exposes its own set of member functions that the covariance method solver requires. For example, it also exposes .bmul() but computes a different quantity: A[:, subset].T @ v where subset is a subset of column indices.

We take the same data as above and show the equivalence of calling .bmul() and the equivalent numpy code.

[22]:
A = X.T @ X
A_wrap = ad.matrix.dense(A, method="cov")

i, j = 2, 5     # starting (i, j) position of the block of A
p, q = 3, 7     # number of rows/cols of the block

subset = np.arange(j, j + q)
indices = np.arange(i, i + p)
values = np.random.normal(0, 1, p)
out = np.empty(q)
A_wrap.bmul(subset, indices, values, out)
assert np.allclose(
    out,
    A[i:i+p, j:j+q].T @ values,
)

For the full set of methods, we refer the readers to MatrixCovBase64.

Custom Matrix Class#

One of the defining features of adelie is the flexibility for the user to specify her own matrix classes. The custom matrix class can be implemented in either C++ or Python! In this section, we demonstrate this feature by defining a custom naive matrix class equivalent to ad.matrix.dense in Python. All of the discussion carries through for the covariance matrix class and any important differences will be mentioned in passing.

We first show the full code for our custom matrix class.

[23]:
class Dense(ad.matrix.MatrixNaiveBase64):
    def __init__(self, mat):
        self.mat = mat
        # MUST call base class __init__!
        ad.matrix.MatrixNaiveBase64.__init__(self)
    def bmul(self, j, q, v, w, out):
        out[...] = self.mat[:, j:j+q].T @ (w * v)
    def btmul(self, j, q, v, out):
        out[...] += self.mat[:, j:j+q] @ v
    def cmul(self, j, v, w):
        return self.mat[:, j] @ (w * v)
    def ctmul(self, j, v, out):
        out[...] += self.mat[:, j] * v
    def rows(self):
        return self.mat.shape[0]
    def cols(self):
        return self.mat.shape[1]
    def cov(self, j, q, sqrt_weights, out, buffer):
        buffer = sqrt_weights[:, None] * self.mat[:, j:j+q] # just to demonstrate use of buffer
        out[...] = buffer.T @ buffer
    def mul(self, v, w, out):
        out[...] = self.mat.T @ (w * v)
    def sp_btmul(self, v, out):
        out[...] = v @ self.mat

We remark on a few important points:

  • The custom (naive) matrix class must inherit from the provided base class ad.matrix.MatrixNaiveBase64. For users interested in using 32-bit floats, inherit from ad.matrix.MatrixNaiveBase32, however beware of the numerical instability in using 32-bit floats! You may observe strange behaviors in the optimizer, so use with caution! Users who are implementing a covariance matrix class must inherit from ad.matrix.MatrixCovBase64 or ad.matrix.MatrixCovBase32.

  • The base class constructor must get called exactly as shown above without the use of super()! This is a quirk of the pybind11 package which is what we rely on for exporting C++ classes to Python.

  • Many of the member functions are given the output container to store the result of computing a quantity. Hence, we use the syntax out[...] = expression to modify the container in-place. A common pitfall is to write out = expression, which will only redirect the local variable out to point to a different location in memory. This low-level interface is done for memory efficiency.

We now show that the matrix can be passed into our existing solver ad.grpnet with no further changes to the code. We first generate a response vector and call ad.grpnet.

[24]:
y = X[:, -1] * np.random.normal(0, 1) + np.random.normal(0, 1, n)
[25]:
state = ad.grpnet(
    X=Dense(X),
    glm=ad.glm.gaussian(y=y),
)
 52%|     | 52/100 [00:00:00<00:00:00, 535.72it/s] [dev:90.2%]

We can compare solutions by comparing with the solutions from passing in a dense matrix, which will internally get wrapped using ad.matrix.dense.

[26]:
state_exp = ad.grpnet(
    X=X,
    glm=ad.glm.gaussian(y=y),
)
 52%|     | 52/100 [00:00:00<00:00:00, 13808.06it/s] [dev:90.2%]
[27]:
assert np.allclose(
    state.betas.toarray(),
    state_exp.betas.toarray(),
)

The convenience of defining a matrix class in Python has a performance cost. If the matrix size is sufficiently large that the member functions become expensive, the cost of dispatching to the Python interpreter becomes negligible. For the best performance, users may be interested in porting the code to C++ and exporting the bindings. We refer the readers to the matrix directory in our C++ backend for the numerous examples of how to extend the base classes in C++. The associated binding code can be found in matrix.cpp.