{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# `scikit-learn`__-style API__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this notebook, we give an overview of using our user-friendly `scikit-learn`-style API.\n", "Oftentimes, existing data analysis pipelines assume `scikit-learn` API for the models,\n", "so it is useful to prepare an `adelie` solver obeying such an API for seamless integration.\n", "Our `scikit-learn`-style solver is a simple wrapper of the more generic tools provided in `adelie`.\n", "Consequently, it is slightly less flexible (e.g., cannot supply a user-specified GLM).\n", "However, it provides many commonly used functionalities \n", "to serve as a quick drop-in replacement for many existing solvers in `scikit-learn` with `adelie`." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "from sklearn.model_selection import train_test_split\n", "from sklearn.metrics import r2_score\n", "from sklearn.datasets import (\n", " load_breast_cancer,\n", " load_diabetes,\n", " load_digits,\n", ")\n", "from sklearn.preprocessing import OneHotEncoder\n", "import adelie as ad\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import scipy.stats as st\n", "\n", "np.random.seed(42)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## __Regression__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We load the `diabetes` dataset as a regression example and perform a train-test split.\n", "Note that we change the storage type for $X$ to be Fortran-order (column-major) \n", "since it is more efficient for our solver." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "((353, 10), (353,), (89, 10), (89,))" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data = load_diabetes()\n", "X, y = data.data, data.target\n", "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)\n", "X_train = np.asfortranarray(X_train)\n", "X_test = np.asfortranarray(X_test)\n", "X_train.shape, y_train.shape, X_test.shape, y_test.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We check that the train and test responses are similar in distribution." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "f, ax = plt.subplots(1, 1, figsize=(6, 6))\n", "ax.hist(y_train, bins=50, alpha=0.5, label='train')\n", "ax.hist(y_test, bins=50, alpha=0.5, label='test')\n", "ax.legend()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### __Lasso__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We first instantiate our `scikit-learn`-style solver class `GroupElasticNet`.\n", "We will use the default setting, which solves the lasso problem with Gaussian loss.\n", "As usual, we call `fit()` with our training data `X_train` and `y_train` to solve \n", "the lasso along a path of regularization values $\\lambda$." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "100%|\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m| 100/100 [00:00:00<00:00:00, 2392.51it/s] [dev:52.5%]\n" ] }, { "data": { "text/html": [ "
GroupElasticNet()
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ "GroupElasticNet()" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model_gaussian = ad.GroupElasticNet()\n", "model_gaussian.fit(X_train, y_train)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We call `predict()` to compute the linear predictions for each $\\lambda$ (each row)." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(100, 89)" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "yhatmat = model_gaussian.predict(X_test) \n", "yhatmat.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, we compute the out-of-sample $R^2$ for each $\\lambda$ value." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "r2vec = np.apply_along_axis(lambda yhat: r2_score(y_test, yhat), axis=1, arr=yhatmat)\n", "lam_path = model_gaussian.lambda_\n", "plt.plot(-np.log(lam_path), r2vec, linestyle=\"None\", marker=\".\")\n", "plt.axhline(r2vec.max(), color=\"red\", linestyle=\"--\", label=f\"max $R^2$ = {r2vec.max():.2f}\")\n", "plt.title(r\"Out-of-sample $R^2$ over $-\\log(\\lambda)$\")\n", "plt.xlabel(r\"$-\\log(\\lambda$)\")\n", "plt.ylabel(r\"$R^2$\")\n", "plt.legend()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### __K-Fold Cross-Validation__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "More often, the user wishes to perform model selection via cross-validation before deploying a model.\n", "If the model class is initalized with the `\"cv_grpnet\"` solver, \n", "we select the best $\\lambda$ by cross-validation and return the model at that $\\lambda$." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "100%|\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m| 101/101 [00:00:00<00:00:00, 21478.51it/s] [dev:51.8%]\n", "100%|\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m| 101/101 [00:00:00<00:00:00, 31226.25it/s] [dev:50.8%]\n", "100%|\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m| 101/101 [00:00:00<00:00:00, 39503.28it/s] [dev:57.5%]\n", "100%|\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m| 100/100 [00:00:00<00:00:00, 39044.08it/s] [dev:52.6%]\n", "100%|\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m| 100/100 [00:00:00<00:00:00, 33957.77it/s] [dev:53.0%]\n", "100%|\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m| 100/100 [00:00:00<00:00:00, 43455.42it/s] [dev:52.6%]\n" ] }, { "data": { "text/html": [ "
GroupElasticNet(solver='cv_grpnet')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ "GroupElasticNet(solver='cv_grpnet')" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model_cv_gaussian = ad.GroupElasticNet(solver=\"cv_grpnet\")\n", "model_cv_gaussian.fit(X_train, y_train, min_ratio=1e-3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With the `cv_grpnet` backend, calling predict automatically uses the model with the best $\\lambda$." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(89,)" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model_cv_gaussian.predict(X_test).shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The class also has a score method that computes $R^2$. " ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.45673736460682524" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model_cv_gaussian.score(X_test, y_test)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## __Binary Classification__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### __Logistic Regression__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We load the `breast_cancer` dataset as a binary classification example and perform a train-test split." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "((455, 30), (455,), (114, 30), (114,))" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data = load_breast_cancer()\n", "X, y = data.data, data.target\n", "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)\n", "X_train = np.asfortranarray(X_train)\n", "X_test = np.asfortranarray(X_test)\n", "X_train.shape, y_train.shape, X_test.shape, y_test.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Since we have a binary classification task, we specify our solver to use the `\"binomial\"` family for our response.\n", "We fit using our cross-validation solver as before." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "100%|\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m| 102/102 [00:00:00<00:00:00, 33514.96it/s] [dev:61.6%]\n", "100%|\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m| 100/100 [00:00:00<00:00:00, 36201.82it/s] [dev:61.8%]\n", "100%|\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m| 101/101 [00:00:00<00:00:00, 42699.36it/s] [dev:64.6%]\n", "100%|\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m| 101/101 [00:00:00<00:00:00, 42445.89it/s] [dev:64.7%]\n", "100%|\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m| 100/100 [00:00:00<00:00:00, 42242.36it/s] [dev:64.2%]\n", "100%|\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m| 100/100 [00:00:00<00:00:00, 41538.31it/s] [dev:63.3%]\n" ] }, { "data": { "text/html": [ "
GroupElasticNet(family='binomial', solver='cv_grpnet')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ "GroupElasticNet(family='binomial', solver='cv_grpnet')" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model_cv_binomial = ad.GroupElasticNet(solver=\"cv_grpnet\", family=\"binomial\")\n", "model_cv_binomial.fit(\n", " X_train, \n", " y_train.astype(np.float64),\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For the `\"binomial\"` family, the predict method outputs\n", "the class label predictions based on the probability estimates.\n", "Once we have our class predictions, we can output a contingency table.\n", "Here, we see that we never mispredict when the test $y_i = 1$, but have 16 mispredictions when $y_i = 0$." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[28, 16],\n", " [ 0, 70]])" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "yhat = model_cv_binomial.predict(X_test)\n", "st.contingency.crosstab(y_test, yhat).count" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can compute the probability estimates at the test points and plot the frequency." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plt.hist(model_cv_binomial.predict_proba(X_test)[:, 1], alpha = 0.5)\n", "plt.xlabel(\"Probability of label 1\")\n", "plt.ylabel(\"Frequency\")\n", "plt.title(\"Probability Frequency\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## __Multi-Class Classification__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### __Multinomial Regression__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We load the `digits` dataset as a multi-class classification example and perform a train-test split." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "((1437, 64), (1437,), (1437, 10), (360, 64), (360,))" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data = load_digits()\n", "X, y = data.data, data.target\n", "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)\n", "X_train = np.asfortranarray(X_train)\n", "X_test = np.asfortranarray(X_test)\n", "# one-hot-encode labels\n", "oh = OneHotEncoder(sparse_output=False)\n", "y_train2 = oh.fit_transform(y_train[:, np.newaxis])\n", "X_train.shape, y_train.shape, y_train2.shape, X_test.shape, y_test.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Since we have a multi-class classification task, \n", "we specify our solver to use the `\"multinomial\"` family for our response.\n", "We fit using our cross-validation solver as before." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "100%|\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m| 100/100 [00:00:01<00:00:00, 75.73it/s] [dev:84.6%]\n", "100%|\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m| 101/101 [00:00:01<00:00:00, 79.43it/s] [dev:84.9%]\n", "100%|\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m| 100/100 [00:00:01<00:00:00, 72.49it/s] [dev:84.8%]\n", "100%|\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m| 101/101 [00:00:01<00:00:00, 75.69it/s] [dev:85.0%]\n", "100%|\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m| 101/101 [00:00:01<00:00:00, 77.03it/s] [dev:85.1%]\n", "100%|\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m\u001b[1;32m█\u001b[0m| 100/100 [00:00:01<00:00:00, 75.64it/s] [dev:84.7%]\n" ] }, { "data": { "text/html": [ "
GroupElasticNet(family='multinomial', solver='cv_grpnet')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ "GroupElasticNet(family='multinomial', solver='cv_grpnet')" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model_cv_multinomial = ad.GroupElasticNet(solver=\"cv_grpnet\", family=\"multinomial\")\n", "model_cv_multinomial.fit(\n", " X_train, \n", " y_train2.astype(np.float64),\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For the `\"multinomial\"` family, the predict method outputs\n", "the class label predictions based on the probability estimates just like the `\"binomial\"` family.\n", "Once we have our class predictions, we can output a contingency table." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[29, 0, 0, 0, 0, 0, 0, 0, 0, 0],\n", " [ 0, 32, 0, 0, 0, 0, 0, 0, 2, 2],\n", " [ 0, 0, 41, 1, 0, 0, 0, 0, 0, 1],\n", " [ 0, 1, 0, 34, 0, 0, 0, 0, 1, 0],\n", " [ 0, 1, 0, 0, 37, 0, 0, 1, 0, 1],\n", " [ 0, 0, 0, 0, 0, 37, 0, 0, 0, 3],\n", " [ 0, 1, 0, 0, 0, 0, 37, 0, 0, 0],\n", " [ 0, 0, 0, 0, 0, 1, 0, 31, 0, 0],\n", " [ 0, 3, 0, 1, 0, 2, 0, 2, 23, 0],\n", " [ 0, 0, 0, 0, 0, 0, 0, 1, 0, 34]])" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "yhat = model_cv_multinomial.predict(X_test)\n", "st.contingency.crosstab(y_test, yhat).count" ] } ], "metadata": { "kernelspec": { "display_name": "adelie", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.10" } }, "nbformat": 4, "nbformat_minor": 2 }