{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Deep QEPs and DSPPs w/ Multiple Outputs\n", "\n", "## Introduction\n", "\n", "In this example, we will demonstrate how to construct deep QEPs that can model vector-valued functions (e.g. multitask/multi-output QEPs).\n", "\n", "This tutorial can also be used to construct multitask [deep sigma point processes](./Deep_Sigma_Point_Processes.ipynb) by replacing `DeepQEPLayer`/`DeepQEP`/`DeepApproximateMLL` with `DSPPLayer`/`DSPP`/`DeepPredictiveLogLikelihood`.\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import os\n", "import torch\n", "import tqdm\n", "import math\n", "import qpytorch\n", "from torch.nn import Linear\n", "from qpytorch.means import ConstantMean, LinearMean\n", "from qpytorch.kernels import MaternKernel, ScaleKernel\n", "from qpytorch.variational import VariationalStrategy, CholeskyVariationalDistribution, \\\n", " LMCVariationalStrategy\n", "from qpytorch.distributions import MultivariateQExponential\n", "from qpytorch.models.deep_qeps import DeepQEPLayer, DeepQEP\n", "from qpytorch.mlls import DeepApproximateMLL, VariationalELBO\n", "from qpytorch.likelihoods import MultitaskQExponentialLikelihood\n", "from matplotlib import pyplot as plt\n", "\n", "smoke_test = ('CI' in os.environ)\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Set up training data\n", "\n", "In the next cell, we set up the training data for this example. We'll be using 100 regularly spaced points on [0,1] which we evaluate the function on and add Gaussian noise to get the training labels.\n", "\n", "We'll have four functions - all of which are some sort of sinusoid. Our `train_targets` will actually have two dimensions: with the second dimension corresponding to the different tasks." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "train_x = torch.linspace(0, 1, 100)\n", "\n", "train_y = torch.stack([\n", " torch.sin(train_x * (2 * math.pi)) + torch.randn(train_x.size()) * 0.2,\n", " torch.cos(train_x * (2 * math.pi)) + torch.randn(train_x.size()) * 0.2,\n", " torch.sin(train_x * (2 * math.pi)) + 2 * torch.cos(train_x * (2 * math.pi)) + torch.randn(train_x.size()) * 0.2,\n", " -torch.cos(train_x * (2 * math.pi)) + torch.randn(train_x.size()) * 0.2,\n", "], -1)\n", "\n", "train_x = train_x.unsqueeze(-1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Structure of a multitask deep QEP\n", "\n", "The layers of a multitask deep QEP will look identical to the layers of a [single-output deep QEP](./Deep_QExponential_Processes.ipynb)." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "# Here's a simple standard layer\n", "POWER = 1.0\n", "class DQEPHiddenLayer(DeepQEPLayer):\n", " def __init__(self, input_dims, output_dims, num_inducing=128, linear_mean=True):\n", " self.power = torch.tensor(POWER)\n", " inducing_points = torch.randn(output_dims, num_inducing, input_dims)\n", " batch_shape = torch.Size([output_dims])\n", "\n", " variational_distribution = CholeskyVariationalDistribution(\n", " num_inducing_points=num_inducing,\n", " batch_shape=batch_shape,\n", " power=self.power\n", " )\n", " variational_strategy = VariationalStrategy(\n", " self,\n", " inducing_points,\n", " variational_distribution,\n", " learn_inducing_locations=True\n", " )\n", "\n", " super().__init__(variational_strategy, input_dims, output_dims)\n", " self.mean_module = ConstantMean() if linear_mean else LinearMean(input_dims)\n", " self.covar_module = ScaleKernel(\n", " MaternKernel(nu=2.5, batch_shape=batch_shape, ard_num_dims=input_dims),\n", " batch_shape=batch_shape, ard_num_dims=None\n", " )\n", "\n", " def forward(self, x):\n", " mean_x = self.mean_module(x)\n", " covar_x = self.covar_module(x)\n", " return MultivariateQExponential(mean_x, covar_x, power=self.power)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The main body of the deep QEP will look very similar to the single-output deep QEP, with a few changes.\n", "\n", "**Most importantly** - the last layer will have `output_dims=num_tasks`, rather than `output_dims=None`. As a result, the output of the model will be a `MultitaskMultivariateQExponential` rather than a standard `MultivariateQExponential` distribution.\n", "\n", "There are two other small changes, which are noted in the comments." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "num_tasks = train_y.size(-1)\n", "num_hidden_dgp_dims = 3\n", "\n", "\n", "class MultitaskDeepQEP(DeepQEP):\n", " def __init__(self, train_x_shape):\n", " hidden_layer = DQEPHiddenLayer(\n", " input_dims=train_x_shape[-1],\n", " output_dims=num_hidden_dgp_dims,\n", " linear_mean=True\n", " )\n", " last_layer = DQEPHiddenLayer(\n", " input_dims=hidden_layer.output_dims,\n", " output_dims=num_tasks,\n", " linear_mean=False\n", " )\n", " \n", " super().__init__()\n", " \n", " self.hidden_layer = hidden_layer\n", " self.last_layer = last_layer\n", " \n", " # We're going to use a multitask likelihood instead of the standard QExponentialLikelihood\n", " self.likelihood = MultitaskQExponentialLikelihood(num_tasks=num_tasks)\n", " \n", " def forward(self, inputs):\n", " hidden_rep1 = self.hidden_layer(inputs)\n", " output = self.last_layer(hidden_rep1)\n", " return output\n", " \n", " def predict(self, test_x):\n", " with torch.no_grad():\n", "\n", " # The output of the model is a multitask QEP, where both the data points\n", " # and the tasks are jointly distributed\n", " # To compute the marginal predictive NLL of each data point,\n", " # we will call `to_data_uncorrelated_dist`,\n", " # which removes the data cross-covariance terms from the distribution.\n", " preds = model.likelihood(model(test_x)).to_data_uncorrelated_dist()\n", " \n", " return preds.mean.mean(0), preds.variance.mean(0)\n", "\n", "\n", "model = MultitaskDeepQEP(train_x.shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Training and making predictions\n", "\n", "This code should look similar to the DQEP training code." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "f162dd03ba434cc2acc57f26ef106503", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Epoch: 0%| | 0/200 [00:00" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "\n", "# Make predictions\n", "model.eval()\n", "with torch.no_grad(), qpytorch.settings.fast_pred_var():\n", " test_x = torch.linspace(0, 1, 51).unsqueeze(-1)\n", " mean, var = model.predict(test_x)\n", " lower = mean - 2 * var.sqrt()\n", " upper = mean + 2 * var.sqrt()\n", "\n", "# Plot results\n", "fig, axs = plt.subplots(1, num_tasks, figsize=(4 * num_tasks, 3))\n", "for task, ax in enumerate(axs):\n", " ax.plot(train_x.squeeze(-1).detach().numpy(), train_y[:, task].detach().numpy(), 'k*')\n", " ax.plot(test_x.squeeze(-1).numpy(), mean[:, task].numpy(), 'b')\n", " ax.fill_between(test_x.squeeze(-1).numpy(), lower[:, task].numpy(), upper[:, task].numpy(), alpha=0.5)\n", " ax.set_ylim([-3, 3])\n", " ax.legend(['Observed Data', 'Mean', 'Confidence'])\n", " ax.set_title(f'Task {task + 1}')\n", "fig.tight_layout()\n", "None" ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.12" } }, "nbformat": 4, "nbformat_minor": 4 }