{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Hyperparameters in QPyTorch\n",
    "\n",
    "The purpose of this notebook is to explain how QEP hyperparameters in QPyTorch work, how they are handled, what options are available for constraints and priors, and how things may differ from other packages.\n",
    "\n",
    "**Note:** This is a *basic* introduction to hyperparameters in QPyTorch. If you want to use QPyTorch hyperparameters with things like Pyro distributions, that will be covered in a less \"basic usage\" tutorial."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# smoke_test (this makes sure this example notebook gets tested)\n",
    "\n",
    "import math\n",
    "import torch\n",
    "import qpytorch\n",
    "from matplotlib import pyplot as plt\n",
    "\n",
    "from IPython.display import Markdown, display\n",
    "def printmd(string):\n",
    "    display(Markdown(string))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Defining an example model\n",
    "\n",
    "In the next cell, we define our simple exact QEP from the <a href=\"../01_Exact_QEPs/Simple_QEP_Regression.ipynb\">Simple QEP Regression</a> tutorial. We'll be using this model to demonstrate certain aspects of hyperparameter creation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "train_x = torch.linspace(0, 1, 100)\n",
    "train_y = torch.sin(train_x * (2 * math.pi)) + torch.randn(train_x.size()) * 0.2\n",
    "POWER = 1.0\n",
    "\n",
    "# We will use the simplest form of QEP model, exact inference\n",
    "class ExactQEPModel(qpytorch.models.ExactQEP):\n",
    "    def __init__(self, train_x, train_y, likelihood):\n",
    "        super(ExactQEPModel, self).__init__(train_x, train_y, likelihood)\n",
    "        self.power = torch.tensor(POWER)\n",
    "        self.mean_module = qpytorch.means.ConstantMean()\n",
    "        self.covar_module = qpytorch.kernels.ScaleKernel(qpytorch.kernels.RBFKernel())\n",
    "    \n",
    "    def forward(self, x):\n",
    "        mean_x = self.mean_module(x)\n",
    "        covar_x = self.covar_module(x)\n",
    "        return qpytorch.distributions.MultivariateQExponential(mean_x, covar_x, power=self.power)\n",
    "\n",
    "# initialize likelihood and model\n",
    "likelihood = qpytorch.likelihoods.QExponentialLikelihood(power = torch.tensor(POWER))\n",
    "model = ExactQEPModel(train_x, train_y, likelihood)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Viewing model hyperparameters\n",
    "\n",
    "Let's take a look at the model parameters. By \"parameters\", here I mean explicitly objects of type `torch.nn.Parameter` that will have gradients filled in by autograd. To access these, there are two ways of doing this in torch. One way is to use `model.state_dict()`, which we demonstrate the use of for saving models <a href=\"Saving_and_Loading_Models.ipynb\">here</a>.\n",
    "\n",
    "In the next cell we demonstrate another way to do this, by looping over the `model.named_parameters()` generator:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Parameter name: likelihood.noise_covar.raw_noise           value = 0.0\n",
      "Parameter name: mean_module.raw_constant                   value = 0.0\n",
      "Parameter name: covar_module.raw_outputscale               value = 0.0\n",
      "Parameter name: covar_module.base_kernel.raw_lengthscale   value = 0.0\n"
     ]
    }
   ],
   "source": [
    "for param_name, param in model.named_parameters():\n",
    "    print(f'Parameter name: {param_name:42} value = {param.item()}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Raw vs Actual Parameters\n",
    "\n",
    "The most important thing to note here is that the actual learned parameters of the model are things like `raw_noise`, `raw_outputscale`, `raw_lengthscale`, etc. The reason for this is that these parameters **must be positive**. This brings us to our next topic for parameters: constraints, and the difference between *raw* parameters and *actual* parameters.\n",
    "\n",
    "In order to enforce positiveness and other constraints for hyperparameters, QPyTorch has **raw** parameters (e.g., `model.covar_module.raw_outputscale`) that are transformed to actual values via some constraint. Let's take a look at the raw outputscale, its constraint, and the final value:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "raw_outputscale,  Parameter containing:\n",
      "tensor(0., requires_grad=True)\n",
      "\n",
      "raw_outputscale_constraint1 Positive()\n"
     ]
    },
    {
     "data": {
      "text/markdown": [
       "\n",
       "\n",
       "**Printing all model constraints...**\n"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Constraint name: likelihood.noise_covar.raw_noise_constraint             constraint = GreaterThan(1.000E-04)\n",
      "Constraint name: covar_module.raw_outputscale_constraint                 constraint = Positive()\n",
      "Constraint name: covar_module.base_kernel.raw_lengthscale_constraint     constraint = Positive()\n"
     ]
    },
    {
     "data": {
      "text/markdown": [
       "\n",
       "**Getting raw outputscale constraint from model...**"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Positive()\n"
     ]
    },
    {
     "data": {
      "text/markdown": [
       "\n",
       "**Getting raw outputscale constraint from model.covar_module...**"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Positive()\n"
     ]
    }
   ],
   "source": [
    "raw_outputscale = model.covar_module.raw_outputscale\n",
    "print('raw_outputscale, ', raw_outputscale)\n",
    "\n",
    "# Three ways of accessing the raw outputscale constraint\n",
    "print('\\nraw_outputscale_constraint1', model.covar_module.raw_outputscale_constraint)\n",
    "\n",
    "printmd('\\n\\n**Printing all model constraints...**\\n')\n",
    "for constraint_name, constraint in model.named_constraints():\n",
    "    print(f'Constraint name: {constraint_name:55} constraint = {constraint}')\n",
    "\n",
    "printmd('\\n**Getting raw outputscale constraint from model...**')\n",
    "print(model.constraint_for_parameter_name(\"covar_module.raw_outputscale\"))\n",
    "\n",
    "\n",
    "printmd('\\n**Getting raw outputscale constraint from model.covar_module...**')\n",
    "print(model.covar_module.constraint_for_parameter_name(\"raw_outputscale\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### How do constraints work?\n",
    "\n",
    "Constraints define `transform` and `inverse_transform` methods that turn raw parameters in to real ones. For a positive constraint, we expect the **transformed** values to always be positive. Let's see:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Transformed outputscale tensor(0.6931, grad_fn=<SoftplusBackward0>)\n",
      "tensor(0., grad_fn=<AddBackward0>)\n",
      "True\n",
      "Transform a bunch of negative tensors:  tensor([0.3133, 0.1269, 0.0486])\n"
     ]
    }
   ],
   "source": [
    "raw_outputscale = model.covar_module.raw_outputscale\n",
    "constraint = model.covar_module.raw_outputscale_constraint\n",
    "\n",
    "print('Transformed outputscale', constraint.transform(raw_outputscale))\n",
    "print(constraint.inverse_transform(constraint.transform(raw_outputscale)))\n",
    "print(torch.equal(constraint.inverse_transform(constraint.transform(raw_outputscale)), raw_outputscale))\n",
    "\n",
    "print('Transform a bunch of negative tensors: ', constraint.transform(torch.tensor([-1., -2., -3.])))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Convenience Getters/Setters for Transformed Values\n",
    "\n",
    "Because dealing with raw parameter values is annoying (e.g., we might know what a noise variance of 0.01 means, but maybe not a `raw_noise` of `-2.791`), virtually all built in QPyTorch modules that define raw parameters define convenience getters and setters for dealing with transformed values directly.\n",
    "\n",
    "In the next cells, we demonstrate the \"inconvenient way\" and the \"convenient\" way of getting and setting the outputscale."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Actual outputscale: 0.6931471824645996\n",
      "Actual outputscale after setting: 2.0\n"
     ]
    }
   ],
   "source": [
    "# Recreate model to reset outputscale\n",
    "model = ExactQEPModel(train_x, train_y, likelihood)\n",
    "\n",
    "# Inconvenient way of getting true outputscale\n",
    "raw_outputscale = model.covar_module.raw_outputscale\n",
    "constraint = model.covar_module.raw_outputscale_constraint\n",
    "outputscale = constraint.transform(raw_outputscale)\n",
    "print(f'Actual outputscale: {outputscale.item()}')\n",
    "\n",
    "# Inconvenient way of setting true outputscale\n",
    "model.covar_module.raw_outputscale.data.fill_(constraint.inverse_transform(torch.tensor(2.)))\n",
    "raw_outputscale = model.covar_module.raw_outputscale\n",
    "outputscale = constraint.transform(raw_outputscale)\n",
    "print(f'Actual outputscale after setting: {outputscale.item()}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Ouch, that is ugly! Fortunately, there is a better way:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Actual outputscale: 0.6931471824645996\n",
      "Actual outputscale after setting: 2.0\n"
     ]
    }
   ],
   "source": [
    "# Recreate model to reset outputscale\n",
    "model = ExactQEPModel(train_x, train_y, likelihood)\n",
    "\n",
    "# Convenient way of getting true outputscale\n",
    "print(f'Actual outputscale: {model.covar_module.outputscale}')\n",
    "\n",
    "# Convenient way of setting true outputscale\n",
    "model.covar_module.outputscale = 2.\n",
    "print(f'Actual outputscale after setting: {model.covar_module.outputscale}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Changing Parameter Constraints\n",
    "\n",
    "If we look at the actual noise of the model, QPyTorch defines a default lower bound of `1e-4` for the noise variance:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Actual noise value: tensor([0.6932], grad_fn=<AddBackward0>)\n",
      "Noise constraint: GreaterThan(1.000E-04)\n"
     ]
    }
   ],
   "source": [
    "print(f'Actual noise value: {likelihood.noise}')\n",
    "print(f'Noise constraint: {likelihood.noise_covar.raw_noise_constraint}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can change the noise constraint either on the fly or when the likelihood is created:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Noise constraint: GreaterThan(1.000E-03)\n",
      "Noise constraint: Positive()\n"
     ]
    }
   ],
   "source": [
    "likelihood = qpytorch.likelihoods.QExponentialLikelihood(noise_constraint=qpytorch.constraints.GreaterThan(1e-3))\n",
    "print(f'Noise constraint: {likelihood.noise_covar.raw_noise_constraint}')\n",
    "\n",
    "## Changing the constraint after the module has been created\n",
    "likelihood.noise_covar.register_constraint(\"raw_noise\", qpytorch.constraints.Positive())\n",
    "print(f'Noise constraint: {likelihood.noise_covar.raw_noise_constraint}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Priors\n",
    "\n",
    "In QPyTorch, priors are things you register to the model that act on any arbitrary function of any parameter. Like constraints, these can usually be defined either when you create an object (like a Kernel or Likelihood), or set afterwards on the fly.\n",
    "\n",
    "Here are some examples:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Registers a prior on the sqrt of the noise parameter \n",
    "# (e.g., a prior for the noise standard deviation instead of variance)\n",
    "likelihood.noise_covar.register_prior(\n",
    "    \"noise_std_prior\",\n",
    "    qpytorch.priors.QExponentialPrior(0, 1, torch.tensor(POWER)),\n",
    "    lambda module: module.noise.sqrt()\n",
    ")\n",
    "\n",
    "# Create a QExponentialLikelihood with a q-exponential prior for the noise\n",
    "likelihood = qpytorch.likelihoods.QExponentialLikelihood(\n",
    "    noise_constraint=qpytorch.constraints.GreaterThan(1e-3),\n",
    "    noise_prior=qpytorch.priors.QExponentialPrior(0, 1)\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Putting it Together\n",
    "\n",
    "In the next cell, we augment our `ExactQEP` definition to place several priors over hyperparameters and tighter constraints when creating the model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "# We will use the simplest form of QEP model, exact inference\n",
    "class FancyQEPWithPriors(qpytorch.models.ExactQEP):\n",
    "    def __init__(self, train_x, train_y, likelihood):\n",
    "        super(FancyQEPWithPriors, self).__init__(train_x, train_y, likelihood)\n",
    "        self.mean_module = qpytorch.means.ConstantMean()\n",
    "        \n",
    "        lengthscale_prior = qpytorch.priors.GammaPrior(3.0, 6.0)\n",
    "        outputscale_prior = qpytorch.priors.GammaPrior(2.0, 0.15)\n",
    "        \n",
    "        self.covar_module = qpytorch.kernels.ScaleKernel(\n",
    "            qpytorch.kernels.RBFKernel(\n",
    "                lengthscale_prior=lengthscale_prior,\n",
    "            ),\n",
    "            outputscale_prior=outputscale_prior\n",
    "        )\n",
    "        \n",
    "        # Initialize lengthscale and outputscale to mean of priors\n",
    "        self.covar_module.base_kernel.lengthscale = lengthscale_prior.mean\n",
    "        self.covar_module.outputscale = outputscale_prior.mean\n",
    "    \n",
    "    def forward(self, x):\n",
    "        mean_x = self.mean_module(x)\n",
    "        covar_x = self.covar_module(x)\n",
    "        return qpytorch.distributions.MultivariateQExponential(mean_x, covar_x, power=torch.tensor(POWER))\n",
    "\n",
    "likelihood = qpytorch.likelihoods.QExponentialLikelihood(\n",
    "    noise_constraint=qpytorch.constraints.GreaterThan(1e-2),\n",
    ")\n",
    "\n",
    "model = FancyQEPWithPriors(train_x, train_y, likelihood)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Initializing hyperparameters in One Call\n",
    "\n",
    "For convenience, QPyTorch modules also define an `initialize` method that allow you to update a full dictionary of parameters on submodules. For example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1.0000001192092896 0.5 2.0\n"
     ]
    }
   ],
   "source": [
    "hypers = {\n",
    "    'likelihood.noise_covar.noise': torch.tensor(1.),\n",
    "    'covar_module.base_kernel.lengthscale': torch.tensor(0.5),\n",
    "    'covar_module.outputscale': torch.tensor(2.),\n",
    "}\n",
    "\n",
    "model.initialize(**hypers)\n",
    "print(\n",
    "    model.likelihood.noise_covar.noise.item(),\n",
    "    model.covar_module.base_kernel.lengthscale.item(),\n",
    "    model.covar_module.outputscale.item()\n",
    ")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}