{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Introduction to Bayesian Inference and Python Emcee" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "##\n", "# Ayush Pandey\n", "# Feb 18th 2022\n", "## \n", "## Note:\n", "# Parts of this notebook are directly taken from the emcee package documentation\n", "# available here: https://emcee.readthedocs.io/en/stable/\n", "##\n", "\n", "\n", "# Plotting parameters\n", "%matplotlib inline\n", "%config InlineBackend.figure_format = \"retina\"\n", "\n", "from matplotlib import rcParams\n", "rcParams[\"savefig.dpi\"] = 100\n", "rcParams[\"figure.dpi\"] = 100\n", "rcParams[\"font.size\"] = 20" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Bayesian Inference: Linear Fit\n", "With this example, we identify the parameters of a linear equation, $y = mx + b$. Given noisy data of the output $y$, we estimate the posterior distributions of the parameters $m$ and $b$. \n", "\n", "### Create artificial data:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import matplotlib.pyplot as plt\n", "\n", "np.random.seed(123)\n", "\n", "# Choose the \"true\" parameters.\n", "m_true = -0.9594\n", "b_true = 4.294\n", "f_true = 0.534\n", "\n", "# Generate some synthetic data from the model.\n", "N = 50\n", "x = np.sort(10 * np.random.rand(N))\n", "yerr = 0.1 + 0.5 * np.random.rand(N)\n", "y = m_true * x + b_true\n", "y += np.abs(f_true * y) * np.random.randn(N)\n", "y += yerr * np.random.randn(N)\n", "\n", "plt.errorbar(x, y, yerr=yerr, fmt=\".k\", capsize=0)\n", "x0 = np.linspace(0, 10, 500)\n", "plt.plot(x0, m_true * x0 + b_true, \"k\", alpha=0.3, lw=3)\n", "plt.xlim(0, 10)\n", "plt.xlabel(\"x\")\n", "plt.ylabel(\"y\");" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Least-Squares Fit:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "A = np.vander(x, 2)\n", "C = np.diag(yerr * yerr)\n", "ATA = np.dot(A.T, A / (yerr ** 2)[:, None])\n", "cov = np.linalg.inv(ATA)\n", "w = np.linalg.solve(ATA, np.dot(A.T, y / yerr ** 2))\n", "print(\"Least-squares estimates:\")\n", "print(\"m = {0:.3f} ± {1:.3f}\".format(w[0], np.sqrt(cov[0, 0])))\n", "print(\"b = {0:.3f} ± {1:.3f}\".format(w[1], np.sqrt(cov[1, 1])))\n", "\n", "plt.errorbar(x, y, yerr=yerr, fmt=\".k\", capsize=0)\n", "plt.plot(x0, m_true * x0 + b_true, \"k\", alpha=0.3, lw=3, label=\"truth\")\n", "plt.plot(x0, np.dot(np.vander(x0, 2), w), \"--k\", label=\"LS\")\n", "plt.legend(fontsize=14)\n", "plt.xlim(0, 10)\n", "plt.xlabel(\"x\")\n", "plt.ylabel(\"y\");" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Maximum likelihood estimation\n", "\n", "The least squares solution found in the previous section is the maximum\n", "likelihood result for a model where the error bars are assumed correct,\n", "Gaussian and independent.\n", "We know, of course, that this isn't the right model.\n", "Unfortunately, there isn't a generalization of least squares that supports a\n", "model like the one that we know to be true.\n", "Instead, we need to write down the likelihood function and numerically\n", "optimize it.\n", "In mathematical notation, the correct likelihood function is:\n", "\n", "$$\n", " \\ln\\,p(y\\,|\\,x,\\sigma,m,b,f) =\n", " -\\frac{1}{2} \\sum_n \\left[\n", " \\frac{(y_n-m\\,x_n-b)^2}{s_n^2}\n", " + \\ln \\left ( 2\\pi\\,s_n^2 \\right )\n", " \\right]\n", "$$\n", "\n", "where\n", "\n", "$$\n", " s_n^2 = \\sigma_n^2+f^2\\,(m\\,x_n+b)^2 \\quad .\n", "$$\n", "\n", "This likelihood function is simply a Gaussian where the variance is\n", "underestimated by some fractional amount: $f$.\n", "In Python, you would code this up as:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def log_likelihood(theta, x, y, yerr):\n", " m, b, log_f = theta\n", " model = m * x + b\n", " sigma2 = yerr ** 2 + model ** 2 * np.exp(2 * log_f)\n", " return -0.5 * np.sum((y - model) ** 2 / sigma2 + np.log(sigma2))\n", "\n", "from scipy.optimize import minimize\n", "\n", "np.random.seed(42)\n", "nll = lambda *args: -log_likelihood(*args)\n", "initial = np.array([m_true, b_true, np.log(f_true)]) + 0.1 * np.random.randn(3)\n", "soln = minimize(nll, initial, args=(x, y, yerr))\n", "m_ml, b_ml, log_f_ml = soln.x\n", "\n", "print(\"Maximum likelihood estimates:\")\n", "print(\"m = {0:.3f}\".format(m_ml))\n", "print(\"b = {0:.3f}\".format(b_ml))\n", "print(\"f = {0:.3f}\".format(np.exp(log_f_ml)))\n", "\n", "plt.errorbar(x, y, yerr=yerr, fmt=\".k\", capsize=0)\n", "plt.plot(x0, m_true * x0 + b_true, \"k\", alpha=0.3, lw=3, label=\"truth\")\n", "plt.plot(x0, np.dot(np.vander(x0, 2), w), \"--k\", label=\"LS\")\n", "plt.plot(x0, np.dot(np.vander(x0, 2), [m_ml, b_ml]), \":k\", label=\"ML\")\n", "plt.legend(fontsize=14)\n", "plt.xlim(0, 10)\n", "plt.xlabel(\"x\")\n", "plt.ylabel(\"y\");" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Markov Chain Monte Carlo for Parameter Inference\n", "\n", "The parameter estimates do not reflect the uncertainties and noise in the data in maximum-likelihood estimate. So, we cannot estimate the uncertainties in $m$ and $b$. Further, it may not always be feasible to use scipy.optimize.minimize to obtain maximum likelihood estimates. MCMC is a hammer that can be applied to all kinds of problems! " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "cost_progress = []\n", "def log_prior(theta):\n", " m, b = theta\n", " if -500.0 < m < 500 and -1100.0 < b < 1100.0:\n", " return 0.0\n", " return -np.inf\n", "\n", "def log_likelihood(theta, x, y):\n", " m, b = theta\n", " model = m * x + b\n", " cost = - np.linalg.norm(y - model)\n", " cost_progress.append(cost)\n", " return cost\n", "\n", "def log_probability(theta, x, y):\n", " lp = log_prior(theta)\n", " if not np.isfinite(lp):\n", " return -np.inf\n", " return lp + log_likelihood(theta, x, y)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import emcee\n", "\n", "pos = 1*np.zeros(2) + 1e-4 * np.random.randn(32, 2)\n", "nwalkers, ndim = pos.shape\n", "\n", "sampler = emcee.EnsembleSampler(nwalkers, ndim, log_probability, args=(x, y))\n", "sampler.run_mcmc(pos, 3000, progress=True);" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, axes = plt.subplots(2, figsize=(10, 7), sharex=True)\n", "samples = sampler.get_chain()\n", "# labels = [\"m\", \"b\", \"log(f)\"]\n", "labels = [\"m\", \"b\"]\n", "for i in range(ndim):\n", " ax = axes[i]\n", " ax.plot(samples[:, :, i], \"k\", alpha=0.3)\n", " ax.set_xlim(0, len(samples))\n", " ax.set_ylabel(labels[i])\n", " ax.yaxis.set_label_coords(-0.1, 0.5)\n", "\n", "axes[-1].set_xlabel(\"step number\");" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.plot(cost_progress)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import corner\n", "flat_samples = sampler.get_chain(discard=100, thin=15, flat=True)\n", "print(flat_samples.shape)\n", "fig = corner.corner(\n", " flat_samples, labels=labels, truths=[m_true, b_true]\n", ");" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Let us try a nonlinear model:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def vehicle_model(x, t, *args):\n", " l, v, phi = args\n", " x, y, theta = x\n", " dx_dt = np.cos(theta)*v\n", " dy_dt = np.sin(theta)*v\n", " dtheta_dt = (1/l)*np.tan(phi)*v\n", " return np.array([dx_dt, dy_dt, dtheta_dt])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Simulate the model and create synthetic data" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from scipy.integrate import odeint\n", "def degree_to_radian(theta):\n", " return theta*np.pi/180\n", "l = 3\n", "v = 10\n", "phi = degree_to_radian(5)\n", "# phi = 0.99\n", "timepoints = np.linspace(0,10,100)\n", "initial_cond = np.array([0, 2, 0])\n", "solution = odeint(func = vehicle_model, y0 = initial_cond,\n", " t = timepoints, args = (l, v, phi))\n", "data_y = solution[:,1] + 2 * np.random.randn(np.shape(timepoints)[0])\n", "ax = plt.axes()\n", "ax.plot(timepoints, solution[:,1], lw = 3, label = 'y (Model)')\n", "ax.plot(timepoints, data_y, lw = 3, label = 'y (Data)')\n", "ax.set_ylabel('y')\n", "ax.set_xlabel('t')\n", "ax.legend()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "cost_progress = []\n", "def log_prior(params):\n", " l, v, phi = params\n", " if 0 < l < 100 and -100 < v < 100 and -10.0 < phi < 10:\n", " return 0.0\n", " return -np.inf\n", "\n", "def log_likelihood(params, x, y):\n", " l, v, phi = params\n", " sol = odeint(func = vehicle_model, y0 = initial_cond,\n", " t = x, args = (l, v, phi))\n", " cost = - np.linalg.norm(y - sol[:,1])\n", " cost_progress.append(cost)\n", " return cost\n", "\n", "def log_probability(params, x, y):\n", " lp = log_prior(params)\n", " if not np.isfinite(lp):\n", " return -np.inf\n", " return lp + log_likelihood(params, x, y)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import emcee\n", "nwalkers = 32\n", "pos = 1*np.array([l, v, phi]) + 1e-4 * np.random.randn(nwalkers, 3)\n", "nwalkers, ndim = pos.shape\n", "x = timepoints\n", "y = data_y\n", "sampler = emcee.EnsembleSampler(nwalkers, ndim, log_probability, args=(x, y))\n", "sampler.run_mcmc(pos, 4000, progress=True);" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, axes = plt.subplots(3, figsize=(10, 7), sharex=True)\n", "samples = sampler.get_chain()\n", "\n", "labels = [\"$l$\", \"$v$\", \"$\\phi$\"]\n", "for i in range(ndim):\n", " ax = axes[i]\n", " ax.plot(samples[:, :, i], \"k\", alpha=0.3)\n", " ax.set_xlim(0, len(samples))\n", " ax.set_ylabel(labels[i])\n", " ax.yaxis.set_label_coords(-0.1, 0.5)\n", "\n", "axes[-1].set_xlabel(\"step number\");" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import corner\n", "flat_samples = sampler.get_chain(discard=50, thin=15, flat=True)\n", "print(flat_samples.shape)\n", "fig = corner.corner(\n", " flat_samples, labels=labels, truths=[l, v, phi]\n", ");" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "inds = np.random.randint(len(flat_samples), size=100)\n", "ax = plt.axes()\n", "add_label = True\n", "for ind in inds:\n", " sample = flat_samples[ind]\n", " solution = odeint(func = vehicle_model, y0 = initial_cond,\n", " t = timepoints, args = tuple(sample))\n", " if add_label:\n", " ax.plot(timepoints, solution[:,1], \"C1\", alpha=0.1, label = 'Fitted Model')\n", " add_label = False\n", " else:\n", " ax.plot(timepoints, solution[:,1], \"C1\", alpha=0.1)\n", " \n", "ax.plot(timepoints, data_y, lw = 3, label = 'y (Data)')\n", "ax.set_ylabel('y')\n", "ax.set_xlabel('t')\n", "ax.legend()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "language_info": { "name": "python" } }, "nbformat": 4, "nbformat_minor": 2 }