{ "cells": [ { "cell_type": "markdown", "id": "dd522981", "metadata": {}, "source": [ "# Linear quadratic optimal control example\n", "\n", "Richard M. Murray, 20 Jan 2022\n", "\n", "This example works through the linear quadratic finite time optimal control problem. We assume that we have a linear system of the form\n", "$$\n", "\\dot x = A x + Bu \n", "$$\n", "and that we want to minimize a cost function of the form\n", "$$\n", "\\int_0^T (x^T Q_x x + u^T Q_u u) dt + x^T P_1 x.\n", "$$" ] }, { "cell_type": "code", "execution_count": null, "id": "866842ea", "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import scipy as sp\n", "import matplotlib.pyplot as plt\n", "import control as ct\n", "import control.optimal as opt\n", "import time" ] }, { "cell_type": "markdown", "id": "83a32e85", "metadata": {}, "source": [ "## System dynamics\n", "\n", "We use the linearized dynamics of the vehicle steering problem as our linear system. This is mainly for convenient (since we have some intuition about it). " ] }, { "cell_type": "code", "execution_count": null, "id": "de9d85f3", "metadata": {}, "outputs": [], "source": [ "# Use the linearized dynamics of the vehicle control problem\n", "# (you can find vehicle.py on the course website)\n", "from vehicle import vehicle, plot_lanechange\n", "\n", "# Initial conditions\n", "x0 = np.array([-40, -2., 0.])\n", "u0 = np.array([10, 0]) # only used for linearization\n", "Tf = 4\n", "\n", "# Linearized dynamics\n", "sys = vehicle.linearize(x0, u0)\n", "print(sys)\n", "sys" ] }, { "cell_type": "markdown", "id": "c5c0abe9", "metadata": {}, "source": [ "## Optimal trajectory generation\n", "\n", "We generate an trajectory for the system that minimizes the cost function above. Namely, starting from some initial function $x(0) = x_0$, we wish to bring the system toward the origin without using too much control effort." ] }, { "cell_type": "code", "execution_count": null, "id": "02e9f87c", "metadata": {}, "outputs": [], "source": [ "# Define the cost function and the terminal cost\n", "# (try changing these later to see what happens)\n", "Qx = np.diag([1, 1, 1]) # state costs\n", "Qu = np.diag([1, 100]) # input costs\n", "Pf = np.diag([1, 1, 1]) # terminal costs" ] }, { "cell_type": "markdown", "id": "62c76e5e", "metadata": {}, "source": [ "### Finite time, linear quadratic optimization\n", "\n", "The optimal solution satisfies the following equations, which follow from the maximum principle:\n", "\n", "\n", " \\begin{aligned}\n", " \\dot x &= \\left(\\frac{\\partial H}{\\partial \\lambda}\\right)^T\n", " = A x + Bu, \\qquad & x(0) &= x_0, \\\\\n", " -\\dot \\lambda &= \\left(\\frac{\\partial H}{\\partial x}\\right)^T\n", " = Q_x x + A^T \\lambda, \\qquad\n", " & \\lambda(T) &= P_1 x(T), \\\\\n", " 0 &= \\left(\\frac{\\partial H}{\\partial u}\\right)^T\n", " = Q_u u + B^T \\lambda. &&\n", " \\end{aligned}\n", "\n", "\n", "The last condition can be solved to obtain the optimal controller\n", "\n", "$$\n", " u = -Q_u^{-1} B^T \\lambda,\n", "$$\n", "\n", "which can be substituted into the equations for the optimal solution.\n", "\n", "Given the linear nature of the dynamics, we attempt to find a solution\n", "by setting $\\lambda(t) = P(t) x(t)$ where $P(t) \\in {\\mathbb R}^{n \\times\n", "n}$. Substituting this into the necessary condition, we obtain\n", "\n", "\n", " \\begin{aligned}\n", " & \\dot\\lambda =\n", " \\dot P x + P \\dot x = \\dot P x + P(Ax - BQ_u^{-1} B^T P) x, \\\\\n", " & \\quad\\implies\\quad\n", " -\\dot P x - PA x + PBQ_u^{-1}B P x = Q_xx + A^T P x.\n", " \\end{aligned}\n", "\n", "\n", "This equation is satisfied if we can find $P(t)$ such that\n", "\n", "$$\n", " -\\dot P = PA + A^T P - P B Q_u^{-1} B^T P + Q_x,\n", " \\qquad P(T) = P_1.\n", "$$" ] }, { "cell_type": "markdown", "id": "b63aed88", "metadata": {}, "source": [ "To solve a final value problem with $x(T) = x_\\text{f}$, we set the \"initial\" condition to $x_\\text{f}$ and then invert time, so that we solve\n", "\n", "$$\n", "\\frac{dx}{d(-t)} = -\\frac{dx}{dt} = -f(x), \\qquad x(0) = x_\\text{f}\n", "$$\n", "\n", "Solving this equation from time $t = 0$ to time $t = T$ will give us an solution that goes from $x(T)$ to $x(0)$." ] }, { "cell_type": "code", "execution_count": null, "id": "02d74789", "metadata": {}, "outputs": [], "source": [ "# Set up the Riccatti ODE\n", "def Pdot_reverse(t, x):\n", " # Get the P matrix from the state by resizing\n", " P = np.reshape(x, (sys.nstates, sys.nstates))\n", " \n", " # Compute the right hand side of Riccati ODE\n", " Prhs = P @ sys.A + sys.A.T @ P + Qx - \\\n", " P @ sys.B @ np.linalg.inv(Qu) @ sys.B.T @ P\n", " \n", " # Return P as a vector, *backwards* in time (no minus sign)\n", " return Prhs.reshape((-1))\n", "\n", "# Solve the Riccati ODE (converting from matrix to vector and back)\n", "P0 = np.reshape(Pf, (-1))\n", "Psol = sp.integrate.solve_ivp(Pdot_reverse, (0, Tf), P0)\n", "Pfwd = np.reshape(Psol.y, (sys.nstates, sys.nstates, -1))\n", "\n", "# Reorder the solution in time\n", "Prev = Pfwd[:, :, ::-1] \n", "trev = Tf - Psol.t[::-1]\n", "\n", "print(\"Trange = \", trev[0], \"to\", trev[-1])\n", "print(\"P[Tf] =\", Prev[:,:,-1])\n", "print(\"P[0] =\", Prev[:,:,0])\n", "\n", "# Internal comparison (we'll talk about this next week)\n", "_, P_lqr, _ = ct.lqr(sys.A, sys.B, Qx, Qu)\n", "print(\"P_lqr =\", P_lqr)" ] }, { "cell_type": "markdown", "id": "f4fb1166", "metadata": {}, "source": [ "For solving the $x$ dynamics, we need a function to evaluate $P(t)$ at an arbitrary time (used by the integrator). We can do this with the SciPy interp1d function:" ] }, { "cell_type": "code", "execution_count": null, "id": "728f675b", "metadata": {}, "outputs": [], "source": [ "# Define an interpolation function for P\n", "P = sp.interpolate.interp1d(trev, Prev)\n", "\n", "print(\"P(0) =\", P(0))\n", "print(\"P(3.5) =\", P(3.5))\n", "print(\"P(4) =\", P(4))" ] }, { "cell_type": "markdown", "id": "eb30c3fa", "metadata": {}, "source": [ "We now solve the $\\dot x$ equations *forward* in time, using $P(t)$:" ] }, { "cell_type": "code", "execution_count": null, "id": "84092dcd", "metadata": {}, "outputs": [], "source": [ "# Now solve the state forward in time\n", "def xdot_forward(t, x):\n", " u = -np.linalg.inv(Qu) @ sys.B.T @ P(t) @ x\n", " return sys.A @ x + sys.B @ u\n", "\n", "# Now simulate from a shifted initial condition\n", "xsol = sp.integrate.solve_ivp(xdot_forward, (0, Tf), x0)\n", "tvec = xsol.t\n", "x = xsol.y\n", "print(\"x[0] =\", x[:, 0])\n", "print(\"x[Tf] =\", x[:, -1])" ] }, { "cell_type": "code", "execution_count": null, "id": "8488acad", "metadata": {}, "outputs": [], "source": [ "# Now compute the \"desired\" state and input values\n", "xd = x\n", "ud = np.zeros((sys.ninputs, tvec.size))\n", "for i, t in enumerate(tvec):\n", " ud[:, i] = -np.linalg.inv(Qu) @ sys.B.T @ P(t) @ x[:, i]\n", "\n", "plot_lanechange(tvec, xd, ud)" ] } ], "metadata": { "language_info": { "name": "python" } }, "nbformat": 4, "nbformat_minor": 5 }