{ "cells": [ { "cell_type": "markdown", "id": "dd522981", "metadata": {}, "source": [ "# Linear quadratic optimal control example\n", "\n", "Richard M. Murray, 20 Jan 2022 (updated 17 Jan 2023)\n", "\n", "This example works through the linear quadratic finite time optimal control problem. We assume that we have a linear system of the form\n", "$$\n", "\\dot x = A x + Bu \n", "$$\n", "and that we want to minimize a cost function of the form\n", "$$\n", "\\int_0^T (x^T Q_x x + u^T Q_u u) dt + x^T P_1 x.\n", "$$\n", "We show how to compute the solution the the Riccati ODE and use this to obtain an optimal (time-varying) linear controller." ] }, { "cell_type": "code", "execution_count": null, "id": "866842ea", "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import scipy as sp\n", "import matplotlib.pyplot as plt\n", "import control as ct\n", "import control.optimal as opt\n", "import time" ] }, { "cell_type": "markdown", "id": "83a32e85", "metadata": {}, "source": [ "## System dynamics\n", "\n", "We use the linearized dynamics of the vehicle steering problem as our linear system. This is mainly for convenient (since we have some intuition about it). " ] }, { "cell_type": "code", "execution_count": null, "id": "de9d85f3", "metadata": {}, "outputs": [], "source": [ "# Use the linearized dynamics of the vehicle control problem\n", "# (you can find kincar.py on the course website)\n", "from kincar import kincar, plot_lanechange\n", "\n", "# Initial conditions\n", "x0 = np.array([-40, -2., 0.])\n", "u0 = np.array([10, 0]) # only used for linearization\n", "Tf = 4\n", "\n", "# Linearized dynamics\n", "sys = kincar.linearize(x0, u0)\n", "print(sys)" ] }, { "cell_type": "markdown", "id": "c5c0abe9", "metadata": {}, "source": [ "## Optimal trajectory generation\n", "\n", "We generate an trajectory for the system that minimizes the cost function above. Namely, starting from some initial function $x(0) = x_0$, we wish to bring the system toward the origin without using too much control effort." ] }, { "cell_type": "code", "execution_count": null, "id": "02e9f87c", "metadata": {}, "outputs": [], "source": [ "# Define the cost function and the terminal cost\n", "# (try changing these later to see what happens)\n", "Qx = np.diag([1, 1, 1]) # state costs\n", "Qu = np.diag([1, 1]) # input costs\n", "Pf = np.diag([1, 1, 1]) # terminal costs" ] }, { "cell_type": "markdown", "id": "62c76e5e", "metadata": {}, "source": [ "### Finite time, linear quadratic optimization\n", "\n", "The optimal solution satisfies the following equations, which follow from the maximum principle:\n", "\n", "\n", " \\begin{aligned}\n", " \\dot x &= \\left(\\frac{\\partial H}{\\partial \\lambda}\\right)^T\n", " = A x + Bu, \\qquad & x(0) &= x_0, \\\\\n", " -\\dot \\lambda &= \\left(\\frac{\\partial H}{\\partial x}\\right)^T\n", " = Q_x x + A^T \\lambda, \\qquad\n", " & \\lambda(T) &= P_1 x(T), \\\\\n", " 0 &= \\left(\\frac{\\partial H}{\\partial u}\\right)^T\n", " = Q_u u + B^T \\lambda. &&\n", " \\end{aligned}\n", "\n", "\n", "The last condition can be solved to obtain the optimal controller\n", "\n", "$$\n", " u = -Q_u^{-1} B^T \\lambda,\n", "$$\n", "\n", "which can be substituted into the equations for the optimal solution.\n", "\n", "Given the linear nature of the dynamics, we attempt to find a solution\n", "by setting $\\lambda(t) = P(t) x(t)$ where $P(t) \\in {\\mathbb R}^{n \\times\n", "n}$. Substituting this into the necessary condition, we obtain\n", "\n", "\n", " \\begin{aligned}\n", " & \\dot\\lambda =\n", " \\dot P x + P \\dot x = \\dot P x + P(Ax - BQ_u^{-1} B^T P) x, \\\\\n", " & \\quad\\implies\\quad\n", " -\\dot P x - PA x + PBQ_u^{-1}B P x = Q_xx + A^T P x.\n", " \\end{aligned}\n", "\n", "\n", "This equation is satisfied if we can find $P(t)$ such that\n", "\n", "$$\n", " -\\dot P = PA + A^T P - P B Q_u^{-1} B^T P + Q_x,\n", " \\qquad P(T) = P_1.\n", "$$" ] }, { "cell_type": "markdown", "id": "b63aed88", "metadata": {}, "source": [ "To solve a final value problem with $P(T) = P_1$, we set the \"initial\" condition to $P_1$ and then invert time, so that we solve\n", "\n", "$$\n", "\\frac{dP}{d(-t)} = -\\frac{dP}{dt} = -F(P), \\qquad P(0) = P_1\n", "$$\n", "\n", "Solving this equation from time $t = 0$ to time $t = T$ will give us an solution that goes from $P(T)$ to $P(0)$." ] }, { "cell_type": "code", "execution_count": null, "id": "02d74789", "metadata": {}, "outputs": [], "source": [ "# Set up the Riccatti ODE\n", "def Pdot_reverse(t, x):\n", " # Get the P matrix from the state by resizing\n", " P = np.reshape(x, (sys.nstates, sys.nstates))\n", " \n", " # Compute the right hand side of Riccati ODE\n", " Prhs = P @ sys.A + sys.A.T @ P + Qx - \\\n", " P @ sys.B @ np.linalg.inv(Qu) @ sys.B.T @ P\n", " \n", " # Return P as a vector, *backwards* in time (no minus sign)\n", " return Prhs.reshape((-1))\n", "\n", "# Solve the Riccati ODE (converting from matrix to vector and back)\n", "P0 = np.reshape(Pf, (-1))\n", "Psol = sp.integrate.solve_ivp(Pdot_reverse, (0, Tf), P0)\n", "Pfwd = np.reshape(Psol.y, (sys.nstates, sys.nstates, -1))\n", "\n", "# Reorder the solution in time\n", "Prev = Pfwd[:, :, ::-1] \n", "trev = Tf - Psol.t[::-1]\n", "\n", "print(\"Trange = \", trev[0], \"to\", trev[-1])\n", "print(\"P[Tf] =\", Prev[:,:,-1])\n", "print(\"P[0] =\", Prev[:,:,0])\n", "\n", "# Internal comparison: show that initial value is close to algebraic solution\n", "_, P_lqr, _ = ct.lqr(sys.A, sys.B, Qx, Qu)\n", "print(\"P_lqr =\", P_lqr)" ] }, { "cell_type": "markdown", "id": "f4fb1166", "metadata": {}, "source": [ "For solving the $x$ dynamics, we need a function to evaluate $P(t)$ at an arbitrary time (used by the integrator). We can do this with the SciPy interp1d function:" ] }, { "cell_type": "code", "execution_count": null, "id": "728f675b", "metadata": {}, "outputs": [], "source": [ "# Define an interpolation function for P\n", "P = sp.interpolate.interp1d(trev, Prev)\n", "\n", "print(\"P(0) =\", P(0))\n", "print(\"P(3.5) =\", P(3.5))\n", "print(\"P(4) =\", P(4))" ] }, { "cell_type": "markdown", "id": "eb30c3fa", "metadata": {}, "source": [ "We now solve the $\\dot x$ equations *forward* in time, using $P(t)$:" ] }, { "cell_type": "code", "execution_count": null, "id": "84092dcd", "metadata": {}, "outputs": [], "source": [ "# Now solve the state forward in time\n", "def xdot_forward(t, x):\n", " u = -np.linalg.inv(Qu) @ sys.B.T @ P(t) @ x\n", " return sys.A @ x + sys.B @ u\n", "\n", "# Now simulate from a shifted initial condition\n", "xsol = sp.integrate.solve_ivp(xdot_forward, (0, Tf), x0)\n", "tvec = xsol.t\n", "x = xsol.y\n", "print(\"x[0] =\", x[:, 0])\n", "print(\"x[Tf] =\", x[:, -1])" ] }, { "cell_type": "code", "execution_count": null, "id": "8488acad", "metadata": {}, "outputs": [], "source": [ "# Finally compute the \"desired\" state and input values\n", "xd = x\n", "ud = np.zeros((sys.ninputs, tvec.size))\n", "for i, t in enumerate(tvec):\n", " ud[:, i] = -np.linalg.inv(Qu) @ sys.B.T @ P(t) @ x[:, i]\n", "\n", "plot_lanechange(tvec, xd, ud)" ] }, { "cell_type": "markdown", "id": "89483f4b", "metadata": {}, "source": [ "Note here that we are stabilizing the system to the origin (compared to some of other examples where we change langes and so the final $y$ position is $y_\\text{f} = 2$." ] }, { "cell_type": "code", "execution_count": null, "id": "7ed4c5eb", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "language_info": { "name": "python" } }, "nbformat": 4, "nbformat_minor": 5 }