Lagrangian Functions
We will subsequently look at the Method of Lagrange multipliers to find extreme values of a function subject to constraint equations. We will begin by laying down the foundation for this sort of problem.
Theorem 1: Let $z = f(x, y)$ be a function subject to the constraint function $g(x, y) = 0$. Let $f$ have an extreme value at the point $P(x_0, y_0)$ constrained for $(x, y)$ on the level curve $g(x, y) = 0$ and assume that that the point $P$ is not an endpoint of the level curve $g(x, y) = 0$ and that $\nabla g(x_0, y_0) \neq (0, 0)$. Then there exists a number $\lambda_0$ for which $(x_0, y_0, \lambda_0)$ is a critical point to the function $L(x, y, \lambda) = f(x, y) + \lambda g(x, y)$. |
- Proof: Note that since $P$ is not an endpoint of the level curve $g(x, y) = 0$ and $\nabla g(x_0, y_0) \neq (0, 0)$, then we can assume that the level curve $g(x, y) = 0$ is a sufficiently smooth curve, and so a tangent line exists at the point $P$. Now recall from The Perpendicularity of The Gradient at a Point on a Level Curve that the gradient $\nabla g(x_0, y_0)$ will be perpendicular to the level curve that passes through the point $(x_0, y_0)$, i.e, $\nabla g(x_0, y_0)$ is perpendicular to to the curve $g(x, y) = 0$ at the point $P$. Thus, $\nabla g(x_0, y_0)$ is perpendicular to the tangent line at $P$.
- Now suppose that the gradient of $f$ at $(x_0, y_0)$, $\nabla f(x_0, y_0)$, is not parallel to $\nabla g(x_0, y_0)$ as depicted in the following illustration.
- Then the projection of $\nabla f(x_0, y_0)$ onto this tangent line is a nonzero vector, call it $v$. Thus, we have that $f$ has a positive directional derivative in the direction of $v$, and $f$ has a negative directional derivative in the opposite direction of $v$ which implies that $f(x, y)$ increases/decreases as we move away from $P$ and along the level curve $g(x, y) = 0$, and so an extreme value does not exist at $P$, which is a contradiction from assuming that $\nabla f(x_0, y_0)$ is not parallel to $\nabla g(x_0, y_0)$.
- So suppose that $\nabla f(x_0, y_0)$ is parallel to $\nabla g(x_0, y_0)$.
- Now since we have that $\nabla g(x_0, y_0) \neq (0, 0)$ then since these vectors are parallel, there must exists a number $\lambda_0 \in \mathbb{R}$ such that:
- The equation above implies that when evaluated at the point $(x_0, y_0, \lambda_0)$, $\frac{\partial L}{\partial x} = 0$ and $\frac{\partial L}{\partial y} = 0$ or equivalently:
- Now note that for $(x_0, y_0, \lambda_0)$ to be a critical point of $f$ of the function $L$, then we must also have that $\frac{\partial L}{\partial \lambda} = 0$. If we take the function $L(x, y, \lambda) = f(x, y) + \lambda g(x, y)$ and partial differentiate with respect to $\lambda$ then we get that we must have that $\frac{\partial L}{\partial \lambda} = g(x, y)$. Now we want $g(x, y) = 0$ at $(x_0, y_0, \lambda_0)$. But $P(x_0, y_0)$ is a point on the level curve $g(x, y) = 0$, so indeed,$g(x,y) \biggr \rvert_{(x_0, y_0, \lambda_0)} = 0$, and so $(x_0, y_0, \lambda_0)$ is a critical point of $L$. $\blacksquare$
Definition: If $f(x, y)$ is a function that has extreme values restricted by the constraint function $g(x, y) = 0$, then the corresponding Lagrangian Function is $L(x, y, \lambda) = f(x, y) + \lambda g(x, y)$. |
From Theorem 1, we have that if we want to find extreme values, $f(x,y)$ restricted to points $(x, y)$ lying on the level curve $g(x, y) = C$ (nothing that the assumption that $g(x, y) = 0$ from Theorem 1 above can be obtained by absorbing $C$ into $g(x, y)$), then we need to examine the critical points of the corresponding Lagrangian function $L(x, y, \lambda) = f(x, y) + \lambda g(x, y)$. The critical points of this function are the values of $(x, y, \lambda)$ such that $\nabla L (x, y, \lambda ) = 0$, that is:
(3)Note that we need not necessarily have to solve for $\lambda$.
It is also important to note that $\nabla g(x_0, y_0) \neq (0, 0)$, that is, at a point of interest $(x_0, y_0)$ that produces an extrema in $f$, we cannot have that the gradient of $g$ at $(x_0, y_0)$ equals the zero vector. This is because in the argument above, we relied on the fact that $\nabla f (x_0, y_0)$ was parallel to $\nabla g(x_0, y_0)$. If $\nabla g(x_0, y_0) = (0, 0)$ then every vector is parallel to $\nabla g$!
However, it is not a requirement that $\nabla f(x_0, y_0) \neq (0, 0)$. Suppose instead that $\nabla f(x_0, y_0) = (0, 0)$. Then $\nabla f (x_0, y_0)$ is still parallel to $\nabla g(x_0, y_0) \neq (0, 0)$, and so there exists a $\lambda \in \mathbb{R}$ such that:
(4)Note that since $\nabla f(x_0, y_0) = (0, 0)$ then this implies that $\lambda = 0$.