Newton's Method for Solving Systems of Two Nonlinear Equations

# Newton's Method for Solving Systems of Two Nonlinear Equations

Recall from the Newton's Method for Approximating Roots page that if we have a function $y = f(x)$ and $\alpha$ is a root of this function, then if we have an initial approximation $x_0$ of this root, then we can define the tangent line of the point $(x_0, f(x_0))$ as:

(1)
\begin{align} \quad p_1(x) = f(x_0) + f'(x_0)(x - x_0) \end{align}

We then take the root of this line which we denote as $x_1 = x_0 - \frac{f(x_0)}{f'(x_0)}$. Under ideal circumstances, this value of $x_1$ will be a better approximation of the root $x_0$. We then repeat the process to obtain a sequence of approximations $\{ x_0, x_1, ..., x_n, ... \}$ that once again, under ideal circumstances, will converge to the root $\alpha$. The general formula for the $x$-intercept approximations is:

(2)
\begin{align} \quad x_{n+1} = f(x_n) - \frac{f(x_n)}{f'(x_n)} \end{align}

We will now look at a slightly modified form of Newton's Method in approximating the solutions to a system of two nonlinear equations with two unknowns. Consider the following system of two nonlinear equations of the two variables $x$ and $y$:

(3)
\begin{align} \left\{\begin{matrix} f(x, y) = 0\\ g(x, y) = 0 \end{matrix}\right. \end{align}

Now suppose that a solution $(\alpha, \beta)$ to this system exists and that $(x_0, y_0)$ is an initial approximation to this solution. Now note that $z = f(x, y)$ and $z = g(x, y)$ will represent surfaces in $\mathbb{R}^3$. The best approximation of these surfaces at the point $(x_0, y_0, f(x_0, y_0))$ will be the tangent plane that passes through this point. The general equation for this tangent plane is given by:

(4)
\begin{align} \quad p_1(x, y) = f(x_0, y_0) + (x - x_0) \frac{\partial}{\partial x} f(x_0, y_0) + (y - y_0) \frac{\partial}{\partial y} f(x_0, y_0) \end{align}

Provided that $f(x_0, y_0)$ is close enough to $0$ (such as to be close enough to satisfy $f(x, y) = 0$) then the level curve, $p_1(x, y) = 0$ which actually represents a straight line in $\mathbb{R}^2$ can be used to approximate the level curve $f(x, y) = 0$ for points $(x, y)$ that are near $(x_0, y_0)$.

Furthermore, we can apply the same procedure to the surface $z = g(x, y)$. The best approximation to this surface at the point $(x_0, y_0, g(x_0, y_0))$ is the tangent plane that passes through this point and given by the following equation:

(5)
\begin{align} \quad q_1(x, y) = g(x_0, y_0) + (x - x_0) \frac{\partial}{\partial x} g(x_0, y_0) + (y - y_0) \frac{\partial}{\partial y} g(x_0, y_0) \end{align}

Provided that $g(x_0, y_0)$ is close enough to $0$ then the level curve $q_1(x, y) = 0$ (which represents a straight line in $\mathbb{R}^2$) can be used to approximate the level curve $g(x, y) = 0$ for points $(x, y)$ near $(x_0, y_0)$.

Now we can approximate the solution $(\alpha, \beta)$ of interest between the curves $f(x, y) = 0$ and $g(x, y) = 0$ with the solution between the lines $p_1(x, y) = 0$ and $q_1(x, y) = 0$. Let $(x_1, y_1)$ be the solution to the now linear system, $\left\{\begin{matrix} p_1(x, y) = 0\\ q_1(x, y) = 0 \end{matrix}\right.$. Then $(x_1, y_1)$ will hopefully be a better approximation to the solution $(\alpha, \beta)$ of the nonlinear system from earlier.

Now to find the intersection of $p_1(x, y) = 0$ and $q_1(x, y) = 0$ is simple. We only need to solve the following system of equations:

(6)
\begin{align} \quad f(x_0, y_0) + (x - x_0)\frac{\partial}{\partial x} f(x_0, y_0) + (y - y_0) \frac{\partial}{\partial y} f(x_0, y_0) = 0 \\ \quad g(x_0, y_0) + (x - x_0)\frac{\partial}{\partial x} g(x_0, y_0) + (y - y_0) \frac{\partial}{\partial y} g(x_0, y_0) = 0 \end{align}

This system can be more nicely compressed using matrices. If we let $x - x_0 = \delta_x$ and $y - y_0 = \delta y$ then:

(7)
\begin{align} \quad \begin{bmatrix} \frac{\partial}{\partial x}f(x_0, y_0) & \frac{\partial}{\partial y} f(x_0, y_0)\\ \frac{\partial}{\partial x}g(x_0, y_0) & \frac{\partial}{\partial x}g(x_0, y_0) \end{bmatrix} \begin{bmatrix} \delta_x \\ \delta_y \end{bmatrix} = - \begin{bmatrix} f(x_0, y_0)\\ g(x_0, y_0) \end{bmatrix} \end{align}

The (hopefully) better approximation to the solution $(\alpha, \beta)$ will be $(x_1, y_1)$ where $x_1 = x_0 + \delta_x$ and $y_1 = y_0 + \delta_y$. We can then repeat this process in hopes that the sequence of approximations $(x_0, y_0)$, $(x_1, y_1)$, …, $(x_n, y_n)$, … converges to the solution $(\alpha, \beta)$. More generally, each iteration of this algorithm for $n = 0, 1, 2, ...$ can be computed in matrix form as:

(8)
\begin{align} \quad \begin{bmatrix} \frac{\partial}{\partial x}f(x_n, y_n) & \frac{\partial}{\partial y} f(x_n, y_n)\\ \frac{\partial}{\partial x}g(x_n, y_n) & \frac{\partial}{\partial x}g(x_n, y_n) \end{bmatrix} \begin{bmatrix} \delta_{x, n}\\ \delta_{y, n} \end{bmatrix} = - \begin{bmatrix} f(x_n, y_n)\\ g(x_n, y_n) \end{bmatrix} \end{align}

And each successive approximation is given by $x_{n+1} = x_n + \delta_{x, n}$ and $y_{n+1} = y_n + \delta_{y, n}$.