The Jacobian Matrix of Differentiable Functions from Rn to Rm

# The Jacobian Matrix of Differentiable Functions from Rn to Rm

Recall from The Total Derivative of a Function from Rn to Rm as a Linear Combination of Its Partial Derivatives page that if $S \subseteq \mathbb{R}^n$ is open, $\mathbf{c} \in S$, and $\mathbf{f} : S \to \mathbb{R}^m$ then if $\mathbf{f}$ is differentiable at $\mathbf{c}$ with total derivative $\mathbf{T}_{\mathbf{c}}$, then for each $\mathbf{v} = v_1\mathbf{e}_1 + v_2\mathbf{e}_2 + ... + v_n\mathbf{e}_n$ (where $\mathbf{e}_1, \mathbf{e}_2, ..., \mathbf{e}_n$ are the standard basis vectors of $\mathbb{R}^n$) we have that the total derivative of $\mathbf{f}$ at $\mathbf{c}$ evaluated at $\mathbf{v}$ can be expressed as a linear combination of the partial derivatives of $\mathbf{f}$ at $\mathbf{c}$ with:

(1)
\begin{align} \quad \mathbf{T}_{\mathbf{c}} (\mathbf{v}) &= v_1 \mathbf{f}'(\mathbf{c}, \mathbf{e}_1) + v_2 \mathbf{f}'(\mathbf{c}, \mathbf{e}_2) + ... + v_n \mathbf{f}'(\mathbf{c}, \mathbf{e}_n) \\ &= \sum_{i=1}^{n} v_i \mathbf{f}'(\mathbf{c}, \mathbf{e}_i) \end{align}

With this observation, we will be able to determine exactly what the total derivative of $\mathbf{f}$ at $\mathbf{c}$, $\mathbf{T}_{\mathbf{c}}$, will look like.

 Definition: Let $S \subseteq \mathbb{R}^n$, $\mathbb{c} \in \mathbb{R}^n$, and $\mathbf{f} : S \to \mathbb{R}^m$ with $\mathbf{f} = (f_1, f_2, ..., f_m)$. If $\mathbf{f}$ is differentiable at $\mathbf{c}$ then the Jacobian Matrix of $\mathbf{f}$ at $\mathbf{c}$ is denoted $\mathbf{D} \mathbf{f} (\mathbf{c})$ and is given by $\displaystyle{\mathbf{D} \mathbf{f} (\mathbf{c}) = \begin{bmatrix} D_1 f_1 (\mathbf{c}) & D_2 f_1 (\mathbf{c}) & \cdots & D_n f_1 (\mathbf{c}) \\ D_1 f_2 (\mathbf{c}) & D_2 f_2 (\mathbf{c}) & \cdots & D_n f_2 (\mathbf{c}) \\ \vdots & \vdots & \ddots & \vdots \\ D_1 f_m (\mathbf{c}) & D_2 f_m (\mathbf{c}) & \cdots & D_n f_m (\mathbf{c}) \\ \end{bmatrix}}$.

Notice that for each $k \in \{ 1, 2, ..., m \}$, the $k^{\mathrm{th}}$ row of the Jacobian matrix $\mathbf{D} \mathbf{f} (\mathbf{c})$ is the row vector $\begin{bmatrix} D_1 f_k (\mathbf{c}) & D_2 f_k (\mathbf{c}) & \cdots & D_n f_k (\mathbf{c}) \end{bmatrix}$, i.e., the $1 \times n$ vector whose entries are the partial derivatives of the real-valued functions $f_k : S \to \mathbb{R}$ at $\mathbf{c}$. But then this means that the $k^{\mathrm{th}}$ row of $\mathbf{D} \mathbf{f} (\mathbf{c})$ is simply just the gradient of $f_k$ at $\mathbf{c}$, i.e., $\nabla f_k (\mathbf{c})$. So the formula for the Jacobian matrix of $\mathbf{f}$ at $\mathbf{c}$ can be remembered as $\mathbf{D} \mathbf{f} (\mathbf{c}) = \begin{bmatrix} \nabla f_1(\mathbf{c})\\ \nabla f_2(\mathbf{c})\\ \vdots\\ \nabla f_m(\mathbf{c}) \end{bmatrix}$.

For example, let $\mathbf{f} : \mathbb{R}^2 \to \mathbb{R}^2$ be defined for all $(x, y) \in \mathbb{R}^2$ by:

(2)
\begin{align} \quad \mathbf{f}(x, y) = (x^2 + y^2, 2xy^2) \end{align}

Then we have that $\mathbf{f} = (f_1, f_2)$ where $f_1, f_2 : \mathbb{R}^2 \to \mathbb{R}$ are given by $f_1(x, y) = x^2 + y^2$ and $f_2(x, y) = 2xy^2$. We have that:

(3)
\begin{align} \quad \nabla f_1(x, y) = \left ( D_1 f_1(x, y), D_2 f_1(x, y) \right ) = (2x, 2y^2) \end{align}
(4)
\begin{align} \quad \nabla f_2(x, y) = \left ( D_1 f_2(x, y), D_2 f_2(x, y) \right ) = (2y, 4xy) \end{align}

Therefore the Jacobian matrix of $\mathbf{f}$ is:

(5)
\begin{align} \quad \mathbf{D} \mathbf{f}(x, y) = \begin{bmatrix} 2x & 2y^2 \\ 2y & 4xy \end{bmatrix} \end{align}

The important of the Jacobian matrix of $\mathbf{f}$ at $\mathbf{c}$ is that if $\mathbf{f}$ is differentiable at $\mathbf{c}$ then the total derivative of $\mathbf{f}$ at $\mathbf{c}$ is identically the Jacobian matrix of $\mathbf{f}$ at $\mathbf{c}$.

 Theorem 1: Let $S \subseteq \mathbb{R}^n$, $\mathbb{c} \in \mathbb{R}^n$, and $\mathbf{f} : S \to \mathbb{R}^m$. If $\mathbf{f}$ is differentiable at $\mathbf{c}$ then the total derivative of $\mathbf{f}$ at $\mathbf{c}$ is given by the Jacobian matrix of $\mathbf{f}$ at $\mathbf{c}$. That is, $\mathbf{f}'(\mathbf{c}) = \mathbf{D} \mathbf{f}(\mathbf{c})$.
• Proof: Let $\mathbf{f}$ be differentiable at $\mathbf{c}$ and let $\mathbf{f} = (f_1, f_2, ..., f_m)$. Let $\mathbf{e}_1, \mathbf{e}_2, ..., \mathbf{e}_n$ be the unit coordinate vectors in $\mathbb{R}^n$ and let $\mathbf{e}_1', \mathbf{e}_2', ..., \mathbf{e}_m'$ be the unit coordinate vectors in $\mathbb{R}^m$. Then applying the total derivative of $\mathbf{f}$ at $\mathbf{c}$ to $\mathbf{e}_k$ for each $k \in \{ 1, 2, ..., n \}$ gives us:
(6)
\begin{align} \quad \mathbf{f}'(\mathbf{c})(\mathbf{e}_k) &= \mathbf{f}'(\mathbf{c}, \mathbf{e}_k) \\ &= (f_1'(\mathbf{c}, \mathbf{e}_k), f_2'(\mathbf{c}, \mathbf{e}_k), ..., f_m'(\mathbf{c}, \mathbf{e}_k)) \\ &= (D_k f_1 (\mathbf{c}), D_k f_2 (\mathbf{c}), ..., D_k f_m (\mathbf{c})) \\ &= D_k f_1 (\mathbf{c}) (1, 0, 0, ..., 0) + D_k f_2 (\mathbf{c}) (0, 1, 0, ..., 0) + ... + D_k f_m (\mathbf{c}) (0, 0, ..., 0, 1) \\ &= D_k f_1 (\mathbf{c}) \mathbf{e}_1' + D_k f_2 (\mathbf{c}) \mathbf{e}_2' + ... + D_k f_m (\mathbf{c}) \mathbf{e}_m' \\ &= \sum_{i=1}^{m} D_k f_i (\mathbf{c}) \mathbf{e}_i' \end{align}
• But $\mathbf{f}'(\mathbf{c})(\mathbf{e}_k)$ gives us the $k^{\mathrm{th}}$ column of the total derivative of $\mathbf{f}$ at $\mathbf{c}$ and so:
(7)
\begin{align} \quad \mathbf{f}'(\mathbf{c}) = \begin{bmatrix} D_1 f_1 (\mathbf{c}) & D_2 f_1 (\mathbf{c}) & \cdots & D_n f_1 (\mathbf{c}) \\ D_1 f_2 (\mathbf{c}) & D_2 f_2 (\mathbf{c}) & \cdots & D_n f_2 (\mathbf{c}) \\ \vdots & \vdots & \ddots & \vdots \\ D_1 f_m (\mathbf{c}) & D_2 f_m (\mathbf{c}) & \cdots & D_n f_m (\mathbf{c}) \\ \end{bmatrix} = \mathbf{D}\mathbf{f} \quad \blacksquare \end{align}

For each vector $\mathbf{v} = v_1\mathbf{e}_1 + v_2\mathbf{e}_2 + ... + v_n\mathbf{e}_n$ we can write $\mathbf{v} = \begin{bmatrix} v_1 \\ v_2\\ \vdots \\ v_n \end{bmatrix}$ and so:

(8)
\begin{align} \quad \mathbf{D} \mathbf{f} (\mathbf{c})(\mathbf{v}) = \begin{bmatrix} D_1 f_1 (\mathbf{c}) & D_2 f_1 (\mathbf{c}) & \cdots & D_n f_1 (\mathbf{c}) \\ D_1 f_2 (\mathbf{c}) & D_2 f_2 (\mathbf{c}) & \cdots & D_n f_2 (\mathbf{c}) \\ \vdots & \vdots & \ddots & \vdots \\ D_1 f_m (\mathbf{c}) & D_2 f_m (\mathbf{c}) & \cdots & D_n f_m (\mathbf{c}) \\ \end{bmatrix} \begin{bmatrix} v_1 \\ v_2\\ \vdots \\ v_n \end{bmatrix} = \begin{bmatrix} \sum_{i=1}^{n} v_iD_i f_1 (\mathbf{c}) \\ \sum_{i=1}^{n} v_iD_i f_2 (\mathbf{c}) \\ \vdots \\ \sum_{i=1}^{n} v_iD_i f_n (\mathbf{c}) \end{bmatrix} = \begin{bmatrix} \mathbf{v} \cdot \nabla f_1(\mathbf{c}) \\ \mathbf{v} \cdot \nabla f_2(\mathbf{c}) \\ \vdots \\ \mathbf{v} \cdot \nabla f_n(\mathbf{c}) \end{bmatrix} = \sum_{k=1}^{m} [\mathbf{v} \cdot \nabla f_k(\mathbf{c})] \mathbf{e}_k' \end{align}

Note that if $m = 1$, i.e., $f : S \to \mathbb{R}$ ($f = (f)$) then we have that:

(9)
\begin{align} \quad \mathbf{D} f(\mathbf{c})(\mathbf{v}) = \sum_{i=1}^{n} v_1 D f(\mathbf{c}) = \mathbf{v} \cdot \nabla f(\mathbf{c}) \quad (*) \end{align}

But we know that plugging a vector $\mathbf{v}$ into the total derivative of $\mathbf{f}$ at $\mathbf{c}$, $\mathbf{T}_{\mathbf{c}} = \mathbf{D} \mathbf{f} (\mathbf{c})$ gives us the directional derivative of $\mathbf{f}$ at $\mathbf{c}$ in the direction of $\mathbf{v}$ from the Differentiable Functions from Rn to Rm and Their Total Derivatives page. Thus by $(*)$, we get that if $f$ is a real-valued function the directional derivative of $f$ at $\mathbf{c}$ in the direction of $\mathbf{v}$ can be computed simply as the dot product of $\mathbf{v}$ with $\nabla f(\mathbf{c})$ - a property we used earlier in one of the proofs on The Gradient of a Differentiable Function from Rn to R page.