The Jacobian Matrix of Differentiable Functions from Rn to Rm
Recall from The Total Derivative of a Function from Rn to Rm as a Linear Combination of Its Partial Derivatives page that if $S \subseteq \mathbb{R}^n$ is open, $\mathbf{c} \in S$, and $\mathbf{f} : S \to \mathbb{R}^m$ then if $\mathbf{f}$ is differentiable at $\mathbf{c}$ with total derivative $\mathbf{T}_{\mathbf{c}}$, then for each $\mathbf{v} = v_1\mathbf{e}_1 + v_2\mathbf{e}_2 + ... + v_n\mathbf{e}_n$ (where $\mathbf{e}_1, \mathbf{e}_2, ..., \mathbf{e}_n$ are the standard basis vectors of $\mathbb{R}^n$) we have that the total derivative of $\mathbf{f}$ at $\mathbf{c}$ evaluated at $\mathbf{v}$ can be expressed as a linear combination of the partial derivatives of $\mathbf{f}$ at $\mathbf{c}$ with:
(1)With this observation, we will be able to determine exactly what the total derivative of $\mathbf{f}$ at $\mathbf{c}$, $\mathbf{T}_{\mathbf{c}}$, will look like.
Definition: Let $S \subseteq \mathbb{R}^n$, $\mathbb{c} \in \mathbb{R}^n$, and $\mathbf{f} : S \to \mathbb{R}^m$ with $\mathbf{f} = (f_1, f_2, ..., f_m)$. If $\mathbf{f}$ is differentiable at $\mathbf{c}$ then the Jacobian Matrix of $\mathbf{f}$ at $\mathbf{c}$ is denoted $\mathbf{D} \mathbf{f} (\mathbf{c})$ and is given by $\displaystyle{\mathbf{D} \mathbf{f} (\mathbf{c}) = \begin{bmatrix} D_1 f_1 (\mathbf{c}) & D_2 f_1 (\mathbf{c}) & \cdots & D_n f_1 (\mathbf{c}) \\ D_1 f_2 (\mathbf{c}) & D_2 f_2 (\mathbf{c}) & \cdots & D_n f_2 (\mathbf{c}) \\ \vdots & \vdots & \ddots & \vdots \\ D_1 f_m (\mathbf{c}) & D_2 f_m (\mathbf{c}) & \cdots & D_n f_m (\mathbf{c}) \\ \end{bmatrix}}$. |
Notice that for each $k \in \{ 1, 2, ..., m \}$, the $k^{\mathrm{th}}$ row of the Jacobian matrix $\mathbf{D} \mathbf{f} (\mathbf{c})$ is the row vector $\begin{bmatrix} D_1 f_k (\mathbf{c}) & D_2 f_k (\mathbf{c}) & \cdots & D_n f_k (\mathbf{c}) \end{bmatrix}$, i.e., the $1 \times n$ vector whose entries are the partial derivatives of the real-valued functions $f_k : S \to \mathbb{R}$ at $\mathbf{c}$. But then this means that the $k^{\mathrm{th}}$ row of $\mathbf{D} \mathbf{f} (\mathbf{c})$ is simply just the gradient of $f_k$ at $\mathbf{c}$, i.e., $\nabla f_k (\mathbf{c})$. So the formula for the Jacobian matrix of $\mathbf{f}$ at $\mathbf{c}$ can be remembered as $\mathbf{D} \mathbf{f} (\mathbf{c}) = \begin{bmatrix} \nabla f_1(\mathbf{c})\\ \nabla f_2(\mathbf{c})\\ \vdots\\ \nabla f_m(\mathbf{c}) \end{bmatrix}$.
For example, let $\mathbf{f} : \mathbb{R}^2 \to \mathbb{R}^2$ be defined for all $(x, y) \in \mathbb{R}^2$ by:
(2)Then we have that $\mathbf{f} = (f_1, f_2)$ where $f_1, f_2 : \mathbb{R}^2 \to \mathbb{R}$ are given by $f_1(x, y) = x^2 + y^2$ and $f_2(x, y) = 2xy^2$. We have that:
(3)Therefore the Jacobian matrix of $\mathbf{f}$ is:
(5)The important of the Jacobian matrix of $\mathbf{f}$ at $\mathbf{c}$ is that if $\mathbf{f}$ is differentiable at $\mathbf{c}$ then the total derivative of $\mathbf{f}$ at $\mathbf{c}$ is identically the Jacobian matrix of $\mathbf{f}$ at $\mathbf{c}$.
Theorem 1: Let $S \subseteq \mathbb{R}^n$, $\mathbb{c} \in \mathbb{R}^n$, and $\mathbf{f} : S \to \mathbb{R}^m$. If $\mathbf{f}$ is differentiable at $\mathbf{c}$ then the total derivative of $\mathbf{f}$ at $\mathbf{c}$ is given by the Jacobian matrix of $\mathbf{f}$ at $\mathbf{c}$. That is, $\mathbf{f}'(\mathbf{c}) = \mathbf{D} \mathbf{f}(\mathbf{c})$. |
- Proof: Let $\mathbf{f}$ be differentiable at $\mathbf{c}$ and let $\mathbf{f} = (f_1, f_2, ..., f_m)$. Let $\mathbf{e}_1, \mathbf{e}_2, ..., \mathbf{e}_n$ be the unit coordinate vectors in $\mathbb{R}^n$ and let $\mathbf{e}_1', \mathbf{e}_2', ..., \mathbf{e}_m'$ be the unit coordinate vectors in $\mathbb{R}^m$. Then applying the total derivative of $\mathbf{f}$ at $\mathbf{c}$ to $\mathbf{e}_k$ for each $k \in \{ 1, 2, ..., n \}$ gives us:
- But $\mathbf{f}'(\mathbf{c})(\mathbf{e}_k)$ gives us the $k^{\mathrm{th}}$ column of the total derivative of $\mathbf{f}$ at $\mathbf{c}$ and so:
For each vector $\mathbf{v} = v_1\mathbf{e}_1 + v_2\mathbf{e}_2 + ... + v_n\mathbf{e}_n$ we can write $\mathbf{v} = \begin{bmatrix} v_1 \\ v_2\\ \vdots \\ v_n \end{bmatrix}$ and so:
(8)Note that if $m = 1$, i.e., $f : S \to \mathbb{R}$ ($f = (f)$) then we have that:
(9)But we know that plugging a vector $\mathbf{v}$ into the total derivative of $\mathbf{f}$ at $\mathbf{c}$, $\mathbf{T}_{\mathbf{c}} = \mathbf{D} \mathbf{f} (\mathbf{c})$ gives us the directional derivative of $\mathbf{f}$ at $\mathbf{c}$ in the direction of $\mathbf{v}$ from the Differentiable Functions from Rn to Rm and Their Total Derivatives page. Thus by $(*)$, we get that if $f$ is a real-valued function the directional derivative of $f$ at $\mathbf{c}$ in the direction of $\mathbf{v}$ can be computed simply as the dot product of $\mathbf{v}$ with $\nabla f(\mathbf{c})$ - a property we used earlier in one of the proofs on The Gradient of a Differentiable Function from Rn to R page.