The Best Approximation of a Function from an Orthonormal System

# The Best Approximation of a Function from an Orthonormal System

Recall from the Orthogonal and Orthonormal Systems of Functions page that if $I$ is an interval in $\mathbb{R}$ then a collection of functions $\{ \varphi_0(x), \varphi_1(x), ... \}$ in $L^2(I)$ is said to be an orthogonal system of functions on $I$ if $(\varphi_i(x), \varphi_j(x)) = 0$ for all $i, j \in \{ 0, 1, ... \}$ such that $i \neq j$. Moreover, we said that this collection of functions is an orthonormal system of functions on $I$ if we further have that $(\varphi_i(x), \varphi_i(x)) = 1$ for all $i$.

Now let $\mathcal S = \{ \varphi_0(x), \varphi_1(x), ... \}$ be an orthonormal system of functions on $I$ and let $f \in L^2(I)$. Then we may attempt to approximate $f$ as a linear combination of functions in $\mathcal S$. For some $b_0, b_1, ..., b_n \in \mathbb{C}$, consider the following function:

(1)
\begin{align} \quad t_n(x) = \sum_{k=0}^{n} b_k \varphi_k(x) \end{align}

Since we are working in the space $L^2(I)$ of square Lebesgue integrable functions on $I$, we will use the $L^2$]-norm as a measure of error, that is, we will say that [[$t_n$ approximates $f$ more accurately when $\| f(x) - t_n(x) \|$ is small.

Now suppose that $f$ is identically a finite linear combination of functions in $\mathcal S$. Then there exists $c_0, c_1, ..., c_n \in \mathbb{C}$ such that $\displaystyle{f(x) = \sum_{k=0}^{n} c_k \varphi_k(x)}$. Then if $b_k = c_k$ we will have that $\| f(x) - t_n(x) \| = 0$. To determine what the values of $c_0, c_1, ..., c_n$ must be, consider the inner products of $(f, \varphi_m)$ for each $m \in \{ 0, 1, ..., n \}$:

(2)
\begin{align} \quad (f(x), \varphi_m(x)) &= \left ( \sum_{k=0}^n c_k \varphi_k(x), \varphi_m(x) \right ) \\ \quad &= \left ( c_0\varphi_0(x) + c_1\varphi_1(x) + ... + c_n\varphi_n(x), \varphi_m(x) \right ) \\ \quad &= (c_0\varphi_0(x), \varphi_m(x)) + (c_1\varphi_1(x), \varphi_m(x)) + ... + (c_n\varphi_n(x), \varphi_m(x)) \\ \quad &= c_0\underbrace{(\varphi_0(x), \varphi_m(x))}_{=0} + c_1\underbrace{(\varphi_1(x), \varphi_m(x))}_{=0} + ... + c_m\underbrace{(\varphi_m(x), \varphi_m(x))}_{=1} + ... + c_n\underbrace{(\varphi_n(x), \varphi_m(x))}_{=0} \\ &= c_m \end{align}

So if $f$ is a finite linear combination of functions in $\mathcal S$, say of $\varphi_0, \varphi_1, ..., \varphi_n$, then the coefficients of this linear combination are given as $c_m = (f(x), \varphi_m(x))$ for each $m \in \{ 0, 1, ..., n \}$.

The following theorem shows that choosing the coefficients $c_m$ as described above is always the best way to approximate $f$ as a linear combination of the functions $\varphi_0, \varphi_1, ..., \varphi_n$ regardless of whether or not $f$ is identically a linear combination of such functions.

 Theorem 1: Let $\mathcal S = \{ \varphi_0(x), \varphi_1(x), ... \}$ be an orthonormal system of functions on $I$ in $L^2(I)$ and let $f \in L^2(I)$. Let $b_0, b_1, ..., b_n \in \mathbb{C}$, and let $c_k = (f, \varphi_k)$ for each $k \in \{ 0, 1, ..., n \}$. Define $\displaystyle{s_n(x) = \sum_{k=0}^n c_k \varphi_k(x)}$ and $\displaystyle{t_n(x) = \sum_{k=0}^{n} b_k \varphi_k(x)}$. Then for each $n \in \mathbb{N}$: a) $\| f(x) - s_n(x) \| \leq \| f(x) - t_n(x) \|$. b) $\| f(x) - s_n(x) \| = \| f(x) - t_n(x) \|$ if and only if $b_k = c_k$ for all $k \in \{ 0, 1, ..., n \}$.
• Proof of a and b) To carry out this proof we will expand $\| f(x) - t_n(x) \|^2$. We have that:
(3)
\begin{align} \quad \| f(x) - t_n(x) \|^2 &= (f(x) - t_n(x), f(x) - t_n(x)) \\ &= (f(x), f(x) - t_n(x)) - (t_n(x), f(x) - t_n(x)) \\ &= (f(x), f(x)) - (f(x), t_n(x)) - (t_n(x), f(x)) + (t_n(x), t_n(x)) \\ &= \| f(x) \|^2 - (f(x) t_n(x)) - \overline{(f(x), t_n(x))} + \| t_n(x) \|^2 \quad (\dagger) \end{align}
• The term $(f(x), t_n(x))$ is given by:
(4)
\begin{align} \quad (f(x), t_n(x)) &= \left ( f(x), \sum_{k=0}^{n} b_k \varphi_k(x) \right ) = \overline{b_0} (f(x), \varphi_0(x)) + \overline{b_1} (f(x), \varphi_1(x)) + ... + \overline{b_n} (f(x), \varphi_n(x)) = \overline{b_0}c_0 + \overline{b_1}c_1+ ... + \overline{b_n}c_n = \sum_{k=0}^{n} \overline{b_k} c_k \quad (*) \end{align}
• While the term $\overline{(f(x), t_n(x))}$ is given by:
(5)
\begin{align} \quad \overline{(f(x), t_n(x))} &= \overline{\sum_{k=0}^{n} \overline{b_k} c_k} = \sum_{k=0}^{n} b_k\overline{c_k} \quad (**) \end{align}
• Lastly, the term $\| t_n(x) \|^2$ is given by:
(6)
\begin{align} \quad \| t_n(x) \|^2 &= (t_n(x), t_n(x)) = \left ( \sum_{k=0}^{n} b_k \varphi_k(x), \sum_{j=0}^{n} b_j \varphi_j(x) \right ) = \sum_{k=0}^n \sum_{j=0}^n b_k \overline{b_j} (\varphi_k(x), \varphi_j(x)) = \sum_{k=0}^{n} \mid b_k \mid^2 (\varphi_k(x), \varphi_k(x)) = \sum_{k=0}^{n} \mid b_k \mid^2 \quad (***) \end{align}
• Using $(*)$, $(**)$, and $(***)$ at $(\dagger)$ we see that:
(7)
\begin{align} \quad \| f(x) - t_n(x) \|^2 = \| f(x) \|^2 - \sum_{k=0}^{n} \overline{b_k} c_k - \sum_{k=0}^{n} b_k \overline{c_k} + \sum_{k=0}^{n} \mid b_k \mid^2 \quad (\dagger \dagger) \end{align}
• Now consider the following identity:
(8)
\begin{align} \quad \sum_{k=0}^{n} \mid b_k - c_k \mid^2 = \sum_{k=0}^{n} (b_k - c_k)(\overline{b_k} - \overline{c_k}) &= \sum_{k=0}^{n} \left ( \mid b_k \mid^2 - b_k\overline{c_k} - \overline{b_k}c_k + \mid c_k \mid^2 \right ) \\ &= \sum_{k=0}^{n} \mid b_k \mid^2 - \sum_{k=0}^{n} b_k\overline{c_k} - \sum_{k=0}^{n} \overline{b_k}c_k + \sum_{k=0}^{n} \mid c_k \mid^2 \quad (****) \end{align}
• Using $(****)$ at $(\dagger \dagger)$ yields:
(9)
\begin{align} \quad \| f(x) - t_n(x) \|^2 = \| f(x) \|^2 + \sum_{k=0}^{n} \mid b_k - c_k \mid^2 - \sum_{k=0}^{n} \mid c_k \mid^2 \end{align}
• This is minimized uniquely when $b_k =c_k$ for all $k \in \{0, 1, ..., n \}$ and thus since the inequalities above involve positive real numbers squares we have that:
(10)
\begin{align} \quad \| f(x) - s_n(x) \| \leq \| f(x) - t_n(x) \| \quad \blacksquare \end{align}