The Best Approximation of a Function from an Orthonormal System
Recall from the Orthogonal and Orthonormal Systems of Functions page that if $I$ is an interval in $\mathbb{R}$ then a collection of functions $\{ \varphi_0(x), \varphi_1(x), ... \}$ in $L^2(I)$ is said to be an orthogonal system of functions on $I$ if $(\varphi_i(x), \varphi_j(x)) = 0$ for all $i, j \in \{ 0, 1, ... \}$ such that $i \neq j$. Moreover, we said that this collection of functions is an orthonormal system of functions on $I$ if we further have that $(\varphi_i(x), \varphi_i(x)) = 1$ for all $i$.
Now let $\mathcal S = \{ \varphi_0(x), \varphi_1(x), ... \}$ be an orthonormal system of functions on $I$ and let $f \in L^2(I)$. Then we may attempt to approximate $f$ as a linear combination of functions in $\mathcal S$. For some $b_0, b_1, ..., b_n \in \mathbb{C}$, consider the following function:
(1)Since we are working in the space $L^2(I)$ of square Lebesgue integrable functions on $I$, we will use the $L^2 $]-norm as a measure of error, that is, we will say that [[$ t_n$ approximates $f$ more accurately when $\| f(x) - t_n(x) \|$ is small.
Now suppose that $f$ is identically a finite linear combination of functions in $\mathcal S$. Then there exists $c_0, c_1, ..., c_n \in \mathbb{C}$ such that $\displaystyle{f(x) = \sum_{k=0}^{n} c_k \varphi_k(x)}$. Then if $b_k = c_k$ we will have that $\| f(x) - t_n(x) \| = 0$. To determine what the values of $c_0, c_1, ..., c_n$ must be, consider the inner products of $(f, \varphi_m)$ for each $m \in \{ 0, 1, ..., n \}$:
(2)So if $f$ is a finite linear combination of functions in $\mathcal S$, say of $\varphi_0, \varphi_1, ..., \varphi_n$, then the coefficients of this linear combination are given as $c_m = (f(x), \varphi_m(x))$ for each $m \in \{ 0, 1, ..., n \}$.
The following theorem shows that choosing the coefficients $c_m$ as described above is always the best way to approximate $f$ as a linear combination of the functions $\varphi_0, \varphi_1, ..., \varphi_n$ regardless of whether or not $f$ is identically a linear combination of such functions.
Theorem 1: Let $\mathcal S = \{ \varphi_0(x), \varphi_1(x), ... \}$ be an orthonormal system of functions on $I$ in $L^2(I)$ and let $f \in L^2(I)$. Let $b_0, b_1, ..., b_n \in \mathbb{C}$, and let $c_k = (f, \varphi_k)$ for each $k \in \{ 0, 1, ..., n \}$. Define $\displaystyle{s_n(x) = \sum_{k=0}^n c_k \varphi_k(x)}$ and $\displaystyle{t_n(x) = \sum_{k=0}^{n} b_k \varphi_k(x)}$. Then for each $n \in \mathbb{N}$: a) $\| f(x) - s_n(x) \| \leq \| f(x) - t_n(x) \|$. b) $\| f(x) - s_n(x) \| = \| f(x) - t_n(x) \|$ if and only if $b_k = c_k$ for all $k \in \{ 0, 1, ..., n \}$. |
- Proof of a and b) To carry out this proof we will expand $\| f(x) - t_n(x) \|^2$. We have that:
- The term $(f(x), t_n(x))$ is given by:
- While the term $\overline{(f(x), t_n(x))}$ is given by:
- Lastly, the term $\| t_n(x) \|^2$ is given by:
- Using $(*)$, $(**)$, and $(***)$ at $(\dagger)$ we see that:
- Now consider the following identity:
- Using $(****)$ at $(\dagger \dagger)$ yields:
- This is minimized uniquely when $b_k =c_k$ for all $k \in \{0, 1, ..., n \}$ and thus since the inequalities above involve positive real numbers squares we have that: