23 Power Series
Highlights of this Chapter: we prove to marvelous results about power series: we show that they are differentiable (and get a formula for their derivative), and we also prove a formula about how to approximate functions well with a power series, and in the limit get a power series representation of a known function, in terms of its derivatives at a single point.
23.1 Differentiating Term-By-Term
The goal of this section is to prove that power series are differentiable, and that we can differentiate them term by term. That is, we seek to prove
\[\left(\sum_{k\geq 0}a_kx^k\right)^\prime=\sum_{k\geq 0}(a_k x^k)^\prime = \sum_{k\geq 1}ka_kx^{k-1}\]
Because a derivative is defined as a limit, this process of bringing the derivative inside the sum is really an exchange of limits: and we know the tool for that,Dominated Convergence! This applies quite generally so we give a general formulation and then apply it to power series
Theorem 23.1 Consider a infinite sum \(\sum_k f_k(x)\) of functions which (1) converges on a domain \(D\) (2) has each \(f_k\) differentiable on \(D\). If there is a sequence \(M_k\) such that
- \(M_k\) with \(|f_k^\prime(x)|<M_K\), for all \(x\in D\)
- The sum \(\sum M_k\) is convergent.
Then, the sum \(\sum_k f^\prime_k(x)\) is convergent, and \[\left(\sum_k f_k(x)\right)^\prime=\sum_k f^\prime_k(x)\]
Proof. Recall the limit definition of the derivative (Definition 21.1): \[\left(\sum_k f_k(x)\right)^\prime=\lim_{y\to x}\frac{\sum_k f_k(y)-\sum_k f_k(x)}{y-x}\] Writing each sum as the limit of finite sums, we may use the limit theorems (Theorem 7.3,Theorem 7.2) to combine this into a single sum \[\lim_{y\to x}\frac{\lim_N\sum_{k=0}^N f_k(y)-\lim_N\sum_{k=0}^N f_k(x)}{y-x}=\lim_{y\to x}\lim_N\sum_{k=0}^N\frac{f_k(y)-f_k(x)}{y-x}\]
And now, rewriting the limit of partial sums as an infinite sum, we see \[\left(\sum_k f_k(x)\right)^\prime=\lim_{y\to x}\sum_k \frac{f_k(y)-f_k(x)}{y-x}\]
If we are justified in switching the limit and the sum via Dominated Convergence, this becomes
\[\sum_k\lim_{y\to x}\frac{f_k(y)-f_k(x)}{y-x}=\sum_k f^\prime_k(x)\]
which is exactly what we want. Thus, all we need to do is justify that the conditions of Dominated Convergence are satisfied, for the terms appearing here. To be precise, this is a limit of functions, and we evaluate these by showing they exist for arbitrary sequences \(y_n\to x\) with \(y_n\neq x\). Choosing such a sequence and plugging in, we see we are really considering the sequence of terms \(\lim \sum_k\frac{f_k(y_n)-f_k(x)}{y_n-x}\).
Dominated convergence tells us we need to bound these terms \(\frac{f_k(y_n)-f_k(x)}{y_n-x}\) above by some \(M_k\). As part of the theorem hypothesis, we are given that there exists an \(M_k\) bounding the derivative of \(f_k\) on \(D\), so we just need to show that these suffice. For any \(x\neq y_n\) then \(\frac{f_k(y_n)-f_k(x)}{y_n-x}\) measures the slope of the secant line of \(f_k\) between \(x\) and \(y_n\), so by the Mean Value Theroem (Theorem 22.2) there is some \(c_n\) between \(x\) and \(y_n\) with \[\left|\frac{f_k(y_n)-f_k(x)}{y_n-x}\right|=\left|f^\prime_k(c_n)\right|\] Since \(|f^\prime_k(c)|\leq M_k\) by assumption (as \(c\in D\)), \(M_k\) is a bound for this difference quotient, as required.
Now recall our other assumption on the \(M_k\): that \(\sum M_k\) converges! This means we can apply dominated convergence bringing the limit inside, and
\[\lim_n \sum_k\frac{f_k(y_n)-f_k(x)}{y_n-x}= \sum_k \lim_n \frac{f_k(y_n)-f_k(x)}{y_n-x}\]
Since \(f_k\) is differentiable and \(y_n\to x\), by definition this limit converges to the derivative \(f_k^\prime(x)\). Thus, the limit of our sums is actually equal to \(\sum_k f^\prime_k(x)\). And, as \(y_n\to x\) was arbitrary, this holds for all such sequences. This means the limit defininig the derivative exists, and putting it all together, that \(\left(\sum_k f_k(x)\right)^\prime=\sum_k f^\prime_k(x)\) as required.
Now we look to apply this to the specific case of power series \(\sum_k a_k x^k\) within their intervals of convergence. The proof is much the same spirit as for continuity (where we also used dominated convergence), where we provide bounds \(M_k\) by looking at a larger point that remains in the interval of convergence. To do so, we need to understand the convergence of the power series of termwise derivatives. This is very similar to a previous homework problem (where you considered termwise antiderivatives) so we again leave as an exercise:
Exercise 23.1 Assume that \(\sum_k a_k x^k\) has radius of convergence \(R\). Show that \(\sum_k ka_k x^{k-1}\) has the same radius of convergence.
Using this, we can put everything together.
Theorem 23.2 Let \(f=\sum_{k\geq 0}a_kx^k\) be a power series with radius of convergence \(R\). Then for \(x\in(-R,R)\): \[f^\prime(x)= \sum_{k\geq 1} ka_k x^{k-1}\]
Proof. Let \(x\in(-R,R)\) be arbitrary. Since \(x\) lies strictly within the interval of convergence, we may choose some closed interval \(\subset (-R,R)\) containing \(x\). For concreteness we take \([-y,y]\) for some \(y<R\), which we will use for the domain \(D\) when applying Theorem 23.1.
Getting to work verifying the assumptions of this theorem, our series converges on \(D\) (as \(D\) is a subset of the radius of convergence). Our individual functions \(f_k\) of the sum are just the \(k^{th}\) term of the series \(f_k(x)=a_kx^k\). These are differentiable (as they are constant multiples of the monomial \(x^k\)) with derivatives \(f_k^\prime = ka_kx^{k-1}\). The bounds \(M_k\) we seek are numbers which are greater in absolute value than this derivative on the domain \(D=[-y,y]\). Choose some value \(z>y\) within the radius of convergence (say, \(a=(R+y)/2\)). Then for all \(x\in D\) we have \(|x|< z\) and so
\[|x|^{k-1}<z^{k-1}=|z^{k-1}|\,\implies |ka_k x^{k=-1}|< |ka_kz^{k-1}|\]
So we may take \(M_k =|ka_kz^{k-1}|\). But since the series of termwise derivatives has the same radius of convergence as the original series, and \(z\) is within the radius of convergence, we know \(\sum ka_k z^{k-1}\) converges absolutely! That is, \(\sum M_k\) converges, and we are done.
Example 23.1 We know the geometric series converges to \(1/(1-x)\) on \((-1,1)\): \[\sum_{k\geq 0}x^k=\frac{1}{1-x}\] Differentiating term by term yields a power series for \(1/(1-x)^2\): \[\begin{align*}\frac{1}{(1-x)^2}&=\left(\frac{1}{1-x}\right)^\prime\\ &=\left(\sum_{k\geq 0}x^k\right)^\prime\\ &=\sum_{k\geq 0}x^k\\ &= \sum_{k\geq 1}kx^{k-1}\\ &= 1+2x+3x^2+4x^3+\cdots \end{align*}\]
The fact that power series are differentiable on their entire radius of convergence puts a strong constraint on which sort of functions can ever be written as the limit of such a series.
Example 23.2 The absolute value \(|x|\) is not expressible as a power series.
But this applies much more powerfully than even this: we can show that a power series must be infinitely differentiable at each point of its domain!
Corollary 23.1 (Power Series are Smooth) Proceed by induction on \(N\). We know if \(\sum_k a_kx^k\) is a power series it is at least \(N=1\) times differentiable on the same radius of convergence, by our big result above. Now assume it is \(N\) times differentiable. Because we can differentiate power series term by term, the \(N^{th}\) derivative is also a power series, which has the same radius of convergence as the original.
But now we can apply our main theorem again: this power series is differentiable, with the same radius of convergence! Thus our original function is \(N+1\) times differentiable, completing the induction.
23.2 Power Series Representations
While power series are interesting in their own right, our main purpose for them is to compute functions we already care about. In this section we use their differentiability to provide tools to do so.
Definition 23.1 (Power Series Representation) A power series representation of a function \(f\) at a point \(a\) is a power series \(p\) where \(p(x)=f(x)\) on some neighborhood of \(a\).
Proposition 23.1 (Candidate Series Representation) Let \(f\) be a smooth real valued function whose domain contains a neighborhood of \(0\), and let \(p(x)=\sum_{k\geq 0}a_kx^k\) be a power series which equals \(f\) on some neighborhood of zero. Then, the power series \(p\) is uniquely determined:
\[p(x)=\sum_{k\geq 0}\frac{f^{(k)}(0)}{k!}x^k\]
Proof. Let \(f(x)\) be a smooth function and \(p(x)=\sum_{k\geq 0 }a_kx^k\) be a power series which equals \(f\) on some neighborhood of zero. Then in particular, \(p(0)=f(0)\), so
\[\begin{align*} f(0)&=\lim_N (a_0+a_1x+a_2x^2+\cdots+a_Nx^N)\\ &= \lim_N (a_0+0+0+\cdots +0)\\ &= a_0 \end{align*}\]
Now, we know the first coefficient of \(p\). How can we get the next? Differentiate!
\[p^\prime(x)=\left(\sum_{k\geq 0}a_kx^k\right)^\prime = \sum_{k\geq 0}(a_kx^k)^\prime=\sum_{k\geq 1}ka_kx^{k-1}\]
Since \(f(x)=p(x)\) on some small neighborhood of zero and the derivative is a limit, \(f^\prime(0)=p^\prime(0)\). Evaluating this at \(0\) will give the constant term of the power series \(p^\prime\)
\[\begin{align*} f^\prime(0)&=\lim_N (a_1+2a_2x+3a_3x^2\cdots+Na_Nx^{N-1})\\ &= \lim_N (a_1+0+0+\cdots +0)\\ &= a_1 \end{align*}\]
Continuing in this way, the second derivative will have a multiple of \(a_2\) as its constant term:
\[p^{\prime\prime}(x)=2a_2 + 3\cdot 2 \cdot a_3 x+4\cdot 3\cdot a_4 x^2+\cdots\]
And evaluating the equality \(f^{\prime\prime}(x)=p^{\prime\prime}(x)\) at zero yields
\[f^{\prime\prime}(0)=2a_2,\hspace{1cm}\mathrm{so}\hspace{1cm} a_2=\frac{f^{\prime\prime}(0)}{2}\]
This pattern continues indefinitely, as \(f\) is infinitely differentiable. The term \(a_n\) arrives in the constant term after \(n\) differentiations (as it was originally the coefficient of \(x^n\)), at which point it becomes
\[a_nx^n\mapsto na_nx^{n-1}\mapsto n(n-1)a_nx^{n-2}\mapsto\cdots\mapsto n(n-1)(n-2)\cdots 3\cdot 2\cdot 1 a_n\]
As the constant term of \(p^{(n)}\) this means \(p^{(n)}(0)=n!a_n\), and so using \(f^{(n)}(0)=p^{(n)}(0)\), \[a_n=\frac{f^{(n)}(0)}{n!}\]
In each case there was no choice to be made, so long as \(f=p\) in any small neighborhood of zero, the unique formula for \(p\) is
\[p(x)=\sum_{k\geq 0}\frac{f^{(k)}(0)}{k!}x^k\]
This candidate series makes it very easy to search for power series representations of known smooth functions: there’s only one series to even consider! This series is usually named after Brook Taylor, who gave their general formula in 1715.
Definition 23.2 (Taylor Series) For any smooth function \(f(x)\) we define the Taylor Polynomial (centered at \(0\)) of degree \(N\) to be \[p_N(x)=\sum_{0\leq k\leq N}\frac{f^{(k)}(0)}{k!}x^k\]
In the limit as \(N\to\infty\), this defines the Taylor Series \(p(x)\) for \(f\).
We’ve seen for example, that the geometric series \(\sum_{k\geq 0}x^k\) is a power series representation of the function \(1/(1-x)\) at zero: it actually converges on the entire interval \((-1,1)\). There are many reasons one may be interested in finding a power series representation of a function - and the above theorem tells us that if we were to search for one, there is a single natural candidate. If there is any power series representation, its this one!
So the next natural step is to study this representation: does it actually converge to \(f(x)\)?
23.2.1 Taylor’s Error Formula
To prove that our series actually do what we want, we are going to need some tools relating a functions derivatives to its values. Rolle’s Theorem / the Mean Value Theorem does this for the first derivative, and so we present a generalization here the polynomial mean value theorem, which does so for \(n^{th}\) derivatives.
Proposition 23.2 (A Generalized Rolle’s Theorem) Let \(f\) be a function which is \(n+1\) times differentiable on the interior of an interval \([a,b]\). Assume that \(f(a)=f(b)=0\), and further that the first \(n\) derivatives at \(a\) are zero: \[f(a)=f^\prime(a)=f^{\prime\prime}(a)=\cdots=f^{(n)}(a)=0\] Then, there exists some \(c\in(a,b)\) where \(f^{(n+1)}(c)=0\).
Proof. Because \(f\) is continuous and differentiable, and \(f(a)=f(b)\), the original Rolle’s Theorem implies that there exists some \(c\in(a,b)\) where \(f^\prime(c_1)=0\). But now, we know that \(f^\prime(a)=f^\prime(c_1)=0\), so we can apply Rolle’s theorem to \(f^\prime\) on $[a,c_1] to get a point \(c_2\in(a,c_1)\) with \(f^{\prime\prime}(c_2)=0\).
Continuing in this way, we get a \(c_3\in(a,c_2)\) with \(f^{(3)}(c)=0\), all the way up to to a \(c_n\in(a,c_{n-1})\) where \(f^{n}(c_n)=0\). This leaves one more application of Rolle’s theorem possible, as we assumed \(f^{(n)}(a)=0\), so we get a \(c\in(a,c_n)\) with \(f^{(n+1)}(c)=0\) as claimed.
Proposition 23.3 (A Polynomial Mean Value Theorem) Let \(f(x)\) be an \(n+1\)-times differentiable function on \([a,b]\) and \(h(x)\) a polynomial which shares the first \(n\) derivatives with \(f\) at zero: \[f(a)=h(a),\hspace{0.2cm}f^\prime(a)=h^\prime(a),\ldots,\hspace{0.2cm}f^{(n)}(a)=p^{(n)}(a)\] Then, if additionally \(f(b)=h(b)\), there must exist some point \(c\in(a,b)\) where \[f^{(n+1)}(c)=h^{(n+1)}(c)\]
Proof. Define the function \(g(x)=f(x)-h(x)\). Then all the first \(n\) derivatives of \(g\) at \(x=a\) are zero (as \(f\) and \(h\) had the same derivatives), and furthermore \(g(b)=0\) as well, since \(f(b)=h(b)\). This means we can apply the generalized Rolle’s theorem and find a \(c\in(a,b)\) with \[g^{(n+1)}(c)=0\] That is, \(f^{(n+1)}(c)=h^{(n+1)}(c)\).
Theorem 23.3 (Taylor Remainder) Let \(f(x)\) be an \(n+1\)-times differentiable function, and \(p_n(x)\) the degree \(n\) Taylor polynomial \(p(x)=\sum_{0\leq k\leq n}\frac{f^{(k)}(0)}{k!}x^k\).
Then for any fixed \(b\in\RR\), we have \[f(b)=p_n(b)+\frac{f^{(n+1)}(c)}{(n+1)!}b^{n+1}\]
For some \(c\in[0,b]\).
Proof. Fix a point \(b\), and consider the functions \(f(x)\) and \(p_n(x)\) on the interval \([0,b]\). These share their first \(n\) derivatives at \(a\), but \(f(b)\neq p_n(b)\): in fact, it is precisely this error we are trying to quantify.
We need to modify \(p_n\) in some way without affecting its first \(n\) derivatives at zero. One natural way is to add a multiple of \(x^{n+1}\), so define
\[q(x)=p_n(x)+\lambda x^{n+1}\] for some \(\lambda\in\RR\), where we choose \(\lambda\) so that \(f(b)=q(b)\). Because we ensured \(q^{(k)}(0)=f^{(k)}(0)\) for \(k\leq n\), we can now apply the polynomial mean value theorem to these two functions, and get some \(c\in(0,b)\) where \[f^{(n+1)}(c)=q^{(n+1)}(c)\]
Since \(p_n\) is degree \(n\) its \(n+1^{st}\) derivative is zero, and \[q^{(n+1)}(x)=0+\left(\lambda x^{n+1}\right)^{(n+1)}=(n+1)!\lambda\] Putting these last two observations together yields
\[f^{(n+1)}(c)=(n+1)!\lambda \implies \lambda = \frac{f^{(n+1)}(c)}{(n+1)!}\]
As \(q(b)=f(b)\) by construction, this in turn gives what we were after:
\[f(b)=p_n(b)+\frac{f^{(n+1)}(c)}{(n+1)!}b^{n+1}\]
23.2.2 Series Centered at \(a\in\mathbb{R}\)
All of our discussion (and indeed, everything we will need about power series for our course) dealt with defining a power series based on derivative information at zero. But of course, this was an arbitrary choice: one could do exactly the same thing based at any point \(a\in\RR\).
Theorem 23.4 Let \(f\) be a smooth function, defined in a neighborhood of \(a\in\RR\). Then there is a unique power series which has all the same derivatives as \(f\) at \(a\): \[p(x)=\sum_{k\geq 0}\frac{f^{(k)}(a)}{k!}(x-a)^k\] And, for any \(N\) the error between \(f\) and the \(N^{th}\) partial sum is quantified as \[f(x)-p_N(x)=\frac{f^{(N+1)}(\xi)}{(N+1)!}(x-a)^{N+1}\] For some \(c\in [a,x]\).
Exercise 23.2 Prove this.