11 Contraction Maps
The cauchy condition (that terms “bunch up”) appears in many natural situations, which makes it a very useful equivalent to convergence. Here we investigate one such instance, when sequences are generated by iterating certain functions.
Definition 11.1 (Contractive Sequence) A sequence \(s_n\) is contractive if there is some positive constant \(k<1\) such that for all \(n\) \[|a_{n+1}-a_n| < k|a_n-a_{n-1}|\]
Definition 11.2 (Contraction Map) A function \(f\) is a contraction map if there is some positive constant \(k<1\) such that for all \(x,y\) in the domain of \(f\), \[|f(x)-f(y)|<k|x-y|\]
Proposition 11.1 If \(f\) is a contraction map, iterating \(f\) starting at any point of the domain produces a convergent sequence.
Proof. We prove the existence of a fixed point of \(f\) by constructing a sequence that converges to it. Start by choosing any \(x_0\in\RR\) and set \(\delta = |f(x_0)-x_0|\). We can define a sequence \(x_n\) by iterating \(f\): \(x_{n+1}:=f(x_n)\). Our goal is to show that \(x_n\) is Cauchy by bounding \(|x_m-x_n|\) appropriately. As a starting point, note that as \(f\) is a contraction map we can choose a positive \(k<1\) where \(|f(x)-f(y)|\leq k|x-y|\) for all \(x,y\) and compute
\[\begin{align*} |x_{n+1}-x_n|&=k|f(x_{n})-f(x_{n-1})|\\ &\leq k^2|f(x_{n-1})-f(x_{n-2})|\\ &\leq \cdots\\ &\leq k^n |f(x_0)-x_0|\\ &= k^n\delta \end{align*}\]
This is not the quantity we are really interested in however, we wish to bound \(|x_m-x_n|\). Using the triangle inequality, we can replace this with \(n-m\) bounds of the form above:
\[\begin{align*} |x_m-x_n|&=|x_m-x_{m-1}+x_{m-1}-x_{m-2}+x_{m-2}-\cdots -x_{n+1}+x_{n+1}-x_n|\\ &\leq |x_m-x_{m-1}|+|x_{m-1}-x_{m-2}|+\cdots +|x_{n+1}-x_n|\\ &\leq k^{m-1}\delta + k^{m-2}\delta + \cdots + k^n\delta\\ \end{align*}\]
Simplifying this we can factor out the \(\delta\) to get a sum
\[\delta \left(k^{m-1}+k^{m-2}+\cdots + k^n\right) = \delta \sum_{j=n}^{m-1}k^j=\delta k^n \sum_{j=0}^{m-n-1}k^j\]
The final sum here we may recognize as a geometric series (Exercise 1.6), where
\[\sum_{j=0}^N k^j = \frac{1-k^{N+1}}{1-k}\leq \frac{1}{1-k}\]
Putting this all together, we have managed to bound \(|x_m-x_n|\) by
\[|x_m-x_n|\leq \delta k^n \sum_{j=0}^{m-n-1}k^j\leq \delta k^n\frac{1}{1-k}\]
The numbers \(\delta\) and \(1/(1-k)\) are both constants, and as \(|k|<1\) we know by Example 6.8, \(k^n\to 0\). Thus by the limit laws our bound \(\delta k^n\frac{1}{1-k}\to 0\), which means for every \(\epsilon\) there is some \(N\) where \(n>N\) implies this is less than \(\epsilon\), and hence that \(|x_m-x_n|<\epsilon\). This means the sequence of \(x\)’s is Cauchy, and hence convergent. Thus there is some \(x\in\RR\) with \(x_n\to x\).
Thinking harder about the limit of this sequence proves a rather important theorem in analysis:
Theorem 11.1 (The Contraction Mapping Theorem) Let \(f\colon\RR\to\RR\) be a contraction map. Then there is a unique real number \(x\) such that \(f(x)=x\).
Proof. Starting from any \(x_0\) in the domain, iterating \(f\) produces a sequence converging to some limit point \(x\). Since \(\lim x_n=\lim x_{n+1}=\lim f(x_n)\) we see this same point \(x\) is also the limit of the sequence \(f(x_n)\), so it suffices to prove that \(f(x_n)\to f(x)\). Then since \(f\) is a contraction map, there’s a \(k<1\) where \(|f(x_n)-f(x)|\leq k|x_n-x|\). This second sequence converges to \(0\) by assumption, so \(f(x_n)\to f(x)\) as required (Exercise 6.17). Finally by uniqueness of limits (Theorem 6.1) since \(f(x_n)\to x\) and \(f(x_n)\to f(x)\) we conclude \(x=f(x)\) is a fixed point.
Finally, we check uniqueness. Assume there are two fixed points \(x,y\) with \(x=f(x)\) and \(y=f(y)\). We apply the condition that \(f\) is a contraction map to \(|x-y|\) to get \[|x-y|\leq k|x-y|\] Since \(k\) is strictly less than \(1\), the only solution to this is that \(|x-y|=0\), so \(x=y\) and there is only one fixed point after all.
The contraction mapping theorem makes quick work of some proofs that previously took us rather a lot of work.
Example 11.1 (Proving \(a^n\to 0\) via contraction mapping) If \(|a|<1\) the sequence \(a^n\) converges to zero.
Proof. This sequence is produced by iteration of the map \(f(x)=ax\) starting from \(x=1\) at \(n=0\). When \(|a|<1\) we see immediately that \(f\) is a contraction map as \(|f(x)-f(y)|=|ax-ay|=|a||x-y|\), which is less than \(k|x-y|\) for any \(k\in (a,1)\). Its immediate to see the fixed point of \(x\mapsto ax\) is \(x=0\), and the proof above assures this fixed point is the limit of any sequence of iterates of \(f\). Thus \(a^n\to 0\).
Example 11.2 (\(\sqrt{1+\sqrt{1+\sqrt{1+\ldots}}}\) via Contraction Mapping)
Proof. The recursive sequence \(\sqrt{1+\sqrt{1+\sqrt{1+\ldots}}}\) we previously studied with monotone convergence is generated by iterating the function \(f(x)=\sqrt{1+x}\). This map sends nonnegative numbers to nonnegative numbers, so we can consider it a function \(f\colon [0,\infty)\to [0,\infty)\).
Furthermore, its straightforward to verify its a contraction map on this domain, by a clever factoring (difference of squares). Starting with \(x,y\geq 0\) and noticing \(|x-y|=|(1+x)-(1+y)|\), we factor as
\[|(1+x)-(1+y)|=\left|\left(\sqrt{1+x}+\sqrt{1+y}\right)\left(\sqrt{1+x}-\sqrt{1+y}\right)\right|\]
As \(x,y\geq 0\) we see \(\sqrt{1+x}\geq 1\), \(\sqrt{1+y}\geq 1\) so the first factor \(\sqrt{1+x}+\sqrt{1+y}\geq 2\). Thus
\[|x-y|=\left|\sqrt{1+x}+\sqrt{1+y}\right|\left|\sqrt{1+x}-\sqrt{1+y}\right|\geq 2\left|\sqrt{1+x}-\sqrt{1+y}\right|\]
Dividng by \(2\) directly shows \(f\) is a contraction map with constatn \(k=\frac{1}{2}\):
\[\frac{1}{2}|x-y|\geq \left|\sqrt{1+x}-\sqrt{1+y}\right|=|f(x)-f(y)|\]
Thus, the contraction mapping theorem applies, and iterating \(f\) from any starting point produces a convergent sequence, whose limit is the unique fixed point of \(f\). Solving for this fixed point we see \[f(x)=x\implies x=\sqrt{1+x}\implies x^2=x+1\] Which together with the contstraint \(x\in[1,\infty)\) (which is the domain we proved \(f\) is a contraction on) yields \(x=\tfrac{1+\sqrt{5}}{2}\), the golden ratio.
Note this proves something stronger than our original claim: we know not just the sequence starting with \(x_0=1\) converges to the golden ratio, but *any starting point \(\geq 1\), for example,
\[\sqrt{17},\sqrt{1+\sqrt{17}},\sqrt{1+\sqrt{1+\sqrt{17}}},\ldots\]
Exercise 11.1 (Contraction Maps and \(\sqrt{2}\)) The babylonian procedure for approximating \(\sqrt{2}\) defined the recursive sequence \[w_{n+1}=\frac{1}{2}\left(w_n+\frac{2}{w_n}\right)\] which is generated by iterating the function \[f(x)=\frac{1}{2}\left(x+\frac{2}{x}\right)\]
- Show that \(f\) maps \([1,\infty)\) into \([1,\infty)\).
- Show that on this domain, \(f\) is a contraction map.
- Show that if \(f(x)=x\) then \(x^2=2\).
Now apply the contraction mapping theorem to re-do a large collection of previous work; proving that (1) \(\sqrt{2}\) exists in \(\RR\) and (2) that the babylonian sequence starting from any possible initial rectangle of area 2 converges to \(\sqrt{2}\). (Before we’d only shown this for the \(2\times 1\) rectangle, so \(w_0=1\).)
We can combine the contracting mapping theorem with other techniques to continue re-proving known results in easier ways. Here we take another look at our argument for the basic limit \(a^{1/n}\to 1\).
Exercise 11.2 (Proving \(a^{1/n}\to 1\) by Contraction Mapping) Fill in the following outline to prove \(a^{1/n}\to 1\) for all \(a>0\):
The sequence \(a^{1/n}\) is not generated by iterating a function, but it has many subsequences that are. For example the subsequence \[a, a^{1/2}, a^{1/4}, a^{1/8},\ldots\] Is generated by iterating \(f(x)=\sqrt{x}\).
Prove that this map is a contraction map on the domain \([1,\infty)\), which guarantees this subsequence converges for every \(a\geq 1\). Find the limiting value.
Now, for \(a>1\) prove the full sequence \(a^{1/n}\) is monotone decreasing and bounded below by 1. This ensures it converges (by monotone convergence).
Putting these two facts together, we see the entire sequence converges to 1, as if we have a convergent sequences, all subsequences have the same limit!
Finally, use the limit laws to argue the same holds even when \(a<1\).
For some applications, its useful to have around a slight generalization of the contraction mapping theorem: what can we say about the case where \(f\) is not a contraction map, but some number of iterations \(f^N\) is? The contraction mapping theorem applies to \(f^N\), showing this map has a fixed point. But that doesn’t directly imply \(f\) does: after all, being fixed by \(f^N\) just means that a point returns to itself after N applications: it could be a periodic point of \(f\). However, a little deeper thought shows this is not the case:
Theorem 11.2 (Root of a Contraction) Let \(f\colon X\to X\) be a map such that the \(N\)-fold composition \(f^N\) of \(f\) with itself is a contraction map. (That is, “\(f\) is the \(N\)th root of a contraction”). Then the conclusion of the Contraction Mapping Theorem applies to \(f\): namely \(f\) has a unique fixed point, and iterating \(f\) on any initial point \(x_0\in X\) produces a sequence which converges to this fixed point.
Proof. Since \(f^N\) is a contraction map it has a unique fixed point \(a\in X\). Thus one way of showing that something is equal to \(a\) is to show that its fixed by \(f^N\). But a clever trick lets us see that \(f^N\) fixes \(f(a)\): since we are just composing \(f\) with itself over and over,
\[f^N(f(a))=f^{N+1}(a)=f(f^N(a))=f(a)\]
where the last inequality follows as \(f^N(a)=a\) by definition. Thus \(f^N\) fixes \(f(a)\) so this must be the unique fixed point \(a\) itself! \(f(a)=a\) as desired. Additionally, this is the only fixed point of \(f\) as any points fixed by \(f\) are fixed by all repeated compositions of \(f\), so would be fixed points of \(f^N\).
It remains only to see that starting from an arbitrary \(x_0\in X\), repeated iteration produces a convergent sequence with limit \(a\). The trick is to break the sequence \(f^n(a)\) down into \(N\) subsequences, one for each \(r\in\{0,\ldots N-1\}\):
\[ \begin{matrix} x_0 & f^N(x_0) & f^{2N}(x_0) & \ldots, &f^{nN}(x_0) &\ldots\\ f(x_0)& f^{1+N}(x_0) & f^{1+2N}(x_0) & \ldots, &f^{1+nN}(x_0) &\ldots\\ f^2(x_0)& f^{2+N}(x_0) & f^{2+2N}(x_0) & \ldots, &f^{2+nN}(x_0) &\ldots\\ \vdots &\vdots &\vdots & \vdots &\vdots &\vdots\\ f^r(x_0)& f^{r+N}(x_0) & f^{r+2N}(x_0) & \ldots, &f^{r+nN}(x_0) &\ldots\\ \vdots &\vdots &\vdots & \vdots &\vdots &\vdots\\ f^{N-1}(x_0)& f^{(N-1)+N}(x_0) & f^{(N-1)+2N}(x_0) & \ldots, &f^{(N-1)+nN}(x_0) &\ldots\\ \end{matrix} \]
Each of these sequences is the iteration of \(f^N\) on some initial point \(f^r(x_0)\), and so by the contraction mapping theorem converges to the unqiue fixed point \(a\) of \(f^N\). Thus, we’ve written our sequence as the union of \(N\) sequences, each of which has the same limit! So the entire sequence converges to this limit, \(f^n(x_0)\to a\) as desired.