$$ \newcommand{\RR}{\mathbb{R}} \newcommand{\QQ}{\mathbb{Q}} \newcommand{\CC}{\mathbb{C}} \newcommand{\NN}{\mathbb{N}} \newcommand{\ZZ}{\mathbb{Z}} \newcommand{\FF}{\mathbb{F}} \renewcommand{\epsilon}{\varepsilon} % ALTERNATE VERSIONS % \newcommand{\uppersum}[1]{{\textstyle\sum^+_{#1}}} % \newcommand{\lowersum}[1]{{\textstyle\sum^-_{#1}}} % \newcommand{\upperint}[1]{{\textstyle\smallint^+_{#1}}} % \newcommand{\lowerint}[1]{{\textstyle\smallint^-_{#1}}} % \newcommand{\rsum}[1]{{\textstyle\sum_{#1}}} \newcommand{\uppersum}[1]{U_{#1}} \newcommand{\lowersum}[1]{L_{#1}} \newcommand{\upperint}[1]{U_{#1}} \newcommand{\lowerint}[1]{L_{#1}} \newcommand{\rsum}[1]{{\textstyle\sum_{#1}}} % extra auxiliary and additional topic/proof \newcommand{\extopic}{\bigstar} \newcommand{\auxtopic}{\blacklozenge} \newcommand{\additional}{\oplus} \newcommand{\partitions}[1]{\mathcal{P}_{#1}} \newcommand{\sampleset}[1]{\mathcal{S}_{#1}} \newcommand{\erf}{\operatorname{erf}} $$

6  Convergence

Having formalized the number line, we can now get to work. If we want to rigorously understand any of the approximation efforts of the ancients, we must think about sequences.

Definition 6.1 (Sequence) A sequence is an infinite ordered list of numbers \[(s_1,s_2,s_3,\ldots,s_n,\ldots)\] Each individual element is a term of the sequence, with an subscript (the index) denoting its position in the list.

Most often, we take the set of indices to be \(1,2,3,\ldots\), but any infinite subset of the integers will do. For example, the sequence \(p_n\) of perimeters of inscribed \(n\)-gons starts with index \(3\) (the triangle), as this is the smallest polygon. And, the subsequence Archimedes used to calculate \(\pi\) started with the hexagon and then iterated doubling: \(P_6,P_{12}, P_{24},\ldots\) so has index set \[\{6,12,24,48,96,192,384,\ldots\}\]

Formally, we note all of this is captured using functions, though we will not need this perspective during our day-to-day usage of sequences.

Remark 6.1. Let \(I\subset\ZZ\) be any infinite set of indices. Then a sequence is a function \(s\colon I\to\RR\).

While sequence itself is just an infinite ordered list of numbers, to work with such an object we often require a way to compute its terms. Sometimes this is hard! For example, the sequence

\[\pi_n=\textrm{ the number of prime numbers $\leq n$ }\]

Is called the prime counting function, and being able to compute its exact values efficiently would be monumental progress in number theory. In practice, sequences that we can compute with efficiently are often presented to us in one of two ways:

\[a_n=\frac{n^2+1}{3n-2},\hspace{1cm}b_n=\sin\left(\frac{1}{n}\right)\hspace{1cm}c_n=\sqrt{1+\frac{\sqrt{n}}{n+1}}\]

Here’s some example sequences that are important both to us, and the history of analysis:

Example 6.1 (Babylonians and \(\sqrt{2}\)) Starting from rectangle of width and height \(w,h\), the Babylonians created a new rectangle whose width was the average of these, and whose height was whatever is required to keep the area \(2\): \[w_{\mathrm{new}}=\frac{w+h}{2}\hspace{1cm}h_{\mathrm{new}}=\frac{2}{w_{\mathrm{new}}}\]

This because we can solve for \(h\) in terms of \(w\), this induces a recursive sequence for the widths. Starting from some \((w_n,h_n)\) we have

\[w_{n+1}=\frac{w_n+h_n}{2}=\frac{w_n+\frac{2}{w_n}}{2}=\frac{w_n}{2}+\frac{1}{w_n}\]

Thus, in modern terminology the babylonian procedure defines a recursive sequence, given any starting rectangle. If we begin with the rectangle of wdith 2 and height 1, we get

\[w_0=2,\hspace{1cm}w_{n+1}=\frac{w_n}{2}+\frac{1}{w_n}\]

Exercise 6.1 (Babylonians and \(\sqrt{2}\)) Following the same type of reasoning as for width, use the babylonian procedure to produce a recursive formula for the sequence of heights \(h_n\), for a rectangle starting with \(h=1\).

Example 6.2 An infinite sum is a type of recursively defined sequence, built from another sequence called its terms. Assume that \(a_n\) is any sequence. Then we build a sequence \(s_n\) by

\[s_0=a_0\hspace{1cm}s_{n+1}=s_{n-1}+a_n\]

Unpacking this, we see that \(s_1=s_0+a_1=a_0+a_1\), and thus \(s_2=s_1+a_2=a_0+a_1+a_2\) etc.

6.1 Convergence

The reason to define a sequence precisely is that we are interested in making rigorous the idea of infinitely many steps, the way the Babylonians may have pictured running their procedure an infinite number of times to produce a perfect square, or Archimedes who ran his side-doubling procedure infinitely many times to produce a circle.

In both cases, there was some number \(L\) out there at infinity that they were probing with a sequence. We call such a number \(L\) the limit of the sequence.

Definition 6.2 (Convergent Sequence) A sequence \(s_n\) converges to a limit \(L\) if for all \(\epsilon>0\) there is some threshold \(N\) past which every further term of the sequence is within \(\epsilon\) of \(L\). Formally, this is the logic expression \[\forall \epsilon >0 \,\exists N\,\forall n>N\,\, |s_n-L|<\epsilon\] When a sequene converges to \(L\) we write \[\lim s_n = L\hspace{1cm}\textrm{or}\hspace{1cm}s_n\to L\]

A sequence is divergent if its not convergent. The definition of convergence formalizes the idea the ancients sought if you keep calculating terms, you’ll get as close as you like to the number you seek

That is, the definition sets up a challenge between you (the computer of the sequence) and the error tolerance. Once you set a certain amount of acceptable error \(\epsilon\), the definition furnishes an \(N\) and guarantees that if you compute the sequence out until \(N\) you’ll be within the tolerated error - and if you keep computing more terms, the approximation will never get worse. Its good to look at some specific examples, while getting comfortable with this:

Exercise 6.2 (Understanding Convergence) Consider the sequence \(a_n=\frac{1}{n^2+13}\). Feel free to use a calculator (even just the google search bar) to experiment and answer these questions.

  • What value \(L\) do you think this sequence converges to?
  • If \(\epsilon = 1/10\), what value of \(N\) ensures that \(a_n\) is always within \(\epsilon\) of \(L\) for, \(n>N\)?
  • If \(\epsilon = 1/100\), what value of \(N\) ensures that \(a_n\) is always within \(\epsilon\) of \(L\) for, \(n>N\)?

Exercise 6.3 (Convergence and \(\sqrt{2}\)) This problem concerns the babylonian sequence for \(\sqrt{2}\) in Example 6.1. Again, use a calculator to play around and answer the following

  • For which value of \(N\) are we guaranteed that \(w_n\) calculates the first two decimal places \(\sqrt{2}\) correctly, when \(n>N\)?
  • For which value of \(N\) are we guaranteed that \(w_n\) calculates the first eight decimal places \(\sqrt{2}\) correctly, when \(n>N\)?

6.1.1 The \(\epsilon-N\) Game

To prove a sequence converges, we need to work through the string of quantifiers \(\forall \epsilon\exists N\forall n\)… This sets up a sort of imagined battle between an imagined foe setting a value of \(\epsilon\), and you needing to come up with an \(N\) such that you can get the sequence within \(\epsilon\) of the limit.

Here’s one incredibly useful example, that will serve as the basis of many future calculations.

Proposition 6.1 (\(1/n\) converges to \(0\).) Prove that the sequence \(s_n=1/n\) of reciprocals of the natural numbers converges to \(0\).

Proof. Let \(\epsilon>0\). Then set \(N=1/\epsilon\), and choose arbitrary \(n>N\). Since \(n>1/\epsilon\) it follows that \(1/n<\epsilon\), and hence that \[\left|\frac{1}{n}-0\right|<\epsilon\]

Since \(n>N\) was arbitrary, this holds for all such \(n\), and we have proved for this \(\epsilon\), theres an \(N\) with \(n>N\) implying the sequence \(1/n\) is within \(\epsilon\) of the proposed limit \(0\). Since \(\epsilon\) was also arbitrary, we have in fact proved this for all positive epsilon, and thus we conclude \[\frac{1}{n}\to 0\]

Often when working out such a computation, the scratch work is backwards of the final proof. In a proof, you need to fix an arbitrary epsilon, then

Exercise 6.4 (\(\frac{n}{n+1}\) converges to \(1\))  

Sometimes the scratch work takes a bit more thinking or algebraic manipulation. Its OK if the scratch work isn’t fully rigorous or perfectly written, as long as the eventual proof is! Here’s an example of some scratch work taking a naive approach (just “solve for \(n\)”) that arrives at an easy bound, and a nice formal proof verifying it.

Example 6.3 (\(\frac{n}{n^2+1}\) converges to \(0\).)  

Proof (Scratch). We want \(\tfrac{n}{n^2+1}<\epsilon\), and we attempt to solve this inequality for \(n\) by multiplying through and using the quadratic formula: \[n<\epsilon n^2+\epsilon\,\implies 0<\epsilon n^2-n+\epsilon \] The zeroes satisfy \[n=\frac{1\pm\sqrt{1-4\epsilon^2}}{2\epsilon}\] The larger of these is the one with the \(+\) sign, so as long ad \(n\) is bigger than this the proof will work. This term as written is rather annoying to deal with as we have \(1-4\epsilon^2\) under the square root, and to do a formal proof using it as is, we’d need to ensure this wasn’t negative. But since we are only looking for a bound, we can use that \[\frac{1+\sqrt{1-4\epsilon^2}}{2\epsilon}<\frac{1+1}{2\epsilon}=\frac{1}{\epsilon}\] and just require \(n>\frac{1}{\epsilon}\).

Proof (Formal). Let \(\epsilon>0\) and set \(N=\frac{1}{\epsilon}\). For any \(n>N\) we see that \(\frac{1}{n^2}<\epsilon^2\), and \(\frac{1}{n^2+1}<\frac{1}{n^2}\) so \(\frac{1}{n^2+1}<\epsilon^2\). Multiplying by \(n\) gives \[\frac{n}{n^2+1}<n\epsilon^2<\frac{1}{\epsilon}\epsilon^2=\epsilon\] Thus for any \(n>N\) we have \(\left|\frac{n}{n^2+1}-0\right|<\epsilon\) so the sequence converges to \(0\) by definition.

Example 6.4 (\(\frac{1}{2^n}\to 0\)) Here’s a sketch of an argument: you should fill in the details. Let \(\epsilon>0\). Then we want to find an \(N\) where \(n>N\) implies \(1/2^n<\epsilon\). First, we prove by induction that \(2^n\geq n\) for all \(n\). Thus, \(1/2^n<1/n\), and so it suffices to find \(N\) where \(1/n<\epsilon\). But this is exactly what we did above in the proof that \(1/n\to 0\). So this is possible, and hence \(1/2^n\to 0\).

Exercise 6.5 Give an example of the following, or explain why no such example can exist.

  • A sequence with infinitely many terms equal to zero, but does not converge to zero.
  • A sequence with infinitely many terms equal to zero, which converges to a nonzero number.
  • A sequence of irrational numbers that converges to a rational number.
  • A convergent sequence where every term is an integer.

Exercise 6.6 Prove, directly from the definition of convergence, that \[\frac{2n-2}{5n+1}\to \frac{2}{5}\]

6.1.2 Divergence

The definition of convergence picks out a very nice class of sequences: those that get arbitrarily close to a fixed value, as their index grows. The rest of sequences - anything that does not have this nice property, are all lumped into the category of divergent.

Definition 6.3 (Divergence) A sequence diverges if its not true that for any \(\epsilon\) you can find an \(N\) where beyond that, all terms of the sequence differ from some constant (the limit) less than \(\epsilon\).

Phrasing this positively: a sequence \(a_n\) diverges if for every value of \(a\), there exists some \(\epsilon>0\) where no matter which \(N\) you pick, there’s always some \(n>N\) where \(|a_n-a|>\epsilon\). There’s a lot of quantifiers here! Written out in first order logic:

\[\forall a\in\RR\; \exists \epsilon>0\;\forall N\; \exists n>N\;\; |a_n-a|>\epsilon\]

Again, its easiest to illustrate with an example:

Example 6.5 (\((-1)^n\) Diverges) Here’s the idea: The sequence \(s_n=(-1)^n\) alternates back and forth from \(1\) to \(-1\) forever. Assume for the sake of contradiction that it in fact converges to some real number \(L\). Then (by definition) eventually all terms must get within \(\epsilon\) of \(L\), but this is impossible if \(\epsilon\) is small as every term differs from its successor by 2.

Proof. Note that for all \(n\), \(|s_n-s_{n+1}|=2\) as when \(n\) is even this is \(|1-(-1)|=|2|\) and when \(n\) is odd its \(|-1-(1)|=|-2|\). Assume for the sake of contradiction that \(s_n\to L\) for some \(L\in\RR\), and set \(\epsilon=\tfrac{1}{2}\). This implies there exists an \(N\) such that for all \(n>N\) we have \(|s_n-L|<\tfrac{1}{2}\). Choosing some \(n>N\) we use the triangle inequality to see \[2=|s_n-s_{n+1}|=|s_n-L+L-s_{n+1}|\leq |s_n-L|+|L-s+{n+1}|\leq \epsilon+\epsilon=1\]

Thus we’ve proven \(2<1\) which is a contradiction, so it must not be true that \(s_n\to L\) for any \(L\): the sequence diverges.

Definition 6.4 (Diverging to \(\pm\infty\)) A sequence \(s_n\) diverges to \(\infty\) if for all \(M>0\) there exists an threshold past which the sequence is always larger than \(M\). As a logic statement, \[\forall M>0\, \exists N\, \forall n>N\, s_n>M\]

Exercise 6.7 (\(n^2\) diverges to \(\infty\).)  

Exercise 6.8  

  • Give an example of two divergent sequences \(a_n,b_n\) where \(a_n+b_n\) is convergent.
  • Give an example of two divergent sequences \(a_n,b_n\) where \(a_nb_n\) is convergent.

6.2 Uniqueness

Theorem 6.1 (Limits are unique) Let \(a_n\) be a convergent sequence. Then there exists a unique \(a\in\RR\) with \(a_n\to a\).

Here’s a sketch of the idea, which uses several big ideas that can be recycled in similar arguments:

  • We prove uniqueness by showing that if \(x\) and \(y\) were both limits, then \(x=y\).
  • We prove \(x=y\) by showing that for every \(\epsilon>0\) the difference \(|x-y|<\epsilon\).
  • We prove \(|x-y|<\epsilon\) by an \(\epsilon/2\) argument:
    • We add zero in a clever way: \(|x-y|=|x-a_n+a_n-y|\)
    • We use the triangle inequality \(|x-a_n+a_n-y|\leq |x-a_n|+|a_n-y|\)
    • We use the fact that \(a_n\to x\) and \(a_n\to y\) to make each of \(|a_n-x|\) and \(|a_n-y|\) less than \(\epsilon/2\).

Proof. Assume that a sequence \(a_n\) converges to two limits \(a_n\to x\) and \(a_n\to y\). Then for any \(\epsilon\) we can find an \(N_1\) where \(n>N_1\) implies \(|a_n-x|<\epsilon/2\) and an \(N_2\) where \(n>N_2\) implies \(|a_n-y|<\epsilon/2\). Setting \(N=\max\{N_1,N_2\}\) we see for any \(n>N\) that \[|x-y|=|x-a_n+a_n-y|\leq |x-a_n|+|a_n-y|<\tfrac{\epsilon}{2}+\tfrac{\epsilon}{2}=\epsilon\]

Thus for any positive \(\epsilon\) we know \(|x-y|<\epsilon\), so in particular \(|x-y|\neq \epsilon\), and \(|x-y|\) can’t be positive. Since absolute values are always nonnegative, the only remaining option is that \(|x-y|=0\). But this means \(x-y=0\) and hence \(x=y\): the two limits are equal.

There’s one more uniqueness-type theorem about limits that’s useful to get a handle on. We just saw that the limit is uniquely determined by the sequence, but we can say something slightly stronger. Its uniquely determined by the end of the sequence: if you throw away the first finitely many terms, it won’t change the limit.

Definition 6.5 A shifted sequence the result of shifting the indices by a constant \(k\), deleting the first \(k\) terms. Precisely, given a sequence \(a_n\) and some \(k\in \NN\), the sequence \(s_n=a_{n+k}\) is the result of shifting \(a\) by \(k\). \[s_{0}=a_{k}, s_{1}=a_{k+1}, s_{2}=a_{k+2},\ldots\]

Proposition 6.2 Shifting a convergent sequence does not change its limit.

Proof (Scratch Work). Assume that \(a_n\) converges to \(a\), and define the sequence \(s_n\) by deleting the first \(k\) terms of \(a_n\), that is, \(s_n=a_{n+k}\). We claim that \(s_n\to a\).

Let \(\epsilon>0\) and choose an \(N\) such that if \(n>N\) we know that \(|a_n-a|<\epsilon\) (we know such an \(N\) exists by the assumption \(a_n\to a\)). Now consider \(|s_n-a|\). Since \(s_n=a_{n+k}\), we know \(|s_n-a|<\epsilon\) because we already knew \(|a_{n+k}-a|<\epsilon\): we knew this for every single index bigger than \(N\).

Thus, for all \(n>N\) we have \(|s_n-a|<\epsilon\), which is the definition of \(s_n\to a\).

This can be generalized, to show that any two sequences which are eventually the same have the same limit. Since the first finite part of any sequence is irrelevant to its limiting behavior, its nice to have a word for “the rest of the sequence, after throwing away an unspecified amount at the beginning”. This is called the tail.

Definition 6.6 (Tail of a Sequence) The tail of a sequence is what remains after chopping off an arbitrary (finite) number of terms from the beginning of the sequence. Two sequences have the same tail if they agree after some point: more precisely, \(a_n\) and \(b_n\) have the same tail if there is an \(N_a\) and \(N_b\) such that for all \(k\in\NN\) \[a_{N_a+k}=b_{N_b+k}\]

Example 6.6 (Tail of a Sequence) The following two sequences have the same tail: \[a_n = 1,1,4,3,1,5,1,3,1,4,7,8,9,10,11,12,13,14,\ldots\] \[b_n = -4,3,9,10,11,12,13,14,15,16,17,18\ldots\]

We can see this because \(a_{13}=b_3=9\), and \(a_{14}=b_4=10\), and \(a_{15}=b_5=11\)…for every \(k\) we have that \(a_{13+k}=b_{3+k}\) so they agree after chopping the first 12 terms off of \(a_n\) and the first two terms off of \(b_n\).

Exercise 6.9 (Convergence only depends on the tail) If two sequences have the same tail, then they either both converge or both diverge, and if they converge, they have the same limit.

6.3 Important Sequences

We will soon develop several theorems that let us calculate many limits without tediously chasing down an \(N\) for every \(\epsilon\). But there are still several ‘basic limits’ that we will need to know, that will prove useful as building blocks of more complicated limits, as well as foundations to further theory in analysis. We compute several of them here: you should not worry too hard about committing these to memory; but rather read the proofs as examples of how to play the \(\epsilon-N\) game in tricky situations.

The first and most important are familiar from above:

Example 6.7 As \(n\to\infty\) the sequence \(1/n\) converges to \(0\).

The next is useful in developing the theory of geometric series:

Example 6.8 Let \(|a|<1\), then the sequence \(a^n\) of repeated powers of \(a\) converges to \(0\).

Proof.

There’s a corresponding result for larger values of \(a\):

Example 6.9 If \(a>1\) then \(a^n\) diverges to infinity.

Proof.

This next is an essential building block of the theory of exponential functions:

Example 6.10 The sequence \(n^{1/n}\) converges to 1.

Proof. (Page 157 Amazing)

Example 6.11 Let \(a>0\). Then the sequence \(a^{1/n}\) converges to 1.

6.4 \(\bigstar\) Topology

With an eye to topology, everything about sequences and convergence can be rephrased in terms of open sets, instead of with talk about \(\epsilon\) and inequalities.

Definition 6.7 (Neighborhoods) A neighborhood of a point \(x\) is any open set \(U\) containing \(x\). The \(\epsilon\)-neighborhood of \(x\) is the neighborhood \(U_\epsilon = (x-\epsilon,x+\epsilon)\)

Definition 6.8 (Convergence and \(\epsilon\)-Neighborhoods) A sequence \(a_n\) converges to \(a\) if every \(\epsilon\) neighborhood contains all but finitely many terms of the sequence.

That this is equivalent to Definition 6.2, because the definition of epsilon neighborhood exactly captures the interval discussed in the original definition of convergence.

Exercise 6.10 (Convergence and \(\epsilon\)-Neighborhoods) The definition of convergence in terms of epsilon neighborhoods is equivalent to the usual definition in terms of absolute values and inequalities.

The definition of an epsilon neighborhood makes sense only somewhere like the real line, where we can talk about intervals. So, the general topological definition must dispense with this notion and talk just about open sets:

Definition 6.9 A sequence \(a_n\) converges to \(a\) if every neighborhood contains all but finitely many terms of the sequence.

Exercise 6.11 (Convergence and Neighborhoods) Prove this is equivalent to convergence using \(\epsilon\) neighborhoods. Hint: show that every neighborhood contains some epsilon neighborhood. Can you show that is enough?

6.5 Problems

Exercise 6.12 Come up with a recursive sequence that could be used to formally understand the infinite expression below: \[\sqrt{1+\sqrt{1+\sqrt{1+\sqrt{1+\sqrt{1+\sqrt{1+\cdots}}}}}}\]

Exercise 6.13 Given two sequences \(x_n,y_n\) that converge to \(a\), show the interleaved sequence \(x_1,y_1,x_2,y_2,\ldots\) converges to \(a\).

Exercise 6.14 Let \(a_n\to 0\) and let \(b_n\) be a sequence such that for all \(n\) you know \(|b_n-L|<a_n\). Prove that \(\lim b_n = L\).