$$ \newcommand{\RR}{\mathbb{R}} \newcommand{\QQ}{\mathbb{Q}} \newcommand{\CC}{\mathbb{C}} \newcommand{\NN}{\mathbb{N}} \newcommand{\ZZ}{\mathbb{Z}} \newcommand{\FF}{\mathbb{F}} \renewcommand{\epsilon}{\varepsilon} % ALTERNATE VERSIONS % \newcommand{\uppersum}[1]{{\textstyle\sum^+_{#1}}} % \newcommand{\lowersum}[1]{{\textstyle\sum^-_{#1}}} % \newcommand{\upperint}[1]{{\textstyle\smallint^+_{#1}}} % \newcommand{\lowerint}[1]{{\textstyle\smallint^-_{#1}}} % \newcommand{\rsum}[1]{{\textstyle\sum_{#1}}} \newcommand{\uppersum}[1]{U_{#1}} \newcommand{\lowersum}[1]{L_{#1}} \newcommand{\upperint}[1]{U_{#1}} \newcommand{\lowerint}[1]{L_{#1}} \newcommand{\rsum}[1]{{\textstyle\sum_{#1}}} % extra auxiliary and additional topic/proof \newcommand{\extopic}{\bigstar} \newcommand{\auxtopic}{\blacklozenge} \newcommand{\additional}{\oplus} \newcommand{\partitions}[1]{\mathcal{P}_{#1}} \newcommand{\sampleset}[1]{\mathcal{S}_{#1}} \newcommand{\erf}{\operatorname{erf}} $$

24  Differentiable Functions

Highlights of this Chapter: we study the relationship between the behavior of a function and its derivative, proving several foundational results in the theory of differentiable functions:

  • Fermat’s Theorem: A differentiable function has derivative zero at an extremum.
  • Rolle’s Theorem: if a differentiable function is equal at two points, it must have zero derivative at some point in-between.
  • The Mean Value Theorem: the average slope of a differentiable function on an interval is realized as the instantaneous slope at some point inside that interval.

The Mean Value theorem is really the star of the show, and we go on to study several of its prominent applications in the following chapter.

24.1 Extrema

That the derivative (rate of change) should be able to detect local extrema is an old idea, even predating the calculus of Newton and Leibniz. Though certainly realized earlier in certain cases, it is Fermat who is credited with the first general theorem (so, the result below is often called Fermat’s theorem)

Theorem 24.1 (Finding Local Extrema (Fermat’s Theorem)) Let \(f\) be a function with a local extremum at \(m\). Then if \(f\) is differentiable at \(m\), we must have \(f^\prime(m)=0\).

Proof. Without loss of generality we will assume that \(m\) is the location of a local minimum (the same argument applies for local maxima, except the inequalities in the numerators reverse). As \(f\) is differentiable at \(m\), we know that both the right and left hand limits of the difference quotient exist, and are equal.

First, some preliminaries that apply to both right and left limits. Since we know the limit exists, it’s value can by computed via any appropriate sequence \(x_n\to m\). Choosing some such sequence we investigate the difference quotient

\[\frac{f(x_n)-f(m)}{x_n-m}\]

Because \(m\) is a local minimum, there is some interval (say, of radius \(\epsilon\)) about \(m\) where \(f(x)\geq f(m)\). As \(x_n\to m\), we know the sequence eventually enters this interval (by the definition of convergence) thus for all sufficiently large \(n\) we know \[f(x_n)-f(m)\geq 0\]

Now, we separate out the limits from above and below, starting with \(\lim_{x\to m^-}\). If \(x_n\to m\) but \(x_n<m\) then we know \(x_n-m\) is negative for all \(n\), and so

\[\frac{f(x_n)-f(m)}{x_n-m}=\frac{\mathrm{pos}}{\mathrm{neg}}=\mathrm{neg}\]

Thus, for all \(n\) the difference quotient is \(\leq 0\), and so the limit must be as well! That is, \[\lim_{x\to m^-}\frac{f(x)-f(m)}{x-m}\leq 0\]

Performing the analogous investigation for the limit from above, we now have a sequence \(x_n\to m\) with \(x_n\geq m\). This changes the sign of the denominator, so

\[\frac{f(x_n)-f(m)}{x_n-m}=\frac{\mathrm{pos}}{\mathrm{pos}}=\mathrm{pos}\]

Again, if the difference quotient is \(\geq 0\) for all \(n\), we know the same is true of the limit.

\[\lim_{x\to m^+}\frac{f(x)-f(m)}{x-m}\geq 0\]

But, by our assumption that \(f\) is differentiable at \(m\) we know both of these must be equal! And if one is \(\geq 0\) and the other \(\leq 0\) the only possibility is that \(f^\prime(m)=0\).

24.2 Mean Values

One of the most important theorems relating \(f\) and \(f^\prime\) is the mean value theorem. This is an excellent example of a theorem that is intuitively obvious (from our experience with reasonable functions) but yet requires careful proof (as we know by know many functions have non-intuitive behavior). Indeed, when I teach calculus I, I often paraphrase the mean value theorem as follows:

If you drove 60 miles in one hour, then at some point you must have been driving 60 miles per hour

How can we write this mathematically? Say you drove \(D\) miles in \(T\) hours. If \(f(t)\) is your position as a function of time*, and you were driving between \(t=a\) and \(t=b\) (where \(b-a=T\)), your average speed was

\[\frac{D}{T}=\frac{f(b)-f(a)}{b-a}\]

To then say *at some point you were going \(D\) miles per hour implies that there exists some \(t^\star\) between \(a\) and \(b\) where the instantaneous rate of change - the derivative - is equal to this value. This is exactly the Mean Value Theorem:

Theorem 24.2 (The Mean Value Theorem) If \(f\) is a function which is continuous on the closed interval \([a,b]\) and differentiable on the open interval \((a,b)\), then there exists some \(x^\star\in(a,b)\) where \[f^\prime(x^\star)=\frac{f(b)-f(a)}{b-a}\]

Note: The reason we require differentiability only ont he interior of the interval is that the two sided limit defining the derivative may not exist at the endpoints, (if for example, the domain of \(f\) is only \([a,b]\)).

In this section we will prove the mean value theorem. It’s simplest to break the proof into two steps: first the special case were \(f(a)=f(b)\) (and so we are seeking \(f^\prime(x^\star=0)\)), and then apply this to the general version. This special case is often useful in its own right and so has a name: Rolle’s Theorem.

Theorem 24.3 (Rolle’s Theorem) Let \(f\) be continuous on the closed interval \([a,b]\) and differentiable on \((a,b)\). Then if \(f(b)=f(a)\), there exists some \(x^\star\in (a,b)\) where \(f^\prime(x^\star)=0\).

Proof. Without loss of generality we may take \(f(b)=f(a)=0\) (if their common value is \(k\), consider instead the function \(f(x)-k\), and use the linearity of differentiation to see this yields the same result).

There are two cases: (1) \(f\) is constant, and (2) \(f\) is not. In the first case, \(f^\prime(x)=0\) for all \(x\in(a,b)\) so we may choose any such point. In the second case, since \(f\) is continuous, it achieves both a maximum and minimum value on \([a,b]\) by the extreme value theorem. Because \(f\) is nonconstant these values are distinct, and so at least one of them must be nonzero. Let \(c\in(a,b)\) denote the location of either a (positive) absolute max or (negative) absolute min.

Then, \(c\in(a,b)\) and for all \(x\in(a,b)\), \(f(x)\leq f(c)\) if \(c\) is the absolute min, and \(f(x)\geq f(c)\) if its the max. In both cases, \(c\) satisfies the definition of a local extremum. And, as \(f\) is differentiable on \((a,b)\) this implies \(f^\prime(c)=0\), as required.

Now, we return to the main theorem:

Proof (Of the Mean Value Theorem). Let \(f\) be a function satisfying the hypotheses of the mean value theorem, and \(L\) be the secant line connecting \((a,f(a))\) to \((b,f(b))\). Computing this line, \[L=f(a)+\frac{f(b)-f(a)}{b-a}(x-a)\]

Now define the auxiliary function \(g(x)=f(x)-L(x)\). Since \(L(a)=f(a)\) and \(L(b)=f(b)\), we see that \(g\) is zero at both endpoints. Further, since both \(L\) and \(f\) are continuous on \([a,b]\) and differentiable on \((a,b)\), so is \(g\). Thus, \(g\) satisfies the hypotheses of Rolle’s theorem, and so there exists some \(\star\in(a,b)\) with \[g(\star)=0\]

But differentiating \(g\) we find

\[\begin{align*}0&=f^\prime(\star)-L^\prime(\star)\\ &= f^\prime(\star)-\frac{f(b)-f(a)}{b-a} \end{align*}\]

Thus, at \(\star\) we have \(f^\prime(\star)=\frac{f(b)-f(a)}{b-a}\) as claimed

Exercise 24.1 Verify the mean value theorem holds for \(f(x)=x^2+x-1\) on the interval \([4,7]\).

24.2.1 \(\bigstar\) IVT for Derivatives

Theorem 24.4 Let \(f\) be a differentiable function on an interval \(I\). Then its derivative \(f^\prime\) satisfies the intermediate value property: for any \(a,b\in I\) and any \(y\) between \(f^\prime(a)\) and \(f^\prime(b)\), there is some \(c\in [a,b]\) with \(f^\prime(c)=y\).

Proof.

24.3 \(\bigstar\) Infinite Sums

The crux of differentiating a function defined as a series is to be able to bring the derivative inside the sum. Because derivatives are limits, we can use dominated convergence to understand when we can switch sums and limits. One crucial step here is the Mean Value Theorem.

Theorem 24.5 (Dominated Convergence and Derivatives) Let \(f_k(x)\) be a series of functions on a domain \(D\).

  • For each \(k\), \(f_k(x)\) is differentiable at all \(x\in D\).
  • For each \(x\in D\), \(\sum_kf_k(x)\) is convergent.
  • There is an \(M_k\) with \(|f_k^\prime(x)|<M_K\), for all \(x\in D\).
  • The sum \(\sum M_k\) is convergent.

Then, the sum \(\sum_k f^\prime_k(x)\) is convergent, and \[\left(\sum_k f_k(x)\right)^\prime=\sum_k f^\prime_k(x)\]

Proof. Recall the limit definition of the derivative (Definition 22.1): \[\left(\sum_k f_k(x)\right)^\prime=\lim_{y\to x}\frac{\sum_k f_k(y)-\sum_k f_k(x)}{y-x}\] Writing each sum as the limit of finite sums, we may use the limit theorems (Theorem 7.3,Theorem 7.2) to combine this into a single sum \[\lim_{y\to x}\frac{\lim_N\sum_{k=0}^N f_k(y)-\lim_N\sum_{k=0}^N f_k(x)}{y-x}=\lim_{y\to x}\lim_N\sum_{k=0}^N\frac{f_k(y)-f_k(x)}{y-x}\]

And now, rewriting the limit of partial sums as an infinite sum, we see \[\left(\sum_k f_k(x)\right)^\prime=\lim_{y\to x}\sum_k \frac{f_k(y)-f_k(x)}{y-x}\]

If we are justified in switching the limit and the sum via Theorem 17.5, this becomes

\[\sum_k\lim_{y\to x}\frac{f_k(y)-f_k(x)}{y-x}=\sum_k f^\prime_k(x)\]

which is exactly what we want. Thus, all we need to do is justify that the conditions of Theorem 17.5 are satisfied, for the terms \[g_k(y)=\frac{f_k(y)-f_k(x)}{y-x}\] with \(x\) a fixed constant and \(y\) the variable, as we take the limit \(y\to x\).

Step 1: Show \(\lim_{y\to x}g_k(y)\) exists We have assumed that \(f_k\) is differentiable at each point of \(D\), which is exactly the assumption that \(\lim_{y\to x}g_k(y)\) exists.

Step 2: Show \(\sum_k g_k(y)\) is convergent We have assumed that \(\sum_k f_k(t)\) exists for all \(t\in D\). Let \(x\neq y\) be two points in \(D\). Then both \(\sum_k f_k(x)\) and \(\sum_k f_k(y)\) exist, and by the limit theorems, the following limit also exists: \[\frac{1}{y-x}\left(\sum_k f_k(y)-\sum_k f_k(x)\right)=\sum_k\frac{f_k(y)-f_k(x)}{y-x}=\sum_k g_k(y)\]

Step 3: Find an \(M_k\) with \(|g_k(y)|<M_k\) for all \(y\neq x\). We are given by assumption that there is such an \(M_k\) bounding the derivative \(f_k\) on \(D\): we need only show this suffices. If \(x\neq y\) then \(g_k(y)\) measures the slope of the secant line of \(f_k\) between \(x\) and \(y\), so by the Mean Value Theorem (Theorem 24.2) there is some \(c\) between \(x\) and \(y\) with \[|g_k(y)|=\left|\frac{f_k(y)-f_k(x)}{y-x}\right|=\left|f^\prime_k(c)\right|\] Since \(|f^\prime_k(c)|\leq M_k\) by assumption (as \(c\in D\)), \(M_k\) is a bound for \(g_k\) as required.

Step 4: Show \(\sum M_k\) is convergent This is an assumption, as the \(M_k\)’s are the same as originally given. Thus there’s nothing left to show, and dominated convergence applies!

24.4 \(\bigstar\) Order of Multiple Derivatives

WRITE THIS SECTION

Exchanging limits! Conditions on when you can do this: both partial derivatives exist and are continuous.