$$ \newcommand{\RR}{\mathbb{R}} \newcommand{\QQ}{\mathbb{Q}} \newcommand{\CC}{\mathbb{C}} \newcommand{\NN}{\mathbb{N}} \newcommand{\ZZ}{\mathbb{Z}} \newcommand{\FF}{\mathbb{F}} \renewcommand{\epsilon}{\varepsilon} % ALTERNATE VERSIONS % \newcommand{\uppersum}[1]{{\textstyle\sum^+_{#1}}} % \newcommand{\lowersum}[1]{{\textstyle\sum^-_{#1}}} % \newcommand{\upperint}[1]{{\textstyle\smallint^+_{#1}}} % \newcommand{\lowerint}[1]{{\textstyle\smallint^-_{#1}}} % \newcommand{\rsum}[1]{{\textstyle\sum_{#1}}} \newcommand{\uppersum}[1]{U_{#1}} \newcommand{\lowersum}[1]{L_{#1}} \newcommand{\upperint}[1]{U_{#1}} \newcommand{\lowerint}[1]{L_{#1}} \newcommand{\rsum}[1]{{\textstyle\sum_{#1}}} % extra auxiliary and additional topic/proof \newcommand{\extopic}{\bigstar} \newcommand{\auxtopic}{\blacklozenge} \newcommand{\additional}{\oplus} \newcommand{\partitions}[1]{\mathcal{P}_{#1}} \newcommand{\sampleset}[1]{\mathcal{S}_{#1}} \newcommand{\erf}{\operatorname{erf}} $$

23  Working with Derivatives

Highlights of this Chapter: we prove many foundational theorems about the derivative that one sees in an early calculus course. We see how to take the derivative of scalar multiples, sums, products, quotients and compositions. We also compute - directly from the definition - the derivative of exponential functions. This leads to an important discovery: there is a unique simplest, or natural exponential, whose derivative is itself. This is the origin of \(e\) in Analysis.

From the definition, we move on to confirm the basic properties of the derivative well known and loved in introductory calculus courses. Most of these are straightforward, the only exception whose proof requires more thought than usually let on in Calculus I is the chain rule.

23.1 Continuity

Before jumping in we prove one small oft-useful result often not mentioned in a calculus class, relating differentiability to continuity.

Theorem 23.1 (Differentiable implies Continuous) Let \(f\) be differentiable at \(a\in\RR\). Then \(f\) is continuous at \(a\).

Proof. Since \(f\) is differentiable at \(a\), we know the limit of the difference quotient is finite \[\lim_{x\to a}\frac{f(x)-f(a)}{x-a}=f^\prime(a)\] We also know that \(\lim_{x\to a}(x-a)=0\)$ So, using the limit theorems we may multiply these together and get what we want. Precisely, let \(x_n\to a\) be any sequence with \(x_n\neq a\) for all \(n\). Then we have

\[\begin{align*} 0 &= (0)(f^\prime(a))\\ &=\left(\lim x_n-a\right)\left(\lim \frac{f(x_n)-f(a)}{x_n-a}\right)\\&=\lim\left((x_n-a)\frac{f(x_n)-f(a)}{x_n-a}\right)\\ &=\lim\left(f(x_n)-f(a)\right) \end{align*}\]

Thus \(\lim (f(x_n)-a)=0\) so by the limit theorems we see \(\lim f(x_n)=a\). Since \(x_n\) was arbitrary with \(x_n\neq a\) this holds for any such sequence, we see that \(f\) is continuous at \(a\) using the sequence definition.

Remark 23.1. There is a little gap not explicitly spelled out at the end of the proof above, that we should fill in now (to assure ourselves this style of reasoning always works). We just proved that for sequences \(x_n\neq a\) the property we want holds, but continuity requires this fact for all arbitrary sequences. How do we bridge this gap? Let \(y_n\to a\) be an arbitrary sequence: then we split into the subsequences \(x_n\neq a\) and the subsequence of all terms \(=a\). If either of these is finite, we can just truncate the original sequence at a point past which all terms are of one or the other: each of these has \(\lim f(x_n)=f(a)\) so we are done. In the case that both are infinite, we just use that we have separated our sequence into a union of two subsequences, each with the same limit! Thus the overall limit exists.

Thus continuous functions must be differentiable, but what can we say about the derivative itself? If a function is everywhere differentiable must the derivative itself be continuous? In fact not, as the following example shows

Example 23.1 While its hard to imagine a function that is differentiable at every point but not continuously differentiable such things exist. For example \[f(x)=\begin{cases} x^2\sin\left(\frac{1}{x^2}\right)&x\neq 0\\ 0 & x=0 \end{cases} \]

Its possible to find a formula for \(f^\prime(x)\) when \(x\neq 0\), and show that \(\lim_{x\to 0}f^\prime(x)\) does not exist (we will do this later). However one can also calculate directly the derivative at zero: and find \(f^\prime(0)=0\). This means \(\lim_{x\to 0}f^\prime(x)\neq f^\prime(\lim_{x\to 0}x)\) as one side does not exist and the other is zero: thus \(f^\prime\) is not continuous at \(0\).

Exercise 23.1 For \(f(x)\) as above in Example 23.1, calculate \(f^\prime(0)\) directly using the limit definition. (Perhaps surprisingly, all you need to know about the sine function here is that it is bounded between \(-1\) and \(1\)!)

23.2 Field Operations

Here we prove the ‘derivative laws’ of Calculus I:

23.2.1 Sums and Multiples

Theorem 23.2 (Differentiating Constant Multiples) Let \(f\) be a function and \(c\in\RR\). Then if \(f\) is differentiable at a point \(a\in\RR\) so is \(cf\), and \[(cf)^\prime(a)=c\left(f^\prime(a)\right)\]

Proof. Let’s use the difference quotient with \(a+h_n\) to change things up: Let \(h_n\to 0\) be arbitrary, and we wish to compute the limit \[\lim \frac{cf(a+h_n)-cf(a)}{h_n}\] By the limit laws we can pull out the constant \(c\), and the remainder converges to \(f^\prime(a)\), as \(f\) is assumed to be differentiable at \(a\).

\[=c\lim \frac{f(a+h_n)-f(a)}{h_n}=cf^\prime(a)\]

Because this is true for all sequences \(h_n\to 0\) with \(h_n\neq 0\), the limit exists, and equals \(cf^\prime(a)\).

Theorem 23.3 (Differentiating Sums) Let \(f,g\) be functions which are both differentiable at a point \(a\in\RR\). Then \(f+g\) is also differentiable at \(a\), and \[(f+g)^\prime(a)=f^\prime(a)+g^\prime(a)\]

Exercise 23.2 Prove the differentiability rule for sums.

23.2.2 Products and Quotients

Theorem 23.4 (Differentiating Products) Let \(f,g\) be functions which are both differentiable at a point \(a\in\RR\). Then \(fg\) is differentiable at \(a\) and

\[(fg)^\prime(a)=f^\prime(a)g(a)+f(a)g^\prime(a)\]

Proof. Let \(f,g\) be differentiable at \(a\in\RR\), and choose an arbitrary sequence \(a_n\to a\). Then we wish to compute

\[\lim\frac{f(a_n)g(a_n)-f(a)g(a)}{a_n-a}\]

To the numerator we add \(0=f(a_n)g(a)-f(a_n)g(a)\) and regroup with algebra:

\[=\lim \frac{f(a_n)g(a_n)-f(a_n)g(a)+f(a_n)g(a)-f(a)g(a)}{a_n-a}\] \[=\lim\frac{f(a_n)g(a_n)-f(a_n)g(a)}{a_n-a}+\frac{f(a_n)g(a)-f(a)g(a)}{a_n-a}\]

Using the limit laws, we can take each of these limits individually so long as they exist (which we will show they do). But even more, note that the first term has a common factor of \(f(a_n)\) in the numerator that can be factored out, and the second a common factor of \(g(a)\). Thus, by the limit laws, we see

\[=\left(\lim f(a_n)\right)\left(\lim\frac{g(a_n)-g(a)}{a_n-a}\right)+g(a)\left(\frac{f(a_n)-f(a)}{a_n-a}\right)\]

Because \(f\) is differentiable at \(a\), its continuous at \(a\), and so we know \(\lim f(a_n)=f(a)\). The other two limits above converge to the derivatives \(f^\prime(a)\) and \(g^\prime(a)\) respectively. Thus, alltogether we find the resulting limit to be

\[f(a)g^\prime(a)+f^\prime(a)g(a)\]

As this was the result for an arbitrary sequence \(a_n\to a\) with \(a_n\neq a\), it must be the same for all sequences, meaning the limit exists, and

\[(f\cdot g)^\prime (a)=f(a)g^\prime(a)+f^\prime(a)g(a)\]

Exercise 23.3 Let \(f\) be a function and \(a\in\RR\) be a point such that \(f(a)\neq 0\) and \(f\) is differentiable at \(a\). Prove that \(1/f\) is also differentiable at \(a\) and \[\left(\frac{1}{f}\right)^\prime(a)=\frac{-f^\prime(a)}{f(a)^2}\]

Theorem 23.5 (Differentiating Quotients) Let \(f,g\) be a functions which are differentiable at a point \(a\in\RR\) and assume \(g(a)\neq 0\). Then the function \(f/g\) is also differentiable at \(a\) and \[\left(\frac{f}{g}\right)^\prime(a)=\frac{f^\prime(a)g(a)-f(a)g^\prime(a)}{g(a)^2}\]

Exercise 23.4 Use the Reciprocal Rule and Product Rule to prove the quotient rule.

23.3 Compositions and Inverses

23.3.1 The Chain Rule

Theorem 23.6 (The Chain Rule) If \(g(x)\) is differentiable at \(a\in\RR\) and \(f(x)\) is differentiable at \(g(a)\) then the composition \(f\circ g\) is differentiable at \(a\), with \[(f\circ g)^\prime(a)=f^\prime(g(a))g^\prime(a)\]

Proof (Wish this Worked!). We are taking the derivative at \(a\), so let \(x_n\to a\) wtih \(x_n\neq a\) be arbitrary. Then the limit defining \(\left[f(g(a))\right]^\prime\) is

\[\lim \frac{f(g(x_n))-f(g(a))}{x_n-a}\]

We multiply the numerator and denominator of this fraction by $\(g(x_n)-g(a)\) and regroup:

\[\begin{align*} \frac{f(g(x_n))-f(g(a))}{x_n-a}&= \frac{f(g(x_n))-f(g(a))}{x_n-a}\frac{g(x_n)-g(a)}{g(x_n)-g(a)}\\ &=\lim \frac{f(g(x_n))-f(g(a))}{g(x_n)-g(a)}\frac{g(x_n)-g(a)}{x_n-a} \end{align*}\]

Because \(g\) is continuous at \(a\), we know \(g(x_n)\to a\), and because \(f\) is differentiable at \(g(a)\) we recognize the first term here as the limit defining \(f^\prime\) at \(g(a)\)! Since the second term is the limit defining the derivative of \(g\), both of these exist by our assumptions, and so by the limit theorems we can compute

\[= \left(\lim \frac{f(g(x_n))-f(g(a))}{g(x_n)-g(a)}\right)\left(\lim \frac{g(x_n)-g(a)}{x_n-a}\right)\] \[ = f^\prime(g(a))g^\prime(a) \]

Unfortunately, this proof fails at one crucial step! Wile we do know that \(x_n-a\neq 0\) (in the definition of \(\lim_{x\to a}\), we only choose sequences \(x_n\to a\) with \(x_n\neq a\)) we do not know that the other denominator \(g(x_n)-g(a)\) is nonzero.

If this problem could only happen finitely many times it would be no trouble - we could just truncate the beginning of our sequence and rest assured we had not affected the value of the limit. But functions - even differentiable functions - can be pretty wild. The function \(x^2\sin(1/x)\) (from Example 23.1) ends up equaling zero infinitely often in any neighborhood of zero! So such things are a real concern.

Happily the fix - while tedious - is straightforward. It’s given below.

Exercise 23.5 We define the auxiliary function \(d(y)\) as follows:

\[d(y)=\begin{cases} \frac{f(y)-f(g(a))}{y-g(a)} & y\neq g(a)\\ f^\prime(g(a))& y=g(a) \end{cases}\]

This function equals our problematic difference quotient most of the time, but equals the quantity we want it to be when the denominator is zero.

Prove that \(d\) is continuous at \(g(c)\) and we may use \(d\) in place of the difference quotient in our computation: that for all \(x\neq a\), the following equality holds:

\[\frac{f(g(x))-f(g(a))}{x-a}=d(g(x))\frac{g(x)-g(a)}{x-a}\]

Given this, the original proof is rescued:

Proof. We are taking the derivative at \(a\), so let \(x_n\to a\) with \(x_n\neq a\) be arbitrary. Then the limit defining \(\left[f(g(a))\right]^\prime\) is (by the exercise)

\[\lim \frac{f(g(x_n))-f(g(a))}{x_n-a}=\lim d(g(x_n))\frac{g(x_n)-g(a)}{x_n-a}\]

Because \(d\) is continuous at \(g(a)\) and \(g(x_n)\to g(a)\) we know \(d(g(x_n))\to d(g(a))=f^\prime(g(a))\). And, as \(g\) is differentiable at \(a\) we know the limit of the difference quotient exists. Thus, by the limit laws we can separate them and

\[=\left(\lim d(g(x_n))\right)\left(\frac{g(x_n)-g(a)}{x_n-a}\right)=f^\prime(g(a))g^\prime(a)\]

23.3.2 Differentiating Inverses

Theorem 23.7 (Differentiating Inverses) Let \(f\) be an invertible function and \(a\in\RR\) a point where \(f(a)=b\). Assume \(f\) is differentiable at \(a\) with \(f^\prime(a)\neq 0\). Then its inverse function \(f^{-1}\) is differentiable at \(b\), and \[(f^{-1})^\prime(b)=\frac{1}{f^\prime(a)}\]

One may be tempted to prove this using the chain rule, by the following argument: since \(f\circ f^{-1}(x)=x\) we differentiate to yield \((f\circ f^{-1}(x))^\prime = 1\) and apply the chain rule to the left hand side, resulting in \[f^\prime\left(f^{-1}(x)\right)(f^{-1})^\prime(x)=1\]

Solving for \((f^{-1})^\prime\) and plugging in \(x=b\) yields the result. However a more careful review shows doesn’t actually do what we think: in applying the chain rule, we’ve implicitly assumed that \(f^{-1}\) is invertible; which is part of what we want to prove! (This proof does go through when we already know \(f^{-1}\) to be differentiable, but we are unfortunately not often already in possession of that knowledge). Below we give a direct proof of the theorem from the limit definition, fixing this oversight:

Proof. We attempt to compute the limit defining the derivative for \(f^{-1}\): \(\lim_{y\to b }\frac{f^{-1}(y)-f^{-1}(b)}{y-b}\). To compute such a limit we choose an arbitrary sequence \(y_n\to y\) with \(y_n\neq y\) and evaluate \[\lim \frac{f^{-1}(y_n)-f^{-1}(b)}{y_n-b}\] By definition \(b=f(a)\), and for each \(n\) there is a unique \(x_n\) such that \(y_n=f(x_n)\): making these substitutions yields \[\lim \frac{f^{-1}(f(x_n))-f^{-1}(f(a))}{f(x_n)-f(a)}\] The composition \(f^{-1}\circ f\) is the identity since they are inverse functions so \(f^{-1}(f(x_n))=x_n\) and \(f^{-1}(f(a))=a\). Making these additional substitutions our limit statement becomes \[\lim \frac{x_n-a}{f(x_n)-f(a)}\]

By assumption \(f\) is differentiable at \(a\) and \(f^\prime(a)\neq 0\), so we know that \[f^\prime(a)=\lim \frac{f(x_n)-f(a)}{x_n-a}\]

The limit we are interested in is the reciprocal of this, and as the limit value is nonzero by assumption, the limit laws imply

\[\lim \frac{x_n-a}{f(x_n)-f(a)}=\lim\frac{1}{\frac{f(x_n)-f(a)}{x_n-a}}=\frac{1}{\lim \frac{f(x_n)-f(a)}{x_n-a}}=\frac{1}{f^\prime(a)}\]

Since the sequence \(y_n\) was arbitrary, this argument holds for any such sequence. Thus the limit defining \((f^{-1})^\prime(b)\) exists, and \((f^{-1})^\prime(b)=\frac{1}{f^\prime(a)}\).

Exercise 23.6 Compute the derivative of \(y=\sqrt{x}\) using this idea.

23.4 \(\blacklozenge\) The Power Rule

Perhaps the most memorable fact from Calculus I is the power rule, that \((x^n)^\prime = nx^{n-1}\). In this short section, we prove the power level at various levels of generality, starting with natural number exponents and proceeding to arbitrary real exponents.

Proposition 23.1 (Power Rule: Natural Number Exponents) If \(n\) is a natural number, \(x^n\) is differentiable at all real numbers and \[(x^n)^\prime = nx^{n-1}\]

Proof. This is directly proved via induction on \(n\), starting from the base case \(x^\prime =1\), which holds as if \(f(x)=x\) and \(a\in\RR\),

\[\lim_{x\to a}\frac{f(x)-a}{x-a}=\frac{x-a}{x-a}=1\]

Now, assume \((x^n)^\prime = nx^{n-1}\) and consider \(x^{n+1}\). Using the product rule, we compute the derivative of \(x^{n+1}=xx^{n}\)

\[\begin{align*} (xx^n)^\prime &= (x)^\prime x^n+x (x^n)^\prime\\ &= 1 x^n + x (nx^{n-1})\\ &= x^n+n x^n\\ &=(n+1)x^{n+1} \end{align*}\]

Exercise 23.7 (Power Rule: Integer Exponents) Let \(n\in\ZZ\) and consider the function \(x^n\) (which is defined as \(1/x^{|n|}\) when \(n<0\)). Then \(x^n\) is differentiable at all \(x\neq 0\) and \[(x^n)^\prime = nx^{n-1}\]

Using this, we can extend what we know to rational exponents:

Proposition 23.2 (Power Rule: Rational Exponents) Let \(r=p/q\) be any rational number and \(f(x)=x^r\). Then \(f\) is differentiable for all \(x>0\) and \[f^\prime(x)=rx^{r-1}\]

Proof. Let \(r=p/q\) where without loss of generality \(p,q\neq 0\) and \(q>1\) (as if \(q=1\) we are in the integer exponent case). Then let \(f(x)=x^{p/q}\), and note that \(f(x)^q = x^p\). Then we can differentiate both sides of this inequality:

\[\left [f(x)^q\right]^\prime=qf(x)^{q-1}f^\prime(x)\] \[\left[x^p\right]^\prime = px^{p-1}\]

Equating these gives \(qf(x)^{q-1}f^\prime(x)=px^{p-1}\), and solving for \(f^\prime\):

\[f^\prime(x)=\frac{px^{p-1}}{qf(x)^{q-1}}\]

Using that \(f(x)=x^{p/q}\) we can simplify the right hand side further:

\[f^\prime(x)=\frac{px^{p-1}}{q (x^{p/q})^{q-1}}=\frac{px^{p-1}}{q x^{p\frac{q-1}{a}}}=\frac{p}{q} x^{(p-1)-p\frac{q-1}{q}}\]

This exponent simplifies as expected, yielding \[f^\prime(x)=\tfrac{p}{q} x^{\frac{p}{q}-1}\]

Now that we know the power rule for all rational exponents, it is time to consider arbitrary real exponents, recalling that we define \(x^a\) as a limit of rational exponents.

Theorem 23.8 (The Power Rule) If \(a\in\RR\) and \(f(x)=x^a\). Then \(f\) is differentiable for all \(x>0\), and \[(x^a)^\prime = ax^{a-1}\]

Proof (Well Almost…). The function \(x^a\) is defined as a limit: for any sequence \(a_n\to a\) of rational numbers, we define \[f(x)=\lim_n x^{a_n}\]

To differentiate at \(x\), we need to choose a sequence \(x_k\to x\) and compute the limit \[\lim_k \frac{x_k^a-x^a}{x_k-x}\]

But, as \(x^a\) is itself defined as a limit, we have a limit of limits! \[=\lim_k \frac{\lim_n x_k^{a_n}-\lim_n x^{a_n}}{x_k-x}=\lim_k\lim_n\frac{x_k^{a_n}-x^{a_n}}{x_k-x}\]

We need to justify that we can switch the order of these limits: assuming temporarily that we are allowed to do so, this yields

\[=\lim_n\left(\lim_k \frac{x_k^{a_n}-x^{a_n}}{x_k-x}\right)\]

For a fixed \(n\), this inner limit now just describes the definition of the derivative of the function \(x^{a_n}\) as \(x_k\to x\). Since we know \(a_n\) is rational, we can apply our previous result to see

\[=\lim_n (x^{a_n})^\prime = \lim a_n x^{a_{n}-1}\]

Now we can take the \(n\) limit, using the limit laws and the definition of irrational exponents \[\lim_n a_n x^{a_n-1}=(\lim a_n)(\lim x^{a_n-1})=ax^{a-1}\]

This is exactly what we want! Thus, all that remains is a proof that switching the limits is actually justified.

PROOF OF SWITCHING LIMITS!!!

We will give an alternative argument for the general power rule that sidesteps this issue, and uses exponentials and logarithms after we have learned to differentiate them.

23.5 The \(dx\) Notation

Introduce the notation \(\frac{df}{dx}\) and its utility with the chain rule, and other calculations.