21 Definition & Properties
Highlights of this Chapter: we prove many foundational theorems about the derivative that one sees in an early calculus course. We see how to take the derivative of scalar multiples, sums, products, quotients and compositions. We also compute - directly from the definition - the derivative of exponential functions. This leads to an important discovery: there is a unique simplest, or natural exponential, whose derivative is itself. This is the origin of \(e\) in Analysis.
Finally - on to some calculus! Here we will define the derivative, and study its properties. This may sound daunting at first, remembering back to the days of calculus when it all seemed so new and advanced. But hopefully, after so much exposure to sequences and series during this course, the rigorous notion of a derivative will feel more just like a nice application of what we’ve learned, than a whole new theory.
21.1 Difference Quotients
The derivative is defined to capture the slope of a graph at a point. Elementary algebra tells us we can compute the slope of a line given two points as rise over run, and so we can compute the slope of a secant line of a function between the points \(a,t\) as
\[\frac{f(t)-f(a)}{t-a}\]
The derivative is the limit of this, as \(t\to a\):
Definition 21.1 (The Derivative) Let \(f\) be a function defined on an open interval containing \(a\). Then \(f\) is differentiable at \(a\) if the following limit of difference quotients exists. In this case, we define the limiting value to be the derivative of \(f\) at \(a\). \[f^\prime(a)=D f(a)=\lim_{t\to a}\frac{f(t)-f(a)}{t-a}\]
Exercise 21.1 (Equivalent Formulation) Prove that we may alternatively use the following limit definition to calculate the derivative: \[f^\prime(a)=\lim_{h\to 0}\frac{f(a+h)-f(a)}{h}\]
Example 21.1 The function \(f(x)=x^2\) is differentiable at \(x=2\).
This is a classic problem from calculus 1, whose argument is already pretty much rigorous! We wish to compute the limit
\[\lim_{x\to 2}\frac{x^2-4}{x-2}\]
So, we choose an arbitrary sequence \(x_n\) with \(x_n\neq 2\) but \(x_n\to 2\) and compute
\[\lim \frac{x_n^2-4}{x_n-2}=\lim \frac{(x_n+2)(x_n-2)}{x_n-2}=\lim x_n+2\]
Where the arithmetic is justified since \(x_n\neq 2\) for all \(n\) by definition, so everything is defined. But now, as \(x_n\to 2\) we can just use the limit laws to see
\[\lim x_n+2=2+2=4\]
Since \(x_n\) was arbitrary, this holds for all such sequences, so the limit exists and equals 4. Because this limit defines the derivative, we have that \(f\) is differentiable at 2 and
\[f^\prime(2)=4\]
Exercise 21.2 Compute the derivative of \(f(x)=x^3\) at an arbitrary point \(a\in\RR\), directly from the definition and show \(f^\prime(a)=3a^2\).
As defined above, the derivative is a limit \(t\to a\), which depends on values of \(t\) both greater than and less than \(a\). But sometimes its useful to have a notion of the derivative that only cares about one sided limits (for instance, when computing the slope at the end of an interval). We give the analogous definition below
Definition 21.2 (One Sided Derivatives) Let \(f\) be a function defined at \(a\); then its 1-sided derivatives are defined by the following limits, when they exist
\[D_+f(a)=\lim_{t\to a^+}\frac{f(t)-f(a)}{t-a}\] \[D_-f(a)=\lim_{t\to a^-}\frac{f(t)-f(a)}{t-a}\]
This definition, together with our previous work on limits, implies that a function \(f\) is differentiable if and only if its two one sided derivatives exist and are equal. This is useful in practice, for instance in showing the non-differentiability of the absolute value:
Exercise 21.3 Show that \(f(x)=|x|\) is not differentiable at \(x=0\).
21.2 Derivative as a Function
So far we have been discussing the derivative at a point as a number; the result of a limiting process. But we can let this point vary, and produce a function taking in \(x\) and outputting the derivative at \(x\):
Definition 21.3 (The Function \(f^\prime\)) Let \(f\) be a function, and suppose that the derivative of \(f\) exists at each point of a set \(D\subset\RR\). Then we may define a function \(f^\prime\colon D\to \RR\) by
\[f^\prime\colon x\mapsto f^\prime(x)=\lim_{t\to x}\frac{f(t)-f(x)}{t-x}\]
If \(f^\prime\) is continuous, \(f\) is called continuously differentiable on \(D\).
For example, \(f(x)=x^3\) is continuously differentiable on \(\RR\) since by Exercise 21.2 we see its derivative is the function \(x\mapsto 3x^2\), and this is a polynomial: we proved all polynomials are continuous.
Since the derivative of a function yields another function, we can look at iterating this process to produce higher derivatives
Definition 21.4 (nth Derivatives) Given a differentiable function \(f\), the second derivative \(f^{\prime\prime}\) is defined as the derivative of \(f^\prime\). A function is twice differentiable at \(x\) if \[\lim_{x\to a}\frac{f^\prime(x)-f^\prime(a)}{x-a}\] exists. Continuing inductively, we define the \(n^{th}\) derivative of a function at \(a\) as the derivative of the \(n-1^{st}\) derivative of \(f\) at \(a\).
We will use the prime notation for small numbers of derivatives, like \(f^\prime(x)\), \(f^{\prime\prime}(x)\) and \(f^{\prime\prime\prime}(x)\). For higher derivatives it is traditional to denote via the number of derivatives in parentheses: \(f^{(2)}=f^{\prime\prime}\), \(f^{(3)}=f^{\prime\prime\prime}\) and so on; so \(f^{(47)}\) for the 47th derivative of \(f\).
Exercise 21.4 (A Difference Quotient for 2nd Derivative) If \(f\) is twice differentiable at \(a\), show that \[f^{\prime\prime}(a)=\lim_{h\to 0}\frac{f(a+2h)-2f(a+h)+f(a)}{h^2}\]
Find a limit depending only on \(f\) (not \(f^\prime\) or \(f^{\prime\prime}\)) which computes the third derivative
Its useful to have a notation for functions which admit \(k\) derivatives, we say a function is \(C^k\) if you can differentiate it \(k\) times (but not necessairly \(k+1\) times). And, we call a function smooth if you can differentiate it \(n\) times for any \(n\in\NN\). The set of smooth functions is denoted \(C^\infty\).
21.2.1 Continuity
Before jumping in we prove one small oft-useful result often not mentioned in a calculus class, relating differentiability to continuity.
Theorem 21.1 (Differentiable implies Continuous) Let \(f\) be differentiable at \(a\in\RR\). Then \(f\) is continuous at \(a\).
Proof. Since \(f\) is differentiable at \(a\), we know the limit of the difference quotient is finite \[\lim_{x\to a}\frac{f(x)-f(a)}{x-a}=f^\prime(a)\] We also know that \(\lim_{x\to a}(x-a)=0\)$ So, using the limit theorems we may multiply these together and get what we want. Precisely, let \(x_n\to a\) be any sequence with \(x_n\neq a\) for all \(n\). Then we have
\[\begin{align*} 0 &= (0)(f^\prime(a))\\ &=\left(\lim x_n-a\right)\left(\lim \frac{f(x_n)-f(a)}{x_n-a}\right)\\&=\lim\left((x_n-a)\frac{f(x_n)-f(a)}{x_n-a}\right)\\ &=\lim\left(f(x_n)-f(a)\right) \end{align*}\]
Thus \(\lim (f(x_n)-a)=0\) so by the limit theorems we see \(\lim f(x_n)=a\). Since \(x_n\) was arbitrary with \(x_n\neq a\) this holds for any such sequence, we see that \(f\) is continuous at \(a\) using the sequence definition.
Remark 21.1. There is a little gap not explicitly spelled out at the end of the proof above, that we should fill in now (to assure ourselves this style of reasoning always works). We just proved that for sequences \(x_n\neq a\) the property we want holds, but continuity requires this fact for all arbitrary sequences. How do we bridge this gap? Let \(y_n\to a\) be an arbitrary sequence: then we split into the subsequences \(x_n\neq a\) and the subsequence of all terms \(=a\). If either of these is finite, we can just truncate the original sequence at a point past which all terms are of one or the other: each of these has \(\lim f(x_n)=f(a)\) so we are done. In the case that both are infinite, we just use that we have separated our sequence into a union of two subsequences, each with the same limit! Thus the overall limit exists.
Thus continuous functions must be differentiable, but what can we say about the derivative itself? If a function is everywhere differentiable must the derivative itself be continuous? In fact not, as the following example shows
Example 21.2 While its hard to imagine a function that is differentiable at every point but not continuously differentiable such things exist. For example \[f(x)=\begin{cases} x^2\sin\left(\frac{1}{x^2}\right)&x\neq 0\\ 0 & x=0 \end{cases} \]
Its possible to find a formula for \(f^\prime(x)\) when \(x\neq 0\), and show that \(\lim_{x\to 0}f^\prime(x)\) does not exist (we will do this later). However one can also calculate directly the derivative at zero: and find \(f^\prime(0)=0\). This means \(\lim_{x\to 0}f^\prime(x)\neq f^\prime(\lim_{x\to 0}x)\) as one side does not exist and the other is zero: thus \(f^\prime\) is not continuous at \(0\).
Exercise 21.5 For \(f(x)\) as above in Example 21.2, calculate \(f^\prime(0)\) directly using the limit definition. (Perhaps surprisingly, all you need to know about the sine function here is that it is bounded between \(-1\) and \(1\)!)
21.3 Field Operations
Here we prove the ‘derivative laws’ of Calculus I:
21.3.1 Sums and Multiples
Theorem 21.2 (Differentiating Constant Multiples) Let \(f\) be a function and \(c\in\RR\). Then if \(f\) is differentiable at a point \(a\in\RR\) so is \(cf\), and \[(cf)^\prime(a)=c\left(f^\prime(a)\right)\]
Proof. Let’s use the difference quotient with \(a+h_n\) to change things up: Let \(h_n\to 0\) be arbitrary, and we wish to compute the limit \[\lim \frac{cf(a+h_n)-cf(a)}{h_n}\] By the limit laws we can pull out the constant \(c\), and the remainder converges to \(f^\prime(a)\), as \(f\) is assumed to be differentiable at \(a\).
\[=c\lim \frac{f(a+h_n)-f(a)}{h_n}=cf^\prime(a)\]
Because this is true for all sequences \(h_n\to 0\) with \(h_n\neq 0\), the limit exists, and equals \(cf^\prime(a)\).
Theorem 21.3 (Differentiating Sums) Let \(f,g\) be functions which are both differentiable at a point \(a\in\RR\). Then \(f+g\) is also differentiable at \(a\), and \[(f+g)^\prime(a)=f^\prime(a)+g^\prime(a)\]
Exercise 21.6 Prove the differentiability rule for sums.
21.3.2 Products and Quotients
Theorem 21.4 (Differentiating Products) Let \(f,g\) be functions which are both differentiable at a point \(a\in\RR\). Then \(fg\) is differentiable at \(a\) and
\[(fg)^\prime(a)=f^\prime(a)g(a)+f(a)g^\prime(a)\]
Proof. Let \(f,g\) be differentiable at \(a\in\RR\), and choose an arbitrary sequence \(a_n\to a\). Then we wish to compute
\[\lim\frac{f(a_n)g(a_n)-f(a)g(a)}{a_n-a}\]
To the numerator we add \(0=f(a_n)g(a)-f(a_n)g(a)\) and regroup with algebra:
\[=\lim \frac{f(a_n)g(a_n)-f(a_n)g(a)+f(a_n)g(a)-f(a)g(a)}{a_n-a}\] \[=\lim\frac{f(a_n)g(a_n)-f(a_n)g(a)}{a_n-a}+\frac{f(a_n)g(a)-f(a)g(a)}{a_n-a}\]
Using the limit laws, we can take each of these limits individually so long as they exist (which we will show they do). But even more, note that the first term has a common factor of \(f(a_n)\) in the numerator that can be factored out, and the second a common factor of \(g(a)\). Thus, by the limit laws, we see
\[=\left(\lim f(a_n)\right)\left(\lim\frac{g(a_n)-g(a)}{a_n-a}\right)+g(a)\left(\frac{f(a_n)-f(a)}{a_n-a}\right)\]
Because \(f\) is differentiable at \(a\), its continuous at \(a\), and so we know \(\lim f(a_n)=f(a)\). The other two limits above converge to the derivatives \(f^\prime(a)\) and \(g^\prime(a)\) respectively. Thus, alltogether we find the resulting limit to be
\[f(a)g^\prime(a)+f^\prime(a)g(a)\]
As this was the result for an arbitrary sequence \(a_n\to a\) with \(a_n\neq a\), it must be the same for all sequences, meaning the limit exists, and
\[(f\cdot g)^\prime (a)=f(a)g^\prime(a)+f^\prime(a)g(a)\]
Exercise 21.7 Let \(f\) be a function and \(a\in\RR\) be a point such that \(f(a)\neq 0\) and \(f\) is differentiable at \(a\). Prove that \(1/f\) is also differentiable at \(a\) and \[\left(\frac{1}{f}\right)^\prime(a)=\frac{-f^\prime(a)}{f(a)^2}\]
Theorem 21.5 (Differentiating Quotients) Let \(f,g\) be a functions which are differentiable at a point \(a\in\RR\) and assume \(g(a)\neq 0\). Then the function \(f/g\) is also differentiable at \(a\) and \[\left(\frac{f}{g}\right)^\prime(a)=\frac{f^\prime(a)g(a)-f(a)g^\prime(a)}{g(a)^2}\]
Exercise 21.8 Use the Reciprocal Rule and Product Rule to prove the quotient rule.
21.4 Compositions and Inverses
21.4.1 The Chain Rule
Theorem 21.6 (The Chain Rule) If \(g(x)\) is differentiable at \(a\in\RR\) and \(f(x)\) is differentiable at \(g(a)\) then the composition \(f\circ g\) is differentiable at \(a\), with \[(f\circ g)^\prime(a)=f^\prime(g(a))g^\prime(a)\]
Proof (Wish this Worked!). We are taking the derivative at \(a\), so let \(x_n\to a\) wtih \(x_n\neq a\) be arbitrary. Then the limit defining \(\left[f(g(a))\right]^\prime\) is
\[\lim \frac{f(g(x_n))-f(g(a))}{x_n-a}\]
We multiply the numerator and denominator of this fraction by $\(g(x_n)-g(a)\) and regroup:
\[\begin{align*} \frac{f(g(x_n))-f(g(a))}{x_n-a}&= \frac{f(g(x_n))-f(g(a))}{x_n-a}\frac{g(x_n)-g(a)}{g(x_n)-g(a)}\\ &=\lim \frac{f(g(x_n))-f(g(a))}{g(x_n)-g(a)}\frac{g(x_n)-g(a)}{x_n-a} \end{align*}\]
Because \(g\) is continuous at \(a\), we know \(g(x_n)\to a\), and because \(f\) is differentiable at \(g(a)\) we recognize the first term here as the limit defining \(f^\prime\) at \(g(a)\)! Since the second term is the limit defining the derivative of \(g\), both of these exist by our assumptions, and so by the limit theorems we can compute
\[= \left(\lim \frac{f(g(x_n))-f(g(a))}{g(x_n)-g(a)}\right)\left(\lim \frac{g(x_n)-g(a)}{x_n-a}\right)\] \[ = f^\prime(g(a))g^\prime(a) \]
Unfortunately, this proof fails at one crucial step! Wile we do know that \(x_n-a\neq 0\) (in the definition of \(\lim_{x\to a}\), we only choose sequences \(x_n\to a\) with \(x_n\neq a\)) we do not know that the other denominator \(g(x_n)-g(a)\) is nonzero.
If this problem could only happen finitely many times it would be no trouble - we could just truncate the beginning of our sequence and rest assured we had not affected the value of the limit. But functions - even differentiable functions - can be pretty wild. The function \(x^2\sin(1/x)\) (from Example 21.2) ends up equaling zero infinitely often in any neighborhood of zero! So such things are a real concern.
Happily the fix - while tedious - is straightforward. It’s given below.
Exercise 21.9 We define the auxiliary function \(d(y)\) as follows:
\[d(y)=\begin{cases} \frac{f(y)-f(g(a))}{y-g(a)} & y\neq g(a)\\ f^\prime(g(a))& y=g(a) \end{cases}\]
This function equals our problematic difference quotient most of the time, but equals the quantity we want it to be when the denominator is zero.
Prove that \(d\) is continuous at \(g(c)\) and we may use \(d\) in place of the difference quotient in our computation: that for all \(x\neq a\), the following equality holds:
\[\frac{f(g(x))-f(g(a))}{x-a}=d(g(x))\frac{g(x)-g(a)}{x-a}\]
Given this, the original proof is rescued:
Proof. We are taking the derivative at \(a\), so let \(x_n\to a\) with \(x_n\neq a\) be arbitrary. Then the limit defining \(\left[f(g(a))\right]^\prime\) is (by the exercise)
\[\lim \frac{f(g(x_n))-f(g(a))}{x_n-a}=\lim d(g(x_n))\frac{g(x_n)-g(a)}{x_n-a}\]
Because \(d\) is continuous at \(g(a)\) and \(g(x_n)\to g(a)\) we know \(d(g(x_n))\to d(g(a))=f^\prime(g(a))\). And, as \(g\) is differentiable at \(a\) we know the limit of the difference quotient exists. Thus, by the limit laws we can separate them and
\[=\left(\lim d(g(x_n))\right)\left(\frac{g(x_n)-g(a)}{x_n-a}\right)=f^\prime(g(a))g^\prime(a)\]
21.4.2 Differentiating Inverses
Theorem 21.7 (Differentiating Inverses) Let \(f\) be an invertible function and \(a\in\RR\) a point where \(f(a)=b\). Assume \(f\) is differentiable at \(a\) with \(f^\prime(a)\neq 0\). Then its inverse function \(f^{-1}\) is differentiable at \(b\), and \[(f^{-1})^\prime(b)=\frac{1}{f^\prime(a)}\]
One may be tempted to prove this using the chain rule, by the following argument: since \(f\circ f^{-1}(x)=x\) we differentiate to yield \((f\circ f^{-1}(x))^\prime = 1\) and apply the chain rule to the left hand side, resulting in \[f^\prime\left(f^{-1}(x)\right)(f^{-1})^\prime(x)=1\]
Solving for \((f^{-1})^\prime\) and plugging in \(x=b\) yields the result. However a more careful review shows doesn’t actually do what we think: in applying the chain rule, we’ve implicitly assumed that \(f^{-1}\) is invertible; which is part of what we want to prove! (This proof does go through when we already know \(f^{-1}\) to be differentiable, but we are unfortunately not often already in possession of that knowledge). Below we give a direct proof of the theorem from the limit definition, fixing this oversight:
Proof. We attempt to compute the limit defining the derivative for \(f^{-1}\): \(\lim_{y\to b }\frac{f^{-1}(y)-f^{-1}(b)}{y-b}\). To compute such a limit we choose an arbitrary sequence \(y_n\to y\) with \(y_n\neq y\) and evaluate \[\lim \frac{f^{-1}(y_n)-f^{-1}(b)}{y_n-b}\] By definition \(b=f(a)\), and for each \(n\) there is a unique \(x_n\) such that \(y_n=f(x_n)\): making these substitutions yields \[\lim \frac{f^{-1}(f(x_n))-f^{-1}(f(a))}{f(x_n)-f(a)}\] The composition \(f^{-1}\circ f\) is the identity since they are inverse functions so \(f^{-1}(f(x_n))=x_n\) and \(f^{-1}(f(a))=a\). Making these additional substitutions our limit statement becomes \[\lim \frac{x_n-a}{f(x_n)-f(a)}\]
By assumption \(f\) is differentiable at \(a\) and \(f^\prime(a)\neq 0\), so we know that \[f^\prime(a)=\lim \frac{f(x_n)-f(a)}{x_n-a}\]
The limit we are interested in is the reciprocal of this, and as the limit value is nonzero by assumption, the limit laws imply
\[\lim \frac{x_n-a}{f(x_n)-f(a)}=\lim\frac{1}{\frac{f(x_n)-f(a)}{x_n-a}}=\frac{1}{\lim \frac{f(x_n)-f(a)}{x_n-a}}=\frac{1}{f^\prime(a)}\]
Since the sequence \(y_n\) was arbitrary, this argument holds for any such sequence. Thus the limit defining \((f^{-1})^\prime(b)\) exists, and \((f^{-1})^\prime(b)=\frac{1}{f^\prime(a)}\).
Exercise 21.10 Compute the derivative of \(y=\sqrt{x}\) using this idea.
21.5 \(\blacklozenge\) The Power Rule
Perhaps the most memorable fact from Calculus I is the power rule, that \((x^n)^\prime = nx^{n-1}\). In this short section, we prove the power level at various levels of generality, starting with natural number exponents and proceeding to arbitrary real exponents.
Exercise 21.11 (Power Rule: Integer Exponents)
We can use the chain rule, and the the functional equation for roots to differentiate \(n\)^{th}$ roots as well:
Proposition 21.1 If \(R(x)=x^{1/n}\) is the \(n^{th}\) root function, then \[R^\prime(x)=\frac{1}{n}x^{\frac{1}{n}-1}\]
Proof. The definition of the \(n^{th}\) root function is that \(R(x)^n=x\). We differentiate this equation with the chain rule, using that \(n\) is a natural number exponent:
\[\left(R(x)^n\right)^\prime = nR(x)^{n-1}R^\prime(x)\] The other side was \(x\), whose derivative is \(1\). Thus, \[nR(x)^{n-1}R^\prime(x)=1\] and, solving for \(R^\prime\) yields
\[R^\prime(x)=\frac{1}{n R(x)^{n-1}}=\frac{1}{n(x^{1/n})^{n-1}}=\frac{1}{n x^{\frac{n-1}{n}}}\] \[=\frac{1}{n}x^{\frac{1}{n}-1}\]
Exercise 21.12 (Power Rule: Rational Exponents) Run a similar argument to the \(n^{th}\) root case to prove that if \(r>0\) is rational, then \(x^{r}\) is differentiable and \((x^r)^\prime = rx^{r-1}\).
When it comes to arbitrary real exponents one can use their definition as limits of rational powers, and work to differentiate such a limit. This is possible but requires an exchange of limits, so needs care. Another method is to use the work we’ve already put into understanding exponentials and logarithms to help us out!
21.6 Problems
The pasting lemma has a differentiable analog, which shows exactly when gluing two pieces (like the absolute value) is differentiable, and when its not.
Exercise 21.13 Let \(f,g\) be two continuous and differentiable functions with \(a\in\RR\) a point such that \(f(a)=g(a)\). Prove that the piecewise function \[h(x)=\begin{cases} f(x)&x\leq a\\ g(x)&x>a \end{cases}\] is differentiable at \(a\) if and only if \(f^\prime(a)=g^\prime(a)\). (recall we saw such a function is always continuous at \(a\) in ?exr-pasting-lemma).
Exercise 21.14 (Differentiable, but The Derivative is Not Continuous) While its hard to imagine a function that is differentiable at every point but not continuously differentiable such things exist. For example \[f(x)=\begin{cases} x^2\sin\left(\frac{1}{x^2}\right)&x\neq 0\\ 0 & x=0 \end{cases} \]
Assume for the sake of this problem that \(\sin(x)\) is a differentiable function on the entire real line, and prove that \(f(x)\) is differentiable at every nonzero point, using the product/chain rules.
At \(x=0\) this method fails, but we can compute \(f^\prime(0)\) directly using the limit definition. Do this, and show you get zero. (Perhaps surprisingly, all you need to know about the sine function here is that it is bounded between \(-1\) and \(1\)!)