Highlights of this Chapter: we study the relationship between the behavior of a function and its derivative, proving several foundational results in the theory of differentiable functions:
- Fermat’s Theorem: A differentiable function has derivative zero at an extremum.
- Rolle’s Theorem: if a differentiable function is equal at two points, it must have zero derivative at some point in-between.
- The Mean Value Theorem: the average slope of a differentiable function on an interval is realized as the instantaneous slope at some point inside that interval.
The Mean Value theorem is really the star of the show, and we follow it with several important applications
That the derivative (rate of change) should be able to detect local extrema is an old idea, even predating the calculus of Newton and Leibniz. Though certainly realized earlier in certain cases, it is Fermat who is credited with the first general theorem (so, the result below is often called Fermat’s theorem) We will have more to say about extrema later in the chapter, but this theorem is so useful we prove it first, so it’s available for our use throughout.
Theorem 22.1 (Finding Local Extrema (Fermat’s Theorem)) Let be a function with a local extremum at . Then if is differentiable at , we must have .
Proof. Without loss of generality we will assume that is the location of a local minimum (the same argument applies for local maxima, except the inequalities in the numerators reverse). As is differentiable at , we know that both the right and left hand limits of the difference quotient exist, and are equal.
First, some preliminaries that apply to both right and left limits. Since we know the limit exists, it’s value can by computed via any appropriate sequence . Choosing some such sequence we investigate the difference quotient
Because is a local minimum, there is some interval (say, of radius ) about where . As , we know the sequence eventually enters this interval (by the definition of convergence) thus for all sufficiently large we know
Now, we separate out the limits from above and below, starting with . If but then we know is negative for all , and so
Thus, for all the difference quotient is , and so the limit must be as well! That is,
Performing the analogous investigation for the limit from above, we now have a sequence with . This changes the sign of the denominator, so
Again, if the difference quotient is for all , we know the same is true of the limit.
But, by our assumption that is differentiable at we know both of these must be equal! And if one is and the other the only possibility is that .
Mean Values
One of the most important theorems relating and is the mean value theorem. This is an excellent example of a theorem that is intuitively obvious (from our experience with reasonable functions) but yet requires careful proof (as we know by know many functions have non-intuitive behavior). Indeed, when I teach calculus I, I often paraphrase the mean value theorem as follows:
If you drove 60 miles in one hour, then at some point you must have been driving 60 miles per hour
How can we write this mathematically? Say you drove miles in hours. If is your position as a function of time*, and you were driving between and (where ), your average speed was
To then say *at some point you were going miles per hour implies that there exists some between and where the instantaneous rate of change - the derivative - is equal to this value. This is exactly the Mean Value Theorem:
Theorem 22.2 (The Mean Value Theorem) If is a function which is continuous on the closed interval and differentiable on the open interval , then there exists some where
Note: The reason we require differentiability only ont he interior of the interval is that the two sided limit defining the derivative may not exist at the endpoints, (if for example, the domain of is only ).
In this section we will prove the mean value theorem. It’s simplest to break the proof into two steps: first the special case were (and so we are seeking ), and then apply this to the general version. This special case is often useful in its own right and so has a name: Rolle’s Theorem.
Theorem 22.3 (Rolle’s Theorem) Let be continuous on the closed interval and differentiable on . Then if , there exists some where .
Proof. Without loss of generality we may take (if their common value is , consider instead the function , and use the linearity of differentiation to see this yields the same result).
There are two cases: (1) is constant, and (2) is not. In the first case, for all so we may choose any such point. In the second case, since is continuous, it achieves both a maximum and minimum value on by the extreme value theorem. Because is nonconstant these values are distinct, and so at least one of them must be nonzero. Let denote the location of either a (positive) absolute max or (negative) absolute min.
Then, and for all , if is the absolute min, and if its the max. In both cases, satisfies the definition of a local extremum. And, as is differentiable on this implies , as required.
Now, we return to the main theorem:
Proof (Of the Mean Value Theorem). Let be a function satisfying the hypotheses of the mean value theorem, and be the secant line connecting to . Computing this line,
Now define the auxiliary function . Since and , we see that is zero at both endpoints. Further, since both and are continuous on and differentiable on , so is . Thus, satisfies the hypotheses of Rolle’s theorem, and so there exists some with
But differentiating we find
Thus, at we have as claimed
Exercise 22.1 Verify the mean value theorem holds for on the interval .
MVT and Function Behavior
Proposition 22.1 (Zero Derivative implies Constant) If is a differentiable function where on an interval , then is constant on that interval.
Proof. Let be any two points in the interval: we will show that , so takes the same value at all points. If we can apply the mean value theorem to this pair, which furnishes a point such that But, by assumption! Thus , so .
Corollary 22.1 (Functions with the Same Derivative) If are two functions which are differentiable on an interval and on , then there exists a with
Proof. Consider the function . Then by the differentiation laws, as we have assumed . But now ?prp-derivative-zero-implies-const implies that is constant, so for some . Substituting this in yields
Definition 22.1 Let be a function. If is a differentiable function with the same domain such that , we say is an antiderivative of .
Corollary 22.2 (Antiderivatives differ by a Constant) Any two antiderivatives of a function differ by a constant. Thus, the collection of all possible antiderivatives is described choosing any particular antiderivative as
This is the familiar from Calculus!
We can use the theory of derivatives to understand when a function is increasing / decreasing and convex/concave, which prove useful in classifying the extrema of functions among other things.
Proposition 22.2 (Monotonicity and the Derivative) If is is continuous and differentiable on , then is monotone increasing on if and only of for all .
As this is an if and only if statement, we prove the two claims separately. First, we assume that and show is increasing:
Proof. Let be any two points in the interval : we wish to show that . By the Mean Value Theorem, we know there must be some point such that
But, we’ve assumed that on the entire interval, so . Thus, and since is positive, this implies
That is, . Note that we can extract even more information here than claimed: if we know that is strictly greater than 0 then following the argument we learn that , so is strictly monotone increasing.
Next, we assume is increasing and show :
Proof. Assume is increasing on , and let be arbitrary. Because we have assumed is differentiable, we know that the right and left limits both exist and are equal, and that either of them equals the value of the derivative. So, we consider the right limit
For any we know by the increasing hypothesis, and we know that by definition. Thus, for all such this difference quotient is nonnegative, and hence remains so in the limit:
Exercise 22.2 Prove the analogous statement for negative derivatives: on if and only if is monotone decreasing on .
Classifying Extrema
We can leverage our understanding of function behavior to classify the maxima and minima of a differentiable function. By Fermat’s theorem we know that if the derivative exists at such points it must be zero, motivating the following definition:
Definition 22.2 (Critical Points) A critical point of a function is a point where either (1) is not differentiable, or (2) is differentiable, and the derivative is zero.
Note that not all critical points are necessarily local extrema - Fermat’s theorem only claims that extrema are critical points - not the converse! There are many examples showing this is not an if and only if:
Example 22.1 The function has a critical point at (as the derivative is zero), but does not have a local extremum there. The function has a critical point at (because it is not differentiable there) but also does not have a local extremum.
If one is only interested in the absolute max and min of the function over its entire domain, this already provides a reasonable strategy, which is one of the early highlights of Calculus I.
Theorem 22.4 (Finding Global Extrema) Let be a continuous function defined on a closed interval with finitely many critical points. Then the absolute maximum and minimum value of are explicitly findable via the following procedure:
- Find the value of at the endpoints of
- Find the value of at the points of non-differentiability
- Find the value of at the points where .
The absolute max of is the largest of these values, and the the absolute min is the smallest.
Proof. Because is a closed interval and is continuous, we are guaranteed by the extreme value theorem that achieves both a maximum and minimum value. Let these be respectively, realized at points with
Without loss of generality, we will consider (the same argument applies to ).
First, could be at one of the endpoints of . If it is not, then lies in the interior of , and there is some small interval containing totally contained in the domain . Since is the location of the global max, we know for all , . Thus, for all , so is the location of a local max.
But if is the location of a local maximum, if is differentiable there by Fermat’s theorem we know . Thus, must be a critical point of (whether differentiable or not).
Thus, occurs in the list of critical points and endpoints, which are the points we checked.
Oftentimes one is concerned with the more fine-grained information of trying to classify specific extrema as (local) maxes or mins, however. This requires some additional investigation of the behavior of near the critical point
Proposition 22.3 (Distinguishing Maxes and Mins) Let be a continuously differentiable function on and be a critical point where for and if , for all in some small interval about .
Then is a local minimum of .
Proof. By the above, we know that for implies that is monotone decreasing for : that is, . Similarly, as for , we have that is increasing, and .
Thus, for on either side of we have , so is the location of a local minimum.
This is even more simply phrased in terms of the second derivative, as is common in Calculus I.
Theorem 22.5 (The Second Derivative Test) Let be a twice continuously differentiable function on , and a critical point. Then if , the point is the location of a local minimum, and if then is the location of a local maximum.
Proof. We consider the case that , the other is analogous. Since is continuous and positive at , we know that there exists a small interval about where is positive (by ?prp-continuous-positive-neighborhood).
Thus, by ?prp-pos-deriv-increasing, we know on this interval that is an increasing function. Since , this means that if we have and if we have . That is, changes from negative to positive at , so is the location of a local minimum by ?cor-max-min-first-deriv.
Contraction Maps
We can use what we’ve learned about derivatives and the mean value theorem to also produce a simple test for finding contraction maps.
Proposition 22.4 (Contraction Mappings) If is continuously differentiable and on closed interval then is a contraction map.
Proof. Let have a continuous derivative which satisfies for all in a closed interval . Because is continuous and is continuous, so is the composition , and thus it achieves a maximum value on (Theorem 18.2); call this maximum , and note that by our assumption.
Now let be arbitrary. By the Mean Value Theorem there is some such that
Taking absolute values and using that this implies
Since were arbitrary this holds for all such pairs, and so the distance between and$ decreases by a factor of at least , which is strictly less than 1. Thus is a contraction map!
We know contraction maps to be extremely useful as they have a unique fixed point, and iterating from any starting value produces a sequence which rapidly converges to that fixed point. Using this differential condition its easy to check if a function is a contraction mapping, and thus easy to rigorously establish the existence of certain convergent sequences.
As a good example, we give a re-proof of the convergence of the Babylonian procedure to
Example 22.2 The function is a contraction map on the interval . The fixed point of this map is $, thus the sequence converges to .
To prove this, note that if then whose only positive solution is , thus it remains only to check that is a contraction. Computing its derivative;
On the interval the function lies in and so lies in the interval , and lies in : thus is bounded above by and is a contraction map!
Newton’s Method
Netwon’s method is a recipe for numerically finding zeroes of a function . It works iteratively, by taking one guess for a zero and producing a (hopefully) better one, using the geometry of the derivative and linear approximations. The procedure is simple to derive: given a point we can calculate the tangent line to at
and since this tangent line should be a good approximation of near , if is near the a of , we can approximate this zero by solving not for (which is hard, if is a complicated function) but (which is easy, as is linear). Doing so gives
Definition 22.3 (Newton Iteration) Let be a differentiable function. Then Newton iteration is the recursive procedure
Starting from some this defines a recursive sequence
This is an extremely useful calculational procedure in practice, so long as you can cook up a function that is zero at whatever point you are interested in. To return to a familiar example, to calculate one might consider the function , or to find a solution to , one may consider .
Exercise 22.3 Show the sequence of approximates from newtons method for starting at is precisely the babylonian sequence.
We already have several proofs this sequence for converges, so we know that Newton’s method works as expected in at least one instance. But we need a general proof. Below we offer a proof of the special case of a simple zero: were crosses the axes like rather than running tangent to it like :
Definition 22.4 (Simple Zero) A continuously differentiable function has a simple zero at if but .
Theorem 22.6 (Newton’s Method) Let be a continuously twice-differentiable function with a simple zero at . Then there is some such that applying newton iteration to any starting point in results in a sequence that converges to .
Proof. Our strategy is to show that there is an interval on which the Newton iteration is a contraction map.
Since is a simple zero we know and without loss of generality we take . Since is continuously twice differentiable is also continuous, meaning there is some where is positive on the entire interval . On this interval we may compute the derivative of the Newton map
Since and are all continuous and is nonzero on this interval, is continuous. As we see , so using continuity for any there is some where implies .
Thus, choosing any and taking we’ve found an interval where the derivative of is strictly bounded away from : thus by Proposition 22.4 is a contraction map on this interval, and so iterating from any starting point produces a sequence that converges to the unique fixed point of (Theorem 11.1). This fixed point satisfies
which after some algebra simplifies to
Since and is positive on the entire interval by construction, is increasing and so for and for . That is, has a unique zero on this interval, so and our sequence of Newton iterates converges to as desired.
The structure of this proof tells us that Netwon’s method is actually quite efficient: a contraction map which contracts by creates a cauchy sequence that converges exponentially fast (like ). And in our proof, we see continuity of lets us set any and get an interval about where convergence is exponential in . These intervals are nested, and so as gets closer and closer to the convergence of Newton’s method gets better and better: its always exponentially fast but the base of the exponential improves as we close in.
Exercise 22.4 Provide an alternative proof of Newton’s method when is convex: if is a simple zero and show the sequence of Newton iterates is a monotone decreasing sequence which is bounded below, and converges to the via Monotone Convergence.
L’Hospital’s Rule
L’Hospital’s rule is a very convenient trick for computing tricky limits in calculus: it tells us that when we are trying to evaluate the limit of a quotient of continuous functions and ‘plugging in’ yields the undefined expression we can attempt to find the limit’s value by differentiating the numerator and denominator, and trying again. Precisely:
Theorem 22.7 (L’Hospital’s Rule) Let and be continuous functions on an interval containing , and assume that both and are differentiable on this interval, with the possible exception of the point .
Then if and for all ,
Proof (Sketch).
- Show that for any , we have
- For any , use the MVT to get points such that and .
- Choose a sequence : for each , the above furnishes points : show these sequences converge to by squeezing.
- Use this to show that the sequence converges to , using our assumption .
- Conclude that the sequence , and that as claimed.
Hint: Use the definition of a functional limit our assumption to help: for any , theres a where implies this quotient is within of . Since can you find an beyond which is always within of ?
Exercise 22.5 Fill in the details of the above proof sketch.