At vero eos et accusamus et iusto odio dignissimos ducimus qui blanditiis praesentium voluptatum deleniti atque corrupti quos dolores et quas molestias excepturi sint occaecati cupiditate non provident, similique sunt in culpa qui officia deserunt mollitia animi, id est laborum et dolorum fuga.
Et harum quidem rerum facilis est et expedita distinctio. Nam libero tempore, cum soluta nobis est eligendi optio cumque nihil impedit quo minus id quod maxime placeat facere possimus, omnis voluptas assumenda est, omnis dolor repellendus.
Itaque earum rerum hic tenetur a sapiente delectus, ut aut reiciendis voluptatibus maiores alias consequatur aut perferendis doloribus asperiores repellat.
This is the chain rule.
Proofs of the chain rule
One proof of the chain rule begins with the definition of the derivative:
Assume for the moment that g(x) does not equal g(a) for any x near a. Then the previous expression is equal to the product of two factors:
When g oscillates near a, then it might happen that no matter how close one gets to a, there is always an even closer x such that g(x) equals g(a). For example, this happens for g(x) = x2sin(1 / x) near the point a = 0. Whenever this happens, the above expression is undefined because it involves division by zero. To work around this, introduce a function Q as follows:
We will show that the difference quotient for f ∘ g is always equal to:
Whenever g(x) is not equal to g(a), this is clear because the factors of g(x) - g(a) cancel. When g(x) equals g(a), then the difference quotient for f ∘ g is zero because f(g(x)) equals f(g(a)), and the above product is zero because it equals f′(g(a)) times zero. So the above product is always equal to the difference quotient, and to show that the derivative of f ∘ g at a exists and to determine its value, we need only show that the limit as x goes to a of the above product exists and determine its value.
To do this, recall that the limit of a product exists if the limits of its factors exist. When this happens, the limit of the product of these two factors will equal the product of the limits of the factors. The two factors are Q(g(x)) and (g(x) - g(a)) / (x - a). The latter is the difference quotient for g at a, and because g is differentiable at a by assumption, its limit as x tends to a exists and equals g′(a).
It remains to study Q(g(x)). Q is defined wherever f is. Furthermore, because f is differentiable at g(a) by assumption, Q is continuous at g(a). g is continuous at a because it is differentiable at a, and therefore Q ∘ g is continuous at a. So its limit as x goes to a exists and equals Q(g(a)), which is f′(g(a)).
This shows that the limits of both factors exist and that they equal f′(g(a)) and g′(a), respectively. Therefore the derivative of f ∘ g at a exists and equals f′(g(a))g′(a).
Another way of proving the chain rule is to measure the error in the linear approximation determined by the derivative. This proof has the advantage that it generalizes to several variables. It relies on the following equivalent definition of differentiability at a point: A function g is differentiable at a if there exists a real number g′(a) and a function ε(h) that tends to zero as h tends to zero, and furthermore
Here the left-hand side represents the true difference between the value of g at a and at a + h, whereas the right-hand side represents the approximation determined by the derivative plus an error term.
In the situation of the chain rule, such a function ε exists because g is assumed to be differentiable at a. Again by assumption, a similar function also exists for f at g(a). Calling this function η, we have
The above definition imposes no constraints on η(0), even though it is assumed that η(k) tends to zero as k tends to zero. If we set η(0) = 0, then η is continuous at 0.
Proving the theorem requires studying the difference f(g(a + h)) − f(g(a)) as h tends to zero. The first step is to substitute for g(a + h) using the definition of differentiability of g at a:
f(g(a + h)) − f(g(a)) = f(g(a) + g'(a)h + ε(h)h) − f(g(a)).
The next step is to use the definition of differentiability of f at g(a). This requires a term of the form f(g(a) + k) for some k. In the above equation, the correct k varies with h. Set kh = g′(a)h + ε(h)h and the right hand side becomes f(g(a) + kh) − f(g(a)). Applying the definition of the derivative gives:
To study the behavior of this expression as h tends to zero, expand kh. After regrouping the terms, the right-hand side becomes:
Because ε(h) and η(kh) tend to zero as h tends to zero, the bracketed terms tend to zero as h tends to zero. Because the above expression is equal to the difference f(g(a + h)) − f(g(a)), by the definition of the derivative f ∘ g is differentiable at a and its derivative is f′(g(a))g′(a).
The role of Q in the first proof is played by η in this proof. They are related by the equation:
The need to define Q at g(a) is analogous to the need to define η at zero. However, the proofs are not exactly equivalent. The first proof relies on a theorem about products of limits to show that the derivative exists. The second proof does not need this because showing that the error term vanishes proves the existence of the limit directly.
Not the answer you are looking for? Search for more explanations.
Wikipedia is awesome XD nice one Aron
THATS why so SOPA is silly XD
I don't understand the part (on http://web.mit.edu/wwmath/calculus/differentiation/chain-proof.html)
after it says "Differentiablility implies continuity; therefore..."
Why does \[du \rightarrow 0 \] as \[dx \rightarrow 0\]
And how does that get inserted into the equations?
I don't know why, but I don't understand the wikipedia explanation.