**3.4: The Chain Rule**

Dun dun dunnnnnnnnnnnnnn.

The chain rule. Much like when I have sex with your Mom, the chain is very important.

The chain rule tells you how to deal with nested functions. What’s a nested function? Well, let’s remember what a function is.

Loosely speaking, a function is an operation you perform to move from one set of data to another. The operation might be adding 5, multiplying by 40, tetrating to 10, whatever. Each of those is one operation you do to get from a data point on the first set to a data point on the new set.

A nested function is when you do that move a couple times. So, say you have sets W, X, Y, and Z. You might have a function that takes you from W to X. Then, to get to Y you run another function. Then, to get to Z, you run yet another function.

Of course, what those functions are is defined by you. There will be some function that gets you from W to Z in one operation. It just may be much uglier than any of the individual functions you could’ve used to go from W to X to Y to Z.

Now, imagine you need to take the derivative of the big ugly W-straight-to-Z function. Maybe it’s something like . You probably see that this big mega-bitch of a function can be seen is 4 separate ones: square x, take the cosine of the result, take the square root of that result, take the sine of that result. But, how in the balls do you take the derivative?

First, I’m going to give you the tool, then we’ll solve that big ugly bitch there. Then, the proof.

You ready for me to go full frontal nerdity on you? Here goes: The way I think about the chain rule corresponds to a rule in the card game Magic: the Gathering. In this game (at least, when I played it around the Late Nineteen-Hundreds) there is a type of card called an “instant.” You’re allowed to play this card at any time. But, this creates a problem – what if I play an instant, then you play an instant on my instant, then Frank plays an instant on your instant? The various cards might affect one another, so you need an order of operations.

The term used in Magic is “last one in, first one out.” And, this is the same rule the chain rule exercises. To rephrase it for math: The most nested operation is the first one you deal with.

Let’s look at the book’es example:

The x on the right side has two operations performed on it.

Operation 1: Square it, then add 1.

Operation 2: Take the square root.

Before we go on, you might object and say “BAH! There are 3 operations!”

Operation 1: Square it

Operation 2: Add 1

Operation 3: Take the square root

Let me assure you that your objection is both ill-informed and stupid. We’re not looking to break up every operation. We’re just breaking the overall function into functions whose derivative is easy to take. And, if you wanna get technical, there are infinity operations being done on x:

Operation 1: Square it

Operation 2: Add 1

Operation 3: Take the square root

Operation 4: Add 0

Operation 5: Add 0

Operation 6: Add 0…

So, back to our example:

The first function is , whose derivative is simply 2x.

The second functionis , which you should think of as . The derivative is simply . (Note, if that’s at all confusing, look up the power rule for derivatives and the definition of a square root).

So, now you have two functions:

1) 2x

2)

Just multiply them together, and you have the derivative:

BAM! That’s what is meant by last in first out.

But why the balls does this work.

Let’s get a better sense of what you’re actually doing. In the example there, you were taking the derivative of a function of a function. Let’s call the inner function g(x) and the outer function f(x). Thus, the whole function is f(g(x)), pronounced “f of g of x.”

You’ll notice that each function is not operating on the same thing. g is operating on x. f is operating on g(x). But when you take the derivative, you want to know how f(x) changes with x itself.

That is, you want to know . So, what you do is you come up with derivatives for the nested functions.

The derivative of g(x) is just , i.e. the derivative of g(x) with respect to x. The derivative for f(g(x)) is , i.e. the derivative of f(x) with respect to g(x). Let’s see how that all looks when multiplied as the chain rule prescribes.

It sure looks like you can just multiply through for your answer. Kinda neat, huh? Unfortunately, this isn’t quite proof. Your spidey math-sense may be telling you it doesn’t quite seem legal to multiply derivatives that way. And, in fact, it’s not a legal move. For example, what if g(x) = 0 at some point. Now you’re dividing by zero and ending the universe.

That said, the above notation is super useful as a reminder, and it’s a reasonably good hint at the truth. Plus, your physics professor will probably tell you to go ahead and use it as long as there aren’t any mathematicians looking, since it’s true for most uses.

So, with that in mind, see if you can solve this mother fucker:

Give it a shot yourself before you read on.

Did you do so?

Okay.

Operation one: square it

Operation two: cos it

Operation 3: square root it

Operation 4: sin it.

First out: 2x

Second out:

Third out:

Fourth out:

(I just used more latex than a sailor on Christmas Eve, HEYOOOO!)

So, put it all together:

Fuck you, I’m not simplifying that.

Now, you should have a good sense of why it’s called the chain rule. You’re chaining together functions to put together a big ugly derivative.

**Next up: Using the Chain for familiar functions and rules, and the proper proof of the chain rule.**

Good thing you didn’t try to simplify it – according to Wolfram Alpha, that’s about as pretty as it gets. The alternate forms all look worse.

http://www.wolframalpha.com/input/?i=Derive(sin(sqrt(cos(x^2)))

LaTeX tip: while writing something in math mode, if you want text with actual normal spacing in it, use $\mathit{Text with spaces here}$ which will give you properly spaced italicized text. Or if you just want normal roman text $\mathrm{Roman text here}$. Also, you were sort of inconsistent in doing trig functions properly. You want to precede them with a slash: $\sin^2 (x) = \cos^2 (x) = 1$. Also, if you want to replace the ugly asterisk with the actual times symbol, use $\times$

This made me actually laugh out loud. I enjoyed it immensely, and I think it would have been useful if my Calc course didn’t cover it earlier. Still, any light you can shed prematurely on Calc is fantastic, as I like being top of the class with minimum effort.

I was just going to remark on that asterisk myself. I’m just going to add that you can also use $\cdot$ if you prefer the dot over the x as a multiplication sign.

Excellent site, dude. Wish this were around when my English non-speaking professor was trying to teach Calc….