This approach is derived from the chain rule. Implicitly we're treating y as a function of x, so when we have a function of y (such as tan y) we have a function of a function of x. the chain rule tells us that when we have a function of a function, we take the derivative of the outer function with respect to the inner one (the inner one being y in this case) and multiply by the derivative of the inner one with respect to the independent variable (which is of course x).
One way to keep all this straight is to think of dy and dx as real entities that we can multiply and divide. (You'll be doing this later in the course.) In the formulation given in this lecture (clip 2 in session 15), we have d/dy multiplied by dy/dx, so the dy's cancel out, leaving d/dx, which is what we want. Your version would give us d/dx times dy/dx, with nothing to cancel the dy, and dx squared in the denominator, something you definitely don't want to see.