In lecture 16, in minute 27, the professor talks about solving for the least squared error using calculus and taking partial derivatives.
I can't under stand why we set for example d=0 and proceed ? What relation does this have to gradient descent ?
MIT 18.06 Linear Algebra, Spring 2010
Stacey Warren - Expert brainly.com
Hey! We 've verified this expert answer for you, click below to unlock the details :)
At vero eos et accusamus et iusto odio dignissimos ducimus qui blanditiis praesentium voluptatum deleniti atque corrupti quos dolores et quas molestias excepturi sint occaecati cupiditate non provident, similique sunt in culpa qui officia deserunt mollitia animi, id est laborum et dolorum fuga.
Et harum quidem rerum facilis est et expedita distinctio. Nam libero tempore, cum soluta nobis est eligendi optio cumque nihil impedit quo minus id quod maxime placeat facere possimus, omnis voluptas assumenda est, omnis dolor repellendus.
Itaque earum rerum hic tenetur a sapiente delectus, ut aut reiciendis voluptatibus maiores alias consequatur aut perferendis doloribus asperiores repellat.
I got my questions answered at brainly.com in under 10 minutes. Go to brainly.com now for free help!
salehmamdouh1984, in effect we are considering the square of the length of the error vector ||e||^2 as a function f(C,D); Prof Strang shows on the board that in his example f(C,D) = (C+D-1)^2 + (C+2D-2)^2 + (C+3D-2)^2; the intention is to choose C and D to minimise f(C,D). The way to find the minimum is to find the two partial derivatives df/dC and df/dD; we know that there is a stationary point where df/dC=df/dD=0. Prof Strang glosses over the point, but it's not hard to show (by taking second partial derivatives) that the stationary point is a minimum. Of course the whole point is to demonstrate that calculus gets you the same answer you can get much more quickly by looking for e to be orthogonal to the columns of A ==> A'Ax = A'b. Josh.
What I am trying to understand,what is the difference between doing the projection method VS the gradient descent algorithm ?
salehmamdouh1984, isn't the gradient descent algorithm a numerical technique to use when you can't find a minimum any other way? I'm not sure that that's called for when you can go direct to the solution either by projection or by calculus. Josh.