agdgdgdgwngo 3 years ago Can anyone help me transform this double sum into a big vector operation? $\frac{1}{m} * \sum_{i=1}^{m}{\sum_{k=1}^{k}{[-y_{k}^{(i)}\log((h_{\theta}(x)^{(i)}))_k) - (1 - y_{k}^{(i)})\log(1-(h_{\theta}(x^{(i)}))_k)]}}$

1. agdgdgdgwngo

think of the i's as the i'th row of a matrix, and the k as the k'th element of that row (vector)

2. JamesJ

I suggest you think exactly what all this notation means (which is not obvious to the rest of us with no context) and I imagine there is a way to convert this into an inner product calculation and/or matrix operation.

3. agdgdgdgwngo

btw can we move the sum symbols around? so the k sum comes before the m sum?

4. agdgdgdgwngo

I had one like this (with only the sigma from one to m) and made it into an inner product calculation, but the new sigma symbol is giving me hell :(

5. agdgdgdgwngo

we're summing up all the vectors produced by that expression inside those brackets, and then we're summing up the elements of that one big vector.

6. dmancine

This is from Neural Networks in the Stanford Machine Learning class. I think I got the inner sum vectorized. I made yi be a zero-vector of size K, then set the y(i)th element be 1. Then I assigned the output activations to h (just to make it look more like the formula). Then the inner sum just became yi' * log(h) + (1-yi') * log(1-h). I guess I could omit the transposes and turn the multiplications into ".*". But I'm having a devil of a time creating a Y matrix where row i is the yi I gave above. I can create a 5000x10 matrix of zeros (easy). Then I want to use y (the 5000x1 label vector) as indices into each row of Y, and set those entries to 1. Something like Y(y) = 1, but that doesn't work. I'm reading about indexing right now to try to figure it out. I was able to figure out a vectorized cost function for last week's homework, so once I get this Y figured out I think I'll have it.

7. ineedbiohelpandquick

How the thing go with the girl

8. agdgdgdgwngo

@dmancine Yeah I finally figured it out. I used a for loop to construct that new y matrix, and then used another for loop to loop over each training example, but I cloned the vectorized computation of the cost for each example over so it's reasonably fast :-P

9. agdgdgdgwngo

I just set a new y matrix to be equal to a matrix of zeros of size m x num_labels and then I used a for-loop. can't show actual code or we will get flamed for honor code stuff :-D

10. agdgdgdgwngo

I do think the whole thing can be vectorized into a one-liner, but I just want to get this assignment done :-P

11. agdgdgdgwngo

Finally got this assignment done (just now, which is about 2 hours before the due time!) I struggled to implement a for-loop based backpropagation, but I became so frustrated with the 'nonconforming operands' errors that I just went with vectorization for the whole problem.

12. agdgdgdgwngo

here's one way to transform that y matrix into an array of classification vectors: yi = eye(num_labels)(y,:);

13. dmancine

Whew! 20 minutes to spare!

14. dmancine

@agd, That's the formula I was trying to think of. I even thought about indexing into the eye matrix. I just couldn't get all the way to the end. Thanks! In the ex4.pdf he suggests using logical arrays might be helpful for this, and they were explained in the previous programming exercise. I haven't figured out how to use that, yet, either.

15. agdgdgdgwngo

I never really figured out the logical arrays part either :-P

16. agdgdgdgwngo

Time's up :(

17. agdgdgdgwngo

wait..... argh I forgot to submit my completed work!!!!

18. agdgdgdgwngo

oh the page was not refreshed :-P yeah it's there