Deep Learning

Mathematics for Deep Learning

My Mathematics for Deep Learning notes are intended to document — with easy-to-follow but rigorous steps — multi-variable calculus as it is applied to backpropagation. The final result for the gradients with respect to the neural net’s parameters is surprisingly simple. To be fair, the apparent simplicity depends on some sweet notation.

TODO: To make the general result more intuitive, I could add a realistic L=2 example at the end of the notes.

ANOTHER TODO: I could also add some motivation for the multi-variable calculus result that was required for the derivation, and even for the ordinary chain rule, in which case this write-up would have no prerequisites other than good high school math.

Chollet & Watson Notes