General Matrix Multiplication in Assembly Part 1
So, it has been a while since Pete Warden’s post for calls to assembly hackers to work on deep learning. Here is the simplest implementation of GEMM in C
and I tried to dissect it with the infamous Compiler Explorer (by Matt Godbolt)
Here starts my nights :)
Originally published at The Secret Guild of Silicon Valley.