It is about the one year mark since we started our (virtual) LightOn AI Meetups, and to mark the anniversary 🥳, we had Charles Martin as a guest. Charles is the Chief Scientist at Calculation Consulting and he presented his work on WeightWatcher: a Diagnostic Tool for Deep Neural Networks, a Python package built around a series of papers.
The 📺 recording of the meetup is on LightOn’s Youtube channel. Subscribe to the channel and subscribe to our Meetup to get notified of the next videos and events!
Weightwatcher is a Python package dedicated to analyze trained models and inspect models that are difficult to train🏋️. It can be used to gauge improvements in model performance and predict test accuracies across different models 🔮(without ever looking at the data!). It can also detect potential problems when compressing or fine-tuning pre-trained models 🗜️.
It is based on ideas from Random Matrix Theory, Statistical Mechanics, and Strongly Correlated Systems. The main idea is to fit a power law to the tail of the empirical spectral density (ESD) of the layer weights. The power-law exponent α is what helps us detect potential problems.
Poorly trained models tend to have large layer α, as can be seen for example comparing GPT and GPT-2: the same model trained on dirty versus well-curated data.
In particular, a weighted α can predict the test accuracy for models in the same architecture series across varying depths and other architectures and regularization parameters 📉.
Finally, there is some early research to extend this idea on when to perform optimal early stopping 🛑, or per-layer learning rate settings 🎛️, or detect over-fitting 🔍. Quite a program! We look forward to even more insightful empirical metrics in Charles’ WeightWatcher in the future. The video of the meetup is here.
About Us
LightOn is a hardware company that develops new optical processors that considerably speed up Machine Learning computation. LightOn’s processors open new horizons in computing and engineering fields that are facing computational limits. Interested in speeding your computations up? Try out our solution on LightOn Cloud! If you want to try out your latest idea using our technology, you can register to the LightOn Cloud for a Free Trial or apply to the LightOn Cloud for Research Program! 🌈
Follow us on Twitter at @LightOnIO, subscribe to our newsletter, and/or register for our workshop series. We live stream, so you can join from anywhere. 🌍
The author
Iacopo Poli, Lead Machine Learning Engineer at LightOn AI Research