WeightWatcher: a Diagnostic Tool for Deep Neural Networks

Fitting a power law in log-log to the tail of ESD needs to be done carefully!
GPT is trained on dirtier data than GPT-2, and it shows in the unusually large α values for some of the layers.
The correlation between test accuracy and weighted alpha is remarkable.

About Us

The author

We are a technology company developing Optical Computing for Machine Learning. Our tech harvests Computation from Nature, We are at lighton.ai

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Expansion of Guardforce (AI) in Dubai and Australia Sets Up Subsidiaries

Subirority Complex — Issue #6

Our Expectations from INTERSPEECH 2019

Johathan Replaced by AI, He´s Happy for It.

Danish AI startup BotSupply wins the IBM award 2017

Mixed Reality, China’s Growing Dominance in AI, Data Science Resources & More

THE STATE OF AI IN 2019 by wishtech solution

The age of Artificial Intelligence

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
LightOn

LightOn

We are a technology company developing Optical Computing for Machine Learning. Our tech harvests Computation from Nature, We are at lighton.ai

More from Medium

The SwAV method

Expectation Maximization Algorithm

ADN: A Crazy Complicated Deep Learning Network that May Reduce Data Annotation Efforts Needed.

Transformer’s Encoder-Decoder