Sitemap
HuggingFace

Stories @ Hugging Face

Press enter or click to view image in full size
Photo by Henry & Co. on Unsplash

🚧 Simple considerations for simple people building fancy neural networks

7 min readSep 22, 2020

--

Press enter or click to view image in full size
An article from The Guardian

Pro-tip: in my experience working with pre-trained language models, freezing the embeddings modules to their pre-trained values doesn’t affect much the fine-tuning task performance while considerably speeding up the training.

Pro-tip: when you work with language, have a serious look at the outputs of the tokenizers. I can’t count the number of lost hours I spent trying to reproduce results (and sometimes my own old results) because something went wrong with the tokenization.šŸ¤¦ā€ā™‚ļø

Some people report successes using fancy hyperparameter tuning methods such as Bayesian optimization but in my experience, random over a reasonably manually defined grid search is still a tough-to-beat baseline.

On average, experts use fewer resources to find better solutions.

--

--

Victor Sanh
Victor Sanh

Written by Victor Sanh

Dog sitter by day, Scientist at @huggingface šŸ¤— by night | Into Natural Language Processing, started with Computer Vision

No responses yet