Artificiality Bites đź’Š Issue #14
Hello Human! This is a new issue from my weekly newsletter, holding a tiny compilation made of interesting articles from last week, projects, tutorials and tools; all related to Data, Artificial Intelligence and adjacent topics. Bon appetit!
đź“ť Interesting articles this week
- Introducing the First AI Model That Translates 100 Languages Without Relying on English
11'
Facebook open-sourced M2M-100, its multilingual machine translation model, along with its training and evaluation set up. - Microsoft details T-ULRv2 model that can translate between 94 languages
4'
Microsoft announced that their T-ULRv2 model achieved top results in XTREME, a NLP benchmark created by Google. Read more at Microsoft Blog. - Rethinking Attention with Performers
11'
Google introduced Performers, a generalized attention framework based on the Transformer architecture, which provides linearly scalable, low-variance and unbiased estimation of attention mechanisms.
đź’ˇ Projects
- The case for a learned sorting algorithm
8'
Machine Learning focusing on a classic computer science problem: sorting. Learned Sort outperforms the next best competitor by a factor of 1.49x on a 1 billion item dataset, including the model training time.
🔧 Tutorials
- Emerging Architectures for Modern Data Infrastructure
12'
This article shares the results from two years of conversations with data leaders, experts and practitioners on their current data stacks, in an attempt to collect emerging best practices. - Train and use a NLP model in 10 min đź“ą
10'
Choose your favorite Twitter account and train a language model to write new fake tweets using HugginFace. - Fine-tuning a model on a text classification task đź“ť
Colab notebook showing how to fine-tune one of the HuggingFace’s Transformers model on a text classification task.
📦 Repositories
- google-research/multilingual-t5
Multilingual T5 (mT5) is a pretrained text-to-text transformer model covering 101 languages. It achieved state-of-the-art performance on many cross-lingual NLP tasks. - airctic/icevision
An agnostic object detection framework that connects to different libraries such as Fastai, Pytorch Lightning, and Pytorch. - adamerose/pandasgui
An interactive GUI for managing and visualizing Pandas DataFrames - emeryberger/scalene
A high-performance and high-precision CPU / memory profiler for Python
🎓 Courses / Events
- Stanford MLSys Seminar Series
Stanford researchers started a series of talks on how machine learning changes the modern programming stack. They’re livestreaming each talk on YouTube, and all videos are made available after. - Spatial Data Science Conference 2020
The SDSC2020 conference (October 19th — 23rd) held a series of presentations and workshops covering cutting-edge techniques in spatial modeling, machine learning, spatial statistics, geo-processing at scale, and novel uses of spatial data sets. Videos are now available.
🚀 Extra bits
👉 Newsletter en Español
đź‘‹ See you next week!