Artificiality Bites 💊 Issue #44
Hello Human! This is a new issue from my weekly newsletter, holding a tiny compilation made of interesting articles from last week, projects, tutorials and tools; all related to Data, Artificial Intelligence and adjacent topics. Nyd dit måltid!
📝 Interesting publications this week
- EleutherAI claims new NLP model approaches GPT-3-level performance
5'
EleutherAI has released GPT-J-6B (aka GPT-J), a model the group claims performs nearly on par with an equivalent-sized GPT-3 model on various tasks. - Improving Language Model Behavior by Training on a Curated Dataset
6'
OpenAI researchers have found they can improve language model behavior with respect to specific behavioral values (reducing bias and toxicity) by fine-tuning on a curated dataset of <100 examples of those values. - DeepMind scientists: Reinforcement learning is enough for general AI
13'
These researchers suggest that reward maximization and trial-and-error experience are enough to develop behavior that exhibits the kind of abilities associated with intelligence. - How long before AI can ‘understand’ animals?
8'
AI might grant us the ability to reliably translate animals in the next decade or so.
- What the Heck is a Data Mesh?!
10'
What's a Data Mesh and what's not. - Tabular Data: Deep Learning is Not All You Need
11p
This paper explores whether deep learning models should be a recommended option for tabular data, by rigorously comparing the new deep models to XGBoost on a variety of datasets.
🔧 Tutorials
- Python Data Viz Libraries Compared
14'
A notebook containing 8 Popular Graphs Made with pandas, matplotlib, seaborn, and plotly.express.
📦 Repositories
- amundsen-io/amundsen
Amundsen is a data discovery and metadata engine. - IBM/UQ360
The Uncertainty Quantification 360 toolkit provides a diverse set of algorithms to quantify uncertainty. - jina-ai/jina
Jina allows you to build deep learning-powered search-as-a-service in just minutes. - wellecks/naturalproofs
NaturalProofs is a multi-domain corpus of mathematical statements and their proofs, written in natural mathematical language. - facebookresearch/flores
Flores-101 is a Many-to-Many multilingual translation benchmark dataset for 101 languages, consisting of 3000 sentences extracted from English Wikipedia and carefully translated to the rest. More info here. - graph4ai/graph4nlp
Graph4NLP is a library for R&D at the intersection of Deep Learning on Graphs and Natural Language Processing. - PrithivirajDamodaran/Gramformer
Gramformer is a framework for detecting, highlighting and correcting grammatical errors on natural language text.
🎓 Courses / Events / Books
- Data Science at the Command Line 📕
A public draft for the second edition of this book written by Jeroen Janssens.
- OpenCV Python Course
3h
📹
Learn how to use OpenCV for Computer Vision and AI in this basic course for beginners. - Hugging Face course 🤗
A completely free course made by Hugging Face engineers that will teach you about natural language processing using libraries from the Hugging Face ecosystem.
👋 See you next week!