Machine Learns — Newsletter #11

Eren Gölge

Published in

Machine Learns

Sent as a

Newsletter

5 min readDec 20, 2023

AI: Latest News, Research, and Open-Source

👋 Hey all!

Welcome to the 11th edition of Machine Learns. This week’s issue is packed with good stuff again.

A notable trend is the focus of major tech companies on training Large Language Models (LLMs) beyond the scope of human-generated data. As public discourse ponders whether AI might replace human roles, these companies actively explore this realm. The limitations of current LLMs with human data are becoming apparent, particularly as AI-generated content increasingly populates the internet. This raises crucial questions about progressing beyond human data and ensuring AI’s alignment with human values.

The highlighted papers indicate that self-training LLMs could be not only feasible but also potentially more effective than reliance on human data. This marks a significant and intriguing shift in AI development.

Bookmarks

👩‍💻 Database Fundamentals 🔗Link. It’s been a nice refresher on the basics of databases.

📰 Gartner Predicts Mass Consumer Social Media Exodus by 2025 🔗Link. Users lose trust in social media platforms and limit their usage.

📰 TikTok requires users to “forever waive” rights to sue over past harms 🔗Link

👩‍💼 The Top 10 Things to Know Before Starting a SaaS Company 🔗Link

👩‍💼 10+ Signs of a Mediocre Hire 🔗Link

👩‍💼 Be impatient 🔗Link. On the importance of being fast and agile as a startup and a co-founder with quotes from successful founders.

🤖 All bigs developing AI to run on laptops and phones 🔗Link. Running AI on device will be big next year.

👩‍🔬 Lean back or lean in? Exploring social loafing in human–robot teams 🔗Link. A research on whether the use of robots as colleagues influences the motivation and performance of humans.

📰 Import ban for Apple Watches 🔗Link.

👩‍🔬 Using CRISPR for treating Alzheimer’s 🔗Link.

📰 Everything you know about the podcast industry is a lie 🔗Link. This article argues that while Spotify’s struggles have created negative perceptions of podcasting, the industry remains viable for most creators outside of Spotify’s failures.

👩‍💻 Understanding PyTorch GPU Memory Management 🔗Link. Using PyTorch profiler tools to debug GPU RAM utilization.

👩‍💼 Paying Netflix $0.53/h, etc. 🔗Link. People pay $0.50-$2.00 for an hour of digital entertainment.

👩‍🔬 Scientists Contact Whales in World-First Communication Experiment 🔗Link. Scientists had a groundbreaking conversation with a humpback whale in her language and it could help humans chat with aliens one day

🤖 Deep dive: 4 NeurIPS 2023 best paper award papers — emergent ability, scaling, DPO, trustworthiness 🔗Link.

Papers

For more papers, you can check my list. Anything I read or plan to read is there.

LLM in a flash: Efficient Large Language Model Inference with Limited Memory — Apple

📎Paper

It proposes storing model parameters in flash memory and transferring them on-demand to DRAM. The approach optimizes for two key aspects: reducing the volume of data transferred and increasing data chunk size for efficient reading. Innovative techniques like “windowing” and “row-column bundling” are introduced. These methods collectively enable running models up to twice the size of the available DRAM, significantly increasing inference speed compared to traditional loading methods. The study showcases a novel convergence of hardware-aware strategies with machine learning, demonstrating a breakthrough for deploying advanced LLMs in resource-limited environments.

ReST — Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models — DeepMind

📎Paper

The authors propose a method called Reinforced Self-Training (ReST), which involves self-training with scalar feedback. The process includes generating samples, filtering them using binary feedback, and fine-tuning the model on these samples. This approach, tested on advanced mathematical reasoning and coding benchmarks, shows that ReST scales well with model size and significantly surpasses fine-tuning solely on human data. The findings suggest that self-training with feedback can greatly reduce reliance on human-generated data.

ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent — Google

📎Paper

This is a follow-up to the previous paper. It introduces a ReST-like self-improvement approach, where the agent is iteratively fine-tuned on its reasoning traces, using AI-generated feedback instead of human-labeled data. This process involves generating samples, filtering them, and fine-tuning the model on these refined samples. This enables the model to improve its performance and robustness in answering complex, multi-step questions. The study demonstrates that this method allows for efficient self-improvement of the agent, even enabling smaller models to achieve comparable performance to their larger counterparts. This approach represents a significant step in reducing reliance on human-generated data for training LMs, particularly in complex reasoning tasks.

Are Emergent Abilities of Large Language Models a Mirage? — Stanford

📎Paper

This work proposes that what appears as emergent abilities in LLMs may be artifacts of the chosen metrics rather than fundamental changes in the models’ capabilities. The authors argue that metrics that nonlinearly or discontinuously scale a model’s error rate can create the illusion of emergent abilities. They demonstrate this by applying different metrics to the same model outputs, showing that changes in metric choice can either induce or eliminate the appearance of emergent abilities. This perspective suggests that emergent abilities in LLMs might not be inherent qualities of model scaling, but rather a byproduct of the metrics used in their evaluation.

Open-Source

DeepEval

👩‍💻Code

DeepEval is a simple-to-use, open-source evaluation framework for LLM applications. It is similar to Pytest but specialized for unit testing LLM applications. DeepEval evaluates performance based on metrics such as hallucination, answer relevancy, RAGAS, etc., using LLMs and various other NLP models locally on your machine.

Coffee — UI development with AI

👩‍💻Code

Build and iterate on your UI 10x faster with AI — right from your own IDE! Coffee caffeinates your frontend development workflow with AI. This project is intended to be more than just a nice demo, but rather be an ergonomic tool that can write and interact with production-quality code.

superduperdb — AI in your database

👩‍💻Code

SuperDuperDB is an open-source framework for integrating AI directly with your existing databases, including streaming inference, scalable model training, and vector search.

SuperDuperDB is not a database. It transforms your favorite database into an AI development and deployment environment; think db = superduper(db).

SuperDuperDB eliminates complex MLOps pipelines, specialized vector databases — and the need to migrate and duplicate data by integrating AI at the data’s source, directly on top of your existing data infrastructure. This massively simplifies building and managing AI applications.

EAGLE: Lossless Acceleration of LLM Decoding by Feature Extrapolation

👩‍💻Code

EAGLE (Extrapolation Algorithm for Greater Language-model Efficiency) is a new baseline for fast decoding of Large Language Models (LLMs) with provable performance maintenance. This approach involves extrapolating the second-top-layer contextual feature vectors of LLMs, enabling a significant boost in generation efficiency.

Machine Learns — Newsletter #11

Bookmarks

Papers

LLM in a flash: Efficient Large Language Model Inference with Limited Memory — Apple

ReST — Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models — DeepMind

ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent — Google

Are Emergent Abilities of Large Language Models a Mirage? — Stanford

Open-Source

DeepEval

Coffee — UI development with AI

superduperdb — AI in your database

EAGLE: Lossless Acceleration of LLM Decoding by Feature Extrapolation

Written by Eren Gölge