Machine Learns — Newsletter #8
News, papers and open-source about AI
Hello everyone! It’s been some time. I’ve finally managed to carve out time to compile all the pins I’ve collected recently. Just a heads up, I used my little AI buddy 🤖ME_AI to convert my notes into this full-length content. So you might see some mistakes or awkwardness here and there. Sorry about that. Let’s begin…
Bookmarks
This new data poisoning tool lets artists fight back against generative AI 🔗Link
X announced their own GPT, “Grok” language model 🔗Link
Experts: 90% of Online Content Will Be AI-Generated by 2026 🔗Link
Google launches generative AI for product imagery to US advertisers and merchants 🔗Link
UK to invest $273 million in Turing AI supercomputer 🔗Link
What Is Apple Doing in AI? Revamping Siri, Search, Apple Music, and other Apps 🔗Link
GRWI 2023: top global remote work destinations. Top3 Denmark, Netherlands, Germany. 🔗Link
TikTop testing 15-minute content with select users 🔗Link
Surgery-free brain stimulation with “temporal interference” through electrodes, a new treatment for dementia. 🔗Link
Our brains store memories in two forms to **model our surrounding **world and strategize for the future. 🔗[Link](Role of hippocampus in two functions of memory revealed)
Stability AI’s Stable 3D to generate 3D models 🔗Link
Artists lose copyright case against AI art generators 🔗Link
Papers
Tell Your Model Where to Attend
PASTA (Post-hoc Attention STeering Approach) allows Large Language Models (LLMs) to read text with user-specified emphasis marks, similar to how text styling guides reader attention in human-written texts. By identifying a set of attention heads for reweighting, PASTA navigates the model’s attention to the user-specified details without needing to change any model parameters. The method has proven effectiveness, demonstrating a significant performance increase on various tasks, including a 22% average accuracy improvement for LLAMA-7B.
MoLoRA: Extremely Parameter Efficient MoE for Instruction Tuning
This paper introduces a parameter-efficient Mixture of Experts (MoE) by merging MoE architecture with lightweight experts, paving new ways for optimizing overall performance while managing computational cost. This architecture surpasses standard parameter-efficient fine-tuning methods and matches full fine-tuning, only needing to update the lightweight experts, which comprises less than 1% of an 11B parameters model. This approach exhibits great flexibility as it does not require any prior task knowledge and can be applied to unseen tasks, signifying the expansive potential of the MoE architecture.
Video Crafter: a text-to-video model with unseen quality
Video Crafter introduces two open-source diffusion models for high-quality video production, text-to-video (T2V) and image-to-video (I2V) generations. The T2V model, capable of synthesizing ultra-realistic, cinematic-quality videos from text inputs, surpasses existing open-source counterparts in quality, capable of delivering 1024x576 resolution. The innovative I2V model, acting as the inaugural open-source model of its kind, can transform given images into video clips while strictly adhering to content preservation rules based on imagery content, structure, and style. Video Crafter uses 600 million images with captions and 10 million videos for training. With enough data, you can do anything with AI these days.
AlpaGasus: Training A Better Alpaca with Fewer Data
The paper proposes a data selection strategy utilizing ChatGPT, to enhance the Instruction-finetuning (IFT) in large language models. The approach aims to filter out low-quality data in prevalent IFT datasets. A model fine-tuned with a smaller subset of high-quality data extracted from the broader Alpaca dataset. This data-centric IFT strategy significantly improves the instruction-following trait of the models, providing faster training times and outperforming the original Alpaca model in multiple test scenarios and human evaluation, thus suggesting a new paradigm that can be applied to instruction-tuning scenarios.
Step-Back Prompting by DeepMind
The work presents a novel two-step prompting strategy known as Abstraction-grounded Reasoning. Initially, instead of directly responding to a query, LLM is prompted to ask a broad, conceptual question to gather pertinent data about higher-level concepts. Following this, the LLM, using the major concepts or principles information, can draw a rational conclusion to the initial query.
Mixture of Tokens: Efficient LLMs through Cross-Example Aggregation
The Mixture of Experts (MoE) models, despite their potential to boost parameter counts of Transformer models, encounter some significant drawbacks, such as training instability and uneven expert utilization, due to matching experts and tokens. To circumvent these problems, a new fully-differentiable model called “Mixture of Tokens” has been introduced, which leverages the benefits of MoE architectures while eliminating their difficulties. This new approach combines tokens from different examples before sending them to experts, enabling the model to learn from all token-expert combinations, and it can be used with both masked and causal Large Language Model training and inference.
Open-Source
XTTSv2 by Coqui — Even better Text-to-Speech
👩💻 Github
🤗 HF Demo
📎 Docs
📻 Audio Sample
We’ve unveiled XTTS v2, our most advanced text-to-speech model till now. It adds two new languages, Hungarian and Korean, broadening the total to 16. The model structure has also been updated to facilitate superior voice-cloning. XTTS v2 delivers improvements across the board, including improved prosody and audio quality.
DeepSpeed MII
👩💻Github
DeepSpeed-FastGen, a system built to enhance large language models’ (LLMs) performance by leveraging the Dynamic SplitFuse technique, has been introduced. This approach offers an up to 2.3x improvement in effective throughput over leading systems like vLLM. By using a blend of DeepSpeed-MII and DeepSpeed-Inference, DeepSpeed-FastGen provides a user-friendly serving system, overcoming prompt processing issues that can interfere with service level agreements.
DeepSparse
👩💻Github
DeepSparse is a CPU inference runtime that takes advantage of sparsity to accelerate neural network inference. Coupled with SparseML, our optimization library for pruning and quantizing your models, DeepSparse delivers exceptional inference performance on CPU hardware.
Giskard
👩💻Github
Giskard is a Python library that automatically detects vulnerabilities in AI models, from tabular models to LLM, including performance biases, data leakage, spurious correlation, hallucination, toxicity, security issues, and many more.
HelixNet
👩💻Github
HelixNet is a Deep Learning architecture featuring 3 x Mistral-7B LLMs, consisting of an actor, a critic, and a regenerator. The actor creates an initial response, and the critic then evaluates this response, providing intelligent feedback, which is used by the regenerator to improve and adapt the response, ensuring it better addresses the initial question.
TigerLab
👩💻[Github]https://github.com/tigerlab-ai/tiger)
There’s a crucial need to bridge the expanding gap between Large Language Models (LLMs) and the data stores that feed them contextual information. To address this, an exciting new open-source resource, the Tiger toolkit, has been introduced for developers to craft AI models and applications specific to their needs. It provides tools for RAG, fine-tuning, data, and quality assurance (TigerRag, TigerTune, TigerDA, TigerArmor). These efforts will enable organizations to calibrate AI systems in line with their unique intellectual property and safety requirements, initiating a fresh period of accurate AI customization.