Reading Digest, August #23

Daniel Chen
Journey Into AI with Aili
9 min readSep 2, 2024

--

Hey there, my amazing readers! I hope you’re ready for another exciting edition of my daily reading digest. If you’re new here, get ready for a wild ride through the fascinating world of online content. And if you’re a regular, thank you for your continued support — it means the world to me!

Today’s digest is a true buffet of captivating topics, ranging from the first MLPerf benchmarks for Nvidia Blackwell, AMD, Google, and Untether AI to the rise and fall of OpenSea. We’ll explore how to use AI for technical SEO, straight from HubSpot’s tech SEO team, and dive into value innovation and how to win in crowded markets by ignoring competitors.

But that’s not all — we’ve got some intriguing pieces on the latest developments in AI and tech. From Cartesia to unifying AI and databases with TAG, and generative verifiers using reward modeling as next-token prediction, this digest has something for everyone. We’ll even explore how Amazon is using my grocery purchases to sell me prescription drugs and welcome you to the era of the gritty startup.

For the tech enthusiasts among us, we’ve got articles on how all your work could be gone and how using the one-word AI prompt “RUMINATE” can induce deeper reasoning and more accurate output from ChatGPT. We’ll also take a closer look at whether models can learn from each other in a machine learning Möbius and whether Japan will allow Circle-K to acquire 7-Eleven.

But that’s just the tip of the iceberg, my friends. From generative AI’s fatal flaw to a thorough analysis of OpenAI’s new leaked strategy, and the importance of having hard conversations as good parents, this digest covers a wide range of topics that are sure to pique your interest. We’ll even explore how major sites are saying no to Apple’s AI scraping and dive into UnifiedMLLM, which enables unified representation for multi-modal multi-tasks with large language models.

So, grab your favorite beverage, get comfortable, and join me on this thrilling journey through the world of online content. I can’t wait to hear your thoughts and reactions in the comments below!

Happy reading, my incredible friends!

UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model

The paper proposes UnifiedMLLM, a comprehensive multi-modal large language model (MLLM) that can handle various multi-modal tasks using a unified representation. The key aspects are:

  • Unified Representation: The model generates task tokens and grounding tokens to represent different tasks and regions, enabling seamless integration of multiple tasks.
  • Task Router and Expert Integration: The task tokens and grounding tokens are used to activate corresponding expert models to execute the specified tasks.
  • Dataset Construction: The authors construct task-specific datasets and a 100k multi-task dataset with complex scenarios to train the model.
  • Three-stage Training Strategy: The model is trained in three stages — modality-perception pretraining, task adaptation tuning, and multi-task LoRAMoE tuning to improve its reasoning and task processing capabilities.

The experiments demonstrate the model’s impressive performance across a wide range of multi-modal tasks, including referring segmentation, reasoning editing, layout-based image generation, and multi-modality generation.

Major Sites Are Saying No to Apple’s AI Scraping

The article discusses Apple’s new tool that allows publishers to opt out of having their data used to train Apple’s AI models. It mentions that major news outlets and social platforms have already taken advantage of this option.

Being Good Parents Means Having Hard Conversations

The article discusses the challenges of raising children in a complex world, and the need to prepare them for the realities they will face, while also preserving their childhood. It explores topics such as addressing racism, sexism, consumerism, and climate change with children, and provides guidance on how to have these difficult conversations in an age-appropriate manner.

Thorough Analysis of OpenAI’s New Leaked Strategy

The article provides insights into OpenAI’s strategy, which includes the development of two new model families — Strawberry and Orion. The Strawberry model is expected to have improved reasoning capabilities, while Orion is described as a more powerful model that may require the use of synthetic data generated by the Strawberry model. The article also discusses the implications of these models, including potential regulatory concerns and the challenges of achieving the desired level of AI intelligence.

This Is Generative AI’s Fatal Flaw

The article discusses the limitations and challenges of using Generative AI for sales outreach and content creation. It draws on the author’s past experiences with Automated Insights, a company that used AI to automate sports content, to illustrate how Generative AI is often used to replace low-quality human-generated content rather than to create high-quality, valuable content.

Will Japan Allow Circle-K to Acquire 7-Eleven?

The article discusses the potential acquisition of the Japanese convenience store chain 7-Eleven by the Canadian company Alimentation Couche-Tard (ACT), and the implications this would have on the unique convenience store culture in Japan.

A Machine Learning Möbius: Can Models Learn from Each Other?

The article explores the potential of machine learning models to learn from each other, drawing inspiration from human learning processes. It discusses various approaches, such as iterative refinement, distillation, and self-teaching, that aim to make AI more accessible and democratized. The article focuses on the intersection of human-machine learning, highlighting the challenges of data scarcity and the need for more efficient training methods.

My one-word AI prompt to induce deeper reasoning and more accurate output from ChatGPT: “RUMINATE”

The article discusses how current generative AI models, while impressive in their speed, often struggle with simple tasks like counting the number of ‘r’s in the word “strawberry”. The author proposes that by prompting the AI to “ruminate” on the task, it can be encouraged to slow down and engage in more deliberate, System 2 style reasoning, leading to more accurate results.

All your work: gone.

The article discusses the risks and challenges faced by designers and freelancers when working on high-profile projects, particularly the potential for data loss and the legal consequences of not meeting deadlines.

Welcome to the Era of the Gritty Startup

The article discusses the changing landscape of the startup ecosystem, particularly in Silicon Valley, as it transitions from the “age of glut” to the “era of the gritty startup.” It highlights three key issues that characterized the previous era and how startups can adapt to the new reality.

Amazon is using my grocery purchases to sell me prescription drugs

The article discusses the author’s experience with Amazon’s recommendation of cholesterol treatments and the broader implications of Amazon’s growing presence in the healthcare industry.

Generative Verifiers: Reward Modeling as Next-Token Prediction

The paper proposes Generative Verifiers (GenRM), which recast verification as next-token prediction in large language model (LLM) reasoning domains. Key points:

  • GenRM is a more performant alternative to discriminative reward models, and unlocks the use of powerful tools like chain-of-thought reasoning and majority voting for better verification.
  • GenRM unifies generation and verification into a single LLM, and demonstrates that such unification benefits both generation and verification.
  • GenRM can effectively utilize synthetic model-generated rationales, which are noisy and sub-optimal, to identify reasoning errors in grade school math problems.

Text2SQL is Not Enough: Unifying AI and Databases with TAG

The article proposes a unified paradigm called “Table-Augmented Generation” (TAG) for answering natural language questions over databases. It highlights the limitations of existing methods like Text2SQL and Retrieval-Augmented Generation (RAG) in handling queries that require semantic reasoning or world knowledge beyond what is directly available in the database. The TAG model aims to leverage the reasoning capabilities of language models (LMs) and the computational power of database management systems (DBMS) to answer a broader range of natural language queries over data.

Cartesia

The article discusses Cartesia’s mission to build the next generation of ubiquitous, interactive AI that can run on any device. It introduces three key releases:

  • Edge: An open-source library for developing efficient on-device AI models using state space models (SSMs)
  • Rene: An open-source 1.3B parameter language model designed for efficient on-device inference
  • Sonic On-Device: A generative voice model that supports low-latency real-time streaming on-device

The article highlights the advantages of on-device AI over cloud-based approaches, such as reduced data transfer, lower latency, and increased privacy and security. It also discusses how new model architectures like SSMs are key to enabling powerful yet efficient AI models that can run on the edge.

Value Innovation: How To Win In Crowded Markets By Ignoring Competitors — Frontera

The article discusses how Rolls-Royce, a smaller aircraft engine manufacturer, was able to gain a dominant position in the booming commercial aviation industry of the 1960s by innovating on value rather than competing on traditional factors like price, speed, or efficiency.

How to Use AI for Technical SEO, Straight from HubSpot’s Tech SEO Team

The article discusses the use of AI in technical SEO, covering various use cases and the overall value proposition of AI for SEO practitioners.

The rise and fall of OpenSea

The article discusses the rise and fall of the NFT (Non-Fungible Token) market, with a focus on the challenges faced by the leading NFT marketplace, OpenSea. It explores how OpenSea, a startup inspired by cat JPEGs, transformed into a complex company dealing with regulatory scrutiny, internal conflicts, and competition from emerging platforms like Blur.

First MLPerf benchmarks for Nvidia Blackwell, AMD, Google, Untether AI

The article discusses the latest developments in the AI inference chip market, with a focus on the recent MLPerf Inference v4.1 competition results. It highlights the performance and power efficiency of various chips from companies like Nvidia, AMD, Google, Untether AI, Cerebras, and Furiosa, and how they are challenging Nvidia’s dominance in the AI inference space.

Our website: https://aili.app

Follow us on X (Twitter): https://x.com/aili_app

Join our discord channel: https://discord.gg/CQtysdQfDM

--

--