Reading Digest, July #11

Daniel Chen
Journey Into AI with Aili
17 min readJul 17, 2024

--

Hey there, my wonderful readers! I hope you’re ready for another exciting edition of my daily reading digest. If you’re new here, get ready to be amazed by the captivating world of online content. And if you’re a regular, thank you for your continued support — it truly means the world to me!

Today’s digest is an absolute treasure trove of intriguing topics, ranging from the ease with which cops can break into your phone to the shocking scale of sex trafficking allegations at Red Roof Inn hotels. We’ll explore the latest developments in AI, including ChatGPT’s ability to do data science, ex-Meta scientists debuting a gigantic AI protein design model, and the new “window.ai” API that will blow your mind.

But that’s not all — we’ve got some fascinating pieces on the world of politics, including Trump’s entrance into the Republican convention hall with a bandaged ear, his selection of Sen. JD Vance as his GOP running mate, and Biden’s combative NBC interview days after the Trump assassination attempt. We’ll also take a closer look at what we know about the gunman who shot at Trump at the Pennsylvania rally.

For the tech enthusiasts out there, we’ll dive into the world of link-in-bio tools, Google AI Overviews, and the fastest object ever made by humans. We’ll also explore the art of efficient creativity with AI, the shocking scale of sex trafficking allegations at Red Roof Inn hotels, and why startups fall apart at 50 employees.

But that’s just the tip of the iceberg, my friends. From Thailand’s controversial $13.8 billion handout plan in digital money to citizens to the question of why 80% of Mexico is nearly empty, this digest has something for everyone. We’ll even take a closer look at the new “window.ai” API, the financial frontier, and how Microsoft’s Satya Nadella became tech’s steely-eyed A.I. gambler.

So, grab your favorite beverage, get comfortable, and join me on this thrilling journey through the world of online content. I can’t wait to hear your thoughts and reactions in the comments below!

Happy reading, my amazing friends!

It’s never been easier for the cops to break into your phone

The article discusses the FBI’s ability to access the phone of the shooter in the recent attempted assassination at a former President Trump rally in Pennsylvania. It examines the various tools and techniques law enforcement agencies use to extract data from phones, including devices like Cellebrite and GrayKey. The article also revisits the FBI’s past attempts to compel tech companies like Apple to help break into encrypted phones, and the ongoing tensions between law enforcement and privacy/security concerns.

Thailand is set to roll out a controversial $13.8 billion handout plan in digital money to citizens

The article discusses Thailand’s plan to provide digital cash handouts to eligible businesses and individuals in an effort to boost the country’s lagging economy.

FBI is working to break into the phone of the Trump rally shooter

The article discusses the investigation into the shooting incident at a Pennsylvania rally, where the shooter’s phone is being examined by the FBI to uncover his motives.

The FBI says it has ‘gained access’ to the Trump rally shooter’s phone

The article discusses the FBI’s successful efforts to break into the phone of the man who shot at former President Donald Trump during a rally in Butler, Pennsylvania. The FBI has conducted extensive investigations, including searching the subject’s residence and vehicle, interviewing law enforcement personnel, event attendees, and other witnesses, and reviewing hundreds of digital media tips. The FBI has also offered assistance to the victims of the incident.

Donald Trump enters Republican convention hall with a bandaged ear and gets a hero’s welcome

The article discusses the opening night of the Republican National Convention, which took place just days after an apparent assassination attempt on former President Donald Trump. It covers the triumphant appearance of Trump at the convention, the party’s calls for unity and attacks on the Biden administration, as well as the nomination of J.D. Vance as Trump’s running mate.

Biden snaps repeatedly at Lester Holt in combative NBC interview days after Trump assassination attempt: ‘What’s with you guys?’

The article discusses a tense interview between President Biden and NBC anchor Lester Holt, where Biden repeatedly scolded Holt and defended his criticism of former President Trump. The article also covers Biden’s comments on the assassination attempt on Trump, his mental acuity, and the performance of the Secret Service.

What we know about the gunman who shot at Trump at Pennsylvania rally

The article discusses an attempted assassination on former U.S. President Donald Trump at a rally in Butler, Pennsylvania. The gunman, identified as 20-year-old Thomas Matthew Crooks, opened fire with a semi-automatic AR-15 rifle while Trump was speaking on stage. The Secret Service quickly surrounded Trump and rushed him to safety, with one bullet striking him in the right ear. Crooks was shot dead by a Secret Service sniper at the scene. The FBI is investigating the shooting as a potential act of domestic terrorism, but the motive remains unknown.

Trump picks Sen. JD Vance of Ohio, a once-fierce critic turned loyal ally, as his GOP running mate

The article discusses former President Donald Trump’s selection of Senator J.D. Vance of Ohio as his running mate for the 2024 presidential election. It provides details on Vance’s background, his relationship with Trump, and the potential impact of this pick on the Republican ticket.

Can ChatGPT do data science?

The article discusses the challenges data scientists face when using ChatGPT for data science tasks, and provides recommendations for designing AI-powered data science tools.

This Is The Fastest Object Ever Made by Humans, And It’s Not Slowing Down

The article discusses the record-breaking speeds achieved by NASA’s Parker Solar Probe, which is designed to study the Sun’s outer corona up close.

Ex-Meta scientists debut gigantic AI protein design model

The article discusses how AI tools are being used to design entirely new proteins that could transform medicine. It focuses on the work of EvolutionaryScale, a company that has developed a powerful protein language model called ESM3, which can be used to create new proteins to specifications provided by users.

What’s A Link In Bio Tool? And What Should You Look For In One?

The article discusses the problem of managing multiple links for brands and creators on social media platforms, and introduces link in bio tools as a solution. It provides an overview of the key features and benefits of 8 popular link in bio tools, including Linktree, Flodesk, Shorby, Hopp by Wix, Linkin.bio by Later, Beacons, Campsite.bio, and ContactInBio.

Google AI Overviews only show for 7% of queries, a new low

The article discusses the recent changes and trends observed in Google’s AI Overviews, a search feature that provides summarized information directly on the search results page. The key findings from the analysis by BrightEdge, an enterprise SEO platform, are presented.

How annual pre-pay creates an infinite marketing budget

The article discusses techniques that can transform the cash-flow of a business, particularly for SaaS (Software-as-a-Service) companies. It explores how growth affects cash-flow and provides several techniques to improve the cash-flow metrics, such as:

  • Optimizing advertising and marketing efforts to reduce Customer Acquisition Cost (CAC)
  • Increasing Average Revenue per Customer (ARPC) by raising prices
  • Offering annual billing plans to improve cash-flow and marketing budget

Taming the tail utilization of ads inference at Meta scale

The article discusses the challenges and solutions around improving tail utilization in Meta’s ads delivery system, which relies on sophisticated machine learning models. It covers the infrastructure requirements, system optimizations, and best practices implemented to address tail utilization and improve the overall performance and reliability of the ads inference service.

Gradient Boosting Reinforcement Learning

The paper introduces Gradient Boosting Reinforcement Learning (GBRL), a framework that extends the advantages of Gradient Boosting Trees (GBT) to the reinforcement learning (RL) domain. GBRL implements various actor-critic algorithms and compares their performance with neural network (NN) counterparts. The paper also presents a GPU-accelerated GBRL implementation that integrates with popular RL libraries.

OpenAI Revenue Report — FUTURESEARCH

The article discusses the methods used by FutureSearch, an AI tool, to estimate the Annual Recurring Revenue (ARR) of OpenAI’s products, including ChatGPT. It outlines the three-step process FutureSearch uses: primary and secondary research, modeling the domain, and estimating the unknown.

MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis

The article introduces MARS, a novel auto-regressive framework that retains the capabilities of pre-trained Large Language Models (LLMs) while incorporating exceptional text-to-image (T2I) generation abilities. The key contributions are:

  1. The design of the Semantic Vision-Language Integration Expert (SemVIE) module, which seamlessly integrates the pre-trained LLM with a trainable visual expert, preserving the NLP capabilities while endowing the model with advanced visual understanding.
  2. A multi-stage refinement training strategy that significantly enhances MARS’ robust instruction-following capability and its ability to generate high-quality images with rich details.
  3. MARS demonstrates strong performance on various benchmarks, including MS-COCO, T2I-CompBench, and human evaluations, while requiring only 9% of the training budget of Stable Diffusion v1.5.
  4. MARS possesses bilingual generation capabilities, handling both English and Chinese language prompts, and the flexibility to perform joint image and text generation tasks.

Meet the AI Agent Engineer

The article discusses the role of Agent Engineers at Sierra, a company that enables businesses to build their own branded, customer-facing AI agents for customer service, commerce, and more. It highlights the key responsibilities and skills required for this emerging engineering discipline, as well as the technical challenges involved in building and deploying AI agents at scale.

Remembering Shannen Doherty: A Gen X Icon Who Fought Like Hell to Live

The article is a tribute to the life and legacy of actress Shannen Doherty, who passed away at the age of 53. It highlights her iconic roles in shows like “Beverly Hills, 90210” and “Charmed,” as well as her personal struggles with cancer and the public scrutiny she faced as a young celebrity.

Trump rally shooting: what we know about the suspected gunman

The article provides a detailed account of the background and circumstances surrounding the attempted assassination of former U.S. President Donald Trump by a 20-year-old Pennsylvania man named Thomas Matthew Crooks.

BM25S: Orders of magnitude faster lexical search via eager sparse scoring

The article introduces BM25S, an efficient Python-based implementation of the BM25 lexical search algorithm that achieves significant speedups compared to existing Python-based and Java-based implementations. The key points are:

  • BM25S eagerly computes BM25 scores during indexing and stores them in sparse matrices, enabling faster slicing and summations at query time.
  • BM25S reproduces the exact implementation of five BM25 variants by extending the eager scoring approach to non-sparse variants using a novel score shifting method.
  • BM25S includes a fast Python-based tokenizer that combines Scikit-Learn’s text splitting, Elastic’s stopword list, and an optional C-based Snowball stemmer.
  • BM25S achieves up to 500x speedup compared to popular Python-based frameworks and considerable speedups compared to highly optimized Java-based implementations.

Vision language models are blind

The article examines the limitations of large language models with vision capabilities (VLMs) in performing simple visual tasks that are trivial for humans. It introduces a new benchmark called “BlindTest” to systematically evaluate the visual perception of four state-of-the-art VLMs: GPT-4o, Gemini-1.5 Pro, Claude-3 Sonnet, and Claude-3.5 Sonnet.

Data, Data Everywhere: A Guide for Pretraining Dataset Construction

The article discusses the process of pretraining dataset construction for large language models. It highlights the lack of open information on how to develop effective pretraining sets and aims to provide insights across all steps of pretraining set development. The key findings include:

MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions

The article discusses the MiraData dataset, a new video dataset for video generation, and the potential societal impacts of advancements in video generation technology.

FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision

The paper presents FlashAttention-3, a new algorithm for speeding up attention on Hopper GPUs. The key contributions are:

  1. Exploiting asynchrony of the Tensor Cores and TMA to overlap computation and data movement via warp-specialization.
  2. Overlapping block-wise matmul and softmax operations by pipelining the computations across iterations.
  3. Leveraging hardware support for FP8 low-precision computation through block quantization and incoherent processing to further improve performance.

The authors demonstrate that FlashAttention-3 achieves 1.5–2.0x speedup over the previous FlashAttention-2 algorithm on H100 GPUs, reaching up to 740 TFLOPs/s (75% utilization) in FP16 and close to 1.2 PFLOPs/s in FP8. They also show that FP8 FlashAttention-3 achieves 2.6x lower numerical error compared to a baseline FP8 attention implementation.

Inside the Pod: AI and the Art of Efficient Creativity

The article discusses how AI tools like ChatGPT can help writers and creative professionals automate tedious tasks, allowing them to focus on the parts of their work they enjoy the most. It explores how author Seth Stephens-Davidowitz used ChatGPT to write his latest book in just 30 days by outsourcing tasks he found boring, such as formatting citations, proofreading, and data analysis.

Fight!

The article discusses the current state of online discourse and the negative impacts of the incentive structures of social media platforms, which are fueling polarization, conspiracy theories, and a lack of empathy. It calls for individuals to take back control of their attention and choose to engage with more trustworthy communicators.

Why Is 80% of Mexico Nearly Empty?

The article explores the reasons behind the high population density in the central regions of Mexico, particularly the Mexican heartland, and how this geographic and environmental context has shaped the development of Mexican civilization, including the rise of the Aztec Empire and the construction of the famous Mexican pyramids.

Joe Biden Deserves to Be Elected

The article discusses President Biden’s performance at a press conference following the NATO summit, highlighting his ability to effectively communicate on domestic and foreign policy issues, in contrast to former President Trump’s incoherent and self-serving rhetoric. It also emphasizes Biden’s strong leadership in guiding the country out of a difficult period and his potential to defeat Trump in a future election.

Today’s violence

The article discusses the attempted shooting of Donald Trump at a rally in Pennsylvania and the implications of Trump’s political rhetoric and its connection to increased political violence and threats in the United States.

What is Old is New Again

The article discusses the sudden changes happening across the tech industry and what they mean for the next few years of software engineering. It covers topics such as the impact of interest rate changes, the smartphone and cloud revolutions, the new reality for software engineers, and how history may be repeating itself.

The shocking scale of sex trafficking allegations at Red Roof Inn hotels

The article discusses the widespread allegations of sex trafficking at Red Roof Inn hotels across the United States, with hundreds of lawsuits filed by victims against the hotel chain. The article delves into the details of the lawsuits, including testimony from victims, and examines the legal liability of the hotel chain for allegedly turning a blind eye to the trafficking activities taking place on their premises.

Speculative RAG By Google Research

The article discusses retrieval augmented generation (RAG), which combines the generative abilities of large language models (LLMs) with external knowledge sources to provide more accurate and up-to-date responses. It covers different approaches to RAG, including:

  • Standard RAG: Incorporates all documents into the prompt, with variations using summarization, re-ranking, and user feedback.
  • Self-Reflective RAG: Requires specialized instruction-tuning of the language model to generate specific tags for self-reflection.
  • Corrective RAG: Uses an external retrieval evaluator to refine document quality, focusing on contextual information without enhancing reasoning capabilities.
  • Speculative RAG: Leverages a larger generalist LM to efficiently verify multiple RAG drafts produced in parallel by a smaller, specialized LM, each based on a distinct subset of retrieved documents.

Big Tech’s “attention rents”

The article discusses the concept of “the algorithm” and how it is used in various digital platforms and services. It explores the issues around algorithmic information sorting, the problems with “Surveillance Capitalism” theory, and the concept of “enshittification” — a theory of platform decay where platforms abuse both business customers and end-users. The article also covers topics like rent-seeking, digital rent extraction, and regulatory approaches to addressing platform abuses.

Why Startups Fall Apart at 50 Employees

The article discusses the challenges that startups face as they grow from a small team to around 50 employees, a phase the author refers to as the “teenage” stage of a startup. It outlines the common behavioral patterns observed among employees during this transition and provides strategies for managing the chaos that often arises.

Icebreaker

The article discusses the current state of the IPO market, highlighting the psychology and attention-based economy that shapes it. It focuses on the potential of Reddit as an undervalued company with a large user base and attention-capturing power, which could be the “icebreaker” to revive the IPO market. The article also touches on Reddit’s monetization challenges, governance structure, and its connection to the meme stock movement.

After Getting an IUD, I’m Convinced No One Cares About Women’s Health

The article discusses the author’s experience with getting a copper IUD (intrauterine device) as a form of long-term birth control, and the broader issues surrounding women’s healthcare and medical research. It highlights the lack of pain management and informed consent around IUD insertion, the gender bias in medical research, and the safety concerns around common women’s health products like tampons.

An Unsuccessful Video Game Turned Into a $26 Billion Startup…

The article discusses the remarkable success story of Slack, a B2B SaaS startup that became the fastest-growing in history, reaching a $1 billion valuation in just 8 months. It covers Slack’s origins as a video game company called Glitch, the team’s pivot to building an internal communication tool that eventually became Slack, and the key factors that enabled Slack’s rapid growth and success.

Starting with No: Why Most People Shouldn’t Be Managers

The article discusses the considerations and challenges around transitioning from an individual contributor role to a management role. It provides insights and advice on assessing one’s suitability and readiness for a management position.

The new “window.ai” API will blow your mind.

The article discusses Google’s introduction of an in-built Artificial Intelligence model called Gemini Nano in the Google Chrome browser. It explores the potential benefits and challenges of this new feature for startups and companies that rely on AI infrastructure.

The Financial Frontier

The article discusses the current state of the space industry, focusing on the growth and dominance of SpaceX and Starlink, as well as the emerging commercial opportunities and challenges in space exploration and utilization.

How Microsoft’s Satya Nadella Became Tech’s Steely Eyed A.I. Gambler

The article discusses Microsoft CEO Satya Nadella’s aggressive push into artificial intelligence (AI), including his risky bets on companies like OpenAI and Inflection AI. It explores Nadella’s vision for AI as a game-changer for Microsoft and the tech industry, as well as the challenges and setbacks the company has faced in its AI efforts.

Our website: https://aili.app

Follow us on X (Twitter): https://x.com/aili_app

Join our discord channel: https://discord.gg/CQtysdQfDM

--

--