AI Top-of-Mind for 7.1.24 — Hallucinations (or not)

Published in

AI.society

3 min readJul 1, 2024

Top-of-mind, some good insight by Ignacio de Gregorio into how LLMs actually predict text and the nature of hallucinations. One good takeaway is captured in the diagram below, where the next word is determined by probability, and it is not guaranteed that the LLM will reply with ‘playground.’

From the post:

A hallucination implies an incorrect perception of the world that makes someone generate statements that are not grounded in reality. But that’s the thing: LLMs aren’t capable of perceiving reality.

And two approaches to maximize ‘truthfulness’:

In entropy minimization, the model has an inductive bias toward lower entropy responses. In other words, it generates multiple responses and, as a way to discriminate, takes the hypothesis that the response with the lowest possible number of assumptions, aka the simplest, is the best answer, something some of you will find akin to Occam’s razor.
In test-time fine-tuning, Jack Cole and Mohamed Osman are actively searching for a solution to the famous ARC-AGI benchmark (the hardest benchmark for LLMs), by fine-tuning the model on inference.

Also on models, how does one detect AI-generated text? I’ve written on this topic below, but a good update by Matthew MacDonald writing in ‘Young Coder’ showing how current tools generate both false positives and false negatives. And, tools like ‘GPThero’ that will take AI-generated text and ‘humanize’ it to pass AI detectors. And arms-race of the bots! Read his analysis for more.

On the policy front, an update by ‘EE Times’ on the AI Alliance and how it is faring six months in. There are now over 100 members, with focus area including:

Scalable data tooling and pipelines
Navigating the application design space: Fine tuning, RAG, and iterative reasoning
Application reference implementations
Industry-specific foundation models
Multi-HW model inference deployment
AI for Science, with a focus on materials and chemistry

While the AI Safety and Trust working group has the following on its plate:

And an update on copyright protection. Enrique Dans looks at the latest, this time involving AI music generation. Warner, Sony, and Universal have entered a joint lawsuit against Suno and Udio, the latter which I covered a week or so ago. We’ll see how this plays out but there must be some middle ground. From Reuters:

· The lawsuits are the first to target music-generating AI following several cases brought by authors, news outlets and others over the alleged misuse of their work to train text-based AI models powering chatbots like OpenAI’s ChatGPT. AI companies have argued that their systems make fair use of copyrighted material.

· “Unlicensed services like Suno and Udio that claim it’s ‘fair’ to copy an artist’s life’s work and exploit it for their own profit without consent or pay set back the promise of genuinely innovative AI for us all,” Mitch Glazier, CEO of the Recording Industry Association of America, said in a statement.

Then turning to corporate, more from Jim Clyde Monge in ‘Generative AI’ on stability.ai’s struggles. One point he raises is the backlash the company is facing for its new, less open, licensing model, and a second is the quality of the images generated by Stable Diffusion 3 2B. He offers a good summary:

The situation at Stability AI reflects the broader tension in the AI industry between openness and commercialization. While profit-driven motives can fuel innovation and sustainability, they can also conflict with the collaborative spirit that has driven much of AI’s rapid progress.

Lastly, if you are looking for a good AI voice generation tool, Artturi Jalli offers his take. His detailed analysis of capabilities and how to access includes three top picks — Muri AI, Play.HT, Lovo AI — as well as a few others.

AI Top-of-Mind for 7.1.24 — Hallucinations (or not)

Written by dave ginsburg