AI Top-of-Mind for 7.9.24 — Hallucinations

dave ginsburg
AI.society
Published in
3 min readJul 9, 2024

Today: Ending hallucinations, updates on Google Astra and Apple’s OpenELM, more on agents, and graph-based RAG

Top-of-mind are LLM hallucinations, and can they be avoided? Ignacio de Gregorio in ‘DataDrivenInvestor’ thinks yes, and introduces ‘Lamini-1,’ as well as the concept of ‘overfitted’ and ‘underfitted’ models and what that means for correct responses. He offers a good summary as well:

  • Open-source models as fact retrievers and “chatting with data” use cases.
  • Private-source models for copilots and processes that require hefty human oversight and continuous iteration (like coding).
Source: Ignacio de Gregorio

And an update on Google’s ‘Astra’ project by Sai Viswanth in ‘Towards AI.’ This includes the Gemini 1.5 models. He discusses the Mixture of Experts architecture as well as performance vs other LLMs. From the post:

Source: Google

Keeping to updates, where are we with AI agent deployments and why are many not too successful? Logan Kilpatrick in ‘Around the Prompt’ looks at existing tools and their limitations.

Source: Logan Kilpatrick

And one more update, this time looking at Apple’s ‘OpenELM’ models and their suitability for running locally on a smartphone. Remembering the earlier release of ‘Ferret,’ this is part of Apple’s overall strategy. Dylan Cooper writingin ‘Stackademic’ dives into the different models including performance and training datasets. From the post:

Companies like Qualcomm and MediaTek have introduced smartphone chipsets that can meet the processing power required for AI applications. Previously, many AI applications on devices were actually partially processed in the cloud and then downloaded to the phone. However, cloud-based models also have drawbacks, such as high inference costs, with some AI startups spending around $1 just to train + generate a single image. Advanced chips and edge-side models will drive more AI applications to run on smartphones, saving costs while providing users with better real-time computing power, thereby spawning new business models.

Link to the original publication at arXiv.

Diving a bit deeper, are current vector-based RAG architectures ideal, or is there something more flexible and lower cost? Aniket Hingane makes the case that there is, based on a graph-based approach. His summary:

The graph model’s strengths really shine for diverse enterprise datasets containing multimodal data. You can simply add new data entities as nodes and define their relationships, without reprocessing everything. This flexibility and efficiency makes it highly scalable.

--

--

dave ginsburg
AI.society

Lifelong technophile and author with background in networking, security, the cloud, IIoT, and AI. Father. Winemaker. Husband of @mariehattar.