AI Top-of-Mind for 7.9.24 — Hallucinations
Today: Ending hallucinations, updates on Google Astra and Apple’s OpenELM, more on agents, and graph-based RAG
Top-of-mind are LLM hallucinations, and can they be avoided? Ignacio de Gregorio in ‘DataDrivenInvestor’ thinks yes, and introduces ‘Lamini-1,’ as well as the concept of ‘overfitted’ and ‘underfitted’ models and what that means for correct responses. He offers a good summary as well:
- Open-source models as fact retrievers and “chatting with data” use cases.
- Private-source models for copilots and processes that require hefty human oversight and continuous iteration (like coding).
And an update on Google’s ‘Astra’ project by Sai Viswanth in ‘Towards AI.’ This includes the Gemini 1.5 models. He discusses the Mixture of Experts architecture as well as performance vs other LLMs. From the post:
Keeping to updates, where are we with AI agent deployments and why are many not too successful? Logan Kilpatrick in ‘Around the Prompt’ looks at existing tools and their limitations.
And one more update, this time looking at Apple’s ‘OpenELM’ models and their suitability for running locally on a smartphone. Remembering the earlier release of ‘Ferret,’ this is part of Apple’s overall strategy. Dylan Cooper writingin ‘Stackademic’ dives into the different models including performance and training datasets. From the post:
Companies like Qualcomm and MediaTek have introduced smartphone chipsets that can meet the processing power required for AI applications. Previously, many AI applications on devices were actually partially processed in the cloud and then downloaded to the phone. However, cloud-based models also have drawbacks, such as high inference costs, with some AI startups spending around $1 just to train + generate a single image. Advanced chips and edge-side models will drive more AI applications to run on smartphones, saving costs while providing users with better real-time computing power, thereby spawning new business models.
Link to the original publication at arXiv.
Diving a bit deeper, are current vector-based RAG architectures ideal, or is there something more flexible and lower cost? Aniket Hingane makes the case that there is, based on a graph-based approach. His summary:
The graph model’s strengths really shine for diverse enterprise datasets containing multimodal data. You can simply add new data entities as nodes and define their relationships, without reprocessing everything. This flexibility and efficiency makes it highly scalable.