LLMs are not all you need

Adhya Dagar
99P Labs
Published in
4 min readDec 19, 2023

Written by CSL Fellow Adhya Dagar
Edited by 99P Labs

The CSL Fellows program is part of the Corporate Startup Lab at Carnegie Mellon University’s Swartz Center for Entrepreneurship

Introduction: The Hypothesis and the Challenge

The 99P Labs team presented us with a challenging hypothesis to test this spring as a part of the Corporate Startup Lab Fellowship: ‘How may a Foundational/Large Language Model like ChatGPT be integrated into a sensor system network to provide an innovative interior navigation/wayfinding application for people?’

Airport Indoor Navigation — Image Produced by DALLE

This question is particularly relevant in today’s context, where there’s a significant emphasis on becoming ‘AI-powered’, with tools like ChatGPT at the forefront.

However, in addressing this hypothesis, especially in the context of indoor navigation — the task of navigating within large indoor spaces — it was crucial not to be swayed by the prevailing AI hype. Instead, we needed to take a step back and thoughtfully consider whether an LLM is truly capable of addressing the specific challenges of indoor navigation. These reflections led us to a critical evaluation, aiming to explore beyond the widespread acclaim of LLMs and understand their actual utility in practical applications.

LLMs in Indoor Navigation: Capabilities and Challenges

LLMs excel in tasks such as text generation, summarization, and creating engaging conversational agents. This linguistic prowess makes them invaluable in applications where human-like text interaction is essential.

However, the domain of wayfinding, which involves navigating through spaces like airports or shopping malls, presents unique challenges that stretch beyond the capabilities of LLMs.

Wayfinding requires a comprehensive understanding of spatial layouts, the ability to process and respond to real-time environmental changes, and interactive communication that adapts to the user’s context. These requirements are critical in ensuring accurate and efficient navigation in dynamically changing environments.

LLMs currently face significant limitations in wayfinding applications. They lack the ability to perceive and understand physical spaces, making spatial navigation challenging. Additionally, they cannot inherently process real-time data, which is crucial for dynamic navigation scenarios. Constantly updating context for real-world navigation is computationally costly. These limitations highlight the need for integrating LLMs with other technologies to effectively address the complexities of real-world navigation.

For instance, in a shopping mall, an LLM-based system might provide textual directions to a store, but it cannot update these directions in real-time if the store has been relocated or if certain pathways are temporarily closed. Similarly, in an airport, LLMs can offer basic guidance but might fail to account for last-minute gate changes or delays.

Tasks such as route planning, obstacle avoidance, or interpreting physical signs typically require technologies like Computer Vision (CV) and Geographic Information Systems (GIS) that can process sensory inputs and spatial data.

However, large vision models might come to rescue here, as seen in the recent paper “LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language Vision and Action” explores an innovative approach to robotic navigation using Large Language Models (LLMs). It demonstrates how LLMs, when combined with vision-language models and navigation models, can interpret natural language instructions and navigate complex environments. This approach allows robots to follow instructions in real-world settings without the need for language-annotated robot data.

The research indicates potential future advancements in navigation technology, where LLMs could play a significant role in interpreting and executing complex navigational tasks in dynamic environments.

Identifying the Right Problems for LLMs and AI

In my exploration at 99P Labs, inspired by Google PM Jaclyn Konzelmann’s insights on validating “technology-first” product ideas, we delved into identifying the most suitable application for LLMs.

We examined if our idea existed in the market, its uniqueness, and our capability to build it better. This led us to focus on the conference industry, an area where a significant need for an intuitive navigation solution existed but hadn’t been revolutionized yet.

We also considered factors like recurring navigation problem, monetization potential, magnitude of impact, paying customers, reach, technical feasibility, financial viability, user experience, and ease of adoption.

This comprehensive approach helped us shortlist convention centers as the ideal problem space, leading to a data-driven solution that promised substantial commercial viability and user benefit.

For more details on our iterative solution journey, check out the article on ‘Solution Evolution: Navigating the Maze of Innovation’.

My Journey and Learnings

Working on this project with 99P Labs, I learned how to align technical solutions with creating value for users. In developing our final recommendation — a conference guidance conversational agent — we introduced critical perspectives on enhancing attendee engagement beyond just efficient navigation.

Our sponsors — Duane, Erin, Rajeev, Ryan, and Matt — guided us by stepping into attendees’ shoes, challenging assumptions, and sharing their personal insights. Their diverse viewpoints pushed us to create more impactful solutions focused not only on practical problem-solving but also on elevating the human experience.

Key learnings included:

  • Decide what AI can and cannot do, and then continue fulfilling the unmet needs
  • Identify the highest ROI opportunities
  • Realize that initial traction and stakeholder buy-in for AI projects is contingent on top-line revenue improvement or bottom line savings

This experience provided invaluable perspectives on aligning capabilities with user-centric values — paving the way for AI solutions that are both technically sound and socially conscious.

Learn more about this project from Alex’s blog!

We hope you enjoyed wayfinding with the CSL Fellows. Follow 99P Labs here on Medium and on LinkedIn for more research projects and collaborations!

--

--

Adhya Dagar
99P Labs

Computer Science Engineer| AI for Social Good| Social Entrepreneurship| linkedin.com/in/adhya-dagar/