The Dawn of Agentic AI: Transforming Enterprise Automation and the Future of Work
At Hitachi Ventures, we’ve been exploring agentic AI infrastructure and applications, even experimenting with building our own task management and summarization agents. When we began last year, the market was largely dominated by ‘ChatGPT’ wrappers and open-source frameworks. However, we’ve seen more founders and operators assembling components to build fully automated workflows capable of handling unstructured data and dynamic contexts. We invite you to engage us in a dialog as we continue to expand our understanding.
Sam Altman’s prediction that AI can enable a one-person unicorn signals the tectonic shift in enterprise automation that we will witness in the years ahead. The groundwork has been laid over decades starting with RPA to improvements across the stack from infrastructure and AI/ML to Foundation Models, setting the stage for AI-powered Agents.
So, what are AI Agents? An intelligent system that can autonomously perform a certain task or decision by understanding context, interpret user intent, and adapt to new tasks not unlike humans. Fine-tuned on specific domain data, they can further grasp industry-specific terminology and workflows. Agents have the potential to drive high-value end-to-end enterprise automation, operational efficiency, talent development, application development, and more. Tomorrow’s leading AI applications will be agentic at their core, requiring limited human involvement. Welcome to the age where companies sell results, not just tools!
The Rise of Agents
Throughout human history, technological advancements have aimed at overcoming the monoliths of human labor. The Industrial Revolution amplified human strength through steam power, the Second Industrial Revolution scaled labor with mass production, and the Digital Revolution enhanced labor productivity. Today, we are amidst a new revolution — one that seeks to address the global shortage of skilled labor and automate human work. As AI’s ability to perceive and navigate dynamic context improves, so does the degree of automation & performance of high-value work. What began as rudimentary bots has evolved into sophisticated agents capable of managing complex workflows today.
AI agents are emerging rapidly across both infrastructure and applications to meet specific needs of the enterprise or consumer. Github Co-pilot demonstrated early traction in the coding/ software development space and as they demonstrated their utility, attention shifted to more complex Infrastructure such as web browser and OS-level agents. Following these developments, more application specific agents are beginning to emerge, designed to work with enterprise applications like Google Drive, Microsoft Office, and custom systems to automate variety of complex workflows in functions like Customer Support.
Broadly, we see two areas of activity (see Figure 1 for market map):
- Infrastructure: The surge in development for agent-based technologies — spanning areas like data, memory, tooling, evaluation, and models is promising. However, as foundational models evolve to natively manage more complex tasks, we believe the demand for standalone agent infrastructure may be limited to few players.
- Application: We anticipate a rise in investments focused on application-specific agents, as already seen in H1 2024 (Figure 2), particularly those targeting workflows with high levels of repeatable tasks and significant human labor. Early examples of this can be seen in coding and customer support, with expansion likely into areas like Sales, GTM, Finance, Legal, and Healthcare.
Core Components of Agents
As AI agent landscape evolves, constructing these systems demands a deep understanding of both high-level goals and the underlying technologies. Today agents require a high degree of vertical integration, incorporating components such as self-managed cloud hosting, prompt and tool instructions, databases, and connectors for seamless ingestion of both external context and internal data across various systems. These agents are typically hyper-specific, tailored to execute a defined set of tasks with precision achieved using frameworks such as LangChain, Chainlit, or Autogen, which support database operations, error handling, context management, and tool integration. Additionally, for tasks requiring custom model development or fine-tuning, libraries like TensorFlow, PyTorch, and Scikit is often employed. Figure 3 offers a detailed view of the technological components to build agents.
Over time, there is the potential to combine multiple models within a single agent, a milestone that could be realized as model capabilities strengthen, and compute requirements decrease. This innovation is further supported by industry leaders like Google, with its Project Astra, Microsoft’s Co-Pilot, Amazon’s Bedrock, Open AI’s Function Calling and anticipated launch of AI agents for phones or computers, paving the way for AI agents to execute tasks across devices with greater autonomy and reliability. Figure 4 below outlines the high level components required to build agents, along with illustrative use cases.
However, the landscape is not without its challenges. Despite the advancements, there are still several limitations that need to be addressed:
- Models: Current models struggle with planning and executing tasks without heavy reliance on prompting techniques like ReAct or Few Shot combined with Reflection. Both open-source models (e.g., Llama4) and closed-source models (e.g., GPT-4, Claude 3) are making strides, but breakthroughs in task-specific and multi-modal models are essential for further scalability.
- Memory: Advancements in reasoning capabilities and extended context windows, such as the 128K context window in Llama models, could reduce the reliance on vector databases for AI memory. This would significantly enhance agent functionality and reduce issues like hallucinations
- Latency and cost: A multi-agent set-up based on LLMs can cause an explosion of calls to the underlying LLMs to answer simple and/or single questions. The resulting token generation and coordination between the agents can result in significant cost and latency at the expense of the user experience.
- Access to business data: While foundation models have shown promise with publicly available data, true enterprise value creation will require access to private data often stored in custom systems. This remains a significant hurdle for companies looking to deploy AI agents at scale.
Investment Hypothesis
Our investment strategy focuses on companies developing platforms that automate high volume, repeatable tasks traditionally handled by human labor. This shift is especially relevant in service-heavy industries, where automation can improve margins by taking over tasks usually performed by humans. A trend already supported by the rise of outcome-based pricing in the services industry.
We anticipate the following trends:
- RPA and AI Agents will co-exist, with each technology serving different automation needs. With RPA (Robotic Process Automation) used for highly repetitive and simpler tasks and AI agents for more complex workflows, such as integrating and linking databases, spreadsheets, SaaS tools, as well as handling tasks that require a high degree of subjectivity like RFPs or customer support. Established incumbents, UiPath, Workato, ServiceNow, and Zapier are integrating agentic capabilities into their platforms, although scalability due to architectural design remains uncertain. But their potential as distribution partners cannot be underestimated. Simultaneously, foundational models and hyperscalers are expanding their offerings to include agent-based functionalities.
- New business models will emerge as we increasingly outsource human-level tasks to digital co-workers, shifting the valuation and pricing of digital labor. The agentic business model will begin to command rates comparable to salaries rather than traditional software licensing fees. This shift is exemplified by Hippocratic AI agentic nurses, which cost $9 per hour, a fraction of the median $43 hourly wage for human nurses.
- Vertical Application Agents, tailored for specific domains, will drive significant advancements in industries like healthcare, finance, insurance, and manufacturing. By leveraging domain-specific data, these agents will deliver highly accurate and efficient solutions, surpassing general-purpose models, boosting productivity and innovation. For example, imagine an automated invoice processing system where a data extraction agent pulls relevant information, an analysis agent verifies its accuracy, and a processing agent manages approvals and payments. Together, these specialized agents function as a ‘universal employee,’ seamlessly handling end-to-end tasks with precision and speed, transforming how businesses operate.
- Multi-Modal Agents will revolutionize human-computer interactions by seamlessly integrating and interpreting data from diverse sensory inputs facilitating more natural and intuitive interactions and applicability across various industries. Recent developments like CRAB framework offers a step forward towards this reality.
- Alternative Architectures will emerge moving us beyond transformer-based models. These new architectures like state-spaced model will be designed to support real-time learning and dynamic adaptability without needing large-scale retraining. Such architectures will be crucial for advancing AI in fields like robotics, autonomous systems, and complex engineering tasks where real-time decision-making and adaptability are key.
Agents are not just tools; they are autonomous workers capable of augmenting and eventually replacing certain human roles. This unprecedented shift opens vast new markets, redefining the future of work and technology’s role in the global economy.
We’d love to hear about your challenges, use cases, perspectives on this evolving trend and for an opportunity to be featured in our blog series which will include interviews from startup CEOs and industry experts.