The Generative AI Ecosystem Part II

A tour of Generative AI tools and technologies for developers

8 min readJul 13, 2023

This blog aims to overview and highlight various Generative AI tools and technologies for application developers. Unlike the first blog that focused on end-user applications, this blog is more oriented towards those companies and technologies that are relevant for developing and deploying applications using Gen AI in an enterprise setting.

To develop this Gen AI blog series we collaborated with our partners at Soko Solutions and Lio Chamorro to analyze ~435 companies, tools, and open-source projects. For this specific blog, we focused on application development. However, even within that segment, there exists a diverse range of companies and technologies that do not neatly fit into a single category. For clarity and simplicity we have grouped them into the following 4 areas:

Application Development
Chat and Document Search
Coding
Vector Databases

If you want more detailed information about each company, you can request the Google Sheet that lists all companies and more specifics.

General Technical Introduction

Andreessen Horowitz (a16z) does a nice job describing the emerging LLM application stack in this article. They highlight a data layer with data pipelines, embedding models, and vector database, an application layer that orchestrate the flow of the application across LLMs (such as LangChain), and several pieces around hosting and MLOps. This blog will focus on the data and application stack, whereas the next blog will cover some of the MLOps, hosting, and other ML-related pieces.

Application Development

One of the most exciting aspects of the rise of Generative AI has been the recent rapid pace of innovation in application development. The large LLMs have proven to be so capable at such a wide variety of language tasks it’s been natural and exciting to quickly extend those capabilities to our data and our applications. Quickly we saw examples of people using GPT-based technologies to summarize your PDF, chat with your PDF, summarize youtube videos, and more.

One of the most exciting such frameworks that have enabled fast-paced growth has been Langchain. Langchain is an open-source project with a number of libraries and connectors that enables you to build LLM-based applications on top of LLM providers (like OpenAI and Google) and connect those LLMs with other applications and your data. In general, it provides a structure to interact with data sources (like PDFs and YouTubes) and how to hold interim steps or information while interacting with LLMs like chatGPT. There’s also a related open-source project called Langflow which provides a GUI editor for visually designing Langchain applications.

The Langchain project is about 9 months old and already has 100+ contributors and 50k stars. It is the subject of numerous side projects and video instructions and only recently did OpenAI develop a capability to do some similar applications with functions in its API. While LanGchain has an open source component — in a sign of the fast-paced times — it received $10M in seed funding from Benchmark and then $20M in funding from Sequoia a week later to commercialize its offerings!

Another open-source framework is Llamaindex — a data framework that provides data connectors and the ability to structure, query, and retrieve data to work in conjunction with LLMs.

Beyond these application development frameworks, Hugging Face is a well-funded and notable player in the development of LLM-based technologies. Sometimes referred to as the Github of AI — Hugging Face offers a number of open-source models, tools, and frameworks in a straightforward accessible set of packages that can be used and stitched together to build leading LLM-based applications.

Aviary, also open source and offered by Anyscale, is a different type of developer tool that offers the ability to evaluate, compare, and self-host multiple open-source LLM-based models.

Ability to Compare LLM Performance Side by Side. Source: AnyScale-Aviary

Chat and Document/Search

While chat and conversational AI has been growing in popularity, ChatGPT demonstrated the truly transformational capability to “chat” with information — to ask broad questions, dig deeper, and have back-and-forth Q&A. Companies like Moveworks aim to provide an enterprise conversational interface for “all your employee needs” — HR, IT, Support, Enterprise Search, etc. Although not a new company (founded in 2016) — they have raised $300M+ for this important use case.

Neural search is different than typical text-based algorithm search engines in that it uses language model (neural network) type matching to find semantic and language-based similarities of words, phrases, or questions. We’ll see more about the trend of word embeddings and neural search in the vector database section — but for now, we can mention that large foundational language models can provide word embeddings in a useful way to power or enable some type of neural search. Cohere for example provides an add-on API to enhance traditional search with neural search directly through its rerank product.

Companies that are specialized in the search include Jina.ai, and Vectara — “a developer-first API platform for building conversational search”. Additionally, Rasa, an open-source leader in conversational AI, continues to innovate alongside the GPT growth trend, and rasaGPT, for example, is a headless (no user interface) chatbot using Rasa and Langchain. And the space is fast-moving, a recent upstart search player (Neeva) was recently acquired by Snowflake.

Coding

One super exciting sub-section of GPT and language model technologies is focused on writing and assisting in writing software. As we know from chatGPT and elsewhere, the GPT/LLM technologies have been trained on vast amounts of text and website content to learn about language and enable the models to generate similar text. In the case of software or “code” — the concept is the same just that the models have been trained with large amounts of code from Git Hub and other software repositories. The tools that exist are code generators — that can generate simple template code or even functional applications from a prompt or can look at the code you are writing and provide summary comments, tips, or suggestions.

Some of the more powerful such technologies are Open AI’s Codex, and Github Co-Pilot which are based on the same OpenAI technology, Amazon’s CodeWhisperer, Google’s Vertex AI Codey, and Replit’s Ghostwriter. In all cases these technologies can be prompted to provide a code snippet that can be helpful in starting a coding task, can be fed error messages to determine what is wrong, and can be provided a section of code to explain or summarize. The cases are similar to general text applications but of course, this text is understandable by machines as well and can be used to create whole applications. OpenAI has recently announced its new release chatGPT Code Interpreter — you can see examples of the alpha release here.

The expectations are very high and for sure there’s a bit of hype (e.g. chatgpt will replace programmers in 10 years). But according to recent news, programmers are indeed adopting these technologies (92% of developers use AI coding tools) and we have started to and will continue to see some impact in terms of productivity, speed, and composition of software teams. This is the same trend, perhaps accelerated, that we have seen with the general low-code/no-code movement.

Some nice video examples can be seen here from OpenAI and from here where a user creates a customized squarespace website.

VectorDB/Feature Store

As mentioned earlier, embeddings are a list of numbers (known as vectors) that encode and represent the language information about a word or a series of words. The process of training the language model produces these vectors based on all the language that the model learns from the various text inputs.

The vector (or embedding) amazingly gives a mathematical and computer-based way to compare words or sentences for their language similarity. And this concept of “embedding comparisons” is the one that is used in neural search. The basic idea is that you can “embed” the input or search string — computing a vector that represents the language of the query — and then compare that vector against a set of embeddings that already exist — words, phrases, or passages in the target corpus you are searching. A great explanation about embeddings is here by Jay Allamar and a small video course here by Google.

Since the use case of embedding search has become important and powerful for neural search, the need to store, retrieve, and compute vector comparisons on embeddings has grown too. And this setup the growing need for the vector database — or a database that specializes in storing and retrieving vector embeddings. While this may sound quite niche — we were also surprised to learn that there are at least a dozen or so that we have found with hundreds of millions of dollars of investment to date. Some prominent vector databases include Pinecone which has raised $130M+ to date, Weaviate which has also raised nearly $70M of funding, and Chroma which is open source but has also raised an $18M seed round for its hosted product.

Conclusion

Just as end-users have seen the impact of LLM and Gen AI technologies, so have software developers and enterprises. Accordingly, a number of technologies have emerged to make software developers more productive in coding and provided a series of tools to utilize LLMs to build and deploy LLM-based applications in the enterprise.

The next and final blog in this generative AI series will focus on machine learning developers and MLOps — the training and hosting of custom Gen AI models.

If you are interested to learn more about the market landscape or want access to the full Google sheet of 435 companies and technologies, please reach out to us here.

References

Reference architecture for the emerging LLM app stack (link)
Comparison of Openai function calling to other languages (link)
LangChain’s $10M seed round led by Benchmark (link)
The latest funding round scored LangChain a valuation of at least $200 million (link)
Moveworks raised $200 million in a series C funding (link)
Guide to embeddings in OpenAI’s GPT models (link)
Snowflake acquires Neeva to accelerate search in the Data Cloud (link)
Blog that discusses the future of programmers and AI (link)
Github survey about developers using AI coding tools (link)
ChatGPT Code Interpreter guide and examples (link)
Guide to embeddings in OpenAI’s GPT models and other deep learning concepts (link)
Pinecone Hits $750M Valuation (link)
An illustrated guide to Word2Vec, a neural network model for generating word embeddings (link)
Video lecture from Google’s Machine Learning Crash Course on embeddings (link)
Video Example Coding Application with GPT (link)