Langchain has a scalability problem right now and need a proper multi-process and multi-threaded approach

3 min readJul 6, 2023

I. Scalabilities issues :

LangChain is a useful framework for building applications with Large Language Models (LLMs), but it may not be sufficient for building scalable apps out of the box. Some components are not asynchronous by nature.

This means that they might not be well-suited for handling a large number of simultaneous users and maintaining low latency.

To address this issue, LangChain has introduced initial asynchronous support by leveraging the asyncio library. Asyncio uses coroutines and an event loop to perform non-blocking I/O operations, allowing multiple tasks to run concurrently. This can help improve the scalability of LangChain applications, especially when integrated with async frameworks like FastAPI.

But not all components are covered for now.

Custom tools are integrated though :

class CustomSearchTool(BaseTool):
    name = "custom_search"
    description = "useful for when you need to answer questions about current events"

    def _run(
        self, query: str, run_manager: Optional[CallbackManagerForToolRun] = None
    ) -> str:
        """Use the tool."""
        return search.run(query)

    async def _arun(
        self, query: str, run_manager: Optional[AsyncCallbackManagerForToolRun] = None
    ) -> str:
        """Use the tool asynchronously."""
        raise…

Langchain has a scalability problem right now and need a proper multi-process and multi-threaded approach

I. Scalabilities issues :

Written by Anthony Alcaraz