Best Generative AI frameworks for Developers and Data Scientists

LangChain, LangGraph, DSPy, PandasAI, and others

Mehul Gupta
Data Science in your pocket

--

Since ChatGPT’s arrival, the AI arena has taken a complete shift. There has been a flurry of frameworks and softwares out there catering to different requirements. In this post, I’ve curated a list of some of the best frameworks for different usecases required while building Generative AI applications. We will briefly discuss

LangChain : Framework for building any Generative AI applications

LangFuse: Tracing and debugging GenAI applications

DSPy: Automatic prompt tuning framework

PandasAI: Perform any operation on pandas DataFrame using LLM

Unsloth : Fine-Tuning LLMs faster on any custom data

LangGraph : Create a team of LLM Agents for completing a task

AutoGen: similar to LangGraph but more towards code execution tasks

LlamaIndex: A dedicated framework for implementing RAG framework for connecting external files to LLMs for context

GraphRAG: An advanced version of standard RAG that uses Knowledge Graphs for Retrieval giving more comprehensive answers

DeepEval: Framework to evaluate your GenAI applications

Ollama : Software/python package for running local LLMs

Diffusers: Helps with Image based generation.

My debut book: LangChain in your Pocket is out now

So let’s get started with this list

LangChain

LangChain is an open-source framework that makes it easier to build applications using LLMs. It helps developers create interactive tools like chatbots and question-answering systems by connecting these models to different data sources. With LangChain, you can mix and match different components to customize your application, making it simple to enhance how it understands and processes natural language. I can’t think of a applications that can’t be built using LangChain

You can find tutorials related to LangChain here (check the full playlist, top right):

DSPy

The DSPy package is a framework developed by Stanford University that simplifies the use of LLMs by focusing on programming rather than manual prompting. It allows users to create and optimize LLM-based applications by defining modules and compiling them to automatically adjust prompts and weights. This approach helps improve the performance and flexibility of applications by treating language models as programmable components, similar to how neural networks are trained and optimized. Find the code tutorials below

PandasAI

PandasAI is a Python library that enhances the capabilities of the Pandas data analysis tool by integrating generative AI. It allows users to interact with dataframes using natural language queries, which are then translated into Python or SQL code. This makes it easier for users to analyze and visualize data without needing to write complex code, as PandasAI can generate insights, create graphs, and clean data through simple text prompts.See demo below

LangGraph

LangGraph is a library designed to build complex, stateful applications using LLMs. It allows developers to create workflows involving multiple agents and cycles, enabling more dynamic and flexible interactions compared to traditional linear workflows. Built on top of LangChain, LangGraph provides features like persistence, human-in-the-loop capabilities, and streaming support, making it ideal for applications that require advanced control and coordination between different components or agents.

Find out different coding tutorial on LangGraph here:

Langfuse

Langfuse is an open-source platform designed to help teams work with LLMs more effectively. It provides tools for observing, analyzing, and experimenting with LLM applications. Langfuse allows developers to track and debug model calls, manage and test prompts, and gather insights through analytics. By integrating with various frameworks and offering features like prompt management and performance evaluations, Langfuse simplifies the process of developing and optimizing LLM-based applications.

Get started with LangFuse using the below tutorial

Unsloth

The Unsloth package is a library designed to optimize the training and fine-tuning of LLMs. It focuses on making the process faster and more efficient, claiming to increase training speed by up to 30 times and reduce memory usage by 60%. Unsloth supports various hardware, including NVIDIA, Intel, and AMD GPUs, and is open-source, allowing users to easily implement and experiment with LLMs. Its features include improved performance with minimal loss in accuracy, making it a valuable tool for developers working with AI models.

Checkout how I fine-tuned Llama 3.1 using unsloth

AutoGen

The AutoGen package is a framework that helps create applications using multiple agents that can communicate and work together to solve tasks. It simplifies building workflows with LLMs by allowing these agents to collaborate, send messages, and perform actions like generating and executing code. This makes it easier to automate complex tasks and manage interactions between different components in an application.

Checkout how to get started with AutoGen here

GraphRAG

The GraphRAG package is a tool that enhances Retrieval-Augmented Generation (RAG) by using knowledge graphs instead of traditional vector databases. This approach allows for more accurate and meaningful data retrieval from unstructured text. By building a structured knowledge graph, GraphRAG improves the ability of large language models to understand and reason about complex information, making it particularly useful for handling intricate queries and synthesizing insights from large datasets.

You can implement GraphRAG using LangChain as well

LlamaIndex

LlamaIndex is a tool that helps connect your custom data to large language models (LLMs) like GPT-4 (RAG framework). It makes it easy to bring data from different sources, such as APIs, databases, and documents, into a format that LLMs can understand. LlamaIndex organizes this data, allowing users to ask questions and interact with it using natural language. This way, it enhances the capabilities of LLMs by enabling them to access and process specific information from your own datasets.

Ollama

The Ollama package is a tool that lets you run LLMs like Llama 3.1 and Mistral directly on your local computer. It simplifies the process by bundling everything needed — model weights, configurations, and data — into a single package. This makes it easy to set up and use these models locally, whether you’re using a CPU or GPU, and it supports various interaction methods, including command-line, SDK, and API access. This approach offers flexibility and control for developers looking to leverage AI models on their own devices.

DeepEval

The DeepEval package is an open-source framework designed for evaluating LLM outputs. It functions similarly to Pytest but is specialized for unit testing LLM applications. DeepEval provides a range of evaluation metrics, such as answer relevancy and hallucination detection, to help developers assess and improve the performance of their models. It supports real-time evaluations and can be integrated into existing workflows, making it a valuable tool for fine-tuning and optimizing LLM-based application

Diffusers

The Diffusers package is a library from Hugging Face that makes it easy to use diffusion models for generating images and audio. It provides pre-trained models and simple tools to create high-quality outputs from text prompts. With Diffusers, developers can quickly build applications that involve generating visuals or sounds, all while customizing the settings to get the desired results. It’s designed to be user-friendly and accessible for anyone working with AI-generated content

With this, we will wrap this post. There are many other Generative AI packages that one can used which I haven’t discussed here. Do check them out here:

--

--