The Pythoneers

Your home for innovative tech stories about Python and its limitless possibilities. Discover, learn, and get inspired.

Member-only story

Create a QA Chatbot with GeminiPro LLM model with Python from PDF

Amit Chauhan
The Pythoneers
Published in
4 min readMay 16, 2024

--

Image source

In this article, we will try to implement the question-answering chatbot from a pdf file provided using a large language model of google. The pdf file text data is converted into embedding vectors with the help of the FAISS library.

Let’s start the implementation part.

PyPDF2 is a library to manipulate pdf’s data in terms of extracting data, splitting, merging, etc.

from PyPDF2 import PdfReader

RecursiveCharacterTextSplitter: It is very useful to handle large document data and then divide it into smaller chunks for processing and analysis. As per the splitting nature, it recursively divides the text into limited lengths and this process continues until all the big-length text into the specified
limit chunks.

from langchain.text_splitter import RecursiveCharacterTextSplitter

The os module provides the operation system interface functionalities like file manipulations, environment variables, path operations, etc.

import os

As we are using google’s AI LLMs model, we are converting text data to embedding that…

--

--

The Pythoneers
The Pythoneers

Published in The Pythoneers

Your home for innovative tech stories about Python and its limitless possibilities. Discover, learn, and get inspired.

Amit Chauhan
Amit Chauhan

Written by Amit Chauhan

Data Scientist, AI/ML/DL, Azure Cloud

No responses yet