Create a QA Chatbot with GeminiPro LLM model with Python from PDF

Google’s large language models in artificial intelligence

Amit Chauhan
The Pythoneers


Image source

In this article, we will try to implement the question-answering chatbot from a pdf file provided using a large language model of google. The pdf file text data is converted into embedding vectors with the help of the FAISS library.

Let’s start the implementation part.

PyPDF2 is a library to manipulate pdf’s data in terms of extracting data, splitting, merging, etc.

from PyPDF2 import PdfReader

RecursiveCharacterTextSplitter: It is very useful to handle large document data and then divide it into smaller chunks for processing and analysis. As per the splitting nature, it recursively divides the text into limited lengths and this process continues until all the big-length text into the specified
limit chunks.

from langchain.text_splitter import RecursiveCharacterTextSplitter

The os module provides the operation system interface functionalities like file manipulations, environment variables, path operations, etc.

import os

As we are using google’s AI LLMs model, we are converting text data to embedding that…