YouTube Q&A System

Aaron Philip
Django Unleashed
Published in
6 min readNov 24, 2023

In an era defined by instant connectivity and information, this software is an attempt to revolutionize the creator-viewer dynamic with this YouTube Q&A System. This article is your ticket to explore the genesis, functionalities, and impact of this groundbreaking platform.

Large Language Models (LLMs) have undeniably left an indelible mark on various facets of our digital landscape. First and foremost, these models, exemplified by GPT-3 followed by more such as PALM and BARD, have redefined natural language processing, enabling computers to understand and generate human-like text on an unprecedented scale. Their impact on content creation is monumental, empowering writers, marketers, and developers with tools that can craft coherent, contextually relevant text across diverse domains.

In the educational sphere, LLMs have emerged as invaluable learning companions, providing personalized tutoring, language translation, and even aiding in the creation of educational content.

Let’s dive into the intricacies of this software — its design, and meticulous development.

System Design

Understanding the Workflow and System Design:

The idea is the user gives a YouTube URL as an input and clicks a “train” button, after a designated amount of time, the user can ask questions to this system about the video.

For the tech stack, we will use React as our Frontend and Django as our backend

STEPS:

  1. Enter URL
  2. Use YouTube transcript API to extract the captions
  3. Store them in text format
  4. Use an LLM ( PALM-2) in our system, but any LLM would work
  5. Generate FAISS Files (it generates pickle files automatically)
  6. Store these files along with the captions
  7. Use an LLM to - take in the query, understand the context and then give answers based on the information provided
  8. Also provide an option for getting references (which is a simple semantic search through the FAISS Files)

Backend Code (Django)

We’ll look only at the view functions and then the urls.py (hopefully you have a basic idea about Django), please feel free to post any doubts in the comments section

from django.shortcuts import render
from django.http import HttpResponse
from langchain.embeddings import HuggingFaceEmbeddings
import requests
from django.http import JsonResponse
from langchain.embeddings import GooglePalmEmbeddings
from transformers import pipeline
from django.http import JsonResponse
from youtube_transcript_api import YouTubeTranscriptApi
import urllib.parse as urlparse
import os
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import GooglePalmEmbeddings
from langchain.vectorstores import FAISS
from langchain.llms import HuggingFacePipeline
from langchain.chains.question_answering import load_qa_chain
import json
from langchain.llms.huggingface_hub import HuggingFaceHub
from django.views.decorators.csrf import csrf_exempt


# Function to extract video id from url
def extract_video_id(url):
print("This is the url",url)
# Parse URL
url_data = urlparse.urlparse(url)
# Extract video id
video_id = urlparse.parse_qs(url_data.query)['v'][0]
return video_id

def say_hello(request):
return HttpResponse("Hello")

def yt(request):
# Get YouTube URL from request parameters
url = request.GET.get('url', '')
# Extract video id
video_id = extract_video_id(url)

embedding2 = GooglePalmEmbeddings(google_api_key="PALM-API-KEY")
vdb_chunks_HF = FAISS.load_local(f"./query/vdb_chunks_HF/", embedding2,index_name=f"index{video_id}")
query = request.GET.get('query', '')
ans = vdb_chunks_HF.as_retriever().get_relevant_documents(query)
answers = [doc.page_content for doc in ans]

# Add CORS headers to the response
response = JsonResponse({'answers': answers})
response['Access-Control-Allow-Origin'] = 'http://localhost:3000'
return response

def llm_answering(request):
# Assuming you're using Django for web development
url = request.GET.get('url', '')
query = request.GET.get('query', '')
print(query)
# Validate if both 'url' and 'query' parameters are present
if not url or not query:
return HttpResponse("Both 'url' and 'query' parameters are required.")
video_id = extract_video_id(url)
embedding2 = GooglePalmEmbeddings(google_api_key="PALM-API-KEY")
# Adjust the path to your FAISS index directory
db = FAISS.load_local("./query/vdb_chunks_HF/", embedding2, index_name=f"index{video_id}")
llm = HuggingFaceHub(repo_id="google/flan-t5-xxl", model_kwargs={"temperature": 0.1, "max_length": 65536, "min_length": 32768}, huggingfacehub_api_token="HUGGING-FACE-API-TOKEN")
chain = load_qa_chain(llm, chain_type="stuff")
docs = db.similarity_search(query)
response = chain.run(input_documents=docs, question=query)

# Add CORS headers to the response
response = JsonResponse({'response': response})
response['Access-Control-Allow-Origin'] = 'http://localhost:3000'
return response


def process_youtube_video(request):
print(request)
# Get YouTube URL from request parameters
url = request.GET.get('query', '')
print("This is the url",request.GET.get('query', ''))
# Extract video id
video_id = extract_video_id(url)

# Fetch the captions
try:
transcript_list = YouTubeTranscriptApi.list_transcripts(video_id)
transcript = transcript_list.find_generated_transcript(['en'])
captions = transcript.fetch()

# Open a text file in write mode
with open(f'./query/vdb_chunks_HF/{video_id}_captions.txt', 'w') as f:
for caption in captions:
# Write the caption to the text file
f.write(caption['text'] + '\n')
except:
return JsonResponse({'error': "An error occurred while fetching the captions."})

# Load text file
with open(f'./query/vdb_chunks_HF/{video_id}_captions.txt', 'r') as file:
text = file.read()

# Split text into chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=700, chunk_overlap=0, length_function=len)
chunks = text_splitter.split_text(text)

# Create documents from chunks
docs = text_splitter.create_documents(chunks)

# Convert chunks to embeddings and save as FAISS file
embedding = GooglePalmEmbeddings(google_api_key="PALM-API-KEY")
vdb_chunks_HF = FAISS.from_documents(docs, embedding=embedding)
vdb_chunks_HF.save_local(f'./query/vdb_chunks_HF/', index_name=f"index{video_id}")
return JsonResponse({'status': 'success'})

urls.py

from django.urls import path
from . import views

urlpatterns=[
path('home/', views.say_hello, name='say_hello'),
path('yt/', views.yt, name='yt'),
path('llm/', views.llm_answering, name='llm'),
path('ytvid/', views.process_youtube_video, name='ytvid'),
]

Example REST Requests:

http://127.0.0.1:8000/query/yt/?query=omega&url=https://www.youtube.com/watch?v=7kcWV6zlcRU&list=PLUl4u3cNGP62esZEwffjMAsEMW_YArxYC&index=5&ab_channel=MITOpenCourseWare
http://127.0.0.1:8000/query/ytvid/?url=https://www.youtube.com/watch?v=IcmzF1GT1Qw
http://127.0.0.1:8000/query/llm/?query=what+is+omega&url=https://www.youtube.com/watch?v=7kcWV6zlcRU&list=PLUl4u3cNGP62esZEwffjMAsEMW_YArxYC&index=5&ab_channel=MITOpenCourseWare
http://127.0.0.1:8000/query/llm/?query=what+does+he+say+about+beethoven+?&url=https://www.youtube.com/watch?v=IcmzF1GT1Qw&ab_channel=Vienna
https://www.youtube.com/watch?v=Tuw8hxrFBH8

Functions and their uses —

  1. extract_video_url(url) — Returns only the video id
  2. yt(request) — Returns a simple semantic search of the text files (References)
  3. llm_answering(request) — Returns an answer by understanding the context
  4. process_youtube_video(request) — Takes in the request from the frontend and converts it into text files, FAISS and pickle files of the extracted captions

Frontend Code (React)

import React, { useState } from 'react';
import axios from 'axios';

const YTComponent = () => {
const [url, setUrl] = useState('');
const [answer, setAnswer] = useState('');
const [references, setReferences] = useState('');
const [trainingUrl, setTrainingUrl] = useState('');
const [loading, setLoading] = useState(false);

const handleTrainingURLChange = (e) => {
// write regex remove everything after the & symbol using regex https://www.youtube.com/watch?v=yRmOWcWdQAo&ab_channel=OverSimplified

const url = e.target.value;
const regex_url = url.replace(/&.*/g, '');
console.log(regex_url)
setTrainingUrl(regex_url);
}

const handleAnswerClick = async () => {
try {
const response = await axios.get(`http://127.0.0.1:8000/query/llm/?query=${url}&url=${trainingUrl}`);
// Extract relevant data from the response and set it to the state
setAnswer(response.data.response);
} catch (error) {
console.error('Error fetching answer:', error);
}
};

const handleReferencesClick = async () => {
try {
const response = await axios.get(`http://127.0.0.1:8000/query/yt/?query=${url}&url=${trainingUrl}`);
setReferences(response.data.answers[0]);
} catch (error) {
console.error('Error fetching references:', error);
}
};

const handleTrainClick = async () => {
try {
setLoading(true);
const response = await axios.get(`http://127.0.0.1:8000/query/ytvid/?query=${trainingUrl}`);

console.log('Training Response:', response.data);

if (response.data === 'success') {
window.alert('Training successful! You can now ask questions.');
}

// Optionally, you can update the state or perform other actions based on the response
// ...

} catch (error) {
console.error('Error during training:', error);
} finally {
setLoading(false);
}
};

return (
<div className="max-w-2xl mx-auto mt-8 p-4 bg-gray-100 rounded-md">
{/* Training Section */}
<div className="mb-4">
<label htmlFor="trainingUrl" className="block text-sm font-medium text-gray-700">
Enter URL for Training:
</label>
<div className="flex">
<input
type="text"
id="trainingUrl"
className="mt-1 p-2 border rounded-l-md w-full"
placeholder="Paste YouTube URL here..."
value={trainingUrl}
onChange={(e) => {
handleTrainingURLChange(e)
}}
/>
<button
className="bg-purple-500 text-white px-4 py-2 rounded-r-md"
onClick={handleTrainClick}
>
{loading ? 'Loading...' : 'Train'}
</button>
</div>
</div>

{/* Answer and References Section */}
<div className="mb-4">
<label htmlFor="url" className="block text-sm font-medium text-gray-700">
Enter Question:
</label>
<input
type="text"
id="url"
className="mt-1 p-2 border rounded-md w-full"
placeholder="Write your Question here..."
value={url}
onChange={(e) => setUrl(e.target.value)}
/>
</div>

<div className="mb-4">
<button
className="bg-blue-500 text-white px-4 py-2 rounded-md mr-2"
onClick={handleAnswerClick}
>
Answer
</button>
<button
className="bg-green-500 text-white px-4 py-2 rounded-md"
onClick={handleReferencesClick}
>
References
</button>
</div>

{answer && (
<div className="bg-green-100 p-4 rounded-md mb-4">
<strong className="block text-green-700 mb-2">Answer:</strong>
<p className="text-green-800">{answer}</p>
</div>
)}

{references && (
<div className="bg-blue-100 p-4 rounded-md">
<strong className="block text-blue-700 mb-2">References:</strong>
<p className="text-blue-800">{references}</p>
</div>
)}
</div>
);
};

export default YTComponent;

Here in the frontend code we have written a small function that removes the “&ab_channel=CHANNEL_NAME” from the URL as that hinders with the training of the video data.

Hopefully, by the end of this article you have an idea on how to make your own YouTube Q&A system !

CONCLUSION:

In conclusion, this journey through the creation and intricacies of our YouTube Answering System, has been both a technological adventure and a testament to the boundless potential of innovation. From its inception to the refined product, each step illuminated the collaborative spirit, dedication, and forward-thinking vision that drove its development.

Any suggestions or thoughts on how to improve this system are welcome in the comments section. Happy coding!

If you are interested in accessing the entire source code, please don’t hesitate to reach out by sending an email to aaronphilip2003@gmail.com. I’ll be happy to provide it upon request.

--

--