A powerful application that transforms YouTube videos into interactive conversations using Simple Retrieval-Augmented Generation RAG Architecture. Ask questions about YouTube video and get precise answers based on its content.
- Video Processing: Extract and analyze content from YouTube video with available transcripts
- Multi-Language Support: Process videos with transcripts in multiple languages
- Simple RAG Pipeline: Accurately retrieve and contextualize relevant information from videos
- Interactive Chat: Ask questions about the video content and receive detailed answers
- Conversation History: Save and export your conversation history in JSON or Markdown format
- Modern UI: Clean and intuitive Streamlit interface
The project consists of two main components:
- FastAPI Backend: The core RAG system that processes videos and answers questions
- Streamlit Frontend: User-friendly interface for interacting with the RAG system
The application uses a the Simple RAG pipeline:
-
Video Processing:
- Extract transcript from YouTube video
- Split transcript into meaningful chunks
- Generate embeddings using Cohere's embed-english-v3.0 model
- Store embeddings in Pinecone vector database
-
Question Answering:
- Process user question through MultiQueryRetriever
- Retrieve most relevant transcript chunks from Pinecone
- Generate comprehensive answer using Google's Gemini 2.5 Flash model
- Python 3.10+
- API keys for:
- Pinecone
- Cohere
- Google AI (Gemini)
-
Clone the repository
-
Create and activate a virtual environment:
python -m venv youtube_rag_env # On Windows .\youtube_rag_env\Scripts\activate # On Unix or MacOS source youtube_rag_env/bin/activate
-
Install the required packages:
pip install -r requirements.txt
-
Create a
.envfile in the root directory with your API keys:GOOGLE_API_KEY="your_google_api_key" COHERE_API_KEY="your_cohere_api_key" PINECONE_API_KEY="your_pinecone_api_key"
-
Start the FastAPI backend:
uvicorn youtube_rag_api:app --reload
-
In a separate terminal, start the Streamlit frontend:
streamlit run streamlit_video_chat.py
-
Open your browser and navigate to
http://localhost:8501
| Endpoint | Method | Purpose | Request | Response |
|---|---|---|---|---|
/ |
GET | Health check | None | Status message |
/process_video |
POST | Process video transcript | video_url, language |
Processing status |
/chat |
POST | Answer questions about a video | video_id, question |
Answer with sources |
- Paste a YouTube URL in the input field
- Select the language of the video's transcript
- Click "Process Video" to extract and analyze the content
- Once processing is complete, ask questions about the video in the chat interface
- Use the suggested questions or type your own
- Download your conversation history as JSON or Markdown from the sidebar
youtube_rag_api.py- FastAPI backend implementing the RAG systemstreamlit_video_chat.py- Streamlit frontend for user interactionrequirements.txt- Required Python packagesnotebooks/YouTubeRAG.ipynb- Development notebook with detailed explanationsimg/technical_architecture.png- Technical architecture diagram
The application requires the following environment variables:
GOOGLE_API_KEY- API key for Google's Gemini modelCOHERE_API_KEY- API key for Cohere's embedding modelPINECONE_API_KEY- API key for Pinecone vector database
- Only works with YouTube videos that have available transcripts
- Quality of answers depends on the quality and accuracy of video transcripts
- Processing very long videos may take more time and resources
- Backend: FastAPI, LangChain, Pinecone
- Frontend: Streamlit
- AI/ML:
- Cohere Embeddings (embed-english-v3.0)
- Google Gemini 2.5 Flash
- Data: YouTube Transcript API
- Data Validation: Pydantic