Skip to content

Mmaneesh007/FRIDAY-Voice-Assistant

Repository files navigation

🤖 F.R.I.D.A.Y. - Advanced Voice Assistant

Python 3.11+ LiveKit Groq Sarvam FastMCP

A Tony Stark-inspired multimodal voice agent that hears, thinks, and commands your digital world.

FeaturesArchitectureInstallationUsage


🌟 Overview

F.R.I.D.A.Y. (Female Replacement Intelligent Digital Assistant Youth) is a high-performance, real-time voice assistant. It leverages state-of-the-art LLMs and high-speed audio processing to understand context, fetch real-time global news, monitor financial markets, and execute automated workflows through Model Context Protocol (MCP).

✨ Features

  • ⚡ Blazing Fast Intelligence: Powered by Groq (llama-3.3-70b-versatile) for sub-second logical reasoning.
  • 🎙️ Real-time STT & TTS: Multilingual speech-to-text and rich text-to-speech utilizing Sarvam AI.
  • 🛠️ Extensible Tooling: Built on FastMCP, allowing F.R.I.D.A.Y. to fetch live news feeds, trigger system web browsers, and perform web searches.
  • 🌐 Cloudflared Tunneling: Securely expose your local toolsets to external cloud environments.
  • ⏱️ Contextual Awareness: Greet you dynamically based on the time of day and maintain a dry, professional, Stark-esque persona.

🏗️ Architecture

graph TD
    User((User)) <-->|Voice/Audio| LK[LiveKit Cloud]
    LK <-->|WebRTC| Agent[F.R.I.D.A.Y Agent]
    
    subgraph Local Environment
        Agent <-->|MCP Protocol| FastMCP[FastMCP Server]
        FastMCP --> Tools[Local Tools]
        Tools -->|Browser| Web[World Monitor App]
    end
    
    subgraph Cloud APIs
        Agent -->|Transcription| SarvamSTT[Sarvam STT]
        Agent -->|Reasoning| Groq[Groq Llama 3.3]
        Agent -->|Synthesis| SarvamTTS[Sarvam TTS]
    end
Loading

🚀 Installation

Prerequisites

  • Python 3.11 or higher
  • uv (Extremely fast Python package installer)
  • Accounts/API Keys for LiveKit, Groq, and Sarvam AI.

1. Clone the Repository

git clone https://github.com/YOUR_USERNAME/Project-FRIDAY.git
cd Project-FRIDAY

2. Set Up Environment Variables

Create a .env file in the root directory and add your credentials:

LIVEKIT_URL=wss://your-project.livekit.cloud
LIVEKIT_API_KEY=your_api_key
LIVEKIT_API_SECRET=your_api_secret

GROQ_API_KEY=your_groq_api_key
SARVAM_API_KEY=your_sarvam_api_key

STT_PROVIDER=sarvam
LLM_PROVIDER=groq
TTS_PROVIDER=sarvam

🎮 Usage

Running F.R.I.D.A.Y. requires running both the Tool Server and the Voice Agent.

Step 1: Start the Brain (Tools)

Open a terminal and launch the FastMCP server:

uv run server.py

Step 2: Start the Voice Agent

Open a second terminal and connect to LiveKit:

uv run agent_friday.py dev

Step 3: Connect

  1. Navigate to the LiveKit Agents Playground.
  2. Click Connect and start speaking! Try asking: "Friday, what's happening around the world?"

🌐 Cloudflared Tunnel (Optional)

If you need to host your local MCP server to the web, use the included PowerShell script:

.\start_tunnel.ps1

This will automatically download Cloudflared and open a tunnel on port 8000.

⚠️ Known Limitations

  • Sarvam TTS Codec: The LiveKit Sarvam plugin currently defaults to expecting WAV audio streams, while the Sarvam API has transitioned to MP3. This causes a temporary codec mismatch that prevents TTS from playing properly. Waiting on upstream updates for livekit-plugins-sarvam to resolve this format change.

📄 License

This project is licensed under the MIT License.

About

A Tony Stark-inspired multimodal voice agent powered by LiveKit, Groq, Sarvam AI, and FastMCP.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors