🎙️ Voice Agentic Workflow

A voice-powered AI assistant that listens to your speech, translates it to English (if needed), processes it through an AI agent, and responds with synthesized speech — all in real-time.

✨ Features

🎤 Voice Input — Captures speech using your microphone with Google Speech Recognition
🌍 Auto Translation — Automatically translates non-English speech to English
🤖 AI Agent — Processes queries using OpenRouter's free LLM models
🔊 Voice Output — Responds with natural text-to-speech (offline, no API needed)
⚡ Async Architecture — Built with asyncio for efficient, non-blocking operations

🏗️ Architecture

┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
│   Microphone    │────▶│  Speech-to-Text  │────▶│   Translator    │
│   (Voice In)    │     │  (Google API)    │     │   (to English)  │
└─────────────────┘     └──────────────────┘     └────────┬────────┘
                                                          │
                                                          ▼
┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
│    Speaker      │◀────│  Text-to-Speech  │◀────│    AI Agent     │
│   (Voice Out)   │     │    (pyttsx3)     │     │  (OpenRouter)   │
└─────────────────┘     └──────────────────┘     └─────────────────┘

📁 Project Structure

voice-agentic-workflow/
├── src/
│   ├── ai_agents/
│   │   └── all_agents.py      # Main agent configuration & runner
│   ├── agents_tools/
│   │   ├── speech_to_text.py  # Voice capture & translation
│   │   └── text_to_speech.py  # AI response vocalization
│   └── .env                   # API keys (not tracked in git)
├── pyproject.toml             # Project dependencies
├── README.md
└── .gitignore

🚀 Quick Start

Prerequisites

Python 3.12+
uv (recommended) or pip
Working microphone
Internet connection (for speech recognition & AI)

Installation

Clone the repository

git clone https://github.com/MYounus-Codes/voice-ai-agent.git
cd voice-ai-agent

Install dependencies with uv
```
uv sync
```
Or with pip:
```
pip install -r requirements.txt
```
Set up environment variables

Create a .env file in the src/ directory:
```
OPENROUTER_API_KEY=your_openrouter_api_key_here
```
Get your free API key from OpenRouter

Running the Assistant

uv run src/ai_agents/all_agents.py

When you see Listening for voice input... Speak clearly., start speaking!

🔧 Configuration

Changing the AI Model

Edit src/ai_agents/all_agents.py to use a different model:

model = OpenAIChatCompletionsModel(
    model="xiaomi/mimo-v2-flash:free",  # Change to any OpenRouter model
    openai_client=external_client
)

Browse available models at OpenRouter Models

Customizing Voice Output

Edit src/agents_tools/text_to_speech.py to adjust:

# Speech rate (words per minute)
engine.setProperty('rate', 150)

# Volume (0.0 to 1.0)
engine.setProperty('volume', 1.0)

# Voice selection (male/female depends on system)
voices = engine.getProperty('voices')
engine.setProperty('voice', voices[1].id)  # Try different indices

📦 Dependencies

Package	Purpose
`openai-agents`	AI agent framework with OpenAI-compatible API
`speechrecognition`	Voice capture & Google Speech-to-Text
`translate`	Automatic language translation
`pyttsx3`	Offline text-to-speech synthesis
`pyaudio`	Audio I/O for microphone access
`python-dotenv`	Environment variable management

🔍 How It Works

Voice Capture — The microphone listens for your voice input
Speech Recognition — Google's Speech-to-Text API converts audio to text
Translation — If the text isn't in English, it's automatically translated
AI Processing — The translated text is sent to an AI agent via OpenRouter
Voice Response — The AI's response is spoken aloud using pyttsx3

🐛 Troubleshooting

"Could not understand the audio"

Speak clearly and closer to the microphone
Reduce background noise
Check if your microphone is working

"401 - User not found" Error

Verify your OpenRouter API key is valid
Generate a new key at openrouter.ai/keys
Ensure no quotes around the key in .env

"No module named 'pyaudio'"

On Windows:

pip install pipwin
pipwin install pyaudio

On Linux:

sudo apt-get install portaudio19-dev
pip install pyaudio

On macOS:

brew install portaudio
pip install pyaudio

🤝 Contributing

Contributions are welcome! Feel free to:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

OpenRouter for providing access to various LLM models
OpenAI Agents SDK for the agent framework
SpeechRecognition for voice capture capabilities

Made with ❤️ by MYounus-Codes

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
src		src
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎙️ Voice Agentic Workflow

✨ Features

🏗️ Architecture

📁 Project Structure

🚀 Quick Start

Prerequisites

Installation

Running the Assistant

🔧 Configuration

Changing the AI Model

Customizing Voice Output

📦 Dependencies

🔍 How It Works

🐛 Troubleshooting

"Could not understand the audio"

"401 - User not found" Error

"No module named 'pyaudio'"

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🎙️ Voice Agentic Workflow

✨ Features

🏗️ Architecture

📁 Project Structure

🚀 Quick Start

Prerequisites

Installation

Running the Assistant

🔧 Configuration

Changing the AI Model

Customizing Voice Output

📦 Dependencies

🔍 How It Works

🐛 Troubleshooting

"Could not understand the audio"

"401 - User not found" Error

"No module named 'pyaudio'"

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages