Speech-to-text and text-to-speech plugin for OpenCode.
Record voice prompts with local whisper transcription, hear assistant responses spoken aloud via Piper TTS. Both directions use an LLM to normalize text for natural speech (fixing homophones, splitting camelCase identifiers, summarizing code-heavy responses, etc.).
Add to your tui.json (create at ~/.config/opencode/tui.json if it doesn't exist).
You must configure at least endpoint and model:
{
"$schema": "https://opencode.ai/tui.json",
"plugin": [
[
"@renjfk/opencode-voice",
{
"endpoint": "https://api.anthropic.com/v1",
"model": "claude-haiku-4-5",
"apiKeyEnv": "ANTHROPIC_API_KEY"
}
]
]
}brew install whisper-cpp soxDownload a whisper model to ~/.local/share/whisper-cpp/:
mkdir -p ~/.local/share/whisper-cpp
curl -L -o ~/.local/share/whisper-cpp/ggml-large-v3-turbo-q5_0.bin \
https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3-turbo-q5_0.binInstall Piper:
uv tool install piper-ttsOr with pip:
pip install piper-ttsDownload a voice model to ~/.local/share/piper-voices/:
mkdir -p ~/.local/share/piper-voices
curl -L -o ~/.local/share/piper-voices/en_US-ryan-high.onnx \
https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/ryan/high/en_US-ryan-high.onnx
curl -L -o ~/.local/share/piper-voices/en_US-ryan-high.onnx.json \
https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/ryan/high/en_US-ryan-high.onnx.jsonAn OpenAI-compatible LLM endpoint is required for text normalization. For speech-to-text it cleans up whisper output (punctuation, filler words, software engineering homophones). For text-to-speech it converts markdown into natural spoken text.
Configure your endpoint in tui.json via plugin options. Any OpenAI-compatible
endpoint works (Anthropic, OpenAI, Ollama, vLLM, LM Studio, etc.). The apiKeyEnv
option is optional - omit it for unauthenticated endpoints like Ollama.
{
"plugin": [
[
"@renjfk/opencode-voice",
{
"endpoint": "https://api.anthropic.com/v1",
"model": "claude-haiku-4-5",
"apiKeyEnv": "ANTHROPIC_API_KEY"
}
]
]
}For unauthenticated local endpoints (e.g. Ollama):
{
"plugin": [
[
"@renjfk/opencode-voice",
{
"endpoint": "http://localhost:11434/v1",
"model": "llama3.2"
}
]
]
}endpoint(required) - OpenAI-compatible base URLmodel(required) - model name sent to/chat/completionsapiKeyEnv(optional) - environment variable containing the API keymaxTokens(optional) - maximum completion tokens for normalization callsreasoningEffort(optional) - reasoning level for models that support itchatTemplateKwargs(optional) - extra keyword arguments passed to the model's chat template (e.g.{"enable_thinking": false}for Qwen models to disable chain-of-thought)retries(optional) - number of retry attempts for transient LLM failures
Instead of local whisper-cli, you can use an OpenAI-compatible speech-to-text
API (e.g. serving a Whisper model). This is useful when you want to run the
plugin on a machine without whisper-cpp installed.
{
"plugin": [
[
"@renjfk/opencode-voice",
{
"sttEndpoint": "http://127.0.0.1:8000/v1",
"sttModel": "whisper-large-v3-turbo",
"sttApiKeyEnv": "MY_STT_API_KEY"
}
]
]
}sttEndpoint(optional) - OpenAI-compatible base URL with/audio/transcriptionssupportsttModel(optional) - whisper model name to pass to the API (default:whisper-large-v3-turbo). Can be changed at runtime via/stt-model, which fetches available whisper models from the endpoint's/modelslistingsttApiKeyEnv(optional) - environment variable containing the API key
The LLM system prompts used for normalization can be fully replaced by pointing to your own prompt files. This lets you fine-tune how transcriptions are cleaned up or how responses are spoken.
{
"plugin": [
[
"@renjfk/opencode-voice",
{
"sttPrompt": "~/.config/opencode/stt-prompt.md",
"ttsAutoPrompt": "~/.config/opencode/tts-auto-prompt.md",
"ttsManualPrompt": "~/.config/opencode/tts-manual-prompt.md"
}
]
]
}sttPrompt(optional) - system prompt for cleaning up whisper transcriptionsttsAutoPrompt(optional) - system prompt for auto-speaking assistant responsesttsManualPrompt(optional) - system prompt for manually reading responses aloud
If a path is not set, the built-in default prompt is used.
| Command | Keybind | Description |
|---|---|---|
/stt-record |
ctrl+r |
Start/stop recording + transcribe |
/stt-stop |
Cancel recording | |
/stt-model |
Select whisper model | |
/stt-mic |
Select microphone |
The leader key in OpenCode is ctrl+x. So leader+s means press ctrl+x
then s.
| Command | Keybind | Description |
|---|---|---|
/tts-speak |
leader+s |
Read last response aloud |
/tts-mode |
leader+v |
Toggle auto TTS on/off |
/tts-stop |
escape |
Stop playback |
/tts-voice |
Select TTS voice |
soxrecords audio from your microphonewhisper-clitranscribes locally using a ggml model, or an OpenAI-compatible API endpoint ifsttEndpointis configured- LLM normalizes the transcription: fixes punctuation, removes filler words, corrects software engineering homophones ("Jason" to "JSON", "bullion" to "boolean", etc.)
- Cleaned text is appended to the OpenCode prompt. If normalization fails (e.g. LLM endpoint unreachable), the raw transcription is used as a fallback so you never lose your input
- When the assistant finishes responding (or on manual trigger), the response text is sent to the LLM for speech normalization
- The LLM decides how to handle it: narrate simple answers, summarize code-heavy responses, or briefly notify for confirmations
- Piper synthesizes speech locally, piped through sox for playback
When enabled (/tts-mode), the plugin automatically speaks:
- Assistant responses when a session goes idle after work
- Permission requests
- Questions that need your answer
opencode-voice is open to contributions and ideas!
Format: type: brief description
feat:new features or functionalityfix:bug fixesenhance:improvements to existing featureschore:maintenance tasks, dependencies, cleanupdocs:documentation updatesbuild:build system, CI/CD changes
npm run check # lint + fmt
npm run lint # oxlint
npm run fmt # oxfmt --check
npm run fmt:fix # oxfmt --writeTo test unpublished changes in the OpenCode TUI, point ~/.config/opencode/tui.json
at the local repo path, not the npm package name:
{
"$schema": "https://opencode.ai/tui.json",
"plugin": ["/Users/your-user/opencode-voice"]
}Manual releases via opencode; see RELEASE_PROCESS.md.
This project is licensed under the MIT License.