fix: launch-readiness pass — bug fixes, hardening, CI, and tests by changhsiuwei · Pull Request #2 · sancliffe/ollama-STT-TTS

changhsiuwei · 2026-05-04T02:16:52Z

Summary

A four-commit launch-readiness pass on the codebase:

9653a81 P0 bug fixes — broken tests, default config path mismatch, Docker build, exit-on-voice-command
d0a9f7d P1 hardening — dynamic TTS sample rate, transcriber thread leak fix, thread-safe retry, LLM rollback safety, extracted WakewordDetector with 7 unit tests, dependency upper bounds
18229e9 polish — CI workflow (ruff + pytest on Python 3.10/3.11/3.12), --version flag, README troubleshooting, repo cleanup
000fca7 test repairs — fixed four flaky tests uncovered by full pytest run

Test plan

python -m pytest — 25 passed, 4 skipped (Ollama integration), 0 failed
ruff check src tests — all checks pass
python -m py_compile — all source files compile
Manual end-to-end voice pipeline test (requires hardware + Ollama)

- tests/test_transcriber.py: fix `bad_logprob_.pysegment` typo that made the entire test suite NameError on import. - audio_utils.py: replace `\n` literals in device-listing output with real newlines so `--list-devices` is actually readable. - audio_utils.py: correct DEFAULT_SETTINGS wakeword path (`hey_jarvis_v2.onnx` -> `jarvis_v2.onnx`) to match the file shipped in models/, so the assistant can start without a config.ini. - Dockerfile: copy `src/` and `README.md` before `pip install .` so the package is actually installable in the image; move runtime assets (config + models) after the install layer. - voice_assistant.py: replace bare `exit(0)` on the spoken "exit" / "goodbye" command with `raise KeyboardInterrupt`, so the cleanup path in assistant.py's finally block runs and closes audio/TTS threads. - voice_assistant.py: raise a clear RuntimeError if openWakeWord loads zero models, instead of an opaque IndexError. - tests/test_synthesizer.py: stop relying on string-path patches against modules that were already substituted with MagicMock; patch synth_module.PiperVoice and synth_module.sd.OutputStream directly and drain the queue with queue.join() instead of time.sleep. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ctor Quality fixes uncovered during the launch-readiness review. - synthesizer.py: query the configured output device's default sample rate at worker startup instead of hard-coding 48000Hz; fall back to the model's native rate if the device cannot be queried, so machines without 48kHz output stop crashing on TTS init. - transcriber.py: - Re-use a single ThreadPoolExecutor for the whole Transcriber lifetime instead of constructing one per call (was leaking workers on long sessions). - `transcribe()` now accepts explicit threshold args and no longer reads them from `self.args`, so the retry loop in voice_assistant.py can pass per-attempt thresholds without monkey-patching shared state. - close() shuts the executor down with cancel_futures. - voice_assistant.py: `_transcribe_with_retry` no longer mutates `self.args.whisper_*`; it computes per-attempt thresholds and passes them through. Eliminates a subtle race where an exception or early-return left thresholds permanently relaxed. - llm_handler.py: snapshot the message list before mutating it, and restore the snapshot on streaming failure. The previous "pop the last user message" rollback was unsafe because `_prune_history()` may already have discarded older turns. - audio_input.py: drop the orphan `task_done()` call. Nothing joins on the buffer queue and pairing them invites future deadlocks; replaced with a comment documenting the choice. - wakeword.py (new) + test_wakeword.py (new): extract the cooldown/consecutive/moving-average debouncing into a pure WakewordDetector class with 7 unit tests. The main loop in voice_assistant.py is now a thin adapter over this class and is far easier to reason about. - config_manager.py: only treat `system_prompt` as a file path when it *looks* like one (short, no newlines, .txt/.md/.prompt suffix or a path separator). Previously, prompts containing characters that sanitize_file_path() rejected (e.g. literal `..`) caused the assistant to refuse to start. - pyproject.toml: add upper bounds to every dependency so a future major release of numpy/scipy/etc. cannot silently break installs. Add ruff to the [test] extra. - tests/__init__.py: refresh the stale `app/`-era comments and add the `src/` directory to sys.path so modern import patterns work. - cleanup.py: stop trying to delete the long-removed `app/` directory and `requirements.txt`; clean `.ruff_cache` / `.mypy_cache` instead. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Lower-priority improvements identified during the launch-readiness review. These are not bug fixes; they are productisation work. - .github/workflows/ci.yml (new): run ruff and pytest on Python 3.10/3.11/3.12 for every push and PR. Installs portaudio + ffmpeg so the native audio deps actually compile in CI. - pyproject.toml: - configure ruff (E/F/I/B/UP rules, line length 110, py310 target) with per-file-ignores for tests/* (stubbing sys.modules before imports is intentional). - configure pytest's testpaths. - add ruff to the [test] extra. - src/voice_assistant/__init__.py: expose `__version__ = "0.1.0"`. - config_manager.py: add `--version` to argparse, sourced from the package metadata. - README.md: add a Troubleshooting section covering portaudio, Ollama connectivity, mic permissions, wake-word tuning, clipping, Whisper accuracy, and the C-toolchain requirement for webrtcvad. Also document the new `--version` flag and the test command. - sample-conversation.txt -> docs/sample-conversation.log: move the 567-line debug capture out of the project root. - ruff sweep across the codebase: import sorting, replace `IOError` alias with `OSError`, `type == X` -> `type is X`, raise-from in except clauses, drop unused locals. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Running the full suite locally surfaced four pre-existing fragile spots in the tests that were not visible from static analysis. None of the production code is wrong; the tests were lying. - test_transcriber.py: Stop stuffing a MagicMock into `sys.modules['torch']` at import time. faster-whisper now pulls in transformers transitively, and transformers inspects `torch.__spec__` during import — a naive MagicMock fails that check and the entire test module fails to load. Switch to per-test `patch.object(transcriber_module, 'torch', ...)`. Also give the mock segments numeric `start`/`end` attributes so the new f-string segment log (`{segment.start:.2f}s-...`) doesn't blow up formatting MagicMock. - test_config_manager.py: The mocked `parse_args()` Namespace was missing every attribute the function reads after parse_args returns (wakeword_model_path, piper_model_path, …). It only worked previously because some of those attributes were never accessed in the code path. Build a complete Namespace from DEFAULT_SETTINGS so the test matches what argparse actually produces. - test_synthesizer.py: `_resolve_target_sample_rate` calls `sd.query_devices()` which in these tests returns a MagicMock; converting that to int yields a garbage rate that then triggers a divide-by-zero inside scipy's `resample`. Patch `_resolve_target_sample_rate` to return a known 16000Hz so the worker exercises the real write path. Also use real zero-filled int16 bytes (not `b''`) for test_interrupt so `stream.write` actually has something to write before the interrupt. - test_ollama_connection.py: Skip the entire integration class via `setUpClass` when the Ollama HTTP probe fails, instead of failing offline runs. Online runs still exercise the real assertions. Result: 25 passed, 4 skipped (Ollama integration), 0 failed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

For users who don't want to remember terminal commands every time, ship three click-to-run launchers in the project root: - start.bat The main launcher. cd's to the project directory, picks an interpreter (preferring a project-local .venv\ or venv\, falling back to `python` on PATH), pings http://localhost:11434/api/tags to warn early if Ollama is not running, then exec's `run.py`. The window pauses on exit so users can read errors instead of the window slamming shut. - start-debug.bat Same flow with --debug appended. For wake-word / VAD tuning. - list-devices.bat --list-devices + --list-output-devices, so users can copy the right indices into config.ini. README gains a "double-click" section explaining each launcher and how to pin one to the desktop via Send to -> Desktop (create shortcut). The pre-existing Python and Docker run sections are renamed B/C. We deliberately did NOT ship a frozen .exe (PyInstaller / Nuitka): the project pulls in faster-whisper, piper-tts, sounddevice and torch via ctranslate2, which makes single-file freezes 800MB+ and brittle on Windows. A .bat that drives the existing entry point is faster, easier to debug, and trivial to update. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

t3andy1025 and others added 5 commits May 4, 2026 09:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: launch-readiness pass — bug fixes, hardening, CI, and tests#2

fix: launch-readiness pass — bug fixes, hardening, CI, and tests#2
changhsiuwei wants to merge 5 commits intosancliffe:mainfrom
changhsiuwei:claude/peaceful-goldwasser-420c25

changhsiuwei commented May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

changhsiuwei commented May 4, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants