Skip to content

Add FluidAudio to Speech Processing section#60

Open
Alex-Wengg wants to merge 1 commit intolikedan:masterfrom
Alex-Wengg:add-fluidaudio
Open

Add FluidAudio to Speech Processing section#60
Alex-Wengg wants to merge 1 commit intolikedan:masterfrom
Alex-Wengg:add-fluidaudio

Conversation

@Alex-Wengg
Copy link
Copy Markdown

Summary

Adds FluidAudio to the Speech Processing section.

FluidAudio is a Swift SDK that brings frontier audio AI models to Apple devices through CoreML integration. It provides:

  • Automatic Speech Recognition (ASR) - Using NVIDIA's Parakeet TDT model, supporting 25 European languages
  • Speaker Diarization - Streaming and offline modes for identifying multiple speakers
  • Voice Activity Detection (VAD) - Using Silero models
  • Text-to-Speech (TTS) - Using the Kokoro model

All models run fully on-device using the Apple Neural Engine (ANE) for low-latency, privacy-preserving audio AI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant