Local Speech-to-Speech AI Assistant Technologies and Recommendations
By
dsrtslnd23
Sesame, salt, and substance. A flagship bake.
Summary
The article discusses local/open speech-to-speech setups for AI assistants, focusing on technologies that run entirely locally in browsers without cloud dependencies. The author shares their experience building a local assistant using web-first technologies that fits small language models in memory and handles speech-to-text and text-to-speech without stuttering. Recommendations include vosk-browser for speech recognition, vits-web for text-to-speech, and KittenTTS for its size/performance ratio, though the latter requires custom JavaScript integration since it's a Python project.
Key quotes
· 4 pulledI have a great local assistant that works end-to-end with voice. It's built on local, web-first technologies, it fits small LLMs in memory and manages inference and TTS/STT without stuttering.
If you want something simple that runs in browser, look at vosk-browser[0] and vits-web[1].
I'd also recommend checking out KittenTTS[2], I use it and it's great for the size/performance.
However, you'd need to implement a custom JavaScript harness for the model since it's a python project.
You might also wanna read
KugelAudio launches real-time TTS with voice cloning, sub-60ms latency, and on-premise deployment
KugelAudio launches a real-time text-to-speech model with voice cloning capabilities on Product Hunt. The model can clone a voice from just
Microsoft Launches MAI-Voice-1 Speech Generation Model with Sub-Second Audio Processing
Microsoft has launched MAI-Voice-1, a highly efficient speech generation model that can generate a full minute of audio in under a second on
OpenWispr: A Local Open-Source AI Speech-to-Text Model
OpenWispr is an open-source AI speech-to-text model that operates entirely locally, offering 3-5x faster transcription than typing. It is de
Raspberry Pi Can Run AI Assistants Like OpenClaw, But Needs Cloud LLM for Practical Use
A Raspberry Pi can run an AI assistant like OpenClaw, but it is only practical when paired with a cloud-based LLM. Running it fully locally
Building a Trustworthy Personal AI Assistant: Architecture and Security Trade-offs
The author describes building a personal AI assistant to manage the chaos of running multiple parallel projects (family, company, relocation
paragraph.com·5d ago