Parlor: Open-Source On-Device Multimodal AI for Real-Time Voice and Vision Conversations
By
karimf
Pure flour-power. Hearty enough to carry you through lunch.
Summary
Parlor is an open-source, on-device multimodal AI system that enables real-time voice and vision conversations entirely on local machines. The project uses Gemma 4 E2B for speech and vision understanding and Kokoro for text-to-speech, allowing users to have natural conversations with AI without cloud dependencies. It's currently in research preview with rough edges, created by a developer who previously built a free voice AI for English learning that serves hundreds of monthly active users.
Key quotes
· 4 pulledOn-device, real-time multimodal AI. Have natural voice and vision conversations with an AI that runs entirely on your machine.
Parlor uses Gemma 4 E2B for understanding speech and vision, and Kokoro for text-to-speech. You talk, show your camera, and it talks back, all locally.
Research preview. This is an early experiment. Expect rough edges and bugs.
I'm self-hosting a totally free voice AI on my home server to help people learn speaking English. It has hundreds of monthly active users.
You might also wanna read
ChatPal: AI-Powered Language Learning App for Speaking Practice
ChatPal is a conversation-first language learning app designed to help users practice speaking and achieve fluency through real-world scenar
Okara: Private AI Workspace with Encrypted Chats Across 20+ Open-Source Models
Okara is a private AI workspace offering encrypted chats with over 20 open-source AI models including Llama, Qwen, DeepSeek, and Mistral. Th
SpeechPal: AI-Powered Practice Platform for Real-Life Conversations and Public Speaking
SpeechPal is an AI-powered practice platform that helps users rehearse real-life conversations like job interviews, presentations, and meeti
NativeMind: Open-Source, On-Device AI Assistant Powered by Ollama
NativeMind is a fully private, open-source AI assistant that runs locally on devices, powered by Ollama. It provides fast, private access to
Colloqio: On-Device AI Companion for Private, Offline Conversations
Colloqio is an upcoming AI companion application that operates entirely on-device without cloud servers, ensuring complete privacy by not co
Ringg launches Parrot: A speech-to-text model optimized for noisy, Hindi-heavy voice agent conversations
Ringg introduces Parrot, a speech-to-text model specifically designed for production-grade voice agents. Unlike standard STT models that per
