Interaction Models: Native Real-Time Multimodal AI Collaboration
By
Thinking Machines Lab
Crisp on the outside, thoughtful on the inside. A keeper.
Summary
The article introduces "interaction models," a new approach to human-AI collaboration where AI systems handle interaction natively—continuously processing audio, video, and text in real time—rather than relying on external scaffolding or turn-based interfaces. These models are trained from scratch with a multi-stream, micro-turn design to ensure real-time responsiveness, aiming to make AI collaboration feel as natural as human-to-human interaction.
Key quotes
· 3 pulledWe think interactivity should scale alongside intelligence; the way we work with AI should not be treated as an afterthought.
Interaction models let people collaborate with AI the way we naturally collaborate with each other—they continuously take in audio, video, and text, and think, respond, and act in real time.
To ensure real-time responsiveness, we adopt a multi-stream, micro-turn design.
You might also wanna read

Mira Murati's Thinking Machines announces "interaction models" for real-time AI collaboration
Thinking Machines, the AI company founded by former OpenAI CTO Mira Murati, announced it is developing "interaction models" — a new AI appro

Physical AI's Next Frontier: Smarter Human-Machine Interfaces Over Smarter Robots
This sponsored article discusses the emerging field of Physical AI, arguing that the next major breakthrough isn't about building smarter ro
spectrum.ieee.org·4d agoOdyssey launches Starchild-1, a real-time multimodal AI world model with synchronized audio-video generation
Odyssey has launched Starchild-1, described as the first real-time multimodal world model capable of generating synchronized audio and video
Duelin' Agents: Real-Time AI Model Interaction Platform for Debates and Collaboration
Duelin' Agents is a platform that enables real-time interaction between two AI models via a split-screen interface. Users can configure diff
TruGen AI Launches Hyper-Realistic Video Agents for Natural Human-Like Interactions
TruGen AI introduces hyper-realistic Video Agents that can see, hear, remember, and act in real time, transforming conversations into natura
Mezzie: Enhancing Chat Privacy and Experience with AI Models
Mezzie routes messages to optimal AI models for private chats, offering an enhanced chat experience with over 30 top AI models.
