Inworld launches TTS-2 with cross-lingual synthesis and natural language voice control

Aleksey Tikhonov

1mo ago· 3 min readenProduct

85/100

Golden Brown

Bagelometer↗

Sesame, salt, and substance. A flagship bake.

Score85Typepress releaseSentimentpositive

Summary

Inworld announces TTS-2, the successor to their #1 ranked text-to-speech model (TTS 1.5), featuring six major upgrades including natural language voice direction, text-based voice design, cross-lingual synthesis across 100+ languages, IPA phonetic control, and improved pronunciation. The company offers a unified API platform combining speech-to-text, LLM routing, and top-ranked TTS for developers building voice agents, AI companions, and conversational applications.

Key quotes

· 5 pulled

Realtime TTS 1.5 is #1 on Artificial Analysis, voted best in blind tests by thousands of real users.

TTS-2 builds on that with six major upgrades: natural language voice direction for tone, emotion, speed, and pitch.

Cross-lingual synthesis across 100+ languages preserving speaker identity.

One platform with speech-to-text, an LLM router, and the top-ranked text-to-speech, all connected on a single API so context flows between every layer.

Used by developers building voice agents, AI companions, and conversational apps.

Snippet from the RSS feed

Inworld builds the infrastructure for production voice AI. One platform with speech-to-text, an LLM router, and the top-ranked text-to-speech, all connected on a single API so context flows between every layer. Used by developers building voice agents, AI

You might also wanna read

OpenAI Releases Realtime API with Production Voice Agent Features and Advanced GPT-Realtime Model

OpenAI has made its Realtime API generally available with new production-ready features for voice agents, including support for remote MCP s

openai.com·9mo ago

Mistral AI Releases Voxtral Transcribe 2 Speech-to-Text Models with Real-time Capabilities

Mistral AI has released Voxtral Transcribe 2, a new generation of speech-to-text models featuring state-of-the-art transcription quality, di

mistral.ai·3mo ago