Meta Launches Omnilingual ASR Supporting Over 1,600 Languages
By
jean-
The kind of bagel that ruins lesser bagels for you.
Summary
Meta introduces Omnilingual Automatic Speech Recognition (ASR), a suite of models that provides speech recognition capabilities for over 1,600 languages, addressing the digital divide where most current ASR systems focus only on high-resource languages with abundant labeled data. The system aims to make spoken language universally accessible by transcribing speech into searchable, analyzable text, particularly benefiting speakers of less widely represented or low-resource languages.
Key quotes
· 4 pulledAutomatic speech recognition (ASR) systems aim to make spoken language universally accessible by transcribing speech into text that can be searched, analyzed, and shared.
Currently, most automatic speech recognition systems focus on a limited set of high-resource languages that are well represented on the internet, often relying on large amounts of labeled data and human-generated metadata to achieve good performance.
This means high-quality transcriptions are often unavailable for speakers of less widely represented or low-resource languages, furthering the digital divide.
We're introducing Meta Omnilingual Automatic Speech Recognition, a suite of models providing automatic speech recognition capabilities for over 1,600 languages.
You might also wanna read
Microsoft Launches MAI-Transcribe-1: Multilingual Speech-to-Text Model for Production Use
Microsoft has launched MAI-Transcribe-1, a new multilingual speech-to-text model designed for production use. The model offers best-in-class

Meta Expands AI Translation Tool for Instagram and Facebook Reels with Voice and Lip-Sync Features
Meta is expanding its AI-powered translation tool to Facebook and Instagram users, allowing automatic dubbing of reels into different langua
Meta Launches Muse Spark: A Multimodal AI Model for Everyday Users and Developers
Meta has launched Muse Spark, its 23rd product launch, which is a multimodal AI model designed for both everyday users and developers. The A
Meta Launches Muse Spark: A Multimodal AI Model for Everyday Users and Developers
Meta has launched Muse Spark, its 23rd product launch, which is a multimodal AI model designed for both everyday users and developers. The A
Xiaomi releases MiMo-V2.5-ASR: open-source 8B speech recognition model supporting Mandarin, English, dialects, and song lyrics
MiMo-V2.5-ASR is an 8-billion-parameter open-source speech recognition model developed by Xiaomi. It supports transcription of Mandarin, Eng
