Polish-founded ElevenLabs releases its own AI speech recognition model Scribe

01 March, 2025, 14:20 193
Polish-founded ElevenLabs releases its own AI speech recognition model Scribe

ElevenLabs, a Polish-founded AI company, has introduced its first audio and video-to-text conversion model called Scribe. As claimed by the company, it is "the world's most accurate speech to text model." It is available both through the API for developers and directly through the control panel.

This tool will be useful for subtitling, content analysis, and other applications.

The company says its model outperformed Google Gemini 2.0 Flash and Whisper Large V3 in FLEURS & Common Voice tests in many languages.

Scribe supports more than 99 languages, including most European languages, and demonstrates high accuracy in some of them, such as Italian (claimed accuracy of 98.7%) and English (96.7%).

Scribe Tool demonstrationImage: ElevenLabs

In addition, Scribe significantly improves the quality of automatic recognition for lesser-known languages such as Serbian, Cantonese, and Malayalam, where other models often make more than 40% mistakes.

Developers can integrate Scribe through the API and receive structured JSON transcriptions. Users can already download audio and video files through the ElevenLabs platform.

AIN reminds that ElevenLabs recently raised $180 million in investment, reaching $3.3 billion in valuation.

Read more