tools
updated every night
Speechmatics
last releaseOctober 11, 2024
powered byUrsa 2
goblin vibe check:
worth a look if you need speech-to-text and text-to-speech from one invoice and accuracy actually matters to you
enterprise speech stack powered by proprietary ursa engines trained with self-supervised learning across broad accents, dialects, and domains.
speed
<1s
realtime
Ursa 2 powers the full product line18% lower WER than original Ursa across 50+ languagesReal-time transcription under one secondMedical model tuned on clinical language
key features
Ursa 2 powers the full product line18% lower WER than original Ursa across 50+ languagesReal-time transcription under one secondMedical model tuned on clinical language
spec & usage
Uses self-supervised learning and transformer-based acoustic models trained on millions of hours of unlabeled audio
Scales to roughly 2B parameters with NVIDIA-accelerated inference for high-throughput transcription
Maps acoustic representations into phoneme probabilities before an LLM handles sequence prediction
scope:
audiovoiceapicloudpaidreal-timebenchmark-strong