Velma Transcribe
by Modulate vs. Deepgram

A head-to-head comparison between Velma Transcribe, Modulate’s speech to text API, and Deepgram’s speech recognition API.

Get Started with Free Credits

Why Teams Choose Velma Transcribe Over Deepgram

Accuracy Without Multi-Week Tuning

Gets accuracy right from the start. No need for weeks of tuning to get reliable results. Built on 500 million+ hours of real-world conversations, it just works.

Costs 90% Less with Simple Pricing

Save up to 90% over competitors. Simple usage-based pricing and 1,500 hours+ in free credits. You’ll know exactly what you’re paying for – no tricky conversions or unexpected fees.

Clean Output for Downstream AI

Engineered to provide better output for your downstream AI tools. By focusing on natural speech and not just clean text, you can expect better summaries, analytics, and more.

Stable Performance Across Long Conversations

Handles long conversations like meetings or conversations with multiple speakers without compromising accuracy.

Transcription Benchmark (Accuracy vs. Price)
Average Word Error Rate (WER) across Earnings-22 and VoxPopuli datasets
Lowest WER, lowest cost
Cost per 1000 minutes of audio
Avg. Word Error Rate
modulate-velma-2
scribe-v2
gemini-2.5-pro
universal
speechmatics-enhanced
solaria-1
gpt-4o-transcribe
chirp-2
speechmatics-standard
whisper-large-v3
nova-3
8
9
10
11
12 %
1
2
3
4
5
6
7
8
$9
0