Modulate Launches Velma Deepfake Detect: A Paradigm Shift in the Economics of Fraud Prevention

Boston, MA – March 31, 2026 – Modulate, the frontier conversational voice intelligence company, today announced the launch of Velma Deepfake Detect, a synthetic voice detection API that makes continuous, full-call monitoring economically viable at scale for the first time. Ranked #1 on the Hugging Face Deepfake Speech leaderboard, Velma Deepfake Detect combines state-of-the-art accuracy with 578x lower cost vs. the next-best model, enabling detection of AI-generated audio across entire conversations in both batch and real-time streaming environments. 

Voice is one of the most vulnerable attack surfaces for modern enterprises,” said Mike Pappas, CEO and Cofounder of Modulate. “The problem isn’t just that synthetic audio is getting better; it’s that it’s incredibly cheap to create, while detection has historically been too expensive to deploy at scale. That’s left real gaps in how companies defend themselves. Velma Deepfake Detect changes that by creating true cost parity with scammers creating fraudulent voice deepfakes. It’s a paradigm shift that gives enterprises and developers a fraud prevention solution at a low cost required to catch the huge proliferation in deepfake fraud.

The volume of synthetic content is growing at an unprecedented pace. AI-generated voice fraud increased by over 1,200% in 2025, costing organizations an average of $14 million annually (source), and those number keeps going up. The financial impact is significant, with incidents averaging over $500,000 in 2024. The result is a growing gap between how easily these attacks can be executed and how effectively they can be stopped. 

Built using Modulate’s Ensemble Listening Model (ELM) architecture, Velma Deepfake Detect combines insights from short vocal tones and more complex rhythm or pronunciation patterns to deliver the precision and cost efficiency required for end-to-end, real-time detection of deepfake fraud across retail, banking, and IT helpdesks – any call center, content-sharing platform, or other audio-rich environment.

Reduce the Operational Costs of Deepfake Detection by up to 99%

With pricing starting at $0.25 per hour of audio, Velma Deepfake Detect is over 100x less expensive than competing solutions, making large-scale deployments across entire voice pipelines economically viable for the first time.

Pappas elaborates, “Historically, cost has shaped how deepfake detection is used in practice. When detection is expensive, organizations are forced to sample only a small portion of each interaction. But as fraud tactics evolve, those partial approaches leave exploitable blind spots. Velma changes the economics, making it possible to monitor entire conversations and voice pipelines, closing those gaps in real time.”

Beyond risk mitigation, continuous fraud detection with Modulate Velma Deepfake Detect improves the overall efficiency and cost-effectiveness of voice operations. By identifying fraudulent or suspicious interactions earlier, organizations can route calls more effectively, reduce time spent on bad actors, and allow agents to focus on legitimate customer needs – reducing unnecessary strain and potential churn on frontline teams.

#1 Accuracy, Independently Validated by Hugging Face

Velma Deepfake Detect is ranked the top-performing model on the independently validated Hugging Face Speech Deepfake Arena leaderboard, achieving an equal error rate (EER) of 1.1% – catching 60% of the deepfakes the #2 provider missed, while generating less than half the number of false positives – and significantly outperforming competing models across a broad range of evaluation datasets.

This benchmark reflects the model’s ability to reliably distinguish genuine human speech from AI-generated audio under diverse conditions, including noisy environments and compressed audio formats.

Built for Real-World Voice Systems

Velma Deepfake Detect is already being applied in high-risk enterprise workflows, including preventing account takeover during customer support calls, flagging synthetic voices during high-value transaction verification, and identifying scam callers in real time in contact center environments. These use cases enable organizations to stop fraud as it happens, rather than after losses occur.

Now available as an API for developers building production systems that rely on voice input, Velma Deepfake Detect enables:

  • Batch and real-time streaming detection endpoints
  • Probability-based scoring for flexible decision thresholds
  • Segment-level analysis for identifying partial manipulation
  • Accurate results with as little as 2-3 seconds of audio, compared to 5-30 seconds
  • Robust performance across noisy, multi-speaker, and compressed audio

The Velma Deepfake Detect API enables enterprises and developers to incorporate detection into fraud prevention, contact centers, voice agents, and identity verification workflows. Because alerts and scores can be routed into existing systems, organizations can use Velma Deepfake Detect to support real-time decisions such as escalation, rerouting, secondary verification, or post-call review.

Modulate: The Comprehensive Voice Intelligence Platform

As part of the broader Velma platform, detection can be combined with additional capabilities, including transcription, emotion detection, PII redaction, and conversational analytics – allowing organizations to move from simply identifying synthetic audio to fully understanding voice interactions.

Pricing and Availability

Velma Deepfake Detect is available today via API. Modulate pricing is usage-based and optimized for high-volume workloads: https://www.modulate.ai/pricing

Download the Modulate Deepfake Detect press kit here.

About Modulate

Modulate is a voice intelligence company building AI models and APIs designed to understand real-world conversational audio at scale. Its technology combines speech recognition, acoustic analysis, and conversational context to deliver reliable, explainable, and cost-effective voice intelligence for developers and enterprises.

For more information or to get started, visit modulate.ai

Media Contact

Kristin Canders

Grithaus Agency

(e) kristin@grithaus.agency

###