AI Music Detection
Detect AI-generated music - vocals, instrumentals, and everything in between.
Clip-level verdicts built from per-window evidence.
Batch API
Streaming API (WebSocket)
Structured output
Tunable thresholds
Two detection paths. One API.
Most AI music detectors break down on hybrid tracks, multi-part productions, and anything where AI was only used for part of the composition. Modulate runs two independent models, one for vocals, one for instrumentals, scoring each 4-second window separately, so you get a confident result grounded in evidence.
Vocal detection
vocal_ai_percentage and vocal_ai_confidence.Instrumental detection
A side-by-side comparison for teams evaluating AI music detection solutions.
It's time to get ahead of the AI music challenges your team is already dealing with.
AI-generated music is evolving faster than manual review can handle. AI Music Detect by Modulate gives you a scalable, self-serve detection layer, so you can enforce policies, protect rights, and stay compliant.
Reduce false positives
Protect royalty payouts
Enforce platform policies at scale
Know what you're licensing
Screen content before it reaches platforms
Stay ahead of AI disclosure regulations
Clip-level verdicts built from per-window evidence.
Start building with Modulate.
Explore Modulate's other leading voice models
Frequently Asked Questions
What is AI Music Detect by Modulate?
Modulate's API for identifying AI-generated music in audio. It analyzes both vocal and instrumental content across 4-second windows, returning per-segment scores and a clip-level verdict of ai-vocal-music, ai-instrumental, or not-ai-music.
What does the API actually return?
For each clip, the API returns a primary_verdict, clip-level vocal_ai_percentage and instrumental_ai_percentage with confidence scores, and a per-window breakdown showing where in the track AI content was detected.
How is this different from a single-score detector?
Most detectors return one probability score for an entire track. Modulate's API scores every 4-second segment independently and separates vocal AI detection from instrumental AI detection. This matters for hybrid tracks, where only part of the content is AI-generated.
What can Modulate's AI Music Detection API reliably identify?
The API reliably detects fully AI-generated songs (AI vocals + AI instrumentals), AI vocals over human/organic music, and AI-only instrumentals. Known current limitations include AI choral or background vocals, and AI backing tracks underneath a live human vocal performance.
Does the API support streaming?
Yes. In addition to the batch API, Velma AI Music Detect supports real-time WebSocket streaming, returning per-window vocal AI results as audio arrives and a final clip-level verdict at end of stream.
How much does it cost?
Velma AI Music Detect is priced at $0.07/hr of audio. For current pricing details, see the API Pricing page.
What audio formats are supported?
Supported formats for batch: .aac, .flac, .m4a, .mp3, .mp4, .ogg, .opus, .wav. Maximum file size is 100 MB. For streaming, container formats need only an audio_format query parameter; raw PCM requires sample_rate and num_channels as well.
Is Modulate ISO 27001 certified?
Yes. Modulate maintains ISO 27001 certification as part of its organization-wide security program.