Fair, flexible pricing
Pay only for what you use. Token-based pricing built to scale with you.
$5 free credit for new users
Every new account gets $5 in free credit — no credit card required, never expires. Start building immediately.
Real-time Audio-to-Text API
All API costs are calculated based on tokens.
Equivalent to about $0.126/hour for real-time (streaming) transcription.
Input audio tokens
Duration of audio or streaming session
Input text tokens
Custom instructions or context you provide
Output text tokens
Transcription and other text returned by the model
Usage reference: 1 hour of audio is ~30,000 input audio tokens. 1 hour of speech is ~15,000 output text tokens. 1 character of output is ~0.3 tokens.
Offline / Batch Transcription API
Upload audio files for high-accuracy batch transcription. Ideal for podcasts, meetings, and archival content.
Standard
High-accuracy batch transcription with speaker diarization and punctuation.
Enhanced
Premium accuracy with advanced noise reduction, multi-speaker diarization, and word-level timestamps.
Offline API is billed per hour of audio processed. Minimum billing unit is 1 second. Files up to 4 hours supported.