by OpenAI· 2 years ago
Gold standard speech recognition model supporting 99+ languages. 1.55B parameter encoder-decoder architecture.
Input Price
$0.0060/M tokens
Output Price
$0.0060/M tokens
Performance Profile
Reliable audio processing with strong multi-language support
Process audio in real-time with support for dozens of languages
1.55B parameter encoder-decoder for accurate transcription
vs similar-tier models
| Model | Input | Output | Context | Avg Score |
|---|---|---|---|---|
Whisper Large V3Current OpenAI | $0.0060 | $0.0060 | N/A | 0.0 |
DeepSeek-R1 DeepSeek | $0.55 | $2.19 | 128K | 87.0 |
Claude Sonnet 4 Anthropic | $3.00 | $15.00 | 200K | 84.6 |
Transcribe a 1-minute clip
<$0.001Short voice memo → text
1,500 in · 200 out
Transcribe a 30-min meeting
<$0.001Full meeting → transcript with speakers
45,000 in · 6,000 out
Process 1 hour of audio
<$0.001Podcast episode → transcript + summary
90,000 in · 12,000 out
Transcribe 8 hours (full day)
$0.0049Call center daily volume
720,000 in · 96,000 out
Voice memos
$0.31/mo
$0.01/day
Meeting transcripts
$9/mo
$0.31/day
Podcast processing
$18/mo
$0.61/day
No ratings yet. Be the first to rate this model!
Sign in to rate this model and share your experience.
Sign in to leave a comment and join the discussion.
OpenAI
OpenAI's most advanced multimodal model. Excels at text, vision, and audio tasks with fast response times.
Input
$2.50/M
Output
$10.00/M
Context
128K
OpenAI
OpenAI's reasoning model with chain-of-thought capabilities for complex problem solving.
Input
$15.00/M
Output
$60.00/M
Context
200K
OpenAI
OpenAI's efficient reasoning model, optimized for speed while maintaining strong analytical capabilities.
Input
$1.10/M
Output
$4.40/M
Context
200K
DeepSeek
DeepSeek's reasoning model with transparent chain-of-thought. Open-source and highly competitive.
Input
$0.55/M
Output
$2.19/M
Context
128K
Anthropic
Anthropic's best balance of intelligence and speed. Excellent for production workloads.
Input
$3.00/M
Output
$15.00/M
Context
200K
Meta
Meta's open-source model matching GPT-4 class performance at 70B parameters.
Input
$0.18/M
Output
$0.18/M
Context
128K