by NVIDIA· 12 months ago
Speed-optimized ASR model delivering 1000+ RTFx on Open ASR Leaderboard. Exceptional accuracy.
Input Price
$0.0040/M tokens
Output Price
$0.0040/M tokens
Performance Profile
Budget-friendly audio processing at $0.0040/M tokens
Process audio in real-time with support for dozens of languages
1B parameter encoder-decoder for accurate transcription
vs similar-tier models
| Model | Input | Output | Context | Avg Score |
|---|---|---|---|---|
Canary-1B-FlashCurrent NVIDIA | $0.0040 | $0.0040 | N/A | 0.0 |
Claude Haiku 3.5 Anthropic | $0.80 | $4.00 | 200K | 77.0 |
Mistral Small Mistral AI | $0.10 | $0.30 | 32K | 69.8 |
Transcribe a 1-minute clip
<$0.001Short voice memo → text
1,500 in · 200 out
Transcribe a 30-min meeting
<$0.001Full meeting → transcript with speakers
45,000 in · 6,000 out
Process 1 hour of audio
<$0.001Podcast episode → transcript + summary
90,000 in · 12,000 out
Transcribe 8 hours (full day)
$0.0033Call center daily volume
720,000 in · 96,000 out
Voice memos
$0.20/mo
$0.01/day
Meeting transcripts
$6/mo
$0.20/day
Podcast processing
$12/mo
$0.41/day
No ratings yet. Be the first to rate this model!
Sign in to rate this model and share your experience.
Sign in to leave a comment and join the discussion.
NVIDIA
NVIDIA's optimized Llama 3.1 variant with custom reward model training.
Input
$0.18/M
Output
$0.18/M
Context
128K
NVIDIA
NVIDIA's large open-source model trained for synthetic data generation.
Input
$1.20/M
Output
$1.20/M
Context
4K
NVIDIA
Input
Free/M
Output
Free/M
Anthropic
Anthropic's fastest and most affordable model. Great for high-volume, low-latency tasks.
Input
$0.80/M
Output
$4.00/M
Context
200K
Mistral AI
Mistral's efficient model for everyday tasks. Fast and cost-effective.
Input
$0.10/M
Output
$0.30/M
Context
32K
OpenAI
A fast, affordable variant of GPT-4.1 for high-volume workloads.
Input
$0.40/M
Output
$1.60/M
Context
1.0M