GPTCrunch

Compare Models

Select up to 4 models to compare benchmarks, pricing, and capabilities side by side.

OpenAI logoo3-mini

OpenAI

Anthropic logoClaude Sonnet 4

Anthropic

DeepSeek logoDeepSeek-R1-Distill-Qwen-32B

DeepSeek

Add Model
MMLU
o3-mini
86.9
Claude Sonnet 4
88.7
DeepSeek-R1-Distill-Qwen-32B
86.0
HumanEval
o3-mini
92.9
Claude Sonnet 4
93.7
DeepSeek-R1-Distill-Qwen-32B
85.0
GSM8K
o3-mini
97.9
Claude Sonnet 4
96.4
DeepSeek-R1-Distill-Qwen-32B
96.0
GPQA
o3-mini
77.0
Claude Sonnet 4
68.2
DeepSeek-R1-Distill-Qwen-32B
62.0
MGSM
o3-mini
89.5
Claude Sonnet 4
91.6
DeepSeek-R1-Distill-Qwen-32B
0.0
ARC-Challenge
o3-mini
96.0
Claude Sonnet 4
96.7
DeepSeek-R1-Distill-Qwen-32B
0.0
HellaSwag
o3-mini
92.5
Claude Sonnet 4
93.2
DeepSeek-R1-Distill-Qwen-32B
0.0
MATH
o3-mini
97.0
Claude Sonnet 4
78.0
DeepSeek-R1-Distill-Qwen-32B
94.0
SWE-bench
o3-mini
49.3
Claude Sonnet 4
53.6
DeepSeek-R1-Distill-Qwen-32B
0.0
MMMLU
o3-mini
83.5
Claude Sonnet 4
86.0
DeepSeek-R1-Distill-Qwen-32B
0.0
AIME 2025
o3-mini
0.0
Claude Sonnet 4
0.0
DeepSeek-R1-Distill-Qwen-32B
72.0
ModelInputOutputBlended*
o3-mini
$1.10$4.40$2.75
Claude Sonnet 4
$3.00$15.00$9.00
DeepSeek-R1-Distill-Qwen-32B
$0.12$0.18$0.15

*Blended = average of input and output price

Spec
o3-mini
Claude Sonnet 4
DeepSeek-R1-Distill-Qwen-32B
Context Window200K200K128K
Max Output100K16K8K
TTFT800ms280ms300ms
Speed75 tok/s100 tok/s100 tok/s
ParametersN/AN/A32B
ArchitectureTransformer + CoTTransformerTransformer + CoT (distilled)
Open SourceNoNoYes
Tiermidmidmid

Quick Verdict

Best Performance

o3-mini

Best Value

DeepSeek-R1-Distill-Qwen-32B

Fastest

Claude Sonnet 4