Nemotron 3 Nano

Name: Nemotron 3 Nano
Price: 0.04 USD
Author: NVIDIA

budget

by NVIDIA· 8 months ago

Hybrid Mamba-Transformer MoE with 4x higher throughput than predecessor. Open weights and training data.

Context Window

1.0M

text Open Source

Input Price

$0.04/M tokens

Output Price

$0.08/M tokens

Performance Profile

Why Choose Nemotron 3 Nano

Budget-friendly at just $0.04/M input tokens

Massive 1.0M token context window for entire codebases and long documents

Fully open source — self-host, fine-tune, and customize without restrictions

31.6B total / 3.2B active parameter architecture for deep reasoning

Strengths & Limitations

Strengths

+Solid benchmark performance
+Large context window for complex tasks
+Very affordable pricing
+Open source — can self-host and fine-tune

Limitations

−Text only — no image or audio support

Benchmark Results

MMLU64.0

HumanEval60.0

HellaSwag72.0

GSM8K72.0

Quick Comparison

vs similar-tier models

Model	Input	Output	Context	Avg Score
Nemotron 3 NanoCurrent NVIDIA	$0.04	$0.08	1.0M	67.0
Claude Haiku 3.5 Anthropic	$0.80	$4.00	200K	77.0
Mistral Small Mistral AI	$0.10	$0.30	32K	69.8

Full Comparison

Pricing Calculator

How pricing works A token is roughly ¾ of a word. A 1,000-word article is about 1,333 tokens. You pay separately for input (what you send) and output (what the model replies).

Summarize an email

<$0.001

~300 word email → short summary

400 in · 100 out

Analyze a 1,000-word article

<$0.001

Blog post or news article → detailed analysis

1,333 in · 500 out

Chatbot conversation (10 turns)

<$0.001

Full customer support interaction

4,000 in · 2,000 out

Summarize a 50-page report

$0.0017

Legal contract or research paper → key points

37,500 in · 2,000 out

Review a 5,000-line codebase

$0.0012

Full code review with suggestions

25,000 in · 3,000 out

Process a full novel

$0.0052

~90,000 words → detailed summary & analysis

120,000 in · 5,000 out

At scale: 1,000 requests/day

Email summaries

$0.72/mo

$0.02/day

Chat conversations

$10/mo

$0.32/day

Document analysis

$50/mo

$2/day

Technical Specifications

ProviderNVIDIA

ArchitectureHybrid Mamba-Transformer MoE

Parameters31.6B total / 3.2B active

Context Window1.0M tokens

Modalitiestext

Open SourceYes

Release DateNovember 1, 2025

Community Ratings

No ratings yet. Be the first to rate this model!

Rate This Model

Comments

0 comments

No comments yet. Be the first to share your thoughts!

Similar Budget Models

Claude Haiku 3.5

Anthropic

budget

Anthropic's fastest and most affordable model. Great for high-volume, low-latency tasks.

textimage

Input

$0.80/M

Output

$4.00/M

Context

200K

Mistral Small

Mistral AI

budget

Mistral's efficient model for everyday tasks. Fast and cost-effective.

text

Input

$0.10/M

Output

$0.30/M

Context

32K

GPT-4.1 Mini

OpenAI

budget

A fast, affordable variant of GPT-4.1 for high-volume workloads.

textimage

Input

$0.40/M

Output

$1.60/M

Context

1.0M

Compare Nemotron 3 Nano with other models

See how it stacks up against the competition

Nemotron 3 Nano

Why Choose Nemotron 3 Nano

Strengths & Limitations

Strengths

Limitations

Benchmark Results

Quick Comparison

Quick Comparison

Pricing Calculator

At scale: 1,000 requests/day

Technical Specifications

Community Ratings

Rate This Model

Comments

More from NVIDIA

Nemotron 70B

Nemotron-4 340B

PersonaPlex 7B v1

Similar Budget Models

Claude Haiku 3.5

Mistral Small

GPT-4.1 Mini

Compare Nemotron 3 Nano with other models