NVIDIA has just unveiled the Nemotron 3 family of open AI models, designed especially for building smart, multi-agent systems that can handle complex tasks across industries. These fully open models come in three sizes – Nano, Super, and Ultra – and promise big improvements in speed and efficiency, with the Nano version delivering up to 4x higher throughput compared to Nemotron 2 Nano.
The Nemotron 3 lineup uses a new hybrid mixture-of-experts (MoE) architecture that mixes Mamba and Transformer tech for better performance on long contexts and agentic AI – that’s AI that can plan, reason, and act like a team of helpers. This setup helps cut costs while keeping accuracy high, making it easier for developers to create reliable systems without huge expenses.
Here’s a quick look at the three models:
- Nemotron 3 Nano: Around 30-31 billion parameters (3-3.6 billion active per token). It’s the smallest and most efficient, great for tasks like summarizing text, debugging code, or running AI assistants. It supports up to 1 million token context lengths and is available right now on Hugging Face and inference providers like Baseten, DeepInfra, and Together AI.
- Nemotron 3 Super: About 100 billion parameters (10 billion active). Optimized for high-volume workloads with multiple agents working together, like automating IT support or collaborative tools.
- Nemotron 3 Ultra: Roughly 500 billion parameters (50 billion active). The powerhouse for deep reasoning and complex planning in advanced AI workflows.
NVIDIA says the family shines in multi-agent setups, where agents collaborate without losing track or wasting compute. Nemotron 3 Nano, for example, reduces reasoning tokens by up to 60% and handles long inputs smoothly. Independent benchmarks from Artificial Analysis rank it as the most efficient open model in its class with top accuracy.
Along with the models, NVIDIA released open datasets (3 trillion tokens worth), training recipes, and new libraries like NeMo Gym for reinforcement learning. This lets anyone customize agents for their needs, from manufacturing to cybersecurity.
Jensen Huang, NVIDIA’s CEO, called it a step for “open innovation” in AI, helping build transparent systems aligned with real-world rules and data.
Nemotron 3 Nano is out today. Super and Ultra arrive in the first half of 2026. Developers can start experimenting now – head to NVIDIA’s site or Hugging Face for downloads and docs. This could shake up open AI for agentic apps, especially with NVIDIA’s hardware backing. Exciting times ahead!
