Introduction
Today marks a pivotal moment in open-source artificial intelligence with the official announcement of Mistral 3. This next generation of models is not just a single release but a comprehensive family designed to address every computing need, from massive data centers to local devices. The lineup features three state-of-the-art small, dense models (the Ministral series) and the formidable Mistral Large 3, a mixture-of-experts (MoE) model built for frontier performance.
For official details and the full announcement, you can visit the Mistral 3 news page.
Mistral Large 3: The Open Source Giant
The centerpiece of this release is undoubtedly Mistral Large 3. Featuring a sparse mixture-of-experts architecture with 41 billion active parameters out of a total of 675 billion, this model was trained from scratch using 3,000 NVIDIA H200 GPUs. It stands as Mistral's most capable model to date and ranks among the best permissive open-weight models globally.
Key capabilities of Mistral Large 3 include:
- Market-Leading Parity: Matches the performance of the best instruction-tuned open-weight models available.
- Multimodal Understanding: Native capabilities to understand and process images.
- Multilingual Excellence: Best-in-class performance for non-English and non-Chinese conversations.
On the LMArena leaderboard, the model debuted at #2 in the OSS non-reasoning category, validating its technical prowess.
Ministral 3: Intelligence at the Edge
Not every application requires massive server infrastructure. For local and edge use cases, Mistral has introduced the Ministral 3 series, available in three sizes: 3B, 8B, and 14B parameters. These models are engineered to deliver the best performance-to-cost ratio in their category.
Each size comes in three variants: base, instruct, and reasoning, all equipped with image understanding capabilities. A standout feature of the instruct models is their efficiency: they often produce an order of magnitude fewer tokens than comparable models while matching or exceeding response quality.
Accessibility and Strategic Partnerships
A critical aspect of this launch is the collaboration with tech giants like NVIDIA, vLLM, and Red Hat to ensure Mistral 3 is both accessible and optimized.
Hardware Optimization
All models were trained on NVIDIA Hopper GPUs to leverage high-bandwidth HBM3e memory. Furthermore, an optimized checkpoint in NVFP4 format has been released, enabling efficient execution of Mistral Large 3 on Blackwell NVL72 systems or single 8×A100/H100 nodes via vLLM.
Conclusion
By releasing these models under the Apache 2.0 license, Mistral is democratizing access to frontier-grade AI technology. Whether deploying complex cloud solutions with Large 3 or bringing intelligence to laptops and robots with Ministral, the Mistral 3 family offers unprecedented flexibility to developers and enterprises alike.
FAQ
What is Mistral 3?
Mistral 3 is the next generation of AI models from Mistral, featuring the powerful Mistral Large 3 and the compact Ministral 3 series (3B, 8B, 14B) for edge computing.
Is Mistral 3 open source?
Yes, all models in the Mistral 3 family, including base and instruct variants, are released under the Apache 2.0 license, allowing for broad commercial and research use.
What are the specs of Mistral Large 3?
It is a Mixture-of-Experts (MoE) model with 675 billion total parameters (41B active), trained on NVIDIA H200 GPUs, featuring multilingual and image understanding capabilities.
Where can I use or download Mistral 3?
The models are available on Mistral AI Studio, Hugging Face, and via cloud providers like Azure, AWS, Google Cloud, as well as platforms like NVIDIA NIM and OpenRouter.
What is the Ministral 3 series?
Ministral 3 consists of dense, small models designed for efficiency and local (edge) use, available in base, instruct, and reasoning variants with image support.