Skip to main content
All model briefs
2026.07.03Open-Weights LLM
10 min read

Mistral Large 3Open-Weights LLM

Europe's frontier model went Apache 2.0: what that unlocks for sovereignty-constrained deployments, and what it costs to actually run 675B parameters.

Hamad Pervaiz
Hamad Pervaiz
Founder & CEO, BearPlex
Share
Reference
Parameters
675B MoE (41B active) incl. 2.5B vision encoder
Base model
-
License
Apache 2.0
Publisher
Mistral AI
Paper date
2025.12.02

Mistral Large 3 is the model you evaluate when the first requirement in the RFP is not a benchmark, it is a jurisdiction. Plenty of models are capable; very few combine frontier-class scale, a genuinely permissive license, and a European vendor. As of July 2026 this is the only place where all three meet, and that intersection, not the leaderboard, is where Large 3 wins engagements. This brief covers what shipped, what Apache 2.0 actually settles, the sovereignty argument stated precisely, and the deployment math.

What it actually is

Mistral Large 3 shipped on December 2, 2025 as the flagship of the Mistral 3 family, and Mistral's own description is the headline: "a state-of-the-art, open-weight, general-purpose multimodal model." The verified specs from the model card: a granular Mixture-of-Experts design totaling 675B parameters with 41B active, composed of a 673B-parameter language model (39B active) plus a 2.5B vision encoder, with a 256K context window.

The family context matters for architecture: the same Apache 2.0 release included Ministral 3 at 14B, 8B, and 3B, giving the Large 3 stack a same-vendor, same-license distill ladder for volume traffic. Above it in Mistral's catalog (verified at docs.mistral.ai, July 2026) sit newer specialist releases, with Mistral Medium 3.5 (v26.04) positioned as the frontier-class agentic and coding model; the flagship *open-weight* large model remains Large 3 (v25.12).

The MoE math is the same lesson as every model in this class: 41B active parameters means per-token compute like a mid-size dense model, while 675B total means memory sized to the full parameter count. Cheap to run per token, expensive to hold.

The license, and what commercial use really permits

Apache 2.0, full stop. The release post states "all models are released under the Apache 2.0 license," and the model card confirms it. In the taxonomy of open-weight licensing this is the clean end of the spectrum, alongside Qwen 3 and MIT-licensed DeepSeek, and it is a meaningful break from Mistral's own earlier era of research-only licenses on flagship weights: no MAU gates, no attribution badges, no derivative naming rules (the Llama 4 contrast), fine-tuning and redistribution permitted, plus Apache 2.0's explicit patent grant, which some counsel prefer even over MIT.

One caution from our verification pass: Mistral's own pricing-page FAQ still carries stale generic language about commercial deployments requiring a separate Mistral license. For the Mistral 3 family, the release post and the Hugging Face model card are the controlling documents, and both say Apache 2.0. Have counsel cite those, not the FAQ.

The EU-sovereignty angle, stated precisely

"Sovereign AI" is mostly a marketing word; here is the engineering substance. A sovereignty-constrained deployment (EU public sector, healthcare, financial services under EU data-protection regimes) typically needs three properties: data never leaves a controlled boundary, the vendor relationship survives geopolitical friction, and the stack can be audited.

Mistral Large 3 is the strongest combined answer in the July 2026 market because each property has a concrete mechanism:

  • Data boundary: Apache 2.0 weights run in your EU datacenter or EU-region VPC. No inference traffic to any vendor, no third-country data transfer to analyze in the DPIA. This is a property of self-hosted open weights generally; what Large 3 adds is frontier scale under that posture.
  • Vendor jurisdiction: Mistral AI is a French company. For procurement teams whose legal review stalls on US or Chinese vendor jurisdiction, an EU-headquartered publisher removes the hardest questions, and does so even in the hosted-API scenario.
  • Auditability: weights are inspectable artifacts; your red team evaluates the actual deployed model, not a vendor attestation.

The honest caveat: if you self-host on Azure, AWS, or GCP EU regions, your *infrastructure* provider is still a US hyperscaler. Full sovereignty arguments end at EU-owned infrastructure, and that is an infrastructure decision Large 3 enables but does not make for you. We work through exactly this layering in sovereign cloud engagements.

Real cost: hosted API versus self-host

Hosted path. Large 3 serves as mistral-large-latest on Mistral's own platform, and the pricing page lists Mistral Large at $2 per million input tokens and $6 per million output tokens (as of July 2026). That is an aggressive flagship price: below Sonnet-class $3/$15 and far below gpt-5.5's $5/$30 on list. Availability is unusually broad for day one: Amazon Bedrock, Azure Foundry, Hugging Face, IBM WatsonX, and the major GPU clouds, per the release post.

Self-host path. Derived from the published parameter counts (weights only, before KV cache):

  • BF16: roughly 1.35TB of weights, a multi-node deployment. Rarely the right call.
  • FP8: roughly 675GB, an 8-GPU node in the 96GB-141GB-per-card class, the same footprint class as DeepSeek V3.
  • ~4-bit: roughly 340GB, several datacenter GPUs, with mandatory quality evals at that quantization.
  • The Ministral ladder (3B/8B/14B, Apache 2.0) covers workstation-to-single-GPU territory for the traffic that does not need the flagship.

The pattern that survives contact with a budget: hosted API or Ministral-class self-host first, and the 675B flagship on your own metal only once volume, sovereignty requirements, or unit economics prove out the cluster.

When to use it, and when not

Use Mistral Large 3 when:

  • Sovereignty or data-residency constraints are load-bearing and you want frontier scale without a US or Chinese vendor in the loop.
  • You want one license (Apache 2.0) and one vendor family from 3B to 675B, with the same toolchain from pilot to flagship.
  • Multimodal input (the built-in vision encoder) matters inside a self-hosted boundary, where open-weight options are scarce.

Do not use it when:

  • You would only ever use the hosted API and have no sovereignty constraint; then it competes purely on price and your task evals against the frontier APIs, and the verdict is workload-specific.
  • Your workload is agentic coding and Mistral's own catalog points you at Medium 3.5 instead; run that comparison rather than assuming the flagship wins.
  • You need a warranty. Apache 2.0 weights carry none; contractual accountability comes from the hosted platforms, not the license.

How we would architect it for a client

For an EU-regulated deployment, the reference shape we use:

  1. Two-tier serving inside the boundary: Ministral 3 14B on a single GPU for volume traffic, Large 3 at FP8 on an 8-GPU node for the requests that need the flagship, one Apache 2.0 license file covering both.
  2. A routing gateway enforcing the tier split and logging escalation rates, the same discipline as our DeepSeek two-lane pattern.
  3. Hosted-API burst lane, jurisdiction permitting: mistral-large-latest at $2/$6 absorbs load spikes so the on-prem cluster is sized for the median, not the peak. Where the DPIA forbids it, the gateway simply queues instead.
  4. Eval-gated everything: quantization level, Ministral-versus-Large routing thresholds, and any migration to newer Mistral releases all move only when task-level evals say so, the standard we apply in every model engineering engagement.

The strategic read: Large 3 made "European frontier model" a real procurement category instead of a wish. If your constraints point there, it is not one option among many; as of July 2026 it is essentially the only complete answer.

Frequently asked

Yes. The December 2, 2025 release post states all Mistral 3 family models are released under Apache 2.0, and the Hugging Face model card for Mistral-Large-3-675B-Instruct-2512 confirms it. That means commercial use, modification, fine-tuning, and redistribution with no MAU gates, attribution badges, or naming rules, plus Apache 2.0's patent grant. Note that Mistral's pricing-page FAQ still contains stale generic language about commercial licensing; the release post and model card are the controlling documents.

Shipping open-weights llm in production?

BearPlex engineers AI systems for regulated enterprises. If you're evaluating a model like Mistral Large 3 for production, we'd like to talk.