Is Mistral Large 3 really Apache 2.0, including commercial use?

Yes. The December 2, 2025 release post states all Mistral 3 family models are released under Apache 2.0, and the Hugging Face model card for Mistral-Large-3-675B-Instruct-2512 confirms it. That means commercial use, modification, fine-tuning, and redistribution with no MAU gates, attribution badges, or naming rules, plus Apache 2.0's patent grant. Note that Mistral's pricing-page FAQ still contains stale generic language about commercial licensing; the release post and model card are the controlling documents.

What hardware does Mistral Large 3 need to self-host?

It is a 675B-parameter Mixture-of-Experts model (41B active), so weights alone are roughly 1.35TB at BF16, about 675GB at FP8, and around 340GB at 4-bit quantization: an 8-GPU datacenter node at FP8, before KV cache. Per-token compute behaves like a 41B model once memory is provisioned. Most deployments should start on the hosted API or the Apache 2.0 Ministral 3 models (3B/8B/14B) and promote to self-hosted Large 3 only when volume or sovereignty requirements justify the cluster.

What does the Mistral Large 3 API cost?

Mistral's pricing page lists Mistral Large at $2 per million input tokens and $6 per million output tokens as of July 2026, served as mistral-large-latest on Mistral's platform. It is also available through Amazon Bedrock, Azure Foundry, IBM WatsonX, and major GPU clouds per the release announcement. On list price that undercuts the US frontier APIs, but we always compare cost per completed task on the client's own workload rather than per-token rates.

Why does Mistral Large 3 matter for EU sovereignty?

It is the only July 2026 combination of frontier scale, a permissive Apache 2.0 license, and an EU-headquartered publisher. Self-hosted, the weights run entirely inside your EU boundary with no vendor in the inference path and no third-country transfer to analyze; and the vendor-jurisdiction questions that stall EU procurement on US or Chinese providers largely disappear. The honest caveat: hosting on a US hyperscaler's EU region still leaves a US infrastructure provider in the stack, so full sovereignty is an infrastructure decision the model enables but cannot make for you.

How does Mistral Large 3 compare to DeepSeek V3 as an open-weights flagship?

They occupy the same hardware class (roughly 670GB at FP8, 8-GPU nodes) with similar MoE economics, and both carry clean permissive licenses: MIT for DeepSeek from V3-0324 onward, Apache 2.0 for Large 3. The differentiators are jurisdiction and modality: Mistral is a French publisher, which matters enormously in EU procurement, and Large 3 ships with a built-in vision encoder. Where sovereignty is not a constraint, we let task-level evals decide; where it is, the jurisdiction usually decides first.

Should we use Mistral Large 3 or a smaller Mistral model?

Route by difficulty, not by brand tier. The Apache 2.0 Ministral 3 line (3B, 8B, 14B) handles classification, extraction, and routine generation on single-GPU or smaller footprints, and in our architectures that lane carries most traffic. Reserve Large 3 for the requests that measurably need frontier capability, and note Mistral's own catalog positions Medium 3.5 as its agentic coding specialist. The right split is an eval result on your workload, then enforced by a routing gateway with logged escalation rates.

Does BearPlex deploy Mistral models in client work?

We evaluate the Mistral 3 family as the primary candidate wherever EU sovereignty or data-residency constraints are load-bearing: a two-tier Apache 2.0 stack (Ministral for volume, Large 3 at FP8 for the hard cases) inside the client's boundary, optionally with a hosted-API burst lane where the data-protection assessment permits it. As with every model we deploy, quantization levels, routing thresholds, and upgrade decisions are gated on the client's own task evals, not on vendor announcements.

Mistral Large 3: The Engineering Decision Brief

Mistral Large 3 is the model you evaluate when the first requirement in the RFP is not a benchmark, it is a jurisdiction. Plenty of models are capable; very few combine frontier-class scale, a genuinely permissive license, and a European vendor. As of July 2026 this is the only place where all three meet, and that intersection, not the leaderboard, is where Large 3 wins engagements. This brief covers what shipped, what Apache 2.0 actually settles, the sovereignty argument stated precisely, and the deployment math.

What it actually is

Mistral Large 3 shipped on December 2, 2025 as the flagship of the Mistral 3 family, and Mistral's own description is the headline: "a state-of-the-art, open-weight, general-purpose multimodal model." The verified specs from the model card: a granular Mixture-of-Experts design totaling 675B parameters with 41B active, composed of a 673B-parameter language model (39B active) plus a 2.5B vision encoder, with a 256K context window.

The family context matters for architecture: the same Apache 2.0 release included Ministral 3 at 14B, 8B, and 3B, giving the Large 3 stack a same-vendor, same-license distill ladder for volume traffic. Above it in Mistral's catalog (verified at docs.mistral.ai, July 2026) sit newer specialist releases, with Mistral Medium 3.5 (v26.04) positioned as the frontier-class agentic and coding model; the flagship *open-weight* large model remains Large 3 (v25.12).

The MoE math is the same lesson as every model in this class: 41B active parameters means per-token compute like a mid-size dense model, while 675B total means memory sized to the full parameter count. Cheap to run per token, expensive to hold.

The license, and what commercial use really permits

Apache 2.0, full stop. The release post states "all models are released under the Apache 2.0 license," and the model card confirms it. In the taxonomy of open-weight licensing this is the clean end of the spectrum, alongside Qwen 3 and MIT-licensed DeepSeek, and it is a meaningful break from Mistral's own earlier era of research-only licenses on flagship weights: no MAU gates, no attribution badges, no derivative naming rules (the Llama 4 contrast), fine-tuning and redistribution permitted, plus Apache 2.0's explicit patent grant, which some counsel prefer even over MIT.

One caution from our verification pass: Mistral's own pricing-page FAQ still carries stale generic language about commercial deployments requiring a separate Mistral license. For the Mistral 3 family, the release post and the Hugging Face model card are the controlling documents, and both say Apache 2.0. Have counsel cite those, not the FAQ.

The EU-sovereignty angle, stated precisely

"Sovereign AI" is mostly a marketing word; here is the engineering substance. A sovereignty-constrained deployment (EU public sector, healthcare, financial services under EU data-protection regimes) typically needs three properties: data never leaves a controlled boundary, the vendor relationship survives geopolitical friction, and the stack can be audited.

Mistral Large 3 is the strongest combined answer in the July 2026 market because each property has a concrete mechanism:

Data boundary: Apache 2.0 weights run in your EU datacenter or EU-region VPC. No inference traffic to any vendor, no third-country data transfer to analyze in the DPIA. This is a property of self-hosted open weights generally; what Large 3 adds is frontier scale under that posture.
Vendor jurisdiction: Mistral AI is a French company. For procurement teams whose legal review stalls on US or Chinese vendor jurisdiction, an EU-headquartered publisher removes the hardest questions, and does so even in the hosted-API scenario.
Auditability: weights are inspectable artifacts; your red team evaluates the actual deployed model, not a vendor attestation.

The honest caveat: if you self-host on Azure, AWS, or GCP EU regions, your *infrastructure* provider is still a US hyperscaler. Full sovereignty arguments end at EU-owned infrastructure, and that is an infrastructure decision Large 3 enables but does not make for you. We work through exactly this layering in sovereign cloud engagements.

Real cost: hosted API versus self-host

Hosted path. Large 3 serves as mistral-large-latest on Mistral's own platform, and the pricing page lists Mistral Large at $2 per million input tokens and $6 per million output tokens (as of July 2026). That is an aggressive flagship price: below Sonnet-class $3/$15 and far below gpt-5.5's $5/$30 on list. Availability is unusually broad for day one: Amazon Bedrock, Azure Foundry, Hugging Face, IBM WatsonX, and the major GPU clouds, per the release post.

Self-host path. Derived from the published parameter counts (weights only, before KV cache):

BF16: roughly 1.35TB of weights, a multi-node deployment. Rarely the right call.
FP8: roughly 675GB, an 8-GPU node in the 96GB-141GB-per-card class, the same footprint class as DeepSeek V3.
~4-bit: roughly 340GB, several datacenter GPUs, with mandatory quality evals at that quantization.
The Ministral ladder (3B/8B/14B, Apache 2.0) covers workstation-to-single-GPU territory for the traffic that does not need the flagship.

The pattern that survives contact with a budget: hosted API or Ministral-class self-host first, and the 675B flagship on your own metal only once volume, sovereignty requirements, or unit economics prove out the cluster.

When to use it, and when not

Use Mistral Large 3 when:

Sovereignty or data-residency constraints are load-bearing and you want frontier scale without a US or Chinese vendor in the loop.
You want one license (Apache 2.0) and one vendor family from 3B to 675B, with the same toolchain from pilot to flagship.
Multimodal input (the built-in vision encoder) matters inside a self-hosted boundary, where open-weight options are scarce.

Do not use it when:

You would only ever use the hosted API and have no sovereignty constraint; then it competes purely on price and your task evals against the frontier APIs, and the verdict is workload-specific.
Your workload is agentic coding and Mistral's own catalog points you at Medium 3.5 instead; run that comparison rather than assuming the flagship wins.
You need a warranty. Apache 2.0 weights carry none; contractual accountability comes from the hosted platforms, not the license.

How we would architect it for a client

For an EU-regulated deployment, the reference shape we use:

Two-tier serving inside the boundary: Ministral 3 14B on a single GPU for volume traffic, Large 3 at FP8 on an 8-GPU node for the requests that need the flagship, one Apache 2.0 license file covering both.
A routing gateway enforcing the tier split and logging escalation rates, the same discipline as our DeepSeek two-lane pattern.
Hosted-API burst lane, jurisdiction permitting: mistral-large-latest at $2/$6 absorbs load spikes so the on-prem cluster is sized for the median, not the peak. Where the DPIA forbids it, the gateway simply queues instead.
Eval-gated everything: quantization level, Ministral-versus-Large routing thresholds, and any migration to newer Mistral releases all move only when task-level evals say so, the standard we apply in every model engineering engagement.

The strategic read: Large 3 made "European frontier model" a real procurement category instead of a wish. If your constraints point there, it is not one option among many; as of July 2026 it is essentially the only complete answer.

Mistral Large 3Open-Weights LLM

What it actually is

The license, and what commercial use really permits

The EU-sovereignty angle, stated precisely

Real cost: hosted API versus self-host

When to use it, and when not

How we would architect it for a client

Frequently asked

Related work

Related reading

Shipping open-weights llm in production?