What does Grok 4.3 cost via the xAI API?

Per xAI's models documentation as of July 2026: $1.25 per million input tokens and $2.50 per million output tokens, with a 1M-token context window and no published long-context surcharge. The grok-4.20 variants (reasoning, non-reasoning, multi-agent) share the same price, and grok-build-0.1 runs $1.00/$2.00 at 256K context. Cached-input and search-tool pricing were not published on the page we verified, so budget those conservatively.

Is Grok 4.3 really cheaper than GPT-5.5 and Claude?

On list price, dramatically: $2.50 per million output tokens versus $30.00 for gpt-5.5 and $15.00 for Sonnet-class Claude models as of July 2026, a 6-12x gap on the token type that dominates agentic workloads. Whether that survives contact with your workload depends on quality: a model that needs more retries to produce a verified result gives back part of its price advantage. We compare cost per completed, verified task rather than per-token rates, and recommend you do the same before migrating anything.

What happened to the original Grok 4?

It no longer appears on xAI's current models page as of July 2026, roughly a year after its mid-2025 launch; the current lineup is Grok 4.3, the grok-4.20 variant family, and grok-build. The takeaway for buyers is cadence: xAI ships and retires models quickly, and we could not verify a published minimum-notice deprecation policy comparable to OpenAI's six-month commitment. Pin dated snapshots, route through a gateway, and keep migration evals ready.

What is Grok 4.3's knowledge cutoff?

xAI's docs state a November 2024 cutoff for Grok 3 and Grok 4, and publish no cutoff for Grok 4.3 on the models page we verified in July 2026. Engineering-wise the answer is to make the question irrelevant: ground any knowledge-dependent workload with retrieval or search tooling and treat parametric knowledge as a convenience, not a source of record. That is our standard practice on every vendor; the unverifiable cutoff here just makes it mandatory.

Is the xAI API enterprise-ready?

The integration surface is genuinely easy: published pricing, simple model list, aliased names for automatic updates. What we could not verify on the public docs is the enterprise packaging around it: data-residency options, compliance attestations, and capacity reservation terms are documented far more thinly than the Azure OpenAI, Bedrock, or Vertex paths. That is a procurement work item rather than a disqualifier: if your regime needs specific guarantees, obtain them from xAI in writing during the pilot phase, not after the architecture is committed.

Where does Grok 4.3 fit in a multi-model architecture?

Two places. First, as the volume lane for output-heavy agentic workloads where its $2.50 output price changes unit economics, provided it wins a cost-per-verified-task bake-off on your workload. Second, as portfolio pressure: even where Grok does not win a lane, having a credible 4-12x cheaper quote in your gateway disciplines pricing conversations with incumbent vendors. In both roles it sits behind a model-agnostic gateway with pinned snapshots, never as a hard dependency.

Does BearPlex deploy Grok in client work?

We evaluate it wherever cost-dominated, output-heavy workloads meet a client risk profile that can absorb a fast-moving vendor: gateway-wrapped, snapshot-pinned, grounded with retrieval, and gated on a cost-per-verified-task bake-off plus tone and refusal-behavior testing against the client's content policies. For compliance-constrained deployments we require written answers from xAI on residency and attestations before it enters the architecture. Whether it ships is decided by those evals and that paperwork, not by the price sheet.

Grok 4.3: The Engineering Decision Brief

Grok 4.3 is the cheapest flagship-tier API of mid-2026 by a wide margin, and that single fact is why it lands on evaluation shortlists that would not otherwise include xAI. The engineering job is to work out what the price does and does not buy you. This brief covers the verified numbers, the state of the xAI API as an integration target, and a candid account of where Grok fits in a production portfolio and where it does not.

What it actually is

Grok 4.3 went live on the xAI API in the spring of 2026 (xAI's own announcement that it was live on the API is dated May 5, 2026), and xAI's models documentation is unambiguous about its positioning: "It is the most intelligent and fastest model we've built," recommended for chat and coding. The verified spec: a 1M-token context window at $1.25 per million input tokens and $2.50 per million output tokens (as of July 2026).

The current lineup around it, verified on the same page, tells you how xAI thinks about model surface:

grok-4.20 (the March 2026 generation, snapshot-dated -0309) ships as three explicit variants: reasoning, non-reasoning, and multi-agent, all at 1M context and the same $1.25/$2.50 price.
grok-build-0.1: a 256K-context builder-focused model at $1.00/$2.00.
The docs recommend the aliased names (<modelname> or <modelname>-latest) so you pick up point releases automatically; in production we pin snapshots instead, on every vendor, for the same reason we pin dependencies.

Note what is *not* on that page as of July 2026: the original Grok 4 and Grok 3. Barely a year after Grok 4's mid-2025 launch, the pre-4.2 lineup has been retired from the current model table entirely. Hold that thought for the risk section.

Commercial terms

Hosted API under xAI's terms; no weights, no self-host path. Pricing is published and simple, which is genuinely to xAI's credit: one flagship price, no long-context surcharge published for the 1M window, no priority-tier multiplication table. Two commercial observations from our verification pass:

The price is the strategy. $1.25/$2.50 against gpt-5.5's $5/$30 and Sonnet-class $3/$15 is a 4-12x list-price gap on output tokens. xAI is buying market share, and buyers should enjoy it while pricing it as promotional rather than structural.
The enterprise story is thinner than the price sheet. The public model docs we verified say little about data residency, compliance attestations, or capacity reservations compared to the documentation depth of Azure OpenAI, Bedrock, or Vertex paths. That is not an accusation, it is a procurement work item: if your compliance regime needs specific guarantees, get them in writing from xAI before committing an architecture, because the docs alone will not answer your security questionnaire.

Real API cost

Per the official models page, as of July 2026, per million tokens:

| Model | Context | Input | Output | |---|---|---|---| | grok-4.3 | 1M | $1.25 | $2.50 | | grok-4.20 (reasoning / non-reasoning / multi-agent) | 1M | $1.25 | $2.50 | | grok-build-0.1 | 256K | $1.00 | $2.00 |

The output price is the story. Agentic workloads are output-heavy (tool calls, drafts, retries, reasoning tokens), and at $2.50 per million output tokens, Grok 4.3's cost per completed agent task can undercut rivals even if it needs more attempts. That "even if" is measurable, and you should measure it: our standard is cost per *completed, verified* task, not cost per token, and models with higher retry rates lose more of their price advantage than teams expect.

What we could not verify on the public page: cached-input pricing and per-call pricing for live search or web-search tooling. Budget conservatively until your own invoices tell you otherwise.

API access reality

Integration is deliberately low-friction, and the docs' recommended defaults reveal the intended audience: aliased model names, automatic feature pickup, one price. For an engineering team, the realities to plan around:

Model churn is fast and real. The original Grok 4 went from flagship launch to absent-from-the-docs in roughly a year, and the 4.20 generation shipped with date-stamped snapshots barely two months before 4.3 superseded it. That cadence is OpenAI-speed, without the published minimum-notice deprecation policy we could verify on OpenAI's side. Pin snapshots, wrap the vendor behind your gateway, and keep your eval suite ready for forced migrations.
The stale-cutoff trap. The docs state a November 2024 knowledge cutoff for Grok 3 and Grok 4 and publish no cutoff for 4.3 on the models page we verified. Treat parametric knowledge as unreliable for anything recent and ground the model with retrieval or search tooling, which is good practice on every vendor and mandatory here.
The variant surface is unusual. Explicit reasoning versus non-reasoning versus multi-agent SKUs at identical prices means routing is by capability rather than by budget: pick the variant per lane the same way you would set reasoning_effort elsewhere.

When to use it, and when not

Use Grok 4.3 when:

Output-token economics dominate: high-volume agentic or generation workloads where $2.50 output changes the unit economics, validated by your own cost-per-completed-task evals.
You need a 1M-token window without a long-context surcharge and your inputs genuinely are that long.
It serves as the price-pressure lane in a multi-vendor gateway: even when Grok does not win a lane, its quote disciplines your negotiation with the vendors that do.

Do not use it when:

Your compliance regime needs documented residency, attestations, or contractual capacity that you have not yet obtained from xAI in writing. Price does not answer a security questionnaire.
Brand-sensitivity review is part of your deployment gate and you have not run it here. Grok's consumer persona is distinctive by design; your evaluation should include tone and refusal-behavior testing against your own content policies, exactly as we recommend for every vendor, and with extra care here.
You need weights, a self-host path, or multi-year model stability. None of the three is on offer; for the first two see the open-weights conversation.

How we would architect it for a client

Grok enters our architectures the same way every frontier API does: as a lane behind a gateway, never as a hard dependency.

Gateway-wrapped, snapshot-pinned. Aliased names are convenient and we do not use them in production. Model IDs live in config; xAI's churn record makes this non-negotiable.
Cost-per-verified-task bake-off against the incumbent lane on the client's own workload, with retry and failure rates in the denominator. If Grok's list-price advantage survives that math, it earns the volume lane it is priced for.
Grounding by default: retrieval or search tooling in front of any knowledge-dependent workload, given the unverifiable cutoff situation.
Procurement in parallel with engineering: the compliance questions (residency, attestations, capacity terms) go to xAI in writing during the pilot, not after it, so a technical win cannot be stranded by an unanswerable questionnaire. This is standard model engineering discipline; Grok just makes it visibly necessary.

The one-line verdict: Grok 4.3 is a serious, aggressively priced option for cost-dominated agentic workloads, and a portfolio instrument even where it does not win. It is not yet the model you build a compliance-constrained system around on documentation alone.

Grok 4.3Frontier LLM

What it actually is

Commercial terms

Real API cost

API access reality

When to use it, and when not

How we would architect it for a client

Frequently asked

Related work

Related reading

Shipping frontier llm in production?