Should our agent prompts explicitly say 'Thought:' and 'Action:' ?

Not necessarily. Original ReAct papers used those literal tags, but modern function calling and structured output make this implicit: the model naturally produces a reasoning chain followed by tool calls without needing the explicit prefix. Use explicit tags when working with older models or when you need strict format control; modern agent frameworks handle the structure for you.

How does ReAct affect token cost?

ReAct uses more tokens per task than direct action because of the explicit reasoning. For high-volume workloads, this is a real cost, typically 30-100% more tokens vs direct action. The benefit (much higher task completion rate) usually justifies the cost, but for cost-sensitive applications we sometimes use shorter reasoning ('briefly: I need to...') or hybrid patterns where simple cases skip reasoning.

What's the difference between ReAct and chain-of-thought (CoT)?

Both involve explicit reasoning, but for different purposes. CoT is reasoning before producing a final answer (a single output). ReAct is reasoning before each action in an iterative loop where the model takes multiple actions interspersed with reasoning. CoT is for one-shot tasks; ReAct is for multi-step tasks involving tool use.

Start a conversation

AI engineering glossary

What is the ReAct Pattern (Reasoning + Acting)?

ReAct is an LLM agent design pattern where the model alternates between Reasoning steps (thinking through what to do next) and Acting steps (calling tools or taking actions): producing an interpretable trace of the agent's decision-making process and dramatically improving task completion compared to direct action without reasoning.

Last updated 2026-04-29BearPlex AI Engineering Team

Overview

ReAct was introduced by Yao et al. (Princeton, 2022) as one of the foundational LLM agent design patterns. The core insight: LLMs perform much better on multi-step tasks when they explicitly reason about what to do before doing it, rather than directly emitting actions. The pattern produces traces like: 'Thought: I need to find the user's order status. Action: lookup_order(user_id=12345). Observation: Order is shipped, tracking #ABC123. Thought: Now I should explain this to the user. Action: send_response(...).' This explicit reasoning-action loop is now the foundation of essentially every production agent system. ReAct's intellectual descendants include the modern agent loop in LangGraph, Claude's tool use behavior, OpenAI's function calling patterns, and the autonomous agent frameworks that have emerged since 2023.

How ReAct works

The agent loop: (1) The LLM receives the user's request plus available tools; (2) It generates a reasoning step ('Thought:') explaining what it needs to do; (3) It generates an action step ('Action:') invoking a tool with arguments; (4) The tool executes and returns a result ('Observation:'); (5) The LLM uses the observation to inform the next reasoning step; (6) Loop until the task is complete (signaled by a final answer instead of another action). Modern implementations (LangGraph, Claude tool use) often use structured output instead of literal 'Thought:' / 'Action:' / 'Observation:' tags, but the underlying pattern is the same. The key benefit: explicit reasoning improves task completion rates significantly; Yao et al.'s original paper showed 10-30% improvement on complex multi-step tasks compared to direct action prompting.

ReAct vs newer patterns

ReAct kicked off a generation of agent design patterns: (1) ReAct (original, 2022), alternating reasoning and acting; the foundation; (2) ReWOO (Reasoning Without Observation, 2023): pre-plans the full action sequence before executing, separating planning from execution; reduces token cost; (3) Reflexion (2023): adds explicit self-reflection after failures, learning from past mistakes within the same task; (4) Tree of Thoughts (2023): explores multiple reasoning paths in parallel and picks the best; (5) Modern structured agents (LangGraph, Claude Agent SDK): combine ReAct's reasoning-action loop with explicit state management, conditional branching, and human-in-the-loop checkpoints. The lineage matters: most modern agent patterns are evolutions of ReAct's basic insight that LLMs benefit from explicit reasoning before action.

Production ReAct patterns

Practical considerations from BearPlex production agent work: (1) Token cost, ReAct's explicit reasoning consumes more tokens per task than direct action; for cost-sensitive workloads this matters; (2) Latency: multiple LLM calls per task means total latency is N × per-call latency; mitigated by parallel tool calls when reasoning permits; (3) Debugging: the explicit reasoning trace is invaluable for debugging agent failures; we capture all reasoning steps in audit logs; (4) Tool design: ReAct works best with well-designed tools that have clear descriptions, predictable behavior, and informative error messages; vague tools cause vague reasoning; (5) Safety: explicit reasoning makes it easier to validate agent intent before destructive actions execute (we sometimes pause between reasoning and action for human approval on high-stakes operations).

Use cases

Foundation pattern for nearly every modern LLM agent system
Customer support agents that handle multi-step ticket resolution
Autonomous research agents that combine search, reading, and synthesis
Code generation agents that plan, execute, and verify their work
Data analysis agents that query, analyze, and report findings

Examples in production

Princeton (2022)

Yao et al.'s 'ReAct: Synergizing Reasoning and Acting in Language Models' introduced the foundational pattern that defined modern LLM agent design.

Source

LangChain / LangGraph

Both frameworks implement ReAct-pattern agents as core capabilities; LangGraph's modern agent design extends ReAct with explicit state management.

Source

Anthropic Claude

Claude's tool use behavior follows ReAct-like patterns by default; the model naturally produces reasoning before action when given access to tools.

Source

ReAct Pattern compared to alternatives

Alternative	Choose ReAct Pattern when	Choose alternative when
Direct action (no reasoning) LLM emits actions without explicit reasoning steps	Use ReAct for any non-trivial multi-step task: task completion improves significantly	Direct action only for single-step tasks where reasoning would be wasteful
Tree of Thoughts Explore multiple reasoning paths in parallel	Use ReAct for sequential reasoning + acting: the production default	Use Tree of Thoughts when task benefits from exploring alternatives (puzzles, optimization)

Common pitfalls

Skipping explicit reasoning to save tokens, usually hurts task completion more than it saves cost
Designing vague tools: vague tools cause vague reasoning and worse decisions
Not capturing reasoning traces in audit logs: losing the most valuable debugging artifact
Allowing infinite reasoning loops: bound the maximum number of iterations per task
Treating ReAct as a complete agent design: modern production agents add state management, error handling, and HITL on top

Related BearPlex services

Autonomous AI Agents

Full AI glossary

FAQ

Questions about ReAct Pattern.

Yes: fundamentally. Almost every production agent system uses some form of the ReAct pattern (reasoning before action, alternating with observations from tool results). Modern frameworks like LangGraph and the Claude Agent SDK build on ReAct with state management, conditional flow, and HITL, but the underlying reasoning-action loop is unchanged.

Need help implementing ReAct Pattern?

BearPlex builds production AI systems that use ReAct Pattern for Fortune 500s and high-growth scale-ups. Outcome-based pricing. 90-day embedded sprints.

Talk to BearPlex See case studies

What is the ReAct Pattern (Reasoning + Acting)?

Overview

How ReAct works

ReAct vs newer patterns

Production ReAct patterns

Use cases

Examples in production

Princeton (2022)

LangChain / LangGraph

Anthropic Claude

ReAct Pattern compared to alternatives

Common pitfalls

Related terms

Related BearPlex services

Questions about ReAct Pattern.

Related reading

Need help implementing ReAct Pattern?