What is the ReAct Pattern (Reasoning + Acting)?
ReAct is an LLM agent design pattern where the model alternates between Reasoning steps (thinking through what to do next) and Acting steps (calling tools or taking actions): producing an interpretable trace of the agent's decision-making process and dramatically improving task completion compared to direct action without reasoning.
Overview
ReAct was introduced by Yao et al. (Princeton, 2022) as one of the foundational LLM agent design patterns. The core insight: LLMs perform much better on multi-step tasks when they explicitly reason about what to do before doing it, rather than directly emitting actions. The pattern produces traces like: 'Thought: I need to find the user's order status. Action: lookup_order(user_id=12345). Observation: Order is shipped, tracking #ABC123. Thought: Now I should explain this to the user. Action: send_response(...).' This explicit reasoning-action loop is now the foundation of essentially every production agent system. ReAct's intellectual descendants include the modern agent loop in LangGraph, Claude's tool use behavior, OpenAI's function calling patterns, and the autonomous agent frameworks that have emerged since 2023.
How ReAct works
The agent loop: (1) The LLM receives the user's request plus available tools; (2) It generates a reasoning step ('Thought:') explaining what it needs to do; (3) It generates an action step ('Action:') invoking a tool with arguments; (4) The tool executes and returns a result ('Observation:'); (5) The LLM uses the observation to inform the next reasoning step; (6) Loop until the task is complete (signaled by a final answer instead of another action). Modern implementations (LangGraph, Claude tool use) often use structured output instead of literal 'Thought:' / 'Action:' / 'Observation:' tags, but the underlying pattern is the same. The key benefit: explicit reasoning improves task completion rates significantly; Yao et al.'s original paper showed 10-30% improvement on complex multi-step tasks compared to direct action prompting.
ReAct vs newer patterns
ReAct kicked off a generation of agent design patterns: (1) ReAct (original, 2022), alternating reasoning and acting; the foundation; (2) ReWOO (Reasoning Without Observation, 2023): pre-plans the full action sequence before executing, separating planning from execution; reduces token cost; (3) Reflexion (2023): adds explicit self-reflection after failures, learning from past mistakes within the same task; (4) Tree of Thoughts (2023): explores multiple reasoning paths in parallel and picks the best; (5) Modern structured agents (LangGraph, Claude Agent SDK): combine ReAct's reasoning-action loop with explicit state management, conditional branching, and human-in-the-loop checkpoints. The lineage matters: most modern agent patterns are evolutions of ReAct's basic insight that LLMs benefit from explicit reasoning before action.
Production ReAct patterns
Practical considerations from BearPlex production agent work: (1) Token cost, ReAct's explicit reasoning consumes more tokens per task than direct action; for cost-sensitive workloads this matters; (2) Latency: multiple LLM calls per task means total latency is N × per-call latency; mitigated by parallel tool calls when reasoning permits; (3) Debugging: the explicit reasoning trace is invaluable for debugging agent failures; we capture all reasoning steps in audit logs; (4) Tool design: ReAct works best with well-designed tools that have clear descriptions, predictable behavior, and informative error messages; vague tools cause vague reasoning; (5) Safety: explicit reasoning makes it easier to validate agent intent before destructive actions execute (we sometimes pause between reasoning and action for human approval on high-stakes operations).
Use cases
- Foundation pattern for nearly every modern LLM agent system
- Customer support agents that handle multi-step ticket resolution
- Autonomous research agents that combine search, reading, and synthesis
- Code generation agents that plan, execute, and verify their work
- Data analysis agents that query, analyze, and report findings
Examples in production
Princeton (2022)
Yao et al.'s 'ReAct: Synergizing Reasoning and Acting in Language Models' introduced the foundational pattern that defined modern LLM agent design.
SourceLangChain / LangGraph
Both frameworks implement ReAct-pattern agents as core capabilities; LangGraph's modern agent design extends ReAct with explicit state management.
SourceAnthropic Claude
Claude's tool use behavior follows ReAct-like patterns by default; the model naturally produces reasoning before action when given access to tools.
SourceReAct Pattern compared to alternatives
| Alternative | Choose ReAct Pattern when | Choose alternative when |
|---|---|---|
Direct action (no reasoning) LLM emits actions without explicit reasoning steps | Use ReAct for any non-trivial multi-step task: task completion improves significantly | Direct action only for single-step tasks where reasoning would be wasteful |
Tree of Thoughts Explore multiple reasoning paths in parallel | Use ReAct for sequential reasoning + acting: the production default | Use Tree of Thoughts when task benefits from exploring alternatives (puzzles, optimization) |
Common pitfalls
- Skipping explicit reasoning to save tokens, usually hurts task completion more than it saves cost
- Designing vague tools: vague tools cause vague reasoning and worse decisions
- Not capturing reasoning traces in audit logs: losing the most valuable debugging artifact
- Allowing infinite reasoning loops: bound the maximum number of iterations per task
- Treating ReAct as a complete agent design: modern production agents add state management, error handling, and HITL on top
Questions about ReAct Pattern.
Not necessarily. Original ReAct papers used those literal tags, but modern function calling and structured output make this implicit: the model naturally produces a reasoning chain followed by tool calls without needing the explicit prefix. Use explicit tags when working with older models or when you need strict format control; modern agent frameworks handle the structure for you.
ReAct uses more tokens per task than direct action because of the explicit reasoning. For high-volume workloads, this is a real cost, typically 30-100% more tokens vs direct action. The benefit (much higher task completion rate) usually justifies the cost, but for cost-sensitive applications we sometimes use shorter reasoning ('briefly: I need to...') or hybrid patterns where simple cases skip reasoning.
Both involve explicit reasoning, but for different purposes. CoT is reasoning before producing a final answer (a single output). ReAct is reasoning before each action in an iterative loop where the model takes multiple actions interspersed with reasoning. CoT is for one-shot tasks; ReAct is for multi-step tasks involving tool use.
Need help implementing ReAct Pattern?
BearPlex builds production AI systems that use ReAct Pattern for Fortune 500s and high-growth scale-ups. Outcome-based pricing. 90-day embedded sprints.