Reflexion Actor

Date: 06.14.2025

The Reflexion Actor is a cognitive AI architecture that enhances an agent's ability to learn from its mistakes through self-reflection. It combines a base reasoning model (the Actor) with a memory mechanism and a critic loop that evaluates performance after each task attempt.

Why Reflexion?

Most AI agents operate in a trial-and-error fashion without adapting their reasoning strategy. Reflexion aims to overcome this by introducing self-awareness mechanisms. It takes inspiration from human cognitive behavior—particularly meta-cognition, the ability to think about one’s thinking.

Architecture Overview

The Reflexion Actor consists of:

🔥Actor: The reasoning LLM that performs the task
🔥Memory: Stores task history, actions, and thoughts
🔥Critic: Evaluates failed attempts and suggests improvements
🔥Reflexion Loop: Uses the critic's feedback to update the actor's strategy

Reflexion Flow

🔥The agent attempts a task using the Actor.
🔥It logs internal reasoning and outputs to Memory.
🔥The Critic reviews the attempt, highlighting mistakes.
🔥The Reflection step allows the Actor to revise its approach.
🔥The agent re-attempts the task with an improved strategy.

Comparison with Other Architectures

Here’s how Reflexion stacks up:

🔥ReAct: Encourages explicit reasoning but lacks self-evaluation
🔥AutoGPT: Can loop through tasks, but often lacks reflection and critique
🔥LangGraph Agents: Can implement Reflexion-style workflows with more control over flow logic

Applications

🔥Multi-step planning agents (coding, reasoning, research)
🔥Customer support bots that learn from failed responses
🔥AI game agents adapting over levels
🔥 Scientific research agents revising strategies over iterations

Sample LangChain Setup

In LangChain, you could implement a Reflexion-style agent using an LLM chain, a memory buffer, and a custom evaluator/critic:


const executor = new AgentExecutor({
  agent: baseAgent,
  memory: new BufferMemory(),
  tools: [searchTool, codeTool],
  reflectionEnabled: true
});

// Loop until task is solved or retry limit is reached

Challenges

🔥Maintaining concise but useful memory logs
🔥Ensuring critic feedback is accurate and actionable
🔥Avoiding overfitting to specific error types
🔥Increased latency due to reflection steps