Anatomy of an AI agent

Home
AI
Concepts
Anatomy of an AI agent

Date: 27.03.2026

Anatomy of an AI agent: sense, think, act, observe, finish in a dramatic pipeline diagram — Sense → Think → Act → Observe → Finish (loop until termination)

Introduction

This page names the parts most LLM agents share: what comes in (Sense), how decisions are made (Think / reason), what happens in the world (Act), what comes back (Observe), and how a run ends (Finish). On top of that you can add optional pieces — Evaluate, Memory, Description, Plan, chain-of-thought, Ask — and use RAG to ground answers in your data.

Sense

Sense is every input the agent receives: not only chat text, but also sensors, webhooks, queues, and other events.

Category	Examples
Text	Keyboard input, chat messages, prompts, documents, search queries
Sensor	Camera, microphone, IoT devices, GPS, temperature, motion
API / events	Webhooks, HTTP requests, DB triggers, queue messages, calendar events, notifications

Anything that gives the agent information flows through Sense. A minimal mental model:
[User] → [Sense] → [Agent].

Think / reason

Think is how the agent chooses what to do next — rules, an LLM, or both.

🔥 Deterministic: if-then rules, decision trees, state machines — same input, same output; no LLM.
🔥 LLM-based: the model reasons (chain-of-thought, planning, ReAct-style thoughts). May be non-deterministic when temperature > 0.
🔥 Hybrid: deterministic routing or safety gates, LLM for open-ended reasoning (e.g. if the user asks X → call tool A; else let the LLM decide).

Act

Act is what the agent does with a decision: generate text, speech, or alerts; or invoke tools — read/write databases, call HTTP APIs, control systems.

Observe

Observe is what comes back after an action: tool output, HTTP response, query result. The agent can loop back to Think with that observation or move toward Finish.

Finish

Finish is when the agent stops the loop and returns a final result. It is not really another pluggable component like a tool or a memory store — it is the termination condition your runtime applies on top of the loop.

Typical reasons a run ends:

🔥 Task done: goal met, final answer or artifact returned.
🔥 Max steps: step or token budget hit — safety and cost caps.
🔥 Error: tool or environment failure that you treat as unrecoverable in this session.
🔥 User stop: cancel, timeout, or leaving the session.
🔥 Give up: the model decides it cannot complete the task and exits explicitly instead of looping forever.

Evaluate

Evaluate is the umbrella for feedback that improves behavior over time. RLHF (reinforcement learning from human feedback) is one training-time example. Reflect is one runtime mechanism: the agent critiques its own attempt and retries.

	Evaluate	Reflect
Scope	Broad feedback loop	Self-critique in the loop
When	Training-time (RLHF) or runtime	Runtime only
Who	Humans, environment, or agent	Agent (or a critic)

Evaluate can take several forms:

🔥 Human feedback → reward model → fine-tuning
🔥 Self-evaluation: reflection, critic, Constitutional AI
🔥 Environment feedback: task success/failure, scores → RL-style rewards
🔥 Preference learning: DPO and similar methods on preferred vs non-preferred outputs

A minimal reflection trace: the agent looks at actions and observations, writes a short critique, then may start a new Reason → Act → Observe cycle. Example: Reason — search for a train time; Act — search({"q": "Paris"}); Observe — too many results, not specific enough.

For reflection-heavy runs, models such as Claude 3.5 Sonnet and GPT-4o are common; o1 / o3 / R1 for harder critique; fine-tuning and RLHF need models that support those training modes; lightweight critique can use something like GPT-4o-mini.

Memory

Memory is how the agent stores and reuses information across turns or sessions — not the raw context window alone, but anything you persist and pull back into the next prompt or tool plan.

Type	What it stores
Episodic	Past events, interactions, and tool results from this session (short-lived working state).
Long-term	Facts, preferences, and learnings you keep across sessions (profiles, playbooks, vector memory).

Description

Description is when the agent emits user-visible text about what it is doing or did — status and transparency. Common in conversational patterns; many agents skip it when latency matters.

Plan

Plan sits inside Think / reasoning — it is a different way to reason: instead of picking the next tool immediately, the LLM first returns a short list of steps that decompose the task.

Example outline the model might emit:

Search for London–Paris train duration.
Compute cost at 50€/h.
Return both answers to the user.

Execute comes next: for each planned step you run the usual Think → Act → Observe loop (ReAct-style). Plan-and-execute uses that outline as the map and the per-step loop as the engine. Often the outline is produced by one LLM call at the start; replanning can add more calls later if the task drifts.

Chain of thought

Chain of thought (CoT) is also part of Think: the model writes intermediate reasoning in text before choosing a tool — e.g. “get duration first, then multiply by cost,” then search(...).

Description, plan, and CoT compared

Description, plan, and chain of thought all show up as text, but they answer different questions. Description is aimed at people (or audit logs): “I will search the pricing API next.” Plan is a task outline the runtime can follow step by step. CoT is working reasoning inside one Think turn to pick the next action — often verbose, sometimes hidden from the user.

	Primary audience	What it encodes
Description	User, operator, or log reader	What the agent is doing or did — transparency and status, not the full reasoning graph.
Plan	Executor (your loop / planner)	Ordered subgoals for the whole task — a map before or between execution phases.
Chain of thought	The model (and optionally you, if you expose it)	Intermediate rationale for the current decision — which tool, which argument, why.

You can mix them: a plan for the journey, CoT at each stoplight, optional description at each leg so users see progress without reading every hidden thought. Skipping description saves latency; skipping CoT often hurts accuracy on hard tools; a bad plan wastes effort until you replan.

Ask

Ask is when the model outputs a question to the user; the user’s reply is appended to history and the run continues. Same idea as conversational ReAct: user-in-the-loop clarification.

Grounding with RAG

Retrieval-augmented generation (RAG) does not replace Sense or Act by itself — it supplies grounded passages so the model’s answers and tool use can align with your documents instead of hallucinating facts. For more on wiring retrieval into agents, see RAG architectures.

Anatomy of AI Agents: Inside LLMs, RAG Systems, & Generative AI (video)

Reflection

Reflection in the narrow sense: a process where the agent or a separate critic reviews output or plan and revises before the next action or answer. A Reflexion-style actor pairs a reasoning actor with memory and a critic so failed attempts produce critiques and updated strategies across tries — see also Reflexion actor.

Conclusion

Agent frameworks differ by vendor, but the same skeleton shows up everywhere: bound inputs, a reasoning step, actions with observable results, and explicit stop conditions. Optional pieces — memory, evaluation, narration, planning, user questions, RAG — are where products diverge: latency, safety, and traceability trade off against each other. Naming these parts clearly makes it easier to design prompts, choose models, and debug runs when something loops too long or goes off the rails. For ReAct-style agent loop patterns (think–act–observe and extensions), see Agent loop patterns.