Core Concepts of an AI Agent#
1.1 What Is an AI Agent#
An AI Agent is the “hands and feet” of an AI Model — a layer of code sitting between the user and the Model, responsible for executing the things the Model wants to do but cannot (read files, run commands, browse the web, edit code…).
AI Model (also called LLM, Large Language Model) refers to AI models like Claude, ChatGPT, or Gemini that you can talk to in natural language. At its core it is a pure text generator — give it input, it produces output, no hands or feet. Throughout the following chapters we use the word Model to mean AI Model / LLM.
Here is an example. Suppose you want Claude to do this for you: “Read README.md and summarize it for me”. What happens if you just throw it at the Model directly?
sequenceDiagram
actor User
participant Model
User->>Model: "Read README.md and summarize it for me"
Model-->>User: "I cannot directly access your file system,<br/>please paste the content and I can summarize it..."The Model knows what you want, but it cannot do it — it can only output text. It cannot actively read files, run commands, browse the web, or edit code — those “real actions”. Without “hands and feet” to act on its behalf, the Model can only stop at “I can’t do it, please paste it for me”.
Once you add an AI Agent, someone is there to execute what the Model wants to do:
sequenceDiagram
actor User
participant Agent as AI Agent
participant Model
User->>Agent: "Read README.md and summarize it for me"
Agent->>Model: Send the conversation to the Model
Model-->>Agent: "tool_use: read_file"
Note over Agent: Execute read_file to read the file
Agent->>Model: Send tool_result back so the Model can keep reasoning
Model-->>Agent: "This project is..."
Agent-->>User: "This project is..."After we add an AI Agent, the Model still only outputs text — but this time what it writes is a tool_use marker that tells the AI Agent which tool to call. The AI Agent stands in the middle: when it sees tool_use, it actually reads the file, wraps the read result into a tool_result message and sends it back to the Model for further reasoning, and finally relays the Model’s summary to the User.
What are
tool_use/tool_result? They are two fields inside the Model API’s message structure — the Model never actually executes a tool; it just writes atool_usemarker in its reply telling the AI Agent which tool to call. After the AI Agent runs the tool, it wraps the result into atool_resultmessage and sends it back to the Model. You will see these two terms over and over in 1.3 / 1.5.
The “hands and feet” layer is internally composed of four things, which the next four sections unpack in order:
- 1.2 Executor — the body of the AI Agent, the code that actually performs actions
- 1.3 Message and Role — the format the Model and Executor use to communicate
- 1.4 Memory — the agent’s memory = a
messageslist, the whole history is resent every turn - 1.5 Agent Loop — why a single task requires the Model and Executor to bounce back and forth
1.2 Executor#
The diagram in 1.1 treats the AI Agent as a black box. If we open it up, the core component of the AI Agent is called the Executor — in later chapters we use “Executor” to refer to the layer of code inside the AI Agent that “actually does the work”. (Other names you will see in the industry: harness, scaffolding, agent runner — they all mean the same thing.)
The Model only does one thing: look at the conversation history → output the next piece of text — including a tool_use marker like “I want to call read_file”, which is also just text. The Executor’s job is to translate that marker into a real execution (read a file, run a shell command, call an API), wrap the result into a tool_result message, and send it back to the Model for continued reasoning.
sequenceDiagram
actor User
box AI Agent
participant Executor
end
participant Model
User->>Executor: "Read the README for me"
Executor->>Model: Send the conversation to the Model
Model-->>Executor: "tool_use: read_file"
Note over Executor: Execute read_file to read the file
Executor->>Model: Send tool_result back to the Model
Model-->>Executor: "This project is..."
Executor-->>User: "This project is..."| Role | Location | Responsibility |
|---|---|---|
| User | External | Submits requests and reads results. The start and end of the whole flow |
| Executor | Inside the AI Agent | Receives the user’s message → bounces back and forth with the Model → performs real actions (file system, network, shell) → returns the result to the user |
| Model | External service (API) | Decides what to do — answer directly? Or call a tool first? Which tool? With what arguments? |
The single most important sentence: the model executes no tools; it only writes tool_use markers in its reply text. What actually reads files, runs shell commands, and calls APIs is the executor. This separation of duties is the root of every design that follows.
The Model does not know on its own which tools the executor has — it is the executor that proactively sends a “tool list” along with the conversation history to the Model. What the tool list looks like, how it is sent, and how to implement it are discussed in detail in CH02.
1.3 Message and Role#
The Executor and the Model converse through Messages.
Every Message carries at least two fields:
role— who is speaking (user / assistant / system)content— what was said (text, a tool_use request, a tool_result…)
| |
Only Three Roles#
In the Anthropic protocol, role only has three values:
| Role | Written by | Content |
|---|---|---|
user | The real user ⊕ the executor | Questions; or tool execution results |
assistant | The model | Text replies ⊕ tool call requests |
system | You (the system prompt) | Behavior configuration for the model |
Note: There is no
toolrole. Tool execution results also use theuserrole.
The benefit of this design is that the message protocol is extremely simple: it always alternates user / assistant, and the executor is just the “ghostwriter for user messages”.
user "Read the README for me" ← real user
assistant [I want to call read_file] ← model decision
user [tool_result: "..."] ← executor uses the user role to return tool_result
assistant "This project is..." ← model summary1.4 Memory#
An agent’s memory = Message history = a Messages list.
The model itself has no memory — because it does not store the Messages from any past conversation.
This is different from the ChatGPT web UI that “remembers what you asked” — the web front end already stuffs every past Message back into the model for you. Claude / GPT / any Model API by itself has no memory.
To make the model “remember” things, the executor appends every Message in the conversation (the user’s questions, tool_use, tool_result, etc.) into its own recorded messages list, and every turn sends the entire messages list to the model — not just the latest message. Only when the model sees the complete Message history does it know what to do next.
| |
The cost: the longer the conversation, the more tokens are sent each turn — more expensive and slower. Coming up, CH05 / CH06 / CH07 will expand this basic memory layer into a three-tier short / medium / long term memory system to address this problem.
1.5 Agent Loop#
The diagram in 1.2 only demonstrated a single round of tool calling (read one file and then summarize). Real tasks usually require chaining several tool calls in a row:
- “List the Python files under src/ and what each one does” → first
ls src/→ then runread_fileon every.py - “Find which piece of code handles auth” → first
grep "auth"→ thenread_fileon the matching files - “Fix this bug” →
read_fileto locate the issue →write_fileto edit →run_shellto run the tests for verification
After each tool runs, the Model has to look at the result before deciding the next step. So the Executor must bounce back and forth between the Model and “real tool execution” until the Model returns end_turn:
%%{init: {'sequence': {'noteAlign': 'left'}}}%%
sequenceDiagram
actor User
box AI Agent
participant Executor
end
participant Model
User->>Executor: (1) Submit a request
loop Agent loop
Executor->>Model: (2) Send the messages list
alt stop_reason = tool_use
Model-->>Executor: (3) Wants to call some tool
Note over Executor: (4) Execute the tool<br/>(5) Append tool_result to the messages list, go back to (2)
else stop_reason = end_turn
Model-->>Executor: (3) Final text answer
Note over Executor: (4) Exit [Agent loop]
end
end
Executor-->>User: (5) Relay the final answer to the userThe two branches in the diagram above come from the stop_reason field in the Model’s response — a “status flag” the Model attaches to every reply, telling the executor whether the current turn is “still wants to call a tool” or “already has a final answer”. The Executor looks at stop_reason to decide which path to take next:
- The Executor sends the entire accumulated conversation history to the Model (as 1.4 explained: every turn resends the whole
messageslist) stop_reason = tool_use(wants to call a tool) → the executor runs the tool the Model specified → wraps the result into atool_resultand appends it to the conversation → goes back to step 1stop_reason = end_turn(already has a final answer) → the executor extracts the Model’s text answer and returns it to the user → the loop ends
Without this loop, an AI Agent could only “call one tool and then stop” and could not handle multi-step tasks. There is only one exit condition: stop_reason = end_turn, meaning the Model believes it has enough information to answer the user.
The Full List of stop_reason Values#
Besides the tool_use and end_turn used by the loop above, stop_reason actually has other values:
| stop_reason | Meaning | What to do |
|---|---|---|
end_turn | The model is done speaking | Extract the text and return |
tool_use | The model wants to call a tool | Execute + append the result + continue the loop |
max_tokens | The output got truncated | Raise max_tokens and retry |
stop_sequence | Hit a custom stop string | Handle as the situation requires |
max_turns: Avoiding an Infinite Conversation#
Because the Executor keeps bouncing back and forth with the Model until end_turn, this loop will run forever if it never converges. Common runaway scenarios:
- The Model gets stuck in a “call a tool → look at the result → call the same tool again” dead loop
- The tool keeps returning errors, and the Model keeps trying new parameters but none succeed
To prevent infinite back-and-forth, the Agent Loop must add a maximum turn count limit:
| |
Practical reference values:
- Ordinary Q&A tasks: 10 is enough
- Multi-step tasks: 30
- Agentic coding (read many files, edit many places): 100+ is common
Recap#
By this point you should understand:
- AI Agent = Model + Executor + Loop — three parts make up the minimal architecture, each with its own role
- The Executor is the hands and feet — everything the Model wants done relies on the Executor to execute, including tool calls and state management
- Message is the protocol — the Model and Executor communicate through a messages list, with three roles user / assistant / system, each with its own purpose
- The Model has no memory — “remembering” is the illusion created by accumulating the messages list and resending it every turn
- The Agent Loop is flow-controlled by
stop_reason—tool_usecontinues,end_turnwraps up;max_turnsguards against infinite loops
The next chapter CH02 Providing Tools puts this chapter’s “Tool calling” architecture into practice — writing the tool definitions and implementations for the trio (run_shell / read_file / write_file) so the agent can really get to work.