Providing Tools#

CH01 explained that the Executor is the AI Agent’s “hands and feet” — it carries out what the Model wants to do. But before the Executor can execute anything, it has one more thing to do first: tell the Model what tools (capabilities) it has and what it can do. Otherwise the Model has no idea what tools it can use to finish the task.

So the Executor must hand the Model a “tool list” — a JSON spec that lists each tool’s name, when to use it, and what parameters it needs. Once the Model has read the list, it knows which tool it can call in the current conversation and what arguments to pass.

This chapter writes out the tool definitions and Python implementations for the three-piece set (run_shell / read_file / write_file), producing tools that can actually do work:

ToolCapabilityReplaces
run_shellRun any shell commandgrep, find, git, curl, compile, test… no need to write each one separately
read_fileRead file contentsGives the model “eyes” for any text file
write_fileOverwrite a fileGives the model “hands” to edit code or write config files

This three-piece set is surprisingly powerful — because run_shell alone covers most of the “doing things” capability.
First we’ll look at the workflow between Executor and Model around the tool list, then walk through tool definitions (2.2) and implementations (2.3) separately.


2.1 The Four-Step Tool Interaction Flow#

No matter which tool, the interaction between executor and Model always follows the same four steps:

%%{init: {'sequence': {'noteAlign': 'left'}}}%%
sequenceDiagram
    box AI Agent
        participant Executor
    end
    participant Model
    Note over Executor: (1) Send the tool list<br/>along with the messages list to the Model
    Executor->>Model: messages list + tool list
    Note over Model: (2) Read description / input_schema<br/>and decide which tool to call
    Model-->>Executor: tool_use: read_file({"path": "..."})
    Note over Executor: (3) Run the corresponding<br/>Python implementation to get the result
    Note over Executor: (4) Append tool_result to the messages list

For clarity, this diagram only shows one round of interaction and omits the outer Agent Loop. In practice, after (4) it loops back to (1) for the next round, until the Model returns end_turn.

Every tool has two parts corresponding to the four steps above:

  • A “tool list” for the Model to read — made up of multiple tool definitions (2.2, corresponds to (1)(2))
  • An “implementation” for the Executor to run — the actual Python function (2.3, corresponds to (3)(4))

The two have to line up: the tool name / parameters in the definition must match the function name / signature in the implementation, otherwise the call the Model produces won’t match and things will break.


2.2 Providing the Tool List to the Model#

What does each tool definition in the tool list look like? The simplest example (a single tool):

1
2
3
4
5
6
7
8
9
{
    "name":         "read_file",                                 # "tool name"
    "description":  "Read the entire contents of a text file.",  # "when should I be used"
    "input_schema": {                                            # "what parameters do I need"
        "type": "object",
        "properties": {"path": {"type": "string"}},
        "required": ["path"],
    },
}

description is the most critical field — the Model picks tools almost entirely based on this. Write it vaguely and it will guess wrong; write it precisely and it will call the right tool at the right time.

Write one for each of the three-piece set and combine them into the full tool list:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
TOOLS = [                            # <- tool list (the whole list)
    {                                # <- one tool definition
        "name": "run_shell",
        "description": "Execute a shell command and return stdout/stderr/returncode.",
        "input_schema": {            # <- describe parameters with JSON Schema
            "type": "object",
            "properties": {
                "command": {"type": "string", "description": "The shell command to run."},
            },
            "required": ["command"],
        },
    },
    {
        "name": "read_file",
        "description": "Read the entire contents of a text file.",
        "input_schema": {
            "type": "object",
            "properties": {
                "path": {"type": "string", "description": "Absolute or relative file path."},
            },
            "required": ["path"],
        },
    },
    {
        "name": "write_file",
        "description": "Overwrite a file with the given content. Creates the file if it doesn't exist.",
        "input_schema": {
            "type": "object",
            "properties": {
                "path":    {"type": "string"},
                "content": {"type": "string"},
            },
            "required": ["path", "content"],
        },
    },
]

Pass this list in every time you call the Model API:

1
2
3
4
5
6
response = client.messages.create(
    model="claude-...",
    messages=messages,
    tools=TOOLS,           # <- tool list goes here
    ...
)

Once the Model sees TOOLS, it knows which tools it can call in this conversation. As mentioned at the start of the section, description is the most critical field; input_schema describes parameters in JSON Schema format — properties lists each field’s type, and required marks the required fields. The Model generates parameter JSON following this format.

Up to this point, what the Model sees is just a JSON spec — no Python function, and no “execution” has happened yet. It only knows these options exist for it to call.


2.3 Tool Implementation: How the Executor Actually Runs Things#

What 2.2 showed the Model was only a JSON spec — no code gets executed. Corresponding to steps (3)(4) of the four-step flow: after the Model returns tool_use, the executor has to extract name and input, find the matching Python implementation, run it to get a result, and append tool_result to the messages list.

The Model writes in its reply message:

tool_use: read_file({"path": "README.md"})

The Executor takes the tool_use (read_file), matches if name == "read_file", and runs the corresponding Python:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
def execute_native_tool(name: str, args: dict) -> str:
    if name == "run_shell":
        proc = subprocess.run(args["command"], shell=True, ...)
        return json.dumps({...})

    if name == "read_file":
        return Path(args["path"]).read_text(encoding="utf-8")

    if name == "write_file":
        Path(args["path"]).write_text(args["content"], encoding="utf-8")
        return f"Wrote {len(args['content'])} chars to {args['path']}"

    return f"ERROR: unknown native tool {name}"

Each if name == "..." corresponds to one tool. The final fallback return f"ERROR: unknown native tool {name}" handles “the Model called a tool name that doesn’t exist” — don’t raise; following Principle 2, return the error as data and let the Model handle it.

The tool string in name == "..." must match the name in the tool definition exactly — that’s the correspondence between definition and implementation. After execution, wrap the returned string as tool_result and append it to the messages list to reply to the Model.

Below are the three principles for the implementation. Violate any of them and the agent will break in some scenario.

Principle 1: Always Return a String#

API spec: a tool’s return value must be a string. Objects, dicts, and numbers must be serialized first.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
def read_file(path: str) -> str:
    return Path(path).read_text()           # already str, ok

def run_shell(command: str) -> str:
    result = subprocess.run(command, ...)
    return json.dumps({                     # object -> str
        "stdout": result.stdout,
        "stderr": result.stderr,
        "returncode": result.returncode,
    })

run_shell naturally has three fields — stdout / stderr / returncode — so we use json.dumps to wrap them into a single JSON string. The Model can parse JSON and understands the difference between the three fields.

Principle 2: Return Errors as Data, Don’t Raise#

Bad:   read fails -> raise FileNotFoundError -> the whole agent breaks
Good:  read fails -> return "ERROR reading X: file not found"
                  -> the Model can decide to retry / try a different path / give up

Treat errors as just another kind of normal output, and the Model can “understand” the failure and adapt. The Agent can then self-correct — trial and error is its mode of operation.

1
2
3
4
5
6
7
8
9
def read_file(path: str) -> str:
    try:
        return Path(path).read_text()
    except FileNotFoundError:
        return f"ERROR: file not found: {path}"
    except PermissionError:
        return f"ERROR: permission denied: {path}"
    except Exception as e:
        return f"ERROR: {type(e).__name__}: {e}"

This principle runs through the entire minimal-agent — when we later add user rejection (the safety gate), the user’s rejection is wrapped as tool_result and returned to the Model in exactly the same way, an extension of this same principle.

Principle 3: Anything That Can Block Needs a Timeout#

If run_shell gets stuck mid-run (waiting on stdin, infinite loop, network hang), it freezes the whole agent. Every tool that can block must have a timeout, and a timeout should be returned to the model as a failure so it can decide what to do next:

1
2
3
4
5
6
7
8
9
def run_shell(command: str, timeout: int = 30) -> str:
    try:
        result = subprocess.run(
            command, shell=True, capture_output=True,
            text=True, timeout=timeout,
        )
        return result.stdout + result.stderr
    except subprocess.TimeoutExpired:
        return f"ERROR: command timed out after {timeout}s"

2.4 Try It Out: The Model Chains the Steps Itself#

With the tool list and implementations ready, run a request that needs multiple steps:

     you: List the current directory, find README, read it, and tell me what this project does
   model: I'll start with run_shell("ls")        <- first step it decides on its own
executor: ["minimal_agent.py", "README.md", ...]
   model: I see README.md, read_file("README.md")
executor: "# Minimal Agent\n\nA tiny..."
   model: end_turn — "This project is a minimal agent, with the feature ..."

All you wrote was three tool definitions + three Python functions. Chaining the steps is 100% the Model’s job — it reads description to know what each tool does and reads each round’s tool_result to decide the next step.


Recap#

By this point you should understand:

  • A Tool has two sides — a tool definition (JSON) for the Model to read, and a Python implementation for the Executor to run. They line up via the tool name / parameters.
  • description is the most critical field — the Model picks tools almost entirely based on this. Write it precisely and it will call the right tool at the right time.
  • The three-piece set covers most “doing things” capability — because run_shell itself covers a huge range (grep / find / git / curl / running tests…).
  • Three implementation principles — always return a string, return errors as data instead of raising, and add a timeout to anything that can block.
  • The Model chains the flow itself — you write the tool definitions + functions, and the Model uses description + tool_result to decide the next step. You don’t have to write flow control.

The next chapter, CH03 MCP Tool Integration, solves the “once you have too many tools, maintenance becomes unmanageable” problem — bringing in tools written by the external community through a standard protocol, coexisting with native tools.


References#