提供工具#

CH01 講過，Executor 是 AI Agent 的「手腳」 — 負責執行 Model 想做的事。但 Executor 在執行之前還得先做一件事：告訴 Model 它有哪些工具（能力）、可以做到哪些事。否則 Model 連自己能使用什麼工具來完成任務都不知道。

因此 Executor 必須提供一份 「工具清單」 給 Model — 一份 JSON 規格，列出每個工具的名稱、什麼時候該用、需要什麼參數。 Model 讀過清單後，才知道在當前對話可以叫哪個工具、要傳什麼參數。

這一章寫出三件套（run_shell / read_file / write_file）的工具定義跟 Python 實作，做出能動手做事的工具：

工具	能力	取代什麼
`run_shell`	跑任意 shell 命令	grep、find、git、curl、編譯、測試⋯⋯你不用個別寫
`read_file`	讀檔內容	給模型「眼睛」看任何文字檔
`write_file`	覆寫檔案	給模型「手」改 code、寫設定檔

這三件套的功能驚人 — 因為 run_shell 本身就涵蓋了一大半「做事」的能力。
下面先看 Executor / Model 之間圍繞工具清單的互動 workflow，再分別走過工具定義（2.2）跟實作（2.3）。

2.1 Tool 互動的四步流程#

不管哪個工具，executor 跟 Model 之間的互動都跑同一個四步：

%%{init: {'sequence': {'noteAlign': 'left'}}}%%
sequenceDiagram
    box AI Agent
        participant Executor
    end
    participant Model
    Note over Executor: ① 把工具清單<br/>跟 Messages list 一起送給 Model
    Executor->>Model: messages list + 工具清單
    Note over Model: ② 看完 description / input_schema<br/>決定該叫哪個工具
    Model-->>Executor: tool_use: read_file({"path": "..."})
    Note over Executor: ③ 跑該工具對應的<br/>Python 實作得到結果
    Note over Executor: ④ 把 tool_result 加到 Messages list

為了方便說明，這張圖只畫一輪互動，省略了外層的 Agent Loop。實務上 ④ 之後會回到 ① 繼續下一輪，直到 Model 回 end_turn。

每個工具都有 兩個部分 對應到上面這四步：

一份給 Model 看的「工具清單」 — 多份工具定義組成（2.2，對應 ①②）
一份給 Executor 跑的「實作」 — 真正的 Python 函數（2.3，對應 ③④）

兩者要對得起來：工具定義裡的工具名 / 參數要能對到實作的函數名 / 簽名，否則 Model 給的呼叫對不上、會出錯。

2.2 提供工具清單給 Model#

工具清單裡每一份工具定義是怎麼長的？最簡單的例子（單一工具）：

1
2
3
4
5
6
7
8
9
{
    "name":         "read_file",                                 # 「工具名稱」
    "description":  "Read the entire contents of a text file.",  # 「什麼時候該用我」
    "input_schema": {                                            # 「我需要什麼參數」
        "type": "object",
        "properties": {"path": {"type": "string"}},
        "required": ["path"],
    },
}

description 是最關鍵的一欄 — Model 挑工具的判準幾乎完全來自這裡。寫得模糊它會猜錯，寫得精準它會在對的時機叫對的工具。

把三件套各寫一份合在一起，就是完整的工具清單：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
TOOLS = [                            # ← 工具清單（整份 list）
    {                                # ← 一份工具定義
        "name": "run_shell",
        "description": "Execute a shell command and return stdout/stderr/returncode.",
        "input_schema": {            # ← 用 JSON Schema 規格描述參數
            "type": "object",
            "properties": {
                "command": {"type": "string", "description": "The shell command to run."},
            },
            "required": ["command"],
        },
    },
    {
        "name": "read_file",
        "description": "Read the entire contents of a text file.",
        "input_schema": {
            "type": "object",
            "properties": {
                "path": {"type": "string", "description": "Absolute or relative file path."},
            },
            "required": ["path"],
        },
    },
    {
        "name": "write_file",
        "description": "Overwrite a file with the given content. Creates the file if it doesn't exist.",
        "input_schema": {
            "type": "object",
            "properties": {
                "path":    {"type": "string"},
                "content": {"type": "string"},
            },
            "required": ["path", "content"],
        },
    },
]

每次呼叫 Model API 把這份清單傳進去：

1
2
3
4
5
6
response = client.messages.create(
    model="claude-...",
    messages=messages,
    tools=TOOLS,           # ← 工具清單在這
    ...
)

Model 看到 TOOLS 之後，就知道這次對話有哪些工具可叫。其中 description 章節開頭已經講過是最關鍵的一欄；input_schema 則用 JSON Schema 格式描述參數 — properties 列出每個欄位的型別、required 標出必填欄位。 Model 會照著這個格式產生參數 JSON。

到這裡為止，Model 看到的只是一份 JSON 規格 — 沒有 Python 函數，也還沒有任何「執行」發生，它只是知道有這些選項可以叫。

2.3 工具實作：Executor 怎麼真的去執行#

2.2 給 Model 看的只是 JSON 規格 — 沒有任何程式會被執行。對應到四步流程的第 ③④ 步：Model 回了 tool_use 之後，executor 要把 name 跟 input 拆出來，找到對應的 Python 實作跑出結果，再把 tool_result 加到 messages list。

Model 在回覆的 message 寫：

tool_use: read_file({"path": "README.md"})

Executor 拿到 tool_use（read_file）比對 if name == "read_file"，跑對應的那段 Python：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
def execute_native_tool(name: str, args: dict) -> str:
    if name == "run_shell":
        proc = subprocess.run(args["command"], shell=True, ...)
        return json.dumps({...})

    if name == "read_file":
        return Path(args["path"]).read_text(encoding="utf-8")

    if name == "write_file":
        Path(args["path"]).write_text(args["content"], encoding="utf-8")
        return f"Wrote {len(args['content'])} chars to {args['path']}"

    return f"ERROR: unknown native tool {name}"

每個 if name == "..." 對應一個工具，最後一條保底 return f"ERROR: unknown native tool {name}" 是給「Model 叫了一個不存在的工具名」用的 — 不要 raise，依照原則 2 把錯誤當資料回給 Model 自己處理。

name == "..." 的工具字串必須跟工具定義裡的 name 一字不差，這就是定義 → 實作之間的對應。執行完之後把 return 的字串包成 tool_result 加到 messages list 回覆給 Model。

下面是實作的三個原則，違反任何一條，agent 都會在某些情境下中斷。

原則 1：回傳一律是 string#

API 規範：tool 的回傳值必須是字串。物件、字典、數字都要先序列化。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
def read_file(path: str) -> str:
    return Path(path).read_text()           # 已經是 str ✓

def run_shell(command: str) -> str:
    result = subprocess.run(command, ...)
    return json.dumps({                     # 物件 → str
        "stdout": result.stdout,
        "stderr": result.stderr,
        "returncode": result.returncode,
    })

run_shell 天生就有 stdout / stderr / returncode 三個欄位，這時候用 json.dumps 包成一個 JSON 字串回去 — Model 有能力解析 JSON，看得懂三個欄位的差別。

原則 2：錯誤當資料回，不要 raise#

壞：     讀檔失敗 → raise FileNotFoundError → 整個 agent 中斷
好：     讀檔失敗 → return "ERROR reading X: file not found"
                  → Model 看到後可以決定 retry / 換路徑 / 放棄

把 error 當成 normal output 的一種，Model 就能「理解」失敗並適應。Agent 因此能自我修正 — 試錯是它的工作模式。

1
2
3
4
5
6
7
8
9
def read_file(path: str) -> str:
    try:
        return Path(path).read_text()
    except FileNotFoundError:
        return f"ERROR: file not found: {path}"
    except PermissionError:
        return f"ERROR: permission denied: {path}"
    except Exception as e:
        return f"ERROR: {type(e).__name__}: {e}"

這個原則貫穿整個 minimal-agent — 後續加 user 拒絕（safety gate）時，user 的拒絕也會以同樣的方式包成 tool_result 回給 Model，是同一個原則的延伸。

原則 3：能阻塞的都要 timeout#

run_shell 跑到一半卡住（等 stdin、無窮迴圈、卡網路）就把整個 agent 凍結。所有可能阻塞的工具都必須有 timeout，超時當失敗回給模型，讓它決定下一步：

1
2
3
4
5
6
7
8
9
def run_shell(command: str, timeout: int = 30) -> str:
    try:
        result = subprocess.run(
            command, shell=True, capture_output=True,
            text=True, timeout=timeout,
        )
        return result.stdout + result.stderr
    except subprocess.TimeoutExpired:
        return f"ERROR: command timed out after {timeout}s"

2.4 試一下：模型自己串流程#

工具清單和實作都備好了，跑一個需要多步驟的請求：

     you: 列出當前資料夾，找出 README，讀內容後告訴我這專案做什麼
   model: 我先 run_shell("ls")        ← 自己決定的第一步
executor: ["minimal_agent.py", "README.md", ...]
   model: 看到 README.md 了，read_file("README.md")
executor: "# Minimal Agent\n\nA tiny..."
   model: end_turn — 「這個專案是 minimal agent，特色是 ...」

你寫的只有三份工具定義 + 三個 Python 函數。串流程的工作 100% 是 Model 自己完成 — 它看 description 知道每個工具能幹嘛，看每一輪的 tool_result 決定下一步。

階段檢查點#

到這裡你應該理解：

Tool 有兩面 — 一份給 Model 看的工具定義（JSON）、一份給 Executor 跑的 Python 實作，兩邊靠工具名 / 參數對得起來
description 是最關鍵的欄位 — Model 挑工具的判準幾乎完全來自這裡，寫得精準它就在對的時機叫對的工具
三件套涵蓋大半「做事」能力 — 因為 run_shell 本身就包山包海（grep / find / git / curl / 跑 test⋯⋯）
實作三原則 — 回傳一律 string、錯誤當資料回不要 raise、能阻塞的都要 timeout
流程是 Model 自己串的 — 你寫工具定義 + 函數，Model 看 description + tool_result 決定下一步，你不用寫流程控制

下一章 CH03 MCP 工具整合解決「工具一多就維護不過來」的問題 — 把外部社群寫好的工具透過標準協議接進來、跟原生工具並存。

參考資源#

Anthropic Tool Use 文件
完整程式碼：github.com/codereindeer-dev/minimal-agent