cheat sheet

Claude Tool Use (Function Calling)

Define tools, handle tool calls, run agentic loops, use parallel tools, and manage errors with the Claude API.

updated 05-25-2026

Claude Tool Use (Function Calling)

What it is

Tool use (function calling) is Claude's ability to request the execution of developer-defined functions during a conversation. You define tools as JSON Schema objects describing their name, purpose, and parameters; Claude emits a tool_use content block when it decides to invoke one, your code executes the function and returns a tool_result, and Claude continues reasoning with the result to produce a final response. This pattern is the foundation of agentic applications where Claude must interact with databases, APIs, filesystems, or any external system to answer a question or complete a task.

Tool use lets Claude call external functions during a conversation. Claude decides when to call a tool, sends a structured request, receives the result, and incorporates it into its response.

Define a tool

Tools are JSON Schema objects describing name, purpose, and input parameters.

python

tools = [
    {
        "name": "get_weather",
        "description": "Get current weather for a location. Call this when the user asks about weather.",
        "input_schema": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "City and country, e.g. 'Toronto, Canada'"
                },
                "unit": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                    "description": "Temperature unit. Default celsius."
                }
            },
            "required": ["location"]
        }
    }
]

Write the description from Claude's perspective: explain when to call the tool, not just what it does. Claude uses descriptions to decide whether to call the tool at all.

First API call

python

import anthropic

client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    tools=tools,
    messages=[{"role": "user", "content": "What's the weather in Toronto?"}]
)

print(response.stop_reason)   # "tool_use"
print(response.content)       # list of TextBlock and/or ToolUseBlock

Output:

text

tool_use
[ToolUseBlock(id='toolu_01XVn...', input={'location': 'Toronto, Canada'}, name='get_weather', type='tool_use')]

Handle the tool call

python

import json

def handle_tool_call(name: str, inputs: dict) -> str:
    if name == "get_weather":
        location = inputs["location"]
        unit = inputs.get("unit", "celsius")
        # Call your real weather API here
        return json.dumps({"temp": 12, "condition": "cloudy", "unit": unit})
    raise ValueError(f"Unknown tool: {name}")

tool_use = next(b for b in response.content if b.type == "tool_use")
result = handle_tool_call(tool_use.name, tool_use.input)

Continue the conversation

Append the assistant's response and the tool result, then call again to get the final answer.

python

messages = [
    {"role": "user", "content": "What's the weather in Toronto?"},
    {"role": "assistant", "content": response.content},    # include full content list
    {
        "role": "user",
        "content": [
            {
                "type": "tool_result",
                "tool_use_id": tool_use.id,
                "content": result          # string or list of content blocks
            }
        ]
    }
]

final = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    tools=tools,
    messages=messages
)
print(final.content[0].text)

Output:

text

The current weather in Toronto, Canada is 12°C and cloudy.

Full agentic loop

A while-style loop that keeps calling the API until Claude returns stop_reason == "end_turn". Each iteration dispatches any tool_use blocks to your handler, appends the results as a tool_result turn, and calls the API again. Always cap the loop with max_turns — an uncaught exception in your handler or an unexpected model response can otherwise spin forever.

python

def run_agent(user_message: str, tools: list, max_turns: int = 10) -> str:
    messages = [{"role": "user", "content": user_message}]

    for turn in range(max_turns):
        response = client.messages.create(
            model="claude-opus-4-7",
            max_tokens=4096,
            tools=tools,
            messages=messages,
        )
        messages.append({"role": "assistant", "content": response.content})

        if response.stop_reason == "end_turn":
            text = [b.text for b in response.content if b.type == "text"]
            return text[-1] if text else ""

        if response.stop_reason == "tool_use":
            tool_results = []
            for block in response.content:
                if block.type != "tool_use":
                    continue
                try:
                    result_content = handle_tool_call(block.name, block.input)
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": result_content,
                    })
                except Exception as exc:
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": f"Error: {exc}",
                        "is_error": True,    # tells Claude the tool failed
                    })
            messages.append({"role": "user", "content": tool_results})

    return "Max turns reached"

Always set a max_turns ceiling. Without one, a bug in your tool handler or an unexpected Claude response can loop indefinitely. 10 is a safe default for most tasks; complex agentic pipelines may need 20–50.

Parallel tool use

Claude may call multiple tools in a single response. Handle all ToolUseBlock items in the content list.

python

import anthropic

tools = [
    {
        "name": "get_stock_price",
        "description": "Get current stock price for a ticker symbol.",
        "input_schema": {
            "type": "object",
            "properties": {"ticker": {"type": "string"}},
            "required": ["ticker"]
        }
    },
    {
        "name": "get_company_news",
        "description": "Get recent news headlines for a company.",
        "input_schema": {
            "type": "object",
            "properties": {"ticker": {"type": "string"}},
            "required": ["ticker"]
        }
    }
]

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    tools=tools,
    messages=[{"role": "user", "content": "What's AAPL's price and latest news?"}]
)

# Claude may return TWO tool_use blocks in one response
tool_calls = [b for b in response.content if b.type == "tool_use"]
print(f"Tool calls requested: {len(tool_calls)}")

# Handle all of them and return all results in one user turn
tool_results = []
for tc in tool_calls:
    result = handle_tool_call(tc.name, tc.input)
    tool_results.append({
        "type": "tool_result",
        "tool_use_id": tc.id,
        "content": result,
    })

Output:

text

Tool calls requested: 2

Disable parallel tool use

python

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    tools=tools,
    tool_choice={"type": "auto", "disable_parallel_tool_use": True},
    messages=messages
)

Error handling with is_error

When a tool call fails, return is_error: true instead of raising an exception. Claude will acknowledge the failure and decide whether to retry or respond differently.

python

def safe_tool_call(name: str, inputs: dict) -> dict:
    try:
        content = handle_tool_call(name, inputs)
        return {"content": content}
    except TimeoutError:
        return {"content": "Tool timed out after 10s.", "is_error": True}
    except Exception as exc:
        return {"content": f"Tool error: {type(exc).__name__}: {exc}", "is_error": True}

# Then in your loop:
for block in response.content:
    if block.type == "tool_use":
        result = safe_tool_call(block.name, block.input)
        tool_results.append({
            "type": "tool_result",
            "tool_use_id": block.id,
            **result,
        })

Tool choice control

The tool_choice parameter overrides Claude's default decision about whether and which tool to call. Use "tool" with a specific name to force structured extraction (guaranteed JSON matching your schema), "any" to ensure at least one tool is called, "auto" (the default) to let Claude decide, or "none" to suppress tool calls entirely and get a plain text response.

python

# Force Claude to call a specific tool (useful for structured extraction)
tool_choice={"type": "tool", "name": "extract_fields"}

# Force any tool call (not end_turn)
tool_choice={"type": "any"}

# Claude decides (default)
tool_choice={"type": "auto"}

# Never use tools — return text only
tool_choice={"type": "none"}

Prompt caching with tools

Mark your tool definitions as cacheable when they are large and reused across many calls. Cache TTL is 5 minutes (ephemeral).

python

tools_with_cache = [
    {
        "name": "search_docs",
        "description": "Search the documentation database...",
        "input_schema": {
            "type": "object",
            "properties": {
                "query": {"type": "string", "description": "Search query"},
                "max_results": {"type": "integer", "default": 5}
            },
            "required": ["query"]
        },
        "cache_control": {"type": "ephemeral"}   # cache this tool definition
    }
]

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    tools=tools_with_cache,
    system=[
        {
            "type": "text",
            "text": "You are a documentation assistant with access to search.",
            "cache_control": {"type": "ephemeral"}   # also cache system prompt
        }
    ],
    messages=messages
)

Output (usage block when cached):

text

Usage(cache_creation_input_tokens=1024, cache_read_input_tokens=1024, input_tokens=52, output_tokens=80)

Tool schema best practices

Practice	Why
Keep descriptions short but precise	Token efficiency; Claude reads every description every turn
Name parameters unambiguously	`city_name` not `name` when there could be other names
Mark truly required fields as `required`	Prevents Claude from omitting fields you always need
Use `enum` for fixed choices	Avoids hallucinated values; validation is free
Add `default` in description, not schema	JSON Schema `default` is informational; Claude reads descriptions
Keep tool count under ~20	Beyond ~20 tools, Claude struggles to choose; group by domain
Write description from Claude's POV	"Call this when the user asks about weather" not "Gets weather"

Tool result content types

The content field in a tool_result can be a string, or a list of content blocks (text + images):

python

# String (simple)
{"type": "tool_result", "tool_use_id": tc.id, "content": "12°C, cloudy"}

# List with image (e.g. a chart tool that returns a plot)
{
    "type": "tool_result",
    "tool_use_id": tc.id,
    "content": [
        {"type": "text", "text": "Chart generated:"},
        {
            "type": "image",
            "source": {
                "type": "base64",
                "media_type": "image/png",
                "data": base64_png_data
            }
        }
    ]
}

Streaming with tool use

When streaming, tool input JSON arrives as a series of input_json_delta events that you must concatenate. The SDK's stream.get_final_message() collects the whole reconstruction for you — use the events directly only when you want to surface a partial tool call in the UI as it streams in (e.g. progressively rendering a planned action).

python

with client.messages.stream(
    model="claude-opus-4-7",
    max_tokens=1024,
    tools=tools,
    messages=messages,
) as stream:
    for event in stream:
        if event.type == "content_block_start":
            if event.content_block.type == "tool_use":
                print(f"\nTool call: {event.content_block.name}")
        elif event.type == "content_block_delta":
            if event.delta.type == "input_json_delta":
                print(event.delta.partial_json, end="", flush=True)
        elif event.type == "message_stop":
            print()

    # Get the final message for the full tool use input
    final_message = stream.get_final_message()

Output:

text

Tool call: get_weather
{"location":"Toronto, Canada","unit":"celsius"}

Structured extraction with forced tool

Forcing a single tool with tool_choice={"type": "tool", "name": "..."} is the most reliable way to get strictly typed JSON from Claude — input_schema becomes a contract, not a hint. Pair it with Pydantic (Python) or zod (TypeScript) on your side for compile-time guarantees that the model output matches the rest of your code.

python

from pydantic import BaseModel
import anthropic

class Invoice(BaseModel):
    invoice_id: str
    total_usd: float
    line_items: list[str]

tools = [{
    "name": "extract_invoice",
    "description": "Extract structured fields from an invoice.",
    "input_schema": Invoice.model_json_schema(),
}]

client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=512,
    tools=tools,
    tool_choice={"type": "tool", "name": "extract_invoice"},
    messages=[{
        "role": "user",
        "content": "Invoice INV-2026-014 for $1,240.50 — items: hosting (annual), domain renewal."
    }],
)

tool_use = next(b for b in response.content if b.type == "tool_use")
invoice = Invoice.model_validate(tool_use.input)
print(invoice)

Output:

text

invoice_id='INV-2026-014' total_usd=1240.5 line_items=['hosting (annual)', 'domain renewal']

Multiple tool result media

A tool_result can return text, image, or document blocks together — useful for tools that produce a chart and a textual summary, or that read a file and a screenshot. Claude reasons over all of them as if they were part of the same user turn.

python

import base64

with open("plot.png", "rb") as f:
    plot_b64 = base64.standard_b64encode(f.read()).decode("ascii")

tool_results = [{
    "type": "tool_result",
    "tool_use_id": tool_use.id,
    "content": [
        {"type": "text", "text": "Generated plot of revenue by quarter."},
        {
            "type": "image",
            "source": {"type": "base64", "media_type": "image/png", "data": plot_b64},
        },
    ],
}]

Built-in tools

The API exposes several built-in tools that Claude invokes natively — no schema definition required, no developer-side implementation. Each is enabled by adding it to the tools array with type: <toolname>.

Tool	Purpose
`web_search_20250305`	Claude searches the public web mid-turn
`computer_20250124`	Mouse, keyboard, and screenshot control
`text_editor_20250124`	Read/edit files on a sandboxed filesystem
`bash_20250124`	Run shell commands in a sandbox
`code_execution_20250122`	Run Python in an isolated container

python

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=4096,
    tools=[
        {"type": "web_search_20250305", "name": "web_search", "max_uses": 3},
    ],
    messages=[{"role": "user", "content": "What did Anthropic announce this week?"}],
)
print(response.content[-1].text)

Output:

text

This week Anthropic announced two major updates: a new prompt caching SDK
helper and expanded availability of the Files API in the EU region…

Built-in tools incur their own usage fees (e.g. per-search for web_search). Check pricing before enabling on a high-volume endpoint.

Streaming-friendly result rendering

For long-running tools (database queries, web scrapes, model calls) you can stream back a "thinking out loud" status while the tool executes. The cleanest pattern is to start the tool in a background task as soon as you see the tool_use block, render progress to the user via your own SSE channel, and only feed the final tool_result back into Claude when it is complete.

python

import asyncio

async def long_running_tool(args: dict, on_progress) -> str:
    for i in range(5):
        await on_progress(f"step {i+1}/5")
        await asyncio.sleep(0.5)
    return "completed"

async def handle_with_progress(block, send_to_ui):
    if block.type != "tool_use":
        return None
    result = await long_running_tool(block.input, send_to_ui)
    return {
        "type": "tool_result",
        "tool_use_id": block.id,
        "content": result,
    }

TypeScript example

The same patterns translate one-to-one to the TypeScript SDK. The key API differences: Anthropic from @anthropic-ai/sdk, content blocks typed as Anthropic.ContentBlock, and tool input arrives as input: unknown so you typically validate with zod.

typescript

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const tools: Anthropic.Tool[] = [{
  name: "get_weather",
  description: "Get current weather. Call this when the user asks about weather.",
  input_schema: {
    type: "object",
    properties: { location: { type: "string" } },
    required: ["location"],
  },
}];

const response = await client.messages.create({
  model: "claude-opus-4-7",
  max_tokens: 1024,
  tools,
  messages: [{ role: "user", content: "Weather in Toronto?" }],
});

for (const block of response.content) {
  if (block.type === "tool_use") {
    console.log(block.name, block.input);
  }
}

Output:

text

get_weather { location: 'Toronto, Canada' }

See TypeScript SDK for full SDK coverage.

Agentic loop with retry budget

A production agentic loop tracks not just max_turns but also a per-tool retry budget. The model can otherwise loop on a flaky tool forever, eating tokens. Combine a turn cap with per-tool failure counts and surface Error: tool repeatedly failed after N retries so the model can recover or escalate.

python

from collections import Counter

def run_agent(user_message: str, tools: list, *, max_turns: int = 20, max_tool_failures: int = 3) -> str:
    messages = [{"role": "user", "content": user_message}]
    failures: Counter[str] = Counter()

    for turn in range(max_turns):
        response = client.messages.create(
            model="claude-opus-4-7",
            max_tokens=4096,
            tools=tools,
            messages=messages,
        )
        messages.append({"role": "assistant", "content": response.content})

        if response.stop_reason == "end_turn":
            return "".join(b.text for b in response.content if b.type == "text")

        if response.stop_reason != "tool_use":
            return f"Unexpected stop_reason: {response.stop_reason}"

        tool_results = []
        for block in response.content:
            if block.type != "tool_use":
                continue
            if failures[block.name] >= max_tool_failures:
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": f"Tool {block.name} has failed {max_tool_failures} times — stop calling it.",
                    "is_error": True,
                })
                continue
            try:
                content = handle_tool_call(block.name, block.input)
                tool_results.append({"type": "tool_result", "tool_use_id": block.id, "content": content})
            except Exception as exc:
                failures[block.name] += 1
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": f"Error: {exc}",
                    "is_error": True,
                })
        messages.append({"role": "user", "content": tool_results})

    return "Max turns reached"

Token accounting for tools

Tool definitions count against your input tokens on every call. Large tool arrays (10+ verbose schemas) can add hundreds of tokens per turn even when no tool is called. Measure with count_tokens and prune descriptions or split into specialised endpoints when the budget hurts.

python

count = client.messages.count_tokens(
    model="claude-opus-4-7",
    tools=tools,
    messages=[{"role": "user", "content": "Hi"}],
)
print(f"Input tokens with tools: {count.input_tokens}")

Output:

text

Input tokens with tools: 412

Common pitfalls

Pitfall	Symptom	Fix
Returning Python object instead of string	`BadRequestError: content must be str`	`json.dumps()` before returning
Missing `tool_use_id`	`BadRequestError: no matching tool_use`	Copy `block.id` verbatim into the result
Mixing tool_result and text in same user turn	Confused model, lower accuracy	Put tool_results in their own user turn; add follow-up question next turn
Forgetting `is_error: true` on failures	Claude assumes success, retries the same call	Always set `is_error` when the tool raised
Forcing a tool then not handling it	Model gets stuck	After `tool_choice={"type": "tool", ...}`, always dispatch and reply
Tool descriptions written for humans	Model fails to choose	Rewrite from Claude's POV: "Call this when …"
No `max_turns` ceiling	Infinite loop on a buggy tool	Cap loops; track per-tool failure budgets

Common recipes

Tool registry with decorators

python

from typing import Callable

_TOOLS: dict[str, dict] = {}
_HANDLERS: dict[str, Callable] = {}

def tool(name: str, description: str, schema: dict):
    def decorator(fn: Callable) -> Callable:
        _TOOLS[name] = {"name": name, "description": description, "input_schema": schema}
        _HANDLERS[name] = fn
        return fn
    return decorator

@tool(
    name="get_weather",
    description="Get current weather. Call this when the user asks about weather.",
    schema={
        "type": "object",
        "properties": {"location": {"type": "string"}},
        "required": ["location"],
    },
)
def get_weather(location: str) -> str:
    return f"15C in {location}"

def dispatch(name: str, inputs: dict) -> str:
    return _HANDLERS[name](**inputs)

tools_list = list(_TOOLS.values())

Async parallel tool execution

python

import asyncio

async def dispatch_all(blocks: list) -> list[dict]:
    """Execute multiple tool_use blocks in parallel."""
    async def one(block):
        try:
            content = await async_handle_tool(block.name, block.input)
            return {"type": "tool_result", "tool_use_id": block.id, "content": content}
        except Exception as exc:
            return {
                "type": "tool_result",
                "tool_use_id": block.id,
                "content": f"Error: {exc}",
                "is_error": True,
            }
    tool_blocks = [b for b in blocks if b.type == "tool_use"]
    return await asyncio.gather(*(one(b) for b in tool_blocks))

Mocking tools in tests

python

def test_agent_handles_weather_query(monkeypatch):
    calls = []
    def fake_handle(name, inputs):
        calls.append((name, inputs))
        return '{"temp": 20, "condition": "sunny"}'
    monkeypatch.setattr("myapp.agent.handle_tool_call", fake_handle)

    result = run_agent("What's the weather in Berlin?", tools)
    assert "20" in result
    assert calls == [("get_weather", {"location": "Berlin, Germany"})]

Claude Tool Use (Function Calling)

What it is

Define a tool

First API call

Handle the tool call

Continue the conversation

Full agentic loop

Parallel tool use

Disable parallel tool use

Error handling with is_error

Tool choice control

Prompt caching with tools

Tool schema best practices

Tool result content types

Streaming with tool use

Structured extraction with forced tool

Multiple tool result media

Built-in tools

Streaming-friendly result rendering

TypeScript example

Agentic loop with retry budget

Token accounting for tools

Common pitfalls

Common recipes

Tool registry with decorators

Async parallel tool execution

Mocking tools in tests

See also