cheat sheet
Claude Tool Use (Function Calling)
Define tools, handle tool calls, run agentic loops, use parallel tools, and manage errors with the Claude API.
Claude Tool Use (Function Calling)
What it is
Tool use (function calling) is Claude's ability to request the execution of developer-defined functions during a conversation. You define tools as JSON Schema objects describing their name, purpose, and parameters; Claude emits a tool_use content block when it decides to invoke one, your code executes the function and returns a tool_result, and Claude continues reasoning with the result to produce a final response. This pattern is the foundation of agentic applications where Claude must interact with databases, APIs, filesystems, or any external system to answer a question or complete a task.
Tool use lets Claude call external functions during a conversation. Claude decides when to call a tool, sends a structured request, receives the result, and incorporates it into its response.
Define a tool
Tools are JSON Schema objects describing name, purpose, and input parameters.
tools = [
{
"name": "get_weather",
"description": "Get current weather for a location. Call this when the user asks about weather.",
"input_schema": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City and country, e.g. 'Toronto, Canada'"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature unit. Default celsius."
}
},
"required": ["location"]
}
}
]
Write the
descriptionfrom Claude's perspective: explain when to call the tool, not just what it does. Claude uses descriptions to decide whether to call the tool at all.
First API call
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
tools=tools,
messages=[{"role": "user", "content": "What's the weather in Toronto?"}]
)
print(response.stop_reason) # "tool_use"
print(response.content) # list of TextBlock and/or ToolUseBlock
Output:
tool_use
[ToolUseBlock(id='toolu_01XVn...', input={'location': 'Toronto, Canada'}, name='get_weather', type='tool_use')]
Handle the tool call
import json
def handle_tool_call(name: str, inputs: dict) -> str:
if name == "get_weather":
location = inputs["location"]
unit = inputs.get("unit", "celsius")
# Call your real weather API here
return json.dumps({"temp": 12, "condition": "cloudy", "unit": unit})
raise ValueError(f"Unknown tool: {name}")
tool_use = next(b for b in response.content if b.type == "tool_use")
result = handle_tool_call(tool_use.name, tool_use.input)
Continue the conversation
Append the assistant's response and the tool result, then call again to get the final answer.
messages = [
{"role": "user", "content": "What's the weather in Toronto?"},
{"role": "assistant", "content": response.content}, # include full content list
{
"role": "user",
"content": [
{
"type": "tool_result",
"tool_use_id": tool_use.id,
"content": result # string or list of content blocks
}
]
}
]
final = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
tools=tools,
messages=messages
)
print(final.content[0].text)
Output:
The current weather in Toronto, Canada is 12°C and cloudy.
Full agentic loop
A while-style loop that keeps calling the API until Claude returns stop_reason == "end_turn". Each iteration dispatches any tool_use blocks to your handler, appends the results as a tool_result turn, and calls the API again. Always cap the loop with max_turns — an uncaught exception in your handler or an unexpected model response can otherwise spin forever.
def run_agent(user_message: str, tools: list, max_turns: int = 10) -> str:
messages = [{"role": "user", "content": user_message}]
for turn in range(max_turns):
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=4096,
tools=tools,
messages=messages,
)
messages.append({"role": "assistant", "content": response.content})
if response.stop_reason == "end_turn":
text = [b.text for b in response.content if b.type == "text"]
return text[-1] if text else ""
if response.stop_reason == "tool_use":
tool_results = []
for block in response.content:
if block.type != "tool_use":
continue
try:
result_content = handle_tool_call(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result_content,
})
except Exception as exc:
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": f"Error: {exc}",
"is_error": True, # tells Claude the tool failed
})
messages.append({"role": "user", "content": tool_results})
return "Max turns reached"
Always set a
max_turnsceiling. Without one, a bug in your tool handler or an unexpected Claude response can loop indefinitely. 10 is a safe default for most tasks; complex agentic pipelines may need 20–50.
Parallel tool use
Claude may call multiple tools in a single response. Handle all ToolUseBlock items in the content list.
import anthropic
tools = [
{
"name": "get_stock_price",
"description": "Get current stock price for a ticker symbol.",
"input_schema": {
"type": "object",
"properties": {"ticker": {"type": "string"}},
"required": ["ticker"]
}
},
{
"name": "get_company_news",
"description": "Get recent news headlines for a company.",
"input_schema": {
"type": "object",
"properties": {"ticker": {"type": "string"}},
"required": ["ticker"]
}
}
]
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
tools=tools,
messages=[{"role": "user", "content": "What's AAPL's price and latest news?"}]
)
# Claude may return TWO tool_use blocks in one response
tool_calls = [b for b in response.content if b.type == "tool_use"]
print(f"Tool calls requested: {len(tool_calls)}")
# Handle all of them and return all results in one user turn
tool_results = []
for tc in tool_calls:
result = handle_tool_call(tc.name, tc.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": tc.id,
"content": result,
})
Output:
Tool calls requested: 2
Disable parallel tool use
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
tools=tools,
tool_choice={"type": "auto", "disable_parallel_tool_use": True},
messages=messages
)
Error handling with is_error
When a tool call fails, return is_error: true instead of raising an exception. Claude will acknowledge the failure and decide whether to retry or respond differently.
def safe_tool_call(name: str, inputs: dict) -> dict:
try:
content = handle_tool_call(name, inputs)
return {"content": content}
except TimeoutError:
return {"content": "Tool timed out after 10s.", "is_error": True}
except Exception as exc:
return {"content": f"Tool error: {type(exc).__name__}: {exc}", "is_error": True}
# Then in your loop:
for block in response.content:
if block.type == "tool_use":
result = safe_tool_call(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
**result,
})
Tool choice control
The tool_choice parameter overrides Claude's default decision about whether and which tool to call. Use "tool" with a specific name to force structured extraction (guaranteed JSON matching your schema), "any" to ensure at least one tool is called, "auto" (the default) to let Claude decide, or "none" to suppress tool calls entirely and get a plain text response.
# Force Claude to call a specific tool (useful for structured extraction)
tool_choice={"type": "tool", "name": "extract_fields"}
# Force any tool call (not end_turn)
tool_choice={"type": "any"}
# Claude decides (default)
tool_choice={"type": "auto"}
# Never use tools — return text only
tool_choice={"type": "none"}
Prompt caching with tools
Mark your tool definitions as cacheable when they are large and reused across many calls. Cache TTL is 5 minutes (ephemeral).
tools_with_cache = [
{
"name": "search_docs",
"description": "Search the documentation database...",
"input_schema": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "Search query"},
"max_results": {"type": "integer", "default": 5}
},
"required": ["query"]
},
"cache_control": {"type": "ephemeral"} # cache this tool definition
}
]
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
tools=tools_with_cache,
system=[
{
"type": "text",
"text": "You are a documentation assistant with access to search.",
"cache_control": {"type": "ephemeral"} # also cache system prompt
}
],
messages=messages
)
Output (usage block when cached):
Usage(cache_creation_input_tokens=1024, cache_read_input_tokens=1024, input_tokens=52, output_tokens=80)
Tool schema best practices
| Practice | Why |
|---|---|
| Keep descriptions short but precise | Token efficiency; Claude reads every description every turn |
| Name parameters unambiguously | city_name not name when there could be other names |
Mark truly required fields as required | Prevents Claude from omitting fields you always need |
Use enum for fixed choices | Avoids hallucinated values; validation is free |
Add default in description, not schema | JSON Schema default is informational; Claude reads descriptions |
| Keep tool count under ~20 | Beyond ~20 tools, Claude struggles to choose; group by domain |
| Write description from Claude's POV | "Call this when the user asks about weather" not "Gets weather" |
Tool result content types
The content field in a tool_result can be a string, or a list of content blocks (text + images):
# String (simple)
{"type": "tool_result", "tool_use_id": tc.id, "content": "12°C, cloudy"}
# List with image (e.g. a chart tool that returns a plot)
{
"type": "tool_result",
"tool_use_id": tc.id,
"content": [
{"type": "text", "text": "Chart generated:"},
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/png",
"data": base64_png_data
}
}
]
}
Streaming with tool use
When streaming, tool input JSON arrives as a series of input_json_delta events that you must concatenate. The SDK's stream.get_final_message() collects the whole reconstruction for you — use the events directly only when you want to surface a partial tool call in the UI as it streams in (e.g. progressively rendering a planned action).
with client.messages.stream(
model="claude-opus-4-7",
max_tokens=1024,
tools=tools,
messages=messages,
) as stream:
for event in stream:
if event.type == "content_block_start":
if event.content_block.type == "tool_use":
print(f"\nTool call: {event.content_block.name}")
elif event.type == "content_block_delta":
if event.delta.type == "input_json_delta":
print(event.delta.partial_json, end="", flush=True)
elif event.type == "message_stop":
print()
# Get the final message for the full tool use input
final_message = stream.get_final_message()
Output:
Tool call: get_weather
{"location":"Toronto, Canada","unit":"celsius"}
Structured extraction with forced tool
Forcing a single tool with tool_choice={"type": "tool", "name": "..."} is the most reliable way to get strictly typed JSON from Claude — input_schema becomes a contract, not a hint. Pair it with Pydantic (Python) or zod (TypeScript) on your side for compile-time guarantees that the model output matches the rest of your code.
from pydantic import BaseModel
import anthropic
class Invoice(BaseModel):
invoice_id: str
total_usd: float
line_items: list[str]
tools = [{
"name": "extract_invoice",
"description": "Extract structured fields from an invoice.",
"input_schema": Invoice.model_json_schema(),
}]
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=512,
tools=tools,
tool_choice={"type": "tool", "name": "extract_invoice"},
messages=[{
"role": "user",
"content": "Invoice INV-2026-014 for $1,240.50 — items: hosting (annual), domain renewal."
}],
)
tool_use = next(b for b in response.content if b.type == "tool_use")
invoice = Invoice.model_validate(tool_use.input)
print(invoice)
Output:
invoice_id='INV-2026-014' total_usd=1240.5 line_items=['hosting (annual)', 'domain renewal']
Multiple tool result media
A tool_result can return text, image, or document blocks together — useful for tools that produce a chart and a textual summary, or that read a file and a screenshot. Claude reasons over all of them as if they were part of the same user turn.
import base64
with open("plot.png", "rb") as f:
plot_b64 = base64.standard_b64encode(f.read()).decode("ascii")
tool_results = [{
"type": "tool_result",
"tool_use_id": tool_use.id,
"content": [
{"type": "text", "text": "Generated plot of revenue by quarter."},
{
"type": "image",
"source": {"type": "base64", "media_type": "image/png", "data": plot_b64},
},
],
}]
Built-in tools
The API exposes several built-in tools that Claude invokes natively — no schema definition required, no developer-side implementation. Each is enabled by adding it to the tools array with type: <toolname>.
| Tool | Purpose |
|---|---|
web_search_20250305 | Claude searches the public web mid-turn |
computer_20250124 | Mouse, keyboard, and screenshot control |
text_editor_20250124 | Read/edit files on a sandboxed filesystem |
bash_20250124 | Run shell commands in a sandbox |
code_execution_20250122 | Run Python in an isolated container |
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=4096,
tools=[
{"type": "web_search_20250305", "name": "web_search", "max_uses": 3},
],
messages=[{"role": "user", "content": "What did Anthropic announce this week?"}],
)
print(response.content[-1].text)
Output:
This week Anthropic announced two major updates: a new prompt caching SDK
helper and expanded availability of the Files API in the EU region…
Built-in tools incur their own usage fees (e.g. per-search for web_search). Check pricing before enabling on a high-volume endpoint.
Streaming-friendly result rendering
For long-running tools (database queries, web scrapes, model calls) you can stream back a "thinking out loud" status while the tool executes. The cleanest pattern is to start the tool in a background task as soon as you see the tool_use block, render progress to the user via your own SSE channel, and only feed the final tool_result back into Claude when it is complete.
import asyncio
async def long_running_tool(args: dict, on_progress) -> str:
for i in range(5):
await on_progress(f"step {i+1}/5")
await asyncio.sleep(0.5)
return "completed"
async def handle_with_progress(block, send_to_ui):
if block.type != "tool_use":
return None
result = await long_running_tool(block.input, send_to_ui)
return {
"type": "tool_result",
"tool_use_id": block.id,
"content": result,
}
TypeScript example
The same patterns translate one-to-one to the TypeScript SDK. The key API differences: Anthropic from @anthropic-ai/sdk, content blocks typed as Anthropic.ContentBlock, and tool input arrives as input: unknown so you typically validate with zod.
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
const tools: Anthropic.Tool[] = [{
name: "get_weather",
description: "Get current weather. Call this when the user asks about weather.",
input_schema: {
type: "object",
properties: { location: { type: "string" } },
required: ["location"],
},
}];
const response = await client.messages.create({
model: "claude-opus-4-7",
max_tokens: 1024,
tools,
messages: [{ role: "user", content: "Weather in Toronto?" }],
});
for (const block of response.content) {
if (block.type === "tool_use") {
console.log(block.name, block.input);
}
}
Output:
get_weather { location: 'Toronto, Canada' }
See TypeScript SDK for full SDK coverage.
Agentic loop with retry budget
A production agentic loop tracks not just max_turns but also a per-tool retry budget. The model can otherwise loop on a flaky tool forever, eating tokens. Combine a turn cap with per-tool failure counts and surface Error: tool repeatedly failed after N retries so the model can recover or escalate.
from collections import Counter
def run_agent(user_message: str, tools: list, *, max_turns: int = 20, max_tool_failures: int = 3) -> str:
messages = [{"role": "user", "content": user_message}]
failures: Counter[str] = Counter()
for turn in range(max_turns):
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=4096,
tools=tools,
messages=messages,
)
messages.append({"role": "assistant", "content": response.content})
if response.stop_reason == "end_turn":
return "".join(b.text for b in response.content if b.type == "text")
if response.stop_reason != "tool_use":
return f"Unexpected stop_reason: {response.stop_reason}"
tool_results = []
for block in response.content:
if block.type != "tool_use":
continue
if failures[block.name] >= max_tool_failures:
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": f"Tool {block.name} has failed {max_tool_failures} times — stop calling it.",
"is_error": True,
})
continue
try:
content = handle_tool_call(block.name, block.input)
tool_results.append({"type": "tool_result", "tool_use_id": block.id, "content": content})
except Exception as exc:
failures[block.name] += 1
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": f"Error: {exc}",
"is_error": True,
})
messages.append({"role": "user", "content": tool_results})
return "Max turns reached"
Token accounting for tools
Tool definitions count against your input tokens on every call. Large tool arrays (10+ verbose schemas) can add hundreds of tokens per turn even when no tool is called. Measure with count_tokens and prune descriptions or split into specialised endpoints when the budget hurts.
count = client.messages.count_tokens(
model="claude-opus-4-7",
tools=tools,
messages=[{"role": "user", "content": "Hi"}],
)
print(f"Input tokens with tools: {count.input_tokens}")
Output:
Input tokens with tools: 412
Common pitfalls
| Pitfall | Symptom | Fix |
|---|---|---|
| Returning Python object instead of string | BadRequestError: content must be str | json.dumps() before returning |
Missing tool_use_id | BadRequestError: no matching tool_use | Copy block.id verbatim into the result |
| Mixing tool_result and text in same user turn | Confused model, lower accuracy | Put tool_results in their own user turn; add follow-up question next turn |
Forgetting is_error: true on failures | Claude assumes success, retries the same call | Always set is_error when the tool raised |
| Forcing a tool then not handling it | Model gets stuck | After tool_choice={"type": "tool", ...}, always dispatch and reply |
| Tool descriptions written for humans | Model fails to choose | Rewrite from Claude's POV: "Call this when …" |
No max_turns ceiling | Infinite loop on a buggy tool | Cap loops; track per-tool failure budgets |
Common recipes
Tool registry with decorators
from typing import Callable
_TOOLS: dict[str, dict] = {}
_HANDLERS: dict[str, Callable] = {}
def tool(name: str, description: str, schema: dict):
def decorator(fn: Callable) -> Callable:
_TOOLS[name] = {"name": name, "description": description, "input_schema": schema}
_HANDLERS[name] = fn
return fn
return decorator
@tool(
name="get_weather",
description="Get current weather. Call this when the user asks about weather.",
schema={
"type": "object",
"properties": {"location": {"type": "string"}},
"required": ["location"],
},
)
def get_weather(location: str) -> str:
return f"15C in {location}"
def dispatch(name: str, inputs: dict) -> str:
return _HANDLERS[name](**inputs)
tools_list = list(_TOOLS.values())
Async parallel tool execution
import asyncio
async def dispatch_all(blocks: list) -> list[dict]:
"""Execute multiple tool_use blocks in parallel."""
async def one(block):
try:
content = await async_handle_tool(block.name, block.input)
return {"type": "tool_result", "tool_use_id": block.id, "content": content}
except Exception as exc:
return {
"type": "tool_result",
"tool_use_id": block.id,
"content": f"Error: {exc}",
"is_error": True,
}
tool_blocks = [b for b in blocks if b.type == "tool_use"]
return await asyncio.gather(*(one(b) for b in tool_blocks))
Mocking tools in tests
def test_agent_handles_weather_query(monkeypatch):
calls = []
def fake_handle(name, inputs):
calls.append((name, inputs))
return '{"temp": 20, "condition": "sunny"}'
monkeypatch.setattr("myapp.agent.handle_tool_call", fake_handle)
result = run_agent("What's the weather in Berlin?", tools)
assert "20" in result
assert calls == [("get_weather", {"location": "Berlin, Germany"})]
See also
- Python SDK — message API, streaming, vision.
- TypeScript SDK — same API in TS.
- Streaming — SSE events for tool input deltas.
- Prompt caching — cache large tool definitions.