cheat sheet
Claude API
The Anthropic Files API — upload PDFs, images, and text once and reference them by `file_id` across multiple messages, with citations, lifecycle management, and Workbench integration.
Claude API — Files
What it is
The Files API lets you upload PDFs, images, and text once and reference them by file_id in any subsequent messages.create call — no base64 re-upload, no per-request bandwidth, automatic caching across requests. It is the right pattern for RAG over a fixed corpus, repeated questions against the same large PDF, multi-step agent workflows that need to revisit a document, and any workload where the same file is sent to Claude more than once. Files have a 32 MB per-file size cap; PDFs may be up to 100 pages. Uploads count against a per-organization storage budget (multi-GB tier; see your dashboard for current limits).
The Files API is in beta. Send the header
anthropic-beta: files-api-2025-04-14on every call (upload, list, retrieve, delete, and anymessages.createthat references afile_id). The SDK helpersclient.beta.files.*set this for you automatically.
When to use it
| Scenario | Files API? |
|---|---|
| Same PDF referenced in 5+ requests | Yes |
| Persistent document store for an agent | Yes |
| RAG over a fixed corpus | Yes |
| Large image library | Yes |
| One-shot question on a brand-new PDF | No — inline base64 is simpler |
| Per-user upload with strict privacy isolation | Yes (one file per user) |
| Files > 32 MB | Pre-chunk; Files API will reject |
Supported file types
| Type | MIME | Notes |
|---|---|---|
application/pdf | Up to 100 pages, 32 MB | |
| Image | image/jpeg, image/png, image/gif, image/webp | 32 MB; same limits as inline images |
| Text | text/plain, text/markdown | UTF-8 encoded |
Limits
| Limit | Value |
|---|---|
| Max file size | 32 MB |
| PDF max pages | 100 |
| Org storage budget | Tier-dependent (see dashboard) |
| File expiration | None — files persist until you delete them |
| Allowed in batch API | Yes |
| Counts against context window | Yes — same as inline; tokens for the doc are billed every reference |
Upload — Python
client.beta.files.upload returns a FileObject with an id you reference elsewhere. Pass either a path-like file handle or a tuple (filename, bytes, mime_type).
import anthropic
client = anthropic.Anthropic()
uploaded = client.beta.files.upload(
file=open("manual.pdf", "rb"),
)
print(uploaded.id)
print(uploaded.filename)
print(uploaded.size_bytes)
print(uploaded.mime_type)
print(uploaded.created_at)
Output:
file_01XVnKzQp8mN7vF4LqJ2cR3Z
manual.pdf
4_271_088
application/pdf
2026-05-25T13:21:08Z
Upload with explicit metadata
When the source is bytes (e.g. generated PDF in memory), pass a (filename, bytes, mime) tuple.
from pathlib import Path
pdf_bytes = Path("manual.pdf").read_bytes()
uploaded = client.beta.files.upload(
file=("manual.pdf", pdf_bytes, "application/pdf"),
)
print(uploaded.id)
Output:
file_01XVnKzQp8mN7vF4LqJ2cR3Z
Upload — TypeScript
The TypeScript SDK accepts File, Blob, ReadStream, or the toFile() helper (which wraps a Buffer / Uint8Array).
import Anthropic, { toFile } from "@anthropic-ai/sdk";
import fs from "node:fs";
const client = new Anthropic();
const uploaded = await client.beta.files.upload({
file: fs.createReadStream("manual.pdf"),
});
console.log(uploaded.id, uploaded.filename, uploaded.size_bytes);
Output:
file_01XVnKzQp8mN7vF4LqJ2cR3Z manual.pdf 4271088
With an in-memory buffer:
const buffer = await fs.promises.readFile("report.pdf");
const uploaded = await client.beta.files.upload({
file: await toFile(buffer, "report.pdf", { type: "application/pdf" }),
});
Upload via curl
For ad-hoc uploads or scripts in languages without an SDK, use a multipart POST.
curl https://api.anthropic.com/v1/files \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "anthropic-beta: files-api-2025-04-14" \
-F "file=@manual.pdf"
Output:
{
"id": "file_01XVnKzQp8mN7vF4LqJ2cR3Z",
"filename": "manual.pdf",
"mime_type": "application/pdf",
"size_bytes": 4271088,
"created_at": "2026-05-25T13:21:08Z",
"downloadable": false
}
Reference a file in a message
Once uploaded, reference the file by file_id in any document or image content block. The model treats it identically to an inline document — same context budget, same visual reading for PDFs.
response = client.beta.messages.create(
model="claude-opus-4-7",
max_tokens=2048,
messages=[{
"role": "user",
"content": [
{
"type": "document",
"source": {"type": "file", "file_id": uploaded.id},
},
{"type": "text", "text": "Summarise section 4.2 about reset behaviour."},
],
}],
extra_headers={"anthropic-beta": "files-api-2025-04-14"},
)
print(response.content[0].text)
Output:
Section 4.2 describes two reset modes. A soft reset clears the I/O queue and
in-progress transactions but preserves the serial number, calibration data,
and firmware. A hard reset clears everything, reverts firmware to factory
defaults, and requires re-pairing the device.
Reference an image
img = client.beta.files.upload(file=open("chart.png", "rb"))
response = client.beta.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[{
"role": "user",
"content": [
{
"type": "image",
"source": {"type": "file", "file_id": img.id},
},
{"type": "text", "text": "What does this chart show?"},
],
}],
extra_headers={"anthropic-beta": "files-api-2025-04-14"},
)
print(response.content[0].text)
Output:
The chart shows monthly revenue from January through December, with steady
growth from $1.2M to $2.1M and a notable spike in Q4 to $2.8M.
Reference from TypeScript
const response = await client.beta.messages.create({
model: "claude-opus-4-7",
max_tokens: 2048,
messages: [{
role: "user",
content: [
{ type: "document", source: { type: "file", file_id: uploaded.id } },
{ type: "text", text: "What is the warranty period?" },
],
}],
}, {
headers: { "anthropic-beta": "files-api-2025-04-14" },
});
const first = response.content[0];
if (first.type === "text") console.log(first.text);
Output:
The standard warranty period is 24 months from the date of purchase, with an
optional extended-warranty plan that adds another 36 months.
Citations
Enable citations per-document with citations: { enabled: true }. Each text block in Claude's response carries a citations array linking back to the exact span in the cited file — page numbers for PDFs, character ranges for text.
response = client.beta.messages.create(
model="claude-opus-4-7",
max_tokens=2048,
messages=[{
"role": "user",
"content": [
{
"type": "document",
"source": {"type": "file", "file_id": uploaded.id},
"title": "Device Manual",
"citations": {"enabled": True},
},
{"type": "text", "text": "List the three supported reset modes with the exact wording from the manual."},
],
}],
extra_headers={"anthropic-beta": "files-api-2025-04-14"},
)
for block in response.content:
if block.type != "text":
continue
print(block.text)
for cite in block.citations or []:
print(f" -> p.{cite.start_page_number}-{cite.end_page_number}: {cite.cited_text!r}")
Output:
The manual lists three reset modes:
1. **Soft reset** — "clears the I/O queue but preserves calibration data."
-> p.12-12: 'clears the I/O queue but preserves calibration data'
2. **Hard reset** — "reverts firmware to factory defaults."
-> p.13-13: 'reverts firmware to factory defaults'
3. **Recovery reset** — "loaded from the recovery partition."
-> p.13-13: 'loaded from the recovery partition'
List files
list pages over uploaded files with optional after_id / before_id cursors.
for f in client.beta.files.list(limit=20):
print(f.id, f.filename, f.size_bytes, f.created_at)
Output:
file_01XVnKzQp8mN7vF4LqJ2cR3Z manual.pdf 4271088 2026-05-25T13:21:08Z
file_01XVnJyPo7lM6uE3KpI1bQ2Y chart.png 284910 2026-05-25T11:04:33Z
file_01XVnHxNm6kL5tD2JoH0aP1X spec.md 12480 2026-05-24T19:51:02Z
const files = await client.beta.files.list({ limit: 20 });
for (const f of files.data) {
console.log(f.id, f.filename, f.size_bytes);
}
Retrieve metadata
Fetch a single file's metadata by ID.
f = client.beta.files.retrieve(uploaded.id)
print(f.filename, f.size_bytes, f.mime_type, f.created_at)
Output:
manual.pdf 4271088 application/pdf 2026-05-25T13:21:08Z
Delete
Files are billable storage — clean them up when no longer needed.
client.beta.files.delete(uploaded.id)
print("deleted")
Output:
deleted
await client.beta.files.delete(uploaded.id);
console.log("deleted");
Output:
deleted
Combined with prompt caching
When you reference a file by file_id, the file's tokens still count toward context — but they cache the same way inline documents do. Attach cache_control to the document block to make the file's content (which is large and stable) a cached prefix.
response = client.beta.messages.create(
model="claude-opus-4-7",
max_tokens=2048,
messages=[{
"role": "user",
"content": [
{
"type": "document",
"source": {"type": "file", "file_id": uploaded.id},
"cache_control": {"type": "ephemeral"},
},
{"type": "text", "text": "What does the warranty say?"},
],
}],
extra_headers={"anthropic-beta": "files-api-2025-04-14"},
)
print(response.usage)
Output (first call — writes cache):
Usage(input_tokens=24, output_tokens=78, cache_creation_input_tokens=18420, cache_read_input_tokens=0)
Output (second call within 5 min on the same file):
Usage(input_tokens=24, output_tokens=82, cache_creation_input_tokens=0, cache_read_input_tokens=18420)
See Prompt caching for breakpoint placement.
Using files in the Batch API
A batched request may reference uploaded files exactly as a synchronous one would. The file is read once per request; combined with prompt caching, the per-request cost drops dramatically across a 10K-item batch.
batch = client.messages.batches.create(
requests=[
{
"custom_id": f"q-{i}",
"params": {
"model": "claude-opus-4-7",
"max_tokens": 256,
"messages": [{
"role": "user",
"content": [
{
"type": "document",
"source": {"type": "file", "file_id": uploaded.id},
"cache_control": {"type": "ephemeral"},
},
{"type": "text", "text": q},
],
}],
},
}
for i, q in enumerate(QUESTIONS)
],
extra_headers={"anthropic-beta": "files-api-2025-04-14"},
)
print(batch.id)
Output:
msgbatch_01XVnKzQpZ8mN7vF4LqJ2cR3
Multi-file RAG pattern
Upload your entire corpus once, then attach the relevant subset (selected by an external retriever — vector DB, BM25, full-text) to each user turn.
import anthropic, json, pathlib
client = anthropic.Anthropic()
# 1. One-time ingest
corpus = pathlib.Path("docs").glob("*.pdf")
manifest: dict[str, str] = {}
for path in corpus:
f = client.beta.files.upload(file=path.open("rb"))
manifest[path.stem] = f.id
pathlib.Path("manifest.json").write_text(json.dumps(manifest, indent=2))
# 2. Per-query: retrieve relevant file_ids however you like
def answer(question: str, file_ids: list[str]) -> str:
docs = [
{
"type": "document",
"source": {"type": "file", "file_id": fid},
"cache_control": {"type": "ephemeral"},
}
for fid in file_ids
]
resp = client.beta.messages.create(
model="claude-opus-4-7",
max_tokens=2048,
messages=[{"role": "user", "content": [*docs, {"type": "text", "text": question}]}],
extra_headers={"anthropic-beta": "files-api-2025-04-14"},
)
return resp.content[0].text
Tool result content with a file
A tool whose result is a generated PDF or chart can return it as a document/image block with the file's id — Claude reads it before continuing.
def render_chart_tool(args: dict) -> dict:
# ... generate plot.png ...
f = client.beta.files.upload(file=open("plot.png", "rb"))
return {
"type": "tool_result",
"tool_use_id": args["tool_use_id"],
"content": [
{"type": "text", "text": "Chart generated for Q4 metrics."},
{"type": "image", "source": {"type": "file", "file_id": f.id}},
],
}
Privacy and lifecycle
| Property | Behaviour |
|---|---|
| Visibility | Files are scoped to your workspace — not shared across orgs |
| Retention | Files persist until you delete them or until the workspace is wound down |
| Training | Per Anthropic policy, Files API data is not used for model training |
| Encryption | Encrypted at rest and in transit |
| Auditing | All file operations are logged in your org audit log |
Files are workspace-scoped — multiple workspaces in the same org each have their own file store. Plan your workspace structure with this in mind for per-customer isolation.
Common pitfalls
| Pitfall | Symptom | Fix |
|---|---|---|
| Missing beta header | 400 on upload or message reference | Add anthropic-beta: files-api-2025-04-14 on every call |
Mixing client.messages and client.beta.messages | File reference rejected | Use client.beta.messages.create while the API is in beta |
| Uploading a file > 32 MB | 413 Payload Too Large | Split or compress before upload |
| Referencing a deleted file | 404 mid-message | Track file lifetime; do not delete until all references are settled |
| Counting on storage being free | Surprise bill | Files count against your storage tier — delete unused ones |
Skipping cache_control on the document | Re-pay for the file's tokens every call | Mark the document block cacheable; reuse hits the cache |
| Treating files as ephemeral | Files persist forever until deleted | Delete from a daily cleanup job for transient files |
| Re-uploading the same PDF | Wasted bandwidth and storage | Hash before upload; reuse the existing file_id |
Common recipes
Idempotent upload with content-hash dedupe
import hashlib, json, pathlib
INDEX = pathlib.Path("file_index.json")
def upload_once(path: pathlib.Path) -> str:
digest = hashlib.sha256(path.read_bytes()).hexdigest()
index = json.loads(INDEX.read_text()) if INDEX.exists() else {}
if digest in index:
return index[digest]
f = client.beta.files.upload(file=path.open("rb"))
index[digest] = f.id
INDEX.write_text(json.dumps(index, indent=2))
return f.id
Bulk ingest with rate-limit
import time
def ingest(paths: list[pathlib.Path], pause: float = 0.1) -> dict[str, str]:
mapping: dict[str, str] = {}
for p in paths:
f = client.beta.files.upload(file=p.open("rb"))
mapping[p.stem] = f.id
print(f"uploaded {p.name} -> {f.id}")
time.sleep(pause)
return mapping
TTL cleanup job
Delete files older than 30 days that haven't been re-referenced.
import datetime as dt
CUTOFF = dt.datetime.now(dt.UTC) - dt.timedelta(days=30)
for f in client.beta.files.list(limit=200):
created = dt.datetime.fromisoformat(f.created_at.replace("Z", "+00:00"))
if created < CUTOFF:
client.beta.files.delete(f.id)
print(f"deleted {f.id} ({f.filename})")
Show what is in storage
total = 0
by_type: dict[str, int] = {}
for f in client.beta.files.list(limit=200):
total += f.size_bytes
by_type[f.mime_type] = by_type.get(f.mime_type, 0) + f.size_bytes
print(f"total: {total / 1_000_000:.1f} MB")
for mime, size in sorted(by_type.items(), key=lambda x: -x[1]):
print(f" {mime}: {size / 1_000_000:.1f} MB")
Output:
total: 412.7 MB
application/pdf: 380.2 MB
image/png: 18.4 MB
text/markdown: 14.1 MB
See also
- Python SDK — base messages, vision, PDFs inline.
- TypeScript SDK — files in TS.
- Prompt caching — stack with files for cheap RAG.
- Batch API — reference files from batched requests.
- Tool use — return files as tool results.