cheat sheet

Claude API

The Anthropic Files API — upload PDFs, images, and text once and reference them by `file_id` across multiple messages, with citations, lifecycle management, and Workbench integration.

updated 05-25-2026

Claude API — Files

What it is

The Files API lets you upload PDFs, images, and text once and reference them by file_id in any subsequent messages.create call — no base64 re-upload, no per-request bandwidth, automatic caching across requests. It is the right pattern for RAG over a fixed corpus, repeated questions against the same large PDF, multi-step agent workflows that need to revisit a document, and any workload where the same file is sent to Claude more than once. Files have a 32 MB per-file size cap; PDFs may be up to 100 pages. Uploads count against a per-organization storage budget (multi-GB tier; see your dashboard for current limits).

The Files API is in beta. Send the header anthropic-beta: files-api-2025-04-14 on every call (upload, list, retrieve, delete, and any messages.create that references a file_id). The SDK helpers client.beta.files.* set this for you automatically.

When to use it

Scenario	Files API?
Same PDF referenced in 5+ requests	Yes
Persistent document store for an agent	Yes
RAG over a fixed corpus	Yes
Large image library	Yes
One-shot question on a brand-new PDF	No — inline base64 is simpler
Per-user upload with strict privacy isolation	Yes (one file per user)
Files > 32 MB	Pre-chunk; Files API will reject

Supported file types

Type	MIME	Notes
PDF	`application/pdf`	Up to 100 pages, 32 MB
Image	`image/jpeg`, `image/png`, `image/gif`, `image/webp`	32 MB; same limits as inline images
Text	`text/plain`, `text/markdown`	UTF-8 encoded

Limits

Limit	Value
Max file size	32 MB
PDF max pages	100
Org storage budget	Tier-dependent (see dashboard)
File expiration	None — files persist until you delete them
Allowed in batch API	Yes
Counts against context window	Yes — same as inline; tokens for the doc are billed every reference

Upload — Python

client.beta.files.upload returns a FileObject with an id you reference elsewhere. Pass either a path-like file handle or a tuple (filename, bytes, mime_type).

python

import anthropic

client = anthropic.Anthropic()

uploaded = client.beta.files.upload(
    file=open("manual.pdf", "rb"),
)

print(uploaded.id)
print(uploaded.filename)
print(uploaded.size_bytes)
print(uploaded.mime_type)
print(uploaded.created_at)

Output:

text

file_01XVnKzQp8mN7vF4LqJ2cR3Z
manual.pdf
4_271_088
application/pdf
2026-05-25T13:21:08Z

Upload with explicit metadata

When the source is bytes (e.g. generated PDF in memory), pass a (filename, bytes, mime) tuple.

python

from pathlib import Path

pdf_bytes = Path("manual.pdf").read_bytes()
uploaded = client.beta.files.upload(
    file=("manual.pdf", pdf_bytes, "application/pdf"),
)
print(uploaded.id)

Output:

text

file_01XVnKzQp8mN7vF4LqJ2cR3Z

Upload — TypeScript

The TypeScript SDK accepts File, Blob, ReadStream, or the toFile() helper (which wraps a Buffer / Uint8Array).

typescript

import Anthropic, { toFile } from "@anthropic-ai/sdk";
import fs from "node:fs";

const client = new Anthropic();

const uploaded = await client.beta.files.upload({
  file: fs.createReadStream("manual.pdf"),
});

console.log(uploaded.id, uploaded.filename, uploaded.size_bytes);

Output:

text

file_01XVnKzQp8mN7vF4LqJ2cR3Z manual.pdf 4271088

With an in-memory buffer:

typescript

const buffer = await fs.promises.readFile("report.pdf");
const uploaded = await client.beta.files.upload({
  file: await toFile(buffer, "report.pdf", { type: "application/pdf" }),
});

Upload via curl

For ad-hoc uploads or scripts in languages without an SDK, use a multipart POST.

bash

curl https://api.anthropic.com/v1/files \
    -H "x-api-key: $ANTHROPIC_API_KEY" \
    -H "anthropic-version: 2023-06-01" \
    -H "anthropic-beta: files-api-2025-04-14" \
    -F "file=@manual.pdf"

Output:

text

{
  "id": "file_01XVnKzQp8mN7vF4LqJ2cR3Z",
  "filename": "manual.pdf",
  "mime_type": "application/pdf",
  "size_bytes": 4271088,
  "created_at": "2026-05-25T13:21:08Z",
  "downloadable": false
}

Reference a file in a message

Once uploaded, reference the file by file_id in any document or image content block. The model treats it identically to an inline document — same context budget, same visual reading for PDFs.

python

response = client.beta.messages.create(
    model="claude-opus-4-7",
    max_tokens=2048,
    messages=[{
        "role": "user",
        "content": [
            {
                "type": "document",
                "source": {"type": "file", "file_id": uploaded.id},
            },
            {"type": "text", "text": "Summarise section 4.2 about reset behaviour."},
        ],
    }],
    extra_headers={"anthropic-beta": "files-api-2025-04-14"},
)

print(response.content[0].text)

Output:

text

Section 4.2 describes two reset modes. A soft reset clears the I/O queue and
in-progress transactions but preserves the serial number, calibration data,
and firmware. A hard reset clears everything, reverts firmware to factory
defaults, and requires re-pairing the device.

Reference an image

python

img = client.beta.files.upload(file=open("chart.png", "rb"))

response = client.beta.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[{
        "role": "user",
        "content": [
            {
                "type": "image",
                "source": {"type": "file", "file_id": img.id},
            },
            {"type": "text", "text": "What does this chart show?"},
        ],
    }],
    extra_headers={"anthropic-beta": "files-api-2025-04-14"},
)
print(response.content[0].text)

Output:

text

The chart shows monthly revenue from January through December, with steady
growth from $1.2M to $2.1M and a notable spike in Q4 to $2.8M.

Reference from TypeScript

typescript

const response = await client.beta.messages.create({
  model: "claude-opus-4-7",
  max_tokens: 2048,
  messages: [{
    role: "user",
    content: [
      { type: "document", source: { type: "file", file_id: uploaded.id } },
      { type: "text", text: "What is the warranty period?" },
    ],
  }],
}, {
  headers: { "anthropic-beta": "files-api-2025-04-14" },
});

const first = response.content[0];
if (first.type === "text") console.log(first.text);

Output:

text

The standard warranty period is 24 months from the date of purchase, with an
optional extended-warranty plan that adds another 36 months.

Citations

Enable citations per-document with citations: { enabled: true }. Each text block in Claude's response carries a citations array linking back to the exact span in the cited file — page numbers for PDFs, character ranges for text.

python

response = client.beta.messages.create(
    model="claude-opus-4-7",
    max_tokens=2048,
    messages=[{
        "role": "user",
        "content": [
            {
                "type": "document",
                "source": {"type": "file", "file_id": uploaded.id},
                "title": "Device Manual",
                "citations": {"enabled": True},
            },
            {"type": "text", "text": "List the three supported reset modes with the exact wording from the manual."},
        ],
    }],
    extra_headers={"anthropic-beta": "files-api-2025-04-14"},
)

for block in response.content:
    if block.type != "text":
        continue
    print(block.text)
    for cite in block.citations or []:
        print(f"  -> p.{cite.start_page_number}-{cite.end_page_number}: {cite.cited_text!r}")

Output:

text

The manual lists three reset modes:

1. **Soft reset** — "clears the I/O queue but preserves calibration data."
  -> p.12-12: 'clears the I/O queue but preserves calibration data'
2. **Hard reset** — "reverts firmware to factory defaults."
  -> p.13-13: 'reverts firmware to factory defaults'
3. **Recovery reset** — "loaded from the recovery partition."
  -> p.13-13: 'loaded from the recovery partition'

List files

list pages over uploaded files with optional after_id / before_id cursors.

python

for f in client.beta.files.list(limit=20):
    print(f.id, f.filename, f.size_bytes, f.created_at)

Output:

text

file_01XVnKzQp8mN7vF4LqJ2cR3Z manual.pdf 4271088 2026-05-25T13:21:08Z
file_01XVnJyPo7lM6uE3KpI1bQ2Y chart.png    284910 2026-05-25T11:04:33Z
file_01XVnHxNm6kL5tD2JoH0aP1X spec.md       12480 2026-05-24T19:51:02Z

typescript

const files = await client.beta.files.list({ limit: 20 });
for (const f of files.data) {
  console.log(f.id, f.filename, f.size_bytes);
}

Retrieve metadata

Fetch a single file's metadata by ID.

python

f = client.beta.files.retrieve(uploaded.id)
print(f.filename, f.size_bytes, f.mime_type, f.created_at)

Output:

text

manual.pdf 4271088 application/pdf 2026-05-25T13:21:08Z

Delete

Files are billable storage — clean them up when no longer needed.

python

client.beta.files.delete(uploaded.id)
print("deleted")

Output:

text

deleted

typescript

await client.beta.files.delete(uploaded.id);
console.log("deleted");

Output:

text

deleted

Combined with prompt caching

When you reference a file by file_id, the file's tokens still count toward context — but they cache the same way inline documents do. Attach cache_control to the document block to make the file's content (which is large and stable) a cached prefix.

python

response = client.beta.messages.create(
    model="claude-opus-4-7",
    max_tokens=2048,
    messages=[{
        "role": "user",
        "content": [
            {
                "type": "document",
                "source": {"type": "file", "file_id": uploaded.id},
                "cache_control": {"type": "ephemeral"},
            },
            {"type": "text", "text": "What does the warranty say?"},
        ],
    }],
    extra_headers={"anthropic-beta": "files-api-2025-04-14"},
)
print(response.usage)

Output (first call — writes cache):

text

Usage(input_tokens=24, output_tokens=78, cache_creation_input_tokens=18420, cache_read_input_tokens=0)

Output (second call within 5 min on the same file):

text

Usage(input_tokens=24, output_tokens=82, cache_creation_input_tokens=0, cache_read_input_tokens=18420)

See Prompt caching for breakpoint placement.

Using files in the Batch API

A batched request may reference uploaded files exactly as a synchronous one would. The file is read once per request; combined with prompt caching, the per-request cost drops dramatically across a 10K-item batch.

python

batch = client.messages.batches.create(
    requests=[
        {
            "custom_id": f"q-{i}",
            "params": {
                "model": "claude-opus-4-7",
                "max_tokens": 256,
                "messages": [{
                    "role": "user",
                    "content": [
                        {
                            "type": "document",
                            "source": {"type": "file", "file_id": uploaded.id},
                            "cache_control": {"type": "ephemeral"},
                        },
                        {"type": "text", "text": q},
                    ],
                }],
            },
        }
        for i, q in enumerate(QUESTIONS)
    ],
    extra_headers={"anthropic-beta": "files-api-2025-04-14"},
)
print(batch.id)

Output:

text

msgbatch_01XVnKzQpZ8mN7vF4LqJ2cR3

Multi-file RAG pattern

Upload your entire corpus once, then attach the relevant subset (selected by an external retriever — vector DB, BM25, full-text) to each user turn.

python

import anthropic, json, pathlib

client = anthropic.Anthropic()

# 1. One-time ingest
corpus = pathlib.Path("docs").glob("*.pdf")
manifest: dict[str, str] = {}
for path in corpus:
    f = client.beta.files.upload(file=path.open("rb"))
    manifest[path.stem] = f.id
pathlib.Path("manifest.json").write_text(json.dumps(manifest, indent=2))

# 2. Per-query: retrieve relevant file_ids however you like
def answer(question: str, file_ids: list[str]) -> str:
    docs = [
        {
            "type": "document",
            "source": {"type": "file", "file_id": fid},
            "cache_control": {"type": "ephemeral"},
        }
        for fid in file_ids
    ]
    resp = client.beta.messages.create(
        model="claude-opus-4-7",
        max_tokens=2048,
        messages=[{"role": "user", "content": [*docs, {"type": "text", "text": question}]}],
        extra_headers={"anthropic-beta": "files-api-2025-04-14"},
    )
    return resp.content[0].text

Tool result content with a file

A tool whose result is a generated PDF or chart can return it as a document/image block with the file's id — Claude reads it before continuing.

python

def render_chart_tool(args: dict) -> dict:
    # ... generate plot.png ...
    f = client.beta.files.upload(file=open("plot.png", "rb"))
    return {
        "type": "tool_result",
        "tool_use_id": args["tool_use_id"],
        "content": [
            {"type": "text", "text": "Chart generated for Q4 metrics."},
            {"type": "image", "source": {"type": "file", "file_id": f.id}},
        ],
    }

Privacy and lifecycle

Property	Behaviour
Visibility	Files are scoped to your workspace — not shared across orgs
Retention	Files persist until you delete them or until the workspace is wound down
Training	Per Anthropic policy, Files API data is not used for model training
Encryption	Encrypted at rest and in transit
Auditing	All file operations are logged in your org audit log

Files are workspace-scoped — multiple workspaces in the same org each have their own file store. Plan your workspace structure with this in mind for per-customer isolation.

Common pitfalls

Pitfall	Symptom	Fix
Missing beta header	`400` on upload or message reference	Add `anthropic-beta: files-api-2025-04-14` on every call
Mixing `client.messages` and `client.beta.messages`	File reference rejected	Use `client.beta.messages.create` while the API is in beta
Uploading a file > 32 MB	`413 Payload Too Large`	Split or compress before upload
Referencing a deleted file	`404` mid-message	Track file lifetime; do not delete until all references are settled
Counting on storage being free	Surprise bill	Files count against your storage tier — delete unused ones
Skipping `cache_control` on the document	Re-pay for the file's tokens every call	Mark the document block cacheable; reuse hits the cache
Treating files as ephemeral	Files persist forever until deleted	Delete from a daily cleanup job for transient files
Re-uploading the same PDF	Wasted bandwidth and storage	Hash before upload; reuse the existing `file_id`

Common recipes

Idempotent upload with content-hash dedupe

python

import hashlib, json, pathlib

INDEX = pathlib.Path("file_index.json")

def upload_once(path: pathlib.Path) -> str:
    digest = hashlib.sha256(path.read_bytes()).hexdigest()
    index = json.loads(INDEX.read_text()) if INDEX.exists() else {}
    if digest in index:
        return index[digest]
    f = client.beta.files.upload(file=path.open("rb"))
    index[digest] = f.id
    INDEX.write_text(json.dumps(index, indent=2))
    return f.id

Bulk ingest with rate-limit

python

import time

def ingest(paths: list[pathlib.Path], pause: float = 0.1) -> dict[str, str]:
    mapping: dict[str, str] = {}
    for p in paths:
        f = client.beta.files.upload(file=p.open("rb"))
        mapping[p.stem] = f.id
        print(f"uploaded {p.name} -> {f.id}")
        time.sleep(pause)
    return mapping

TTL cleanup job

Delete files older than 30 days that haven't been re-referenced.

python

import datetime as dt

CUTOFF = dt.datetime.now(dt.UTC) - dt.timedelta(days=30)

for f in client.beta.files.list(limit=200):
    created = dt.datetime.fromisoformat(f.created_at.replace("Z", "+00:00"))
    if created < CUTOFF:
        client.beta.files.delete(f.id)
        print(f"deleted {f.id} ({f.filename})")

Show what is in storage

python

total = 0
by_type: dict[str, int] = {}
for f in client.beta.files.list(limit=200):
    total += f.size_bytes
    by_type[f.mime_type] = by_type.get(f.mime_type, 0) + f.size_bytes

print(f"total: {total / 1_000_000:.1f} MB")
for mime, size in sorted(by_type.items(), key=lambda x: -x[1]):
    print(f"  {mime}: {size / 1_000_000:.1f} MB")

Output:

text

total: 412.7 MB
  application/pdf: 380.2 MB
  image/png: 18.4 MB
  text/markdown: 14.1 MB

Claude API — Files

What it is

When to use it

Supported file types

Limits

Upload — Python

Upload with explicit metadata

Upload — TypeScript

Upload via curl

Reference a file in a message

Reference an image

Reference from TypeScript

Citations

List files

Retrieve metadata

Delete

Combined with prompt caching

Using files in the Batch API

Multi-file RAG pattern

Tool result content with a file

Privacy and lifecycle

Common pitfalls

Common recipes

Idempotent upload with content-hash dedupe

Bulk ingest with rate-limit

TTL cleanup job

Show what is in storage

See also