cheat sheet

Claude API

The Anthropic Files API — upload PDFs, images, and text once and reference them by `file_id` across multiple messages, with citations, lifecycle management, and Workbench integration.

Claude API — Files

What it is

The Files API lets you upload PDFs, images, and text once and reference them by file_id in any subsequent messages.create call — no base64 re-upload, no per-request bandwidth, automatic caching across requests. It is the right pattern for RAG over a fixed corpus, repeated questions against the same large PDF, multi-step agent workflows that need to revisit a document, and any workload where the same file is sent to Claude more than once. Files have a 32 MB per-file size cap; PDFs may be up to 100 pages. Uploads count against a per-organization storage budget (multi-GB tier; see your dashboard for current limits).

The Files API is in beta. Send the header anthropic-beta: files-api-2025-04-14 on every call (upload, list, retrieve, delete, and any messages.create that references a file_id). The SDK helpers client.beta.files.* set this for you automatically.

When to use it

ScenarioFiles API?
Same PDF referenced in 5+ requestsYes
Persistent document store for an agentYes
RAG over a fixed corpusYes
Large image libraryYes
One-shot question on a brand-new PDFNo — inline base64 is simpler
Per-user upload with strict privacy isolationYes (one file per user)
Files > 32 MBPre-chunk; Files API will reject

Supported file types

TypeMIMENotes
PDFapplication/pdfUp to 100 pages, 32 MB
Imageimage/jpeg, image/png, image/gif, image/webp32 MB; same limits as inline images
Texttext/plain, text/markdownUTF-8 encoded

Limits

LimitValue
Max file size32 MB
PDF max pages100
Org storage budgetTier-dependent (see dashboard)
File expirationNone — files persist until you delete them
Allowed in batch APIYes
Counts against context windowYes — same as inline; tokens for the doc are billed every reference

Upload — Python

client.beta.files.upload returns a FileObject with an id you reference elsewhere. Pass either a path-like file handle or a tuple (filename, bytes, mime_type).

python
import anthropic

client = anthropic.Anthropic()

uploaded = client.beta.files.upload(
    file=open("manual.pdf", "rb"),
)

print(uploaded.id)
print(uploaded.filename)
print(uploaded.size_bytes)
print(uploaded.mime_type)
print(uploaded.created_at)

Output:

text
file_01XVnKzQp8mN7vF4LqJ2cR3Z
manual.pdf
4_271_088
application/pdf
2026-05-25T13:21:08Z

Upload with explicit metadata

When the source is bytes (e.g. generated PDF in memory), pass a (filename, bytes, mime) tuple.

python
from pathlib import Path

pdf_bytes = Path("manual.pdf").read_bytes()
uploaded = client.beta.files.upload(
    file=("manual.pdf", pdf_bytes, "application/pdf"),
)
print(uploaded.id)

Output:

text
file_01XVnKzQp8mN7vF4LqJ2cR3Z

Upload — TypeScript

The TypeScript SDK accepts File, Blob, ReadStream, or the toFile() helper (which wraps a Buffer / Uint8Array).

typescript
import Anthropic, { toFile } from "@anthropic-ai/sdk";
import fs from "node:fs";

const client = new Anthropic();

const uploaded = await client.beta.files.upload({
  file: fs.createReadStream("manual.pdf"),
});

console.log(uploaded.id, uploaded.filename, uploaded.size_bytes);

Output:

text
file_01XVnKzQp8mN7vF4LqJ2cR3Z manual.pdf 4271088

With an in-memory buffer:

typescript
const buffer = await fs.promises.readFile("report.pdf");
const uploaded = await client.beta.files.upload({
  file: await toFile(buffer, "report.pdf", { type: "application/pdf" }),
});

Upload via curl

For ad-hoc uploads or scripts in languages without an SDK, use a multipart POST.

bash
curl https://api.anthropic.com/v1/files \
    -H "x-api-key: $ANTHROPIC_API_KEY" \
    -H "anthropic-version: 2023-06-01" \
    -H "anthropic-beta: files-api-2025-04-14" \
    -F "file=@manual.pdf"

Output:

text
{
  "id": "file_01XVnKzQp8mN7vF4LqJ2cR3Z",
  "filename": "manual.pdf",
  "mime_type": "application/pdf",
  "size_bytes": 4271088,
  "created_at": "2026-05-25T13:21:08Z",
  "downloadable": false
}

Reference a file in a message

Once uploaded, reference the file by file_id in any document or image content block. The model treats it identically to an inline document — same context budget, same visual reading for PDFs.

python
response = client.beta.messages.create(
    model="claude-opus-4-7",
    max_tokens=2048,
    messages=[{
        "role": "user",
        "content": [
            {
                "type": "document",
                "source": {"type": "file", "file_id": uploaded.id},
            },
            {"type": "text", "text": "Summarise section 4.2 about reset behaviour."},
        ],
    }],
    extra_headers={"anthropic-beta": "files-api-2025-04-14"},
)

print(response.content[0].text)

Output:

text
Section 4.2 describes two reset modes. A soft reset clears the I/O queue and
in-progress transactions but preserves the serial number, calibration data,
and firmware. A hard reset clears everything, reverts firmware to factory
defaults, and requires re-pairing the device.

Reference an image

python
img = client.beta.files.upload(file=open("chart.png", "rb"))

response = client.beta.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[{
        "role": "user",
        "content": [
            {
                "type": "image",
                "source": {"type": "file", "file_id": img.id},
            },
            {"type": "text", "text": "What does this chart show?"},
        ],
    }],
    extra_headers={"anthropic-beta": "files-api-2025-04-14"},
)
print(response.content[0].text)

Output:

text
The chart shows monthly revenue from January through December, with steady
growth from $1.2M to $2.1M and a notable spike in Q4 to $2.8M.

Reference from TypeScript

typescript
const response = await client.beta.messages.create({
  model: "claude-opus-4-7",
  max_tokens: 2048,
  messages: [{
    role: "user",
    content: [
      { type: "document", source: { type: "file", file_id: uploaded.id } },
      { type: "text", text: "What is the warranty period?" },
    ],
  }],
}, {
  headers: { "anthropic-beta": "files-api-2025-04-14" },
});

const first = response.content[0];
if (first.type === "text") console.log(first.text);

Output:

text
The standard warranty period is 24 months from the date of purchase, with an
optional extended-warranty plan that adds another 36 months.

Citations

Enable citations per-document with citations: { enabled: true }. Each text block in Claude's response carries a citations array linking back to the exact span in the cited file — page numbers for PDFs, character ranges for text.

python
response = client.beta.messages.create(
    model="claude-opus-4-7",
    max_tokens=2048,
    messages=[{
        "role": "user",
        "content": [
            {
                "type": "document",
                "source": {"type": "file", "file_id": uploaded.id},
                "title": "Device Manual",
                "citations": {"enabled": True},
            },
            {"type": "text", "text": "List the three supported reset modes with the exact wording from the manual."},
        ],
    }],
    extra_headers={"anthropic-beta": "files-api-2025-04-14"},
)

for block in response.content:
    if block.type != "text":
        continue
    print(block.text)
    for cite in block.citations or []:
        print(f"  -> p.{cite.start_page_number}-{cite.end_page_number}: {cite.cited_text!r}")

Output:

text
The manual lists three reset modes:

1. **Soft reset** — "clears the I/O queue but preserves calibration data."
  -> p.12-12: 'clears the I/O queue but preserves calibration data'
2. **Hard reset** — "reverts firmware to factory defaults."
  -> p.13-13: 'reverts firmware to factory defaults'
3. **Recovery reset** — "loaded from the recovery partition."
  -> p.13-13: 'loaded from the recovery partition'

List files

list pages over uploaded files with optional after_id / before_id cursors.

python
for f in client.beta.files.list(limit=20):
    print(f.id, f.filename, f.size_bytes, f.created_at)

Output:

text
file_01XVnKzQp8mN7vF4LqJ2cR3Z manual.pdf 4271088 2026-05-25T13:21:08Z
file_01XVnJyPo7lM6uE3KpI1bQ2Y chart.png    284910 2026-05-25T11:04:33Z
file_01XVnHxNm6kL5tD2JoH0aP1X spec.md       12480 2026-05-24T19:51:02Z
typescript
const files = await client.beta.files.list({ limit: 20 });
for (const f of files.data) {
  console.log(f.id, f.filename, f.size_bytes);
}

Retrieve metadata

Fetch a single file's metadata by ID.

python
f = client.beta.files.retrieve(uploaded.id)
print(f.filename, f.size_bytes, f.mime_type, f.created_at)

Output:

text
manual.pdf 4271088 application/pdf 2026-05-25T13:21:08Z

Delete

Files are billable storage — clean them up when no longer needed.

python
client.beta.files.delete(uploaded.id)
print("deleted")

Output:

text
deleted
typescript
await client.beta.files.delete(uploaded.id);
console.log("deleted");

Output:

text
deleted

Combined with prompt caching

When you reference a file by file_id, the file's tokens still count toward context — but they cache the same way inline documents do. Attach cache_control to the document block to make the file's content (which is large and stable) a cached prefix.

python
response = client.beta.messages.create(
    model="claude-opus-4-7",
    max_tokens=2048,
    messages=[{
        "role": "user",
        "content": [
            {
                "type": "document",
                "source": {"type": "file", "file_id": uploaded.id},
                "cache_control": {"type": "ephemeral"},
            },
            {"type": "text", "text": "What does the warranty say?"},
        ],
    }],
    extra_headers={"anthropic-beta": "files-api-2025-04-14"},
)
print(response.usage)

Output (first call — writes cache):

text
Usage(input_tokens=24, output_tokens=78, cache_creation_input_tokens=18420, cache_read_input_tokens=0)

Output (second call within 5 min on the same file):

text
Usage(input_tokens=24, output_tokens=82, cache_creation_input_tokens=0, cache_read_input_tokens=18420)

See Prompt caching for breakpoint placement.

Using files in the Batch API

A batched request may reference uploaded files exactly as a synchronous one would. The file is read once per request; combined with prompt caching, the per-request cost drops dramatically across a 10K-item batch.

python
batch = client.messages.batches.create(
    requests=[
        {
            "custom_id": f"q-{i}",
            "params": {
                "model": "claude-opus-4-7",
                "max_tokens": 256,
                "messages": [{
                    "role": "user",
                    "content": [
                        {
                            "type": "document",
                            "source": {"type": "file", "file_id": uploaded.id},
                            "cache_control": {"type": "ephemeral"},
                        },
                        {"type": "text", "text": q},
                    ],
                }],
            },
        }
        for i, q in enumerate(QUESTIONS)
    ],
    extra_headers={"anthropic-beta": "files-api-2025-04-14"},
)
print(batch.id)

Output:

text
msgbatch_01XVnKzQpZ8mN7vF4LqJ2cR3

Multi-file RAG pattern

Upload your entire corpus once, then attach the relevant subset (selected by an external retriever — vector DB, BM25, full-text) to each user turn.

python
import anthropic, json, pathlib

client = anthropic.Anthropic()

# 1. One-time ingest
corpus = pathlib.Path("docs").glob("*.pdf")
manifest: dict[str, str] = {}
for path in corpus:
    f = client.beta.files.upload(file=path.open("rb"))
    manifest[path.stem] = f.id
pathlib.Path("manifest.json").write_text(json.dumps(manifest, indent=2))

# 2. Per-query: retrieve relevant file_ids however you like
def answer(question: str, file_ids: list[str]) -> str:
    docs = [
        {
            "type": "document",
            "source": {"type": "file", "file_id": fid},
            "cache_control": {"type": "ephemeral"},
        }
        for fid in file_ids
    ]
    resp = client.beta.messages.create(
        model="claude-opus-4-7",
        max_tokens=2048,
        messages=[{"role": "user", "content": [*docs, {"type": "text", "text": question}]}],
        extra_headers={"anthropic-beta": "files-api-2025-04-14"},
    )
    return resp.content[0].text

Tool result content with a file

A tool whose result is a generated PDF or chart can return it as a document/image block with the file's id — Claude reads it before continuing.

python
def render_chart_tool(args: dict) -> dict:
    # ... generate plot.png ...
    f = client.beta.files.upload(file=open("plot.png", "rb"))
    return {
        "type": "tool_result",
        "tool_use_id": args["tool_use_id"],
        "content": [
            {"type": "text", "text": "Chart generated for Q4 metrics."},
            {"type": "image", "source": {"type": "file", "file_id": f.id}},
        ],
    }

Privacy and lifecycle

PropertyBehaviour
VisibilityFiles are scoped to your workspace — not shared across orgs
RetentionFiles persist until you delete them or until the workspace is wound down
TrainingPer Anthropic policy, Files API data is not used for model training
EncryptionEncrypted at rest and in transit
AuditingAll file operations are logged in your org audit log

Files are workspace-scoped — multiple workspaces in the same org each have their own file store. Plan your workspace structure with this in mind for per-customer isolation.

Common pitfalls

PitfallSymptomFix
Missing beta header400 on upload or message referenceAdd anthropic-beta: files-api-2025-04-14 on every call
Mixing client.messages and client.beta.messagesFile reference rejectedUse client.beta.messages.create while the API is in beta
Uploading a file > 32 MB413 Payload Too LargeSplit or compress before upload
Referencing a deleted file404 mid-messageTrack file lifetime; do not delete until all references are settled
Counting on storage being freeSurprise billFiles count against your storage tier — delete unused ones
Skipping cache_control on the documentRe-pay for the file's tokens every callMark the document block cacheable; reuse hits the cache
Treating files as ephemeralFiles persist forever until deletedDelete from a daily cleanup job for transient files
Re-uploading the same PDFWasted bandwidth and storageHash before upload; reuse the existing file_id

Common recipes

Idempotent upload with content-hash dedupe

python
import hashlib, json, pathlib

INDEX = pathlib.Path("file_index.json")

def upload_once(path: pathlib.Path) -> str:
    digest = hashlib.sha256(path.read_bytes()).hexdigest()
    index = json.loads(INDEX.read_text()) if INDEX.exists() else {}
    if digest in index:
        return index[digest]
    f = client.beta.files.upload(file=path.open("rb"))
    index[digest] = f.id
    INDEX.write_text(json.dumps(index, indent=2))
    return f.id

Bulk ingest with rate-limit

python
import time

def ingest(paths: list[pathlib.Path], pause: float = 0.1) -> dict[str, str]:
    mapping: dict[str, str] = {}
    for p in paths:
        f = client.beta.files.upload(file=p.open("rb"))
        mapping[p.stem] = f.id
        print(f"uploaded {p.name} -> {f.id}")
        time.sleep(pause)
    return mapping

TTL cleanup job

Delete files older than 30 days that haven't been re-referenced.

python
import datetime as dt

CUTOFF = dt.datetime.now(dt.UTC) - dt.timedelta(days=30)

for f in client.beta.files.list(limit=200):
    created = dt.datetime.fromisoformat(f.created_at.replace("Z", "+00:00"))
    if created < CUTOFF:
        client.beta.files.delete(f.id)
        print(f"deleted {f.id} ({f.filename})")

Show what is in storage

python
total = 0
by_type: dict[str, int] = {}
for f in client.beta.files.list(limit=200):
    total += f.size_bytes
    by_type[f.mime_type] = by_type.get(f.mime_type, 0) + f.size_bytes

print(f"total: {total / 1_000_000:.1f} MB")
for mime, size in sorted(by_type.items(), key=lambda x: -x[1]):
    print(f"  {mime}: {size / 1_000_000:.1f} MB")

Output:

text
total: 412.7 MB
  application/pdf: 380.2 MB
  image/png: 18.4 MB
  text/markdown: 14.1 MB

See also