cheat sheet
Claude API
Complete TypeScript SDK reference for the Anthropic Claude API — install, messages.create, streaming, tool use, vision, types, and Node/Deno/Bun integration.
Claude API — TypeScript SDK
What it is
@anthropic-ai/sdk is the official TypeScript client for the Anthropic API — a thin, fully typed wrapper around fetch that exposes messages.create, messages.stream, the Batch API, the Files API, and the rest of the Claude surface. It runs on Node 18+, Deno, Bun, Cloudflare Workers, and modern browsers (with a server-side proxy). Reach for it when your stack is TypeScript/JavaScript instead of Python; the request/response shapes are identical, only the language idioms differ.
Install
npm install @anthropic-ai/sdk
Output:
added 1 package, and audited 12 packages in 1s
Or with pnpm / bun / yarn:
pnpm add @anthropic-ai/sdk
Output:
+ @anthropic-ai/sdk 0.49.0
Setup
The client reads ANTHROPIC_API_KEY from process.env by default. Pass it explicitly when running in environments where env vars are scoped (Cloudflare Workers, Lambda layers, browsers behind a proxy).
export ANTHROPIC_API_KEY="sk-ant-api03-…REDACTED…"
Output: (none — exits 0 on success)
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
// or: new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY })
Basic message
The minimum call. Use await client.messages.create({ ... }) with model, max_tokens, and a messages array. The response is typed as Anthropic.Message; access the first text block via response.content[0].
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
const response = await client.messages.create({
model: "claude-opus-4-7",
max_tokens: 1024,
messages: [{ role: "user", content: "Explain TypeScript generics in one paragraph." }],
});
const first = response.content[0];
if (first.type === "text") {
console.log(first.text);
}
console.log(response.usage);
Output:
TypeScript generics let a function or type accept other types as parameters, so the
same code can work with different shapes while keeping full static type safety.
Inside the function, the placeholder type behaves like any other type; at the call
site, TypeScript infers the concrete type from the arguments.
{ input_tokens: 14, output_tokens: 65, cache_creation_input_tokens: 0, cache_read_input_tokens: 0 }
Discriminated unions for content blocks
response.content is Array<TextBlock | ToolUseBlock | ThinkingBlock | ...>. Use block.type as a discriminant to narrow safely.
for (const block of response.content) {
switch (block.type) {
case "text":
console.log("TEXT:", block.text);
break;
case "tool_use":
console.log("TOOL:", block.name, block.input);
break;
case "thinking":
console.log("THINK:", block.thinking.length, "chars");
break;
}
}
Output:
TEXT: Sure — here is the answer.
System prompt
system accepts a string or an array of typed content blocks. The array form is required when you want to attach cache_control to portions of the system prompt.
const response = await client.messages.create({
model: "claude-opus-4-7",
max_tokens: 512,
system: "You are a concise TypeScript reviewer. Reply in bullet points only.",
messages: [{ role: "user", content: "How should I model a result that can fail?" }],
});
const first = response.content[0];
if (first.type === "text") console.log(first.text);
Output:
- Prefer a discriminated union: `{ ok: true; value: T } | { ok: false; error: E }`.
- Avoid throwing for predictable failures — make them part of the return type.
- Use a small `Result<T, E>` helper and `match`/exhaustive switch to consume it.
Multi-turn conversation
The API is stateless. Maintain your own message array; append each assistant turn verbatim before the next user message.
const messages: Anthropic.MessageParam[] = [
{ role: "user", content: "What is 2 + 2?" },
{ role: "assistant", content: "4" },
{ role: "user", content: "Multiply that by 10." },
];
const response = await client.messages.create({
model: "claude-opus-4-7",
max_tokens: 256,
messages,
});
const first = response.content[0];
if (first.type === "text") console.log(first.text);
Output:
40
Streaming with helpers
client.messages.stream(...) returns a MessageStream you can iterate to receive incremental text. The helper exposes .on("text", handler) for per-chunk callbacks and .finalMessage() for the assembled result.
const stream = client.messages.stream({
model: "claude-opus-4-7",
max_tokens: 1024,
messages: [{ role: "user", content: "Count to five slowly." }],
});
for await (const text of stream) {
process.stdout.write(text);
}
console.log();
const finalMessage = await stream.finalMessage();
console.log("tokens:", finalMessage.usage.output_tokens);
Output:
One... two... three... four... five.
tokens: 19
Low-level events
For full control — handling content_block_start, input_json_delta for streaming tool input, or rendering thinking blocks — iterate the raw event stream.
for await (const event of stream) {
if (event.type === "content_block_delta" && event.delta.type === "text_delta") {
process.stdout.write(event.delta.text);
}
if (event.type === "content_block_delta" && event.delta.type === "input_json_delta") {
process.stdout.write(event.delta.partial_json);
}
}
See Streaming for the complete event reference.
Tool use
Tools are typed as Anthropic.Tool[]. input_schema is JSON Schema; the SDK does not validate it for you on the way in, so pair with zod or another validator on the way out.
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
const tools: Anthropic.Tool[] = [{
name: "get_weather",
description: "Get current weather. Call this when the user asks about weather.",
input_schema: {
type: "object",
properties: {
location: { type: "string", description: "City and country" },
unit: { type: "string", enum: ["celsius", "fahrenheit"] },
},
required: ["location"],
},
}];
const response = await client.messages.create({
model: "claude-opus-4-7",
max_tokens: 1024,
tools,
messages: [{ role: "user", content: "What's the weather in Toronto?" }],
});
console.log(response.stop_reason);
for (const block of response.content) {
if (block.type === "tool_use") {
console.log(block.name, block.input);
}
}
Output:
tool_use
get_weather { location: 'Toronto, Canada' }
Continue the loop
Append the assistant turn and a tool_result user turn, then call again.
const toolUse = response.content.find(b => b.type === "tool_use");
if (toolUse && toolUse.type === "tool_use") {
const result = JSON.stringify({ temp: 12, condition: "cloudy" });
const followup = await client.messages.create({
model: "claude-opus-4-7",
max_tokens: 1024,
tools,
messages: [
{ role: "user", content: "What's the weather in Toronto?" },
{ role: "assistant", content: response.content },
{
role: "user",
content: [{ type: "tool_result", tool_use_id: toolUse.id, content: result }],
},
],
});
const first = followup.content[0];
if (first.type === "text") console.log(first.text);
}
Output:
The current weather in Toronto, Canada is 12°C and cloudy.
See Tool use for the full agentic loop, parallel tools, and tool_choice reference.
Zod-validated tool input
Tool input arrives as Record<string, unknown>. Validate with zod to get a typed object and friendly errors when the model goes off-schema.
import { z } from "zod";
const WeatherInput = z.object({
location: z.string(),
unit: z.enum(["celsius", "fahrenheit"]).optional(),
});
function handleWeather(raw: unknown): string {
const args = WeatherInput.parse(raw); // throws on bad input
return JSON.stringify({ temp: 12, condition: "cloudy", unit: args.unit ?? "celsius" });
}
Vision — image input
Images attach as content blocks alongside text. Pass base64 for local data or url for a public link.
import fs from "node:fs/promises";
const imageData = (await fs.readFile("chart.png")).toString("base64");
const response = await client.messages.create({
model: "claude-opus-4-7",
max_tokens: 1024,
messages: [{
role: "user",
content: [
{
type: "image",
source: { type: "base64", media_type: "image/png", data: imageData },
},
{ type: "text", text: "What trend does this chart show?" },
],
}],
});
const first = response.content[0];
if (first.type === "text") console.log(first.text);
Output:
The chart shows a steady upward trend in monthly active users, with the
sharpest growth between August and October.
PDF input
PDFs work like images — Claude reads them visually with layout intact. Up to 100 pages per document, 32 MB per file.
import fs from "node:fs/promises";
const pdf = (await fs.readFile("report.pdf")).toString("base64");
const response = await client.messages.create({
model: "claude-opus-4-7",
max_tokens: 2048,
messages: [{
role: "user",
content: [
{
type: "document",
source: { type: "base64", media_type: "application/pdf", data: pdf },
cache_control: { type: "ephemeral" },
},
{ type: "text", text: "Summarise in five bullet points." },
],
}],
});
For very large or reused PDFs, upload once via the Files API and reference by file_id.
Extended thinking
Toggle with thinking: { type: "enabled", budget_tokens: N }. Requires temperature: 1 (the default) and a budget ≥ 1024.
const response = await client.messages.create({
model: "claude-opus-4-7",
max_tokens: 16000,
thinking: { type: "enabled", budget_tokens: 10000 },
messages: [{
role: "user",
content: "Twelve coins, one is heavier or lighter. Find it in 3 weighings.",
}],
});
for (const block of response.content) {
if (block.type === "thinking") console.log(`[Thinking: ${block.thinking.length} chars]`);
if (block.type === "text") console.log(block.text);
}
Output:
[Thinking: 4218 chars]
Yes — a classic decision tree solves it in three weighings…
Prompt caching
Mark expensive prefixes with cache_control: { type: "ephemeral" }. The first call writes the cache; subsequent calls within 5 minutes read it at ~10% input cost.
const docs = await fs.readFile("manual.txt", "utf8");
const response = await client.messages.create({
model: "claude-opus-4-7",
max_tokens: 1024,
system: [
{ type: "text", text: "You answer questions about the manual." },
{ type: "text", text: docs, cache_control: { type: "ephemeral" } },
],
messages: [{ role: "user", content: "What does section 4.2 say about resets?" }],
});
console.log(response.usage);
Output (first call):
{ input_tokens: 30, output_tokens: 142, cache_creation_input_tokens: 24512, cache_read_input_tokens: 0 }
Output (second call, same prefix, within 5 min):
{ input_tokens: 30, output_tokens: 138, cache_creation_input_tokens: 0, cache_read_input_tokens: 24512 }
See Prompt caching for breakpoint placement and cost math.
Token counting
Estimate input tokens (including system, tools, images) before sending.
const count = await client.messages.countTokens({
model: "claude-opus-4-7",
messages: [{ role: "user", content: "Explain quantum entanglement." }],
});
console.log(count.input_tokens);
Output:
8
Error handling
The SDK throws typed errors. Catch the specific class to differentiate retryable from permanent failures.
import Anthropic, { APIError, RateLimitError, AuthenticationError } from "@anthropic-ai/sdk";
const client = new Anthropic();
try {
await client.messages.create({
model: "claude-opus-4-7",
max_tokens: 1024,
messages: [{ role: "user", content: "Hello" }],
});
} catch (err) {
if (err instanceof AuthenticationError) {
console.error("Bad API key");
} else if (err instanceof RateLimitError) {
console.error("Rate limited; back off and retry");
} else if (err instanceof APIError) {
console.error(`API error ${err.status}: ${err.message}`);
} else {
throw err;
}
}
Configuring retries and timeouts
Set maxRetries (default 2) and timeout (default 10 minutes) on the client, or per-request with .withOptions.
const client = new Anthropic({
maxRetries: 5,
timeout: 60_000, // ms
});
// Override for one slow request
const slow = await client.with({ timeout: 180_000 }).messages.create({
model: "claude-opus-4-7",
max_tokens: 8000,
messages: [{ role: "user", content: "Write a long essay." }],
});
Raw HTTP response
Wrap a request with .withResponse() to access the underlying Response (Fetch API). Useful for reading rate-limit headers.
const { data, response } = await client.messages
.create({
model: "claude-opus-4-7",
max_tokens: 64,
messages: [{ role: "user", content: "ping" }],
})
.withResponse();
console.log(response.status);
console.log(response.headers.get("anthropic-ratelimit-tokens-remaining"));
console.log(data.content[0]);
Output:
200
399136
{ type: 'text', text: 'pong' }
Cloudflare Workers
The SDK runs natively on Workers — no Node-specific imports. Pass the API key from a secret binding.
import Anthropic from "@anthropic-ai/sdk";
export interface Env {
ANTHROPIC_API_KEY: string;
}
export default {
async fetch(req: Request, env: Env): Promise<Response> {
const client = new Anthropic({ apiKey: env.ANTHROPIC_API_KEY });
const body = await req.json<{ message: string }>();
const result = await client.messages.create({
model: "claude-opus-4-7",
max_tokens: 1024,
messages: [{ role: "user", content: body.message }],
});
const first = result.content[0];
return new Response(first.type === "text" ? first.text : "");
},
};
Stream tokens back to a browser using a ReadableStream:
export default {
async fetch(req: Request, env: Env): Promise<Response> {
const client = new Anthropic({ apiKey: env.ANTHROPIC_API_KEY });
const { message } = await req.json<{ message: string }>();
const encoder = new TextEncoder();
const stream = new ReadableStream({
async start(controller) {
const llm = client.messages.stream({
model: "claude-opus-4-7",
max_tokens: 1024,
messages: [{ role: "user", content: message }],
});
for await (const text of llm) {
controller.enqueue(encoder.encode(text));
}
controller.close();
},
});
return new Response(stream, { headers: { "Content-Type": "text/plain" } });
},
};
Bun and Deno
Both runtimes support the SDK with no extra config.
bun add @anthropic-ai/sdk
Output:
installed @anthropic-ai/sdk@0.49.0
// deno_app.ts — Deno 1.40+
import Anthropic from "npm:@anthropic-ai/sdk";
const client = new Anthropic({ apiKey: Deno.env.get("ANTHROPIC_API_KEY") });
const resp = await client.messages.create({
model: "claude-opus-4-7",
max_tokens: 256,
messages: [{ role: "user", content: "Hello from Deno" }],
});
console.log(resp.content[0]);
deno run --allow-net --allow-env deno_app.ts
Output:
{ type: "text", text: "Hello! How can I help you from Deno today?" }
Express streaming endpoint
Mirror the Python FastAPI streaming pattern — forward Claude tokens straight to the HTTP response body.
import express from "express";
import Anthropic from "@anthropic-ai/sdk";
const app = express();
app.use(express.json());
const client = new Anthropic();
app.post("/chat", async (req, res) => {
res.setHeader("Content-Type", "text/plain; charset=utf-8");
const stream = client.messages.stream({
model: "claude-opus-4-7",
max_tokens: 1024,
messages: [{ role: "user", content: req.body.message }],
});
for await (const text of stream) {
res.write(text);
}
res.end();
});
app.listen(3000);
Test it:
curl -N -X POST http://localhost:3000/chat \
-H "Content-Type: application/json" \
-d '{"message": "Count to three."}'
Output:
One. Two. Three.
Type re-exports
Useful types you'll reach for repeatedly. Import as members of the default namespace.
import Anthropic from "@anthropic-ai/sdk";
type Msg = Anthropic.MessageParam;
type Tool = Anthropic.Tool;
type ToolUse = Anthropic.ToolUseBlock;
type ToolResult = Anthropic.ToolResultBlockParam;
type Message = Anthropic.Message;
type Usage = Anthropic.Usage;
type Stream = Anthropic.MessageStream;
Bedrock and Vertex variants
The same SDK ships sub-packages for Claude on AWS Bedrock and Google Vertex.
import { AnthropicBedrock } from "@anthropic-ai/bedrock-sdk";
const client = new AnthropicBedrock({ awsRegion: "us-east-1" });
const resp = await client.messages.create({
model: "anthropic.claude-opus-4-7-v1:0",
max_tokens: 256,
messages: [{ role: "user", content: "Hello from Bedrock" }],
});
import { AnthropicVertex } from "@anthropic-ai/vertex-sdk";
const client = new AnthropicVertex({ region: "us-central1", projectId: "my-gcp-project" });
Common pitfalls
| Pitfall | Symptom | Fix |
|---|---|---|
Accessing content[0].text without narrowing | TS2339: Property 'text' does not exist | Check block.type === "text" first |
Missing await on async call | Returns Promise, breaks downstream | All SDK methods are async — await them |
| Hardcoded key in source | Leaks via git | Use ANTHROPIC_API_KEY or a secret binding |
| Streaming without finalisation | Connection hangs | await stream.finalMessage() after the loop, or always close res.end() |
| Browser bundle includes SDK | API key exposed in JS | Run the SDK server-side; only ship the proxy endpoint to the browser |
JSON.stringify on Map / class | Tool input arrives empty | Stringify plain objects only |
Common recipes
Typed message builder
import Anthropic from "@anthropic-ai/sdk";
function userTurn(text: string): Anthropic.MessageParam {
return { role: "user", content: text };
}
function assistantTurn(blocks: Anthropic.ContentBlock[]): Anthropic.MessageParam {
return { role: "assistant", content: blocks };
}
function toolResult(id: string, content: string, isError = false): Anthropic.ToolResultBlockParam {
return { type: "tool_result", tool_use_id: id, content, is_error: isError };
}
Result helper
type Result<T, E = Error> =
| { ok: true; value: T }
| { ok: false; error: E };
async function tryClaude(args: Anthropic.MessageCreateParamsNonStreaming): Promise<Result<Anthropic.Message>> {
try {
return { ok: true, value: await client.messages.create(args) };
} catch (err) {
return { ok: false, error: err as Error };
}
}
Cost estimator
const PRICES = {
"claude-opus-4-7": { in: 15.0, out: 75.0 },
"claude-sonnet-4-6": { in: 3.0, out: 15.0 },
"claude-haiku-4-5": { in: 0.8, out: 4.0 },
} as const;
function estimateCost(model: keyof typeof PRICES, usage: Anthropic.Usage): number {
const p = PRICES[model];
return (usage.input_tokens * p.in + usage.output_tokens * p.out) / 1_000_000;
}
See also
- Python SDK — same API in Python.
- Streaming — SSE event reference.
- Tool use — schema, agentic loops, tool_choice.
- Batch API — bulk processing at 50% cost.
- Prompt caching — TTL and cost math.
- Files API — upload once, reference many.