cheat sheet
ai
Package-level reference for the Vercel AI SDK — streamText, generateObject, tool calling, structured output, and the multi-provider model interface.
ai
What it is
ai is Vercel's AI SDK — a provider-agnostic toolkit for calling LLMs from TypeScript with a uniform API for streaming text, generating structured objects, and tool calling. Provider packages (@ai-sdk/openai, @ai-sdk/anthropic, @ai-sdk/google, @ai-sdk/mistral, @ai-sdk/groq, @ai-sdk/xai, …) plug into the same streamText / generateText / generateObject interface.
Reach for ai when you want one SDK that targets every major LLM provider with first-class streaming and tool calling, and especially when building Next.js / Hono / Express apps where the SDK's React/Vue/Svelte client hooks save boilerplate. Reach for provider SDKs directly (openai, @anthropic-ai/sdk) when you need provider-specific features that haven't landed in the AI SDK yet.
Install
You always install ai plus at least one provider:
npm install ai @ai-sdk/openai
Output: added ai and @ai-sdk/openai to dependencies
pnpm add ai @ai-sdk/anthropic
Output: added ai and @ai-sdk/anthropic
yarn add ai @ai-sdk/google
Output: added ai and @ai-sdk/google
bun add ai @ai-sdk/openai @ai-sdk/anthropic
Output: installed ai and both provider packages
For the React UI hooks:
npm install @ai-sdk/react
Output: added @ai-sdk/react to dependencies
Versioning & Node support
The current line is ai@5.x (released 2025) — a significant API consolidation after the ai@4.x and earlier ai@3.x lines. The SDK moves quickly; minor releases ship frequently.
- Node 18+ recommended. Runs in Cloudflare Workers, Vercel Edge, Deno, and Bun.
- ESM-first; CJS supported via Node's interop.
- TypeScript types bundled.
- Major releases (3 → 4 → 5) have rebranded APIs. Migration guides on the Vercel docs site cover each step.
Package metadata
- Maintainer: Vercel
- Project home: github.com/vercel/ai
- Docs: ai-sdk.dev
- npm: npmjs.com/package/ai
- License: Apache 2.0
- First released: 2023
- Downloads: millions weekly — one of the fastest-growing AI client libraries.
Peer dependencies & extras
ai has minimal peer deps. The provider packages each declare ai as a peer:
@ai-sdk/openai— OpenAI, including reasoning models@ai-sdk/anthropic— Claude models@ai-sdk/google,@ai-sdk/google-vertex— Gemini@ai-sdk/mistral,@ai-sdk/groq,@ai-sdk/xai,@ai-sdk/cohere,@ai-sdk/perplexity,@ai-sdk/openai-compatible@ai-sdk/react,@ai-sdk/vue,@ai-sdk/svelte— client framework hookszod— used bygenerateObjectand tool definitions (peer)@vercel/blob,@vercel/kv— common companions for storing chat history
Alternatives
| Package | Trade-off |
|---|---|
openai (official SDK) | Provider-specific. Best when you only need OpenAI features. |
@anthropic-ai/sdk | Provider-specific. Best Claude support. |
langchain | Higher-level orchestration; larger surface. Vercel AI SDK is more focused. |
llamaindex | RAG-focused. Different layer of the stack. |
agents (Cloudflare) | Workers-native agent framework, integrates with the AI SDK. |
Raw fetch | Zero deps, you re-implement streaming and tool-call parsing yourself. |
Real-world recipes
generateText — single response
import { generateText } from "ai";
import { openai } from "@ai-sdk/openai";
const { text } = await generateText({
model: openai("gpt-4o-mini"),
prompt: "Write a haiku about TypeScript.",
});
console.log(text);
Output: prints the model's haiku as a single string.
streamText — streaming response
import { streamText } from "ai";
import { openai } from "@ai-sdk/openai";
const result = streamText({
model: openai("gpt-4o-mini"),
prompt: "Explain TCP slow-start in three short paragraphs.",
});
for await (const chunk of result.textStream) {
process.stdout.write(chunk);
}
Output: prints tokens as they arrive; backpressure-aware async iteration.
generateObject — structured output
import { generateObject } from "ai";
import { openai } from "@ai-sdk/openai";
import { z } from "zod";
const { object } = await generateObject({
model: openai("gpt-4o-mini"),
schema: z.object({
title: z.string(),
tags: z.array(z.string()).max(5),
}),
prompt: "Suggest a title and tags for a blog post about Postgres indexing.",
});
console.log(object.title, object.tags);
Output: returns a typed object validated against the schema; the SDK retries internally if the model returns malformed JSON.
Tool calling
import { generateText, tool } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { z } from "zod";
const result = await generateText({
model: anthropic("claude-3-5-sonnet-latest"),
prompt: "What is the weather in London?",
tools: {
getWeather: tool({
description: "Get current weather for a city",
parameters: z.object({ city: z.string() }),
execute: async ({ city }) => ({ tempC: 12, conditions: "rain" }),
}),
},
});
console.log(result.text);
Output: the model calls getWeather({ city: "London" }), the SDK executes the callback, and the model continues with the tool result.
Chat with message history (Next.js + React hook)
// app/api/chat/route.ts
import { streamText, convertToCoreMessages } from "ai";
import { openai } from "@ai-sdk/openai";
export async function POST(req: Request) {
const { messages } = await req.json();
const result = streamText({
model: openai("gpt-4o-mini"),
messages: convertToCoreMessages(messages),
});
return result.toDataStreamResponse();
}
// app/chat/page.tsx
"use client";
import { useChat } from "@ai-sdk/react";
export default function Chat() {
const { messages, input, handleSubmit, handleInputChange } = useChat();
return (
<form onSubmit={handleSubmit}>
<ul>{messages.map((m) => <li key={m.id}>{m.role}: {m.content}</li>)}</ul>
<input value={input} onChange={handleInputChange} />
</form>
);
}
Output: the API streams tokens; the React hook updates the message list incrementally.
Multi-step tool use
maxSteps lets the model chain multiple tool calls in a single conversation turn.
import { generateText, tool } from "ai";
import { openai } from "@ai-sdk/openai";
import { z } from "zod";
const { text } = await generateText({
model: openai("gpt-4o"),
prompt: "Find the population of France and divide by 1,000,000.",
maxSteps: 5,
tools: {
search: tool({ parameters: z.object({ q: z.string() }), execute: async (a) => "67 million" }),
calculate: tool({ parameters: z.object({ expr: z.string() }), execute: async (a) => eval(a.expr) }),
},
});
Output: model calls search, then calculate, then produces the final answer.
Reasoning model output
import { generateText } from "ai";
import { openai } from "@ai-sdk/openai";
const { text, reasoning } = await generateText({
model: openai("o3-mini"),
prompt: "Solve: if x + 5 = 12, what is x?",
});
console.log("Answer:", text);
console.log("Reasoning:", reasoning);
Output: prints the answer and the model's reasoning trace; only models that emit reasoning return a non-empty reasoning.
Production deployment
The AI SDK runs anywhere modern JavaScript runs.
- Vercel. First-class. The SDK and
@ai-sdk/reactare designed for Next.js API routes. - Cloudflare Workers / Pages. Works directly. Long-running streams need
c.executionCtx.waitUntil(Hono) or equivalent to keep the worker alive. - Bun / Deno / Node. Plain ESM — install and run.
- Edge runtime. Streams use Web
ReadableStream— fully compatible with Cloudflare and Vercel Edge.
Key deployment concerns:
- API keys. Provider keys go in environment variables (
OPENAI_API_KEY,ANTHROPIC_API_KEY). Never embed in client bundles. - Timeouts. Streaming responses can run for minutes. Configure your platform's max-duration (Vercel:
maxDurationexport; Cloudflare: paid plans allow >30 s wall clock). - Rate limits. Use a rate limiter (Upstash Ratelimit,
hono-rate-limiter) in front of AI endpoints — they cost real money per call. - Logging. Capture token counts via
result.usagefor cost tracking.
Performance tuning
- Stream by default. Use
streamTextovergenerateTextwhenever the user sees output incrementally — perceived latency drops dramatically. maxTokenscap. SetmaxTokenson every call to bound cost and latency.- Caching. Use prompt caching (Anthropic, OpenAI) — the SDK passes provider-specific cache headers. Repeated system prompts cache cheaply.
- Parallel tool calls. The SDK can dispatch multiple tools in parallel within one step — keep tools independent so they parallelise.
- Pick the right model.
gpt-4o-mini,claude-haiku,gemini-flashare dramatically cheaper than the flagship models and often sufficient. - Reuse the model factory.
openai("gpt-4o")is cheap, but pulling the provider import inside a hot handler still costs. Hoist to module scope.
Version migration guide
The AI SDK has had several majors. The most consequential recent jumps:
| From | To | Key changes |
|---|---|---|
ai@3 | ai@4 | Tool calling redesigned; introduction of convertToCoreMessages; cleaner streaming response objects. |
ai@4 | ai@5 | Tool definitions, message types, and provider interfaces consolidated. useChat API improvements. Reasoning model support. |
Before (older API):
import { OpenAIStream, StreamingTextResponse } from "ai";
import OpenAI from "openai";
const openai = new OpenAI();
const stream = OpenAIStream(await openai.chat.completions.create({ model: "gpt-4", stream: true, messages }));
return new StreamingTextResponse(stream);
After (current ai@5):
import { streamText } from "ai";
import { openai } from "@ai-sdk/openai";
const result = streamText({ model: openai("gpt-4o"), messages });
return result.toDataStreamResponse();
Output: identical streaming behaviour with the modern, provider-agnostic surface.
Migration checklist:
- Read the release notes for each major you skip — there is no codemod that covers every transition.
- Replace provider-SDK-direct calls with
@ai-sdk/<provider>factories. - Replace
OpenAIStream/AnthropicStreamhelpers withstreamText+toDataStreamResponse. - Update
useChat/useCompletionconsumers — message shape evolved. - Update tool definitions to match the current
tool({ parameters, execute })shape.
Security considerations
- API keys never reach the client. The SDK is designed to be called from the server; client hooks (
useChat) talk to your own API route, which holds the key. - Prompt injection. Any tool that reads user-controlled input is a prompt-injection vector. Treat tool outputs that include user data as untrusted; sanitise before showing to other users.
- Tool execution is real code. A tool that calls
evalor shells out is RCE-by-design if the model decides to call it on attacker input. Validate parameters with the schema and constrain the action. - PII in prompts. Logs of model calls often include the full prompt. Strip PII before logging; respect provider data-use policies.
- Token budget abuse. A user submitting a 100k-token prompt costs real money. Cap
messageslength on the server. - System-prompt leakage. Models leak system prompts when asked. Don't put secrets in system prompts.
- CORS. Streaming endpoints called from cross-origin clients need explicit CORS headers; the SDK helpers expose them, but verify.
Testing & CI integration
Mock the model in unit tests
// route.test.ts
import { describe, it, expect, vi } from "vitest";
import { streamText } from "ai";
vi.mock("@ai-sdk/openai", () => ({
openai: () => ({
doStream: async () => ({
stream: new ReadableStream({ start(c) { c.enqueue("hello"); c.close(); } }),
}),
}),
}));
it("returns a stream", async () => {
const result = streamText({ model: {} as any, prompt: "x" });
const reader = result.textStream.getReader();
const { value } = await reader.read();
expect(value).toBe("hello");
});
Output: test passes without hitting any real provider.
Snapshot the structured output
import { generateObject } from "ai";
// In CI, prefer recording fixtures or using a fake model rather than calling the live API.
Output: integration tests against live models belong behind a feature flag — they cost real money per run.
Ecosystem integrations
| Package | Role |
|---|---|
@ai-sdk/<provider> | Provider adapters (OpenAI, Anthropic, Google, Mistral, Groq, xAI, Cohere, Perplexity, …) |
@ai-sdk/react / @ai-sdk/vue / @ai-sdk/svelte | Framework client hooks |
zod | Schema for structured output and tool params |
@vercel/blob, @vercel/kv | Storage for chat history, attachments |
langfuse, helicone, braintrust | Observability / evaluation |
agents (Cloudflare Agents SDK) | Durable agents that wrap the AI SDK |
Troubleshooting common errors
Invalid API key — environment variable not set or pulled into the runtime. On Vercel, set in project settings; on Cloudflare, wrangler secret put OPENAI_API_KEY.
Tool call failed validation — the model produced JSON that doesn't match the zod schema. Tighten the schema or simplify the tool description.
Stream hangs at start — the model is rate-limited or the API endpoint is wrong. Check provider status and inspect HTTP-level errors with onError.
Cannot find module '@ai-sdk/openai' — installed ai but not the provider. Install the provider package.
maxSteps not reached — the model decided no more tools were needed. Inspect result.steps to confirm.
The model returned object that does not match the schema — model produced invalid JSON repeatedly. Increase maxRetries, simplify the schema, or switch to a stronger model.
React hook re-renders excessively — keys on messages must be stable; using array index causes re-render churn during streaming.
When NOT to use this
- You need a provider-specific feature not yet in the SDK. Use the provider SDK directly until the AI SDK catches up.
- Bundle-size-critical browser code. The SDK + a provider adapter adds ~20–30 KB gzipped. For tiny widgets, hand-roll the fetch call.
- You need full LangChain-style chains / agents. LangChain has more orchestration primitives; the AI SDK is intentionally smaller-surface.
- Vendor-neutrality is a hard requirement. Even though the SDK is open source and provider-agnostic, the streaming response format is opinionated; verify it fits your stack.
See also
- Concept: api — request/response patterns and OpenAI-style chat APIs
- Concept: agents — LLM agents, tool use, multi-step orchestration