cheat sheet

ai

Package-level reference for the Vercel AI SDK — streamText, generateObject, tool calling, structured output, and the multi-provider model interface.

#npm#package#ai#llm#vercelupdated 05-31-2026

ai

What it is

ai is Vercel's AI SDK — a provider-agnostic toolkit for calling LLMs from TypeScript with a uniform API for streaming text, generating structured objects, and tool calling. Provider packages (@ai-sdk/openai, @ai-sdk/anthropic, @ai-sdk/google, @ai-sdk/mistral, @ai-sdk/groq, @ai-sdk/xai, …) plug into the same streamText / generateText / generateObject interface.

Reach for ai when you want one SDK that targets every major LLM provider with first-class streaming and tool calling, and especially when building Next.js / Hono / Express apps where the SDK's React/Vue/Svelte client hooks save boilerplate. Reach for provider SDKs directly (openai, @anthropic-ai/sdk) when you need provider-specific features that haven't landed in the AI SDK yet.

Install

You always install ai plus at least one provider:

bash
npm install ai @ai-sdk/openai

Output: added ai and @ai-sdk/openai to dependencies

bash
pnpm add ai @ai-sdk/anthropic

Output: added ai and @ai-sdk/anthropic

bash
yarn add ai @ai-sdk/google

Output: added ai and @ai-sdk/google

bash
bun add ai @ai-sdk/openai @ai-sdk/anthropic

Output: installed ai and both provider packages

For the React UI hooks:

bash
npm install @ai-sdk/react

Output: added @ai-sdk/react to dependencies

Versioning & Node support

The current line is ai@5.x (released 2025) — a significant API consolidation after the ai@4.x and earlier ai@3.x lines. The SDK moves quickly; minor releases ship frequently.

  • Node 18+ recommended. Runs in Cloudflare Workers, Vercel Edge, Deno, and Bun.
  • ESM-first; CJS supported via Node's interop.
  • TypeScript types bundled.
  • Major releases (3 → 4 → 5) have rebranded APIs. Migration guides on the Vercel docs site cover each step.

Package metadata

  • Maintainer: Vercel
  • Project home: github.com/vercel/ai
  • Docs: ai-sdk.dev
  • npm: npmjs.com/package/ai
  • License: Apache 2.0
  • First released: 2023
  • Downloads: millions weekly — one of the fastest-growing AI client libraries.

Peer dependencies & extras

ai has minimal peer deps. The provider packages each declare ai as a peer:

  • @ai-sdk/openai — OpenAI, including reasoning models
  • @ai-sdk/anthropic — Claude models
  • @ai-sdk/google, @ai-sdk/google-vertex — Gemini
  • @ai-sdk/mistral, @ai-sdk/groq, @ai-sdk/xai, @ai-sdk/cohere, @ai-sdk/perplexity, @ai-sdk/openai-compatible
  • @ai-sdk/react, @ai-sdk/vue, @ai-sdk/svelte — client framework hooks
  • zod — used by generateObject and tool definitions (peer)
  • @vercel/blob, @vercel/kv — common companions for storing chat history

Alternatives

PackageTrade-off
openai (official SDK)Provider-specific. Best when you only need OpenAI features.
@anthropic-ai/sdkProvider-specific. Best Claude support.
langchainHigher-level orchestration; larger surface. Vercel AI SDK is more focused.
llamaindexRAG-focused. Different layer of the stack.
agents (Cloudflare)Workers-native agent framework, integrates with the AI SDK.
Raw fetchZero deps, you re-implement streaming and tool-call parsing yourself.

Real-world recipes

generateText — single response

typescript
import { generateText } from "ai";
import { openai } from "@ai-sdk/openai";

const { text } = await generateText({
  model: openai("gpt-4o-mini"),
  prompt: "Write a haiku about TypeScript.",
});
console.log(text);

Output: prints the model's haiku as a single string.

streamText — streaming response

typescript
import { streamText } from "ai";
import { openai } from "@ai-sdk/openai";

const result = streamText({
  model: openai("gpt-4o-mini"),
  prompt: "Explain TCP slow-start in three short paragraphs.",
});

for await (const chunk of result.textStream) {
  process.stdout.write(chunk);
}

Output: prints tokens as they arrive; backpressure-aware async iteration.

generateObject — structured output

typescript
import { generateObject } from "ai";
import { openai } from "@ai-sdk/openai";
import { z } from "zod";

const { object } = await generateObject({
  model: openai("gpt-4o-mini"),
  schema: z.object({
    title: z.string(),
    tags: z.array(z.string()).max(5),
  }),
  prompt: "Suggest a title and tags for a blog post about Postgres indexing.",
});
console.log(object.title, object.tags);

Output: returns a typed object validated against the schema; the SDK retries internally if the model returns malformed JSON.

Tool calling

typescript
import { generateText, tool } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { z } from "zod";

const result = await generateText({
  model: anthropic("claude-3-5-sonnet-latest"),
  prompt: "What is the weather in London?",
  tools: {
    getWeather: tool({
      description: "Get current weather for a city",
      parameters: z.object({ city: z.string() }),
      execute: async ({ city }) => ({ tempC: 12, conditions: "rain" }),
    }),
  },
});
console.log(result.text);

Output: the model calls getWeather({ city: "London" }), the SDK executes the callback, and the model continues with the tool result.

Chat with message history (Next.js + React hook)

typescript
// app/api/chat/route.ts
import { streamText, convertToCoreMessages } from "ai";
import { openai } from "@ai-sdk/openai";

export async function POST(req: Request) {
  const { messages } = await req.json();
  const result = streamText({
    model: openai("gpt-4o-mini"),
    messages: convertToCoreMessages(messages),
  });
  return result.toDataStreamResponse();
}
tsx
// app/chat/page.tsx
"use client";
import { useChat } from "@ai-sdk/react";

export default function Chat() {
  const { messages, input, handleSubmit, handleInputChange } = useChat();
  return (
    <form onSubmit={handleSubmit}>
      <ul>{messages.map((m) => <li key={m.id}>{m.role}: {m.content}</li>)}</ul>
      <input value={input} onChange={handleInputChange} />
    </form>
  );
}

Output: the API streams tokens; the React hook updates the message list incrementally.

Multi-step tool use

maxSteps lets the model chain multiple tool calls in a single conversation turn.

typescript
import { generateText, tool } from "ai";
import { openai } from "@ai-sdk/openai";
import { z } from "zod";

const { text } = await generateText({
  model: openai("gpt-4o"),
  prompt: "Find the population of France and divide by 1,000,000.",
  maxSteps: 5,
  tools: {
    search: tool({ parameters: z.object({ q: z.string() }), execute: async (a) => "67 million" }),
    calculate: tool({ parameters: z.object({ expr: z.string() }), execute: async (a) => eval(a.expr) }),
  },
});

Output: model calls search, then calculate, then produces the final answer.

Reasoning model output

typescript
import { generateText } from "ai";
import { openai } from "@ai-sdk/openai";

const { text, reasoning } = await generateText({
  model: openai("o3-mini"),
  prompt: "Solve: if x + 5 = 12, what is x?",
});
console.log("Answer:", text);
console.log("Reasoning:", reasoning);

Output: prints the answer and the model's reasoning trace; only models that emit reasoning return a non-empty reasoning.

Production deployment

The AI SDK runs anywhere modern JavaScript runs.

  • Vercel. First-class. The SDK and @ai-sdk/react are designed for Next.js API routes.
  • Cloudflare Workers / Pages. Works directly. Long-running streams need c.executionCtx.waitUntil (Hono) or equivalent to keep the worker alive.
  • Bun / Deno / Node. Plain ESM — install and run.
  • Edge runtime. Streams use Web ReadableStream — fully compatible with Cloudflare and Vercel Edge.

Key deployment concerns:

  • API keys. Provider keys go in environment variables (OPENAI_API_KEY, ANTHROPIC_API_KEY). Never embed in client bundles.
  • Timeouts. Streaming responses can run for minutes. Configure your platform's max-duration (Vercel: maxDuration export; Cloudflare: paid plans allow >30 s wall clock).
  • Rate limits. Use a rate limiter (Upstash Ratelimit, hono-rate-limiter) in front of AI endpoints — they cost real money per call.
  • Logging. Capture token counts via result.usage for cost tracking.

Performance tuning

  • Stream by default. Use streamText over generateText whenever the user sees output incrementally — perceived latency drops dramatically.
  • maxTokens cap. Set maxTokens on every call to bound cost and latency.
  • Caching. Use prompt caching (Anthropic, OpenAI) — the SDK passes provider-specific cache headers. Repeated system prompts cache cheaply.
  • Parallel tool calls. The SDK can dispatch multiple tools in parallel within one step — keep tools independent so they parallelise.
  • Pick the right model. gpt-4o-mini, claude-haiku, gemini-flash are dramatically cheaper than the flagship models and often sufficient.
  • Reuse the model factory. openai("gpt-4o") is cheap, but pulling the provider import inside a hot handler still costs. Hoist to module scope.

Version migration guide

The AI SDK has had several majors. The most consequential recent jumps:

FromToKey changes
ai@3ai@4Tool calling redesigned; introduction of convertToCoreMessages; cleaner streaming response objects.
ai@4ai@5Tool definitions, message types, and provider interfaces consolidated. useChat API improvements. Reasoning model support.

Before (older API):

typescript
import { OpenAIStream, StreamingTextResponse } from "ai";
import OpenAI from "openai";
const openai = new OpenAI();
const stream = OpenAIStream(await openai.chat.completions.create({ model: "gpt-4", stream: true, messages }));
return new StreamingTextResponse(stream);

After (current ai@5):

typescript
import { streamText } from "ai";
import { openai } from "@ai-sdk/openai";
const result = streamText({ model: openai("gpt-4o"), messages });
return result.toDataStreamResponse();

Output: identical streaming behaviour with the modern, provider-agnostic surface.

Migration checklist:

  1. Read the release notes for each major you skip — there is no codemod that covers every transition.
  2. Replace provider-SDK-direct calls with @ai-sdk/<provider> factories.
  3. Replace OpenAIStream / AnthropicStream helpers with streamText + toDataStreamResponse.
  4. Update useChat / useCompletion consumers — message shape evolved.
  5. Update tool definitions to match the current tool({ parameters, execute }) shape.

Security considerations

  • API keys never reach the client. The SDK is designed to be called from the server; client hooks (useChat) talk to your own API route, which holds the key.
  • Prompt injection. Any tool that reads user-controlled input is a prompt-injection vector. Treat tool outputs that include user data as untrusted; sanitise before showing to other users.
  • Tool execution is real code. A tool that calls eval or shells out is RCE-by-design if the model decides to call it on attacker input. Validate parameters with the schema and constrain the action.
  • PII in prompts. Logs of model calls often include the full prompt. Strip PII before logging; respect provider data-use policies.
  • Token budget abuse. A user submitting a 100k-token prompt costs real money. Cap messages length on the server.
  • System-prompt leakage. Models leak system prompts when asked. Don't put secrets in system prompts.
  • CORS. Streaming endpoints called from cross-origin clients need explicit CORS headers; the SDK helpers expose them, but verify.

Testing & CI integration

Mock the model in unit tests

typescript
// route.test.ts
import { describe, it, expect, vi } from "vitest";
import { streamText } from "ai";

vi.mock("@ai-sdk/openai", () => ({
  openai: () => ({
    doStream: async () => ({
      stream: new ReadableStream({ start(c) { c.enqueue("hello"); c.close(); } }),
    }),
  }),
}));

it("returns a stream", async () => {
  const result = streamText({ model: {} as any, prompt: "x" });
  const reader = result.textStream.getReader();
  const { value } = await reader.read();
  expect(value).toBe("hello");
});

Output: test passes without hitting any real provider.

Snapshot the structured output

typescript
import { generateObject } from "ai";
// In CI, prefer recording fixtures or using a fake model rather than calling the live API.

Output: integration tests against live models belong behind a feature flag — they cost real money per run.

Ecosystem integrations

PackageRole
@ai-sdk/<provider>Provider adapters (OpenAI, Anthropic, Google, Mistral, Groq, xAI, Cohere, Perplexity, …)
@ai-sdk/react / @ai-sdk/vue / @ai-sdk/svelteFramework client hooks
zodSchema for structured output and tool params
@vercel/blob, @vercel/kvStorage for chat history, attachments
langfuse, helicone, braintrustObservability / evaluation
agents (Cloudflare Agents SDK)Durable agents that wrap the AI SDK

Troubleshooting common errors

Invalid API key — environment variable not set or pulled into the runtime. On Vercel, set in project settings; on Cloudflare, wrangler secret put OPENAI_API_KEY.

Tool call failed validation — the model produced JSON that doesn't match the zod schema. Tighten the schema or simplify the tool description.

Stream hangs at start — the model is rate-limited or the API endpoint is wrong. Check provider status and inspect HTTP-level errors with onError.

Cannot find module '@ai-sdk/openai' — installed ai but not the provider. Install the provider package.

maxSteps not reached — the model decided no more tools were needed. Inspect result.steps to confirm.

The model returned object that does not match the schema — model produced invalid JSON repeatedly. Increase maxRetries, simplify the schema, or switch to a stronger model.

React hook re-renders excessively — keys on messages must be stable; using array index causes re-render churn during streaming.

When NOT to use this

  • You need a provider-specific feature not yet in the SDK. Use the provider SDK directly until the AI SDK catches up.
  • Bundle-size-critical browser code. The SDK + a provider adapter adds ~20–30 KB gzipped. For tiny widgets, hand-roll the fetch call.
  • You need full LangChain-style chains / agents. LangChain has more orchestration primitives; the AI SDK is intentionally smaller-surface.
  • Vendor-neutrality is a hard requirement. Even though the SDK is open source and provider-agnostic, the streaming response format is opinionated; verify it fits your stack.

See also

  • Concept: api — request/response patterns and OpenAI-style chat APIs
  • Concept: agents — LLM agents, tool use, multi-step orchestration