concept · weight 10
APIs
A versioned contract between two pieces of software — endpoints, verbs, payload shapes, errors, and auth — that decouples a caller from an implementation.
APIs
Definition
An API (Application Programming Interface) is a stable, versioned contract that lets one program call another without depending on its internals. In the engineering sense used across this site, an API is almost always a networked contract — REST/JSON over HTTP, gRPC over HTTP/2, GraphQL over HTTP, or a typed RPC layer like tRPC — that fixes the endpoints, verbs, payload schemas, error envelope, authentication, and rate-limit semantics that both sides have to honour. It is the thing that does not change while the implementation behind it churns.
Why it matters
The API is the boundary where teams, processes, languages, and trust domains meet. Get it right and clients can be written by anyone, in any language, against documentation alone; get it wrong and every release becomes a coordinated rollout. The choice of style is where most of the design tradeoffs live:
- REST over HTTP/JSON — universal client support, cache-friendly (CDNs cache
GETresponses by URL), and standard HTTP semantics for verbs, status codes, and headers. The default for public APIs (Stripe, GitHub, SendGrid) because every language already speaks it. Cost: over- and under-fetching is common; clients often need multiple round-trips. - GraphQL — one endpoint, a query language that lets the client ask for exactly the fields it wants. Wins for complex UIs that aggregate many resources. Cost: HTTP caching is broken (everything is
POSTto/graphql), and N+1 query problems on the server are the rule, not the exception. - gRPC (protobuf over HTTP/2) — binary payloads ~4× smaller than JSON, streaming in both directions, code-generated clients in 10+ languages. The default for service-to-service traffic inside a datacenter. Cost: browsers cannot speak it directly (gRPC-Web is a separate transport), and protobuf changes need disciplined versioning.
- tRPC — end-to-end TypeScript types with no schema language; the server's function signature is the client's type. Excellent for full-stack TypeScript monorepos (Next.js, Remix). Cost: TypeScript-only — useless as a public API.
- OpenAPI / AsyncAPI — not API styles but description languages: OpenAPI 3.1 describes synchronous HTTP APIs (now fully aligned with JSON Schema 2020-12), AsyncAPI 3.0 describes message-driven systems (Kafka, MQTT, WebSocket, SSE). Most production systems publish both.
Beyond style, the same five concerns recur on every API regardless of transport: idempotency, versioning, errors, rate limiting, and auth. Skipping any of them is what turns a working integration into a 3 a.m. page.
How it works
A well-behaved HTTP API is a small handful of disciplines that compose. RFC 9110 (the current HTTP semantics spec, published June 2022) is the foundation; everything below either implements it or extends it.
Verbs carry meaning. GET, HEAD, OPTIONS, TRACE, PUT, and DELETE are idempotent — repeating the same request has the same net effect as a single one, which is what makes safe retry-on-timeout possible. POST and PATCH are not idempotent; if a client retries a POST after a network blip it may create two resources. The standard fix is an Idempotency-Key header (popularised by Stripe): the server stores the result of the first request keyed by the header and replays it on subsequent retries within a TTL.
Status codes are part of the contract. 2xx for success, 3xx for redirection, 4xx for client mistakes the client must fix, 5xx for server faults the client should retry. 429 Too Many Requests specifically signals rate-limiting and should carry a Retry-After header (seconds, or an HTTP-date). 409 Conflict and 412 Precondition Failed are the right answers for optimistic-concurrency disputes — do not return 500.
Errors get an envelope. RFC 9457 Problem Details for HTTP APIs (July 2023, obsoletes RFC 7807) defines a single JSON shape — type, title, status, detail, instance — served as application/problem+json. Extension members carry domain-specific fields (validation errors, retry hints, trace IDs). One parser handles every API that follows it; ad-hoc {error: "..."} envelopes do not compose.
Rate limits are advertised, not just enforced. Return X-RateLimit-Limit, X-RateLimit-Remaining, and X-RateLimit-Reset (or the standard RateLimit / RateLimit-Policy headers from the IETF draft) on every response so clients can self-regulate. Token-bucket is the most common server-side algorithm — it absorbs short bursts while capping the long-term rate.
Versioning is forever. Public APIs almost universally pin the major version in the URL (/v1/users, /v2/users) and roll minor/patch changes in place under the same path. Inside the org, SemVer (MAJOR.MINOR.PATCH) is the right mental model: bug fixes bump patch, additive changes bump minor, breaking changes force a new major and a deprecation window. Header-based versioning (Accept: application/vnd.acme.v2+json) keeps URLs clean but hurts discoverability — most teams find the URL approach simpler.
Streaming is a first-class response shape. When the result is incremental — LLM tokens, log tails, progress updates — Server-Sent Events (SSE) over a single GET is the simplest pattern. The client opens an EventSource, the server keeps the connection open and writes data: frames. The Anthropic and OpenAI APIs both use SSE for token streaming; OpenAPI 3.2 (Sept 2025) added native media-type support for SSE, JSON Lines, and JSON Sequences. For bidirectional streaming, WebSockets or gRPC streams are the alternatives.
Auth is layered onto, not into, the URL. Bearer tokens in Authorization: Bearer <token> are the modern default; never put secrets in query strings (they leak into logs, referrers, and browser history). For machine-to-machine, OAuth 2.0 client credentials or short-lived signed tokens (JWT with rotating keys) are standard; for interactive flows, OAuth 2.1 with PKCE is the current baseline. Cloud-vendor APIs commonly require a signed-request scheme on top of the bearer model — AWS Signature V4 is the canonical example, and modern CLIs now sign natively (curl --aws-sigv4 … from curl 7.75+, OAuth2 bearer via curl --oauth2-bearer … from 7.33+) so scripts no longer need a separate signing helper. For browser-facing APIs that handle PII, post-quantum hybrid key exchange (X25519MLKEM768) is being rolled out at the TLS layer by Cloudflare, Google, AWS, and Anthropic during 2025–2026 — the change is transparent to API code but does mean curl/openssl builds need to be modern (curl ≥ 8.10, OpenSSL ≥ 3.5) for clients to negotiate it.
Supply-chain integrity is part of the API surface. Once an API ships SDKs, container images, or downloadable binaries, the integrity of those artifacts is part of the contract: a tampered SDK can exfiltrate the very bearer tokens the API issues. Sigstore-backed build attestations are now the platform default — gh attestation verify oci://ghcr.io/owner/repo:tag --owner owner enforces the SLSA v1 provenance predicate and confirms the artifact really came from the GitHub repo, commit, and workflow it claims. For high-trust APIs, publishing attestations alongside SDK releases (and verifying them in install scripts) closes the gap between "the API is authenticated" and "the code calling it has not been swapped out".
HTTP/3 in production
HTTP/3 over QUIC is now the default at the major edges: Cloudflare, Google, and Meta all serve HTTP/3 to any client that advertises h3 in Alt-Svc, and the QUIC stack avoids HTTP/2's head-of-line blocking by running each stream on its own UDP "flow". For an API in front of a CDN, this means fewer tail-latency spikes on mobile networks with packet loss and faster recovery from network changes (the QUIC connection ID survives an IP change, so a phone switching from Wi-Fi to LTE keeps its TLS session).
The client side has caught up too. curl --http3 and curl --http3-only are supported in modern builds (ngtcp2 backend is non-experimental; quiche and OpenSSL-QUIC are still flagged experimental as of curl 8.20); wget2 --compression=br,zstd adds HTTP/2 multiplexing and modern compression for batch jobs even when HTTP/3 isn't an option. For an API team, the practical implications are: (1) advertise HTTP/3 via Alt-Svc: h3=":443"; ma=86400; (2) test with curl --http3-only to catch regressions where the edge silently downgrades; and (3) confirm any proxy or WAF in front of the API actually speaks QUIC end-to-end rather than terminating it at the edge.
AI / LLM APIs as a new shape
LLM APIs (Anthropic, OpenAI, Google Gemini, GitHub Models) share a stable shape that has settled in 2025–2026: a small REST surface for inference, SSE for token streaming, JSON-schema-typed tool use (function calling) as the canonical agentic primitive, and an artifact-like resource for long-lived context (uploaded files, persistent memory, prompt caches). The contract is REST in form but the semantics are different — requests are expensive (seconds to minutes, not milliseconds), idempotency keys deduplicate billable retries, and a single response can contain interleaved text, tool-call requests, and citations.
Two shifts are worth calling out:
- Agentic API surfaces. Tool use, MCP (Model Context Protocol), and CLI integrations like
gh copilotandgh models run <model> "<prompt>"mean an API is increasingly consumed by an agent loop rather than a single client request. The agent reads the OpenAPI/JSON-schema description, decides which endpoint to call, executes, and feeds the response back into the next LLM turn. APIs designed for this loop document tool names and parameter schemas as carefully as the REST surface itself. - Inference-platform CLIs as a new caller class. GitHub Models (
gh models list,gh models eval ./prompts/foo.prompt.yml) brings model evaluation into the same CLI used for issues and PRs, authenticated with the sameGITHUB_TOKEN. The pattern — one CLI, many model providers, one authentication context — is becoming the standard developer surface and is what API teams should benchmark their own clients against.
Common pitfalls
- Retrying non-idempotent
POSTon timeout — you do not know whether the server saw the request. Use anIdempotency-Keyso the retry is safe, or convert the operation toPUTagainst a client-chosen resource ID. - Returning
200 OKwith an error body — breaks every generic HTTP client and CDN. Use the right status code and a Problem Details envelope. - No versioning until v1.1 — once a single external client exists, breaking changes cost money and trust. Ship
/v1on day one even if v2 feels years away. - Caching
POSTresponses — browsers and CDNs will not, which is one reason GraphQL-over-POSTloses REST's caching for free. For read-heavy GraphQL, persisted queries +GETis the workaround. - Hiding rate limits from the client — clients that cannot see
X-RateLimit-Remainingwill hammer until they hit429, then back off blindly. Publish the budget. - No
Retry-Afteron429or503— without it, every client invents its own backoff, usually badly. Set the header. - OpenAPI as an afterthought — spec generated from code is fine; spec hand-written to match approximate behaviour is a lie. Either drive the spec from a contract-test suite or generate it from typed handlers (FastAPI, Litestar, NestJS all do this).
- gRPC for browser clients — browsers cannot speak gRPC directly. Use gRPC-Web, ConnectRPC, or an HTTP/JSON gateway in front.
- Leaking internal error detail — stack traces, SQL fragments, and internal hostnames in
5xxresponses are an information-disclosure vulnerability. Thedetailfield should be safe for end users; debugging info goes to logs keyed by a trace ID returned to the client. PATCHwithout a media type — JSON Merge Patch (RFC 7396) and JSON Patch (RFC 6902) are different formats with different semantics. Pick one and advertise it inContent-Type.- Assuming HTTP/3 is end-to-end — the edge may advertise
Alt-Svc: h3=…while the WAF, load balancer, or origin behind it speak only HTTP/2. Test withcurl --http3-onlyand check%{http_version}via--write-outinstead of trusting the headline rollout. - Shipping SDKs without build attestations — a tampered SDK is an authenticated client by definition. Publish Sigstore-backed attestations alongside every release and document
gh attestation verify(or the language-specific equivalent) in the install instructions.
Where to go next
- /sections/javascript/fetch — the browser/Node built-in HTTP client; how to actually call an API from JS/TS.
- /sections/python/requests and /sections/python/httpx — synchronous and async HTTP clients in Python.
- /sections/python/fastapi, /sections/python/litestar, /sections/python/flask — building HTTP APIs in Python, from micro-framework to async ASGI.
- /sections/linux/curl — the universal CLI HTTP client;
--http3,--aws-sigv4,--oauth2-bearer, and--jsonmake it the right tool for modern API probes. - /sections/linux/gh — GitHub CLI:
gh apifor REST/GraphQL,gh modelsfor inference,gh attestation verifyfor supply-chain verification. - /sections/linux/wget — non-interactive downloader;
wget2adds HTTP/2 multiplexing for batch API/static-asset jobs. - /sections/linux/jq — shape and filter the JSON responses that come back.
- /sections/claude-api/python and /sections/claude-api/typescript-sdk — a concrete worked example of a modern REST + SSE API.
- /sections/claude-api/streaming — SSE event types and async iteration patterns.
- /sections/claude-api/tool-use — function-calling as an API-shaped contract between an LLM and your code.
Sources
References consulted while writing this concept page. Links open in a new tab.
- RFC 9110 — HTTP Semantics — Authoritative spec (June 2022) for HTTP methods, status codes, headers, and the formal definitions of safe and idempotent methods that anchor the "How it works" section.
- RFC 9457 — Problem Details for HTTP APIs — The standard JSON error envelope (
type,title,status,detail,instance); obsoletes RFC 7807 and is now the recommended error format. - curl — HTTP/3 in curl — Backend matrix (ngtcp2 non-experimental; quiche / OpenSSL-QUIC experimental) and
--http3/--http3-onlysemantics used in the HTTP/3 production subsection. - Cloudflare — Post-quantum TLS rollout —
X25519MLKEM768hybrid key exchange now default at the Cloudflare edge; underpins the post-quantum note in the auth subsection. - gh attestation verify — GitHub CLI manual — SLSA-v1-by-default Sigstore verification of release artifacts and OCI images cited in the supply-chain paragraph and pitfall #12.
- SLSA v1 Provenance specification — Predicate format the
gh attestationflow enforces; ground truth for the "what an attestation actually contains" framing. - OpenAPI release notes (Speakeasy) — Source for OpenAPI 3.1 / 3.2 facts (JSON Schema 2020-12 alignment; SSE / JSON Lines media-type support added Sept 2025).
- Stripe — Rate limits — Industry-standard reference for
429,Retry-After, and token-bucket-with-burst behaviour cited in the rate-limit guidance and pitfalls list.