cheat sheet
numpy
Package-level reference for numpy — install, versioning, ABI breaks, extras, and gotchas. The bedrock of the Python scientific stack.
numpy
What it is
numpy is the canonical N-dimensional array library for Python — the ndarray type, vectorised math, broadcasting, fancy indexing, and a stable C/C++ ABI that most of the scientific ecosystem (pandas, scipy, scikit-learn, matplotlib, torch, jax) depends on at build time.
On PyPI numpy is a top-10 download by volume; on import-graph centrality, it is the single most-depended-on package in the data stack. Reach for the numpy PyPI package whenever you need raw numerical arrays, image/audio buffers, or a fast substitute for nested Python lists.
Install
pip install numpy
Output: (none — exits 0 on success)
uv add numpy
Output: dependency resolved, lockfile updated
poetry add numpy
Output: installed into the project venv
pip install "numpy<2"
Output: pins to the 1.x ABI line — needed when a downstream wheel has not yet rebuilt against numpy 2.x
Versioning & Python support
numpy follows the NEP-29 / SPEC 0 support policy: roughly the latest three minor Python versions and the most recent N numpy versions. Major releases break the C-API ABI; minor releases preserve it.
| NumPy line | Python support | ABI |
|---|---|---|
| 1.26.x | 3.9 – 3.12 | last 1.x; long-term fallback |
| 2.0.x | 3.9 – 3.12 | first 2.x; C-API break |
| 2.1.x | 3.10 – 3.13 | adds free-threaded build support |
| 2.2.x / 2.3.x | 3.10 – 3.13 | current stable line as of late 2025 |
The 2.0 break (mid-2024) removed deprecated aliases (np.int_, np.float_), reorganised the public namespace, and changed the C-API. Most major downstream wheels rebuilt within a quarter, but long-tail packages can still need numpy<2.
Package metadata
- Maintainer: NumPy steering council under NumFOCUS sponsorship
- Project home: github.com/numpy/numpy
- Docs: numpy.org/doc
- License: BSD-3-Clause
- PyPI: pypi.org/project/numpy
- Governance: NumPy Enhancement Proposals (NEPs); steering council
- First released: 2006 (Numeric → NumPy merge); descendant of 1995 Numeric / Numarray
- Downloads: > 300 M / month on PyPI
Optional dependencies & extras
numpy itself ships no pip extras — installing it pulls only the compiled core. The "extras" are companion packages installed alongside.
pip install numpy scipy matplotlib pandas pyarrow jupyter
Output: installs the canonical scientific Python stack
Typical add-ons by domain:
| Companion | Use |
|---|---|
| scipy | algorithms (stats, optimization, signal, sparse) |
| pandas | tabular API on top of numpy |
| matplotlib | plotting; works directly on ndarrays |
| scikit-learn | classical ML; estimators are numpy in / numpy out |
| numba | JIT-compile numpy-flavoured Python to LLVM |
| numexpr | fast numerical expressions for large arrays |
| pillow | image arrays interoperating with numpy |
| h5py / zarr | persistent ndarrays on disk |
| jax / torch | tensor libraries that interop with numpy |
There is also an numpy[test] group on PyPI but it is intended for upstream contributors, not end users.
Alternatives
| Package | One-line trade-off |
|---|---|
| jax.numpy | numpy API with autograd + GPU/TPU — different execution model |
| torch | tensor library with autograd, leans GPU-first |
| cupy | numpy-compatible API running on NVIDIA GPUs |
| dask.array | distributed/lazy numpy across a cluster |
| xarray | labelled N-D arrays on top of numpy — adds DataFrame-style metadata |
| polars / pandas | tabular-only; the right answer for 2-D labelled data |
Common gotchas
- 2.0 ABI break. Wheels built against numpy 1.x and not rebuilt for 2.x produce
numpy.dtype size changederrors at import. Either pinnumpy<2or upgrade the offender. Usepip checkafter installing to detect. - Removed dtype aliases.
np.int,np.float,np.bool,np.objectwere deprecated in 1.20 and removed in 1.24. Use the built-in Python types (int,float,bool) or the explicit sized aliases (np.int64,np.float32). - Apple Silicon SIMD. Wheels are arm64-native and use Apple's Accelerate or OpenBLAS. Stale x86_64 wheels under Rosetta run silently slow. Verify with
np.show_config(). - Views vs copies. Slicing returns a view; assignments to it mutate the parent.
.copy()when you want independence. - Default int dtype is platform-dependent.
np.array([1, 2, 3]).dtypeisint64on Linux/macOS,int32on Windows historically (nowint64from 2.0+). Be explicit (dtype=np.int32) for portable code. np.random.seed(...)is legacy. Preferrng = np.random.default_rng(42)and callrng.normal(...), etc. — the new Generator API is thread-safe and statistically better.- Free-threaded (no-GIL) Python 3.13t. numpy 2.1+ has experimental support; many downstream packages do not yet. Test before deploying to a free-threaded build.
Real-world recipes
numpy is the substrate for almost every other library in the data stack; the recipes below are the packaging-level patterns — install footprint, BLAS choice, dtype memory budget — rather than re-teaching the ndarray API (the companion sections/python/numpy covers that).
Memory-mapped large arrays — load a 30 GB binary file without loading it into RAM. The OS pages chunks in on demand; the array survives across processes.
import numpy as np
# Create a 10-GB memmap (sparse on most filesystems)
arr = np.memmap("scratch.bin", dtype=np.float32, mode="w+", shape=(2_500_000_000,))
arr[:1000] = np.arange(1000, dtype=np.float32)
arr.flush()
del arr
# Re-open in read mode in another process
arr2 = np.memmap("scratch.bin", dtype=np.float32, mode="r", shape=(2_500_000_000,))
print(arr2[:5])
Output: [0. 1. 2. 3. 4.] — the disk-backed array reads only the requested pages
Vectorised feature engineering — the canonical numpy speed pattern. Compute a derived feature for a million rows in a single C-level pass:
import numpy as np
x = np.random.default_rng(42).normal(size=1_000_000)
z = np.where(x > 0, np.log1p(x), -np.log1p(-x))
print(z[:5], z.mean())
Output: the transformed first 5 values and the mean; runs in single-digit milliseconds where a Python loop would take seconds
Structured array as a lightweight DataFrame — useful when pulling in pandas is overkill:
import numpy as np
dt = np.dtype([("name", "U16"), ("age", "i4"), ("revenue", "f8")])
people = np.array(
[("alice", 32, 1500.0), ("bob", 28, 2300.0), ("carol", 41, 4100.0)],
dtype=dt,
)
print(people["revenue"].mean())
print(people[people["age"] > 30])
Output: 2633.33... then the filtered records — column access plus boolean masking, ~5 KB total
Linear algebra and FFT — the BLAS-backed surface area:
import numpy as np
# Solve Ax = b with a 1000x1000 system
rng = np.random.default_rng(0)
A = rng.normal(size=(1000, 1000))
b = rng.normal(size=1000)
x = np.linalg.solve(A, b)
print(np.allclose(A @ x, b))
# Real FFT on a 1-second 48 kHz signal
sig = rng.normal(size=48_000)
spectrum = np.fft.rfft(sig)
freqs = np.fft.rfftfreq(48_000, d=1 / 48_000)
print(spectrum.shape, freqs[:3])
Output: True for the linear solve check, then the FFT spectrum shape and first frequencies; both operations dispatch to BLAS / FFTPACK via the configured backend
Broadcasting recipe — outer product without np.outer:
import numpy as np
a = np.array([1, 2, 3])
b = np.array([10, 20])
prod = a[:, None] * b[None, :]
print(prod.shape, prod)
Output: shape (3, 2) and the outer-product matrix; no intermediate allocations beyond the result
Performance tuning
numpy's runtime is dominated by (a) the BLAS the wheel links against and (b) memory layout. Tuning these two is worth far more than micro-optimising NumPy code itself.
import numpy as np
np.show_config()
Output: the BLAS / LAPACK info for the installed wheel — openblas64_ on most Linux/Windows wheels, Accelerate on macOS wheels post-1.23 (and again post-2.0)
Tuning levers, ordered by impact:
| Lever | Mechanism | When it helps |
|---|---|---|
| Match dtype to data | float32 halves memory vs float64 | most large-array workloads |
| C-contiguous slices | predictable cache patterns | nested loops, large arrays |
out= parameter | in-place op, no allocation | hot reduction loops |
np.einsum(...) | one-pass tensor contraction | replaces multiple multiplies + sums |
BLAS thread cap (OPENBLAS_NUM_THREADS=...) | avoid oversubscription | inside parallel CV / multiprocessing |
numexpr for big expressions | SIMD interpreter | mathy column expressions on > 1M elements |
numba JIT | LLVM-compiled inner loops | algorithms with Python-loop hot spots |
Memory layout check:
import numpy as np
a = np.zeros((1000, 1000))
print(a.flags["C_CONTIGUOUS"], a.flags["F_CONTIGUOUS"])
print(a.strides)
Output: True False and the C-order strides (8000, 8) — Fortran-ordered transposes look identical but have swapped strides; mixing the two thrashes the CPU cache
Profiling micro-benchmarks — %timeit in Jupyter or perfplot for plotted speed-vs-N curves. Avoid time.time() for sub-millisecond ops; the resolution lies.
Memory & dataset-size scaling
NumPy is fundamentally an in-RAM library, but it has three escape hatches for data that does not fit: memmaps, dtype trimming, and the ArrayLike protocol that lets you pass non-NumPy implementations through code that thinks it takes ndarrays.
import numpy as np
# A 4-GB float64 array is 2 GB as float32, 1 GB as float16, 500 MB as int8 quantised
big = np.random.default_rng(0).normal(size=500_000_000).astype(np.float32)
print(big.nbytes / 1e9, "GB")
Output: 2.0 GB — half the footprint of the default float64
Out-of-core options, ordered by friction:
np.memmap— disk-backed array, OS-managed paging. Zero code change after the open.- Zarr / h5py — chunked storage with explicit reads; better for cloud/HDFS than memmap.
dask.array— distributed lazy ndarrays. Same numpy API, executes across workers.- CuPy — GPU memory with the numpy API (NVIDIA only).
- JAX device arrays — XLA-compiled, autograd; not a drop-in but very close.
The Array API standard — NumPy 2.0 implements the cross-library Array API. Code that imports numpy.array_api as xp and calls xp.matmul(...) works against CuPy, PyTorch, JAX, and others without modification. This is the modern way to write library code that scales across backends.
Version migration guide
The 2.0 release (mid-2024) was the most impactful break in numpy's history. The checklist below covers the actual code changes you will hit.
1.x → 2.x checklist:
# Removed dtype aliases — fix:
# np.int -> int (Python builtin) or np.int64 explicit
# np.float -> float or np.float64
# np.bool -> bool or np.bool_
# np.object -> object
# np.long -> np.int_ (or specific size)
# Namespace cleanup — fix:
# np.cumproduct -> np.cumprod
# np.product -> np.prod
# np.in1d -> np.isin
# np.alltrue -> np.all
# np.sometrue -> np.any
# NEP-50 scalar promotion — Python ints no longer auto-promote integer arrays
import numpy as np
a = np.array([1, 2, 3], dtype=np.int8)
# 1.x: a + 1000 -> int16; 2.x: a + 1000 -> int8 with OverflowError on assignment
print(a + np.int16(1000)) # explicit promotion
Output: the explicitly-promoted sum [1001, 1002, 1003] — the implicit promotion that used to happen now requires an explicit cast
Downstream wheel compatibility: the 2.0 C-API break required every compiled wheel that uses the numpy C-API to be rebuilt. Most major packages (pandas, scipy, scikit-learn, matplotlib) shipped fixed wheels within a quarter. Use pip check after upgrading to detect un-rebuilt offenders; pin numpy<2 as a fallback if a critical dep has not yet updated.
Free-threaded build (Python 3.13t): numpy 2.1+ supports the no-GIL build but performance is uneven and many downstream packages do not yet. Treat as experimental.
Interop with adjacent ecosystems
Almost every Python data library either accepts or returns numpy arrays. Knowing when a conversion is zero-copy (shared buffer) vs a full copy is essential for memory-tight pipelines.
| Library | From numpy | To numpy | Zero-copy? |
|---|---|---|---|
| pandas | pd.DataFrame(arr) | df.to_numpy() | Yes for matching dtypes |
| polars | pl.from_numpy(arr) | df.to_numpy() | Partial — depends on dtype |
| pyarrow | pa.array(arr) | arr.to_numpy() | Yes for fixed-width numeric |
| torch | torch.from_numpy(arr) | t.numpy() | Yes (CPU only) |
| jax | jnp.asarray(arr) | np.asarray(jax_arr) | Copy for device arrays |
| cupy | cp.asarray(arr) | cp.asnumpy(arr) | Always copies (CPU↔GPU) |
| matplotlib | plt.plot(arr) | n/a | Yes (views) |
| sklearn | model.fit(X) | pred.predict(...) | Yes |
import numpy as np
import pandas as pd
# Round-trip without copy when the pandas frame is numeric and contiguous
arr = np.arange(12, dtype=np.float64).reshape(3, 4)
df = pd.DataFrame(arr)
print(df.values.base is arr or np.shares_memory(df.values, arr))
Output: True when zero-copy succeeded — False if pandas had to allocate (mixed dtypes or object columns)
The DLPack / __array_interface__ / Array API protocols — numpy supports all three, letting any conforming library hand off buffers without explicit conversion. For new code, prefer Array API (xp.asarray(...)) over the legacy __array_interface__ polling.
Troubleshooting common errors
The list below catalogues the recurring frictions; each is one of those "I have seen this error a thousand times" cases.
ValueError: numpy.dtype size changed, may indicate binary incompatibility— a wheel built against numpy 1.x loaded with numpy 2.x installed. Fix:pip install --upgrade <offender>orpip install "numpy<2".AttributeError: module 'numpy' has no attribute 'int'— removed alias. Fix: useint(builtin) ornp.int64.OverflowErrorafter arithmetic on a small-int array — NEP-50 promotion rules. Cast explicitly:arr.astype(np.int64) + 1000.MemoryErroron afloat64allocation — tryfloat32orfloat16; halve memory at the cost of precision.np.float64(...)vsnp.float_(...)confusion —np.float_removed in 2.0. Use the explicitnp.float64.dtype('O')showing up unexpectedly — numpy fell back to Python objects because of a mixed type orNone. Inspect witharr.dtype; clean upstream.- Slicing returns a view, not a copy —
arr[0]shares memory witharr. Mutating it mutates the original. Use.copy()when you want independence. RuntimeError: cannot reduce flexible type— you called.sum()on a string array. Cast to numeric or use the structured-array column accessor.- Random results that change across runs even with a seed — global RNG (legacy
np.random.seed) is not thread-safe. Userng = np.random.default_rng(42)and pass it everywhere.
When NOT to use this
NumPy is the right answer almost always; the cases below are where another tool fits better.
- Tabular data with labels: use pandas or polars. NumPy arrays lose column names.
- Autograd / GPU: PyTorch or JAX. NumPy is CPU-only and has no derivative tracking.
- Symbolic math: SymPy. NumPy is numerical.
- Very small data (< 100 elements): Python lists are fine and have lower import overhead.
- Sparse matrices:
scipy.sparse. NumPy'snp.ndarraymaterialises every zero. - Distributed compute: dask.array, ray, or pyspark. NumPy is single-process.
See also
- sections/python/numpy — full API tutorial (ndarray, broadcasting, indexing)
- sections/python/scipy — scientific algorithms built on numpy
- sections/python/pandas — DataFrame library built on numpy
- sections/packages-pip/pip-scipy — package-level companion
- sections/packages-pip/pip-scikit-learn — ML stack on top of numpy