cheat sheet

numpy

Create and manipulate N-dimensional arrays with NumPy. Covers array creation, broadcasting, vectorized math, indexing, and matrix operations.

numpy — Numerical Arrays

What it is

NumPy provides the ndarray — a fixed-type, C-backed multi-dimensional array. It is the foundation of the entire Python scientific stack (pandas, scipy, matplotlib all build on it). Vectorized operations on ndarray are typically 100–1000× faster than equivalent pure-Python loops.

Install

bash
pip install numpy

Output: (none — exits 0 on success)

Quick example

python
import numpy as np

a = np.array([1, 2, 3, 4, 5])
print(a * 2)
print(f"mean={a.mean():.1f}, std={a.std():.4f}")
print(np.dot(a, a))

Output:

text
[ 2  4  6  8 10]
mean=3.0, std=1.4142
55

When / why to use it

  • You need fast numeric computation without pandas overhead.
  • Working with images, audio, or other multi-dimensional data (images are just 3-D arrays).
  • Linear algebra (matrix multiply, decompositions).
  • Feeding data into machine learning libraries (all expect numpy arrays or array-like objects).

Common pitfalls

dtype defaults can surprise younp.array([1, 2, 3]) creates int64, but np.zeros((3, 3)) creates float64. Mixing dtypes in operations silently upcasts. Be explicit: np.array([1, 2], dtype=np.float32).

Views vs copies — slicing a numpy array returns a view, not a copy. Modifying the slice modifies the original. Use .copy() when you need independence: b = a[1:3].copy().

Check shapes before operations: a.shape, a.dtype, a.ndim. Mismatched shapes are the #1 cause of ValueError: operands could not be broadcast.

Array creation

NumPy provides a family of constructors for the common cases: np.array() converts Python lists; np.zeros/np.ones/np.full fill with a constant; np.arange mirrors Python's range but returns an array; np.linspace gives evenly spaced floats inclusive of the endpoint. Always specify dtype explicitly when the default (float64 or int64) isn't what you need.

python
import numpy as np

np.array([1, 2, 3])               # from list
np.zeros((3, 4))                  # 3×4 zeros (float64)
np.ones((2, 2), dtype=int)        # 2×2 ones (int)
np.eye(3)                         # 3×3 identity
np.arange(0, 10, 2)               # [0, 2, 4, 6, 8]
np.linspace(0, 1, 5)              # [0, 0.25, 0.5, 0.75, 1.0]
np.random.default_rng(42).random((3, 3))  # random 3×3 (seeded)

Indexing and slicing

Basic indexing (a[1, 2], a[:, 1]) returns a view — modifying it modifies the original. Boolean masks (a[a > 5]) and fancy indexing (integer arrays) return copies. For multi-dimensional arrays, each comma-separated expression selects along one axis.

python
a = np.arange(12).reshape(3, 4)
print(a)
print(a[1, 2])       # row 1, col 2
print(a[:, 1])       # all rows, col 1
print(a[a > 5])      # boolean mask → 1-D array

Output:

text
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
6
[1 5 9]
[ 6  7  8  9 10 11]

Richer example — matrix operations and broadcasting

python
import numpy as np

A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])

print("Matrix product:")
print(A @ B)

print("\nElement-wise product:")
print(A * B)

# Broadcasting: add a row vector to every row of a matrix
offsets = np.array([10, 20])
print("\nBroadcast add:")
print(A + offsets)

# Trig over evenly spaced angles
x = np.linspace(0, 2 * np.pi, 5)
print("\nsin values:")
print(np.round(np.sin(x), 4))

Output:

text
Matrix product:
[[19 22]
 [43 50]]

Element-wise product:
[[ 5 12]
 [21 32]]

Broadcast add:
[[11 22]
 [13 24]]

sin values:
[ 0.  1.  0. -1. -0.]

Useful functions quick reference

FunctionPurpose
np.concatenate([a, b], axis=0)Join arrays along an axis
np.stack([a, b], axis=0)Stack along a new axis
np.split(a, 3)Split into N equal parts
np.sort(a)Sorted copy
np.argsort(a)Indices that would sort the array
np.unique(a)Unique values
np.where(a > 0, a, 0)Conditional element selection
np.clip(a, 0, 1)Clamp values to [0, 1]
np.linalg.norm(a)Euclidean norm
np.linalg.eig(A)Eigenvalues and eigenvectors
np.fft.fft(signal)Fast Fourier transform

dtypes in depth

A NumPy dtype records the binary layout of each element: kind (int, float, complex, bool, bytes, datetime), width in bytes, and byte order. Mismatched dtypes silently upcast (an int8 + float64 returns float64) and can blow up memory by 8x on big arrays, so set the dtype explicitly whenever the default doesn't match the data.

python
import numpy as np

print(np.array([1, 2, 3]).dtype)               # platform default int (usually int64)
print(np.array([1.0, 2.0]).dtype)              # float64
print(np.array([True, False]).dtype)           # bool, 1 byte each
print(np.array(["abc", "de"]).dtype)           # <U3 (Unicode, width 3)
print(np.zeros(3, dtype=np.float32).itemsize)  # 4 bytes per element
print(np.zeros(3, dtype=np.float64).itemsize)  # 8 bytes per element

Output:

text
int64
float64
bool
<U3
4
8

Cast with .astype() — it always returns a copy:

python
a = np.array([1.7, 2.3, 3.9])
print(a.astype(np.int32))     # truncates toward zero
print(np.round(a).astype(int))  # round-half-to-even, then cast

Output:

text
[1 2 3]
[2 2 4]

For 100M+ element arrays consider float32 over float64 and int8/int16 over int64 — half/quarter the RAM and faster on SIMD hardware. The downside is reduced range and precision.

Broadcasting rules

Broadcasting is how NumPy applies an operation to arrays of different shapes without materialising an explicit copy. The rules: starting from the right, two dimensions are compatible if they are equal or one of them is 1. A length-1 axis is virtually "stretched" along the other operand's matching axis. If any non-1 pair disagrees, you get ValueError: operands could not be broadcast together.

python
import numpy as np

a = np.arange(12).reshape(3, 4)   # (3, 4)
row = np.array([10, 20, 30, 40])  # (4,)  — stretched along axis 0
col = np.array([[100], [200], [300]])  # (3, 1) — stretched along axis 1

print(a + row)
print()
print(a + col)

Output:

text
[[10 21 32 43]
 [14 25 36 47]
 [18 29 40 51]]

[[100 101 102 103]
 [204 205 206 207]
 [308 309 310 311]]

Use np.newaxis (or None) to introduce a length-1 axis when you need to broadcast against a different axis:

python
a = np.array([1, 2, 3])
b = np.array([10, 20])
# outer-product style (3,) * (2,) -> (3, 2)
print(a[:, np.newaxis] * b)

Output:

text
[[10 20]
 [20 40]
 [30 60]]

Fancy indexing and boolean masks

Fancy indexing uses integer arrays to pick arbitrary elements; boolean indexing uses a same-shape bool array as a mask. Both return copies — unlike basic slicing, you cannot use them to assign through a view in chained form. Combined indexing (a[idx, :]) can be cryptic, so name your index arrays.

python
import numpy as np

a = np.arange(10) * 10

# Fancy indexing — pick specific positions
idx = np.array([0, 3, 7, 9])
print(a[idx])

# Boolean mask
mask = (a > 20) & (a < 80)
print(a[mask])

# 2-D fancy indexing — pair rows with cols
m = np.arange(16).reshape(4, 4)
print(m[[0, 2, 3], [1, 1, 3]])  # picks (0,1), (2,1), (3,3)

Output:

text
[ 0 30 70 90]
[30 40 50 60 70]
[ 1  9 15]

a[mask] = 0 modifies a in place even though a[mask] returns a copy, because Python translates this into a.__setitem__(mask, 0). The same does NOT work for a[mask][:] = 0 — the intermediate copy is what gets mutated.

Vectorisation vs Python loops

A Python-level for loop on an ndarray is 50–500x slower than the equivalent vectorised expression because each iteration crosses the C/Python boundary. Almost any element-wise operation has a ufunc form (+, np.sin, np.where); use it. When you absolutely cannot vectorise, drop into Numba, Cython, or a list comprehension over scalars — never for i in range(len(a)): a[i] = ....

python
import numpy as np
import time

x = np.arange(1_000_000, dtype=np.float64)

# Slow: Python loop
t0 = time.perf_counter()
out = np.empty_like(x)
for i in range(len(x)):
    out[i] = x[i] ** 2 + 1
slow = time.perf_counter() - t0

# Fast: vectorised
t0 = time.perf_counter()
out = x ** 2 + 1
fast = time.perf_counter() - t0

print(f"loop: {slow*1000:.1f} ms")
print(f"vec:  {fast*1000:.1f} ms")
print(f"speedup: {slow/fast:.0f}x")

Output (typical):

text
loop: 240.0 ms
vec:  3.2 ms
speedup: 75x

Linear algebra (np.linalg)

np.linalg covers the common matrix routines: solve (Ax=b), inv, det, eig, svd, qr, cholesky, and matrix_rank. Always prefer solve(A, b) over inv(A) @ b — it is faster and numerically more stable. For larger problems or specialised decompositions, switch to scipy.linalg, which links to system LAPACK/BLAS.

python
import numpy as np

A = np.array([[3.0, 1.0], [1.0, 2.0]])
b = np.array([9.0, 8.0])

x = np.linalg.solve(A, b)
print("x =", x)
print("det =", np.linalg.det(A))
print("rank =", np.linalg.matrix_rank(A))

# Singular Value Decomposition
U, S, Vt = np.linalg.svd(A)
print("singular values =", S.round(4))

Output:

text
x = [2. 3.]
det = 5.0
rank = 2
singular values = [3.6180 1.382 ]

Random numbers — modern Generator API

The legacy np.random.rand / np.random.seed functions use a global, non-thread-safe state and are deprecated for new code. Use np.random.default_rng(seed) instead — it returns a Generator with explicit state, better statistical quality (PCG64), and faster sampling. Always seed in tests and pipelines so results are reproducible.

python
import numpy as np

rng = np.random.default_rng(42)

print(rng.random(3))                            # uniform [0, 1)
print(rng.integers(0, 10, size=5))              # uniform ints [0, 10)
print(rng.normal(loc=0.0, scale=1.0, size=4))   # N(0, 1)
print(rng.choice(["A", "B", "C"], size=5, p=[0.5, 0.3, 0.2]))
print(rng.permutation(np.arange(5)))            # shuffled copy

Output:

text
[0.77395605 0.43887844 0.85859792]
[1 7 6 4 4]
[ 0.42175859 -1.06560158  0.54470015  0.0717954 ]
['A' 'A' 'B' 'A' 'C']
[2 4 0 3 1]

Structured arrays and record arrays

A structured dtype lets one ndarray hold heterogeneous fields per element — like a fixed-schema, in-memory database row. Useful for parsing binary file formats and interop with C structs, but for general tabular data, pandas or polars is almost always a better choice.

python
import numpy as np

dt = np.dtype([("name", "U10"), ("age", "i4"), ("weight", "f4")])
people = np.array(
    [("Alice", 30, 65.5), ("Bob", 25, 78.2), ("Carol", 35, 60.1)],
    dtype=dt,
)
print(people)
print(people["name"])
print(people["age"].mean())

Output:

text
[('Alice', 30, 65.5) ('Bob', 25, 78.2) ('Carol', 35, 60.1)]
['Alice' 'Bob' 'Carol']
30.0

Memory layout — C vs Fortran order

Multi-dimensional arrays store elements as a flat memory buffer; C order (the default) is row-major (last axis varies fastest), F order is column-major (first axis varies fastest, like MATLAB/Fortran). Choosing the right order can speed up cache-friendly access patterns 2–10x; mismatches force a copy when handing arrays to libraries that demand one or the other (e.g. BLAS).

python
import numpy as np

a = np.arange(12).reshape(3, 4)             # C-order (default)
print(a.flags["C_CONTIGUOUS"], a.flags["F_CONTIGUOUS"])
print(a.strides)                            # (32, 8): step 32 bytes per row, 8 per col

b = np.asfortranarray(a)
print(b.flags["C_CONTIGUOUS"], b.flags["F_CONTIGUOUS"])
print(b.strides)                            # (8, 24): step 8 bytes per row, 24 per col

# Transpose is free — it just re-labels strides, no copy
print(a.T.flags["F_CONTIGUOUS"])

Output:

text
True False
(32, 8)
False True
(8, 24)
True

Saving and loading

NumPy's native binary formats round-trip dtypes and shapes exactly, unlike CSV. Use .npy for a single array, .npz for multiple, and memmap when an array is too big for RAM but you only need to touch a slice at a time. For interop with other languages, prefer HDF5 (via h5py) or Parquet.

python
import numpy as np

a = np.arange(12).reshape(3, 4)
b = np.linspace(0, 1, 5)

# Single array
np.save("a.npy", a)
loaded = np.load("a.npy")

# Multiple arrays in one file
np.savez("bundle.npz", first=a, second=b)
data = np.load("bundle.npz")
print(data["first"].shape, data["second"].shape)

# Compressed
np.savez_compressed("bundle.npz", first=a, second=b)

# Memory-mapped — read/write without loading into RAM
mm = np.memmap("big.dat", dtype="float32", mode="w+", shape=(10_000, 10_000))
mm[0, 0] = 3.14
mm.flush()                       # write to disk
del mm                           # close
mm2 = np.memmap("big.dat", dtype="float32", mode="r", shape=(10_000, 10_000))
print(mm2[0, 0])

Output:

text
(3, 4) (5,)
3.14

Interop with pandas, polars, and Arrow

A pandas Series or DataFrame exposes its underlying buffer through .to_numpy(); numeric DataFrames give a 2-D array with no copy when dtypes are uniform. Polars uses Arrow internally so df.to_numpy() is zero-copy for fixed-width numeric columns and a quick memory-buffer copy otherwise. Going the other way is just pd.DataFrame(arr) or pl.from_numpy(arr).

python
import numpy as np
import pandas as pd

a = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float64)

df = pd.DataFrame(a, columns=["x", "y", "z"])
back = df.to_numpy()
print(back.dtype, back.shape)

# polars
import polars as pl
pl_df = pl.from_numpy(a, schema=["x", "y", "z"])
print(pl_df.to_numpy().shape)

Output:

text
float64 (2, 3)
(2, 3)

GPU and JIT alternatives

When CPU-bound NumPy code is the bottleneck, three drop-in replacements expose nearly identical APIs:

  • CuPy — NumPy on NVIDIA GPUs. import cupy as cp and use cp.array(...); most functions match np.*. Best for large arrays and operations that fit in GPU memory.
  • JAXjax.numpy as jnp provides NumPy semantics plus automatic differentiation, JIT compilation via XLA, and seamless GPU/TPU execution. Arrays are immutable; in-place mutation becomes x.at[i].set(v).
  • Numba@numba.njit decorates a Python function and compiles it to native code at first call. Best for tight numeric loops that don't vectorise cleanly.
python
# Numba example
from numba import njit
import numpy as np

@njit
def sum_squares(x):
    total = 0.0
    for i in range(len(x)):
        total += x[i] * x[i]
    return total

x = np.arange(1_000_000, dtype=np.float64)
print(sum_squares(x))

Output:

text
3.3333328333335e+17

Before reaching for GPU/JIT, confirm you have actually vectorised the NumPy version. A clean NumPy expression often outperforms a hand-rolled Numba loop on small or medium arrays because of better cache behaviour.

Real-world recipes

Image processing: invert and normalise

Images are just H × W × 3 uint8 arrays. Convert to float, normalise to [0, 1], invert, save back.

python
import numpy as np
from PIL import Image

img = np.asarray(Image.open("photo.jpg"))           # uint8, shape (H, W, 3)
arr = img.astype(np.float32) / 255.0                 # normalise
inverted = 1.0 - arr
out = (inverted * 255).astype(np.uint8)
Image.fromarray(out).save("inverted.jpg")

Audio: detect zero-crossings

A zero-crossing is where the signal sign flips; counting them gives a rough frequency estimate.

python
import numpy as np

signal = np.sin(np.linspace(0, 10 * np.pi, 5000))
sign = np.sign(signal)
crossings = np.where(np.diff(sign) != 0)[0]
print(f"zero-crossings: {len(crossings)}")

Output:

text
zero-crossings: 10

Sliding-window mean (cumulative-sum trick)

For a flat numeric series, a sliding window mean via cumsum is O(n) and beats np.convolve for large windows.

python
import numpy as np

x = np.arange(1, 11, dtype=float)
w = 3
cs = np.cumsum(np.insert(x, 0, 0))
windowed = (cs[w:] - cs[:-w]) / w
print(windowed)

Output:

text
[2. 3. 4. 5. 6. 7. 8. 9.]

One-hot encode integer labels

Useful before feeding labels into a neural net or sklearn.linear_model.

python
import numpy as np

labels = np.array([0, 2, 1, 2, 0])
n_classes = labels.max() + 1
one_hot = np.eye(n_classes, dtype=np.int8)[labels]
print(one_hot)

Output:

text
[[1 0 0]
 [0 0 1]
 [0 1 0]
 [0 0 1]
 [1 0 0]]

Pairwise Euclidean distance matrix

Broadcasting computes a full pairwise distance matrix without an explicit loop.

python
import numpy as np

pts = np.array([[0, 0], [3, 4], [6, 8]], dtype=float)
diff = pts[:, np.newaxis, :] - pts[np.newaxis, :, :]   # (N, N, 2)
dist = np.linalg.norm(diff, axis=-1)
print(dist)

Output:

text
[[ 0.  5. 10.]
 [ 5.  0.  5.]
 [10.  5.  0.]]

For larger point clouds use scipy.spatial.distance_matrix or scipy.spatial.KDTree — they avoid the full N×N intermediate.

See also

  • pandas and polars — DataFrame libraries that wrap NumPy/Arrow buffers.
  • scipy — domain-specific algorithms (stats, optimize, signal, sparse) on top of NumPy.
  • matplotlib — pairs naturally with NumPy for plotting arrays.
  • scikit-learn — ML models consume NumPy arrays directly.