cheat sheet
pygments
Package-level reference for Pygments on PyPI — install, version policy, lexers, formatters, and the pygmentize CLI it ships.
pygments
What it is
Pygments is a generic syntax highlighter for source code. It ships lexers for over 500 languages (Python, JavaScript, Rust, COBOL, JCL, Lisp dialects, shell scripts, configuration formats, log files…) and formatters that turn the tokenized output into HTML, ANSI-terminal, LaTeX, RTF, PNG (via PIL), and more. Sphinx, MkDocs, Jupyter, the rich library, and the IPython REPL all use it under the hood.
Reach for Pygments when you need to render source code on a page or in a terminal and you don't want to hand-roll a lexer. The practical surface is small even though the catalog is huge: pick a lexer, pick a formatter, call highlight(). The 4,000+ lexers are nice for the rare day you need to highlight a 1970s assembly variant, but day-to-day you'll use the same five.
Install
pip install pygments
Output: (none — exits 0 on success; installs pygments library and the pygmentize CLI together)
uv add pygments
Output: dependency resolved + added to pyproject.toml
poetry add pygments
Output: updated lockfile + virtualenv install
pipx install pygments # CLI only, isolated
Output: pygmentize available on $PATH without polluting any project venv.
Versioning & Python support
- Stable
2.xline; the project has been on2.xsince 2008. Releases are semver-ish but the surface is conservative. - Python 3.8+ on
2.18+; older2.16and2.17lines drop at 3.7. - The
3.0milestone has been floated but not landed as of mid-2026 — expect lexer-internals cleanup rather than user-visible API breaks. - Style names occasionally shift (
default,monokai,friendly,dracula, …) — pin the version in production docs builds if your CSS depends on specific style class names.
Package metadata
- Maintainer: Georg Brandl (original author) and the
pygmentsGitHub org - Project home: github.com/pygments/pygments
- Docs: pygments.org/docs
- PyPI: pypi.org/project/Pygments
- License: BSD-2-Clause
- First released: 2006
- Downloads: hundreds of millions per month — pulled in by Sphinx, Jupyter, IPython, MkDocs, rich, and the documentation toolchains of effectively every public Python package
Optional dependencies & extras
Pygments[plugins]— enables entry-point discovery of third-party lexers/formatters. Default on for modern installs.Pygments[windows-terminal]— small Windows console color helper.
Most users install plain pygments. The runtime has no compiled extensions, no transitive dependencies, and is safe to add to any project.
Alternatives
| Package | Trade-off |
|---|---|
tree-sitter | Faster, more accurate parsers per language. Heavier setup (per-language grammar packages); no built-in HTML formatter. Use for IDE-style highlighting. |
chroma (Go) / shiki (JS) | Better defaults for web rendering; not Python-native. Use only if you're already in those ecosystems. |
bat (Rust CLI) | Replaces pygmentize for terminal use with prettier defaults. Use as a CLI, not a library. |
rich.syntax.Syntax | Uses Pygments under the hood but adds line numbers, line ranges, theme integration in rich. Use when already using rich. |
| Hand-rolled regex | Painful past ~50 LOC. Only viable for trivially small DSLs. |
Real-world recipes
Most calls to Pygments are one-liners. The recipes below cover the four formatter targets most people actually use, plus two lexer-selection patterns.
Recipe 1 — Highlight a string to HTML for a static site.
from pygments import highlight
from pygments.lexers import PythonLexer
from pygments.formatters import HtmlFormatter
code = 'def hi(name: str) -> str:\n return f"hi {name}"\n'
html = highlight(code, PythonLexer(), HtmlFormatter(cssclass="codehilite"))
css = HtmlFormatter(style="monokai").get_style_defs(".codehilite")
print(html[:120])
Output: <div class="codehilite"><pre><span></span><span class="k">def</span> <span class="nf">hi</span>... — wrapped in spans with token-class attributes. get_style_defs(...) returns the matching CSS block to drop into your stylesheet.
Recipe 2 — Highlight to ANSI for a terminal pager.
from pygments import highlight
from pygments.lexers import get_lexer_by_name
from pygments.formatters import Terminal256Formatter
print(highlight("for i in 1..3 { println!(\"{i}\") }",
get_lexer_by_name("rust"),
Terminal256Formatter(style="monokai")))
Output: the Rust snippet printed with 256-color ANSI escapes — looks like bat's output, ready to pipe to less -R.
Recipe 3 — Pick a lexer by filename.
from pygments.lexers import get_lexer_for_filename
lexer = get_lexer_for_filename("script.ts")
print(lexer.name)
Output: TypeScript — Pygments scans the filename and extension. For *.txt or other ambiguous files, fall through to get_lexer_by_name("text").
Recipe 4 — Pick a lexer by content (analyse_text heuristic).
from pygments.lexers import guess_lexer
src = "#!/usr/bin/env bash\nset -euo pipefail\necho hi\n"
print(guess_lexer(src).name)
Output: Bash — guess_lexer runs each lexer's analyse_text(src) -> 0.0..1.0 confidence and picks the winner. Useful when the filename is gone (uploaded snippets, paste boards).
Recipe 5 — Custom formatter that emits Markdown fences.
from pygments.formatter import Formatter
class FenceFormatter(Formatter):
def __init__(self, lang="text", **kw):
super().__init__(**kw)
self.lang = lang
def format(self, tokensource, outfile):
outfile.write(f"```{self.lang}\n")
for _, value in tokensource:
outfile.write(value)
outfile.write("```\n")
from pygments import highlight
from pygments.lexers import PythonLexer
print(highlight("x = 1", PythonLexer(), FenceFormatter(lang="python")))
Output:
```python
x = 1
`Formatter` subclasses can do anything with the (`token`, `value`) stream — emit JSON token tables, count lines, strip comments, or render to a custom DOM.
**Recipe 6 — Highlight to LaTeX for a typeset PDF.**
```python
from pygments import highlight
from pygments.lexers import PythonLexer
from pygments.formatters import LatexFormatter
print(highlight("x = [1, 2, 3]", PythonLexer(), LatexFormatter()))
print(LatexFormatter().get_style_defs())
Output: \begin{Verbatim}[commandchars=\\\{\}] wrapping the colorized tokens, plus a \PY{...} color-command preamble.
Performance tuning
Pygments is a pure-Python lexer — fast enough for documentation builds but not for IDE-style real-time highlighting on large files.
- Cache the lexer object.
get_lexer_by_name("python")instantiates each call; build it once and reuse. - Cache
HtmlFormatter(style=...).get_style_defs(...)— it's small but constant per build. - Lex once, format twice — if you need both HTML and ANSI output, call
lexer.get_tokens()directly and feed the iterator to two formatters in sequence. - For documentation builds, parallelize at the page level. Sphinx and MkDocs already do this; if you're rolling your own, use a worker pool — each page is independent.
- Avoid
guess_lexeron every file. It runs every lexer'sanalyse_text; on >100 inputs the cost dominates. Cache the guess by content hash. - Tree-sitter is 10-100× faster if you can afford the grammar packages — for any "highlight 10,000 files" job, Pygments will be the bottleneck.
Version migration guide
2.10 → 2.11— removed several long-deprecated lexer aliases. Checkget_lexer_by_nameraises by alias.2.13 → 2.14—HtmlFormatter'scssclass=now defaults tohighlight(washighlightwith deprecation warning). Pin your CSS.2.15 → 2.16— dropped Python 3.6. Newpythonlexer collapses into a single dialect; no morepython3alias.2.17 → 2.18— improved Rust lexer; some token classes renamed (Token.Name.Function.Magic→Token.Name.Function.Builtin). Update custom styles that referenced the old names.2.18 → 2.19— new lexers added for modern languages; existing API unchanged.
# Before (2.13)
fmt = HtmlFormatter(cssclass="codehilite")
# After (2.14+)
# Same call works; default is now "highlight" if omitted.
fmt = HtmlFormatter(cssclass="codehilite")
Output: identical HTML; the silent change is the default-when-omitted class name.
Production deployment notes
- Build CSS once at deploy time, not per request.
HtmlFormatter(style="monokai").get_style_defs(".codehilite")should land in your static CSS bundle. - Use
cssclass=consistently — every formatter call must agree with the CSS class. Mismatched names produce un-styled<div>s. - Style names are version-locked. If you bundle Pygments-generated CSS, regenerate when bumping Pygments. Mixing CSS from
2.17with HTML from2.19causes subtle color drift. - Sphinx/MkDocs already cache — don't try to add a layer. Their builders ship correct cache invalidation.
- For server-side highlighting in a web app, render to HTML at write-time (when content is uploaded), not read-time. Stored highlighted HTML scales linearly; on-demand highlighting in a hot path doesn't.
Security considerations
- Pygments lexers are regex-driven and can backtrack. Pathological input on some lexers triggers near-quadratic time. For user-uploaded snippets, set a length cap and a timeout.
guess_lexerruns every lexer's analyser — much more expensive than a known-filename path. Disable on untrusted input or constrain to a small lexer list.- HTML output is safe by default. Pygments HTML-escapes token values; never disable that.
- Custom formatters are arbitrary code. Plugin loading via entry points pulls in third-party packages — vet what's installed.
- Style file injection. If a user supplies a
style=name, validate against the known list (pygments.styles.get_all_styles()) — an unknown name raises, but a path-traversal-looking input shouldn't even reach the call. - No native code. No CVEs from compiled-extension memory bugs; the attack surface is pure-Python regex and string handling.
Testing & CI integration
# pip install pygments
from pygments import highlight
from pygments.lexers import PythonLexer
from pygments.formatters import HtmlFormatter
def test_python_highlight_contains_def_token():
out = highlight("def f(): pass\n", PythonLexer(), HtmlFormatter())
assert 'class="k"' in out # keyword token
assert "def" in out
Output: test passes. The class="k" (Keyword) check is more stable than asserting full HTML strings — Pygments occasionally reformats whitespace.
For Sphinx/MkDocs sites, snapshot the rendered HTML of one page and diff in CI; that catches Pygments style-class renames at upgrade time.
Ecosystem integrations
- Sphinx — built-in
code-block::directive uses Pygments; theme CSS often includes a Pygments style. - MkDocs / Material for MkDocs —
pymdownx.highlightuses Pygments by default; switch topygments_lang_class: truefor stable class names. - Jupyter — notebook syntax highlighting routes through Pygments; nbconvert exports via Pygments formatters.
- IPython — terminal REPL uses
Terminal256Formatter. - rich —
rich.syntax.Syntaxwraps Pygments and adds line numbers and themes. bat(Rust) — not Pygments-based, but the practical CLI alternative.black/ruff format— both use Pygments for--diffoutput.pygments-style-*— community style packages installed as entry-points; appear inget_all_styles()after install.
Troubleshooting common errors
| Error / Symptom | Likely cause | Fix |
|---|---|---|
ClassNotFound: no lexer for alias 'foo' | Typo or dropped alias | get_all_lexers() or pygmentize -L lexer to see the live list. |
| Output HTML is un-styled | Missing CSS or cssclass= mismatch | Run HtmlFormatter(style="...").get_style_defs(".X") and inline/output it; ensure <div class="X"> matches. |
| Wrong language detected | guess_lexer ambiguity | Pass get_lexer_for_filename or hard-code get_lexer_by_name. |
Token.Name.Function.Magic style ignored | Token renamed in newer Pygments | Update style sheet, use the new token names. |
| Pygments-rendered CSS clashes with site theme | Pygments uses generic .k, .s classes | Set cssclass= and a scoping CSS prefix. |
| Long input runs slowly | Lexer regex backtracking | Cap input size; profile and consider tree-sitter for repeated calls. |
'cli' extra import fails | Old confusion with httpx; Pygments has no [cli] extra | Just pip install pygments — the CLI ships unconditionally. |
When NOT to use this
- Real-time editor highlighting — too slow for keystroke-by-keystroke updates. Use tree-sitter.
- Just code blocks in Markdown. Most static-site generators already invoke Pygments; you don't need to call it directly.
- One language, ultra-simple output. A 20-line regex highlighter is faster and easier to vendor.
- JavaScript runtimes. Use
shikiorprismjs— Pygments doesn't run in the browser. - You already use
rich. Userich.syntax.Syntaxinstead — same engine, better integration.
Compatibility matrix
| Python | Pygments line | Notes |
|---|---|---|
| 3.7 | 2.16 and earlier | Drop floor; only legacy. |
| 3.8 | 2.18, 2.19 | Current minimum for new releases. |
| 3.9–3.12 | 2.x | Fully supported. |
| 3.13 | 2.18+ | Free-threaded build works. |
See also
- Python: pygmentize — the CLI shipped with Pygments
- Concept: regex — lexers are regex-engine state machines