cheat sheet

idna

Package-level reference for idna on PyPI — IDNA2008 vs UTS46, encode/decode, install, integration with requests / urllib3, alternatives.

updated 05-31-2026

idna

What it is

idna is a Python implementation of RFC 5891 (IDNA2008) for converting between Unicode domain names like münchen.de and their ASCII-compatible encoding (xn--mnchen-3ya.de). It also implements the Unicode Technical Standard #46 (UTS46) for compatibility with the older IDNA2003 mappings that web browsers historically used. The library is part of the request-validation path in requests, urllib3, httpx, and other HTTP libraries — they call idna.encode() before passing a host to the resolver.

Reach for idna directly when you need to: validate or normalize a user-supplied domain name; convert between Unicode and Punycode forms; check whether a label conforms to IDNA2008 rules; or implement a protocol (SMTP, FTP) that requires IDN-safe hostnames.

Install

bash

pip install idna

Output: (none — exits 0 on success; pure-Python, zero dependencies)

bash

uv add idna

Output: dependency resolved + added to pyproject.toml

bash

poetry add idna

Output: updated lockfile + virtualenv install

There are no optional extras — idna is a single pure-Python package with no install variants.

Versioning & Python support

Current line is the 3.x series. Semantic versioning — minor releases are backwards-compatible.
The 2.x line is frozen; remaining downstream pins gradually migrate.
Supports Python 3.6+ on the 3.x line; 3.5 was dropped at 3.0.
The library is small and stable — major releases happen every few years, mostly to update the Unicode tables to match the latest Unicode standard.

Package metadata

Maintainer: Kim Davies
Project home: github.com/kjd/idna
Docs: pypi.org/project/idna
License: BSD-3-Clause
Governance: single-maintainer; ICANN engagement
First released: 2014
Downloads: consistently in PyPI top 10 (transitive via requests, urllib3, httpx)
Standards followed: RFC 5891 (IDNA2008), RFC 5892 (tables), UTS46

Optional dependencies & extras

None. idna has no third-party dependencies.

Alternatives

Package	Trade-off
`encodings.idna` (stdlib)	Implements only the older IDNA2003 RFC. Use only as a last resort — it accepts strings IDNA2008 would reject.
`libidn2` (via ctypes / bindings)	Reference C implementation. Faster on large batches; native dep.
`tld`	Higher-level — extracts effective TLD ("public suffix"). Different layer; pair with `idna`.

Common gotchas

idna.encode() returns bytes, not str. Many callers want .decode("ascii") afterward to get a regular string like "xn--mnchen-3ya.de".
IDNA2008 is strict. Labels with mixed scripts (e.g. mixed Latin + Cyrillic) or with characters disallowed by the table will raise IDNAError. The stdlib idna codec is more permissive — sometimes a sign of bugs.
uts46=True enables the browser-style mapping — converts uppercase to lowercase, maps deprecated chars. Use for parsing user input from address bars.
Empty labels (consecutive dots) and labels longer than 63 octets raise IDNAError. ASCII-only valid hosts pass through unchanged.
Trailing dot (example.com.) is preserved but the empty final label is not encoded. idna.encode("example.com.") raises unless you strip the dot first.
alabel() and ulabel() are per-label functions; for full names, use encode() and decode() which split on . for you.
Internationalized TLDs. .рф, .中国, .tokyo all work through idna — there's nothing special to enable.

Real-world recipes

The recipes cover the four operations you'll actually do: encode, decode, validate, and the UTS46-vs-IDNA2008 split.

Recipe 1 — Encode an internationalised domain to Punycode.

python

import idna
ascii_name = idna.encode("münchen.de").decode("ascii")
print(ascii_name)

Output:

css

xn--mnchen-3ya.de

Pass ascii_name to socket.getaddrinfo() or any other ASCII-only API.

Recipe 2 — Decode Punycode back to Unicode.

python

import idna
print(idna.decode("xn--mnchen-3ya.de"))
print(idna.decode("xn--80akhbyknj4f.xn--p1ai"))

Output:

code

münchen.de
испытание.рф

decode() accepts either bytes or str input.

Recipe 3 — Validate an arbitrary domain.

python

import idna

def is_valid_domain(name: str) -> bool:
    try:
        idna.encode(name)
        return True
    except idna.IDNAError:
        return False

print(is_valid_domain("example.com"))         # True
print(is_valid_domain("münchen.de"))          # True
print(is_valid_domain("foo--bar.com"))        # False — IDNA2008 reserves --
print(is_valid_domain("ASCII--mixed.com"))    # False

Output:

graphql

True
True
False
False

idna.IDNAError covers every rejection reason; inspect str(exc) for the specific cause.

Recipe 4 — UTS46 vs IDNA2008 — browser-permissive vs spec-strict.

python

import idna

# Strict IDNA2008 — rejects mixed scripts and deprecated chars
try:
    idna.encode("ExamPle.Com")     # uppercase disallowed
except idna.IDNAError as e:
    print("strict:", e)

# UTS46 — lowercases and maps deprecated chars (browser behavior)
print("uts46:", idna.encode("ExamPle.Com", uts46=True).decode())

# UTS46 transitional — even more permissive (sharp-s, eszett)
print("uts46 transitional:", idna.encode("straße.de", uts46=True, transitional=True).decode())

Output:

vbnet

strict: Codepoint U+0045 not allowed at position 1 in 'ExamPle'
uts46: example.com
uts46 transitional: strasse.de

uts46=True, transitional=False is the modern default. transitional=True maps ß → ss — historical browser behavior; rarely what you want today.

Recipe 5 — Round-trip with non-ASCII TLD.

python

import idna
original = "тест.испытание"     # Cyrillic example.test
encoded = idna.encode(original).decode("ascii")
decoded = idna.decode(encoded)
print(encoded)
print(decoded)
print(decoded == original)

Output:

sql

xn--e1aybc.xn--80akhbyknj4f
тест.испытание
True

Round-trip is lossless for valid IDNA2008 input.

Performance tuning

idna is fast enough. ~10-50 µs per encode() call; not a hot path in any normal HTTP stack.
Cache results when batch-processing large domain lists — functools.lru_cache(maxsize=10_000) over idna.encode if you re-encode the same labels repeatedly.
The Unicode tables are statically generated at install time — no runtime download.

Version migration guide

2.x → 3.0 — minimum Python 3.5+ (later 3.6+); some helper functions removed in favor of encode/decode.
3.0 → 3.2 — uts46=True default for new releases of httpx matched here.
3.4 → 3.6 — Unicode tables refreshed to match Unicode 15.x.
3.6 → 3.7 — Unicode 16.x, IDNA2008 errata applied.

python

# Pre-3.x removed helpers
from idna.codec import ulabel, alabel   # removed
# 3.x — use top-level functions
from idna import alabel, ulabel

Output: same semantics; cleaner imports.

Security considerations

Homograph attacks are the entire reason IDNA exists. Mixed-script labels (e.g. cyrillic-a masquerading as Latin a) are rejected by IDNA2008 — leaving them in error paths is the safe default.
uts46=True, transitional=True maps ß → ss. This is unsafe for security-sensitive contexts (you can craft visually-similar pairs).
Always normalize before comparing. Compare encoded forms (idna.encode(a) == idna.encode(b)), never raw Unicode — é (U+00E9) ≠ e + combining-acute (U+0065 U+0301).
Allowlist your TLDs. IDNA validates the format of labels; it doesn't tell you whether .zz is a real TLD. Pair with tld or the IANA TLD list.
Email IDN. SMTP and Email IDN (SMTPUTF8) have different rules — for email use idna for the domain part only, after @.

Testing & CI

python

import idna, pytest

@pytest.mark.parametrize("name,expected", [
    ("example.com", "example.com"),
    ("münchen.de", "xn--mnchen-3ya.de"),
    ("испытание.рф", "xn--80akhbyknj4f.xn--p1ai"),
])
def test_encode(name, expected):
    assert idna.encode(name).decode("ascii") == expected

@pytest.mark.parametrize("bad", ["foo--bar.com", "ExamPle.Com", "", ".com"])
def test_rejects(bad):
    with pytest.raises(idna.IDNAError):
        idna.encode(bad)

Output: parametrised test passes for valid names and asserts that invalid forms raise IDNAError.

Ecosystem integrations

requests — idna is used to encode the host portion of every URL.
urllib3 — same.
httpx — same.
smtplib / aiosmtplib — used to encode domain parts when supported.
cryptography — used for IDN-aware Subject Alternative Name (SAN) checks in X.509 verification.
dnspython — used to encode names before DNS queries.

Compatibility matrix

Python	`idna`	Notes
3.5	`2.x` (frozen)	Final supported line for 3.5.
3.6	`3.x`	Lowest 3.x floor.
3.7	`3.x`	Stable.
3.8	`3.x`	Stable.
3.9	`3.x`	Stable.
3.10	`3.x`	Stable.
3.11	`3.x`	Stable.
3.12	`3.x`	Stable.
3.13	`3.x`	Wheel available immediately.

idna is pure Python — wheel availability is universal.

Production deployment

Pin a minimum version (idna>=3.6) to ensure recent Unicode tables.
Validate user-supplied domain input at the boundary with idna.encode(...) inside a try/except idna.IDNAError. Failing fast is preferable to passing invalid hosts down the stack.
Log the encoded form in audit logs — Unicode in logs is a footgun.
uts46=True for user-facing input (browser-style permissiveness); uts46=False for protocol-internal validation (be strict with peer software).
Refresh idna annually to track Unicode standard updates.

When NOT to use this

You only handle ASCII hostnames. No real need for idna if your traffic is example.com-style. (You'll still get it transitively.)
You need actual DNS resolution. idna is encoding only — use dnspython or socket.getaddrinfo() for resolution.
You need the public suffix list. idna doesn't know .co.uk is a registry suffix — use tld or publicsuffix2.
You're decoding email local parts. Use email.headerregistry or email-validator; IDN rules don't apply to local parts.

Troubleshooting common errors

Error / Symptom	Likely cause	Fix
`IDNAError: Codepoint Uxxxx not allowed at position N`	Disallowed Unicode (uppercase, mixed scripts)	Use `uts46=True` for browser-permissive input; fail otherwise.
`IDNAError: Empty Label`	Consecutive dots or leading/trailing dot	Strip and validate input.
`IDNAError: Label too long`	A label > 63 octets after encoding	Shorten the label.
`UnicodeDecodeError` after `idna.encode`	Trying to mix bytes/str	`.decode("ascii")` on the result.
`requests` fails with `IDNAError`	Bad input URL	Validate the URL before passing to requests; consider `uts46=True` upstream.
Old stdlib `encodings.idna` accepts what `idna` rejects	Stdlib uses IDNA2003	Always prefer the `idna` library; treat the stdlib codec as legacy.

idna

What it is

Install

Versioning & Python support

Package metadata

Optional dependencies & extras

Alternatives

Common gotchas

Real-world recipes

Performance tuning

Version migration guide

Security considerations

Testing & CI

Ecosystem integrations

Compatibility matrix

Production deployment

When NOT to use this

Troubleshooting common errors

See also