#parsing

8 pages tagged parsing.

8/8

PyYAML

Package-level reference for PyYAML on PyPI — safe_load vs load, dump, custom tags, install, alternatives like ruamel.yaml.

05-31-2026#pip#package#yaml

python-dateutil

Package-level reference for python-dateutil on PyPI — parser, relativedelta, rrule, timezone, install, alternatives.

05-31-2026#pip#package#datetime

beautifulsoup4

Package-level reference for beautifulsoup4 on PyPI — install variants, parser-backend selection (lxml/html5lib/html.parser), and alternatives.

05-31-2026#pip#package#scraping

Regular Expressions

A pattern-matching mini-language for searching, validating, and rewriting text — implemented (with subtly different dialects) by every modern language and CLI tool.

05-25-2026#text#pattern-matching#scripting

json

Encode and decode JSON in Python with the stdlib json module. Covers dumps/loads, indent/sort_keys/separators, custom default= and JSONEncoder, object_hook decoding, JSONL streaming, and orjson/ujson/msgspec comparison.

05-25-2026#python#stdlib#json

datetime

Work with dates, times, and timezones in Python using the stdlib datetime module and zoneinfo. Covers aware vs naive datetimes, ISO-8601 parsing, strftime/strptime, timedelta arithmetic, and DST handling.

05-25-2026#python#stdlib#time

BeautifulSoup

Parse, search, and mutate HTML/XML with BeautifulSoup 4. Covers parser choice (html.parser/lxml/html5lib), find/find_all/select, tree navigation, attribute access, and pairing with requests/httpx/playwright for end-to-end scraping.

05-25-2026#python#beautifulsoup#scraping

unstructured

Extract structured text from PDFs, Word docs, HTML, images, and more with the unstructured library. Covers partitioning, chunking, cleaning, metadata, and pipeline integrations.

04-27-2026#python#unstructured#pdf