#parsing
8 pages tagged parsing.
PyYAML
Package-level reference for PyYAML on PyPI — safe_load vs load, dump, custom tags, install, alternatives like ruamel.yaml.
python-dateutil
Package-level reference for python-dateutil on PyPI — parser, relativedelta, rrule, timezone, install, alternatives.
beautifulsoup4
Package-level reference for beautifulsoup4 on PyPI — install variants, parser-backend selection (lxml/html5lib/html.parser), and alternatives.
Regular Expressions
A pattern-matching mini-language for searching, validating, and rewriting text — implemented (with subtly different dialects) by every modern language and CLI tool.
json
Encode and decode JSON in Python with the stdlib json module. Covers dumps/loads, indent/sort_keys/separators, custom default= and JSONEncoder, object_hook decoding, JSONL streaming, and orjson/ujson/msgspec comparison.
datetime
Work with dates, times, and timezones in Python using the stdlib datetime module and zoneinfo. Covers aware vs naive datetimes, ISO-8601 parsing, strftime/strptime, timedelta arithmetic, and DST handling.
BeautifulSoup
Parse, search, and mutate HTML/XML with BeautifulSoup 4. Covers parser choice (html.parser/lxml/html5lib), find/find_all/select, tree navigation, attribute access, and pairing with requests/httpx/playwright for end-to-end scraping.
unstructured
Extract structured text from PDFs, Word docs, HTML, images, and more with the unstructured library. Covers partitioning, chunking, cleaning, metadata, and pipeline integrations.