data

33 pages in this category.

33/33

unstructured

Package-level reference for unstructured on PyPI — install variants, the huge extras tree, system-level dependencies, and alternative parsers.

05-31-2026#pip#package#ai

streamlit

Package-level reference for the streamlit framework on PyPI — install variants, version policy, extras, and alternatives.

05-31-2026#pip#package#web

scipy

Package-level reference for scipy — install, versioning, submodules, license caveats, and gotchas. Optimization, statistics, signal processing, and linear algebra.

05-31-2026#pip#package#scipy

scikit-learn

Package-level reference for scikit-learn — install, versioning, extras, and gotchas. The de-facto classical-ML library on PyPI.

05-31-2026#pip#package#scikit-learn

prefect

Package-level reference for Prefect on PyPI — install variants, version policy, cloud-vs-OSS extras, and alternatives.

05-31-2026#pip#package#orchestration

polars

Package-level reference for polars — install, versioning, extras, and gotchas. The Rust-powered Arrow-native alternative to pandas.

05-31-2026#pip#package#polars

Pillow

Package-level reference for Pillow on PyPI — install variants, format-specific native deps, version policy, and alternatives.

05-31-2026#pip#package#images

pandas

Package-level reference for pandas — install, versioning, Python compatibility, extras, and gotchas. The de-facto DataFrame library on PyPI.

05-31-2026#pip#package#pandas

numpy

Package-level reference for numpy — install, versioning, ABI breaks, extras, and gotchas. The bedrock of the Python scientific stack.

05-31-2026#pip#package#numpy

modin

Package-level reference for modin — install, backend extras, versioning, and gotchas. Speeds up existing pandas code with a one-line import swap.

05-31-2026#pip#package#modin

matplotlib

Package-level reference for matplotlib on PyPI — install variants, backends, version policy, extras, and alternatives.

05-31-2026#pip#package#plotting

jupyter

Package-level reference for the jupyter meta-package on PyPI — install variants, what it pulls in, version policy, and alternatives.

05-31-2026#pip#package#notebooks

duckdb

Package-level reference for duckdb — install, versioning, extensions, and gotchas. In-process columnar OLAP for Python.

05-31-2026#pip#package#duckdb

dagster

Package-level reference for Dagster on PyPI — install variants, the dagster-* plugin family, version policy, and alternatives.

05-31-2026#pip#package#orchestration

beautifulsoup4

Package-level reference for beautifulsoup4 on PyPI — install variants, parser-backend selection (lxml/html5lib/html.parser), and alternatives.

05-31-2026#pip#package#scraping

Db2 SPUFI

Run SQL through SPUFI, drive Db2 with DSN subsystem commands, BIND packages and plans, schedule DSNTEP2 in JCL, query the SYSIBM catalog, and generate DCLGEN.

05-26-2026#db2#sql#spufi

streamlit

Build interactive web apps for data and ML in pure Python. Covers widgets, layout, session state, caching, multipage apps, and deployment patterns.

05-25-2026#python#streamlit#ui

scikit-learn

Build classical ML pipelines with scikit-learn. Covers the estimator API, train_test_split, Pipeline, ColumnTransformer, cross-validation, metrics, and model persistence.

05-25-2026#python#scikit-learn#ml

qsv

Comprehensive reference for qsv: count, headers, stats, moarstats, select, search, sort, dedup, frequency, join, sqlp, luau, apply, schema, validate, sample, split, MCP server, and more — with examples and outputs.

05-25-2026#qsv#csv#cli

json

Encode and decode JSON in Python with the stdlib json module. Covers dumps/loads, indent/sort_keys/separators, custom default= and JSONEncoder, object_hook decoding, JSONL streaming, and orjson/ujson/msgspec comparison.

05-25-2026#python#stdlib#json

jq

Slice, filter, map, and transform JSON data from the command line. Covers all essential filters, built-in functions, select, map, reduce, streaming, jq 1.7/1.8 additions, and real-world API response processing.

05-25-2026#jq#json#data

BeautifulSoup

Parse, search, and mutate HTML/XML with BeautifulSoup 4. Covers parser choice (html.parser/lxml/html5lib), find/find_all/select, tree navigation, attribute access, and pairing with requests/httpx/playwright for end-to-end scraping.

05-25-2026#python#beautifulsoup#scraping

prefect

Build, schedule, and observe Python workflows with Prefect. Covers flows, tasks, retries, schedules, deployments, caching, concurrency, and Prefect Cloud.

04-27-2026#python#prefect#orchestration

polars

High-performance DataFrames with a lazy expression API. Covers read/write, select, filter, group_by, joins, LazyFrame, datetime, string operations, and pandas interop.

04-27-2026#python#polars#dataframes

modin

Speed up pandas workloads across all CPU cores with a one-line import swap. Covers Ray and Dask backends, config tuning, pandas interop, and when modin wins vs polars.

04-27-2026#python#modin#pandas

DuckDB

Run fast analytical SQL queries in-process with DuckDB. Covers Python API, CSV/Parquet ingestion, pandas interop, Arrow, window functions, and persistent databases.

04-27-2026#python#duckdb#sql

dagster

Build, schedule, and observe data pipelines as software-defined assets with Dagster. Covers assets, jobs, schedules, sensors, resources, partitions, and the Dagster UI.

04-27-2026#python#dagster#orchestration

scipy

Statistical distributions, optimization, integration, signal processing, and linear algebra with SciPy. Builds on NumPy arrays.

04-25-2026#python#scipy#statistics

Pillow

Open, resize, crop, convert, and save images with Pillow (PIL fork). Covers format conversion, filters, drawing, and EXIF handling.

04-25-2026#python#pillow#pil

pandas

Load, filter, transform, and aggregate tabular data with pandas. Covers DataFrame creation, read_csv, groupby, merge, and the SettingWithCopy pitfall.

04-25-2026#python#pandas#dataframes

numpy

Create and manipulate N-dimensional arrays with NumPy. Covers array creation, broadcasting, vectorized math, indexing, and matrix operations.

04-25-2026#python#numpy#arrays

matplotlib

Create publication-quality 2-D plots with matplotlib. Covers pyplot basics, subplots, savefig, common chart types, and the show-vs-save pitfall.

04-25-2026#python#matplotlib#plotting

jupyter

Run interactive Python notebooks with Jupyter. Covers JupyterLab setup, cell types, keyboard shortcuts, magic commands, nbconvert export, and common pitfalls.

04-25-2026#python#jupyter#notebooks