AI · 59
OS · 169
Packages · 147
Programming · 146
Concepts · 16
1 page tagged golden-dataset.
golden-dataset
Build production evaluation pipelines for LLM applications — golden datasets, LLM-as-judge, rubrics, statistical significance, regression detection, and evals vs tests.