AI · 59
OS · 169
Packages · 147
Programming · 146
Concepts · 16
1 page tagged llm-as-judge.
llm-as-judge
Build production evaluation pipelines for LLM applications — golden datasets, LLM-as-judge, rubrics, statistical significance, regression detection, and evals vs tests.