What is Ground Truth?

Ground truth is the reference data accepted as correct, used to train machine learning models and to measure how well they perform. It usually takes the form of labeled examples, expert annotations, or a curated benchmark set.

A medical imaging team, for instance, treats radiologist-confirmed diagnoses as ground truth, then scores a model by how often its predictions match those labels.

The reliability of any evaluation depends on the ground truth behind it. If the reference set changes without record, accuracy scores shift for reasons no one can trace. Treating ground truth as a versioned, reproducible data state keeps evaluations comparable across runs and over time.

Frequently asked questions

Why does ground truth matter for AI?

Models are trained and scored against it, so the quality and stability of ground truth set the ceiling for evaluation reliability.

What forms does ground truth take?

Labeled examples, expert annotations, or curated benchmark datasets.

How can ground truth distort results?

If the reference set changes without versioning, evaluation scores drift and become hard to compare across runs.