Data drift is when the data feeding a production AI system changes over time, so the model’s inputs no longer match what it was built and validated on — even though the model code is unchanged. The shift can hit feature distributions, schemas, value ranges, or upstream pipelines, and it quietly degrades accuracy until someone notices the downstream impact.
Drift is hard to act on when the data state behind each run isn’t fixed. If you can compare a live dataset against a released, AI-ready baseline, you can see exactly which fields and distributions moved and reproduce the earlier state to confirm the cause.