⇩ Markdown

considering dark software factories→dark software factory -- key media→blog post - software factories and the agentic moment - 2026-02→validation naming is hard→why give evals a different name than just tests?→evals are a bit like unit or system tests

evals are a bit like unit or system tests

evals are a bit like unit tests or link not trackeds

link not tracked

Backlinks

why give evals a different name than just tests?

evals are a bit like unit or system tests, so why not just call evals "tests" like we have for every other software system we've been evaluating for decades? Maybe because of how they started? Evals started off simply evaluating models and only over time have companies started to develop more complex compound systems, including agents, where evaluating the system is more like doing a system test.

see in context