Many of the most popular benchmarks for AI models are outdated or poorly designed. Every time a new AI model is released, it’s typically touted as acing its performance against a series of benchmarks.
Paul Allin is a member of the UK National Statistician's Expert User Advisory Committee and he is the Royal Statistical Society's Honorary Officer for National Statistics. Views expressed in this ...