Inference Cost

Intelligence Per Dollar

June 3, 2026

AI metrics

Yesterday Microsoft added a new metric to a model release card, one that will likely become a standard.¹

Average token usage.

In the first row, the Microsoft model hits 71.6 on SWE-Bench Verified using about a third of the tokens Claude Haiku 4.5 burns.

Benchmarks are now measured on two different dimensions, the overall performance & the cost to achieve that intelligence.