.@christinahkim says the frontier isn't benchmarks anymore. It's usage. Eval scores are saturated, but daily life isn’t. The real signal of progress is how many people use AI to get real things done. That’s how we’ll know we’re approaching AGI.
15,02K