Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Let me toss a grenade in here.

What if we didn’t measure success by sales, but impact to the industry (or society), or value to peoples’ lives?

Zooming out to AI broadly: what if we didn’t measure intelligence by (game-able, arguably meaningless) benchmarks, but real world use cases, adaptability, etc?



I recently watched some Claude Plays Pokemon and believe it's better measure than all those AI benchmarks. The game could be beaten by a 8yo which obviously doesn't have all that knowledge that even small local LLMs posess, but has actual intelligence and could figure out the game within < 100h. So far Claude can't even get past the first half and I doubt any other AI could get much further.


Now I want to watch Claude play Pokemon Go, hitching a ride on self-driving cars to random destinations and then trying to autonomously interpret a live video feed to spin the ball at the right pixels...

2026 news feed: Anthropic cited as AI agents simultaneously block traffic across 42 major cities while trying to capture a not-even-that-rare pokemon


the true measure of AI: does it have fun playing pokemon? did it make friends along the way?


We humans love quantifiability. Since you used the word "measure", do you believe the measurement you're aspiring for is quantifiable?

I currently assert that it's not, but I would also say that trying to follow your suggestion is better than our current approach of measuring everything by money.


> We humans love quantifiability.

No. Screw quantifiability. I don't want "we've improved the sota by 1.931%" on basically anything that matters. Show me improvements that are obvious, improvements that stand out.

Claude Plays Pokemon is one of the few really important "benchmarks". No numbers, just the progress and the mood.


This is difficult to do because one of the juiciest parts of AI is being able to take credit for it's work.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: