Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> I wonder why scores on TriviaQA vis-a-vis 14b model lags behind Gemma 12b so much; that one is not a formatting-heavy benchmark.

My guess is the vast scale of google data. They've been hoovering data for decades now, and have had curation pipelines (guided by real human interactions) since forever.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: