Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This was trained on 6T tokens. Neat to see so many tokens used for such a small model.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: