Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Exactly. You want to come close to maxing out your RAM for model+context. I've run Gemma on a 64GB M1 and it was pretty okay, although that was before the Quantization-Aware Training version released last week, so it might be even better now.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: