Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I was able to finetune the pretrained 13B model (using the same alpaca dataset, just running on my dime) in about 14 hours on an A6000. I think it will work out to between $50 and $60 when the bill hits me.

Be warned everyone is slapping code together so fast that, if your experience is like mine, you'll spend most of your time working around assumptions made by prior authors or hand merging patches between forks to get your setup running well.



Thank you! Very insightful.

Crazy pace.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: