Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don’t think it will ever make sense; you can buy so much cloud based usage for this type of price.

From my perspective, the biggest problem is that I am just not going to be using it 24/7. Which means I’m not getting nearly as much value out of it as the cloud based vendors do from their hardware.

Last but not least, if I want to run queries against open source models, I prefer to use a provider like Groq or Cerebras as it’s extremely convenient to have the query results nearly instantly.



my issue is once you have it in your workflow I'd be pretty latency sensitive. imagine those record-it-all apps working well. eventually you'd become pretty reliant on it. I don't want to necessarily be at the whims of the cloud


Aren’t those “record it all” applications implemented as a RAG and injected into the context based on embedding similarity?

Obviously you’re not going to always inject everything into the context window.


As long as you're willing to wait up to an hour for your GPU to get scheduled when you do want to use it.


I don’t understand what you’re saying. What’s preventing you from using eg OpenRouter to run a query against Kimi-K2 from whatever provider?


Because you have Cloudflare (MITM 1), Openrouter (MITM 2) and finally the "AI" provider who can all read, store, analyze and resell your queries.

EDIT: Thanks for downvoting what is literally one of the most important reasons for people to use local models. Denying and censoring reality does not prevent the bubble from bursting.


you can use chutes.ai TEE (Trusted Execution Environment) and Kimi K2 is running at about 100t/s rn


and you'll get a faster model this way


I think you’re missing the whole point, which is not using cloud compute.


Because of privacy reasons? Yeah I’m not going to spend a small fortune for that to be able to use these types of models.


There are plenty of examples and reasons to do so besides privacy- because one can, because it’s cool, for research, for fine tuning, etc. I never mentioned privacy. Your use case is not everyone’s.


All of those things you can still do renting AI server compute though? I think privacy and cool-factor are the only real reasons why it would be rational for someone to spend checks the apple store $19,000 on computer hardware...


Why do you look at this as a consumer? Have you never heard of businesses spending money on hardware???


And what reasons would a business have to spend the money on hardware instead of cloud services? Privacy


Seriously?? You’ve never seen a company want to control its entire stack and hardware for ANY reason but privacy? Cloud is great, but it doesn’t fit every use case.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: