Hacker Newsnew | past | comments | ask | show | jobs | submit | more Gracana's commentslogin

It's too bad they went to Windows later on. Having the Unix environment available makes the older stuff really flexible.


Interesting, I have the opposite impression. I want to like it because it's the biggest model I can run at home, but its punchy style and insistence on heavily structured output scream "tryhard AI." I was really hoping that this model would deviate from what I was seeing in their previous release.


what do you mean by "heavily structured output"? i find it generates the most natural-sounding output of any of the LLMs—cuts straight to the answer with natural sounding prose (except when sometimes it decides to use chat-gpt style output with its emoji headings for no reason). I've only used it on kimi.com though, wondering what you're seeing.


Yeah, by "structured" I mean how it wants to do ChatGPT-style output with headings and emoji and lists and stuff. And the punchy style of K2 0905 as shown in the fiction example in the linked article is what I really dislike. K2 Thinking's output in that example seems a lot more natural.

I'd be totally on board if cut straight to the answer with natural sounding prose, as you described, but for whatever reason that has not been my experience.


From what I've heard, Kimi K2 0905 was a major downgrade for writing.

So, when you hear people recommend Kimi K2 for writing, it's likely that they recommend the first release, 0711, and not the 0905 update.


Ohhh, thanks, that's really good to know. I'll have to give that one a shot.


Interesting. As others have noted, it has a cut straight to the point non-psychophantic style that I find exceptionally rich in detailey and impressive. But it sounds like you're saying an earlier version was even better.


Again, it's just what I've heard, but the way I've heard it described is: they must have fine tuned 0905 on way too many ChatGPT traces.


> I find it generates the most natural-sounding output of any of the LLMs

Curious, does it do as well/natural as claude 3.5/3.6 sonnet? That was imo the most "human" an AI has ever sounded. (Gemini 2.5 pro is a distant second, and chatgpt is way behind imo.)


If you want to do it at home, ik_llama.cpp has some performance optimizations that make it semi-practical to run a model of this size on a server with lots of memory bandwidth and a GPU or two for offload. You can get 6-10 tok/s with modest hardware workstation hardware. Thinking chews up a lot of tokens though, so it will be a slog.


What kind of server have you used to run a trillion parameter model? I'd love to dig more into this.


Hi Simon. I have a Xeon W5-3435X with a 768GB of DDR5 across 8 channels, iirc it's running at 5800MT/s. It also has 7x A4000s, water cooled to pack them into a desktop case. Very much a compromise build, and I wouldn't recommend Xeon sapphire rapids because the memory bandwidth you get in practice is less than half of what you'd calculate from the specs. If I did it again, I'd build an EPYC machine with 12 channels of DDR5 and put in a single rtx 6000 pro blackwell. That'd be a lot easier and probably a lot faster.

There's a really good thread on level1techs about running DeepSeek at home, and everything there more-or-less applies to Kimi K2.

https://forum.level1techs.com/t/deepseek-deep-dive-r1-at-hom...


If I had to guess, I'd say it's one with lots of memory bandwidth and a GPU or two for offload. (sorry, I had to, happy Friday Jr.)


My employer was running Growthpower (ERP software) on an HP 3000 system up until 2018 or so. We replaced it with a "modern" .NET/MSSQL ERP solution that does a lot more, but it's slow and terrible to navigate compared to the old console menu system, and its database is hundreds of tables without a single foreign key. The frontend application makes a long series of sequential queries to build each view... if you're willing to wade through the muck, you can write a server side query that can do in milliseconds what it does in minutes.


I don’t own a laptop. I run DeepSeek-V3 IQ4_XS on a Xeon workstation with lots of RAM and a few RTX A4000s.

It’s not very fast, and I built it up slowly without knowing quite where I was headed. If I could do it over again, I’d go with a recent EPYC with 12 channels of DDR5 and pair it with a single RTX 6000 Pro Blackwell.


Yep, you need to install a ground conditioner for that.


https://proot-me.github.io/

Wow, that is so cool. This looks a lot more approachable than other sandboxing tools.


I would perhaps put pin receptacles / sockets there instead.


Shrouded headers are also available.


Odd perspective, I guess? Here's a different angle: https://techland.time.com/2012/04/16/photos-the-apple-ii-tur...


I used Omron's K3GN panel meters in a project at work and I had to draw the alphabet in the configuration drawing because it is so unintuitive. It's not a whole lot worse than the one shown in the article, but still... it's pretty rough. I think I prefer numbered parameters like you typically see on VFDs. It's a lot easier to just scroll to P148 or whatever, enter to view/modify, scroll the value, enter to set. Menu trees on seven-segment interfaces are a mistake.

Page six shows the alphabet: https://www.myomron.com/downloads/1.Manuals/Panel%20Indicato...


I wonder how they display "S5" or "5S"


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: