Does CUDA even matter than much for LLMs? Especially inference? I don't think so...

Der_Einzige · on Dec 3, 2024

It's the only thing that matters. Folks act like AMD support is there because suddenly you can run the most basic LLM workload. Try doing anything actually interesting (i.e, try running anything cool in the mechanistic interoperability or representation/attention engineer world) with AMD and suddenly everything broken, nothing works, and you have to spend millions worth of AI engineer developer time to try to salvage a working solution.

Or you can just buy Nvidia.