That is misleading. In some fields that may hold. But in other fields nothing ca...

onion2k · on Jan 11, 2021

nothing can trump specializing the hardware to the computation and the algorithm

Is implementing an algorithm in hardware ever done before all of the obvious software optimizations have been exhausted? If not, how would you know you're implementing the best known version of the algorithm?

It seems to me that hardware is a route to improvement, but only after all of the software improvements have been done. If you want to spand your career making computers do stuff faster there'll always be more work to do in software, simply because if you can make something fast enough there you don't need to implement it in hardware.

samus · on Jan 11, 2021

One would assume that hardware implementations are attempted only for tried&true algorithms that saw no improvements in decades. Developing software is usually easier than developing hardware, and does not require the overhead of insane manufacturing pipelines and the associated logistics. Furthermore, FPGAs often make it possible to do product development and establish oneself on the market _now_. Even if improvements on the original algorithms are found, competitors would have trouble gaining a foothold in the market if the gains from the new algorithms are their only unique selling points.

Then again, there is this crypto-mining boom, where custom ASICs and GPUs literally pay for themselves. There, the associated algorithms were specified to be slow, and no improvements on them can be reasonably expected unless quantum computing takes off for real and opens new venues.

hvidgaard · on Jan 11, 2021

To state it in another way. A less than optimal algorithm implemented in hardware, can be completly dominated by algorithmic improvement. If the improved algorithm is implemented in hardware it would be the new fastest implementation.

That said, when you implement specialized algorithms in hardware, it's likely because it's good enough for the task it's designed for.

fredophile · on Jan 11, 2021

I don't specialize in hardware but I'd guess this is very situation dependant. Algorithms that can easily be implemented on FPGAs and parallelized could probably see easy gains. There are also real world considerations that big o notation doesn't cover. Sorting algorithm A may be faster than algorithm B due to better cache usage even if B is algorithmically better.