> it's proving far better to be able to have lots of people iterating quickly (which necessarily means broad access to the necessary hardware) than to rely on massive models and bespoke hardware
Very true, but can't Google just wait and take from the open-source-LLM community the findings, then quickly update their models on their huge clusters? It's not like they will lose the top position, already done that.
Yes and no. Some of the optimisation techniques that are being researched at the moment use the output of larger models to fine-tune smaller ones, and that sort of improvement can obviously only be one-way. Same with quantising a model beyond the point where the network is trainable. But anything that helps smaller models run faster without appealing to properties of a bigger model that has to already exist? Absolutely yes.
Very true, but can't Google just wait and take from the open-source-LLM community the findings, then quickly update their models on their huge clusters? It's not like they will lose the top position, already done that.