Interesting- my vision of future LLM interface also envisions one where more bits of information per second are required per interaction to operate it to exactly the spec that you want. But for exactly because it’ll just be a plain old engineering problem.
I think that fundamentally the UIs will become more realtime. The models will - because of much lower latencies and more efficient inference throughput - become realtime autosuggest: prompt tuning; i/o feedback at:
(reading wpm)/(”ui wpm”)
in fact it might be interesting to just have a model optimize for “likely comprehensible and as concise as possible” rather than “most alike the human dataset after RLHF alignmen”, just for this bandrate idea
I think that fundamentally the UIs will become more realtime. The models will - because of much lower latencies and more efficient inference throughput - become realtime autosuggest: prompt tuning; i/o feedback at:
(reading wpm)/(”ui wpm”)
in fact it might be interesting to just have a model optimize for “likely comprehensible and as concise as possible” rather than “most alike the human dataset after RLHF alignmen”, just for this bandrate idea