Any analogy is incorrect if you stretch it enough, otherwise it wouldn't be an analogy...
My clock analogy works up to this: ChatGPT success in factually answering a query is merely a happy coincidence, so it does not work well as a primary source of facts. Exactly like... a broken clock. It correctly tells the time twice a day, but it does not work well as a primary source of time keeping.
Please don't read more deeply into the analogy than that :)
Nope, not random behavior. ChatGPT works by predicting the continuation of a sentence. It has been trained in enough data to emulate some pretty awesome and deep statistical structure in human language. Some studies even argue it has built world models in some contexts, but I'd say that needs more careful analysis. Nonetheless, in no way, shape or form has it developed a sense of right vs wrong, real vs fiction, in a way you can depend on it for precise, factual information. It's a language model. If enough data says bananas are larger than the Empire State building, it would repeat that, even if it's absurd.
I didn’t say it was random behavior. You did when you said it was a happy coincidence.
I know it is just a language model. I know that if you took the same model and trained it on some other corpus that it would produce different results.
But it wasn’t so it doesn’t have enough data to say that bananas are larger than the Empire State Building, not that it would really matter anyways.
One important part of this story that you’re missing is that even if there were no texts about bananas and skyscrapers that the model could infer a relationship between those based on the massive amounts of other size comparisons. It is comparing everything to everything else.
See the Norvig-Chomsky debate for a concrete example of how a language model can creat sentences that have never existed.
> the model could infer a relationship between those based on the massive amounts of other size comparisons
That is true! But would it be factually correct? That's the whole point of my argument.
The knowledge and connections that it acquires comes from its training data and it is trained for completing well-structured sentences, not correct ones. Its training data is the freaking internet. ChatGPT stating facts are a happy coincidence because (1) the internet is filled with incorrect information, (2) its training is wired for mimicking human-language's rich statistical structure, not generating factual sentences, and (3) its own powerful and awesome inference capabilities can make it hallucinate completely false but convincingly-structured sentences.
Sure, it can regurgitate simple facts accurately, especially those that are repeated enough in its training corpus. But it fails for more challenging queries.
For a personal anecdote, I tried asking it for some references for a particular topic I needed to review in my masters dissertation. It gave me a few papers, complete with title, author, year, and a short summary. I got really excited. Turns out all the papers it referenced were completely hallucinated :)
Clock correctness is relative. If the antique windup clock in your living room is off by 5 minutes, it's still basically right. But if the clock in your smartphone is 5 minutes off, something has clearly gone wrong.
Nor is it only incorrect one billionth of the time, as you seem to be indicating through your hypotheticals. Depending on what I've asked it about, it can be incorrect at an extremely high rate.
The broken clock is not the correct analogy.