Interesting I think soon we are going to realize that we don’t really need train...

tempusalaria · on May 9, 2024

LLMs can easily produce data not in training dataset.

LLMs do not navigate stored data. An LLM is not a DB of the training data.

carlthome · on May 9, 2024

I've had the same thought as above but unfounded (just a feeling, pretty much) so I'm curious to learn more. Do you have any references I can check out that supports these claims?

int_19h · on May 9, 2024

Come up with a novel puzzle that is guaranteed to not be in the training set, and ask GPT-4 to solve it.

carlthome · on May 12, 2024

To control for that doesn't seems trivial.

sdrg822 · on May 8, 2024

But indexing *is* training. It's just not using end-to-end gradient descent.

PeterisP · on May 9, 2024

The models are multiple orders of magnitude smaller than the compressed versions of their training data, they can not be the equivalent of a DB of it.

lainga · on May 9, 2024

The training data is ideo-semantically compressed? News to me... is it perhaps stored in kanji?

nsagent · on May 8, 2024

You might like, the Infinigram paper then. It was discussed recently:

https://news.ycombinator.com/item?id=40266791