Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Interesting

I think soon we are going to realize that we don’t really need training the models

We just need good indexing and sampling

Essentially at some level any LLM is equivalent to a DB of the dataset, with a great NLP interface on top

Both are just different methods of navigating stored data



LLMs can easily produce data not in training dataset.

LLMs do not navigate stored data. An LLM is not a DB of the training data.


I've had the same thought as above but unfounded (just a feeling, pretty much) so I'm curious to learn more. Do you have any references I can check out that supports these claims?


Come up with a novel puzzle that is guaranteed to not be in the training set, and ask GPT-4 to solve it.


To control for that doesn't seems trivial.


But indexing *is* training. It's just not using end-to-end gradient descent.


The models are multiple orders of magnitude smaller than the compressed versions of their training data, they can not be the equivalent of a DB of it.


The training data is ideo-semantically compressed? News to me... is it perhaps stored in kanji?


You might like, the Infinigram paper then. It was discussed recently:

https://news.ycombinator.com/item?id=40266791




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: