Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Finetuning would never add any new knowledge. For internal corporate data, use RAG or train a model from scratch. Finetuning would not help anyone answer a question from that data.


This isn’t true at all. Fine tune with a single sample and see what happens to your model.


Will share a longer post on this once I finish it. I have tested this multiple times, on bigger models, on custom smaller models, and it does not work.

In a strict sense, finetuning can add new knowledge but for that you need millions of tokens and multiple runs without using LoRA or Peft. For practical purposes, it does not.


I get the sense that you want to do the anti-RAG: take some relatively small corpus of data, train a lora, and then magically have a chatbot that knows your stuff...yeah, that will not work.

But chatbots are only one single use-case. And broadly I think this pattern of LLM-as-store-of-knowledge is a bad one (of course, until ASI, and then it isn't).

That said, you absolutely can impart new knowledge through fine-tuning. Millions of tokens is a rather small hurdle to overcome. And if you're not retraining with original/general data, then your model will become very specialized and possibly overfit...which is not an issue in many instances, and may even be desirable.


Bring on the new post and more details, this stuff is interesting.


PEFT is not for adding knowledge, that's obvious


How does one do RAG? I see it mentioned like 20 times


This article reviews some of the most advanced RAG methods: https://medium.com/@krtarunsingh/advanced-rag-techniques-unl...

(not mine)


you supply to the LLM the subset of data that you need for a specific prompt. That's RAG.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: