Just because it's in the training data doesn't mean the model can remember it. T...

		DennisP 7 days ago \| parent \| context \| favorite \| on: Qwen3-Omni-Flash-2025-12-01：a next-generation nati... Just because it's in the training data doesn't mean the model can remember it. The parameters total 60 gigabytes, there's only so much trivia that can fit in there so it has to do lossy compression.