The paper is quite interesting but efficiency on OCR tasks does not mean it coul...

		cnxhk 65 days ago \| parent \| context \| favorite \| on: Karpathy on DeepSeek-OCR paper: Are pixels better ... The paper is quite interesting but efficiency on OCR tasks does not mean it could be plugged into a general llm directly without performance loss. If you train a tokenizer only on OCR text you might be able to get better compression already.