I’ve tried that too, trying to detect the scan layout to get better OCR, but it ...

richardlblair · 2025-09-24T13:14:12 1758719652

What's the cost of the fine-tuned model? If you were attempting to optimize for cost, would it be worth it to detect scan layouts to get better OCR?

Honestly, I'm such a noob in this space. I had 1 project I needed to do, didn't want to do it by hand which would have taken 2 days so I spent 5 trying to get a script to do it for me.

netdur · 2025-09-24T22:12:45 1758751965

the model runs on H200 in ~20s, costing about $2.4/hr. on L4 it’s cheaper at ~$0.3/hr but takes ~85s to finish. overall, H200 ends up cheaper at volume. my scan has a separate issue though: each page has two columns, so text from the right side sometimes overflows into the left. OCR can’t really tell where sentences start and end unless the layout is split by column.

rexreed · 2025-09-24T11:46:22 1758714382

what fine tuning approach did you use?

netdur · 2025-09-24T22:13:38 1758752018

just unsloth on colab using A100 and dataset on google drive.