Baidu's 0.9B PaddleOCR-VL 1.5 Just Beat GPT-4o at Reading Documents—But Who's Cashing In?
Everyone figured bloated giants like GPT-4o owned document parsing. Baidu's scrappy 0.9B model just flipped the script—94.5% accuracy, cheaper, faster. But is it hype or hardware shift?
⚡ Key Takeaways
- PaddleOCR-VL 1.5's 0.9B model hits 94.5% on OmniDocBench, topping GPT-4o with polygon layout seg and native res encoding. 𝕏
- Hybrid arch fixes traditional OCR flaws—irregular shapes, reading order—runs cheap on consumer hardware. 𝕏
- Baidu's efficiency play signals shift to sub-2B doc models, leapfrogging US giants like Tesseract 2.0. 𝕏
Worth sharing?
Get the best AI stories of the week in your inbox — no noise, no spam.
Originally reported by Towards AI