展示HN:去孤岛化的人工智能 – 在olmOCR-Bench上排名第一

4作者: adnan9999大约 1 个月前原帖
大多数文档解析器在面对复杂的现实挑战时表现不佳,例如复杂表格、手写文档、历史文档扫描、方程式、多列布局、复杂的阅读顺序等。为了解决这些问题,我们开发了Unsiloed Parser。 我们的最新解析器v3.1在olmOCR-Bench中获得了第一名,并且严格通过率达到了88.0%。我们在1,403个PDF文件和8,413个单元测试中进行了评估,使用未经修改的上游Allen AI评分器(olmocr==0.4.27),发现Unsiloed超越了包括GPT-5.5、Claude Opus 4.7、LlamaParse、Reducto、Azure Document Intelligence、AWS Textract和Unstructured在内的18个其他OCR服务。 当我们深入分析失败案例时,发现许多错误并不是OCR错误,而是诸如\frac与\dfrac、空格差异或等效的LaTeX渲染等问题。我们进行了二次LLM作为评判者的评估,以区分真正的错误与语义等价,这使得修正后的得分提升至94.8(在博客中有详细解释)。 完整的方法论和示例请查看博客: [https://www.unsiloed.ai/blog/unsiloed-ai-achieves-1-rank-on-olmocr-bench-2](https://www.unsiloed.ai/blog/unsiloed-ai-achieves-1-rank-on-olmocr-bench-2) 可重复性评估代码: [https://github.com/Unsiloed-AI/unsiloed-olmocr-benchmark](https://github.com/Unsiloed-AI/unsiloed-olmocr-benchmark) 欢迎在评论中发布您最复杂的PDF文件,我们将通过Unsiloed解析器进行处理,并在此分享输出结果。
查看原文
Most of the document parsers fail on real world challenges like complex tables, handwritten documents, historical document scans, equations, multi-column layouts, complex reading order, etc. We built Unsiloed Parser to handle exactly these cases.<p>Our latest parser v3.1 achieved #1 rank and scored 88.0 strict pass-rate on olmOCR-Bench. We ran the evaluation across 1,403 PDFs and 8,413 unit tests using the unmodified upstream Allen AI scorer (olmocr==0.4.27) and found Unsiloed beats 18 other OCR services, including GPT-5.5, Claude Opus 4.7, LlamaParse, Reducto, Azure Document Intelligence, AWS Textract, and Unstructured.<p>When we dug deeper into the failure cases, we found many errors were not OCR errors but things like \frac vs \dfrac, whitespace differences, or equivalent LaTeX renderings. We ran a secondary LLM-as-Judge evaluation to classify real misses vs semantic equivalents, which lifts the corrected score to 94.8 (explained deeply in the blog post).<p>Blog with full methodology and examples: <a href="https:&#x2F;&#x2F;www.unsiloed.ai&#x2F;blog&#x2F;unsiloed-ai-achieves-1-rank-on-olmocr-bench-2">https:&#x2F;&#x2F;www.unsiloed.ai&#x2F;blog&#x2F;unsiloed-ai-achieves-1-rank-on-...</a><p>Evaluation Code for reproducibility: <a href="https:&#x2F;&#x2F;github.com&#x2F;Unsiloed-AI&#x2F;unsiloed-olmocr-benchmark" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;Unsiloed-AI&#x2F;unsiloed-olmocr-benchmark</a><p>Feel free to post your messiest PDFs in the comment and we&#x27;ll run it through Unsiloed parser and share the output here.