请问HN:用于处理PDF的最佳本地大型语言模型工具是什么?
我已经非常习惯使用“大型”语言模型来分析PDF文件。
现在,llama.cpp支持视觉功能;我在本地(通过LM Studio)尝试了PDF文件,但结果并没有我预期的那么好。有一次,它坚持说无法进行“光学字符识别”(OCR),但却给了我一个数据可能是什么样子的示例——实际上就是数据本身。
另一个主要问题是,有时PDF实际上是由图像组成的;在处理这些文件时,它也变得非常困惑。
鉴于这一切都是如此新颖,我很难找到任何可以简化这个过程的工具。
查看原文
I've got very used to using the "big" LLMs for analysing PDFs<p>Now llama.cpp has vision support; I tried out PDFs with it locally (via LM Studio) but the results weren't as good as I hoped for. One time it insisted it couldn't do "OCR", but gave me an example of what the data _could_ look like - which was the data.<p>The other major problem is sometimes PDFs are actually made up of images; and it got super confused on those as well.<p>Given this is so new I'm struggling to find any tools which make this easier.