请问HN:使用AI/LLM API让我想放弃。我哪里做错了?

1作者: moomoo117 个月前原帖
我正在尝试自动化我们目前的一些手动流程,但仍然无法克服这个难关。我到底做错了什么? 我正在使用这些人工智能API进行实际的处理工作,老实说,我感到沮丧和有些愤怒。这些人工智能公司向我们展示了一些宏大的自动化愿景,但实际使用他们的服务却是一种令人失望的体验。 1. 结果从来不一致。“请确保提取所有项目” -> [项目1, 项目2, 项目3, “字面意思是一个评论 // ...剩余项目”] 这是什么鬼!!有时它会给我一个完整的项目列表,有时却是这种胡扯。我提供了一个工具,但一半的时间它只抓取前三个,可能还会抓取最后一个(忽略中间的所有内容)。 2. 由于结果不可靠,我不得不进行更多的后处理。大约60%的时间,即使经过后处理,我也不得不拒绝,因为它们没有达到我的信心阈值。 3. 这些API的供应商支持很差。 - iOS有一些疯狂的行为,有时文件扩展名是.jpg或.JPG等。例如,OpenAI的API会因为扩展名不是“.jpg”而返回错误请求,所以我现在不得不添加更多代码,以确保用户上传文件时,我会重命名文件。 - 文档会说它支持一系列文件格式,但却因为不是.PDF而拒绝请求,尽管其目的为“助手”(文档中说可以处理图像)。没问题,我会转换一下。 - 处理来自其他来源(如G Drive等)的文件时,扩展名缺失但MIME类型存在……同样,错误请求。 4. 我们从2024年的“AGI即将到来”变成了今天的“人工超级智能即将到来”。我们能不能放松一下?我是不是掉进了营销陷阱? 我认为大型语言模型(LLMs)在像Cursor这样的应用程序中,或者在客户支持中非常出色,因为它们不需要给出“完美”的回答,因为人类操作员会进一步提示它。你有多少次不得不处理Cursor的愚蠢输出(我是重度用户,每天都在处理这个)。RAG是一个很酷的应用程序,在我看来,那里的正确性或精确性并没有真正的必要。我有数百条我输入的笔记,有时会参考。我每次得到的答案都不同,但我并不需要它们是完美的。
查看原文
I&#x27;m trying to automate a few manual processes we have right now, but I still can&#x27;t get over this hump. What am I doing wrong?<p>I am using these AI APIs for actual processing type work, and I am left defeated and somewhat angry if I&#x27;m being honest. These AI companies sell us some galaxy-brain vision of automation, but actually using their services is a disappointing experience.<p>1. The results are never consistent. &quot;Please ensure you extract ALL items&quot; -&gt; [Item1, Item2, Item3, &quot;literally a comment &#x2F;&#x2F; ...remaining items&quot;] WHAT THE F$#K!! Sometimes it gives me a full list of all items, and sometimes it does that BS. I provided a tool, and half of the time it just grabs the first 3 and maybe it will grab the very last one too (ignoring everything in the middle).<p>2. Because the results are not reliable, I have to do more post-processing. About 60% of the time, even after post, I have to reject because they don&#x27;t meet my confidence threshold.<p>3. The APIs are poorly supported by the vendors.<p>- iOS has some insane behavior where file extensions are sometimes .jpg or .JPG, etc. OpenAI&#x27;s API, for example, will return Bad Request because the extension was not &quot;.jpg&quot; so now I have to add more code to ensure that when the user uploads files, I rename the file.<p>- The docs will say it supports a list of file formats, but then rejects the request because it was not .PDF even though the purpose was &quot;assistants&quot; (which the docs say can handle images). No problem, I&#x27;ll just convert..<p>- Dealing with files coming from other sources (G Drive, etc.) where the extension is missing but the MIME type is present.. Again, bad request.<p>4. We went from &quot;AGI any day now&quot; in 2024, to &quot;_A_rtificial _S_uper _I_ntelligence any day now&quot; today. Can we just relax? Did I fall for a marketing trap?<p>I think LLMs are great for applications like in Cursor, or for customer support, where it doesn&#x27;t need to give &quot;perfect&quot; responses because a human operator will prompt it further. How many times have you had to deal with stupid output from Cursor (I&#x27;m a power user, I deal with this daily). RAG is a cool application, and there&#x27;s no real need for correctness or exactness there, IMO. I&#x27;ve got hundreds of my notes that I&#x27;ve fed which I reference sometimes. I get different answers each time, but I don&#x27;t need them to be perfect.<p>:q!