问HN:我们是在假装RAG已经准备好了吗?实际上它还仅仅处于演示阶段。

3作者: TXTOS28 天前原帖
我已经观察到RAG(检索增强生成)这一浪潮在生产环境中冲击了超过一年。然而,有些事情让我感到困扰: 大多数设置仍然感觉像是用希望和向量搜索拼凑在一起的华丽笔记本。 是的,它“有效”——直到你真正需要它的时候。 突然间:无关的片段、幻觉、肤浅的查询重写、没有记忆循环,以及一个在你稍微用力时就会崩溃的检索堆栈。 我们面临的问题有: • 流程与用户实际想要询问的内容不对齐, • 检索更像是搜索引擎,而不是推理工具, • 脆弱的评估(因为“正确的上下文”不等于“正确的答案”), • 没有人确定基础是什么时候结束,幻觉又是什么时候开始的。 当然,你可以让它工作——如果你愿意把每个组件用胶带粘起来,并且24/7照看系统。 所以我得问: RAG是否只是停留在原型阶段,假装自己已经进入生产环境? 还是这里有人真正构建了一个能够应对用户混乱和边缘案例的系统? 我很想听听什么有效,什么无效,以及你们不得不舍弃的东西。 并不是在推动什么,只是深陷其中,想和那些真正交付过产品的人进行理智的交流。
查看原文
Been watching the RAG (Retrieval-Augmented Generation) wave crash into production for over a year now.<p>But something keeps bugging me: Most setups still feel like glorified notebooks stitched together with hope and vector search.<p>Yeah, it &quot;works&quot; — until you actually need it to. Suddenly: irrelevant chunks, hallucinations, shallow query rewriting, no memory loop, and a retrieval stack that breaks if you breathe on it wrong.<p>We’ve got: • pipelines that don’t align with what users <i>actually</i> want to ask, • retrieval that acts more like a search engine than a reasoning aid, • brittle evals (because &quot;correct context&quot; ≠ &quot;correct answer&quot;), • and no one’s sure where grounding ends and illusion begins.<p>Sure, you <i>can</i> make it work — if you’re okay duct-taping every component and babysitting the system 24&#x2F;7.<p>So I gotta ask: Is RAG just stuck in prototype land pretending to be production? Or has someone here actually built a setup that survives user chaos and edge cases?<p>Would love to hear what’s worked, what hasn&#x27;t, and what you had to throw away.<p>Not pushing anything, just been knee-deep in this and looking to sanity check with folks who’ve actually shipped stuff.