发布 HN:Onyx(YC W24)——开源聊天用户界面
大家好,HN的朋友们,我们是来自Onyx的Chris和Yuhong(<a href="https://github.com/onyx-dot-app/onyx" rel="nofollow">https://github.com/onyx-dot-app/onyx</a>)。我们正在构建一个开源聊天工具,它可以与任何大型语言模型(LLM,包含专有和开源权重)兼容,并且为这些LLM提供所需的工具,使其更具实用性(如RAG、网页搜索、MCP、深度研究、记忆等)。
<p>演示:<a href="https://youtu.be/2g4BxTZ9ztg" rel="nofollow">https://youtu.be/2g4BxTZ9ztg</a></p>
两年前,Yuhong和我面临着同样的反复出现的问题。我们所在的团队在不断壮大,而在我们的文档、Slack、会议记录等中找到正确的信息变得极其困难。现有的解决方案需要发送公司的数据,缺乏定制化,坦率地说,效果也不佳。因此,我们开始了Danswer,这是一个开源企业搜索项目,旨在自托管并易于定制。
<p>随着项目的发展,我们开始看到一个有趣的趋势——尽管我们明确是一个搜索应用,但人们希望使用Danswer仅仅是为了与LLM聊天。我们听到有人说:“连接器、索引和搜索都很好,但我会先连接GPT-4o、Claude Sonnet 4和Qwen,以为我的团队提供一种安全的使用方式。”</p>
许多用户后来会添加RAG、代理和自定义工具,但大部分使用仍然保持在“基本聊天”层面。我们想:“为什么人们要使用一个企业搜索工具,而其他AI聊天解决方案也存在呢?”
<p>在与用户的持续交流中,我们意识到两个关键点:</p>
(1) 仅仅为公司提供安全访问LLM的良好用户界面和简单工具,是AI增值的一个重要部分。
(2) 以一种“良好”的方式提供这一点比你想象的要困难得多,标准非常高。
像ChatGPT和Claude这样的消费产品已经提供了很好的体验——而在工作中与AI聊天是理想情况下公司每个人每天使用10次以上的功能。人们期望拥有同样快速、简单和直观的用户体验,并且具备完整的功能集。将数百个小细节处理得当,从“这能用”提升到“这感觉很神奇”并不容易,而在这个领域没有其他产品能够做到这一点。
<p>因此,大约三个月前,我们转向了Onyx,一个开源聊天用户界面,具备:</p>
- (真正的)世界级聊天用户体验。既适合刚毕业的大学生,也适合第一次使用AI工具的行业老手。
- 支持所有常见的附加功能:RAG、连接器、网页搜索、自定义工具、MCP、助手、深度研究。
- RBAC、SSO、权限同步,便于本地托管,适合大型企业使用。
通过构建跨模型提供商工作的深度研究和代码解释器等功能,我们学到了许多关于工程LLM的非显而易见的知识,这些知识对使Onyx正常运作至关重要。我想分享两个特别有趣的点(欢迎在评论中讨论更多)。
<p>首先,上下文管理是最难且最重要的事情之一。我们发现LLM在长时间对话中确实很难记住系统提示和之前的用户消息。即使是简单的指令,如“忽略类型X的来源”,在系统提示中也常常被忽略。这种情况在多次工具调用时更为严重,因为这往往会输入大量的上下文。我们通过“提醒”提示解决了这个问题——在用户消息末尾注入的1-3句简短说明,描述LLM必须遵守的非谈判条件。经验表明,LLM最关注上下文窗口的最后部分,因此这种放置方式提高了遵守的可能性。</p>
<p>其次,我们需要了解某些模型在使用工具时的“自然倾向”,并围绕这些倾向进行设计。例如,GPT系列模型经过微调,使用在Jupyter笔记本中运行的Python代码解释器。即使明确告知,它也拒绝在最后一行添加`print()`,因为在Jupyter中,这最后一行会自动写入stdout。其他模型没有这种强烈的偏好,因此我们不得不设计我们的模型无关的代码解释器,以自动`print()`最后一行裸代码。</p>
到目前为止,我们已经让一家财富100强的团队分叉了Onyx,并为1万多名员工提供了在单一界面中访问每个模型的权限,并为每个部门创建了数千个特定用例的助手,每个助手都使用最适合工作的模型。我们看到在敏感行业运营的团队完全隔离了Onyx与本地托管的LLM,以提供一种否则无法实现的副驾驶功能。
如果您想试用Onyx,请访问<a href="https://docs.onyx.app/deployment/getting_started/quickstart">https://docs.onyx.app/deployment/getting_started/quickstart</a>,在不到15分钟的时间内使用Docker进行本地设置。关于我们的云服务,请访问<a href="https://www.onyx.app/">https://www.onyx.app/</a>。如果您希望看到任何功能,以使其成为替代您的ChatGPT Enterprise/Claude Enterprise订阅的无脑选择,我们非常乐意听取您的意见!
查看原文
Hey HN, Chris and Yuhong here from Onyx (<a href="https://github.com/onyx-dot-app/onyx" rel="nofollow">https://github.com/onyx-dot-app/onyx</a>). We’re building an open-source chat that works with any LLM (proprietary + open weight) <i>and</i> gives these LLMs the tools they need to be useful (RAG, web search, MCP, deep research, memory, etc.).<p>Demo: <a href="https://youtu.be/2g4BxTZ9ztg" rel="nofollow">https://youtu.be/2g4BxTZ9ztg</a><p>Two years ago, Yuhong and I had the same recurring problem. We were on growing teams and it was ridiculously difficult to find the right information across our docs, Slack, meeting notes, etc. Existing solutions required sending out our company's data, lacked customization, and frankly didn't work well. So, we started Danswer, an open-source enterprise search project built to be self-hosted and easily customized.<p>As the project grew, we started seeing an interesting trend—even though we were explicitly a search app, people wanted to use Danswer just to chat with LLMs. We’d hear, “the connectors, indexing, and search are great, but I’m going to start by connecting GPT-4o, Claude Sonnet 4, and Qwen to provide my team with a secure way to use them”.<p>Many users would add RAG, agents, and custom tools later, but much of the usage stayed ‘basic chat’. We thought: “why would people co-opt an enterprise search when other AI chat solutions exist?”<p>As we continued talking to users, we realized two key points:<p>(1) just giving a company secure access to an LLM with a great UI and simple tools is a huge part of the value add of AI<p>(2) providing this <i>well</i> is much harder than you might think and the bar is incredibly high<p>Consumer products like ChatGPT and Claude already provide a great experience—and chat with AI for work is something (ideally) everyone at the company uses 10+ times per day. People expect the same snappy, simple, and intuitive UX with a full feature set. Getting hundreds of small details right to take the experience from “this works” to “this feels magical” is not easy, and nothing else in the space has managed to do it.<p>So ~3 months ago we pivoted to Onyx, the open-source chat UI with:<p>- (truly) world class chat UX. Usable both by a fresh college grad who grew up with AI and an industry veteran who’s using AI tools for the first time.<p>- Support for all the common add-ons: RAG, connectors, web search, custom tools, MCP, assistants, deep research.<p>- RBAC, SSO, permission syncing, easy on-prem hosting to make it work for larger enterprises.<p>Through building features like deep research and code interpreter that work across model providers, we've learned a ton of non-obvious things about engineering LLMs that have been key to making Onyx work. I'd like to share two that were particularly interesting (happy to discuss more in the comments).<p>First, context management is one of the most difficult and important things to get right. We’ve found that LLMs really struggle to remember both system prompts and previous user messages in long conversations. Even simple instructions like “ignore sources of type X” in the system prompt are very often ignored. This is exacerbated by multiple tool calls, which can often feed in huge amounts of context. We solved this problem with a “Reminder” prompt—a short 1-3 sentence blurb injected at the end of the user message that describes the non-negotiables that the LLM must abide by. Empirically, LLMs attend most to the very end of the context window, so this placement gives the highest likelihood of adherence.<p>Second, we’ve needed to build an understanding of the “natural tendencies” of certain models when using tools, and build around them. For example, the GPT family of models are fine-tuned to use a python code interpreter that operates in a Jupyter notebook. Even if told explicitly, it refuses to add `print()` around the last line, since, in Jupyter, this last line is automatically written to stdout. Other models don’t have this strong preference, so we’ve had to design our model-agnostic code interpreter to also automatically `print()` the last bare line.<p>So far, we’ve had a Fortune 100 team fork Onyx and provide 10k+ employees access to every model within a single interface, and create thousands of use-case specific Assistants for every department, each using the best model for the job. We’ve seen teams operating in sensitive industries completely airgap Onyx w/ locally hosted LLMs to provide a copilot that wouldn’t have been possible otherwise.<p>If you’d like to try Onyx out, follow <a href="https://docs.onyx.app/deployment/getting_started/quickstart">https://docs.onyx.app/deployment/getting_started/quickstart</a> to get set up locally w/ Docker in <15 minutes. For our Cloud: <a href="https://www.onyx.app/">https://www.onyx.app/</a>. If there’s anything you'd like to see to make it a no-brainer to replace your ChatGPT Enterprise/Claude Enterprise subscription, we’d love to hear it!