HackerNews中文版

我们创建Verdic是因为在将大型语言模型（LLMs）投入生产时，反复遇到同样的问题：大多数人工智能失败并不是关于内容安全，而是关于意图漂移。随着模型变得更加自主，输出往往会悄然从描述性行为转变为规定性行为——而没有任何明确的信号表明系统现在实际上正在采取行动。在这种情况下，关键词过滤器和基于规则的保护措施很快就会失效。 Verdic是一个意图治理层，位于模型与应用程序之间。它不是检查主题或关键词，而是评估： - 输出是否将未来的选择压缩为特定的行动方案 - 响应是否施加了规范性压力（引导行为与解释之间的区别）我们的目标不是进行内容审核，而是实现行为控制：检测人工智能系统是否在超出其部署意图的情况下运行，特别是在受监管或决策关键的工作流程中。 Verdic目前作为API运行，具有可配置的允许/警告/阻止结果。我们正在对自主工作流程和长时间运行的链条进行测试，因为在这些情况下意图漂移最难以检测。这是一个早期版本。我主要希望从在生产中部署LLMs的人那里获得反馈，特别是在以下方面： - 自主系统 - 人工智能治理 - 风险与合规 - 我们可能遗漏的失败模式很高兴回答问题或分享更多关于该方法的细节。

查看原文

We built Verdic after repeatedly running into the same issue while deploying LLMs in production: most AI failures aren’t about content safety, they’re about intent drift.As models become more agentic, outputs often shift quietly from descriptive to prescriptive behavior — without any explicit signal that the system is now effectively taking action. Keyword filters and rule-based guardrails break down quickly in these cases.Verdic is an intent governance layer that sits between the model and the application. Instead of checking topics or keywords, it evaluates:whether an output collapses future choices into a specific course of actionwhether the response exerts normative pressure (directing behavior vs explaining)The goal isn’t moderation, but behavioral control: detecting when an AI system is operating outside the intent it was deployed for, especially in regulated or decision-critical workflows.Verdic currently runs as an API with configurable allow / warn / block outcomes. We’re testing it on agentic workflows and long-running chains where intent drift is hardest to detect.This is an early release. I’m mainly looking for feedback from people deploying LLMs in production, especially around:agentic systemsAI governancerisk & compliancefailure modes we might be missingHappy to answer questions or share more details about the approach.

Verdic – AI系统的意图治理层