HackerNews中文版

代理的价值与其被授予的权限成正比。<p>关于默认拒绝代理、密钥库等解决方案的宣传已经很多，但似乎没有什么能够解决核心问题：代理可能会被欺骗，执行攻击者的指令。<p>我能想到的最好办法就是运行一个观察循环，利用另一个大型语言模型监控代理的所有行为，但我很好奇是否有人有更优雅的解决方案。

查看原文

An agent's value is proportional to the permissions it's been granted.<p>There's been a lot of hype around solutions like default denial proxies, key vaults, and more, but nothing seems to address the core tension: an agent can be tricked into doing an attacker's bidding.<p>The best thing I could think of was to just run an observer loop and monitor everything the agent does with another LLM, but I'm curious if anyone has an elegant solution.

请问HN：你们是如何解决人工智能的混淆代理问题的？