发布 HN:MindFort(YC X25)——用于持续渗透测试的 AI 代理

6作者: bveiseh12 天前原帖
大家好!我们是来自 MindFort 的 Brandon、Sam 和 Akul(网址:<a href="https://mindfort.ai">https://mindfort.ai</a>)。我们正在构建自主 AI 代理,持续发现、验证和修补网络应用程序中的安全漏洞——本质上是在创建一个 24/7 运行的 AI 红队。 以下是一个演示:<a href="https://www.loom.com/share/e56faa07d90b417db09bb4454dce8d5a" rel="nofollow">https://www.loom.com/share/e56faa07d90b417db09bb4454dce8d5a</a> 如今的安全测试面临越来越大的挑战。传统扫描器产生 30% 到 50% 的误报,令工程团队陷入噪音之中。手动渗透测试充其量每季度进行一次,评估费用高达数万美元,并且需要数周才能完成。与此同时,团队在 AI 的帮助下以更快的速度发布代码,但安全审查却成为了更大的瓶颈。 我们三个人从不同的角度遇到了这个问题。Brandon 曾在 ProjectDiscovery 开发 Nuclei 扫描器,然后在 NetSPI(最大的渗透测试公司之一)为测试人员构建 AI 工具。Sam 是 Salesforce 的高级工程师,负责 Tableau 的安全工作。他亲身经历了在处理安全发现和管理修复措施之间的平衡。Akul 在人工智能和安全领域获得硕士学位,合著了关于使用大型语言模型进行安全攻击的论文,并参与了 OpenAI 和 Anthropic 的红队工作。 我们都意识到,AI 代理将从根本上改变安全测试,而 AI 生成代码的浪潮需要一个同样强大的解决方案来确保其安全。 我们构建的 AI 代理能够进行侦察、利用漏洞并建议修补措施——这与人类渗透测试人员的工作方式类似。与传统扫描器的关键区别在于,我们的代理在报告漏洞之前会在运行时环境中验证利用,减少误报。 我们使用多个基础模型进行协同操作。代理首先进行侦察,以了解攻击面,然后利用这些背景信息来制定测试策略。当它们发现潜在漏洞时,会启动隔离环境进行验证。如果成功,它们会分析代码库以生成上下文修补措施。 与现有工具的不同之处在于:通过利用验证:我们不仅仅是模式匹配——我们利用漏洞来证明其真实性;- 代码库集成:代理理解您的代码结构,以发现复杂的逻辑错误并建议适当的修复;- 持续操作:我们不是进行一次性的评估,而是随着代码的演变不断进行测试;- 攻击链发现:代理可以发现需要将不同问题串联在一起的多步骤漏洞。 我们目前处于早期访问阶段,正在与初始合作伙伴一起完善平台。我们的代理已经发现了其他工具遗漏的漏洞,并在渗透测试基准中表现良好。 期待您的想法和评论!
查看原文
Hey HN! We&#x27;re Brandon, Sam, and Akul from MindFort (<a href="https:&#x2F;&#x2F;mindfort.ai">https:&#x2F;&#x2F;mindfort.ai</a>). We&#x27;re building autonomous AI agents that continuously find, validate, and patch security vulnerabilities in web applications—essentially creating an AI red team that runs 24&#x2F;7.<p>Here&#x27;s a demo: <a href="https:&#x2F;&#x2F;www.loom.com&#x2F;share&#x2F;e56faa07d90b417db09bb4454dce8d5a" rel="nofollow">https:&#x2F;&#x2F;www.loom.com&#x2F;share&#x2F;e56faa07d90b417db09bb4454dce8d5a</a><p>Security testing today is increasingly challenging. Traditional scanners generate 30-50% false positives, drowning engineering teams in noise. Manual penetration testing happens quarterly at best, costs tens of thousands per assessment, and takes weeks to complete. Meanwhile, teams are shipping code faster than ever with AI assistance, but security reviews have become an even bigger bottleneck.<p>All three of us encountered this problem from different angles. Brandon worked at ProjectDiscovery building the Nuclei scanner, then at NetSPI (one of the largest pen testing firms) building AI tools for testers. Sam was a senior engineer at Salesforce leading security for Tableau. He dealt firsthand with juggling security findings and managing remediations. Akul did his master&#x27;s on AI and security, co-authored papers on using LLMs for ecurity attacks, and participated in red-teams at OpenAI and Anthropic.<p>We all realized that AI agents were going to fundamentally change security testing, and that the wave of AI-generated code would need an equally powerful solution to keep it secure.<p>We&#x27;ve built AI agents that perform reconnaissance, exploit vulnerabilities, and suggest patches—similar to how a human penetration tester works. The key difference from traditional scanners is that our agents validate exploits in runtime environments before reporting them, reducing false positives.<p>We use multiple foundational models orchestrated together. The agents perform recon to understand the attack surface, then use that context to inform testing strategies. When they find potential vulnerabilities, they spin up isolated environments to validate exploitation. If successful, they analyze the codebase to generate contextual patches.<p>What makes this different from existing tools? Validation through exploitation: We don&#x27;t just pattern-match—we exploit vulnerabilities to prove they&#x27;re real; - Codebase integration: The agents understand your code structure to find complex logic bugs and suggest appropriate fixes; - Continuous operation: Instead of point-in-time assessments, we&#x27;re constantly testing as your code evolves; - Attack chain discovery: The agents can find multi-step vulnerabilities that require chaining different issues together.<p>We&#x27;re currently in early access, working with initial partners to refine the platform. Our agents are already finding vulnerabilities that other tools miss and scoring well on penetration testing benchmarks.<p>Looking forward to your thoughts and comments!