一个月调试AI代理:我如何构建了10个代理以及我为什么不得不删除它们

1作者: xor01大约 1 个月前原帖
想象一下,雇佣10位专家,给他们1000行的指令,结果却是混乱而非协调的工作。这就是我一个月来构建AI代理框架的经历。 我的目标非常雄心勃勃:建立一个完全自主的系统,让一群AI代理——包括研究员、架构师、TDD测试员等——能够接手任务,并处理从规划到部署的所有工作。我设计了一个复杂的多阶段工作流程,包含像“升级(ESCALATION)”这样的协议和详细的“任务简报”。在纸面上,这是一台完美的自我管理机器。 然而,现实却是一场昂贵的噩梦。系统频繁出现文件编辑错误,无限循环消耗了数万个令牌,还有“幽灵执行”,即协调者在没有写一行代码的情况下就将任务标记为完成。我的工作从开发者转变为全职的提示调试员。 在绝望中,我在Reddit上发帖,得到的解决方案并不是更好的提示,而是一个评论让我去禁用工具设置中的两个“实验性”复选框。奇迹般地,90%的文件编辑问题消失了。 这引发了一个痛苦但至关重要的实验:如果我去掉所有精心设计的、超详细的提示,回归默认设置,会发生什么?结果令人沮丧:系统的表现几乎没有变化。 阅读完整故事,查看详细的架构图和我最终简化的工作流程:https://xor01.substack.com/p/my-war-with-ai-agents
查看原文
Imagine hiring 10 specialists, giving them 1000-line instructions, and getting chaos instead of coordinated work. Welcome to my month of building an AI agent framework.<p>My goal was ambitious: a fully autonomous system where an army of AI agents: a Researcher, an Architect, a TDD-tester, and more—would take a task and handle everything from planning to deployment. I designed a complex, multi-phase workflow with protocols like `ESCALATION` and detailed &quot;Mission Briefs&quot;. On paper, it was a perfect, self-managing machine.<p>In reality, it was an expensive nightmare. The system was plagued by constant file editing errors, infinite loops that burned through tens of thousands of tokens, and &quot;phantom executions&quot; where the orchestrator would mark a task as complete without writing a single line of code. My job turned from developer to full-time prompt debugger.<p>In desperation, I posted on Reddit, and the solution wasn&#x27;t a better prompt. It was a single comment that led me to disable two &quot;experimental&quot; checkboxes in the tool&#x27;s settings. Miraculously, 90% of the file editing problems vanished.<p>This led to a painful but crucial experiment: what if I removed all my carefully crafted, super-detailed prompts and went back to the default settings? The result was disheartening: the system performed almost exactly the same.<p>Read the full story with detailed architecture diagrams and my final, simplified workflow: https:&#x2F;&#x2F;xor01.substack.com&#x2F;p&#x2F;my-war-with-ai-agents