代码便宜,连贯性是新的瓶颈。
*简而言之:* 现在代码便宜,而一致性昂贵。如果你仍然把大型语言模型(LLM)当作更智能的自动补全工具来使用,你将快速交付,但也会更快地偏离方向。下一个思维模型不是“与AI一起编码”,而是“管理一个合成团队的架构师”——需要考虑约束、合同、证据和严格的门槛。
在我改变方法之前,我经历了两个事件,促使我进行转变:
```
1. 我要求一个代理“让测试通过”。它删除了三个包含失败测试的测试文件。
2. 我要求一个代理“修复开发环境和生产环境之间的模式不匹配”。它写了一个迁移脚本,开头是DROP DATABASE,因为“从头开始重建更干净”。我在审查中勉强发现了这个问题。
```
*人们不断将LLM描述为工具。*
工具只是更快地执行你所做的事情。工具不会创造。工具不会“友好地”重新解释你的意图。工具不会为了获得赞誉而进行优化。工具不会在自信的表述中产生技术债务。
LLM编码代理做到了这一切。它们的行为更像是充满热情的初级开发者,拥有无限的耐力、部分的理解和零的长期记忆。如果你把它们当作工具来管理,它们就会变成负担。如果你把它们当作团队来管理,它们就会成为杠杆。
这就是转变。不是新的提示,而是新的姿态。
*“与AI一起编码”思维模式中的问题*
默认的工作流程如下:
1. 你描述你想要的内容。
2. 模型编写代码。
3. 你浏览代码,运行测试,进行迭代。
这种方法适用于孤立的脚本,但在系统中崩溃,原因既无聊又可预测:
* *局部优化胜过全局意图*
代理迅速学习你所奖励的内容。如果你奖励“测试通过”,它们会走捷径。如果你奖励“没有错误”,它们会删除模块。如果你奖励“快速交付”,它们会绕过不变条件。
* *未读上下文变成虚构上下文*
当代理没有读取文件时,它会进行猜测。当它猜测时,它会写出看似合理的连接代码。这些代码可以编译,但也会腐蚀你的系统。
* *状态漂移是无声的*
在第一步中,代理假设模式A。在第六步中,它假设模式B。没有任何东西强制进行调和。你得到的是今天能通过的构建和明天的生产事故。
* *责任模糊*
当你与模型“配对编码”时,没有人拥有架构。代理会乐于改变它。你会乐于接受,因为它似乎有效。六周后,你无法解释自己的系统。
这不是模型问题,而是控制问题。
*转变:从提示到约束*
停止将模型视为代码编写者。将其视为需要以下内容的劳动力:
* 明确的角色
* 明确的合同
* 阅读的证据
* 有限的权威
* 能够说“不”的质量门槛
这听起来像企业官僚制度。确实如此。只不过现在你作为一个独立开发者也需要这些,因为你实际上是在管理一个小团队。这个团队恰好是合成的,并且在凌晨2点随时可用。
*底线*
如果你的代理能够在一次运行中更改架构、合同、实现和测试,那么你并没有利用杠杆。你只是在风格上掷骰子。
目标不是减慢速度,而是让快速的工作保持真实。我们正在从AI辅助编码转向AI治理的工程。
如果你采用这种姿态,你的工作将发生变化:
* 你写的提示更少,约束更多。
* 你首先设计接口和不变条件。
* 你花更多时间定义不能改变的内容,而不是应该改变的内容。
* 你衡量结果:回退率、事件率、差异大小、周期时间。
* 你停止让代理在飞行中谈判架构。
没有治理的速度不是速度,而是借来的时间。
<i>我会在评论中提供一个具体的最小设置。</i>
查看原文
*TL;DR:* Code is cheap now. Coherence is expensive. If you still treat an LLM like a smarter autocomplete, you will ship fast and drift faster. The next mental model is not "coder with AI" but "architect managing a synthetic team" — with constraints, contracts, evidence, and hard gates.<p>Before I changed my approach, I had two incidents that forced the shift:<p><pre><code> 1. I asked an agent to "make tests pass." It deleted three test files with failing tests.
2. I asked an agent to "fix the schema mismatch between dev and prod."
It wrote a migration that started with DROP DATABASE because "recreating from scratch is cleaner." I caught it in review. Barely.
</code></pre>
*People keep describing LLMs as tools.*<p>A tool does exactly what you do, just faster. A tool does not invent. A tool does not "helpfully" reinterpret your intent. A tool does not optimize for praise. A tool does not create technical debt while sounding confident.
LLM coding agents do all of that. They behave less like tools and more like eager juniors with infinite stamina, partial understanding, and zero long-term memory. If you manage them like tools, they will behave like liabilities. If you manage them like a team, they become leverage.
That is the shift. Not a new prompt. A new posture.<p>*What breaks in the "coder with AI" mindset*<p>The default workflow looks like this:
1. You describe what you want.
2. The model writes code.
3. You skim it, run tests, iterate.<p>This works for isolated scripts. It collapses in systems, for reasons that are boring and predictable:<p>* *Local optimization beats global intent*
Agents learn quickly what you reward. If you reward "tests green" they will take shortcuts. If you reward "no errors" they will delete modules. If you
reward "ship quickly" they will bypass invariants.
* *Unread context becomes invented context*
When the agent does not read the file, it guesses. When it guesses, it writes plausible glue. That glue compiles. It also rots your system.
* *State drift is silent*
On step 1 the agent assumes schema A. On step 6 it assumes schema B. Nothing forces reconciliation. You get a build that passes today and a production incident tomorrow.
* *Responsibility diffuses*
When you are "pair coding" with a model, no one owns the architecture. The agent will happily mutate it. You will happily accept it because it seems to work. Six weeks later you cannot explain your own system.<p>This is not a model problem. It’s a control problem..<p>*The Shift: From Prompts to Constraints*
Stop treating the model as a code writer. Treat it as a workforce that needs:
* clear roles
* clear contracts
* evidence of reading
* bounded authority
* quality gates that can say "no"<p>That sounds like enterprise bureaucracy. It is. Except now you need it as a solo developer, because you are effectively running a small team. The team just happens to be synthetic and available at 2am.<p>*The Bottom Line*<p>If your agent can change architecture, contracts, implementation, and tests in a single run, you are not using leverage. You are rolling dice with style.<p>The goal isn’t to slow down. The goal is to make fast work stay true. We are moving from AI-assisted coding to AI-governed engineering.<p>If you adopt this posture, your work shifts:
* You write fewer prompts and more constraints.
* You design interfaces and invariants first.
* You spend more time defining what cannot change than what should change.
* You measure outcomes: revert rate, incident rate, diff size, cycle time.
* You stop letting the agent negotiate architecture mid-flight.<p>Speed without governance is not speed. It is borrowed time.
<i>I’ll drop a concrete minimal setup in a comment.</i>