上个月我在Cursor进行了4000次代理通话。每个模型都有其独特的个性。

3作者: mike2105 天前原帖
懒惰的架构师(OpenAI 的 o3)。o3 在编写代码方面非常懒惰,但在规划方面表现出色。它乐于阅读数十个文件并进行深入分析,但在需要编辑多个文件的情况下常常会遇到困难。 过于热心的孩子(Claude Sonnet 3.7 Thinking)。Claude Sonnet 迫不及待地想要开始,真是太兴奋了!它并不是最小心的,在较长的工具调用串中,可能会开始编辑与您要求的内容完全无关的东西。 相对平衡?(Gemini 2.5 Pro)。Gemini 2.5 比 Sonnet 3.7 更聪明,速度显著更快,性格也更为内敛。通常是编写多个文件代码的最佳选择。 我发现 o4-mini 非常慢且相当平庸,而 GPT 4.1 在某些特定领域很有用。我的建议: - 使用 o3 进行规划和/或仅在一个或最多两个文件中编写代码。如果涉及更多文件,它可能会公开反抗,拒绝继续编写。 - 始终确保 Sonnet 3.7 在相对较小的产品部分上遵循紧密的计划,并进行监督。如果您在代码库的多个区域有简单的更改要进行,例如,让 Sonnet 运行,但仍需监督,这是该模型个性完美的应用。 我通常的做法: - 中等复杂度:编辑一个文件:使用 o3。编辑多个文件:用 o3 进行规划,使用 gemini-2.5 编写。 - 简单复杂度:编辑多个文件,非常简单:如有需要,使用 o3 进行规划,使用 claude-3.7 编写。编辑多个文件,简单,需公式化方法:将详细提示写入 GPT 4.1。 - 高复杂度:使用 o3 进行规划,分成多个部分,逐小块使用 gemini-2.5 编写,并对每个部分非常小心。如果我特别懒,有时会一次性处理所有部分,然后在最后修复所有错误,但这可能会导致后续的代码问题。 很想听听其他人如何使用不同的模型!
查看原文
The lazy architect (OpenAI’s o3). o3 is incredibly lazy at writing code, but very good at planning. Will happily read tens of files and do deep analysis, but often struggles in scenarios where it needs to edit more than one file.<p>The over-eager child (Claude Sonnet 3.7 Thinking). Claude Sonnet is eager to just get going, man! It’s not the most careful, and in longer strings of tool calls, may start editing something completely unrelated to what you asked it to.<p>Pretty balanced?(Gemini 2.5 Pro). Gemini 2.5 is a little more intelligent, and significantly faster and more reserved than Sonnet 3.7. Usually the best choice for writing code in multiple files.<p>I’ve found o4-mini to be incredibly slow and fairly mediocre, and GPT 4.1 useful in very situational areas. My tips:<p>- Use o3 to plan and&#x2F;or write code in one or max two file only. If you do more, it may openly revolt and just refuse to write any longer.<p>- Always make sure Sonnet 3.7 is following a tightly scoped plan on a relatively small section of the product, and supervise it. If you have an easy change to make in many areas of your codebase, for example, letting Sonnet run, still supervised, is a perfect use of the model’s persona<p>Generally what I do:<p>- Medium complexity: editing one file: o3. Editing multiple files: plan with o3, write with gemini-2.5<p>- Simple complexity: Editing many files, very simple: plan with o3 if needed, write with claude-3.7. Editing many files, simple, needs formulaic approach: write a detailed prompt into GPT 4.1<p>- High complexity: plan with o3, separate into multiple chunks, write small chunks at a time with gemini-2.5 and be very careful with each section. If I&#x27;m super lazy sometimes I just YOLO all of the sections and then fix all the bugs at the end but this probably leads to code issues later down the line.<p>Would love to hear other people are using the different models!