DeepSeek V4 发布了。这是最佳的开源编码工具。以下是详细介绍。

2作者: Alisaqqt3 天前原帖
两个模型:Flash(总计284B,活跃13B)和Pro(总计1.6T,活跃49B)。两者都支持1M的上下文。<p>V4-Pro是他们的旗舰产品。在代理编码任务中,超越了Claude Opus 4.6 Max(他们的说法)。特别指出在编码方面优于Sonnet 4.5,并且在一般基准测试中与Opus 4.6具有竞争力。在世界知识和STEM领域,他们表示它领先于Gemini-Pro-3.1。<p>V4-Flash是一个潜力股。比Pro更快且成本更低,但在长上下文效率上优于Pro。<p>原文:代理能力大幅提升:V4-Pro在开源模型的代理编码基准测试中达到了最先进水平。实际使用中,用户反馈其体验优于Sonnet 4.5,输出质量接近于Opus 4.6的非思考模式——尽管与启用思考的Opus 4.6仍存在差距。<p>世界知识:V4-Pro在知识基准测试中显著领先所有开源模型,在闭源前沿模型中仅次于Gemini-Pro-3.1。<p>顶级推理:在数学、STEM和竞争编码方面,V4-Pro超越了所有公开基准测试的开源模型,并与世界上最好的闭源模型相抗衡。1M的上下文才是真正的亮点。完全重新设计了注意力机制——结合了一种称为DSA(深度稀疏注意力)的技术,以处理规模而不增加计算负担。V4的推理成本在令牌数量增加时保持平稳,而V3.2则急剧上升。架构改进使得这一模型真正可用,而不仅仅是一个规格数字。<p>代理能力得到了专门升级。特别针对Claude Code、OpenClaw、OpenCode和CodeBuddy进行训练。V4-Pro现在是任何代理/编码工作流程的推荐模型。Flash明确不推荐用于最复杂的代理任务。<p>API已上线。定价:<p>DeepSeek-V4-Flash:每百万输入/输出令牌$0.14 / $0.28<p>DeepSeek-V4-Pro:每百万输入/输出令牌$1.74 / $3.48<p>Reasoning_effort参数允许您为每次调用设置思考强度(低/高/最大)。“最大”特别推荐用于代理任务。<p>该模型将在Atlas Cloud上发布。开发者可以获得API访问权限。
查看原文
Two models: Flash (284B total, 13B active) and Pro (1.6T total, 49B active). both hit 1M token context.<p>V4-Pro is their flagship. Beats Claude Opus 4.6 Max on Agent coding tasks (their words). specifically calls out being better than Sonnet 4.5 on coding, and competitive with Opus 4.6 on general benchmarks. on world knowledge and STEM, they say it&#x27;s ahead of Gemini-Pro-3.1.<p>V4-Flash is the sleeper pick. Faster and cheaper than Pro, but it has better long-context efficiency than Pro does.<p>Original Text: Agent capabilities massively improved: V4-Pro hits SOTA on Agentic Coding benchmarks among open-source models. In practice, users report it feels better than Sonnet 4.5, and output quality is close to Opus 4.6 non-thinking mode — though there&#x27;s still a gap vs Opus 4.6 with thinking enabled.<p>World knowledge: V4-Pro leads all open-source models by a significant margin on knowledge benchmarks, sitting just behind Gemini-Pro-3.1 among closed-source frontier models.<p>Top-tier reasoning: On math, STEM, and competitive coding, V4-Pro beats every open-source model that&#x27;s been publicly benchmarked and is trading blows with the best closed-source models in the world. the 1M context is the real headline. Redesigned attention entirely — combines something called DSA (Deeply Sparse Attention) to handle the scale without blowing up compute. V4 inference cost stays flat as tokens scale up vs V3.2 which shoots up. the architecture improvement is what makes this actually usable, not just a spec number.<p>Agent capabilities got a dedicated upgrade. Trained specifically against Claude Code, OpenClaw, OpenCode, and CodeBuddy. V4-Pro is now the recommended model for any agentic &#x2F; coding workflow. Flash is explicitly not recommended for the most complex agent tasks.<p>API is live. Pricing:<p>DeepSeek-V4-Flash: $0.14 &#x2F; $0.28 per M input&#x2F;output tokens<p>DeepSeek-V4-Pro: $1.74 &#x2F; $3.48 per M input&#x2F;output tokens<p>Reasoning_effort parameter lets you set thinking intensity (low&#x2F;high&#x2F;max) per call. &quot;max&quot; is recommended for agent tasks specifically.<p>The model will launch on Atlas Cloud. Developers can get API access.