问HN:你们正在使用哪些大型语言模型(LLM),以及原因是什么?
你好,HN!
我想知道大家现在日常使用的工具是什么,以及为什么选择它们?
我发现自己在工作中使用GPT-5.5的频率超过了Opus 4.7,这真是一个不小的转变。之前,我一直使用Opus 4.6处理所有事务,而GPT-5.4仅仅是作为第二意见出现(Grok则是一个遥远的第三选择,只在我想要加入一些“混乱”时才会使用)。我个人转变的原因是,我发现GPT-5.5在一致性和可预测性上表现得更好,写作风格也让我觉得不那么疲惫(尽管它的代码质量不如Opus 4.7)。
在个人项目中,我开始尝试DeepSeek V4,给我留下了深刻的印象,因为它的性价比非常高,我发现1M的token窗口对长时间运行的任务非常有帮助。不过,我可能对任务中的压缩问题有些过于担忧。DeepSeek在一次性处理任务的能力上不如GPT-5.5或Opus 4.7,但在有足够的代码检查和静态分析保护措施下,我发现很难抱怨或挑剔(尤其是考虑到它的价格)。
最后,如果你们也在使用重排序和/或嵌入模型,或者其他任何工具来增强或执行特定任务,请分享一下!
查看原文
Hello, HN!<p>I'm wondering what y'all are using for your daily driver these days and <i>why</i>?<p>I've found myself using GPT-5.5 more than Opus 4.7 for work; which, has been a pretty big reversal. Previously, I was using Opus 4.6 for everything, and GPT-5.4 was only ever in the picture to provide a second opinion (with Grok a distant 3rd only when I wanted to throw some "chaos" into the mix). The reason I've personally pivoted, is I've found GPT-5.5 to be a bit more consistent, predictable, and tends to write in a way I find less tiresome (even if the code isn't quite as good as Opus 4.7).<p>For personal projects, I've started experimenting with DeepSeek V4 and have been pretty blown away by it because of it's cost to quality and I've found the 1M token window to be incredibly helpful for long-running tasks. Though I may also have an over abundance of fear of compaction during tasks. DeepSeek isn't quite as good at one-shotting things as either GPT-5.5 or Opus-4.7, but with sufficient linter/static-analysis guardrails I've found it's really hard to complain or find faults (especially at the price).<p>Finally, if you're also making use of reranking and/or embedding models, or anything else, to augment or perform specific tasks please share those too!