问HN:有没有人觉得从编码大型语言模型中获得价值很困难?
我每天都在使用大型语言模型(LLMs)来处理一些事情,比如:
- 解决一些只需要应用知识的任务(“这是我 Python 导入结构的粘贴。我不常写 Python,我知道我在这里做错了什么,因为我收到了这个错误,请告诉我如何正确组织这个包。”)。
- 编写自包含的临时代码片段(“这是我 DESCRIBE TABLE 输出的粘贴,写一个 SQL 查询来显示中位数……”)。
- 作为调试伙伴(“我可以直接 SSH 到这个主机,但 Ansible 连接失败并出现这个错误,这种差异可能是什么原因?”)。
所有这些用例都运行得很好,我节省了很多时间。但在我主要的代码编写工作中,我几乎从未取得过成功。我尝试过:
- Cursor(不记得具体使用哪个模型,默认的)
- Google 的 Jules
- OpenAI Codex 的 o4 版本
我发现,在所有情况下,模型的基本能力显然是存在的(模型可以理解和编写代码),但整体价值却远远不够。它可以编写出“有效”的代码,但要让它生成我愿意维护并“署名”的代码所花的时间比我自己写代码还要长。
我不得不对它们进行无尽的微管理(“确保重新运行格式化工具,确保所有测试通过”和“请遵循代码库的编码风格”。“你添加了无关的注释,去掉这些。” “你重构了大部分文件,但忘记了一个函数。”)。在琐碎问题上需要进行很多次迭代,而由于这些迭代进展缓慢,这意味着我不得不频繁切换上下文,这也让人感到疲惫。
基本上,这就像有一个实习生,他成功地掌握了编程的核心技能,但在良好的协作能力上却不太行,需要时刻有人照看。
我问了一些热衷于编码的朋友,他们基本上说“你的标准太高了”。
这里成功的模型是否是你只需说“我不在乎代码质量,因为我不需要维护它,因为我也会用 LLMs 来处理这个?”我是不是没有正确使用这些工具?
查看原文
I use LLMs daily for stuff like:<p>- solving tasks that just require applying knowledge ("here's a paste of my python import structure. I don't write Python often and I'm aware I'm doing something wrong here because I get this error, tell me the proper way organise the package").<p>- writing self-contained throwaway pieces of code ("here's a paste of my DESCRIBE TABLE output, write an SQL query to show the median [...]").<p>- as a debugging partner ("I can SSH to this host directly, but Ansible fails to connect with this error, what could be causing this difference").<p>All these use cases work great, I save a lot of time. But with the core work of writing the code that I work on, I've almost never had any success. I've tried:<p>- Cursor (can't remember which model, the default)<p>- Google's Jules<p>- OpenAI Codex with o4<p>I found in all cases that the underlying capability is clearly there (the model can understand and write code) but the end-to-end value is not at all. It could write code that _worked_, but trying to get it to generate code that I am willing to maintain and "put my name on" took longer than writing the code would have.<p>I had to micromanage them infinitely ("be sure to rerun the formatter, make sure all tests pass" and "please follow the coding style of the repository". "You've added irrelevant comments remove those". "You've refactored most of the file but forgot a single function"). It would take many many iterations on trivial issues, and because these iterations are slow that just meant I had to context switch a lot, which is also exhausting.<p>Basically it was like having an intern who has successfully learned the core skill of programming but is not really capable of good collaboration and needs to be babysat all the time.<p>I asked friends who are enthusiastic vibe coders and they basically said "your standards are too high".<p>Is the model for success here that you just say "I don't care about code quality because I don't have to maintain it because I will use LLMs for that too?" Am I just not using the tools correctly?