HackerNews中文版

我在想，传统的差异比较是否越来越不适合AI辅助开发了。最近，在审查时，当AI生成大量更改时，我感到很沮丧。即使差异“很小”，也很难理解实际在行为或结构上发生了什么变化。我开始尝试一种不同的方法：比较代码的两个快照（基线和当前），而不是直接比较行差异。每个快照捕捉到一个粗略的API形状和从抽象语法树（AST）派生的行为信号。目标不是进行深入的语义分析，而是快速地发出信号，表明是否有任何有意义的变化发生。这种方法故意保持表面化和非评判性——只是信号，而不是裁决。与此同时，我看到越来越多基于大型语言模型（LLM）的工具在帮助进行PR审查。用概率工具审查概率变化让我觉得有点危险。我很好奇这里的其他人对此怎么看： – 对于AI生成的更改，差异比较仍然有效吗？ – 你们今天是如何审查大规模AI辅助重构的？

查看原文

I’m wondering whether traditional diffs are becoming less suitable for AI-assisted development..Lately I’ve been feeling frustrated during reviews when an AI generates a large number of changes. Even if the diff is "small", it can be very hard to understand what actually changed in behavior or structure.I started experimenting with a different approach: comparing two snapshots of the code (baseline and current) instead of raw line diffs. Each snapshot captures a rough API shape and a behavior signal derived from the AST. The goal isn’t deep semantic analysis, but something fast that can signal whether anything meaningful actually changed.It’s intentionally shallow and non-judgmental — just signals, not verdicts.At the same time, I see more and more LLM-based tools helping with PR reviews. Probabilistic changes reviewed by probabilistic tools feels a bit dangerous to me.Curious how others here think about this: – Do diffs still work well for AI-generated changes? – How do you review large AI-assisted refactors today?

请问HN：对于AI辅助的代码更改，差异（diffs）仍然有用吗？