问HN:在另一个AI上训练AI是如何实现的?

2作者: timonpimba大约 1 个月前原帖
Deepseek到底是如何做到这一点的?他们是否只是将Claude的回答输入到自己的模型中,作为训练数据来提高推理能力? 具体来说,如何在一个模型的输出上训练另一个模型?这里涉及到哪些工程技术? 我希望能详细了解一下这个过程是如何在大规模上执行的。 背景故事: Anthropic最近指控Deepseek、Minimax和Moonshot使用大量虚假账户与Claude进行交流,并利用这些输出训练模型,称之为“蒸馏攻击”。
查看原文
How is Deepseek actually doing this? Are they just feeding claude&#x27;s answers into their own models as their own model as training data to improve reasoning? How exactly one train it&#x27;s model on output of other? what&#x27;s enginnering inovlved here?<p>I&#x27;d love breakdown of how thsi is executed at scale.<p>Backstory:<p>Anthropic recently accused Deepseek,Minimax,Moonshot of using lots of fake accounts to generate exchanges with claude, using the outputs to train the model and called it &quot;distillation attack&quot;.