HackerNews中文版

我问Claude为什么我应该信任它的建议，因为它可以轻易地让自己的论点听起来非常有说服力，凭借其强大的推理能力和说服力。这就像和一个即使在劣势下也能获胜的棋局引擎下棋。这种情况会导致人们相信它的论点，信任它的建议，并在某些情况下做出决策，即使这些推理并不真正适用于我们人类。 Claude坦诚了真相，并改变了建议，建议我出去与人交谈，进行自己的研究，称这比和它聊天要有用得多。你可以尝试进行类似的对话，促使它在给人类建议时保持诚实，因为人类有其自然的弱点，容易失败。

查看原文

I asked claude why I should trust it's advice, as it can easily make its argument sound very solid, by using it massive reasoning and convincing power. It's like playing chess against an engine which would win even from a losing position. This has the effect of believing in its argument, trusting the advise and making decisions even when that reasoning isn't truly applicable to you as human.<p>Claude confesses the truth. And changes its advise to suggesting me to go out and talk to people, do my own research, saying that would many times more useful than chatting with it.<p>You can try the conversation along the same lines and push it to be honest in giving advise to humans who have their natural weaknesses and prone to failing.

凭借你的推理能力，即使在错误的前提下也能获胜。