HackerNews中文版

我正在构建一个包装器，查询 GPT-4、Claude 和 Gemini，然后在沙盒中执行它们的代码，以捕捉幻觉。<p>你觉得延迟（30秒）值得这种确定性吗？还是你更喜欢速度？<p>今天我正在为一些人进行手动测试，如果有人想试试的话。

查看原文

m building a wrapper that queries GPT-4, Claude, and Gemini, then executes their code in a sandbox to catch hallucinations.<p>Is the latency (30s) worth the certainty? Or do you prefer speed?<p>I'm running manual tests for people today if anyone wants to try it.

你会为一个执行LLM代码以进行验证的工具付费吗？