请问HN:你们使用大型语言模型(LLM)进行HTML翻译吗?
我遇到了一个既好又难解决的问题,那就是将HTML页面翻译成任何人类语言。<p>我开始尝试使用DeepL,它提供了这个功能,但结果却相当不理想,因为你无法为其提供任何上下文。<p>因此,对我来说,这显然是一个需要大型语言模型(LLM)的案例。在制定了一个良好的上下文提示后,我尝试了OpenAI的o4-mini,结果几乎完美。<p>然而,商业需求出现了,我需要能够更快地完成这个任务,因为大约100KB的页面生成大约需要一分钟,这对最终用户来说太长了。<p>我对内容进行了些许优化(只发送有趣的HTML部分),但最终仍然相当慢(超过10秒)。我在考虑将HTML翻译成JSON,然后在翻译后再转换回来,这样可能会更快。<p>有没有人也在做这个?或者对这个问题有没有更高效的API/LLM的见解?
查看原文
I've came accross a good but quite hard problem to solve which is to translate HTML pages into whatever human langage.<p>I've started tinkering with DeepL which proposes this feature but results are quite off the rails because you cannot provide any context to it.<p>So well, it was a clear case for an LLM to me. After crafting a good context prompt, I've tried o4-mini from OpenAI and it's results are nearly perfect.<p>However came business around and I need to be able to this WAY faster, as a ~100ko page will take like a minute to generate, which is way too long for the end user.<p>I've optimised content a little (sending only html parts that are interesting) but it's still quite slow in the end (more than 10s).
I'm thinking about translating html to json and do it back after translation, so it might be faster.<p>Does anyone do this as well ? Or has any insight of a better performing API/LLM for this problem ?