问HN:对于那些正在构建AI代理的人,你们是如何提高它们的速度的?

1作者: arkmm23 天前原帖
由于需要跨多个系统进行协调以及链式调用大型语言模型(LLM),目前许多代理的响应速度可能会显得非常缓慢。我很想知道其他人是如何解决这个问题的: - 你们是如何识别代理中的性能瓶颈的? - 哪些类型的改动为你们带来了最大的速度提升? 对我们来说,我们开发了一个分析工具来识别缓慢的LLM调用——有时我们可以在这个步骤中更换为更快的模型,或者意识到可以通过消除不必要的上下文来减少输入的令牌数量。对于需要外部访问(如浏览器使用、API调用)的步骤,我们已经转向使用快速启动的外部容器和线程池来实现并行处理。我们还尝试了一些用户界面的改动,以掩盖部分延迟。 还有哪些其他提升性能的技术是大家在使用的?
查看原文
Because of the coordination across multiple systems + chaining LLM calls, a lot of agents today can feel really slow. I would love to know how others are tackling this:<p>- How are you all identifying performance bottlenecks in agents?<p>- What types of changes have gotten you the biggest speedups?<p>For us we vibe-coded a profiler to identify slow LLM calls - sometimes we could then switch out a faster model for that step or we&#x27;d realize we could shrink the input tokens by eliminating unnecessary context. For steps requiring external access (browser usage, API calls), we&#x27;ve moved to fast start external containers + thread pools for parallelization. We&#x27;ve also experimented some with UI changes to mask some of the latency.<p>What other performance enhancing techniques are people using?