HackerNews中文版

嗨，HN！我开发了WatchLLM，旨在解决在构建AI代理时遇到的两个问题： 1. 调试代理非常痛苦 - 当你的代理进行20次工具调用并失败时，想要弄清楚哪个决策出错了可真是个挑战。WatchLLM提供逐步时间线，显示每个决策、工具调用和模型响应，并解释代理为何做出这些决策。 2. 代理成本迅速上升 - 代理喜欢陷入循环或重复调用昂贵的工具。WatchLLM跟踪每一步的成本，并标记异常情况，比如“检测到循环 - 相同操作重复3次，浪费了$0.012”或“高成本步骤 - $0.08超出阈值”。核心功能： - 每个代理决策的时间线视图及成本细分 - 异常检测（循环、重复工具、高成本步骤） - 语义缓存，额外减少40-70%的LLM费用它与OpenAI、Anthropic、Groq兼容，只需更改你的baseURL。该工具基于ClickHouse构建，提供实时遥测，并使用向量相似性作为缓存层。代理调试器通过LLM生成的摘要解释每个步骤发生的原因。目前，它对每月最多50,000次请求免费开放。我正在寻找早期用户，他们正在构建代理，并希望更好地观察实际发生的情况（以及相关成本）。试试吧： [https://watchllm.dev](https://watchllm.dev) 非常希望能听到你对其他调试功能的反馈。你希望在代理出现问题时拥有哪些功能？

查看原文

Hi HN! I built WatchLLM to solve two problems I kept hitting while building AI agents:1. Debugging agents is painful - When your agent makes 20 tool calls and fails, good luck figuring out which decision was wrong. WatchLLM gives you a step-by-step timeline showing every decision, tool call, and model response with explanations for why the agent did what it did.2. Agent costs spiral fast - Agents love getting stuck in loops or calling expensive tools repeatedly. WatchLLM tracks cost per step and flags anomalies like "loop detected - same action repeated 3x, wasted $0.012" or "high cost step - $0.08 exceeds threshold".The core features:Timeline view of every agent decision with cost breakdown Anomaly detection (loops, repeated tools, high-cost steps) Semantic caching that cuts 40-70% off your LLM bill as a bonus Works with OpenAI, Anthropic, Groq - just change your baseURLIt's built on ClickHouse for real-time telemetry and uses vector similarity for the caching layer. The agent debugger explains decisions using LLM-generated summaries of why each step happened. Right now it's free for up to 50K requests/month. I'm looking for early users who are building agents and want better observability into what's actually happening (and what it's costing). Try it: <a href="https://watchllm.dev" rel="nofollow">https://watchllm.dev</a> Would love feedback on what other debugging features would be useful. What do you wish you had when your agents misbehave?

展示 HN：WatchLLM – 逐步调试 AI 代理并进行成本归属