展示HN:Tabstack – AI代理的浏览器基础设施(由Mozilla提供)
嗨,HN,
我的团队和我正在构建 Tabstack,以处理 AI 代理的“网络层”。发布文章链接: [https://tabstack.ai/blog/intro-browsing-infrastructure-ai-agents](https://tabstack.ai/blog/intro-browsing-infrastructure-ai-agents)
维护一个复杂的网络浏览基础设施是构建可靠代理的最大瓶颈之一。你可能从一个简单的请求开始,但很快就会陷入管理复杂的代理堆栈、处理客户端的水合(hydration)以及调试脆弱的选择器,还要为每个网站编写自定义解析逻辑的麻烦中。
Tabstack 是一个抽象化该基础设施的 API。你只需发送一个 URL 和一个意图,我们负责渲染并返回干净、结构化的数据供大型语言模型(LLM)使用。
它的工作原理如下:
- 升级逻辑:我们不会为每个请求启动一个完整的浏览器实例(这既慢又昂贵)。我们首先尝试轻量级的请求,只有在网站需要执行 JavaScript 或水合时才升级到完整的浏览器自动化。
- 令牌优化:原始 HTML 内容杂乱无章,会消耗上下文窗口的令牌。我们处理 DOM,去除非内容元素,返回一种适合 LLM 消费的 Markdown 友好结构。
- 基础设施稳定性:扩展无头浏览器 notoriously 困难(僵尸进程、内存泄漏、崩溃实例)。我们管理整个集群的生命周期和编排,这样你就可以在不维护底层网格的情况下运行数千个并发请求。
关于伦理:由于我们得到了 Mozilla 的支持,我们对与开放网络的互动非常严格。
- 我们遵守 robots.txt 规则。
- 我们标识我们的用户代理。
- 我们不使用请求/内容来训练模型。
- 数据是短暂的,任务完成后会被丢弃。
链接的文章详细介绍了基础设施以及我们为何认为浏览需要在 AI 堆栈中成为一个独立层的原因。
这显然是一个非常新的领域,我们都在共同学习。在代理浏览方面有很多已知的未知(可能还有更多未知的未知),因此我们非常欢迎你的反馈、问题和建议。
欢迎提问关于我们的堆栈、架构或构建浏览器基础设施的挑战。
查看原文
Hi HN,<p>My team and I are building Tabstack to handle the "web layer" for AI agents. Launch Post: <a href="https://tabstack.ai/blog/intro-browsing-infrastructure-ai-agents" rel="nofollow">https://tabstack.ai/blog/intro-browsing-infrastructure-ai-ag...</a><p>Maintaining a complex infrastructure stack for web browsing is one of the biggest bottlenecks in building reliable agents. You start with a simple fetch, but quickly end up managing a complex stack of proxies, handling client-side hydration, and debugging brittle selectors. and writing custom parsing logic for every site.<p>Tabstack is an API that abstracts that infrastructure. You send a URL and an intent; we handle the rendering and return clean, structured data for the LLM.<p>How it works under the hood:<p>- Escalation Logic: We don't spin up a full browser instance for every request (which is slow and expensive). We attempt lightweight fetches first, escalating to full browser automation only when the site requires JS execution/hydration.<p>- Token Optimization: Raw HTML is noisy and burns context window tokens. We process the DOM to strip non-content elements and return a markdown-friendly structure that is optimized for LLM consumption.<p>- Infrastructure Stability: Scaling headless browsers is notoriously hard (zombie processes, memory leaks, crashing instances). We manage the fleet lifecycle and orchestration so you can run thousands of concurrent requests without maintaining the underlying grid.<p>On Ethics: Since we are backed by Mozilla, we are strict about how this interacts with the open web.<p>- We respect robots.txt rules.<p>- We identify our User Agent.<p>- We do not use requests/content to train models.<p>- Data is ephemeral and discarded after the task.<p>The linked post goes into more detail on the infrastructure and why we think browsing needs to be a distinct layer in the AI stack.<p>This is obviously a very new space and we're all learning together. There are plenty of known unknowns (and likely even more unknown unknowns) when it comes to agentic browsing, so we’d genuinely appreciate your feedback, questions, and tips.<p>Happy to answer questions about the stack, our architecture, or the challenges of building browser infrastructure.