Launch HN:终端使用(YC W26) – 基于文件系统的代理的 Vercel

1作者: filipbalucha大约 3 小时前原帖
你好,Hacker News!我们是来自 Terminal Use 的 Filip、Stavros 和 Vivek(<a href="https://www.terminaluse.com/">https://www.terminaluse.com/</a>)。我们创建 Terminal Use 是为了简化在沙盒环境中部署需要文件系统的代理程序。这包括编码代理、研究代理、文档处理代理以及读取和写入文件的内部工具。 <p>这是一个演示:<a href="https://www.youtube.com/watch?v=ttMl96l9xPA" rel="nofollow">https://www.youtube.com/watch?v=ttMl96l9xPA</a>。</p> 我们在托管代理时遇到的最大痛点是需要将多个部分拼接在一起:打包代理、在沙盒中运行、将消息流回用户、在回合之间持久化状态,以及管理文件在代理工作区之间的传输。 <p>我们希望能有类似于 Replicate 的 Cog,但用于代理的解决方案:一种简单的方法来从代码库打包代理代码,并通过干净的 API/SDK 提供服务。我们希望提供一种与代理通信的协议,但不限制代理的逻辑或框架本身。</p> 在 Terminal Use 上,您可以通过 config.yaml 和 Dockerfile 从代码库打包您的代理,然后使用我们的 CLI 部署。您需要定义三个端点的逻辑(on_create、on_event 和 on_cancel),这些端点跟踪任务(对话)的生命周期。config.yaml 包含有关资源、构建上下文等的详细信息。 <p>开箱即用,我们支持 Claude Agent SDK 和 Codex SDK 代理。所谓支持,是指我们有一个适配器,可以将 SDK 消息类型转换为我们的类型。如果您想使用自己的自定义框架,可以使用我们的类型进行消息转换和发送(兼容 Vercel AI SDK v6)。在前端,我们提供了一个 Vercel AI SDK 提供者,让您可以将代理与 Vercel 的 AI SDK 一起使用,并且有一个消息模块,这样您就不必自己管理流和持久化。</p> 我们认为最不同的部分是存储。 <p>我们将文件系统视为一等原语,与任务的生命周期分开。这意味着您可以在回合之间持久化工作区,在不同代理之间共享,或者独立于沙盒的活动上传/下载文件。此外,我们的文件系统 SDK 提供预签名的 URL,使用户可以直接上传和下载文件,这样您就不需要通过后端代理文件传输。</p> 由于您的代理逻辑和文件系统存储是解耦的,这使得您可以轻松迭代代理,而不必担心沙盒中的文件:如果您发布了一个错误,您可以部署并自动迁移所有任务到新的部署。如果您进行了一次重大更改,您可以指定现有任务保持在旧版本上,只有新任务使用新版本。 <p>我们还在添加对多文件系统挂载的支持,具有可配置的挂载路径和读/写模式,因此存储保持持久和可重用,而挂载布局保持任务特定。</p> 在部署方面,我们受到了现代开发平台的影响:简单的 CLI 部署、预览/生产环境、基于 git 的环境目标、日志和回滚。您需要的所有配置都存储在 config.yaml 文件中,这使得在 CI/CD 管道中构建和部署代理变得简单。 <p>最后,我们明确设计了我们的平台,以支持您的 CLI 编码代理,帮助您构建、测试和迭代代理。通过我们的 CLI,您的编码代理可以向已部署的代理发送消息,并下载文件系统内容,以帮助您理解代理的输出。我们测试代理的常见方法是制作包含我们想要测试的用户场景的 markdown 文件,然后请求 Claude Code 模拟我们的用户并与已部署的代理聊天。</p> 我们目前尚未实现的功能:与通用沙盒提供商的完全对等。例如,预览 URL 和更低级别的 sandbox.exec(...) 风格的 API 仍在开发计划中。 <p>我们期待在下面的评论中听到您任何的想法、见解、问题和关注!</p>
查看原文
Hello Hacker News! We&#x27;re Filip, Stavros, and Vivek from Terminal Use (<a href="https:&#x2F;&#x2F;www.terminaluse.com&#x2F;">https:&#x2F;&#x2F;www.terminaluse.com&#x2F;</a>). We built Terminal Use to make it easier to deploy agents that work in a sandboxed environment and need filesystems to do work. This includes coding agents, research agents, document processing agents, and internal tools that read and write files.<p>Here&#x27;s a demo: <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=ttMl96l9xPA" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=ttMl96l9xPA</a>.<p>Our biggest pain point with hosting agents was that you&#x27;d need to stitch together multiple pieces: packaging your agent, running it in a sandbox, streaming messages back to users, persisting state across turns, and managing getting files to and from the agent workspace.<p>We wanted something like Cog from Replicate, but for agents: a simple way to package agent code from a repo and serve it behind a clean API&#x2F;SDK. We wanted to provide a protocol to communicate with your agent, but not constraint the agent logic or harness itself.<p>On Terminal Use, you package your agent from a repo with a config.yaml and Dockerfile, then deploy it with our CLI. You define the logic of three endpoints (on_create, on_event, and on_cancel) which track the lifecycle of a task (conversation). The config.yaml contains details about resources, build context, etc.<p>Out of the box, we support Claude Agent SDK and Codex SDK agents. By support, we mean that we have an adapter that converts from the SDK message types to ours. If you&#x27;d like to use your own custom harness, you can convert and send messages with our types (Vercel AI SDK v6 compatible). For the frontend, we have a Vercel AI SDK provider that lets you use your agent with Vercel&#x27;s AI SDK, and have a messages module so that you don&#x27;t have to manage streaming and persistence yourself.<p>The part we think is most different is storage.<p>We treat filesystems as first-class primitives, separate from the lifecycle of a task. That means you can persist a workspace across turns, share it between different agents, or upload &#x2F; download files independent of the sandbox being active. Further, our filesystem SDK provides presigned urls which makes it easy for your users to directly upload and download files which means that you don&#x27;t need to proxy file transfer through your backend.<p>Since your agent logic and filesystem storage are decoupled, this makes it easy to iterate on your agents without worrying about the files in the sandbox: if you ship a bug, you can deploy and auto-migrate all your tasks to the new deployment. If you make a breaking change, you can specify that existing tasks stay on the existing version, and only new tasks use the new version.<p>We&#x27;re also adding support for multi-filesystem mounts with configurable mount paths and read&#x2F;write modes, so storage stays durable and reusable while mount layout stays task-specific.<p>On the deployment side, we&#x27;ve been influenced by modern developer platforms: simple CLI deployments, preview&#x2F;production environments, git-based environment targeting, logs, and rollback. All the configuration you need to build, deploy &amp; manage resources for your agent is stored in the config.yaml file which makes it easy to build &amp; deploy your agent in CI&#x2F;CD pipelines.<p>Finally, we&#x27;ve explicitly designed our platform for your CLI coding agents to help you build, test, &amp; iterate with your agents. With our CLI, your coding agents can send messages to your deployed agents, and download filesystem contents to help you understand your agent&#x27;s output. A common way we test our agents is that we make markdown files with user scenarios we&#x27;d like to test, and then ask Claude Code to impersonate our users and chat with our deployed agent.<p>What we do not have yet: full parity with general-purpose sandbox providers. For example, preview URLs and lower-level sandbox.exec(...) style APIs are still on the roadmap.<p>We&#x27;re excited to hear any thoughts, insights, questions, and concerns in the comments below!