展示HN:离网:设备端AI网页浏览,工具视觉、图像、语音——速度提升3倍
九天前,我在这里发布了《离网》,你们的反应让我感到惊喜——124个点赞,66条评论,错误报告我当天就修复了,还有让开源变得有意义的反馈。
你们告诉我你们想要什么。现在,我带来了以下更新:
你的人工智能现在可以使用工具——完全离线。
网页搜索、计算器、日期/时间、设备信息——配备自动工具循环。
你的3B参数模型不再仅仅生成文本。它可以推理、调用工具并综合结果。
在你的手机上。无需API密钥。无需服务器。无需云功能。
这有什么意义?这意味着“本地玩具”和“实用助手”之间的差距大大缩小了。
你不需要GPT-4来查找信息并给出答案。运行在你的Snapdragon上的量化Qwen 3 / SMOLLM3可以迅速完成这个任务。
速度提升了3倍,并且可以配置KV缓存。
你现在可以选择f16、q8_0和q4_0三种KV缓存类型。在q4_0上,之前只能处理10个token的模型现在可以处理30个。应用在你第一次生成后甚至会提示你:“嘿,你可以运行得更快。”只需轻触一下。
这有什么意义?关于设备端AI的最大抱怨是“它太慢,无法实用。”这个论点现在失去了很多说服力。在手机上每秒30个token的速度比大多数人阅读的速度还要快。
现在在两个应用商店上线。无需侧载。无需Xcode。
《离网》现在已在App Store和Google Play上架。像安装其他应用一样安装它。你的父母也能使用这个。
这有什么意义?设备端AI从“开发者的酷炫周末项目”变成了“普通人实际上可以尝试的东西。”这很重要,因为隐私不应该需要计算机科学学位。
未改变的内容:
- MIT许可证。完全开源。每一行代码
- 你的设备上没有数据外泄。没有分析。没有遥测。没有“匿名使用数据”。
- 文本生成(15-30个token)、图像生成(在NPU上5-10秒)、视觉AI、语音转录、文档分析——全部离线
- 带上任何GGUF模型。运行Qwen 3、Llama 3.2、Gemma 3、Phi-4,随你选择。
我之所以构建这个,是因为我相信你口袋里的手机应该是你拥有的最私密的计算机——而不是监控最严重的。每周模型都在变得更小、更快。硬件已经到位。软件只需要跟上。
如果你对此感同身受,给我在GitHub上点个星真的很有帮助:<a href="https://github.com/alichherawalla/off-grid-mobile" rel="nofollow">https://github.com/alichherawalla/off-grid-mobile</a>
我会在评论区等你。告诉我接下来该构建什么。
查看原文
Nine days ago I posted Off Grid here and you showed up - 124 points, 66 comments, bug reports I fixed same-day, and the kind of feedback that makes open source worth it.<p>You told me what you wanted. Here's what I shipped:
Your AI can now use tools — entirely offline.<p>Web search, calculator, date/time, device info — with automatic tool loops.<p>Your 3B parameter model doesn't just generate text anymore. It reasons, calls tools, and synthesizes results.<p>On your phone. No API key. No server. No cloud function.<p>So what? It means the gap between "local toy" and "useful assistant" just got dramatically smaller.<p>You don't need GPT-4 to look something up and give you an answer. A quantized Qwen 3 / SMOLLM3 running on your Snapdragon can do it in no time.<p>3x faster with configurable KV cache.
You can now choose between f16, q8_0, and q4_0 KV cache types. On q4_0, models that were doing 10 tok/s are hitting 30. The app even nudges you after your first generation: "Hey, you could be running faster." One tap.<p>So what? The #1 complaint about on-device AI is "it's too slow to be useful." That argument just lost a lot of weight. 30tokens/second on a phone is faster than most people read.<p>Live on both stores. No sideloading. No Xcode.<p>Off Grid is now on the App Store and Google Play. Install it like any other app. Your parents could use this.<p>So what? On-device AI just went from "cool weekend project for developers" to "thing normal people can actually try." That matters because privacy shouldn't require a CS degree.<p>What hasn't changed:
- MIT licensed. Fully open source. Every line
- Zero data leaves your device. No analytics. No telemetry. No "anonymous usage data."
- Text gen (15-30 tok/s), image gen (5-10s on NPU), vision AI, voice transcription, document analysis
— all offline
- Bring any GGUF model. Run Qwen 3, Llama 3.2, Gemma 3, Phi-4, whatever you want.<p>I'm building this because I believe the phone in your pocket should be the most private computer you own — not the most surveilled. Every week the models get smaller and faster. The hardware is already there. The software just needs to catch up.<p>If this resonates, a star on GitHub genuinely helps: <a href="https://github.com/alichherawalla/off-grid-mobile" rel="nofollow">https://github.com/alichherawalla/off-grid-mobile</a><p>I'm in the comments. Tell me what to build next.