HackerNews中文版

九天前，我在这里发布了《离网》，你们的反应让我感到惊喜——124个点赞，66条评论，错误报告我当天就修复了，还有让开源变得有意义的反馈。你们告诉我你们想要什么。现在，我带来了以下更新：你的人工智能现在可以使用工具——完全离线。网页搜索、计算器、日期/时间、设备信息——配备自动工具循环。你的3B参数模型不再仅仅生成文本。它可以推理、调用工具并综合结果。在你的手机上。无需API密钥。无需服务器。无需云功能。这有什么意义？这意味着“本地玩具”和“实用助手”之间的差距大大缩小了。你不需要GPT-4来查找信息并给出答案。运行在你的Snapdragon上的量化Qwen 3 / SMOLLM3可以迅速完成这个任务。速度提升了3倍，并且可以配置KV缓存。你现在可以选择f16、q8_0和q4_0三种KV缓存类型。在q4_0上，之前只能处理10个token的模型现在可以处理30个。应用在你第一次生成后甚至会提示你：“嘿，你可以运行得更快。”只需轻触一下。这有什么意义？关于设备端AI的最大抱怨是“它太慢，无法实用。”这个论点现在失去了很多说服力。在手机上每秒30个token的速度比大多数人阅读的速度还要快。现在在两个应用商店上线。无需侧载。无需Xcode。《离网》现在已在App Store和Google Play上架。像安装其他应用一样安装它。你的父母也能使用这个。这有什么意义？设备端AI从“开发者的酷炫周末项目”变成了“普通人实际上可以尝试的东西。”这很重要，因为隐私不应该需要计算机科学学位。未改变的内容： - MIT许可证。完全开源。每一行代码 - 你的设备上没有数据外泄。没有分析。没有遥测。没有“匿名使用数据”。 - 文本生成（15-30个token）、图像生成（在NPU上5-10秒）、视觉AI、语音转录、文档分析——全部离线 - 带上任何GGUF模型。运行Qwen 3、Llama 3.2、Gemma 3、Phi-4，随你选择。我之所以构建这个，是因为我相信你口袋里的手机应该是你拥有的最私密的计算机——而不是监控最严重的。每周模型都在变得更小、更快。硬件已经到位。软件只需要跟上。如果你对此感同身受，给我在GitHub上点个星真的很有帮助：<a href="https://github.com/alichherawalla/off-grid-mobile" rel="nofollow">https://github.com/alichherawalla/off-grid-mobile</a> 我会在评论区等你。告诉我接下来该构建什么。

查看原文

Nine days ago I posted Off Grid here and you showed up - 124 points, 66 comments, bug reports I fixed same-day, and the kind of feedback that makes open source worth it.You told me what you wanted. Here's what I shipped: Your AI can now use tools — entirely offline.Web search, calculator, date/time, device info — with automatic tool loops.Your 3B parameter model doesn't just generate text anymore. It reasons, calls tools, and synthesizes results.On your phone. No API key. No server. No cloud function.So what? It means the gap between "local toy" and "useful assistant" just got dramatically smaller.You don't need GPT-4 to look something up and give you an answer. A quantized Qwen 3 / SMOLLM3 running on your Snapdragon can do it in no time.3x faster with configurable KV cache. You can now choose between f16, q8_0, and q4_0 KV cache types. On q4_0, models that were doing 10 tok/s are hitting 30. The app even nudges you after your first generation: "Hey, you could be running faster." One tap.So what? The #1 complaint about on-device AI is "it's too slow to be useful." That argument just lost a lot of weight. 30tokens/second on a phone is faster than most people read.Live on both stores. No sideloading. No Xcode.Off Grid is now on the App Store and Google Play. Install it like any other app. Your parents could use this.So what? On-device AI just went from "cool weekend project for developers" to "thing normal people can actually try." That matters because privacy shouldn't require a CS degree.What hasn't changed: - MIT licensed. Fully open source. Every line - Zero data leaves your device. No analytics. No telemetry. No "anonymous usage data." - Text gen (15-30 tok/s), image gen (5-10s on NPU), vision AI, voice transcription, document analysis — all offline - Bring any GGUF model. Run Qwen 3, Llama 3.2, Gemma 3, Phi-4, whatever you want.I'm building this because I believe the phone in your pocket should be the most private computer you own — not the most surveilled. Every week the models get smaller and faster. The hardware is already there. The software just needs to catch up.If this resonates, a star on GitHub genuinely helps: <a href="https://github.com/alichherawalla/off-grid-mobile" rel="nofollow">https://github.com/alichherawalla/off-grid-mobile</a>I'm in the comments. Tell me what to build next.

展示HN：离网：设备端AI网页浏览，工具视觉、图像、语音——速度提升3倍