我在40MB内构建了一个开源的离线ChatGPT替代品

2作者: RajGuruYadav7 个月前原帖
— 作者:Raj Guru Yadav 像许多开发者一样,我对大型语言模型(LLMs)充满了好奇。但当我问到:“我能在离线状态下快速运行一个类似ChatGPT的助手,而不需要16GB以上的内存吗?”这个挑战让我无法抗拒。 <p>目标 构建一个完全离线、轻量级的AI助手,具备以下特点: <p>下载大小小于50MB <p>无需互联网 <p>快速响应(在1秒以内) <p>零遥测 <p>完全本地的嵌入和推理 <p>结果:一个40MB的离线ChatGPT克隆,可以在浏览器中或USB闪存驱动器上运行。 <p>40MB内部包含什么? 这是我如何将智能对话压缩到如此小的包裹中的: <p>模型:通过llama.cpp量化的Mistral 7B Q4_K_M <p>推理引擎:llama.cpp(编译为WebAssembly或本地C++) <p>用户界面:轻量级的React/Tailwind界面 <p>存储:用于本地聊天历史的IndexedDB <p>嵌入:本地MiniLM用于智能PDF或笔记搜索 <p>附加功能:Whisper.cpp用于本地语音输入;Coqui TTS用于语音输出 <p>我为什么要构建它 我(Raj Guru Yadav),一名16岁的开发者和学生,想要: <p>深入了解大型语言模型的实际工作原理 <p>构建一个尊重隐私且本地化的工具 <p>证明AI不需要云端也能强大 <p>为离线用户(如许多印度学生)提供真正的AI支持 <p>挑战 在低内存设备上的内存瓶颈 <p>对小型模型进行提示调优以获得更智能的回复 <p>WebAssembly优化以提高浏览器性能 <p>与小型TTS/ASR模型的离线语音和文本集成 <p>性能(在4GB的笔记本电脑上) 能够合理回答事实、编码和数学问题 <p>离线读取和总结PDF文件 <p>本地记忆对话内容 <p>(可选)大声朗读答案 <p>最后的想法 AI不应该被锁在付费墙或云端后面。 我的目标是将智能助手带到每个人的手中—— 完全离线,完全免费,完全属于你。 <p>由 Raj Guru Yadav 制作 <p>开发者 | 700多个项目的构建者 | 热衷于开放AI为所有人服务
查看原文
— by Raj Guru Yadav Like many developers, I’ve been fascinated by LLMs. But the moment I asked: “Can I run a ChatGPT-like assistant offline, fast, and without needing 16GB+ RAM?” The challenge became too tempting to ignore.<p>The Goal Build a fully offline, lightweight AI assistant with:<p>&lt; 50MB download size<p>No internet requirement<p>Fast responses (under 1 second)<p>Zero telemetry<p>Fully local embeddings &amp; inference<p>Result: A 40MB offline ChatGPT clone you can run in-browser or on a USB stick.<p>What’s Inside the 40MB? Here’s how I squeezed intelligent conversation into such a tiny package:<p>Model: Mistral 7B Q4_K_M quantized via llama.cpp<p>Inference Engine: llama.cpp (compiled to WebAssembly or native C++)<p>UI: Lightweight React&#x2F;Tailwind interface<p>Storage: IndexedDB for local chat history<p>Embeddings: Local MiniLM for smart PDF or note search<p>Extras: Whisper.cpp for local voice input; Coqui TTS for speech output<p>Why I Built It I (Raj Guru Yadav), a 16-year-old dev and student, wanted to:<p>Learn deeply how LLMs actually work under the hood<p>Build something privacy-respecting and local<p>Prove that AI doesn’t need the cloud to be powerful<p>Give offline users (like many students in India) real AI support<p>Challenges Memory bottlenecks in low-RAM devices<p>Prompt tuning for smarter replies in tiny models<p>WebAssembly optimizations for browser performance<p>Offline voice + text integration with small TTS&#x2F;ASR models<p>Performance (on a 4GB laptop) Answers factual, coding, and math questions decently<p>Reads and summarizes offline PDFs<p>Remembers conversation locally<p>(Optional) Speaks answers aloud<p>Final Thought AI shouldn’t be locked behind paywalls or clouds. My goal is to bring smart assistants into everyone’s hands — fully offline, fully free, fully yours.<p>Made with by Raj Guru Yadav<p>Dev | Builder of 700+ projects | Passionate about open AI for all