HackerNews中文版

我在问HN的问题：你们的本地LLM堆栈实际是什么样的？我在寻找一些能够提供真正价值的东西——而不仅仅是一个华丽的演示。---在最近一次互联网断网后，我意识到我需要一个本地LLM设置作为备份——不仅仅是为了实验和娱乐。我的日常（远程）LLM堆栈：<pre><code> - Claude Max（$100/月）：我进行配对编程时的首选。重度使用Claude的网页和桌面客户端。 - Windsurf Pro（$15/月）：喜欢它的多行自动补全以及对剪贴板/上下文的感知。 - ChatGPT Plus（$20/月）：我的“橡皮鸭”、编辑和创意伙伴。我用它来处理除代码以外的所有事情。 </code></pre> 这是我目前为本地堆栈拼凑的内容：工具<pre><code> - Ollama：用于本地运行模型 - Aider：Claude风格的命令行界面 - VSCode与continue.dev扩展：本地聊天和自动补全 </code></pre> 模型<pre><code> - 聊天：llama3.1:latest - 自动补全：Qwen2.5 Coder 1.5B - 编码/编辑：deepseek-coder-v2:16b </code></pre> 我不担心的事情：<pre><code> - CPU/内存（在M1 MacBook上运行） - 成本（在合理范围内） - 数据隐私/训练数据（不想在这里展开哲学辩论） </code></pre> 我担心的事情：<pre><code> - 实际的有用性（即“感觉”） - 易用性（与我的肌肉记忆相符的工具） - 正确性（不是基准测试） - 延迟和速度 </code></pre> 现在：我已经让它运行起来了。我可以做一个华丽的演示。但它实际上还没有用。---我是谁<pre><code> - 一家小型初创公司的CTO（5位优秀工程师） - 20年的编码经验（从13岁开始） - 前大科技公司员工 </code></pre>

查看原文

What I’m asking HN:What does your actually useful local LLM stack look like?I’m looking for something that provides you with real value — not just a sexy demo.---After a recent internet outage, I realized I need a local LLM setup as a backup — not just for experimentation and fun.My daily (remote) LLM stack:<pre><code> - Claude Max ($100/mo): My go-to for pair programming. Heavy user of both the Claude web and desktop clients. - Windsurf Pro ($15/mo): Love the multi-line autocomplete and how it uses clipboard/context awareness. - ChatGPT Plus ($20/mo): My rubber duck, editor, and ideation partner. I use it for everything except code. </code></pre> Here’s what I’ve cobbled together for my local stack so far:Tools<pre><code> - Ollama: for running models locally - Aider: Claude-code-style CLI interface - VSCode w/ continue.dev extension: local chat & autocomplete </code></pre> Models<pre><code> - Chat: llama3.1:latest - Autocomplete: Qwen2.5 Coder 1.5B - Coding/Editing: deepseek-coder-v2:16b </code></pre> Things I’m not worried about:<pre><code> - CPU/Memory (running on an M1 MacBook) - Cost (within reason) - Data privacy / being trained on (not trying to start a philosophical debate here) </code></pre> I am worried about:<pre><code> - Actual usefulness (i.e. “vibes”) - Ease of use (tools that fit with my muscle memory) - Correctness (not benchmarks) - Latency & speed </code></pre> Right now: I’ve got it working. I could make a slick demo. But it’s not actually useful yet.---Who I am<pre><code> - CTO of a small startup (5 amazing engineers) - 20 years of coding (since I was 13) - Ex-big tech</code></pre>

问HN：你们的实用本地LLM技术栈是什么？