HackerNews中文版

在处理一个大型的 Rust 代码库时，令牌问题确实存在——Claude Code 在理解两个模块之间的关系时，可能会花费 5 美元的上下文费用，甚至在写出一行代码之前。而一旦上下文压缩开始生效，情况就更糟了——代理完全失去线索，开始从头重新搜索相同的文件。我尝试过的几种方法：手动提供 CLAUDE.md / 架构文档——有帮助，但很快就会过时。Cursor 的内置索引——在单一代码库上失效，而且我不喜欢专有代码发送到他们的服务器。基本的 MCP 服务器配合 grep——对精确匹配有效，但对语义查询毫无用处。最终，我构建了一个更为严谨的工具：一个本地的 Tree-sitter 索引器，它构建了文件关系的知识图谱，并通过 MCP 暴露出来，以便代理可以进行语义查询，而不是盲目地使用 grep。只需一次工具调用，而不是 15 次 grep 迭代。我在这里发布了它：https://github.com/Muvon/octocode 但我真的很想知道其他人在这方面的做法，以便在深入研究之前获取一些灵感。我有三个具体问题： 1. 你们如何处理“涟漪效应”问题——知道更改一个文件在语义上会影响其他并不明显关联的文件？ 2. 你们是否信任闭源索引与专有代码，还是选择了本地优先？ 3. 有没有人成功在大规模实践中实现 GraphRAG 风格的关系映射，还是这仍然主要是炒作？

查看原文

Working on a large Rust codebase. The token problem is real — Claude Code will happily spend $5 of context just trying to understand how two modules relate before writing a single line. And once context compaction kicks in, it's even worse — the agent loses the thread completely and starts grepping the same files again from scratch.Approaches I've tried:Feeding CLAUDE.md / architecture docs manually — helps, but gets stale fast. Cursor's built-in indexing — breaks on monorepos, and I don't love proprietary code going to their servers. Basic MCP server with grep — works for exact matches, useless for semantic queries.Eventually built something more serious: a local Tree-sitter indexer that builds a knowledge graph of file relationships and exposes it via MCP so agents query semantically instead of grepping blind. One tool call instead of 15 grep iterations. Published it here: https://github.com/Muvon/octocodeBut genuinely curious what others are doing before I go deeper on it.Three specific questions:1. How do you handle the "ripple effect" problem — knowing that changing one file semantically affects others that aren't obviously linked?2. Do you trust closed-source indexing with proprietary code, or have you gone local-first?3. Has anyone gotten GraphRAG-style relationship mapping to work in practice at scale, or is it still mostly hype?

请问HN：如何在不消耗过多令牌的情况下，给AI代理提供真实的代码库上下文？