HackerNews中文版

我正在进行一个名为 Valori 的项目，这是一个从零开始构建的原生 Python 向量数据库——不是通过重新发明每个算法，而是将高效、知名的索引和搜索技术组合成一个统一的、可修改的框架。这个想法源于我对现有向量数据库的失望，它们要么过于复杂，不适合实验，要么不够透明，难以修改。我想要一个简单、模块化且可扩展的解决方案——于是我自己动手构建了它。它的功能包括： - 允许您存储、索引和搜索高维向量 - 支持多种索引（Flat、HNSW、IVF、LSH、Annoy） - 具有内存、磁盘和混合存储后端 - 包含完整的文档处理管道（解析、清理、分块、嵌入） - 提供量化、持久化和基于插件的扩展性所有功能均使用 Python 编写，与 NumPy 集成，并经过生产环境测试，内置日志记录和监控。安装方法： ``` pip install valori ``` GitHub: [https://github.com/varshith-Git/valori](https://github.com/varshith-Git/valori) PyPI: [https://pypi.org/project/valori](https://pypi.org/project/valori) 我很想听听您的想法—— 在当前的向量数据库中，您觉得缺少什么？如果您构建过 LLM 或 RAG 系统，您希望像这样的轻量级纯 Python 数据库在哪些方面做得更好？您更喜欢更紧密的集成（如 LangChain、Haystack 等），还是更倾向于“自己动手”的风格？欢迎任何反馈、批评或合作想法。 —— Varshith (varshith.gudur17@gmail.com)

查看原文

I’ve been working on a project called Valori, a Python-native vector database I built from the ground up — not by reinventing every algorithm, but by wiring together efficient, well-known indexing and search techniques into a cohesive, hackable framework.The idea came from my frustration with existing vector DBs that were either too heavy for experimentation or too opaque to modify. I wanted something simple, modular, and extensible — so I built it.What it does:Lets you store, index, and search high-dimensional vectorsSupports multiple indices (Flat, HNSW, IVF, LSH, Annoy)Has memory, disk, and hybrid storage backendsIncludes a full document processing pipeline (parsing, cleaning, chunking, embedding)Offers quantization, persistence, and plugin-based extensibilityAll written in Python, integrated with NumPy, and production-tested with logging and monitoring built in.Install:pip install valoriGitHub: https://github.com/varshith-Git/valoriPyPI: https://pypi.org/project/valoriI’d love to hear your thoughts —What’s missing for you in current vector DBs?If you’ve built LLM or RAG systems, what do you wish a lightweight, pure Python DB like this handled better?Would you prefer tighter integrations (LangChain, Haystack, etc.) or a more “build-it-yourself” style?Feedback, criticism, or collaboration ideas are all welcome. — Varshith (varshith.gudur17@gmail.com )

Valori – 我从零开始构建的原生Python向量数据库