一个以磁盘为主的 C++ 向量引擎

2作者: saeedq大约 1 个月前原帖
大家好,我开发了 brinicle,这是一个使用 C++ 编写的内存高效的向量引擎,并配有 Python 包装器,显著减少了内存消耗,同时保持了较快的速度。在 120 万个亚马逊产品上,它实现了低于毫秒的 P99 延迟。它还支持词汇搜索和混合搜索。在混合搜索中,我们没有尝试构建两个索引然后合并结果,而是创建了一个 HNSW 图,并将其用于语义、词汇和混合搜索。 基准比较: [https://brinicle.bicardinal.com/benchmark](https://brinicle.bicardinal.com/benchmark) 混合搜索基准比较: [https://brinicle.bicardinal.com/search_benchmark](https://brinicle.bicardinal.com/search_benchmark) 代码库: [github.com/bicardinal/brinicle](https://github.com/bicardinal/brinicle)
查看原文
Hey all, I built brinicle, an in-process C++ vector engine with a python wrapper that consumes substantially less memory, while staying quite fast. On 1.2 million Amazon products, it achieves sub-ms P99 latency. It also supports lexical search, and hybrid search. In the hybrid search, we did not try to build two indexes and then fuse results, we create ONE HNSW Graph and utilize it for semantic, lexical, and hybrid search. benchmark comparisons: https://brinicle.bicardinal.com/benchmark hybrid search benchmark comparisons: https://brinicle.bicardinal.com/search_benchmark Repository:: github.com/bicardinal/brinicle