HackerNews中文版

大多数密集检索系统依赖于余弦相似度或点积，这隐含地假设了一个平坦的嵌入空间。然而，嵌入空间往往存在于具有非均匀结构的曲面流形上——密集区域、语义间隙、不对称路径。我一直在探索以下内容： - 将Ricci曲率作为重排序信号 - 使用软图来保持局部密度 - 在训练过程中使用测地线感知损失我很好奇是否有其他人尝试过类似的做法？特别是在信息检索、问答或可解释性方面。如果有兴趣，我很乐意分享一些实验（FiQA/BEIR）的结果。

查看原文

ost dense retrieval systems rely on cosine similarity or dot-product, which implicitly assumes a flat embedding space. But embedding spaces often live on curved manifolds with non-uniform structure—dense regions, semantic gaps, asymmetric paths.I’ve been exploring the use of:- Ricci curvature as a reranking signal- Soft-graphs to preserve local density- Geodesic-aware losses during trainingCurious if others have tried anything similar? Especially in information retrieval, QA, or explainability. Happy to share some experiments (FiQA/BEIR) if there's interest.

我们为什么仍然要扁平化嵌入空间？