展示HN:BSE – 文本、图像和音频的语义压缩引擎
我们构建了BSE(Bramble语义引擎)——一种语义压缩器,可以将自然输入转换为低维结构化表示。
它被设计为大型语言模型(LLMs)的预处理引擎,能够将长输入压缩为紧凑且保留逻辑的形式,涵盖以下方面:
1. 语言
- 提取主谓宾(SVO)结构
- 捕捉修饰成分:形容词/副词
- 从短期记忆中恢复代词
- 检测问题
- 计算:
- 压缩率(%)
- 语义损失(%)
- 通过句子压缩输出进行比较:
- 主体-主体、动词-动词、宾语-主体相似度
- 句子距离
2. 图像
- 裁剪并加权中心优先的区域
- 转换为100x100的加权矩阵
- 可视化:
- R、G、B通道
- 亮度
3. 音频
- 将音频分解为频带中的音高和强度
- 返回标准化的二维矩阵
- 以灰度谱图块的形式可视化
实时演示(Gradio):
[https://huggingface.co/spaces/Sibyl-V/BSE_demo](https://huggingface.co/spaces/Sibyl-V/BSE_demo)
欢迎反馈:
- 压缩逻辑
- 使用案例(LLM微调、检索、对齐)
- 多模态结构输出的设计
该项目由一位独立开发者及其黑色九尾狐伙伴在48小时内完成。请告诉我们您希望改进的地方——以及让您感到担忧的事项。
查看原文
We built BSE (Bramble Semantic Engine) – a semantic compressor that transforms natural inputs into low-dimensional structured representations.<p>It's designed as a preprocessing engine for LLMs, capable of reducing long inputs into compact, logic-preserving forms across:<p>1. Language<p>Extracts SVO (Subject, Verb, Object) structure<p>Captures modifiers: adjectives/adverbs<p>Restores pronouns from short-term memory<p>Detects questions<p>Computes:<p>Compression Rate (%)<p>Semantic Loss (%)<p>Compares sentence compression outputs via SDC:<p>Subject-Subject, Verb-Verb, Object-Subject similarity<p>Sentence distance<p>2. Image<p>Crops and weights center-priority patches<p>Converts into 100x100 weighted matrices<p>Visualizes:<p>R, G, B Channels<p>Brightness<p>3. Audio<p>Decomposes audio into pitch & intensity across frequency bands<p>Returns normalized 2D matrices<p>Visualized as grayscale spectro-patches<p>Live demo (Gradio):
<a href="https://huggingface.co/spaces/Sibyl-V/BSE_demo" rel="nofollow">https://huggingface.co/spaces/Sibyl-V/BSE_demo</a><p>Feedback welcome on:<p>Compression logic<p>Use cases (LLM fine-tuning, retrieval, alignment)<p>Design of multi-modal structure output<p>Built in 48 hours by a solo dev & their black nine-tailed fox partner.
Let us know what you'd improve — and what scares you.