展示HN:CCCP – 一种可编程的、上下文感知的压缩协议(早期阶段)

1作者: brucekaushik大约 1 个月前原帖
我一直在琢磨一个我称之为 CCCP 的想法——上下文感知可组合压缩协议。 大多数压缩格式将这个过程视为一个黑箱:你输入字节,输出字节。我希望有一个可编程和可组合的格式,能够适应不同的领域——甚至可以由不同的供应商进行定制。 到目前为止,CCCP 具有一些有趣的特性: - 可组合:可以组合多个查找表(LUT)和编码阶段。 - 上下文感知:解码过程由显式元数据指导,而不仅仅是原始字节流。 - 可回溯的中间表示:中间表示可以在最终的二进制压缩之前重建原始逻辑。 - 可编程:供应商可以插入自己的 LUT、编码器和解码器。 这仍然处于非常早期和实验阶段。如果有人见过类似的方法,或者在实际使用中可能出现的问题,我非常想听听。 代码库: - [https://github.com/brucekaushik/cccp](https://github.com/brucekaushik/cccp) - [https://github.com/brucekaushik/cccp-python-poc](https://github.com/brucekaushik/cccp-python-poc)
查看原文
I have been tinkering with an idea I call CCCP — Context-Aware Composable Compression Protocol.<p>Most compression formats treat the process as a black box: you feed bytes in, you get bytes out. I wanted something programmable and composable, where the format itself can be adapted to different domains — and even customized by different vendors.<p>So far, CCCP has a few interesting properties:<p>Composable: Multiple LUTs (look-up tables) and encoding phases can be combined.<p>Context-aware: Decoding is guided by explicit metadata, not just raw byte streams.<p>Round-trippable IR: The intermediate representation can reconstruct the original logic before final binary compression.<p>Programmable: Vendors can plug in their own LUTs, encoders, and decoders.<p>It is still very early and experimental. Would love to hear if anyone has seen similar approaches, or where this might break down in real-world usage.<p>Repos:<p><a href="https:&#x2F;&#x2F;github.com&#x2F;brucekaushik&#x2F;cccp" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;brucekaushik&#x2F;cccp</a><p><a href="https:&#x2F;&#x2F;github.com&#x2F;brucekaushik&#x2F;cccp-python-poc" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;brucekaushik&#x2F;cccp-python-poc</a>