HackerNews中文版

我一直在琢磨一个我称之为 CCCP 的想法——上下文感知可组合压缩协议。大多数压缩格式将这个过程视为一个黑箱：你输入字节，输出字节。我希望有一个可编程和可组合的格式，能够适应不同的领域——甚至可以由不同的供应商进行定制。到目前为止，CCCP 具有一些有趣的特性： - 可组合：可以组合多个查找表（LUT）和编码阶段。 - 上下文感知：解码过程由显式元数据指导，而不仅仅是原始字节流。 - 可回溯的中间表示：中间表示可以在最终的二进制压缩之前重建原始逻辑。 - 可编程：供应商可以插入自己的 LUT、编码器和解码器。这仍然处于非常早期和实验阶段。如果有人见过类似的方法，或者在实际使用中可能出现的问题，我非常想听听。代码库： - [https://github.com/brucekaushik/cccp](https://github.com/brucekaushik/cccp) - [https://github.com/brucekaushik/cccp-python-poc](https://github.com/brucekaushik/cccp-python-poc)

查看原文

I have been tinkering with an idea I call CCCP — Context-Aware Composable Compression Protocol.Most compression formats treat the process as a black box: you feed bytes in, you get bytes out. I wanted something programmable and composable, where the format itself can be adapted to different domains — and even customized by different vendors.So far, CCCP has a few interesting properties:Composable: Multiple LUTs (look-up tables) and encoding phases can be combined.Context-aware: Decoding is guided by explicit metadata, not just raw byte streams.Round-trippable IR: The intermediate representation can reconstruct the original logic before final binary compression.Programmable: Vendors can plug in their own LUTs, encoders, and decoders.It is still very early and experimental. Would love to hear if anyone has seen similar approaches, or where this might break down in real-world usage.Repos:<a href="https://github.com/brucekaushik/cccp" rel="nofollow">https://github.com/brucekaushik/cccp</a><a href="https://github.com/brucekaushik/cccp-python-poc" rel="nofollow">https://github.com/brucekaushik/cccp-python-poc</a>

展示HN：CCCP – 一种可编程的、上下文感知的压缩协议（早期阶段）