展示HN:SJT - 一种轻量级结构化JSON表格格式,用于API
嗨,HN,
我创建了一种名为 SJT(结构化 JSON 表)的实验性格式,以优化 API 中的数据传输。这个想法很简单:SJT 将结构(表头)与值分开,而不是在每一行中重复对象键。这使得数据更加紧凑,也更易于流式传输。
例如,使用 Discord 的 /messages 端点:
原始 JSON 负载:约 50,110 字节
使用 SJT 编码的相同数据:约 26,494 字节
因此,您可以获得约 50% 的大小减少,同时仍然能够逐条解码(记录逐条)。令人惊讶的是,解码的速度甚至可能比普通 JSON 更快,因为字符串解析的开销更小。
快速基准测试:
| 格式 | 大小 (KB) | 编码时间 | 解码时间 |
| ------------ | --------- | ----------- | ----------- |
| JSON | 3849.34 | 41.81 ms | 51.86 ms |
| JSON + Gzip | 379.67 | 55.66 ms | 39.61 ms |
| MessagePack | 2858.83 | 51.66 ms | 74.53 ms |
| SJT (json) | 2433.38 | 36.76 ms | 42.13 ms |
| SJT + Gzip | 359.00 | 69.59 ms | 46.82 ms |
测试条件:
数据集:合成的表格数据集,包含 50,000 条记录,具有混合的基本字段、嵌套数组和嵌套对象(代表大型 REST API 负载)。
运行环境:Node.js 20(V8 引擎)。
实现语言:JavaScript(通过 sjt.js)。
大小 (KB):未压缩的大小,以千字节为单位(对二进制格式的估算)。
编码/解码 (ms):序列化/反序列化整个数据集的平均时间(毫秒)。
规格: [https://github.com/SJTF/SJT](https://github.com/SJTF/SJT)
JS 实现:[https://github.com/yukiakai212/SJT.js](https://github.com/yukiakai212/SJT.js)
希望听到曾与 JSON 重型 API、流式传输或紧凑数据格式(如 CSV、Parquet 等)打过交道的人的反馈。
查看原文
Hi HN,
I built a small experimental format called SJT (Structured JSON Table) to optimize data transport in APIs.
The idea is simple: instead of repeating object keys for every row, SJT separates the structure (headers) from the values. This makes it both more compact and easier to stream.<p>For example, with Discord’s /messages endpoint:<p>Raw JSON payload: ~50,110 bytes<p>Same data encoded with SJT: ~26,494 bytes<p>So you get about a 50% reduction in size, while still being able to decode incrementally (record by record). Surprisingly, decoding can even be faster than plain JSON, because there’s less string parsing overhead.<p>Quick benchmark:<p>| Format | Size (KB) | Encode Time | Decode Time |<p>| ----------- | --------- | ----------- | ----------- |
| JSON | 3849.34 | 41.81 ms | 51.86 ms |<p>| JSON + Gzip | 379.67 | 55.66 ms | 39.61 ms |<p>| MessagePack | 2858.83 | 51.66 ms | 74.53 ms |<p>| SJT (json) | 2433.38 | 36.76 ms | 42.13 ms |<p>| SJT + Gzip | 359.00 | 69.59 ms | 46.82 ms |<p>Test conditions:<p>Dataset: Synthetic tabular dataset containing 50,000 records with mixed primitive fields, nested arrays, and nested objects (representative of large REST API payloads).<p>Runtime: Node.js 20 (V8 engine).<p>Implementation: JavaScript (via sjt.js).<p>Size (KB): Uncompressed size in kilobytes (estimated for binary formats).<p>Encode / Decode (ms): Average time in milliseconds to serialize/deserialize the entire dataset.<p>Spec: <a href="https://github.com/SJTF/SJT" rel="nofollow">https://github.com/SJTF/SJT</a><p>JS implementation: <a href="https://github.com/yukiakai212/SJT.js" rel="nofollow">https://github.com/yukiakai212/SJT.js</a><p>Curious to hear feedback from people who have worked with JSON-heavy APIs, streaming, or compact data formats (CSV, Parquet, etc.).