展示HN:将Kafka流式传输到Ducklake

1作者: dm035147 个月前原帖
Ducklake是由MotherDuck推出的一种新的湖泊存储格式。它旨在通过将元数据集中存储在Postgres中,而不是直接存储在blob存储上,来解决一些Iceberg存在的问题。 SQLFlow是一个流处理引擎,它从Kafka中获取数据,对该数据流执行SQL查询,并将输出结果存储。 在流处理过程中,SQLFlow提供了一个duckdb上下文。这使得从Kafka流式传输数据到Ducklake变得非常简单! <a href="https://sql-flow.com/docs/tutorials/ducklake-sink/" rel="nofollow">https://sql-flow.com/docs/tutorials/ducklake-sink/</a> <a href="https://github.com/turbolytics/sql-flow">https://github.com/turbolytics/sql-flow</a>
查看原文
Ducklake is a new lake storage format by MotherDuck. It aims to solve some of the issues with Iceberg by centralizing metadata in Postgres, instead of directly on blob storage.<p>SQLFlow is a stream processing engine that ingests data from kafka, runs sql against that stream and sinks the output.<p>SQLFlow has a duckdb context available during stream processing. This make it trivial to stream data from Kafka to Ducklake!<p><a href="https:&#x2F;&#x2F;sql-flow.com&#x2F;docs&#x2F;tutorials&#x2F;ducklake-sink&#x2F;" rel="nofollow">https:&#x2F;&#x2F;sql-flow.com&#x2F;docs&#x2F;tutorials&#x2F;ducklake-sink&#x2F;</a><p><a href="https:&#x2F;&#x2F;github.com&#x2F;turbolytics&#x2F;sql-flow">https:&#x2F;&#x2F;github.com&#x2F;turbolytics&#x2F;sql-flow</a>