展示 HN:VAERS DuckDB 数据库

2作者: yehosef13 天前原帖
我已经在VAERS数据库上玩了一段时间,刚刚将其导入到DuckDB中以进行本地分析。这个数据集存在各种问题,我正在努力修复这些问题,以提高数据集的质量。 导入脚本可以在这里找到: [https://github.com/yehosef/vaers-duckdb](https://github.com/yehosef/vaers-duckdb) 一个准备好的数据文件可以在这里下载(约3GB): [https://drive.google.com/file/d/1d3wRRr2UFvCYR9r7J5XBym2dYKTsMKtF/view?usp=sharing](https://drive.google.com/file/d/1d3wRRr2UFvCYR9r7J5XBym2dYKTsMKtF/view?usp=sharing) 我计划添加一些仪表板,类似于我最初的Elasticsearch项目。我还想为向量搜索添加一些嵌入功能。正在进行中。 我很想听听你的想法!
查看原文
I&#x27;ve been playing with the VAERS database for a while and I just got through importing it into duckdb for local analytics. The data set has various problems that I try to fix to improve the quality of the dataset.<p>The import scripts are <a href="https:&#x2F;&#x2F;github.com&#x2F;yehosef&#x2F;vaers-duckdb" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;yehosef&#x2F;vaers-duckdb</a><p>A ready data file is available here (~3GB) <a href="https:&#x2F;&#x2F;drive.google.com&#x2F;file&#x2F;d&#x2F;1d3wRRr2UFvCYR9r7J5XBym2dYKTsMKtF&#x2F;view?usp=sharing" rel="nofollow">https:&#x2F;&#x2F;drive.google.com&#x2F;file&#x2F;d&#x2F;1d3wRRr2UFvCYR9r7J5XBym2dYKT...</a><p>I&#x27;m planning to add some dashboards like my original elasticsearch project. I would also like to add some embeddings for vector search. WIP<p>I&#x27;d love to hear what you think!