展示 HN:VAERS DuckDB 数据库
我已经在VAERS数据库上玩了一段时间,刚刚将其导入到DuckDB中以进行本地分析。这个数据集存在各种问题,我正在努力修复这些问题,以提高数据集的质量。
导入脚本可以在这里找到: [https://github.com/yehosef/vaers-duckdb](https://github.com/yehosef/vaers-duckdb)
一个准备好的数据文件可以在这里下载(约3GB): [https://drive.google.com/file/d/1d3wRRr2UFvCYR9r7J5XBym2dYKTsMKtF/view?usp=sharing](https://drive.google.com/file/d/1d3wRRr2UFvCYR9r7J5XBym2dYKTsMKtF/view?usp=sharing)
我计划添加一些仪表板,类似于我最初的Elasticsearch项目。我还想为向量搜索添加一些嵌入功能。正在进行中。
我很想听听你的想法!
查看原文
I've been playing with the VAERS database for a while and I just got through importing it into duckdb for local analytics. The data set has various problems that I try to fix to improve the quality of the dataset.<p>The import scripts are <a href="https://github.com/yehosef/vaers-duckdb" rel="nofollow">https://github.com/yehosef/vaers-duckdb</a><p>A ready data file is available here (~3GB) <a href="https://drive.google.com/file/d/1d3wRRr2UFvCYR9r7J5XBym2dYKTsMKtF/view?usp=sharing" rel="nofollow">https://drive.google.com/file/d/1d3wRRr2UFvCYR9r7J5XBym2dYKT...</a><p>I'm planning to add some dashboards like my original elasticsearch project. I would also like to add some embeddings for vector search. WIP<p>I'd love to hear what you think!