展示HN:我构建了一个个人AI新闻策展工具,用于过滤RSS源(n8n和OpenAI)
嗨,HN,
我发现自己在浏览科技新闻和RSS源时浪费了太多时间,扫描数百个标题,只为找到3到4个对我的工作真正有意义的内容。
为了解决这个问题,我构建了一个自托管的自动化工作流程,使用n8n作为个人编辑器。
架构:
获取:每天早上提取RSS源(如TechCrunch、Hacker News等)。
过滤(代理):将标题传递给GPT-4o-mini,并使用系统提示“充当高级编辑”。它根据特定兴趣(例如,“对本地LLM高度关注”,“对加密八卦兴趣低”)对每篇文章进行0-10的评分。
逻辑:丢弃任何得分低于7的内容。
研究:使用Tavily API抓取并总结高评分文章的完整内容。
交付:通过SMTP发送一封简洁的邮件摘要。
最难的部分(SSE和超时):最大的技术难题是处理超时。由于AI研究步骤需要时间,HTTP请求经常会中断。我不得不配置服务器推送事件(SSE)并调整Node.js中的执行超时环境变量,以保持在深入研究阶段的连接活跃。
资源:
工作流程/源代码(JSON):[https://github.com/sojojp-hue/NewsSummarizer/tree/main](https://github.com/sojojp-hue/NewsSummarizer/tree/main)
视频演示与示范:[https://youtu.be/mOnbK6DuFhc](https://youtu.be/mOnbK6DuFhc)
我很想听听其他人是如何应对信息过载的,或者是否有更好的方法来处理AI代理的长轮询。
查看原文
Hi HN,<p>I found myself wasting too much time doomscrolling through tech news and RSS feeds, scanning hundreds of headlines just to find the 3-4 items that actually mattered to my work.<p>To fix this, I built a self-hosted automation workflow using n8n that acts as a personal editor.<p>The Architecture:<p>Ingest: Pulls RSS feeds (TechCrunch, Hacker News, etc.) every morning.<p>Filter (The Agent): Passes headlines to GPT-4o-mini with a system prompt to "act as a senior editor." It scores each article 0-10 based on specific interests (e.g., "High interest in Local LLMs," "Low interest in crypto gossip").<p>Logic: Discards anything with a score < 7.<p>Research: Uses Tavily API to scrape and summarize the full content of the high-scoring articles.<p>Delivery: Sends a single, clean email digest via SMTP.<p>The Hardest Part (SSE & Timeouts): The biggest technical hurdle was handling timeouts. Since the AI research step takes time, the HTTP requests would often drop. I had to configure Server-Sent Events (SSE) and adjust the execution timeout env variables in Node.js to keep the connection alive during the deep-dive research phase.<p>Resources:<p>Workflow/Source (JSON): <a href="https://github.com/sojojp-hue/NewsSummarizer/tree/main" rel="nofollow">https://github.com/sojojp-hue/NewsSummarizer/tree/main</a><p>Video Walkthrough & Demo: <a href="https://youtu.be/mOnbK6DuFhc" rel="nofollow">https://youtu.be/mOnbK6DuFhc</a><p>I’d love to hear how others are handling information overload or if there are better ways to handle the long-polling for the AI agents.