展示HN:我构建了一个个人AI新闻策展工具,用于过滤RSS源(n8n和OpenAI)

1作者: practicalaifg大约 1 个月前原帖
嗨,HN, 我发现自己在浏览科技新闻和RSS源时浪费了太多时间,扫描数百个标题,只为找到3到4个对我的工作真正有意义的内容。 为了解决这个问题,我构建了一个自托管的自动化工作流程,使用n8n作为个人编辑器。 架构: 获取:每天早上提取RSS源(如TechCrunch、Hacker News等)。 过滤(代理):将标题传递给GPT-4o-mini,并使用系统提示“充当高级编辑”。它根据特定兴趣(例如,“对本地LLM高度关注”,“对加密八卦兴趣低”)对每篇文章进行0-10的评分。 逻辑:丢弃任何得分低于7的内容。 研究:使用Tavily API抓取并总结高评分文章的完整内容。 交付:通过SMTP发送一封简洁的邮件摘要。 最难的部分(SSE和超时):最大的技术难题是处理超时。由于AI研究步骤需要时间,HTTP请求经常会中断。我不得不配置服务器推送事件(SSE)并调整Node.js中的执行超时环境变量,以保持在深入研究阶段的连接活跃。 资源: 工作流程/源代码(JSON):[https://github.com/sojojp-hue/NewsSummarizer/tree/main](https://github.com/sojojp-hue/NewsSummarizer/tree/main) 视频演示与示范:[https://youtu.be/mOnbK6DuFhc](https://youtu.be/mOnbK6DuFhc) 我很想听听其他人是如何应对信息过载的,或者是否有更好的方法来处理AI代理的长轮询。
查看原文
Hi HN,<p>I found myself wasting too much time doomscrolling through tech news and RSS feeds, scanning hundreds of headlines just to find the 3-4 items that actually mattered to my work.<p>To fix this, I built a self-hosted automation workflow using n8n that acts as a personal editor.<p>The Architecture:<p>Ingest: Pulls RSS feeds (TechCrunch, Hacker News, etc.) every morning.<p>Filter (The Agent): Passes headlines to GPT-4o-mini with a system prompt to &quot;act as a senior editor.&quot; It scores each article 0-10 based on specific interests (e.g., &quot;High interest in Local LLMs,&quot; &quot;Low interest in crypto gossip&quot;).<p>Logic: Discards anything with a score &lt; 7.<p>Research: Uses Tavily API to scrape and summarize the full content of the high-scoring articles.<p>Delivery: Sends a single, clean email digest via SMTP.<p>The Hardest Part (SSE &amp; Timeouts): The biggest technical hurdle was handling timeouts. Since the AI research step takes time, the HTTP requests would often drop. I had to configure Server-Sent Events (SSE) and adjust the execution timeout env variables in Node.js to keep the connection alive during the deep-dive research phase.<p>Resources:<p>Workflow&#x2F;Source (JSON): <a href="https:&#x2F;&#x2F;github.com&#x2F;sojojp-hue&#x2F;NewsSummarizer&#x2F;tree&#x2F;main" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;sojojp-hue&#x2F;NewsSummarizer&#x2F;tree&#x2F;main</a><p>Video Walkthrough &amp; Demo: <a href="https:&#x2F;&#x2F;youtu.be&#x2F;mOnbK6DuFhc" rel="nofollow">https:&#x2F;&#x2F;youtu.be&#x2F;mOnbK6DuFhc</a><p>I’d love to hear how others are handling information overload or if there are better ways to handle the long-polling for the AI agents.