问HN:当服务条款禁止抓取时,如何为自由开源软件项目构建服务?
我正在为自由和开源软件(FOSS)开发者提供一项服务,帮助他们确保代码许可证的合规性,并使项目更加可持续。
面临的挑战是:许多网站的服务条款明确禁止抓取、爬虫或自动化操作。同时,所需的信息(如代码库、依赖项、元数据)通常只能通过这些网站获取。
对于那些围绕开源生态系统构建工具的人来说:
* 你如何在遵循服务条款的限制的同时,仍然为用户提供价值?
* 你是否只关注官方API,即使它们的功能有限?
* 在这种情况下,有没有已建立的法律或技术最佳实践?
* 如何在遵循服务条款和支持FOSS的使命之间取得平衡?
我很想听听其他人在这个领域所做的(或看到的有效做法)。
查看原文
I'm working on a service for FOSS developers to help enforce code license compliance and make projects more sustainable.<p>The challenge: many websites' Terms of Service explicitly prohibit scraping, crawling, or automation. At the same time, the information needed (repos, dependencies, metadata) is often available only through those sites.<p>For those who've built tools around open source ecosystems:<p>* How do you navigate ToS restrictions while still delivering value to users?<p>* Do you focus on official APIs only, even if they're limited?<p>* Are there established legal/technical best practices for this situation?<p>* How to balance compliance with ToS and the mission of supporting FOSS?<p>Curious to hear what others have done (or seen work) in this space.