请问HN:如何阻止一个AWS机器人发送每月20亿个请求?

4作者: lgats28 天前原帖
我一直在与一个来自新加坡AWS的爬虫“Mozilla/5.0 (compatible; crawler)”斗争,它对我的一个域名发送了大量请求,平均每秒超过700个请求,持续了几个月。 幸运的是,CloudFlare能够通过简单的WAF规则和444响应来处理这些流量,从而减少出站流量。 我向AWS提交了几次投诉,希望能停止这种流量,他们的典型回复是: 我们已与客户进行了沟通,并根据这次沟通确定,报告的活动目前不需要AWS采取进一步行动。 我尝试了各种4XX响应,看看爬虫是否会退缩,也尝试了30X重定向(它会跟随),但都没有效果。 这种流量已经达到需要我重新与CloudFlare谈判合同的程度,并且在查看分析和日志时也造成了困扰。 我考虑将所有流量重定向到AWS的滥用报告页面,但在这种情况下,这实际上就像一个小型DDoS网络,发送到任何地方都可能被视为滥用。 有没有其他人有类似的经历?
查看原文
I have been struggling with a bot– &#x27;Mozilla&#x2F;5.0 (compatible; crawler)&#x27; coming from AWS Singapore – and sending an absurd number of requests to a domain of mine, averaging over 700 requests&#x2F;second for several months now. Thankfully, CloudFlare is able to handle the traffic with a simple WAF rule and 444 response to reduce the outbound traffic.<p>I&#x27;ve submitted several complaints to AWS to get this traffic to stop, their typical followup is: We have engaged with our customer, and based on this engagement have determined that the reported activity does not require further action from AWS at this time.<p>I&#x27;ve tried various 4XX responses to see if the bot will back off, I&#x27;ve tried 30X redirects (which it follows) to no avail.<p>The traffic is hitting numbers that require me to re-negotiate my contract with CloudFlare and is otherwise a nuisance when reviewing analytics&#x2F;logs.<p>I&#x27;ve considered redirecting the entirety of the traffic to aws abuse report page, but at this scall, it&#x27;s essentially a small DDoS network and sending it anywhere could be considered abuse in itself.<p>Are there others that have similar experience?