请问HN:如何阻止一个AWS机器人发送每月20亿个请求?
我一直在与一个来自新加坡AWS的爬虫“Mozilla/5.0 (compatible; crawler)”斗争,它对我的一个域名发送了大量请求,平均每秒超过700个请求,持续了几个月。
幸运的是,CloudFlare能够通过简单的WAF规则和444响应来处理这些流量,从而减少出站流量。
我向AWS提交了几次投诉,希望能停止这种流量,他们的典型回复是:
我们已与客户进行了沟通,并根据这次沟通确定,报告的活动目前不需要AWS采取进一步行动。
我尝试了各种4XX响应,看看爬虫是否会退缩,也尝试了30X重定向(它会跟随),但都没有效果。
这种流量已经达到需要我重新与CloudFlare谈判合同的程度,并且在查看分析和日志时也造成了困扰。
我考虑将所有流量重定向到AWS的滥用报告页面,但在这种情况下,这实际上就像一个小型DDoS网络,发送到任何地方都可能被视为滥用。
有没有其他人有类似的经历?
查看原文
I have been struggling with a bot– 'Mozilla/5.0 (compatible; crawler)' coming from AWS Singapore – and sending an absurd number of requests to a domain of mine, averaging over 700 requests/second for several months now.
Thankfully, CloudFlare is able to handle the traffic with a simple WAF rule and 444 response to reduce the outbound traffic.<p>I've submitted several complaints to AWS to get this traffic to stop, their typical followup is:
We have engaged with our customer, and based on this engagement have determined that the reported activity does not require further action from AWS at this time.<p>I've tried various 4XX responses to see if the bot will back off, I've tried 30X redirects (which it follows) to no avail.<p>The traffic is hitting numbers that require me to re-negotiate my contract with CloudFlare and is otherwise a nuisance when reviewing analytics/logs.<p>I've considered redirecting the entirety of the traffic to aws abuse report page, but at this scall, it's essentially a small DDoS network and sending it anywhere could be considered abuse in itself.<p>Are there others that have similar experience?