I'm having lots of connections every day from Singapor. It's now the main country... despite the whole website being French-only. AI crawlers, for sure.
Amazonbot does this despite my efforts in robots.txt to help it out. I look at all the Singapore requests and they’re Amazonbot trying to get various variants of the Special:RecentChanges page. You’re wasting your time, Amazonbot. I’m trying to help you.
That makes sense. I wonder why Amazonbot specifically as a target to spoof.
I hoped to get them not stuck using a robots.txt but they refuse to obey it and keep hitting that page with various params. No problem for me, but they are going nowhere.
Fun fact: you don't get rid of them even when you put a captcha on all visitors from Singapore. I still see a spike in traffic that perfectly matches the spike in served captchas, but this time it's geographically distributed between places like Iraq, Bangladesh and Brazil.
Hopefully it at least costs them a little bit more.
Usually, there are multiple layers of different counter-protection measures. If you block by country, they shift to different IP ranges, if you block by IP, they might use a new IP for every request, and escalate further depending on the bot owner and your actions.
Yeah same for my Gitea instance. These were all ByteDance and Tencent ASNs from some AWS-equivalent. Blocked the whole subnet belonging to them in my server's ufw and haven't had any problems since then. Same for Vultr and Google Cloud.
Thanks for this tip.