One annoyance that hasn’t changed much since the early days of the internet is the proliferation of bad bots. Think of spam but for internet traffic. By hogging server threads, they hurt website loading times for your actual users. Other bots are constantly probing your apps for vulnerabilities so that they can eventually use your servers to run their code.
We had to fight a bot today. Normally, I would keep these notes in our private docs, but seeing how next month is cybersecurity month, I decided it would be worth making the notes public.
While checking some of our public facing servers, we noticed that one server was not being responsive. A closer inspection of the server logs indicated that the server threads were overloaded. The solution, over the course of the next several hours, was to update our fail2ban
configuration. We also found other server hardening techniques that could be useful later.
Inspect proxy logs
The location of your proxy logs depends on your proxy and operating system. For nginx
on Fedora, the server logs are in /var/log/nginx
. In our case, I saw thousands of entries that looked something like this:
4.62.202.81 - - [24/Sep/2022:16:24:46 +0000] "GET /u/abc HTTP/1.1" 499 0 "-" "Mozilla/5.0 (compatible; +centuryb.o.t9[at]gmail.com)" "-"
The above bot identified itself as Mozilla/5.0 (compatible; +centuryb.o.t9[at]gmail.com)
and was clogging our server. There were other bots as well.
101.43.68.37 - - [24/Sep/2022:23:17:39 +0000] "GET http://1.2.3.4:80/mysql/scripts/setup.php HTTP/1.0" 301 185 "-" "-" "-"
The above bot remained anonymous and was probing for vulnerabilities, but infrequently, hoping to remain invisible in the sea of requests.
185.162.235.116 - - [24/Sep/2022:20:04:18 +0000] "\x16\x03\x01\x02\x00\x01\x00\x01\xFC\x03\x03'\x91r\xEA\x95/\xAD\xC2t\x94\x04\xF5\xDC\x8E\x9ES0_o\x0E\xC8\x07\xB3/\x14=\x91\x99Z\xF5\xE4\xFC \x1D\xC9\x19\xD6{\xC5 [\x0C\xD9S{\x83\xEF6Y\x1C?\x1B\x0B\xD0\xCA\xA9\xBA\x0B:\xBF{f\xD2\xA4\xC1\x00\x96\x13\x02\x13\x03\x13\x01\xC0,\xC00\xC0+\xC0/\xCC\xA9\xCC\xA8\x00\xA3\x00\x9F\x00\xA2\x00\x9E\xCC\xAA\xC0\xAF\xC0\xAD\xC0$\xC0(\xC0" 400 173 "-" "-" "-"
The above bot was also anonymous and was trying to do some weird shit.
I searched for possible fixes and decided to update our fail2ban
filters.
Update fail2ban filters
There are many tutorials that show you how to set up fail2ban
on your server (here is an example). Our fail2ban
filters were in /etc/fail2ban/filter.d
.
I first tried adding our bad bot User-Agent string to the regular expression in the filter.
# /etc/fail2ban/filter.d/apache-badbots.conf
badbotscustom = Mozilla\/5\.0 \(compatible; \+centuryb\.o\.t9\[at\]gmail\.com\)|EmailCollector|WebEMailExtrac|TrackBack/1\.02|sogou music spider|(?:Mozilla/\d+\.\d+ )?Jorgee
$ systemctl restart fail2ban
This did not have any effect and the bot was still happily spamming our server.
# Check jail
$ fail2ban-client status nginx-badbots
# Test filter
$ fail2ban-regex /var/log/nginx/access.log /etc/fail2ban/filter.d/apache-badbots.conf
Testing our updated filter against the past logs showed that the filter was not matching the offending strings. After some poking around, I figured out that the default regular expression was not matching our nginx
log configuration (see pull request). Specifically, nginx
had added an extra term (http_x_forwarded_for
) that fail2ban
was not expecting:
# nginx.conf
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
# /etc/fail2ban/filter.d/apache-badbots.conf
# OLD
failregex = ^<HOST> -.*"(GET|POST|HEAD).*HTTP.*"(?:%(badbots)s|%(badbotscustom)s)"$
# NEW
failregex = ^<HOST> -.*"(GET|POST|HEAD).*HTTP.*"(?:%(badbots)s|%(badbotscustom)s)"
Testing the filter again showed that the regular expression was matching.
# Test filter
$ fail2ban-regex /var/log/nginx/access.log /etc/fail2ban/filter.d/apache-badbots.conf
However, checking the proxy logs again, our bot was still spamming the server.
Update fail2ban jails
After some more experimentation, I found that the badbots jail had never been properly configured. Specifically, the fail2ban
docs state that backend = systemd
will not work for jails that require checking logs outside of journalctl
. Setting backend = pyinotify
fixed the issue.
# /etc/fail2ban/jail.local
[nginx-badbots]
enabled = true
filter = apache-badbots
logpath = %(nginx_access_log)s
backend = pyinotify
$ fail2ban-client status nginx-badbots
Status for the jail: nginx-badbots
|- Filter
| |- Currently failed: 0
| |- Total failed: 1
| `- File list: /var/log/nginx/access.log
`- Actions
|- Currently banned: 1
|- Total banned: 1
`- Banned IP list: 64.62.202.81
Yay!
Consider ultimate blocker
During all this experimentation, I found an interesting project called the NGINX Ultimate Bad Bot Blocker. I will make a note to try this next time.
Maintain security hygiene
Lesson: Check your logs and locks periodically!