Bot Cybersecurity

One annoyance that hasn’t changed much since the early days of the internet is the proliferation of bad bots. Think of spam but for internet traffic. By hogging server threads, they hurt website loading times for your actual users. Other bots are constantly probing your apps for vulnerabilities so that they can eventually use your servers to run their code.

We had to fight a bot today. Normally, I would keep these notes in our private docs, but seeing how next month is cybersecurity month, I decided it would be worth making the notes public.

While checking some of our public facing servers, we noticed that one server was not being responsive. A closer inspection of the server logs indicated that the server threads were overloaded. The solution, over the course of the next several hours, was to update our fail2ban configuration. We also found other server hardening techniques that could be useful later.

Inspect proxy logs

The location of your proxy logs depends on your proxy and operating system. For nginx on Fedora, the server logs are in /var/log/nginx. In our case, I saw thousands of entries that looked something like this: - - [24/Sep/2022:16:24:46 +0000] "GET /u/abc HTTP/1.1" 499 0 "-" "Mozilla/5.0 (compatible; +centuryb.o.t9[at]" "-"

The above bot identified itself as Mozilla/5.0 (compatible; +centuryb.o.t9[at] and was clogging our server. There were other bots as well. - - [24/Sep/2022:23:17:39 +0000] "GET HTTP/1.0" 301 185 "-" "-" "-"

The above bot remained anonymous and was probing for vulnerabilities, but infrequently, hoping to remain invisible in the sea of requests. - - [24/Sep/2022:20:04:18 +0000] "\x16\x03\x01\x02\x00\x01\x00\x01\xFC\x03\x03'\x91r\xEA\x95/\xAD\xC2t\x94\x04\xF5\xDC\x8E\x9ES0_o\x0E\xC8\x07\xB3/\x14=\x91\x99Z\xF5\xE4\xFC \x1D\xC9\x19\xD6{\xC5 [\x0C\xD9S{\x83\xEF6Y\x1C?\x1B\x0B\xD0\xCA\xA9\xBA\x0B:\xBF{f\xD2\xA4\xC1\x00\x96\x13\x02\x13\x03\x13\x01\xC0,\xC00\xC0+\xC0/\xCC\xA9\xCC\xA8\x00\xA3\x00\x9F\x00\xA2\x00\x9E\xCC\xAA\xC0\xAF\xC0\xAD\xC0$\xC0(\xC0" 400 173 "-" "-" "-"

The above bot was also anonymous and was trying to do some weird shit.

I searched for possible fixes and decided to update our fail2ban filters.

Update fail2ban filters

There are many tutorials that show you how to set up fail2ban on your server (here is an example). Our fail2ban filters were in /etc/fail2ban/filter.d.

I first tried adding our bad bot User-Agent string to the regular expression in the filter.

# /etc/fail2ban/filter.d/apache-badbots.conf

badbotscustom = Mozilla\/5\.0 \(compatible; \+centuryb\.o\.t9\[at\]gmail\.com\)|EmailCollector|WebEMailExtrac|TrackBack/1\.02|sogou music spider|(?:Mozilla/\d+\.\d+ )?Jorgee
$ systemctl restart fail2ban

This did not have any effect and the bot was still happily spamming our server.

# Check jail
$ fail2ban-client status nginx-badbots

# Test filter
$ fail2ban-regex /var/log/nginx/access.log /etc/fail2ban/filter.d/apache-badbots.conf

Testing our updated filter against the past logs showed that the filter was not matching the offending strings. After some poking around, I figured out that the default regular expression was not matching our nginx log configuration (see pull request). Specifically, nginx had added an extra term (http_x_forwarded_for) that fail2ban was not expecting:

# nginx.conf

log_format main '$remote_addr - $remote_user [$time_local] "$request" '
                '$status $body_bytes_sent "$http_referer" '
                '"$http_user_agent" "$http_x_forwarded_for"';
# /etc/fail2ban/filter.d/apache-badbots.conf

failregex = ^<HOST> -.*"(GET|POST|HEAD).*HTTP.*"(?:%(badbots)s|%(badbotscustom)s)"$

failregex = ^<HOST> -.*"(GET|POST|HEAD).*HTTP.*"(?:%(badbots)s|%(badbotscustom)s)"

Testing the filter again showed that the regular expression was matching.

# Test filter
$ fail2ban-regex /var/log/nginx/access.log /etc/fail2ban/filter.d/apache-badbots.conf

However, checking the proxy logs again, our bot was still spamming the server.

Update fail2ban jails

After some more experimentation, I found that the badbots jail had never been properly configured. Specifically, the fail2ban docs state that backend = systemd will not work for jails that require checking logs outside of journalctl. Setting backend = pyinotify fixed the issue.

# /etc/fail2ban/jail.local

enabled = true
filter = apache-badbots
logpath = %(nginx_access_log)s
backend = pyinotify
$ fail2ban-client status nginx-badbots

Status for the jail: nginx-badbots
|- Filter
|  |- Currently failed:	0
|  |- Total failed:	1
|  `- File list:	/var/log/nginx/access.log
`- Actions
   |- Currently banned:	1
   |- Total banned:	1
   `- Banned IP list:


Consider ultimate blocker

During all this experimentation, I found an interesting project called the NGINX Ultimate Bad Bot Blocker. I will make a note to try this next time.

Maintain security hygiene

Lesson: Check your logs and locks periodically!