Collating Kevin

Learn from my mistakes. Build it better.

I Was DDoS'd This Week, How I Used ASN Blocking to Resolve The Problem

The Alerts

“Why do things always have to happen on a Friday?”, was the question I asked myself at 5:04 when I read the Grafana alert delivered to my phone. I had just booted up my gaming PC and I was going to get started on an early weekend gaming session when my phone began buzzing and the notifications began piling up.

Bereft of hundreds of frames per second for the moment I logged in to my dashboard to see what was wrong.

A graph showing that the number of requests being served was way above normal levels

Well that isn’t good.

This graph is showing the number of in-flight requests currently being processed by NGINX, the red bar representing the ’emergency’ level where things tend to stop working. To put things in perspective most of the services that I host are fulfilled by a few thousand request a day, the current number of requests served in the last hour was over 1,000,000.

Was it Hackernews, did I make the front-page of Reddit? Let’s take a look.

A graph showing the connecting IP addresses that I was serving at the time

All of these IPs seem to belong to the same entity, they all fall within a /24 subnet. I headed over to ARIN and did the whois lookup and found that these all belong to AS209366 owned by Semrush Holdings Inc. Disappointed that I won’t be a social media superstar, I submit my email to their abuse contact. I’m sure that as a small contributor on the internet that they will get back to immediately with an apology and a fruit basket, meanwhile in the real world I decide that I need to take action.

What About the WAF and Rate Limits?

If you are familiar with my content, you might be wondering why the WAF or rate limiting didn’t stop this. It’s a question of numbers, as previously mentioned I saw 1,000,000 request in an hour or approximately 16,667 requests a minute–but I never said that the site went down, NGINX was still happily serving traffic as intended and there was about 30MB/s flowing over VPN saturating the backend connections so everything was working as intended.

As far as the WAF was concerned the traffic was legitimate, it was mostly spiders, crawling the different sites getting stuck in loops searching the same pages over and over again.

Let’s Get to Fixing

There were a few different ways to approach this particular problem: I could filter the traffic at the firewall, I could set up rules in Crowdsec or Modsecurity, or I could limit the traffic and ride out the storm.

Firewall Filtering

Filtering at the firewall was the first instinct I had. After all, I could set up the rules to block the traffic and be done with it. However this was not as dynamic as I wanted, the current solution has no way to easily modify rules, so I would need to script something out to update the rule sets and keep them up to date.

Crowsec or WAF Rules

I could certainly set up WAF rules, or a Crowdsec decision to block the traffic, but again, this required feeding them a blocklist via a script, which is always prone to errors and omissions.

Limit the Traffic

I could Also just limit the traffic coming in and see if the problem resolved itself. This didn’t seem like much of an option though as I was risking outages if things had continued to trend upwards.

Enter GeoIP

One option that I couldn’t find a lot of information on, was using GeoIP to block ASNs. I was sure it could be done, the information was already there, and my version of NGINX was built with GeoIP support. Apache users have it lucky, as they have this functionality out of the box with mod_asn, but I had to get a little creative.

I won’t go into too much detail about installing and maintaining the GeoIP2 lists, that’s a subject for another article. In short the steps are as follows:

  1. Register for an account at Maxmind GeoIP2 Lite
  2. Install the geoipupdate tool.
  3. Download the latest GeoIP2 databases which will be located in: /usr/share/GeoIP

After you have the databases the next steps are fairly simple.

First, add the following to you nginx.conf, or an included file somewhere else.

# If NGINX is compiled with the GeoIP2 module, you won't need this part.
# If the module was built separately, include these lines at the top to load the modules.
load_module /etc/nginx/modules/ngx_http_geoip2_module.so;
load_module /etc/nginx/modules/ngx_stream_geoip2_module.so;
...
        # This informs the GeoIP module to load the database and construct variables based on the 
        # autonomous_system_number and autonomous_system_organization fields.
        geoip2 /usr/share/GeoIP/GeoLite2-ASN.mmdb {
            $geoip2_data_asn autonomous_system_number;
            $geoip2_data_asn_name autonomous_system_organization;
        }
...

Now that we have access to the ASN data, we need to map it to a new variable so that NGINX always has access to something even if there is no match.

        # Make sure that there is always 'something' in the variable,
        # in this case a match of "" (nothing) will return 000000. 
        map $geoip2_data_asn $asn {
          default $geoip2_data_asn;
          ""      "000000";
        }

Next create a new file /etc/nginx/snippets/asn_block.conf, this is separated out in order to make the management of blocked ASNs cleaner and easier.

if ($asn ~ "(000000|209366)") {
    return 444;
}

I’m returning 444 here because that will cause NGINX to drop the connection immediately without further processing. This is a little bit faster that adding a deny all; since it forces the connection to drop without rendering the error page. You can do what you want here.

Here I’m doing a simple regex lookup to see if the ASN matches against the variable we created earlier. Incidentally, I’m also blocking the null or default ASN since most of the traffic we’re expecting to see should be legitimate and from a provider like ATT-INTERNET4 or CLOUDFLARENET. More can be added by simply appending them to the regex.

Finally, I included the new snippets inside my server {} blocks with the below:

server {
    ...
    include snippets/asn_block.conf;
    ...
}

Next, nginx -t && systemctl reload nginx and it’s time to see the outcome!

Postmortem

It’s been almost 24 hours since I started blocking the offending ASN and things are back to normal. I suppose I should thank Semrush for the free load test, it’s not often that you get to test your configuration against real world scenarios. Had this been a real and purposeful DDoS the outcome might have been quite different. At least now I can sit back and enjoy the games I was supposed to be playing last night.

A graph of the number of inbound requests, which have returned to normal following the steps taken in this article

You can see that there was a substatial drop off in traffic following the new configuration with an expected increase thereafter. Luckily traffic was reigned in before the bulk of traffic hits my sites.

As always, if you are experiencing increased site load and need to manage traffic, please reach out to me and I’ll see what solution we can come up with.