DOS Attacks and DNS: How to Stay Up If Your DNS Provider goes DOWN

mushroom[ The original article was a bit "wordy" and appears in its entirety after the page break. ]

Nameservers and DNS services are one of the most popular attack vectors for Denial of Service Attacks. Since we wrote the original article, the types of attacks against DNS infrastructure have since bifurcated into two distinct types.

There are basically two reasons why your DNS servers may be attacked:

  1. To take you down. If not you, then to take down somebody else using the same nameservers you are. So for example, an attack may be directed against the nameservers for a specific vendor, like a DNS provider, a domain registrar, a web hosting provider, an ISP or anybody else hosting multiple domains on behalf of numerous clients. Often times if the attack is successful, the damage goes beyond the target, some or all of the other domains using the same vendor will be impacted
  2. Amplification and reflection attacks: On the surface similar to above, in that the attack will affect one or more specific vendors, but they differ at the point of the target, which is someplace else entirely. In these attacks the vendors nameserver infrastructure is being leveraged to attack somebody else. It can still affect your vendor and inflict widespread collateral damage, but perhaps more pernicious because neither you nor anybody else sharing the same vendor is the target.

In either case, if your DNS solution gets hit with either of these, you're the one who suffers the impact if they can't fully mitigate the attack.

There is a magic bullet to avoid this, and you can ensure you will always have functional DNS no matter what happens to any specific vendor.

That magic bullet is this:

Use Multiple DNS Solutions

As we never tire of saying: All DNS Solutions, regardless of how much redundancy, how much mitigation gear, how many POPs worldwide, whatever, they are still a SPOF unto themselves. Even us. Maybe something new comes along, like NTP reflection and suddenly 300 Gb/Sec attacks start happening, or maybe instead of a DDOS the core routers meltdown, or maybe some CXO pulls a Marlon Brando in Apocalypse Now and just totally quits the program, you just never know. A lot of companies have redundancy at every level – from power supplies right on up to datacenters, except for their nameservers and DNS. It's just a blind spot in so many cases until the Black Swan event occurs.

So if you absolutely, positively have to have 100% DNS availability all the time, you must:

  • Use multiple DNS providers or solutions
  • Have a coherent methodology for syndicating your zones across those solutions
  • Be able to track the health of each component of your "DNS mosaic" (I just made that up)
  • Have the ability to switch / enable or disable components as required

and, the one piece which all of the above hinges on, usually the one part you usually cannot do once "The Event" has started is to

  • Have it all setup in advance.

( We have a  system which actually automates all of this, called Proactive Nameservers, if you are interested in our whitepaper on that, enter your email address below )

Thank you, your sign-up request was successful! Please check your e-mail inbox.
Given email address is already subscribed, thank you!
Please provide a valid email address.
Please complete the CAPTCHA.
Oops. Something went wrong. Please try again later.

In general terms however, what follows are the methods you employ to make all this happen and thus guarantee that you have DNS availability 100% of the time, whether you're with easyDNS, somebody else or doing it all yourself, and even if you are managing DNS on behalf of other downstream users.

Hot Spares / Warm Spares / Mixed Delegations

Step one is to simply have at the very least, one or two out-of-band nameservers somewhere that are already configured to host your zones (warm spare), or even better, is always up-to-date (hot spare). You can do this as simply as adding one or more external nameserver IPs to your master nameserver configs, allowing them to mirror from your master and that they receive NOTIFY packets whenever you update your zone (also-notify).

In bind it's as easy as this:

zone "example.com" {
      type master;
      file "example.com.zone";
      masters {
           10.2.1.33;
      };

         allow-transfer { 192.168.179.248; 192.168.179.249; };
         also-notify { 192.168.179.248; 192.168.179.249; };
};

and most DNS providers allow you to add third party nameservers. In the easyDNS control panel you just add them under the External or Integrations tab in your domain administration module:easy_integrations

[ On easyDNS we also support more robust integrations, for example you can configure your zones to automatically export (or import) your DNS settings on Amazon's Route 53 DNS. You don't even need your own AWS account to use it if you go with our fully managed option.  Hooks into the Linode and Digital Ocean DNS API are also in beta.]

Use DNS Anycast

IP Anycast is when a single IP address actually equates to multiple POPs distributed across the world (geographic diversity) and (ideally) multiple transit providers (network diversity). On it's own, anycast will not mitigate a DDoS attack aside from helping diffuse the attack. If you have enough POPs than sometimes the hostile traffic gets spread thin enough that not all of them fall over.

Also, sometimes anycast can mean the difference between a global and localized outage. Say for example a lot of the hostile traffic is originating from one area (I dunno, maybe… China? Just as an example) and you have a POP near there (one that doesn't get null routed or it's BGP announcements dropped) during the attack, even if it does fall over, it will act as a "sinkhole" for the attack traffic – pulling it away from your other global POPs. It means a localized outage (users in that region perceive you as having gone dark) versus a global one. As we noted in the original article "some users may experience localized outages" is a better outcome than "everything is down hard".

Put your nameservers in Realtime Rootzones

Most serious Top Level Domains update their root zones in realtime. (Alas, being based in Canada, .CA isn't one of them, it only updates hourly). This is critical if you are going to add or modify your domain's nameserver delegation to bring in your hot spares or remove a nameserver that has failed.

Choose your nameservers accordingly: .com, .net, .org, .biz, .info all update in realtime so once you modify your delegation it's live across the internet in seconds.

You Can Round-Robin a Nameserver Record

It's true. We tried it in the past, it's nowhere close to as effective as DNS anycast, but in a DDoS situation, anything you can do to diffuse the attack can help.

Separate Your Nameservers From Everything Else

Don't have your nameservers inside the same net blocks as your other core infrastructure like database clusters, mail servers, et al. Because when you're upstream null routes your nameservers you want to be able to still have come communications and other functionality available to you (which it will be, once you activate your backup nameservers, right?) If everything is inside the same net block that has been null routed or had its advertisements dropped, you're totally gone.

Decide on Active / Active or Active / Passive

From High Availability parlance this means do you have your "Plan B" DNS option configured, ready to go, but not live in your active delegation? Or do you run your multiple DNS configurations all the time. Your own circumstances will be your guide.

Perhaps your main DNS is anycast, but your Plan B is unicast, you probably don't want to run active / active because your Plan B will dilute the optimization and performance gains that you are getting from your anycast deployment.

Conclusion

The basic gist is to take a similar approach to nameservers and DNS as you would any other High Availability component of your IT infrastructure:

  • Treat every DNS component, be it a vendor, data centre or cloud provider, as a discrete, logical unit
  • Stack up multiple "units", that are separate from each other
  • Keep them configured, ready and hopefully in-sync with your current zone data
  • Have the ability to activate them when the S hits the F, or else, just run them all, all the time.

The Original Article Follows

Comments

  1. ps says

    As usual, it was a great read.

    It's the reason I cite your blog as a source of examples on what ethical business thinking is all about.

  2. says

    Great post. As a LAMP sys admin I've always complained about the horrible web based interfaces of some DNS providers, like having to individually set all 86+ subdomain records to a smaller TTL before making any server migrations, and re-setting them all back to sane defaults.
    I have very seriously contemplated setting up Bind on some servers, but thought the web interfaces are still better than my previous workplace, where they required people to email in DNS requests… which would only be processed during standard business hours after being escalated from the Helpdesk.

    Now you've made me understand the difference between an OK DNS host and a proper DNS hosting provider and convinced me that the scale to which is required to properly run my own DNS (assuming it was all myself), is simply unfeasible, but that having some spare slaves isn't a bad thing.
    Also, having read this blog post has made me want to change over to EasyDNS as my DNS host for all new domains. You have passion and knowledge, something I am drawn to.

    Thank you Mark!

  3. says

    Brilliant article, I wish I'd read it before the attack last night. Your Route 53 integration that you are beta testing is awesome too. I'm much more clued up now to minising DNS risk.

  4. says

    of course like your web-site but you need to check the spelling on several of your posts.
    Many of them are rife with spelling problems and I find it very bothersome to tell the truth
    nevertheless I will definitely come back again.

    • says

      Hi there,

      Our apologies for that. Sometimes these posts get sent out in a hurry due to the urgency of the matter, and don't get proofed by as many eyes as they should. We'll try and take better care.

      Regards

      Arnon