Post-Mortem of the Jan-07 DDoS Attack

On Saturday, Jan 07 commencing at approximately 3:30pm EST (almost the exact moment I hit "publish" on a post about Save The Elephants, which we'll repost later) we were hit with a multi-faceted DDoS Attack across three anycast constellations: dns1, dns2 and dns3.

The attack was a combination SYN, ICMP and DNS Flood, in excess of 1 Gig/sec across our anycast IPs with packets per second ranging from 500K/sec to 1M/sec across each nameserver.

At the outset of the attack it looks like all three affected DNS constellations were rendered non-responsive for a period of 30 to 60 minutes.

We were able to identify the target of the attack and had them delegate away from our nameservers (this domain has now cycled through 8 other DNS providers in under 48 hours, bringing this DDoS with it to every one of them).

The attack traffic against us is still persisting, but being mitigated.

Prolexic was able to mitigate starting around 4:40PM EST, bringing parts  of DNS2 back online.

We then made changes to DNS1 to direct queries toward the functional nodes of DNS2.

The attack overwhelmed DNS3 and our upstream providers dropped our BGP  sessions to preserve operational integrity.

We restored native DNS1 functionality at approximately midnight Saturday evening by renumbering the anycast broadcast IP address.

At approximately 3:00am Sunday morning we began routing queries for DNS3  to DNS4. We later renumbered the public anycast IPs for DNS3.EASYDNS.ORG and DNS3.EASYDNS.CA and that traffic has reverted to those anycast constellations.

Policy Response

The target domain in question was of a type we've seen before and has caused us grief in the form of other DDoS attacks. We have made additions to our domain prescreening rules to prevent similar domains from acquiring service from us in the future. (Nearly all DDoS attacks are against domains that have moved onto the system within the previous 72 hours).

Technical Response

We have identified DNS3 as the weakest link in our offering and will be making substantive changes to it first, followed by DNS1.

Key Takeaways

We made several unforced errors in the course of this incident which aggravated our pain:

  • While we were frantically trying to get the ccTLD registry to pull the delegation for the domain, they domain moved themselves to a licensee website that specializes in DDoS mitigation. That site was still serving up our NS records in response to queries, because of a configuration error in the licensee site (which we maintain).
  • We had the easyDNS.com and blog.easydns.org DNS fully backed up to Amazon Route53 via our easyRoute53 interface for exactly this type of scenario, but nobody else in the company knew that except for me. I never thought to push the button until long after. Had I done that, as was supposed to be the plan, then the easyDNS and blog website would have remained available the entire time. Some of our members did avail themselves of this for their own domains and we're told it worked great. File under: The cobblers children have no shoes.
  • There was inadequate communications between systems staff (who were all handling this remotely) and support staff back in the office. This translated into additional frustration for members. So while people following all this on twitter, or signed up to the blog mailing list were giving us kudos for our communications savvy, the guys back at the office were wondering what the hell was going on.

All in all, not our proudest day. We've handled previous DDoS attacks better. We have been thinking on this deeply since it happened, and we are profoundly sorry for the pain this has cause our members.

There will be substantive changes here as a result of this incident.

 

Comments

  1. smith says

    quick question Mark- Why didn't use immediately start using DDoS mitigation services like Prolexic? Have you considered them in the past?

  2. says

    Thanks for your resolution into this incident.
    Can you explain how we can get the to the easyRoute53 interface in order to mirror our easydns zones over for failover?

    -Tony.

    • says

      Hi Tony,

      The route53 interface is in your control panel, in Domain Administration, in the 'external' tab.

      Sorry for the delay. We're just getting used to the blog being used for questions. :)

  3. Mike Polek says

    I would like to commend your response to the DDoS attack, not just in handling the technical aspects of mitigating the attack, but equally as important being very transparent in the post-mortem. I applaud your courage to post the "Key Takaways" section which contains valuable information and learning opportunities for any administrator who has ever (or will ever) experience a DDoS attack.

    Documentation and Communication Channels set up ahead of time are critical to shortening the response time. I know firsthand what it is like when an attack occurs. If it's an attack you haven't seen before, restoring service in an hour's time is excellent. Sharing your learnings, so that others may benefit is outstanding.

    Thanks for all you do to support your customers.

    -Mike.

    • says

      Hi Mike,

      Arnon here, answering the blog, sorry for the delay.

      It was definitely a learning experience for us all!

      Nobody does things by themselves after all. We're big fans of open-source and partnering here, and that means sharing knowledge. Even the stuff that hurts to learn. :)

      Thanks for the kind words, and your OWN support.

    • says

      Thanks Geoffrey!

      Feedback is really useful, so never hesitate to drop us a line. The more we know about what are clients are seeing and thinking about the service, the better we can make it.

      Arnon

  4. says

    Glad to see you got it worked out as quick as you did.

    I first noticed my DNS resolution down on SA when I was
    checking a friend's server (who I had set up a hostname
    for) to check on a few things shortly before going out.
    Of course at first I thought it was a domain name
    hijacking but the record at CIRA checked out OK, then
    I thought maybe it was Shaw's DNS so I checked with a
    friend who uses MTS. Thank you for being transparent
    enough that I could eventually figure out what was going
    on.

    I'm glad that you do your best to take care of your
    customers.*

    Has there been any success identifying who might be
    behind the attack? I ask because although some
    domains are at higher risk than others conceivably
    this could happen to any of your customers including
    those of us who've been with you for over 10 years.

    =========
    *This comment edited at the request of easyDNS and with the permission of the
    author. For any details regarding the edit, please contact glenap@moonie.ca

    • says

      Hi Glen,

      We don't really have the resources to do this sort of investigation ourselves, I'm afraid. This is the sort of thing that requires police, private armies, movie soundtracks, etc..

      The whole botnet/DDoS thing is a pretty huge affair, and very complicated. It's much like any other criminal activity, unfortunately. If criminals target you, either for monetary reasons, political, or the lulz, then you're a target. Not much you can do aside from mitigate if it happens and try and figure out who you ticked off. It's a protection racket. :( On the plus side, there are a lot of people out there working to deal with the issue.

      For our part, we're working hard right now to make sure if this sort of thing happens again, we have a stronger response time, more hardened infrastructure, and solid communications to keep our clients in the loop.

      Hope that helps!

      Arnon
      -an easyPerson

  5. Omri says

    We've set up easyRoute53 now but don't see a button to automatically switch the nameservers over (in the event of another attack) – is there one? Or do we have to manually go in and change the nameservers one by one?

    Also, it would be great if we could choose to have easyRoute53 automatically change the nameservers in the event of another attack, and then change them back once the attack was resolved.

    • says

      Hi Omri,

      There's no 'hotswap' button, as it were, though that's certainly a good idea. We'll put it on the list for future instances.

      We're still going to be evolving things. An automated system for this sort of thing might not be bad plan, but taking the client out of th equation in making decisions regarding nameserver delegation is a pretty major step, and involves legal responsibilities, etc.. among other things, we'd have to be certain before we did that everything was ok on the Amazon side of things. I'm not saying it's not do-able, just that it's a major project, and not one we can say yea or nay to off the bat. I've got it noted for future checking, though.

      In the meantime, to change the nameservers, you can do so through the route53 tool by clicking on edit while within the tool and selecting the nameservers you want (for exampling adding in Amazon or removing our own, or whatever you chose). Please keep in mind that easyDNS has to be the registrar for the domain to be able to inlude that functionality. If we're not, then you won't see anything about the nameservers.

      If you have any other issues with it, please let us know at support@easydns.com as we reply to that a lot faster. :)

      Regards

      Arnon