What You Should Know About DDoS Incident Response

What You Should Know About DDoS Incident Response

August 27, 2019 | Mina Hao

This document addresses the overall strategy and process for DDoS incident response and provides detailed analysis of and countermeasures against some typical attacks, in a bid to help organizations respond to DDoS attacks more effectively and efficiently. Therefore, we will not dwell upon specific methods of and configurations of specific mitigations against each type of DDoS attack.

0x00 Introduction

In the past several years, DDoS incidents have emerged endlessly. From DDoS analysis reports produced by security vendors, we can also find that DDoS attacks are growing by leaps and bounds in both the size and frequency. Thanks to the increasingly decreasing attack cost, the increasingly lowered technical skills required, the wide spread of attack tools, and the readily available bot machines on the Internet, it has become a piece of cake to launch a DDoS attack. Amid this trend, organizations have to input more and more in DDoS attack defenses. Naturally, with more inputs, people expect higher returns. When we say an organization does a good job in DDoS mitigation, it is largely because of its prompt and effective response to DDoS incidents.

This document is aimed at helping readers well understand DDoS incident response and stand higher to view such efforts in an all-round manner. By reading this document, organizations are expected to handle incidents well prepared and with ease when suffering a DDoS attack, instead of being at a loss as to what to do.

0x01 Overview

In routine operation and maintenance (O&M), incident responses are conducted either to fix device faults or to restore services carried by faulty devices or both.

Here we also list device faults. Although this component is not what we focus on in this document, it is an integral part of the overall response framework as we will be rendered helpless once anti-DDoS devices become faulty during a DDoS attack.

0x02 DDoS Incident Response Strategy

The DDoS incident response strategy can be generalized as “all-round defenses, multilevel filtering”, as shown in the following figure.

As we all know, DDoS is characterized by floods of traffic, but still there are ways to hit targets with less traffic. That is why our defense strategy is layered, as shown in the preceding figure.

During a DDoS attack, when the attack bandwidth is below 80% of the line bandwidth, on-premises anti-DDoS devices are capable enough to scrub DDoS attack traffic. In this case, external assistance is unnecessary.

However, when the DDoS attack bandwidth exceeds 100% of the line bandwidth, you have to resort to the carrier for DDoS scrubbing. Oops! The carrier owning the line under attack does not provide the DDoS scrubbing service. “What shall we do?” Take it easy. You can initiate plan B, that is, asking the carrier to expand the bandwidth for the time being. As long as the line bandwidth is greater than the attack bandwidth, on-premises devices can do their jobs effectively.

In case the carrier’s scrubbing service does not work to your satisfaction, you can play the trump card: cloud scrubbing.

Most real-life DDoS attacks are multi-vector ones (a mix of various attack types). For example, some DDoS attacks generate floods of reflective traffic as the background traffic, in which CC, connection exhaustion, and low-and-slow attack traffic is hidden. In this case, a likely option is to use the carrier-side scrubbing service (specifically for volume-based attacks) to first filter out over 80% of traffic, thereby ensuring enough line bandwidth. Among the remaining 20% of traffic, 80% is probably attack traffic (low-and-slow attack, CC attack, and so on), which needs to be further scrubbed.

0x03 DDoS Incident Response Process

The following figure illustrates a general DDoS incident response process that is suitable for most organizations. Still, there are some details needing to be clarified:

  1. In the case of no 24/7 onsite security O&M, the network management and monitoring center (NMMC) is usually responsible for monitoring DDoS alerts. Therefore, a collaborative handling mechanism should be in place to enable cooperation with monitoring personnel at NMMC.
  2. If on-premises scrubbing devices are not configured with automatic diversion, diversion needs to be manually initiated when an attack is detected. In case of emergency, who and how to do this and what kind of authorization is required should be discussed beforehand and included in the response process. Assume that a DDoS attack takes place at 2:00 a.m. If the preceding matters are well considered, emergency response personnel will be poised to do their job confidently.
  3. As for the carrier-side scrubbing service, a related mechanism should be established beforehand upon sufficient communication to enable effective implementation of incident responses. At least the method of reaching the contact person on the carrier’s side and an authorization mode acknowledged by both sides (for example, some carriers’ scrubbing service requires a fax of the request affixed with the customer’s official seal) should be known to response personnel.
  4. For vendor’s expert support, it is advisable to perform sufficient technical exchanges and communication in advance. The response personnel should at least know when this mechanism needs to be initiated. In addition, some basic information needs to be collected beforehand for second-line support personnel’s ready use as it takes time for them to arrive at the site and the time to recovery is an important factor that organizations have to consider.

0x04 DDoS Incident Response Guidance

4.1 DDoS Attack Types

DDoS Attack Type Symptom
Volume-based attack (direct) DDoS alerts on SYN, ACK, ICMP, UDP, or connection floods
Volume-based attack (reflective) DDoS alerts on NTP, DNS, SSDP, or ICMP floods
CC Possibly no evident traffic fluctuation, slow response to service access requests, frequent timeouts, and lots of requests for access to the same page(s)
Low-and-slow HTTP attack Possibly no evident traffic fluctuation, slow response to service access requests, frequent timeouts, many incomplete HTTP GET requests, and HTTP POST request packets of regular sizes (usually very small)
URL reflection Evident traffic fluctuations, slow response to service access requests, frequent timeouts, and many request packets with the same Referer header that indicates the same page linked to the requested resource
Various exploits with the DoS effect Alerts possibly generated by intrusion detection/prevention devices, but seldom by DDoS detection devices

After DDoS attacks are sorted out, the next step is to deploy defenses according to the DDoS defense guideline.

  • Volume-based attack (direct): In the case of the attack bandwidth below the line bandwidth, on-premises scrubbing suffices to overcome this type of attack.
  • Volume-based attack (direct): In the case of the attack bandwidth in excess of the line bandwidth, there are three options available to handle this: carrier-side scrubbing, temporary bandwidth expansion, and cloud scrubbing. After the attack bandwidth is controlled below the line bandwidth, on-premises devices can take over the remaining work.

For SYN, ACK, UDP, ICMP, and other flood attacks:

Generally, defense algorithms (for example, dropping the first packet and IP traceback) configured on on-premises devices are effective enough to cope with these attacks.

In particular circumstances, it is advisable to apply rate limits for various packets along with the preceding algorithms to at least ensure the basic availability of services during an attack.

If source IP addresses are found to be mostly located in a specific region, location-based restriction may be a good option, especially for attacks originated from foreign countries.

  • Volume-based attack (reflective): In the case of the attack bandwidth below the line bandwidth, on-premises scrubbing suffices to overcome this type of attack.
  • Volume-based attack (reflective): In the case of the attack bandwidth in excess of the line bandwidth, there are three options available to handle this: carrier-side scrubbing, temporary bandwidth expansion, and cloud scrubbing. After the attack bandwidth is controlled below the line bandwidth, on-premises devices can take over the remaining work.

For NTP, DNS, SSDP, and other reflection attacks:

Generally, defense algorithms (for example, drop of UDP fragments and rate limiting) configured on on-premises devices are effective enough to mitigate these attacks.

In particular circumstances, as reflection attacks are characterized by the traffic with fixed source ports and fixed destination IP addresses using over 90% of the bandwidth, a more thorough dropping rule can be configured accordingly.

  • CC attack: First on-premises devices can be used to scrub the traffic and, if the effect is not satisfactory, cloud scrubbing can step in.

For CC attacks, if traffic scrubbing hardly works, in case of emergency, replacement with static pages can be used as a makeshift.

  • Low-and-slow HTTP attack: First on-premises devices can be used to scrub the traffic and, if the effect is not satisfactory, cloud scrubbing can step in.

For slow HTTP body attacks, characteristics of the attack tool should be first identified and then policies should be configured on on-premises devices accordingly.

  • URL reflection: On-premises scrubbing and cloud scrubbing should be combined.

For URL reflection attacks, it is important to identify reflectors in the attack process and then configure advanced settings on on-premises devices.

  • Various exploits with the DoS effect: Intrusion detection or prevention devices should be monitored for alerts and system vulnerabilities should be promptly fixed once discovered.

This type of attacks, strictly speaking, does not belong to DDoS attacks, but can somewhat compare with DoS owing to the similar effect. Therefore, it is listed here only to make our classifications as exhaustive as possible.

4.2 DDoS Incident Response Guidance

4.2.1 Volume-based DDoS Attack (Direct)

Examples of this type of attacks are SYN flood, ACK flood, ICMP flood, and UDP flood attacks. When detecting a DDoS attack, the DDoS detection device immediately generates an alert. This means that we can obtain firsthand information from this device. To defend against the attack traffic, a necessary step is to divert such traffic to the DDoS scrubbing device (not deployed in in-path mode) before being scrubbed automatically or manually. Either way, we can capture packets for further analysis to identify signatures of this attack.

Generally, when packets of a certain type account for over 80% of the total number of packets captured, an attack is believed to be in process.

  • SYN flood
  1. Checking the number of packets

TCP-SYN packets take up around 80% of the total packets captured.

  1. Checking the number of connections to the server

Run the netstat –an | find “SYN_RECEIVED” command to check TCP connections. If a lot of connections are in the SYN_RECEIVED state (half-open), a SYN flood attack is believed to be happening.

  • ACK flood

Most ACK flood attacks are for the purpose of saturating the bandwidth. If a large proportion of packets captured are TCP-ACK packets, which do not result in setup of TCP connections and contain a large number of retransmitted TCP-ACK packets, an ACK flood attack is virtually ongoing.

  • ICMP flood

Normally, ICMP packets take up a very small proportion of network traffic. When over 20% of packets captured are ICMP packets, it may be rash to determine that an ICMP flood attack is ongoing, but this symptom at least indicates that an anomaly occurs in the current network environment. Typically, when a core transport network becomes faulty, in some cases, the router encapsulates those packets that cannot reach the destination as expected via ICMP before forwarding them to the server. This will result in a detection device generating a DDoS alert on ICMP floods. To determine whether a real ICMP flood attack is happening, we can also check the size of ICMP packets, which is usually smaller than 100 bytes (unless they are for implementing some special functions like probing). If most packets captured are ICMP packets that are larger than 1000 bytes, or sometimes there are even an overwhelming number of ICMP fragments, an ongoing ICMP flood attack is almost a surefire thing.

  • UDP flood

As UDP flood attacks are mainly aimed at overwhelming the target’s ability to process and respond by occupying the whole bandwidth, there must be a great number of UDP packets in a very short time. Besides, payloads of these packets are largely similar. In a UDP flood attack, we can find that, of all UDP packets captured with Wireshark, most contain similar information in the Data field although their source IP addresses and destination ports may be different.

For volume-based DDoS attacks (direct), common algorithms employed by most DDoS traffic scrubbing devices can work to a good effect. How to configure policies on these devices is not something this document concerns.

4.2.2 Volume-based DDoS Attack (Reflective)

The following table lists some representative volume-based DDoS attacks (reflective).

Attack Type Amplification Factor Vector
NTP Amplification Attack 556.9 Monlist query
DNS Amplification Attack 28 to 54 Text query
SSDP Amplification Attack 30.8 SEARCH request
Charger Amplification Attack 358.8 Character generation request
SNMP Amplification Attack 6.3 GetBulk request
NetBIOS Amplification Attack 3.8 Name resolution
QOTD Amplification Attack 140.3 Quote request

Reflective DDoS attacks have two features: (1) The attack traffic is overwhelmingly large; (2) It is difficult to trace the attack source. Owing to reflection, the real attack source is concealed, even if it is a botnet, which is actually often involved in such an attack. For this reason, hackers behind this type of attacks are especially audacious, without scruple.

These attacks have distinctive features. Experience tells us that, whether packets are captured by a cleaning device or a network probe, the attack traffic can take up over 90%, and sometimes even 99%, of the total network traffic. After all, the sole purpose of reflection attacks is to consume network bandwidth and cause congestion to the ingress line.

During a reflection attack, almost all alerts on DDoS detection devices are about UDP floods.

The following figures show signatures of various reflection attack packets.

DNS reflection attack:

NTP reflection attack:

SSDP reflection attack:

Reflective DDoS attacks are not difficult to mitigate. When the attack bandwidth exceeds the line bandwidth (on the web-based manager of a protection device, we often see the attack bandwidth is equal to the line bandwidth because traffic in excess of the maximum line bandwidth is already dropped by the upstream carrier), the organization should request initiation of the carrier-side DDoS scrubbing service. If the attack bandwidth does not exceed the line bandwidth, on-premises scrubbing is enough. Besides, ACLs can be configured on edge routers to filter out this type of traffic. On an on-premises DDoS traffic scrubbing device, the following policy can be configured to thoroughly filter out reflective DDoS traffic:

To defend against DNS reflection attacks, we can configure a DNS keyword filtering policy. Up to now, all DNS reflection attacks we have handled have the query type of 0x00ff.

4.2.3 Application-Layer DDoS Attack

Typical application-layer DDoS attacks include CC attacks and low-and-slow attacks. The most distinctive difference between these attacks and volume-based DDoS attacks is that the former can achieve the effect of the latter with small volumes of traffic. In extreme situations, no obvious traffic fluctuation is observed before services are paralyzed.

Basic algorithms employed by DDoS traffic scrubbing devices are not so competent for this type of attacks. We must capture attack signatures in real time to determine what the best cure is.

CC attacks have obvious patterns. When accessing services, users usually need to browse one after another page here and there instead of being confined to certain pages. During CC attacks, most visits are related to certain (5–10) pages. In this case, we can configure a filtering rule on the DDoS traffic scrubbing device to protect these pages.

Normal traffic does not contain a large number of very small packets, which is, however, the case with low-and-slow HTTP attacks, especially body-related ones. Besides, the packet size of these attacks has a pattern. After identifying these features, we can configure parameters accordingly on the DDoS scrubbing device to effectively defend against these attacks.

0x05 DDoS Incident Response Exercise

To ensure truly efficient responses to DDoS attacks, we should routinely carry out the incident response exercise. After the DDoS incident response strategy is set down, the response process established, and various DDoS attacks and related countermeasures analyzed, the next step is to conduct regular exercises, whether in sandbox mode or hands-on form. Exercises can walk us through the DDoS incident response process and help us identify what needs to be improved in our response efforts.

0x06 Must-Knows About DDoS Incident Response

What needs to be considered for DDoS incident response is listed as follows for readers’ reference:

  • In the current network environment, how many Internet access lines are there and what is the bandwidth of each line?
  • Do the carriers owning Internet access lines support DDoS scrubbing? If yes, have we purchased this service or can we use it on a trial basis in case of emergency? Is there an emergency response process in place for initiating the carrier-side scrubbing service during DDoS attacks?
  • Do the carriers owning Internet access lines support temporary bandwidth expansion in case of emergency? If yes, have we purchased the service or can we use it on a trial basis? Is there an emergency response process in place for initiating the temporary bandwidth expansion service during DDoS attacks?
  • Is on-premises DDoS scrubbing available for each of the Internet access lines?
  • Does the on-premises anti-DDoS device and service vendor provide an emergency response plan for DDoS attacks?
  • Are all services that need to be protected included in the monitoring scope of anti-DDoS devices?
  • For services that require automatic DDoS scrubbing, can automatic diversion and scrubbing work properly during a DDoS attack?
  • Is there an internal guiding process for DDoS incident response?
  • Is it possible that we are immediately aware of a DDoS attack taking place? And how?