Detection of DoS and DDoS attacks using Breakout
Detection and Time Series Models.
B.K.S.P.Kumar Raju Alluri Department of Computer Science National Institute of Technology Andhra Pradesh, India [email protected] Rahul Sai Department of Computer Science National Institute of Technology Andhra Pradesh, India [email protected] Krishnatej Department of Computer Science National Institute of Technology Andhra Pradesh, India [email protected]
Abstract—DoS and DDoS attacks became a threat to cyber security. Detection of these attacks became challenging task for cyber security. We Suggest an algorithm for detection of these attacks. Initially, apply Breakout Detection on frequency of captured packets for message observation mechanism for analysing some set of higher frequency packets. Now, apply Time Series Models on the dataset. Compare RMSE and MAE values to get best time series model. Apply forecasting on best time series model. Then, Calculate error rate by finding difference in actual and predicted values. Finally, apply chaotic systems on calculated error rate values to detect the chaotic behaviour. Which helps in detecting attacked traffic packets and analysing them.
Index Terms—DoS and DDoS Detection,Breakout Detec- tion,Time Series Models,Local Lyapunov exponent.
A denial-of-service (DoS) is any type of attack where the attackers (hackers) attempt to prevent legitimate users from accessing the service. In a DoS attack, the attacker usually sends excessive messages asking the network or server to authenticate requests that have invalid return addresses. The network or server will not be able to find the return address of the attacker when sending the authentication approval, causing the server to wait before closing the connection. When the server closes the connection, the attacker sends more authentication messages with invalid return addresses. Hence, the process of authentication and server wait will begin again, keeping the network or server busy.
Distributed denial of service (DDoS) attacks are a subclass of denial of service (DoS) attacks. DDoS attacks are quickly becoming the most prevalent type of cyber threat, growing rapidly in the past year in both number and volume according to recent market research. A distributed denial of service (DDoS) attack is a malicious attempt to make an online service unavailable to legal users, usually by temporarily interrupting or suspending the services of its hosting server which involves multiple connected online devices, collectively known as a botnet, which are used to overwhelm a target website with fake traffic. DDoS assaults often last for days, weeks and even months at a time, making them extremely destructive to any online organization. Amongst other things, DDoS attacks can lead to loss of revenues, erode consumer trust, force
businesses to spend fortunes in compensations and cause long- term reputation damage.
A. Problem Statement
To stop the damage which occurs due to DoS and DDoS attacks we have to analyse attacked packets. Detecting attacked packets is a major problem. In this paper we propose an algorithm to detect the time when a DoS attack had occured. This helps in analysing the attack traffic.
Following sections are organised as follows. Section II describes related works. Section III describes the proposed algorithm. Section IV shows the experimental results. Finally, conclusions are in Section V.
II. RELATED WORKS
Jhaveri, R.H., Patel, S.J. and Jinwala 1 gave a detailed explanations on different kind of DoS attacks in which mes- sage detection mechanism is mentioned which we consider as primary step in detecting DoS packets. Wiemken 2 proposed breakout detection which uses the concepts of Mosqueiro 3 E-divisive with medians which shows the points in graph which have more deviation. We use this breakout detection as message detection mechanism to find the sudden increases in traffic where possibility of attack is high and analyse them which may reduce detection time. The next stage of detection is by using time series models.
Impact on forecasting 7 which clearly shows the variations of output for different time series models on different datasets. This variation makes mandatory of applying several time series models to get the best. Yaacob 6 applied ARIMA for DoS detection and Hyndman, R.J. 5 explains about holt winters are time series models. Build the models one some set of initial values and forecast to get the predicted values. Time series models can be compared by using Harvey, A.C., 8 metrics especially by using RMSE and MAE values. Finally, after finding best model calculate error rate and apply the lyapunov exponent. Wolf, A 9 which explains error nature prediction using lyapunov exponent which helps the get the points(time) where the data is chaos. By combining all above process we designed PRK algorithm to detect the
Figure 1. Graph between time period and frequency
chaos nature of collected network traffic.
III. PROPOSED ALGORITHM FOR DETECTION OF DOS
This section contains proposed algorithm (PRK), briefly explained in algorithm 1 followed by detail explanation.
Figure 2. Breakout Points
Breakouts above certain frequency value and analyse the packets in original data this process is Message Detection Mechanism. Breakout points are shown in Fig. 2. this is base and direct step in detecting attack traffic.
C. Applying Time series Models
Time series model are the important for forecasting the time series data. There are many time series models different
models will effectively work on different datasets. In this sec-
Algorithm 1 DoS sttack detection algorithm
Require: pi(time period) and qi(frequency of captured pack- ets)
1: Applying breakout analysis for Detecting breakouts and analysing attack messages.
2: Forecasting different Time Series Models .
3: Compare the Time Series Models and choose the best model.
4: Calculate the error rate using best model.
tion apply two main Time Series models ARIMA(Aggressive
Integrated Moving Average Models) and Holtwinter.
1) ARIMA: Given a time series of data Xt where t is an integer index and the Xt are real numbers, an ARMA(p’,q) model is given by
Xt ? ?1Xt?1 ? · · · ? ?p0 Xt?p0 = ?t + ?1?t?1 + · · · + ?q?t?q
or equivalance by
5: Apply error detection mechanism to detect chaotic point
in error rate values.
?iLi? Xt =
1 + X
A. Extracting time series attributes
Consider time attribute form dataset. Fix suitable time period and calculate number of packets which is qi(frequency) based on pi(time period).This results in required time series dataset.Fig. 1 shows the graph between pi and qi.
Apply breakout-detection on time series data which results in one or more breakout points of higher frequency. They may help in detecting attack traffic directly. The underlying algorithm – referred to as E-Divisive with Medians (EDM)
– employs energy statistics to detect divergence in mean. It detect change in distribution in a given time series. Consider
where L is the lag operator, the ?i are the parameters of the autoregressive part of the model, the ?i are the parameters of the moving average part and the ?t are error terms. The error terms ?t are generally assumed to be independent, identically d distributed variables sampled from a normal distribution with zero mean.
Apply the ARIMA model (1) on the given Time series data. Forecast the model and predict the future values of data. This results in Fig. 3 which explains how the future values variate by applying forecast on the ARIMA model.
2) HoltWinter: Holt (1957) and Winters (1960) extended
Holt’s method to capture seasonality.
The additive Holt-Winters prediction function (for time series
Figure 3. Forecast Graph of ARIMA Time Series Model
with period length p) is
Yhatt+h = at + h * bt + st – p + 1 + (h – 1)
mod p -(2)
where at, bt and st are given by
at = ?(Y t ? st ? p) + (1 ? ?)(at ? 1 + bt ? 1)
bt = ?(at ? at ? 1) + (1 ? ?)bt ? 1
st = ?(Y t ? at) + (1 ? ?)st ? p
The multiplicative Holt-Winters prediction function (for time series with period length p) is
Yhatt+h = (at + h * bt) * st – p + 1 + (h – 1)
mod p -(3)
where at, bt and st are given by
at = ?(Y t/st ? p) + (1 ? ?)(at ? 1 + bt ? 1)
bt = ?(at ? at ? 1) + (1 ? ?)bt ? 1
st = ?(Y t/at) + (1 ? ?)st ? p
The data in x are required to be non-zero for a multiplicative model, but it makes most sense if they are all positive. The function tries to find the optimal values by minimizing the squared one-step prediction error. Apply the HoltWinter additive (2) or multiplicative (3) model based on dataset
Figure 4. Forecast Graph of HoltWinter Time Model
values. Forecast the model and predict the future values of data. This results in Fig. 4 which explains how the future values variate by applying forecast on the HoltWinter model.
D. Calculating Error rate
Consider the time series model which gives less RMSE and MAE on Forecasting the Time Series Dataset. Calculate the error rate by subtracting the actual Time Series Data and the predicted values as in (4). Results in the Fig. 5 shows the error variation.
?xi = xi ? yi -(4)
where xi represents the predicted values, yi represents the actual values and ?xi represents error rate.
E. Time of Choais using Lyapunov Exponent
Since in forecast of a time series model error is common.In this step we used Lyapunov error detection on error rate values to detect chaos.
?i = 1/ti(ln|?xi/?x1|) -(5) Lyapunov exponent sis calculated using (5). The positive
value of lyapunov exponent shows that predicted value deviated form actual value which shows the chaos nature. Negative value of lyapunov exponent represents normal behaviour. In Fig.6. graph between lyapunov exponent and time is drawn which shows the chaotic and normal behaviour
Figure 5. Error Variations of predicted and actual values
Figure 6. Error analysis using Lyapunov Exponent
of traffic. The point where the first positive value occured is the point of chaos.
IV. EXPERIMENTAL RESULTS
The proposed algorithm implemented on a Core i3 Laptop with 2 GHZ CPU and 4GB RAM. ARIMA and Breakout detection are implemented in R. The breakout points are plotted in Fig.2 these points are considered as primary detection points. Fig.3 and Fig.4 shows the variation of forecast values of ARIMA and HoltWinters models. Table I shows comparions between ARIMA and HoltWintrs on basis of RMSE and MAE values. Then Fig.5 represents
the calculated error values. These values are used for chaos detection using lyapunov. The positive deviation in Fig.6 is the predicted DoS attack point where the possibility of attack is high. The network dataset used for carrying out the experiment is ISCX-IDS.
Table I:Comparisions between ARIMA and HoltWinters
Time Series Models RMSE MAE ARIMA11.61 7.29
HoltWinters 12.25 7.83
V. CONCLUSION AND FUTURE WORKS Breakout detection is applied for detection set of possible
DoS and DDoS attack points. In case of failure we used time
series models on dataset. Taking the help of RMSE and MAE values we got the suitable time series model for dataset. The best model is forecasted and error rate is calculated. Finally, by applying the lyapunov exponent on error rate we detected the time value where attack had occured and used for analysis. As a future work we propose applying breakout detection to on the fly datasets. Also various other time series models can be implemented for comparisions.
1 Jhaveri, R.H., Patel, S.J. and Jinwala, D.C., 2012 January. DoS attacks in mobile ad hoc networks A survey In Advanced Computing and Com- munication Technologies (ACCT), 2012 Second International Conference on (pp. 535-541).
2 Wiemken, T.L., Furmanek, S.P., Mattingly, W.A., Wright, M.O., Persaud, A.K., Guinn, B.E., Carrico, R.M., Arnold, F.W. and Ramirez, J.A., 2018. Methods for computational disease surveillance in infection prevention and control: Statistical process control versus Twitter’s anomaly and breakout detection algorithms American journal of infection control,
3 Mosqueiro, T., Strube-Bloss, M., Tuma, R., Pinto, R., Smith, B.H. and Huerta, R., 2016, March Non-parametric change point detection for spike trains. In Information Science and Systems (CISS), 2016 Annual Conference on (pp. 545-550).
4 De Gooijer, J.G. and Hyndman, R.J., 2006 25 years of time series forecasting International journal of forecasting, 22(3) pp.443-473.
5 Hyndman, R.J. and Khandakar, Y., 2007. 25 years of time series fore- casting Automatic time series for forecasting: the forecast package for R (No. 6/07)) Monash University, Department of Econometrics and Business Statistics.
6 Yaacob, Asrul H., Ian KT Tan, Su Fong Chien, and Hon Khi Tan ;Arima based network anomaly detection.; In Communication Software and Networks, 2010. ICCSN’10. Second International Conference on, pp.
7 Comparision between time series models : https://www.ons.gov.uk/ons/guide-method/ukcemga/publications- home/publications/archive/from-holt-winters-to-arima-modelling– measuring-the-impact-on-forecasting-errors-for-components-of-quarterly- estimates-of-public-service-output.pdf
8 Harvey, A.C., 1990 Forecasting, structural time series models and the
Kalman filter Cambridge university press.
9 Wolf, A., Swift, J.B., Swinney, H.L. and Vastano, J.A., 1985. Determining Lyapunov exponents from a time series. Physica D: Nonlinear Phenom- ena, 16(3), pp.285-317.
10 Datasets used http://www.unb.ca/cic/datasets/index.html