Anomaly Detection and Characterization to Classify Traffic Anomalies Case Study: TOT Public Company Limited Network
Authors: O. Siriporn, S. Benjawan
Abstract:
This paper represents four unsupervised clustering algorithms namely sIB, RandomFlatClustering, FarthestFirst, and FilteredClusterer that previously works have not been used for network traffic classification. The methodology, the result, the products of the cluster and evaluation of these algorithms with efficiency of each algorithm from accuracy are shown. Otherwise, the efficiency of these algorithms considering form the time that it use to generate the cluster quickly and correctly. Our work study and test the best algorithm by using classify traffic anomaly in network traffic with different attribute that have not been used before. We analyses the algorithm that have the best efficiency or the best learning and compare it to the previously used (K-Means). Our research will be use to develop anomaly detection system to more efficiency and more require in the future.
Keywords: Unsupervised, clustering, anomaly, machine learning.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1078213
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2110References:
[1] K. Ramah, H. Ayari, and F. Kamoun, "Traffic Anomaly Detection and Characterization in the Tunisian National University Network", Networking, 2006, pp. 136-147.
[2] A. Lakhina, M. Crovella, and C. Diot, "Mining Anomalies Using Traffic Feature Distributions", Technical Report BUCS-TR-2005-002, Boston University, 2005.
[3] M. Shyu, S. Chen, K. Sarinnapakorn, and L. Chang, "A Novel Anomaly Detection Scheme Based on Principal Component Classifier-, In Proceedings of the IEEE Foundations and New Directions of Data Mining Workshop, in conjunction with the Third IEEE International Conference on Data Mining (ICDM-03), pp.172-179, Melbourne, Florida, USA, 2003.
[4] G. M├╝nz, S. Li, and G. Carle, "Traffic Anomaly Detection Using KMeans Clustering", In GI/ITG Workshop MMBnet, 2007.
[5] L.S. Silva, T.D. Mancilha, J.D.S. Silva, A.C.F. Santos, e A. Montes, "A Framework for Analysis of Anomalies in the Network Traffic", In INPE-06, S├úo José dos Campos, December 2006.
[6] P. Tan, M. Steinbach, V. Kuman, "Introduction to Data Mining", Addison Wesley, 2006.
[7] J. Erman, M. Arlitt, A. Mahanti, "Traffic Classification Using Clustering Algorithms", In SIGCOMM-06 MineNet Workshop, Pisa, Italy, September 2006.
[8] A. McGregor, M. Hall, P. Lorier, and J. Brunskill, "Flow Clustering Using Machine Learning Techniques", In PAM 2004, Antibes Juan-les- Pins, France, April 19-20, 2004.
[9] S. Zander, T. Nguyen, and G. Armitage, "Automatic Traffic Classification and Application Identification using Machine Learning", In LCN-05, Sydney, Australia, Nov 15-17, 2005.
[10] RapidMiner Homepage, http://rapid-i.com/content/ blogcategory/38/69/
[11] Ethereal Homepage, http://www.rootsecure.net/content/ downloads/pdf/ethereal_guide.pdf
[12] M. Ester, H. Kriegel, J. Sander, and X. Xu, "A density-based Algorithm for discovering Clusters in Large Spatial Databases with Noise", In 2nd Int. Conf. on Knowledge Discovery and Data Mining (KDD 96), Portland, USA, 1996.
[13] J. MacQueen, "Some methods for classification and analysis of multivariate observations", In Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability, University of California Press, 1976, pp. 281-297.
[14] Winston H. Hsu, and Shih-Fu Chang, "Visual Cue Cluster Construction via Information Bottleneck Principle and Kernel Density Estimation", In CIVR 2005, pp. 82-91.
[15] Zheng-Yu Niu, Dong-Hong Ji, Chew Lim Tan, "Using cluster validation criterion to identify optimal feature subset and cluster number for document clustering", In Information Processing and Management- 06, 2006.
[16] icml2006 ...Precision , Recall
[17] http://en.wikipedia.org/wiki/F-score
[18] http://en.wikipedia.org/wiki/Precision_and_recall
[19] J. Davis and M. Goadrich, "The Relationship Between Precision-Recall and ROC Curves", In ICML-06, 2006
[20] M. Pirooznia, J. Y Yang, M. Qu Yang and Y. Deng, "A comparative study of different machine learning methods on microarray gene expression data", In BIOCOMP-07, June 2007.
[21] Michael W. Berry, Umeshwar Dayal, Chandrika Kamath and David Skillicorn, "Proceedings of the Fourth SIAM International Conference on Data Mining", p 338, 2004
[22] B. Sugato ,"Semi-supervised Clustering: Learning with Limited User Feedback", November 2003
[23] A. William, "Clustering Algorithms for Categorical Data", September 2006.
[24] RapidMiner Homepage, http://downloads.sourceforge.net/yale/ rapidminer -4.2 -guimanual.pdf