Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 31093
Network Intrusion Detection Design Using Feature Selection of Soft Computing Paradigms

Authors: T. S. Chou, K. K. Yen, J. Luo


The network traffic data provided for the design of intrusion detection always are large with ineffective information and enclose limited and ambiguous information about users- activities. We study the problems and propose a two phases approach in our intrusion detection design. In the first phase, we develop a correlation-based feature selection algorithm to remove the worthless information from the original high dimensional database. Next, we design an intrusion detection method to solve the problems of uncertainty caused by limited and ambiguous information. In the experiments, we choose six UCI databases and DARPA KDD99 intrusion detection data set as our evaluation tools. Empirical studies indicate that our feature selection algorithm is capable of reducing the size of data set. Our intrusion detection method achieves a better performance than those of participating intrusion detectors.

Keywords: Intrusion Detection, Feature selection, Fuzzy Clustering, Dempster-Shafer theory, k-nearest neighbors

Digital Object Identifier (DOI):

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1549


[1] W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling, Numerical Recipes in C: The Art of Scientific Computing, Cambridge University Press, 1988.
[2] J. C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum Press, New York, 1981.
[3] J. C. Dunn, "A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters," Journal of Cybernetics, vol. 3, pp. 32-57, 1973.
[4] A. P. Dempster, "A Generalization of Bayesian Inference," Journal of the Royal Statistical Society, Series B, vol. 30, pp. 205-247, 1968.
[5] G. Shafer, A Mathematical Theory of Evidence, Princeton, University Press, Princeton, NJ, 1976.
[6] E. Fix and J. L. Hodges, "Discriminatory Analysis: Nonparametric Discrimination: Consistency Properties," Report Number 4, Project Number 21-49-004, USAF School of Aviation Medicine, Randolph Field, Texas, 1951.
[7] KDD99 archive: The Fifth International Conference on Knowledge Discovery and Data Mining.
[8] C. L. Blake and C. J. Merz, UCI Repository of Machine Learning Databases, 1998.
[9] M. Hall, Correlation Based Feature Selection for Machine Learning, Doctoral Dissertation, The University of Waikato, Department of Computer Science, 1999.
[10] L. Yu and H. Liu, "Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution," in Proceedings of The Twentieth International Conference on Machine Leaning, pp. 856-863, Washington, D.C., August, 2003.
[11] T. M. Mitchell, Machine Learning, McGraw-Hill, New York, 1997.
[12] J. R. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann, 1993.
[13] M. Keller, M. R. Gray, and J. A. Givens Jr., "A Fuzzy k-Nearest Neighbor Algorithms," Transactions on Systems, Man and Cybernetics, vol. SMC-15(4), pp. 580-585, 1985.
[14] T. Denoeux, "A k-Nearest Neighbor Classification Rule Based on Dempster-Shafer Theory," IEEE Transactions on Systems, Man and Cybernetics, vol. 25, no. 5, pp. 804-813, May 1995.
[15] I. Guyon and A. Elisseeff, "An Introduction to Variable and Feature Selection," Journal of Machine Learning Research, vol. 3, pp. 1157- 1182, 2003.
[16] J. M. Booker, M. C. Anderson, M. A. Meyer, "The Role of Expert Knowledge in Uncertainty Quantification (Are We Adding More Uncertainty or More Understanding?)," in Seventh Army Conference on Applied Statistics, pp. 155-161, 2001.
[17] W. L Oberkampf, J. C. Helton, C. A. Jos lyn, S. F. Wojtkiewicz, and S. Ferson, "Challenge Problems: Uncertainty in System Response Given Uncertain Parameters," Reliability Engineering and System Safety, vol. 85 pp. 11-19, 2004.
[18] K. Jones and R. S. Sielken, Computer System Intrusion Detection: A Survey, Technical Report, Computer University of Virginia, 2000.
[19] G. John, R. Kohavi, and K. Pfleger, "Irrelevant Features and the Subset Selection Problem," in Proceedings ML-94, pp. 121-129, Morgan Kaufmann, 1994.
[20] K. Kira and L. A. Rendell, "The Feature Selection Problem: Traditional Methods and a New Algorithm," in Proceedings AAAI-92, pp. 129-134, MIT Press, 1992.
[21] H. Almuallim and T. G. Dietterich, "Learning with Many Irrelevant Features," in Proceedings AAAI-91, pp. 547-551, MIT Press, 1991.
[22] G. Qu, S. Hariri, and M. Yousif, "A New Dependency and Correlation Analysis for Features," IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 9, pp. 1199-1207, September 2005.
[23] H. G. Kayac─▒k, A. N. Zincir-Heywood, and M. I. Heywood, "Selecting Features for Intrusion Detection: A Feature Relevance Analysis on KDD 99 Intrusion Detection Datasets," in Third Annual Conference on Privacy, Security and Trust, St. Andrews, New Brunswick, Canada, October 2005.
[24] J. R Quinlan, "Induction of Decision Trees," Machine Learning, vol. 1, pp. 81-106, 1986.
[25] G. Stein, B. Chen, A. S. Wu, and K. A. Hua, "Decision Tree Classifier For Network Intrusion Detection With GA-based Feature Selection," in Proceedings of the 43rd ACM Southeast Conference, Kennesaw, GA, March 2005.
[26] S. Mukkamala and A. H. Sung, "Feature Selection for Intrusion Detection Using Neural Networks and Support Vector Machines", Journal of the Transportation Research Board of the National Academics, Transportation Research Record No 1822, pp. 33-39, 2003.
[27] J. Biesiada and W. Duch, "Feature Selection for High-Dimensional Data: A Kolmogorov-Smirnov Correlation-Based Filter Solution," in Proceedings of the 4th International Conference on Computer Recognition Systems, 2005.
[28] S. A. Dudani, "The Distance-Weighted k-NN Rule," IEEE Transactions on Systems, Man and Cybernetics, vol. 6, no. 4, pp. 325- 327, 1976.
[29] R. P. Lippmann, D. J. Fried, I. Graf, J. W. Haines, K. R. Kendall, D. McClung, D. Weber, S. E. Webster, D. Wyschogrod, R. K. Cunningham, and M. A. Zissman, "Evaluating Intrusion Detection Systems: the 1998 DARPA Off-Line Intrusion Detection Evaluation," in Proceedings of the 2000 DARPA Information Survivability Conference and Exposition, vol. 2, IEEE Press, January 2000.
[30] M. Sabhnani and G. Serpen, "Why Machine Learning Algorithms Fail in Misuse Detection on KDD Intrusion Detection Data Set," Intelligent Data Analysis, vol. 8, no. 4, pp. 403-415, 2004.