A K-Means Based Clustering Approach for Finding Faulty Modules in Open Source Software Systems

Parvinder S. Sandhu; Jagdeep Singh; Vikas Gupta; Mandeep Kaur; Sonia Manhas; Ramandeep Sidhu

Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 33122

A K-Means Based Clustering Approach for Finding Faulty Modules in Open Source Software Systems

Authors: Parvinder S. Sandhu, Jagdeep Singh, Vikas Gupta, Mandeep Kaur, Sonia Manhas, Ramandeep Sidhu

Abstract:

Prediction of fault-prone modules provides one way to support software quality engineering. Clustering is used to determine the intrinsic grouping in a set of unlabeled data. Among various clustering techniques available in literature K-Means clustering approach is most widely being used. This paper introduces K-Means based Clustering approach for software finding the fault proneness of the Object-Oriented systems. The contribution of this paper is that it has used Metric values of JEdit open source software for generation of the rules for the categorization of software modules in the categories of Faulty and non faulty modules and thereafter empirically validation is performed. The results are measured in terms of accuracy of prediction, probability of Detection and Probability of False Alarms.

Keywords: K-Means, Software Fault, Classification, ObjectOriented Metrics.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1080951

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2308

References:

[1] http://promisedata.org/repository/
[2] S. Chidamber, and C. Kemerer, "A metrics suite for object-oriented design", IEEE Transactions on Software Engineering, 20(6), 1994, pp.476-493.
[3] Arvinder Kaur and Ruchika Malhotra, "Application of Random Forest in Predicting Fault-Prone Classes", 2008 International Conference on Advanced Computer Theory and Engineering ICACTE 2008, Pukhet, Dec. 2008, pp. 37-43.
[4] Lanubile F., Lonigro A., and Visaggio G. (1995) "Comparing Models for Identifying Fault-Prone Software Components", Proceedings of Seventh International Conference on Software Engineering and Knowledge Engineering, June 1995, pp. 12-19.
[5] Saida Benlarbi,Khaled El Emam, Nishith Geol (1999), "Issues in Validating Object-Oriented Metrics for Early Risk Prediction", by Cistel Technology 210 Colonnade Road Suite 204 Nepean, Ontario Canada K2E 7L5.
[6] Runeson, Claes Wohlin and Magnus C. Ohlsson (2001), "A Proposal for Comparison of Models for Identification of Fault-Proneness", Dept. of Communication Systems, Lund University, Profes 2001, LNLS 2188, pp. 341-355.
[7] Mahaweerawat, A. (2004), "Fault-Prediction in object oriented software-s using neural network techniques", Advanced Virtual and Intelligent Computing Center (AVIC), Department of Mathematics, Faculty of Science, Chulalongkorn University, Bangkok, Thailand, pp. 1-8.
[8] Bellini, P. (2005), "Comparing Fault-Proneness Estimation Models", 10th IEEE International Conference on Engineering of Complex Computer Systems (ICECCS'05), vol. 0, 2005, pp. 205-214.
[9] Ma, Y., Guo, L. (2006), "A Statistical Framework for the Prediction of Fault-Proneness", West Virginia University, Morgantown.
[10] Eric Rotenberg (1999), "AR-SMT: A Microarchitectural Approach to Fault Tolerance in Microprocessors", Proceedings of the Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing, June 15- 18, pp. 84-90.
[11] L. Briand, J. Wilst, H. Lounis, "Replicated Case Studies for Investigating Quality Factors in Object-Oriented Designs", Empirical Software Engineering: An International Journal, 6(1), 2001, pp.11-58.
[12] T. Gyimothy, R. Ferenc, 1. Siket, "Empirical validation of objectoriented metrics on open Trans. Software Engineering, 31 (10), 2005, pp. 897 ÔÇö910.
[13] Z. Yuming, and L. Hareton, "Empirical analysis of Object-Oriented Design Metrics for predicting high severity faults", IEEE Transactions on Software Engineering, 32(10), 2006, pp.771-784.
[14] G. Pai, "Empirical analysis of Software Fault Content and Fault Proneness Using Bayesian Methods", IEEE Transactions on software Engineering, 33(10), 2007, pp.675-686.
[15] K.K Aggarwal, Y. Singh, A. Kaur, R. Malhotra, "Empirical Analysis for Investigating the Effect of Object-Oriented Metrics on Fault Proneness: A Replicated Case Study", Published online in Software Process Improvement and Practice, Wiley, 2008.
[16] K.K Aggarwal, Y. Singh, A. Kaur, R. Malhotra, "Investigating the Effect of Coupling Metrics on Fault Proneness in Object-Oriented Systems", Software Quality Professional, 8(4), 2006, pp.4-16.
[17] T.M. Khoshgaftaar, E.D. Allen, J.P. Hudepohl, S.J. Aud, Application of neural networks to software quality modeling of a very large telecommunications system, IEEE Transactions on Neural Networks, 8(4), 1997, pp. 902-909.
[18] Promise. http://promisedata.org/repository/.
[19] Website sourceforge: www.sourceforge.net/projects/jedit
[20] S. Watanabe, H. Kaiya, K. Kaijiri, Adapting a Fault Prediction Model to Allow Inter Language Reuse, PROMISE'08, May 12-13, Leipzig, Germany, 2008.
[21] M. A. Hall (1998). Correlation-based Feature Subset Selection for Machine Learning. Hamilton, New Zealand.
[22] Wikipedia. Taxicab geometry. URL http://en.wikipedia.org/wiki/Taxicab_geometry.