Cross Project Software Fault Prediction at Design Phase

Pradeep Singh; Shrish Verma

Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 33156

Cross Project Software Fault Prediction at Design Phase

Authors: Pradeep Singh, Shrish Verma

Abstract:

Software fault prediction models are created by using the source code, processed metrics from the same or previous version of code and related fault data. Some company do not store and keep track of all artifacts which are required for software fault prediction. To construct fault prediction model for such company, the training data from the other projects can be one potential solution. Earlier we predicted the fault the less cost it requires to correct. The training data consists of metrics data and related fault data at function/module level. This paper investigates fault predictions at early stage using the cross-project data focusing on the design metrics. In this study, empirical analysis is carried out to validate design metrics for cross project fault prediction. The machine learning techniques used for evaluation is Naïve Bayes. The design phase metrics of other projects can be used as initial guideline for the projects where no previous fault data is available. We analyze seven datasets from NASA Metrics Data Program which offer design as well as code metrics. Overall, the results of cross project is comparable to the within company data learning.

Keywords: Software Metrics, Fault prediction, Cross project, Within project.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1106977

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2558

References:

[1] Menzies, T., Greenwald, J., Frank, “A.: Data mining static code attributes to learn defect predictors” IEEE Trans. Softw. Eng. 33(1), 2– 13 (2007b)
[2] Lessmann, S., Baesens, B., Mues, C., Pietsch, S. “Benchmarking classification models for software defect prediction: a proposed framework and novel findings” IEEE Trans. Softw. Eng. 34(4), 485–496 (2008)
[3] C. Andersson, “A Replicated Empirical Study of a Selection Method for Software Reliability Growth Models,” Empirical Software Eng., vol. 12, no. 2, pp. 161-182, 2007.
[4] N. E. Fenton and N. Ohlsson, “Quantitative Analysis of Faults and Failures in a Complex Software System,” IEEE Trans. Software Eng., vol. 26, no. 8, pp. 797-814, Aug. 2000.
[5] Tosun, A., Turhan, B., Bener, “A.: Practical considerations in deploying AI for defect prediction: a case study within the Turkish telecommunication industry.” In: Proceedings of the 5th International Conference on Predictor Models in Software Engineering, pp. 1–9 (2009).
[6] Weyuker, E. J., Ostrand, T. J., Bell, R. M. “Comparing the effectiveness of several modeling methods for fault prediction”. Empir. Softw. Eng. 15(3), 277–295 (2009)
[7] Zimmermann, T., Nagappan, N., Gall, H.: “Cross-project defect prediction: a large scale experiment on data vs. domain vs. process,” In: Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering, pp. 91–100 (2009)
[8] Turhan, B., Menzies, T., Bener, A. “On the relative value of crosscompany and within_company data for defect prediction,” Empir. Softw. Eng. 14(5), 540–578 (2009)
[9] Watanabe, S., Kaiya, H., Kaijiri, K. “Adapting a fault prediction model to allow inter language reuse,” In: Proceedings of the International Workshop on Predictive Models in Software Engineering, pp. 19–24 (2008).
[10] http://promisedata.org/repository.
[11] Ostrand, T.J., Weyuker, E.J., Bell, R.M. “Predicting the location and number of faults in large software systems,” IEEE Trans. Softw. Eng. 31(4), 340–355 (2005)
[12] D’Ambros, M., Lanza, M., Robbes, R. “An extensive comparison of bug prediction approaches,” In: Proceedings of the 7th IEEE Working Conference on Mining Software Repositories, pp. 31–41 (2010)
[13] Tosun, A., Bener, A., Kale, R. “ AI-based software fault predictors: applications and benefits in a case study” In: Proceedings of the 22th Innovative Applications of Artificial Intelligence Conference, pp. 1748– 1755 (2010)
[14] Nagappan, N., Ball, T “Use of relative code churn measures to predict system fault density,” In: Proceedings of the 27th International Conference on Software Engineering, pp. 284–292 (2005)
[15] Catal, C., Diri, B.: “A systematic review of software fault prediction studies,” Expert Syst. Appl. 36(4), 7346–7354 (2009)
[16] Turhan, B., Bener, A., Menzies, T. “Regularities in learning defect predictor,” In: The 11th International Conference on Product Focused Software Development and Process Improvement, pp. 116–130 (2010)
[17] Jureczko, M., Madeyski, L. “Towards identifying software project clusters with regard to defect prediction,” In: Proceedings of the 6th International Conference on Predictive Models in Software Engineering, pp. 1–10 (2010)
[18] Nagappan, N., Ball, T., Zeller, A. “Mining metrics to predict component failure” In: Proceedings of the 28th International Conference on Software Engineering, pp. 452–461 (2006).
[19] http://www.cse.lehigh.edu/~gtan/bug/localCopies/nistReport.pdf
[20] Do-178b and mccabe iq. Available in http://www. mccabe.com/iq_research_whitepapers.htm.
[21] N. Ohlsson and H. Alberg “Predicting fault-prone software modules in telephone switches,” IEEE Transactions on Software Engineering, 22(12):886–894, 1996.
[22] K. El-Emam, S. Benlarbi, N. Goel, and S.N. Rai, “Comparing Case- Based Reasoning Classifiers for Predicting High-Risk Software Components,” J. Systems and Software, vol. 55, no. 3, pp. 301-320, 2001.
[23] T. M. Khoshgoftaar and N. Seliya, “Analogy-Based Practical Classification Rules for Software Quality Estimation,” Empirical Software Eng., vol. 8, no. 4, pp. 325-350, 2003.
[24] T. Fawcett, “An Introduction to ROC Analysis,” Pattern Recognition Letters, vol. 27, no. 8, pp. 861-874, 2006.
[25] C. Wohlin, P. Runeson, M. Host, M.C. Ohlsson, B. Regnell, and A. Wesslen, Experimentation in Software Engineering: An Introduction. Kluwer Academic Publishers, 2000.
[26] L. Guo, Y. Ma, B. Cukic, and H. Singh, “Robust Prediction of Fault- Proneness by Random Forests,” Proc. 15th Int’l Symp Software Reliability Eng., 2004.
[27] M.J. Harrold, Testing: a roadmap, in: Proceedings of the Conference on the Future of Software Engineering, ACM Press, New York, NY, 2000.