Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 31836
Defect Cause Modeling with Decision Tree and Regression Analysis

Authors: B. Bakır, İ. Batmaz, F. A. Güntürkün, İ. A. İpekçi, G. Köksal, N. E. Özdemirel


The main aim of this study is to identify the most influential variables that cause defects on the items produced by a casting company located in Turkey. To this end, one of the items produced by the company with high defective percentage rates is selected. Two approaches-the regression analysis and decision treesare used to model the relationship between process parameters and defect types. Although logistic regression models failed, decision tree model gives meaningful results. Based on these results, it can be claimed that the decision tree approach is a promising technique for determining the most important process variables.

Keywords: Casting industry, decision tree algorithm C5.0, logistic regression, quality improvement.

Digital Object Identifier (DOI):

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2287


[1] J. Han, M. Kamber, Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers, 2001.
[2] M. H. Dunham, Data Mining: Introductory and Advanced Topics. Prentice Hall, 2003.
[3] B. S. Kang, S. C. Park, "Integrated machine learning approaches for complementing statistical process control procedures", Decision Support System, vol. 29, pp. 59-72, 2000.
[4] M. Li, S. Feng, I. K. Sethi, J. Luciow, K. Wagner, "Mining Production Data with Neural Network & CART" in Conf. Rec. 2003 IEEE Int. Conf. Data Mining.
[5] J. Lian, X. M. Lai, Z. Q. Lin, F. S. Yao, "Application of data mining and process knowledge discovery in sheet metal assembly dimensional variation diagnosis", Journal of Materials Processing Technology, vol. 129, pp. 315-320, 2002.
[6] D. Braha, A. Shmilovici, "Data Mining for Improving a Cleaning Process in the Semiconductor Industry", IEEE Trans. Semiconductor Manufacturing, vol. 15, no. 1 pp. 91-101, Feb. 2002.
[7] D. W. Hosmer, S. Lemeshow, Applied Logistic Regression. Wiley- Interscience Publication, 2000.
[8] D. C. Montgomery, E. A. Peck, Introduction to Linear Regression Analysis. Wiley, 1982, pp. 444-453
[9] P. McCullagh, "Regression models for ordinal data (with discussion)", Journal of the Royal Statistical Society. Series B, vol. 42, pp. 109-127, 1980.
[10] A. Albert, J. A. Anderson, "On the existence of maximum likelihood estimates in logistic models", Biometrika, vol. 71, pp. 1-10, 1984.
[11] M. C. Bryson, M. E. Johnson, "The incidence of monotone likelihood in the Cox model", Techometrics, vol.23, pp. 381-384, 1981.
[12] Data Mining Tools C5.0
[13] K. R. Skinner, D. C. Montgomery, G. C. Runger, J. W. Fowler, D. R. McCarville, T. R. Rhoads, "Multivariate Statistical Methods for Modeling and Analysis of Wafer Probe Test Data", IEEE Trans. Semiconductor Manufacturing, vol. 15, no. 4 pp. 523-530, Nov. 2002.