Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 30840
Using Data Mining Techniques for Finding Cardiac Outlier Patients

Authors: Farhan Ismaeel Dakheel, Raoof Smko, K. Negrat, Abdelsalam Almarimi


In this paper we used data mining techniques to identify outlier patients who are using large amount of drugs over a long period of time. Any healthcare or health insurance system should deal with the quantities of drugs utilized by chronic diseases patients. In Kingdom of Bahrain, about 20% of health budget is spent on medications. For the managers of healthcare systems, there is no enough information about the ways of drug utilization by chronic diseases patients, is there any misuse or is there outliers patients. In this work, which has been done in cooperation with information department in the Bahrain Defence Force hospital; we select the data for Cardiac patients in the period starting from 1/1/2008 to December 31/12/2008 to be the data for the model in this paper. We used three techniques for finding the drug utilization for cardiac patients. First we applied a clustering technique, followed by measuring of clustering validity, and finally we applied a decision tree as classification algorithm. The clustering results is divided into three clusters according to the drug utilization, for 1603 patients, who received 15,806 prescriptions during this period can be partitioned into three groups, where 23 patients (2.59%) who received 1316 prescriptions (8.32%) are classified to be outliers. The classification algorithm shows that the use of average drug utilization and the age, and the gender of the patient can be considered to be the main predictive factors in the induced model.

Keywords: Data Mining, Clustering, classification, Drug Utilization

Digital Object Identifier (DOI):

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1547


[1] G. Y. H. Lip, K. Peter "New oral anticoagulant drugs in cardiovascular disease", Thrombosis and Haemostasis. ISSN: 0340-6245. 2010 July.
[2] World Health Organization, "The World Health Report 2006 - working together for health", 2006.
[3] Ministry of Health - Kingdom of Bahrain. Annual Report of 2008. CH03-vital%20stat_2008.pdf
[4] J. Han, M. Kamber, Data Mining: Concepts and Techniques, 2nd Edition, Morgan Kaufmann, 2006.
[5] T. Mitchell, Machine Learning, McGraw Hill, 1997.
[6] J.R. Quinlan: C4.5, Programs for MachineLearning, Morgan Kaufmann, 1993.
[7] M. Last and O. Maimon, "A Compact and Accurate Model for Classification", IEEE Transactions on Knowledge and Data Engineering 2004; 16, 2: 203-215.
[8] O. Maimon and M. Last, Knowledge Discovery and Data Mining - The InfoFuzzy Network (IFN) Methodology, Kluwer Academic Publishers, Massive Computing, Boston, December 2000.
[9] M. Halkidi, Y. Batistakis, M. Vazirgiannis, "On Clustering Validation Techniques", J. Intell. Inf. Syst. 2001; 17, 2-3: 107-145.
[10] M. Last, Y. Klein, A. Kandel, "Knowledge Discovery in Time Series Databases", IEEE Transactions on Systems, Man, and Cybernetics 2001; 31, 1: 160-169.
[11] J.C. Prather, D.F. Lobach, L.K. Goodwin, J.W. Hales, M.L. Hage, W.E. Hammond, "Medical Data Mining: Knowledge Discovery in a Clinical Data Warehouse", Proc AMIA Annu Fall Symp. 1997:101-5.
[12] Krzysztof J. Cios, Witold Pedrycz, Roman W. Swiniarski, and Lukasz A. Kurgan "Data Mining: A Knowledge Discovery Approach" ISBN-13: 978-0-387-33333-5; 2007 Springer.
[13] J.C. Dunn, "Well Separated Clusters and Optimal Fuzzy Partitions", J. Cybern. 1974; 4: 95-104.
[14] F. Azuaje, "A Cluster Validity Framework for Genome Expression Data", Bioinformatics 2002; 18: 319-320.