Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 32119
Concepts Extraction from Discharge Notes using Association Rule Mining

Authors: Basak Oguz Yolcular


A large amount of valuable information is available in plain text clinical reports. New techniques and technologies are applied to extract information from these reports. In this study, we developed a domain based software system to transform 600 Otorhinolaryngology discharge notes to a structured form for extracting clinical data from the discharge notes. In order to decrease the system process time discharge notes were transformed into a data table after preprocessing. Several word lists were constituted to identify common section in the discharge notes, including patient history, age, problems, and diagnosis etc. N-gram method was used for discovering terms co-Occurrences within each section. Using this method a dataset of concept candidates has been generated for the validation step, and then Predictive Apriori algorithm for Association Rule Mining (ARM) was applied to validate candidate concepts.

Keywords: association rule mining, otorhinolaryngology, predictive apriori, text mining

Digital Object Identifier (DOI):

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1477


[1] M. Konchady , Text Mining Application Programming. Boston: Charles River Media, 2006, ch. 1.
[2] D.B. Johnson, R.K. Taira, A.F. Cardenas, and D.R. Aberle, "Extracting Information from Free Text Radiology Reports", Int. J. Digit Libr., vol. 1, no. 3, pp. 297-308, Dec. 1997.
[3] G. Schadow , C.J. Mcdonald,. "Extracting Structured Information from Free Text Pathology Reports," in Conf. 2003 AMIA Annu. Symp. Proc., pp. 584-8.
[4] R.A. Erhardt, R. Schneider , and C. Blaschke, "Status of Text Mining Techniques Applied to Biomedical Text," Drug Dicovery Today, vol. 11, no. 7-8, pp. 315-25, Apr. 2006.
[5] A.M. Cohen, W.R. Hersh, "A Survey of Current Work in Biomedical Text Mining," Briefings in Bioinformatics, vol. 6, no. 1, pp. 57-71, Mar. 2005.
[6] Wikipedia, "Otolaryngology (Unpublished work style)," unpublished.
[7] Google, "Zemberek (Unpublished work style)," unpublished.
[8] DB2 Universal Database, "Associations (Unpublished work style)," unpublished.
[9] S.E. Brossette, A.P. Sprague, J.M. Hardin, K.W.T. Jones, and S.A. Moser , "Association rules and data mining in hospital infection control and public health surveillance," Journal of American Medical Informatics Association, vol. 5, pp. 373-81, 1998.
[10] J. Paetz, R.W. Brause, "A frequent patterns tree approach for rule generation with categorical septic shock patient data," in Proceedings of the second international symposium on medical data analysis, London: Springer-Verlag, 2001, pp. 207-12.
[11] M. Ohsaki, Y. Sato, H. Yokoi, and T. Yamaguchi, "A rule discovery support system for sequential medical data in the case study of a chronic hepatitis dataset," in Proceedings of the ECML/PKDD 2003 discovery challenge workshop.
[12] J. Chen, H. He, G.J. Williams, and Jin H, "Temporal sequence associations for rare events," in Advances in knowledge discovery and data mining, Berlin/Heidelberg: Springer, 2004, pp. 235-9.
[13] C. Ordonez, N.F. Ezquerra, and C.A. Santana, "Constraining and summarizing association rules in medical data," Knowledge and Information Systems,vol. 3, pp. 1-2, 2006.
[14] R. Agrawal, T. Imielinski, and A. Swami, "Mining association rules between sets of items in large databases," in Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, DC: SIGMOD Conference, 1993, pp. 207-216.
[15] T. Scheffer, "Finding Association Rules That Trade Support Optimally against Confidence," in Proc of the 5th European Conf. on principles and Practice of Knowledge Discovery in Databases (PKDD'01), Freiburg, Germany: Springer-Verlag, 2001, pp. 424-435.
[16] I.H. Witten, E. Frank, "Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations," San Francisco, 2005.
[17] E. Frank, M. Hall, L. Trigg, G. Holmes, and I.H. Witten, "Data Mining in Bioinformatics using Weka," Bioinformatics, vol. 20, no. 15, pp. 2479-2481, 2004.