TY - JFULL AU - Hany Mahgoub PY - 2008/9/ TI - Mining Association Rules from Unstructured Documents T2 - International Journal of Computer and Information Engineering SP - 2707 EP - 2713 VL - 2 SN - 1307-6892 UR - https://publications.waset.org/pdf/3514 PU - World Academy of Science, Engineering and Technology NX - Open Science Index 20, 2008 N2 - This paper presents a system for discovering association rules from collections of unstructured documents called EART (Extract Association Rules from Text). The EART system treats texts only not images or figures. EART discovers association rules amongst keywords labeling the collection of textual documents. The main characteristic of EART is that the system integrates XML technology (to transform unstructured documents into structured documents) with Information Retrieval scheme (TF-IDF) and Data Mining technique for association rules extraction. EART depends on word feature to extract association rules. It consists of four phases: structure phase, index phase, text mining phase and visualization phase. Our work depends on the analysis of the keywords in the extracted association rules through the co-occurrence of the keywords in one sentence in the original text and the existing of the keywords in one sentence without co-occurrence. Experiments applied on a collection of scientific documents selected from MEDLINE that are related to the outbreak of H5N1 avian influenza virus. ER -