{"title":"Using Textual Pre-Processing and Text Mining to Create Semantic Links","authors":"Ricardo Avila, Gabriel Lopes, Vania Vidal, Jose Macedo","volume":151,"journal":"International Journal of Educational and Pedagogical Sciences","pagesStart":959,"pagesEnd":967,"ISSN":"1307-6892","URL":"https:\/\/publications.waset.org\/pdf\/10010548","abstract":"This article offers a approach to the automatic discovery
\r\nof semantic concepts and links in the domain of Oil Exploration
\r\nand Production (E&P). Machine learning methods combined with
\r\ntextual pre-processing techniques were used to detect local patterns in
\r\ntexts and, thus, generate new concepts and new semantic links. Even
\r\nusing more specific vocabularies within the oil domain, our approach
\r\nhas achieved satisfactory results, suggesting that the proposal can
\r\nbe applied in other domains and languages, requiring only minor
\r\nadjustments.","references":"[1] Miles, A. & Brickley, D. (2009, August 18). SKOS Simple\r\nKnowledge Organization System Primer. Retrieved from\r\nhttps:\/\/www.w3.org\/TR\/skos-primer\/.\r\n[2] Ag\u02c6encia Nacional do Petr\u00b4oleo, G\u00b4as Natural e Biocombust\u00b4\u0131veis (2016,\r\nAugust 19). Gloss\u00b4ario Retrieved from http:\/\/www.anp.gov.br\/glossario.\r\n[3] Fern\u00b4andez, E. F., Pedrosa Junior, O., Pinho , A. C. (2015, January 7).\r\nDicion\u00b4ario do Petr\u00b4oleo Retrieved from http:\/\/dicionariodopetroleo.com.br.\r\n[4] Anthonysamy, P., Edwards, M. J., Weichel, C. & Rashid, A. (2016).\r\nInferring Semantic Mapping Between Policies and Code: The Clue is\r\nin the Language. In: ESSoS (p.\/pp. 233-250): Springer.\r\n[5] Avila, Ricardo, Santos, Salomao, Araujo, David, Vidal, Vania Maria Ponte\r\nand de Macedo, Jose Antonio Fernandes. Semantic Links Using SKOS\r\nPredicates. Paper presented at the meeting of the KES, 2017.\r\n[6] Bland, J. M. and D. G. Altman (1996). Transformations, means, and\r\nconfidence intervals. 312(7038), 1079.\r\n[7] Bot, M. C. J. (2000). Improving Induction of Linear Classification Trees\r\nwith Genetic Programming. In: Proc. of the Genetic and Evolutionary\r\nComputation Conference (GECCO-2000). Las Vegas,Nevada,USA, pp.\r\n403\u2013410.\r\n[8] Brown, M. L. and J. F. Kros (2009). Imprecise Data and the Data Mining\r\nProcess. In: Encyclopedia of Data Warehousing and Mining. IGI Global,\r\npp. 999\u20131005.\r\n[9] Chakrabarti, S. (2002). Mining the Web: Discovering Knowledge from\r\nHypertext Data. Morgan-Kauffman.\r\n[10] Engels, R., G. Lindner, and R. Studer (1997). A Guided Tour through\r\nthe Data Mining Jungle. In: KDD. AAAI Press, pp. 163\u2013166.\r\n[11] Engels, R. and C. Theusinger (1998). Using a Data Metric for\r\nPreprocessing Advice for Data Mining Applications. In: ECAI, pp.\r\n430\u2013434.\r\n[12] Hasan, M. A., V. Chaoji, S. Salem, and M. Zaki (2006). Link Prediction\r\nUsing Supervised Learning. In: Proc. of SDM 06 workshop on Link\r\nAnalysis, Counterterrorism and Security.\r\n[13] Joachims, T. (2002). Learning to Classify Text Using Support Vector\r\nMachines. Kluwer Academic Publishers.\r\n[14] Kuhn, M. and K. Johnson (2013). Applied predictive modeling. Vol. 26.\r\nSpringer.\r\n[15] Lampos, V., B. Zou, and I. J. Cox (2017). Enhancing Feature Selection\r\nUsing Word Embeddings: The Case of Flu Surveillance. In: WWW. ACM,\r\npp. 695\u2013704.\r\n[16] Lesk, M. (1986). Automatic sense disambiguation using machine\r\nreadable dictionaries: how to tell apine cone from an ice cream cone. In:\r\nProceedings of ACM SIGDOC Conference, pp. 24\u201326.\r\n[17] Lichtenwalter, R., J. T. Lussier, and N. V. Chawla (2010). New\r\nperspectives and methods in link prediction. In: KDD, pp. 243\u2013252.\r\n[18] Miles, A. and S. Bechhofer (2008). SKOS Simple\r\nKnowledge Organization System Reference. W3C. URL:\r\nhttp:\/\/www.w3.org\/TR\/skos-reference.\r\n[19] Miles, A., B. Matthews, M. Wilson, and D. Brickley (2005). SKOS core:\r\nSimple Knowledge Organisation for the Web. In: Proc. of international\r\nconference on DC and metadata applications. DC Metadata Initiative, pp.\r\n1\u20139.\r\n[20] Morik, K. (2000). The Representation Race - Preprocessing for Handling\r\nTime Phenomena. In: ECML. Vol. 1810. Lecture Notes in Computer\r\nScience. Springer, pp. 4\u201319.\r\n[21] Muller, P., C. Fabre, and C. Adam (2014). Predicting the relevance\r\nof distributional semantic similarity with contextual information. In:\r\nProc. of the 52nd Annual Meeting of the Association for Computational\r\nLinguistics. Volume 1, pp. 479\u2013488.\r\n[22] Mustapha, S. M. F. D. S. (2018). Case-based reasoning for identifying\r\nknowledge leader within online community. Expert Syst. Appl. 97,\r\n244\u2013252.\r\n[23] Su, Y. and S.-U. Guan (2016). Density and Distance Based KNN\r\nApproach to Classification. IJAEC7(2), 45\u201360.\r\n[24] Sun, S., D. Liu, G. Li, W. Yu, and L. Pang (2010). Combination of\r\nOntology Model and Semantic Link Network in Web Resource Retrieval.\r\nIn: SKG. IEEE Computer Society, pp. 285\u2013288.\r\n[25] Ukey, K. and A. Alvi (2012). Text Classification using Support\r\nVector Machine. In: International Journal of Engineering and Technology\r\n(IJERT).\r\n[26] Volker, J., P. Haase, and P. Hitzler (2009). Learning expressive\r\nontologies. IOS Press.\r\n[27] Volz, J., C. Bizer, M. Gaedke, and G. Kobilarov (2009). Discovering\r\nand Maintaining Links on the Web of Data. In: International Semantic\r\nWeb Conference. Vol. 5823. Springer, pp. 650\u2013665. [28] Wang, Z., J. Li, Y. Zhao, R. Setchi, and J. Tang (2013). A unified\r\napproach to matching semantic data onthe Web. Knowl.-Based Syst.39,\r\n173\u2013184.\r\n[29] Weiss, S. M., N. Indurkhya, T. Zhang, and F. Damerau (2005). Text\r\nMining: Predictive Methods for Analyzing Unstructured Information.\r\nSpringer.\r\n[30] Zhang, C., G.-R. Xue, Y. Yu, and H. Zha (2009). Web-scale classification\r\nwith naive bayes. In: WWW.ACM, pp. 1083\u20131084.\r\n[31] Zhang, J. and Y. Yang (2003). Robustness of regularized linear\r\nclassification methods in text categorization. In: SIGIR. ACM, pp.\r\n190\u2013197.\r\n[32] Zhuge, H. (2009). Communities and Emerging Semantics in Semantic\r\nLink Network: Discovery and Learning. IEEE Trans. Knowl. Data Eng.21\r\n(6), 785\u2013799.","publisher":"World Academy of Science, Engineering and Technology","index":"Open Science Index 151, 2019"}