Text Summarization for Oil and Gas Drilling Topic
Authors: Y. Y. Chen, O. M. Foong, S. P. Yong, Kurniawan Iwan
Abstract:
Information sharing and gathering are important in the rapid advancement era of technology. The existence of WWW has caused rapid growth of information explosion. Readers are overloaded with too many lengthy text documents in which they are more interested in shorter versions. Oil and gas industry could not escape from this predicament. In this paper, we develop an Automated Text Summarization System known as AutoTextSumm to extract the salient points of oil and gas drilling articles by incorporating statistical approach, keywords identification, synonym words and sentence-s position. In this study, we have conducted interviews with Petroleum Engineering experts and English Language experts to identify the list of most commonly used keywords in the oil and gas drilling domain. The system performance of AutoTextSumm is evaluated using the formulae of precision, recall and F-score. Based on the experimental results, AutoTextSumm has produced satisfactory performance with F-score of 0.81.
Keywords: Keyword's probability, synonym sets.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1072882
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1731References:
[1] E. Qwiener, J.O. Pederson, and A.S.Weigned, "A neural network approach to topic spotting", in Proceedings of the Fourth Annual Symposium on Document Analysis and Information Retrievel (SDAIR-95), 1995.
[2] Joachims, T., "Text Categorization with SupportVector Machins: Learning with Many Relevant Features", in European Conference on Machine Learning (ECML), 1998.
[3] Tsuruoka, Y., Kawaguchi-shi, Tsujii, J., "Journal of Biomedical Informatics archive", Vol.37(6), pp. 461-470, 2004.
[4] Y.Yang and C.G.Chute, "An example-based mapping method for text categorization and retrievel", ACM Transaction on Information Systems (TOIS), 12(3):252-277, 1994.
[5] Victoria, M., "Statistical Approaches to Automatic Text Summarization", Bulletin of the American Society for Information Science and Technology, Vol3(4), April/May 2004.
[6] S.P. Yong, Ahmad I.Z. Abidin and Y.Y. Chen, "A Neural Based Text Summarization System", in Proceedings of the 6th International Conference of DATA MINING, 2005.
[7] Pardo, T.A.S., Rino, L.H.M. and Nunes, M.G.V., "GistSumm: A Summarization Tool Based on a New Extractive Method" in Computational Processing of the Portuguese Language. Vol. 2721/2003
[8] Neto, J.L., Freitas, A.A. and Kaestner, C.A.A., "Automatic Text Summarization Using a Machine Learning Approach" in Proceedings of the 16th Brazilian Symposium on Artificial Intelligence: Advances in Artificial Intelligence, London, 2002.
[9] Kim, S.B., Han, K.S., Rim, H.C. and Myaeng, S.H., "Some Effective Techniques for Naïve Bayes Text Classification" in IEEE Transactions on Knowledge and Data Engineering, 2006.
[10] Kraaij, W., Spitters, M. and Heijden, M., "Combining a Mixture Language Model and Naïve Bayes for Multi-document Summarisation" http://www-connex.lip6.fr/~amini/RelatedWorks/Kraaij01.pdf
[Accessed on 23th June 2008].
[11] Albanese, M., "Extacting and Summarizing Information from Large Data Repositories" http://www.fedoa.unina.it/577/01/Tesi_MASSIMILIANO_ALBANESE. pdf (Accessed on 23th June 2008).