Automatic Text Summarization

Mohamed Abdel Fattah; Fuji Ren

Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 32795

Automatic Text Summarization

Authors: Mohamed Abdel Fattah, Fuji Ren

Abstract:

This work proposes an approach to address automatic text summarization. This approach is a trainable summarizer, which takes into account several features, including sentence position, positive keyword, negative keyword, sentence centrality, sentence resemblance to the title, sentence inclusion of name entity, sentence inclusion of numerical data, sentence relative length, Bushy path of the sentence and aggregated similarity for each sentence to generate summaries. First we investigate the effect of each sentence feature on the summarization task. Then we use all features score function to train genetic algorithm (GA) and mathematical regression (MR) models to obtain a suitable combination of feature weights. The proposed approach performance is measured at several compression rates on a data corpus composed of 100 English religious articles. The results of the proposed approach are promising.

Keywords: Automatic Summarization, Genetic Algorithm, Mathematical Regression, Text Features.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1084246

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2269

References:

[1] Hobson, S., Dorr, B., Monz, C., & Schwartz, R. (2007). Task-based evaluation of text summarization using Relevance Prediction Information Processing & Management, 43(6), 1482-1499.
[2] Sjöbergh, J. (2007). Older versions of the ROUGEeval summarization evaluation system were easier to fool. Information Processing & Management, 43(6), 1500-1505.
[3] Over, P., Dang, H., & Harman, D. (2007). DUC in context. Information Processing & Management, 43(6), 1506-1520.
[4] Hirao, T., Okumura, M., Yasuda, N., & Isozaki, H. (2007). Supervised automatic evaluation for summarization with voted regression model. Information Processing & Management, 43(6), 1521-1535.
[5] Zajic, D., Dorr, B., Lin, J., & Schwartz, R. (2007). Multi-candidate reduction: Sentence compression as a tool for document summarization tasks. Information Processing & Management, 43(6), 1549-1570.
[6] Nomoto, T. (2007). Discriminative sentence compression with conditional random fields. Information Processing & Management, 43(6), 1571-1587.
[7] Vanderwende, L., Suzuki, H., Brockett, C., & Nenkova, A. (2007). Beyond SumBasic: Task-focused summarization with sentence simplification and lexical expansion. Information Processing & Management, 43(6), 1606-1618.
[8] Harabagiu, S., Hickl, A., & Lacatusu, F. (2007). Satisfying information needs with multi-document summaries. Information Processing & Management, 43(6), 1619-1642.
[9] Moens, M. (2007). Summarizing court decisions. Information Processing & Management, 43(6) 1748-1764.
[10] Reeve, L., Han, H., & Brooks, A. (2007). The use of domain-specific concepts in biomedical text summarization. Information Processing & Management, 43(6), 1765-1776.
[11] Ling, X., Jiang, J., He, X., Mei, Q., Zhai, C., & Schatz, B. (2007). Generating gene summaries from biomedical literature: A study of semistructured summarization. Information Processing & Management, 43(6), 1777-1791.
[12] Russell, S. J., & Norvig, P. (1995). Artificial intelligence: a modern approach. Englewood Cliffs, NJ: Prentice-Hall International Inc.
[13] Yeh, J., Ke, H., Yang, W., & Meng. I. (2005). Text summarization using a trainable summarizer and latent semantic analysis. Information Processing & Management, 41(1), 75-95.
[14] Jann, B. (2005). Making regression tables from stored estimates. Stata Journal 5, 288-308.