Using Genetic Algorithm to Improve Information Retrieval Systems
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 32797
Using Genetic Algorithm to Improve Information Retrieval Systems

Authors: Ahmed A. A. Radwan, Bahgat A. Abdel Latef, Abdel Mgeid A. Ali, Osman A. Sadek

Abstract:

This study investigates the use of genetic algorithms in information retrieval. The method is shown to be applicable to three well-known documents collections, where more relevant documents are presented to users in the genetic modification. In this paper we present a new fitness function for approximate information retrieval which is very fast and very flexible, than cosine similarity fitness function.

Keywords: Cosine similarity, Fitness function, Genetic Algorithm, Information Retrieval, Query learning.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1071610

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2699

References:


[1] J. H. Holland, "Adaptation in Natural and Artificial Systems", University of Michigan Press, Ann Arbor, 1975.
[2] K. A. DeJong, An Analysis of the Behavior of a Class of Genetic Adaptive Systems, Ph.D. Thesis, University of Michigan, 1975.
[3] D. E. Goldberg, Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley, Reading, MA., 1989.
[4] H. Chen, "Machine learning for information retrieval: neural networks, symbolic learning, and genetic algorithms". Journal of the American Society for Information Science, 46(3), 1995, pp. 194-216.
[5] J. Savoy and D. Vrajitoru, "Evaluation of learning schemes used in information retrieval (CR-I-95-02)". Universite de Neuchatel, Faculte de droit et des Sciences Economiques, 1996.
[6] M. Gordon, "Probabilistic and genetic algorithms in document retrieval". Communications of the ACM, 31(10), 1988, pp. 1208-1218.
[7] J. Yang, R. Korfhage and E. Rasmussen. "Query improvement in information retrieval using genetic algorithms--a report on the experiments of the TREC project". In Proceedings of the 1st text retrieval conference (TREC-1), 1992, pp. 31-58.
[8] J. Morgan and A. Kilgour. "Personalising on-line information retrieval support with a genetic algorithm". In A. Moscardini, & P. Smith (Eds.), PolyModel 16: Applications of artificial intelligence, 1996, pp. 142-149.
[9] M. Boughanem, C. Chrisment, and L. Tamine. "On using genetic algorithms for multimodal relevance optimization in information retrieval". Journal of the American Society for Information Science and Technology, 53(11), 2002, pp. 934-942.
[10] J. T. Horng and C. C. Yeh. "Applying genetic algorithms to query optimization in document retrieval". Information Processing & Management, 36(5), 2000, pp. 737-759.
[11] D. Vrajitoru. "Crossover improvement for the genetic algorithm in information retrieval". Information Processing& Management, 34(4), 1998, pp. 405-415.
[12] D. Vrajitoru. "Large population or many generations for genetic algorithms? Implications in information retrieval". In F. Crestani and G. Pasi (Eds.), Soft computing in information retrieval. Techniques and applications, Physica-Verlag, 2000, pp. 199-222.
[13] D. Harman. "Overview of the first TREC conference". In Proceedings of the 16th ACM SIGIR conference on information retrieval, 1993, pp. 36- 47.
[14] B. T. Bartell, G. W. Cottrell and R. K. Belew. "Automatic combination of multiple ranked retrieval systems". In Proceedings of the 17th ACM SIGIR conference on information retrieval, 1994, pp. 173-181.
[15] P. Pathak, M. Gordon and W. Fan. "Effective information retrieval using genetic algorithms based matching functions adaption", in: Proc. 33rd Hawaii International Conference on Science (HICS), Hawaii, USA, 2000.
[16] R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval, Adisson, 1999.
[17] G. Salton and M.H. McGill. Introduction to Modern Information Retrieval, McGraw-Hill, 1983.
[18] C.J. Van Rijsbergen. Information Retrieval, second ed., Butterworth, 1979.
[19] A. Bookstein. "Outline of a general probabilistic retrieval model", Journal of Documentation 39 (2), 1983, pp. 63-72.
[20] N. Fuhr. "Probabilistic models in information retrieval", Computer Journal 35 (3), 1992, pp. 243-255.
[21] C. H. Chang and C. C. Hsu. The design of an information system for hypertext retrieval and automatic discovery on WWW. Ph.D. thesis, Department of CSIE, National Taiwan University, 1999.
[22] K. L. Kwok. "Comparing representations in Chinese information retrieval". ACM SIGIR'97, Philadelphia, PA, USA, 1997, pp. 34 -41.
[23] T. Mitchell. Machine Learning, McGraw-Hill, 1997.
[24] H. Chen et al., "A machine learning approach to inductive query by examples: an experiment using relevance feedback, ID3, genetic algorithms, and simulated annealing", Journal of the American Society for Information Science 49 (8), 1998, pp. 693-705.
[25] W. Fan, M.D. Gordon and P. Pathak. "Personalization of search engine services for effective retrieval and knowledge management", in: Proc. 2000 International Conference on Information Systems (ICIS), Brisbane, Australia, 2000.
[26] A.M. Robertson and P. Willet. "Generation of equifrequent groups of words using a genetic algorithm", Journal of Documentation 50 (3), 1994, pp. 213-232.
[27] M. Gordon. "Probabilistic and genetic algorithms for document retrieval", Communications of the ACM 31 (10), 1988, pp. 1208-1218.
[28] W. Fan, M.D. Gordon and P. Pathak. "Discovery of context-specific ranking functions for effective information retrieval using genetic programming", IEEE Transactions on knowledge and Data Engineering, in press.
[29] M.P. Smith, M. Smith. "The use of genetic programming to build Boolean queries for text retrieval through relevance feedback", Journal of Information Science 23 (6), 1997, pp. 423-431.
[30] J. Koza. "Genetic Programming". On the Programming of Computers by means of Natural Selection, The MIT Press, 1992.
[31] J. Yang and R. Korfhage. "Query modifications using genetic algorithms in vector space models", International Journal of Expert Systems 7 (2), 1994, pp.165-191.
[32] H. Kucera and N. Francis. "Computational analysis of present-day American English". Providence, RD: Brown University Press, 1967.
[33] M. F. Porter. "An algorithm for suffix stripping. Program", 14(3), 1980, pp. 130-137.
[34] G. Salton and C. Buckley. "Improving retrieval performance by relevance feedback". Journal of the American Society for Information Science, 41(4), 1990, pp. 288-297.
[35] T. Noreault, M. McGill and M. B. Koll. "A performance evaluation of similarity measures, document term weighting schemes and representation in a Boolean environment". Information retrieval research. London: Butterworths, 1981.