Search results for: Databases and Information Retrieval
4180 Context-Aware Querying in Multimedia Databases – A Futuristic Approach
Authors: Nadeem Iftikhar, Zouhaib Zafar, Shaukat Ali
Abstract:
Efficient retrieval of multimedia objects has gained enormous focus in recent years. A number of techniques have been suggested for retrieval of textual information; however, relatively little has been suggested for efficient retrieval of multimedia objects. In this paper we have proposed a generic architecture for contextaware retrieval of multimedia objects. The proposed framework combines the well-known approaches of text-based retrieval and context-aware retrieval to formulate architecture for accurate retrieval of multimedia data.
Keywords: Context-aware retrieval, information retrieval, multimedia databases, multimedia data.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15364179 A Frame Work for Query Results Refinement in Multimedia Databases
Authors: Humaira Liaquat, Nadeem Iftikhar, Shaukat Ali, Zohaib Zafar Iqbal
Abstract:
In the current age, retrieval of relevant information from massive amount of data is a challenging job. Over the years, precise and relevant retrieval of information has attained high significance. There is a growing need in the market to build systems, which can retrieve multimedia information that precisely meets the user's current needs. In this paper, we have introduced a framework for refining query results before showing it to the user, using ambient intelligence, user profile, group profile, user location, time, day, user device type and extracted features. A prototypic tool was also developed to demonstrate the efficiency of the proposed approach.Keywords: Context aware retrieval, Information retrieval, Ambient Intelligence, Multimedia databases, User and group profile.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15474178 Algorithm for Information Retrieval Optimization
Authors: Kehinde K. Agbele, Kehinde Daniel Aruleba, Eniafe F. Ayetiran
Abstract:
When using Information Retrieval Systems (IRS), users often present search queries made of ad-hoc keywords. It is then up to the IRS to obtain a precise representation of the user’s information need and the context of the information. This paper investigates optimization of IRS to individual information needs in order of relevance. The study addressed development of algorithms that optimize the ranking of documents retrieved from IRS. This study discusses and describes a Document Ranking Optimization (DROPT) algorithm for information retrieval (IR) in an Internet-based or designated databases environment. Conversely, as the volume of information available online and in designated databases is growing continuously, ranking algorithms can play a major role in the context of search results. In this paper, a DROPT technique for documents retrieved from a corpus is developed with respect to document index keywords and the query vectors. This is based on calculating the weight (Keywords: Internet ranking,
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14754177 Storing OWL Ontologies in SQL Relational Databases
Authors: Irina Astrova, Nahum Korda, Ahto Kalja
Abstract:
Relational databases are often used as a basis for persistent storage of ontologies to facilitate rapid operations such as search and retrieval, and to utilize the benefits of relational databases management systems such as transaction management, security and integrity control. On the other hand, there appear more and more OWL files that contain ontologies. Therefore, this paper proposes to extract ontologies from OWL files and then store them in relational databases. A prerequisite for this storing is transformation of ontologies to relational databases, which is the purpose of this paper.Keywords: Ontologies, relational databases, SQL, and OWL.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 52144176 A Review on Important Aspects of Information Retrieval
Authors: Yogesh Gupta, Ashish Saini, A.K. Saxena
Abstract:
Information retrieval has become an important field of study and research under computer science due to explosive growth of information available in the form of full text, hypertext, administrative text, directory, numeric or bibliographic text. The research work is going on various aspects of information retrieval systems so as to improve its efficiency and reliability. This paper presents a comprehensive study, which discusses not only emergence and evolution of information retrieval but also includes different information retrieval models and some important aspects such as document representation, similarity measure and query expansion.
Keywords: Information Retrieval, query expansion, similarity measure, query expansion, vector space model.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 33394175 Soft Computing based Retrieval System for Medical Applications
Authors: Pardeep Singh, Sanjay Sharma
Abstract:
With increasing data in medical databases, medical data retrieval is growing in popularity. Some of this analysis including inducing propositional rules from databases using many soft techniques, and then using these rules in an expert system. Diagnostic rules and information on features are extracted from clinical databases on diseases of congenital anomaly. This paper explain the latest soft computing techniques and some of the adaptive techniques encompasses an extensive group of methods that have been applied in the medical domain and that are used for the discovery of data dependencies, importance of features, patterns in sample data, and feature space dimensionality reduction. These approaches pave the way for new and interesting avenues of research in medical imaging and represent an important challenge for researchers.Keywords: CBIR, GA, Rough sets, CBMIR, SVM.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17324174 A Novel Framework for User-Friendly Ontology-Mediated Access to Relational Databases
Authors: Efthymios Chondrogiannis, Vassiliki Andronikou, Efstathios Karanastasis, Theodora Varvarigou
Abstract:
A large amount of data is typically stored in relational databases (DB). The latter can efficiently handle user queries which intend to elicit the appropriate information from data sources. However, direct access and use of this data requires the end users to have an adequate technical background, while they should also cope with the internal data structure and values presented. Consequently the information retrieval is a quite difficult process even for IT or DB experts, taking into account the limited contributions of relational databases from the conceptual point of view. Ontologies enable users to formally describe a domain of knowledge in terms of concepts and relations among them and hence they can be used for unambiguously specifying the information captured by the relational database. However, accessing information residing in a database using ontologies is feasible, provided that the users are keen on using semantic web technologies. For enabling users form different disciplines to retrieve the appropriate data, the design of a Graphical User Interface is necessary. In this work, we will present an interactive, ontology-based, semantically enable web tool that can be used for information retrieval purposes. The tool is totally based on the ontological representation of underlying database schema while it provides a user friendly environment through which the users can graphically form and execute their queries.
Keywords: Ontologies, Relational Databases, SPARQL, Web Interface.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19304173 Performance Evaluation of Content Based Image Retrieval Using Indexed Views
Authors: Tahir Iqbal, Mumtaz Ali, Syed Wajahat Kareem, Muhammad Harris
Abstract:
Digital information is expanding in exponential order in our life. Information that is residing online and offline are stored in huge repositories relating to every aspect of our lives. Getting the required information is a task of retrieval systems. Content based image retrieval (CBIR) is a retrieval system that retrieves the required information from repositories on the basis of the contents of the image. Time is a critical factor in retrieval system and using indexed views with CBIR system improves the time efficiency of retrieved results.
Keywords: Content based image retrieval (CBIR), Indexed view, Color, Image retrieval, Cross correlation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18564172 Language and Retrieval Accuracy
Authors: Ahmed Abdelali, Jim Cowie, Hamdy S. Soliman
Abstract:
One of the major challenges in the Information Retrieval field is handling the massive amount of information available to Internet users. Existing ranking techniques and strategies that govern the retrieval process fall short of expected accuracy. Often relevant documents are buried deep in the list of documents returned by the search engine. In order to improve retrieval accuracy we examine the issue of language effect on the retrieval process. Then, we propose a solution for a more biased, user-centric relevance for retrieved data. The results demonstrate that using indices based on variations of the same language enhances the accuracy of search engines for individual users.Keywords: Information Search and Retrieval, LanguageVariants, Search Engine, Retrieval Accuracy.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14774171 Using Genetic Algorithm to Improve Information Retrieval Systems
Authors: Ahmed A. A. Radwan, Bahgat A. Abdel Latef, Abdel Mgeid A. Ali, Osman A. Sadek
Abstract:
This study investigates the use of genetic algorithms in information retrieval. The method is shown to be applicable to three well-known documents collections, where more relevant documents are presented to users in the genetic modification. In this paper we present a new fitness function for approximate information retrieval which is very fast and very flexible, than cosine similarity fitness function.Keywords: Cosine similarity, Fitness function, Genetic Algorithm, Information Retrieval, Query learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 27564170 Balancing Strategies for Parallel Content-based Data Retrieval Algorithms in a k-tree Structured Database
Authors: Radu Dobrescu, Matei Dobrescu, Daniela Hossu
Abstract:
The paper proposes a unified model for multimedia data retrieval which includes data representatives, content representatives, index structure, and search algorithms. The multimedia data are defined as k-dimensional signals indexed in a multidimensional k-tree structure. The benefits of using the k-tree unified model were demonstrated by running the data retrieval application on a six networked nodes test bed cluster. The tests were performed with two retrieval algorithms, one that allows parallel searching using a single feature, the second that performs a weighted cascade search for multiple features querying. The experiments show a significant reduction of retrieval time while maintaining the quality of results.
Keywords: balancing strategies, multimedia databases, parallelprocessing, retrieval algorithms
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14244169 Information Retrieval: A Comparative Study of Textual Indexing Using an Oriented Object Database (db4o) and the Inverted File
Authors: Mohammed Erritali
Abstract:
The growth in the volume of text data such as books and articles in libraries for centuries has imposed to establish effective mechanisms to locate them. Early techniques such as abstraction, indexing and the use of classification categories have marked the birth of a new field of research called "Information Retrieval". Information Retrieval (IR) can be defined as the task of defining models and systems whose purpose is to facilitate access to a set of documents in electronic form (corpus) to allow a user to find the relevant ones for him, that is to say, the contents which matches with the information needs of the user. Most of the models of information retrieval use a specific data structure to index a corpus which is called "inverted file" or "reverse index". This inverted file collects information on all terms over the corpus documents specifying the identifiers of documents that contain the term in question, the frequency of each term in the documents of the corpus, the positions of the occurrences of the word... In this paper we use an oriented object database (db4o) instead of the inverted file, that is to say, instead to search a term in the inverted file, we will search it in the db4o database. The purpose of this work is to make a comparative study to see if the oriented object databases may be competing for the inverse index in terms of access speed and resource consumption using a large volume of data.
Keywords: Information Retrieval, indexation, oriented object database (db4o), inverted file.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17344168 A New Model of English-Vietnamese Bilingual Information Retrieval System
Authors: Chinh Trong Nguyen, Dang Tuan Nguyen
Abstract:
In this paper, we propose a new model of English- Vietnamese bilingual Information Retrieval system. Although there are so many CLIR systems had been researched and built, the accuracy of searching results in different languages that the CLIR system supports still need to improve, especially in finding bilingual documents. The problems identified in this paper are the limitation of machine translation-s result and the extra large collections of document to be found. So we try to establish a different model to overcome these problems.Keywords: Bilingual Information Retrieval, Cross-lingual Information Retrieval, Bilingual Web sites.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16284167 Word Stemming Algorithms and Retrieval Effectiveness in Malay and Arabic Documents Retrieval Systems
Authors: Tengku Mohd T. Sembok
Abstract:
Documents retrieval in Information Retrieval Systems (IRS) is generally about understanding of information in the documents concern. The more the system able to understand the contents of documents the more effective will be the retrieval outcomes. But understanding of the contents is a very complex task. Conventional IRS apply algorithms that can only approximate the meaning of document contents through keywords approach using vector space model. Keywords may be unstemmed or stemmed. When keywords are stemmed and conflated in retrieving process, we are a step forwards in applying semantic technology in IRS. Word stemming is a process in morphological analysis under natural language processing, before syntactic and semantic analysis. We have developed algorithms for Malay and Arabic and incorporated stemming in our experimental systems in order to measure retrieval effectiveness. The results have shown that the retrieval effectiveness has increased when stemming is used in the systems.Keywords: Information Retrieval, Natural Language Processing, Artificial Intelligence.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22584166 Using Dempster-Shafer Theory in XML Information Retrieval
Authors: F. Raja, M. Rahgozar, F. Oroumchian
Abstract:
XML is a markup language which is becoming the standard format for information representation and data exchange. A major purpose of XML is the explicit representation of the logical structure of a document. Much research has been performed to exploit logical structure of documents in information retrieval in order to precisely extract user information need from large collections of XML documents. In this paper, we describe an XML information retrieval weighting scheme that tries to find the most relevant elements in XML documents in response to a user query. We present this weighting model for information retrieval systems that utilize plausible inferences to infer the relevance of elements in XML documents. We also add to this model the Dempster-Shafer theory of evidence to express the uncertainty in plausible inferences and Dempster-Shafer rule of combination to combine evidences derived from different inferences.Keywords: Dempster-Shafer theory, plausible inferences, XMLinformation retrieval.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15304165 ARCS for Critical Information Retrieval Development
Authors: Suttipong Boonphadung
Abstract:
The research on ARCS for critical information retrieval development aimed to (1) investigate conditions of critical information retrieval skill of the Mathematics pre-service teachers before applying ARCS model in learning activities, (2) study and analyze the development of critical information retrieval skill of the Mathematics pre-service teachers after utilizing ARCS model in learning activities, and (3) evaluate the Mathematics pre-service teachers’ satisfaction on using ARCS model in learning activities as a tool to development critical information retrieval skill. Forty-one of 4th year Mathematics pre-service teachers who have enrolled in the subject of Research for Learning Development of semester 2 in 2012 were purposively selected as the research cohort. The research tools were self-report and interview questionnaire that was approved as content validity and reliability (IOC=.66-1.00, α =.834). The research found that critical information retrieval skill of the research samples before using ARCS model in learning activities was in the normal high level. According to the in-depth interview and focus group, the result however showed that the pre-service teachers still lack inadequate and effective knowledge in information retrieval. Additionally, critical information retrieval skill of the research cohort after applying ARCS model in learning activities appeared to be high level. The result revealed that the pre-service teachers are able to explain the method of searching, extraction, and selecting information as well as evaluating quality of information, and effectively making decision in accepting information. Moreover, the research discovered that the pre-service teachers showed normal high to highest level of satisfaction on using ARCS model in learning activities as a tool to development their critical information retrieval skill.
Keywords: Critical information retrieval skill, ARCS model, Satisfaction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15234164 Text Retrieval Relevance Feedback Techniques for Bag of Words Model in CBIR
Authors: Nhu Van NGUYEN, Jean-Marc OGIER, Salvatore TABBONE, Alain BOUCHER
Abstract:
The state-of-the-art Bag of Words model in Content- Based Image Retrieval has been used for years but the relevance feedback strategies for this model are not fully investigated. Inspired from text retrieval, the Bag of Words model has the ability to use the wealth of knowledge and practices available in text retrieval. We study and experiment the relevance feedback model in text retrieval for adapting it to image retrieval. The experiments show that the techniques from text retrieval give good results for image retrieval and that further improvements is possible.Keywords: Relevance feedback, bag of words model, probabilistic model, vector space model, image retrieval
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21174163 Multi-agent Data Fusion Architecture for Intelligent Web Information Retrieval
Authors: Amin Milani Fard, Mohsen Kahani, Reza Ghaemi, Hamid Tabatabaee
Abstract:
In this paper we propose a multi-agent architecture for web information retrieval using fuzzy logic based result fusion mechanism. The model is designed in JADE framework and takes advantage of JXTA agent communication method to allow agent communication through firewalls and network address translators. This approach enables developers to build and deploy P2P applications through a unified medium to manage agent-based document retrieval from multiple sources.Keywords: Information retrieval systems, list fusion methods, document score, multi-agent systems.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16004162 Local Mesh Co-Occurrence Pattern for Content Based Image Retrieval
Authors: C. Yesubai Rubavathi, R. Ravi
Abstract:
This paper presents the local mesh co-occurrence patterns (LMCoP) using HSV color space for image retrieval system. HSV color space is used in this method to utilize color, intensity and brightness of images. Local mesh patterns are applied to define the local information of image and gray level co-occurrence is used to obtain the co-occurrence of LMeP pixels. Local mesh co-occurrence pattern extracts the local directional information from local mesh pattern and converts it into a well-mannered feature vector using gray level co-occurrence matrix. The proposed method is tested on three different databases called MIT VisTex, Corel, and STex. Also, this algorithm is compared with existing methods, and results in terms of precision and recall are shown in this paper.Keywords: Content-based image retrieval system, HSV color space, gray level co-occurrence matrix, local mesh pattern.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22224161 Controlled Vocabularies and Information Retrieval: 1918 Pandemic’s Scientific Literature as an Example
Authors: M. Garcia-Alsina, J. Cobarsí
Abstract:
The role of controlled vocabularies in information retrieval is broadly recognized as a relevant feature. Besides, there is a standing demand that editors and databases should consider the effective introduction of controlled vocabularies in their procedures to index scientific literature. That is especially important because information retrieval is pointed out as a significant point to drive systematic literature review. Hence, a first question emerges: Are the controlled vocabularies at this moment considered? On the other hand, subject searching in the catalogs is complex mainly due to the dichotomy between keywords from authors versus keywords based on controlled vocabularies. Finally, there is some demand to unify the terminology related to health to make easier the medical history exploitation and research. Considering these features, this paper focuses on controlled vocabularies related to the health field and their role for storing, classifying, and retrieving relevant literature. The objective is knowing which role plays the controlled vocabularies related to the health field to index and retrieve research literature in data bases such as Web of Science (WoS) and Scopus. So, this exploratory research is grounded over two research questions: 1) Which are the terms considered in specific controlled vocabularies of the health field; and 2) How papers are indexed in relevant databases to be easily retrieved, considering keywords vs specific health’ controlled vocabularies? This research takes as fieldwork the controlled vocabularies related to health and the scientific interest for 1918 flu pandemic, also known equivocally as ‘Spanish flu’. This interest has been fostered by the emergence in the early 21st of epidemics of pneumonic diseases caused by virus. Searches about and with controlled vocabularies on WoS and Scopus databases are conducted. First results of this work in progress are surprising. There are different controlled vocabularies for the health field, into which the terms collected and preferred related to ‘1918 pandemic’ are identified. To summarize, ‘Spanish influenza epidemic’ or ‘Spanish flu’ are collected as not preferred terms. The preferred terms are: ‘influenza’ or ‘influenza pandemic, 1918-1919’. Although the controlled vocabularies are clear in their election, most of the literature about ‘1918 pandemic’ is retrievable either by ‘Spanish’ or by ‘1918’ disjunct, and the dominant word to retrieve literature is ‘Spanish’ rather than ‘1918’. This is surprising considering the existence of suitable controlled vocabularies related to health topics, and the modern guidelines of World Health Organization concerning naming of diseases that point out to other preferred terms. A first conclusion is the failure of using controlled vocabularies for a field such as health, and in consequence for WoS and Scopus. This research opens further research questions about which is the role that controlled vocabularies play in the instructions to authors that journals deliver to documents’ authors.
Keywords: Controlled vocabularies, indexing, 1918 influenza, information retrieval, keywords, 1918 pandemic, scientific databases.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4274160 Application of a Novel Audio Compression Scheme in Automatic Music Recommendation, Digital Rights Management and Audio Fingerprinting
Authors: Anindya Roy, Goutam Saha
Abstract:
Rapid progress in audio compression technology has contributed to the explosive growth of music available in digital form today. In a reversal of ideas, this work makes use of a recently proposed efficient audio compression scheme to develop three important applications in the context of Music Information Retrieval (MIR) for the effective manipulation of large music databases, namely automatic music recommendation (AMR), digital rights management (DRM) and audio finger-printing for song identification. The performance of these three applications has been evaluated with respect to a database of songs collected from a diverse set of genres.
Keywords: Audio compression, Music Information Retrieval, Digital Rights Management, Audio Fingerprinting.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15404159 Developing the Color Temperature Histogram Method for Improving the Content-Based Image Retrieval
Authors: P. Phokharatkul, S. Chaisriya, S. Somkuarnpanit, S. Phaiboon, C. Kimpan
Abstract:
This paper proposes a new method for image searches and image indexing in databases with a color temperature histogram. The color temperature histogram can be used for performance improvement of content–based image retrieval by using a combination of color temperature and histogram. The color temperature histogram can be represented by a range of 46 colors. That is more than the color histogram and the dominant color temperature. Moreover, with our method the colors that have the same color temperature can be separated while the dominant color temperature can not. The results showed that the color temperature histogram retrieved an accurate image more often than the dominant color temperature method or color histogram method. This also took less time so the color temperature can be used for indexing and searching for images.
Keywords: Color temperature histogram, color temperature, animage retrieval and content-based image retrieval.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24534158 Enhancing Retrieval Effectiveness of Malay Documents by Exploiting Implicit Semantic Relationship between Words
Authors: Mohd Pouzi Hamzah, Tengku Mohd Tengku Sembok
Abstract:
Phrases has a long history in information retrieval, particularly in commercial systems. Implicit semantic relationship between words in a form of BaseNP have shown significant improvement in term of precision in many IR studies. Our research focuses on linguistic phrases which is language dependent. Our results show that using BaseNP can improve performance although above 62% of words formation in Malay Language based on derivational affixes and suffixes.
Keywords: Information Retrieval, Malay Language, Semantic Relationship, Retrieval Effectiveness, Conceptual Indexing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14284157 Image Retrieval: Techniques, Challenge, and Trend
Authors: Hui Hui Wang, Dzulkifli Mohamad, N.A Ismail
Abstract:
This paper attempts to discuss the evolution of the retrieval techniques focusing on development, challenges and trends of the image retrieval. It highlights both the already addressed and outstanding issues. The explosive growth of image data leads to the need of research and development of Image Retrieval. However, Image retrieval researches are moving from keyword, to low level features and to semantic features. Drive towards semantic features is due to the problem of the keywords which can be very subjective and time consuming while low level features cannot always describe high level concepts in the users- mind.Keywords: content based image retrieval, keyword based imageretrieval, semantic gap, semantic image retrieval.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25244156 A System to Integrate and Manipulate Protein Database Using BioPerl and XML
Authors: Zurinahni Zainol, Rosalina Abdul Salam, Rosni Abdullah, Nur'Aini, Wahidah Husain
Abstract:
The size, complexity and number of databases used for protein information have caused bioinformatics to lag behind in adapting to the need to handle this distributed information. Integrating all the information from different databases into one database is a challenging problem. Our main research is to develop a tool which can be used to access and manipulate protein information from difference databases. In our approach, we have integrated difference databases such as Swiss-prot, PDB, Interpro, and EMBL and transformed these databases in flat file format into relational form using XML and Bioperl. As a result, we showed this tool can search different sizes of protein information stored in relational database and the result can be retrieved faster compared to flat file database. A web based user interface is provided to allow user to access or search for protein information in the local database.Keywords: Protein sequence database, relational database, integrated database.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14434155 Data Extraction of XML Files using Searching and Indexing Techniques
Authors: Sushma Satpute, Vaishali Katkar, Nilesh Sahare
Abstract:
XML files contain data which is in well formatted manner. By studying the format or semantics of the grammar it will be helpful for fast retrieval of the data. There are many algorithms which describes about searching the data from XML files. There are no. of approaches which uses data structure or are related to the contents of the document. In these cases user must know about the structure of the document and information retrieval techniques using NLPs is related to content of the document. Hence the result may be irrelevant or not so successful and may take more time to search.. This paper presents fast XML retrieval techniques by using new indexing technique and the concept of RXML. When indexing an XML document, the system takes into account both the document content and the document structure and assigns the value to each tag from file. To query the system, a user is not constrained about fixed format of query.
Keywords: XML Retrieval, Indexed Search, Information Retrieval.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17834154 Selection of Relevant Servers in Distributed Information Retrieval System
Authors: Benhamouda Sara, Guezouli Larbi
Abstract:
Nowadays, the dissemination of information touches the distributed world, where selecting the relevant servers to a user request is an important problem in distributed information retrieval. During the last decade, several research studies on this issue have been launched to find optimal solutions and many approaches of collection selection have been proposed. In this paper, we propose a new collection selection approach that takes into consideration the number of documents in a collection that contains terms of the query and the weights of those terms in these documents. We tested our method and our studies show that this technique can compete with other state-of-the-art algorithms that we choose to test the performance of our approach.
Keywords: Distributed information retrieval, relevance, server selection, collection selection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13784153 Query Optimization Techniques for XML Databases
Authors: Su Cheng Haw, G. S. V. Radha Krishna Rao
Abstract:
Over the past few years, XML (eXtensible Mark-up Language) has emerged as the standard for information representation and data exchange over the Internet. This paper provides a kick-start for new researches venturing in XML databases field. We survey the storage representation for XML document, review the XML query processing and optimization techniques with respect to the particular storage instance. Various optimization technologies have been developed to solve the query retrieval and updating problems. Towards the later year, most researchers proposed hybrid optimization techniques. Hybrid system opens the possibility of covering each technology-s weakness by its strengths. This paper reviews the advantages and limitations of optimization techniques.Keywords: indexing, labeling scheme, query optimization, XML storage.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20384152 Composite Relevance Feedback for Image Retrieval
Authors: Pushpa B. Patil, Manesh B. Kokare
Abstract:
This paper presents content-based image retrieval (CBIR) frameworks with relevance feedback (RF) based on combined learning of support vector machines (SVM) and AdaBoosts. The framework incorporates only most relevant images obtained from both the learning algorithm. To speed up the system, it removes irrelevant images from the database, which are returned from SVM learner. It is the key to achieve the effective retrieval performance in terms of time and accuracy. The experimental results show that this framework had significant improvement in retrieval effectiveness, which can finally improve the retrieval performance.
Keywords: Image retrieval, relevance feedback, wavelet transform.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19934151 Query Reformulation Guided by External Resource for Information Retrieval
Authors: Mohammed El Amine Abderrahim
Abstract:
Reformulating the user query is a technique that aims to improve the performance of an Information Retrieval System (IRS) in terms of precision and recall. This paper tries to evaluate the technique of query reformulation guided by an external resource for Arabic texts. To do this, various precision and recall measures were conducted and two corpora with different external resources like Arabic WordNet (AWN) and the Arabic Dictionary (thesaurus) of Meaning (ADM) were used. Examination of the obtained results will allow us to measure the real contribution of this reformulation technique in improving the IRS performance.
Keywords: Arabic NLP, Arabic Information Retrieval, Arabic WordNet, Query Expansion.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1401