Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 86

Search results for: Information retrieval

86 Q-Map: Clinical Concept Mining from Clinical Documents

Authors: Sheikh Shams Azam, Manoj Raju, Venkatesh Pagidimarri, Vamsi Kasivajjala

Abstract:

Over the past decade, there has been a steep rise in the data-driven analysis in major areas of medicine, such as clinical decision support system, survival analysis, patient similarity analysis, image analytics etc. Most of the data in the field are well-structured and available in numerical or categorical formats which can be used for experiments directly. But on the opposite end of the spectrum, there exists a wide expanse of data that is intractable for direct analysis owing to its unstructured nature which can be found in the form of discharge summaries, clinical notes, procedural notes which are in human written narrative format and neither have any relational model nor any standard grammatical structure. An important step in the utilization of these texts for such studies is to transform and process the data to retrieve structured information from the haystack of irrelevant data using information retrieval and data mining techniques. To address this problem, the authors present Q-Map in this paper, which is a simple yet robust system that can sift through massive datasets with unregulated formats to retrieve structured information aggressively and efficiently. It is backed by an effective mining technique which is based on a string matching algorithm that is indexed on curated knowledge sources, that is both fast and configurable. The authors also briefly examine its comparative performance with MetaMap, one of the most reputed tools for medical concepts retrieval and present the advantages the former displays over the latter.

Keywords: Information retrieval (IR), unified medical language system (UMLS), Syntax Based Analysis, natural language processing (NLP), medical informatics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 251
85 Clustering Categorical Data Using the K-Means Algorithm and the Attribute’s Relative Frequency

Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami

Abstract:

Clustering is a well known data mining technique used in pattern recognition and information retrieval. The initial dataset to be clustered can either contain categorical or numeric data. Each type of data has its own specific clustering algorithm. In this context, two algorithms are proposed: the k-means for clustering numeric datasets and the k-modes for categorical datasets. The main encountered problem in data mining applications is clustering categorical dataset so relevant in the datasets. One main issue to achieve the clustering process on categorical values is to transform the categorical attributes into numeric measures and directly apply the k-means algorithm instead the k-modes. In this paper, it is proposed to experiment an approach based on the previous issue by transforming the categorical values into numeric ones using the relative frequency of each modality in the attributes. The proposed approach is compared with a previously method based on transforming the categorical datasets into binary values. The scalability and accuracy of the two methods are experimented. The obtained results show that our proposed method outperforms the binary method in all cases.

Keywords: Clustering, k-means, categorical datasets, pattern recognition, unsupervised learning, knowledge discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2771
84 Domain Knowledge Representation through Multiple Sub Ontologies: An Application Interoperability

Authors: Sunitha Abburu, Golla Suresh Babu

Abstract:

The issues that limit application interoperability is lack of common vocabulary, common structure, application domain knowledge ontology based semantic technology provides solutions that resolves application interoperability issues. Ontology is broadly used in diverse applications such as artificial intelligence, bioinformatics, biomedical, information integration, etc. Ontology can be used to interpret the knowledge of various domains. To reuse, enrich the available ontologies and reduce the duplication of ontologies of the same domain, there is a strong need to integrate the ontologies of the particular domain. The integrated ontology gives complete knowledge about the domain by sharing this comprehensive domain ontology among the groups. As per the literature survey there is no well-defined methodology to represent knowledge of a whole domain. The current research addresses a systematic methodology for knowledge representation using multiple sub-ontologies at different levels that addresses application interoperability and enables semantic information retrieval. The current method represents complete knowledge of a domain by importing concepts from multiple sub ontologies of same and relative domains that reduces ontology duplication, rework, implementation cost through ontology reusability.

Keywords: Knowledge acquisition, knowledge representation, knowledge transfer, ontologies, semantics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 345
83 An Improvement of Multi-Label Image Classification Method Based on Histogram of Oriented Gradient

Authors: Ziad Abdallah, Mohamad Oueidat, Ali El-Zaart

Abstract:

Image Multi-label Classification (IMC) assigns a label or a set of labels to an image. The big demand for image annotation and archiving in the web attracts the researchers to develop many algorithms for this application domain. The existing techniques for IMC have two drawbacks: The description of the elementary characteristics from the image and the correlation between labels are not taken into account. In this paper, we present an algorithm (MIML-HOGLPP), which simultaneously handles these limitations. The algorithm uses the histogram of gradients as feature descriptor. It applies the Label Priority Power-set as multi-label transformation to solve the problem of label correlation. The experiment shows that the results of MIML-HOGLPP are better in terms of some of the evaluation metrics comparing with the two existing techniques.

Keywords: Data mining, information retrieval system, multi-label, problem transformation, histogram of gradients.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 829
82 Development of Fuzzy Logic Control Ontology for E-Learning

Authors: Muhammad Sollehhuddin A. Jalil, Mohd Ibrahim Shapiai, Rubiyah Yusof

Abstract:

Nowadays, ontology is common in many areas like artificial intelligence, bioinformatics, e-commerce, education and many more. Ontology is one of the focus areas in the field of Information Retrieval. The purpose of an ontology is to describe a conceptual representation of concepts and their relationships within a particular domain. In other words, ontology provides a common vocabulary for anyone who needs to share information in the domain. There are several ontology domains in various fields including engineering and non-engineering knowledge. However, there are only a few available ontology for engineering knowledge. Fuzzy logic as engineering knowledge is still not available as ontology domain. In general, fuzzy logic requires step-by-step guidelines and instructions of lab experiments. In this study, we presented domain ontology for Fuzzy Logic Control (FLC) knowledge. We give Table of Content (ToC) with middle strategy based on the Uschold and King method to develop FLC ontology. The proposed framework is developed using Protégé as the ontology tool. The Protégé’s ontology reasoner, known as the Pellet reasoner is then used to validate the presented framework. The presented framework offers better performance based on consistency and classification parameter index. In general, this ontology can provide a platform to anyone who needs to understand FLC knowledge.

Keywords: Engineering knowledge, fuzzy logic control ontology, ontology development, table of contents.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 737
81 AINA: Disney Animation Information as Educational Resources

Authors: Piedad Garrido, Fernando Repulles, Andy Bloor, Julio A. Sanguesa, Jesus Gallardo, Vicente Torres, Jesus Tramullas

Abstract:

With the emergence and development of Information and Communications Technologies (ICTs), Higher Education is experiencing rapid changes, not only in its teaching strategies but also in student’s learning skills. However, we have noticed that students often have difficulty when seeking innovative, useful, and interesting learning resources for their work. This is due to the lack of supervision in the selection of good query tools. This paper presents AINA, an Information Retrieval (IR) computer system aimed at providing motivating and stimulating content to both students and teachers working on different areas and at different educational levels. In particular, our proposal consists of an open virtual resource environment oriented to the vast universe of Disney comics and cartoons. Our test suite includes Disney’s long and shorts films, and we have performed some activities based on the Just In Time Teaching (JiTT) methodology. More specifically, it has been tested by groups of university and secondary school students.

Keywords: Information retrieval, animation, educational resources, JiTT.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 742
80 Algorithm for Information Retrieval Optimization

Authors: Kehinde K. Agbele, Kehinde Daniel Aruleba, Eniafe F. Ayetiran

Abstract:

When using Information Retrieval Systems (IRS), users often present search queries made of ad-hoc keywords. It is then up to the IRS to obtain a precise representation of the user’s information need and the context of the information. This paper investigates optimization of IRS to individual information needs in order of relevance. The study addressed development of algorithms that optimize the ranking of documents retrieved from IRS. This study discusses and describes a Document Ranking Optimization (DROPT) algorithm for information retrieval (IR) in an Internet-based or designated databases environment. Conversely, as the volume of information available online and in designated databases is growing continuously, ranking algorithms can play a major role in the context of search results. In this paper, a DROPT technique for documents retrieved from a corpus is developed with respect to document index keywords and the query vectors. This is based on calculating the weight (

Keywords: Internet ranking,

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1041
79 Augmented Reality for Maintenance Operator for Problem Inspections

Authors: Chong-Yang Qiao, Teeravarunyou Sakol

Abstract:

Current production-oriented factories need maintenance operators to work in shifts monitoring and inspecting complex systems and different equipment in the situation of mechanical breakdown. Augmented reality (AR) is an emerging technology that embeds data into the environment for situation awareness to help maintenance operators make decisions and solve problems. An application was designed to identify the problem of steam generators and inspection centrifugal pumps. The objective of this research was to find the best medium of AR and type of problem solving strategies among analogy, focal object method and mean-ends analysis. Two scenarios of inspecting leakage were temperature and vibration. Two experiments were used in usability evaluation and future innovation, which included decision-making process and problem-solving strategy. This study found that maintenance operators prefer build-in magnifier to zoom the components (55.6%), 3D exploded view to track the problem parts (50%), and line chart to find the alter data or information (61.1%). There is a significant difference in the use of analogy (44.4%), focal objects (38.9%) and mean-ends strategy (16.7%). The marked differences between maintainers and operators are of the application of a problem solving strategy. However, future work should explore multimedia information retrieval which supports maintenance operators for decision-making.

Keywords: Augmented reality, situation awareness, decision-making, problem-solving.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 851
78 Domain Driven Design vs Soft Domain Driven Design Frameworks

Authors: Mohammed Salahat, Steve Wade

Abstract:

This paper presents and compares the SSDDD “Systematic Soft Domain Driven Design Framework” to DDD “Domain Driven Design Framework” as a soft system approach of information systems development. The framework use SSM as a guiding methodology within which we have embedded a sequence of design tasks based on the UML leading to the implementation of a software system using the Naked Objects framework. This framework has been used in action research projects that have involved the investigation and modelling of business processes using object-oriented domain models and the implementation of software systems based on those domain models. Within this framework, Soft Systems Methodology (SSM) is used as a guiding methodology to explore the problem situation and to develop the domain model using UML for the given business domain. The framework is proposed and evaluated in our previous works, a comparison between SSDDD and DDD is presented in this paper, to show how SSDDD improved DDD as an approach to modelling and implementing business domain perspectives for Information Systems Development. The comparison process, the results, and the improvements are presented in the following sections of this paper.

Keywords: SSM, UML, domain-driven design, soft domain-driven design, naked objects, soft language, information retrieval, multimethodology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1412
77 Selection of Relevant Servers in Distributed Information Retrieval System

Authors: Benhamouda Sara, Guezouli Larbi

Abstract:

Nowadays, the dissemination of information touches the distributed world, where selecting the relevant servers to a user request is an important problem in distributed information retrieval. During the last decade, several research studies on this issue have been launched to find optimal solutions and many approaches of collection selection have been proposed. In this paper, we propose a new collection selection approach that takes into consideration the number of documents in a collection that contains terms of the query and the weights of those terms in these documents. We tested our method and our studies show that this technique can compete with other state-of-the-art algorithms that we choose to test the performance of our approach.

Keywords: Distributed information retrieval, relevance, server selection, collection selection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 936
76 Genetic Algorithms for Feature Generation in the Context of Audio Classification

Authors: José A. Menezes, Giordano Cabral, Bruno T. Gomes

Abstract:

Choosing good features is an essential part of machine learning. Recent techniques aim to automate this process. For instance, feature learning intends to learn the transformation of raw data into a useful representation to machine learning tasks. In automatic audio classification tasks, this is interesting since the audio, usually complex information, needs to be transformed into a computationally convenient input to process. Another technique tries to generate features by searching a feature space. Genetic algorithms, for instance, have being used to generate audio features by combining or modifying them. We find this approach particularly interesting and, despite the undeniable advances of feature learning approaches, we wanted to take a step forward in the use of genetic algorithms to find audio features, combining them with more conventional methods, like PCA, and inserting search control mechanisms, such as constraints over a confusion matrix. This work presents the results obtained on particular audio classification problems.

Keywords: Feature generation, feature learning, genetic algorithm, music information retrieval.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 529
75 Leveraging Quality Metrics in Voting Model Based Thread Retrieval

Authors: Atefeh Heydari, Mohammadali Tavakoli, Zuriati Ismail, Naomie Salim

Abstract:

Seeking and sharing knowledge on online forums have made them popular in recent years. Although online forums are valuable sources of information, due to variety of sources of messages, retrieving reliable threads with high quality content is an issue. Majority of the existing information retrieval systems ignore the quality of retrieved documents, particularly, in the field of thread retrieval. In this research, we present an approach that employs various quality features in order to investigate the quality of retrieved threads. Different aspects of content quality, including completeness, comprehensiveness, and politeness, are assessed using these features, which lead to finding not only textual, but also conceptual relevant threads for a user query within a forum. To analyse the influence of the features, we used an adopted version of voting model thread search as a retrieval system. We equipped it with each feature solely and also various combinations of features in turn during multiple runs. The results show that incorporating the quality features enhances the effectiveness of the utilised retrieval system significantly.

Keywords: Content quality, Forum search, Thread retrieval, Voting techniques.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1258
74 Personalization of Web Search Using Web Page Clustering Technique

Authors: Amol Bapuso Rajmane, Pradeep M. Patil, Prakash J. Kulkarni

Abstract:

The Information Retrieval community is facing the problem of effective representation of Web search results. When we organize web search results into clusters it becomes easy to the users to quickly browse through search results. The traditional search engines organize search results into clusters for ambiguous queries, representing each cluster for each meaning of the query. The clusters are obtained according to the topical similarity of the retrieved search results, but it is possible for results to be totally dissimilar and still correspond to the same meaning of the query. People search is also one of the most common tasks on the Web nowadays, but when a particular person’s name is queried the search engines return web pages which are related to different persons who have the same queried name. By placing the burden on the user of disambiguating and collecting pages relevant to a particular person, in this paper, we have developed an approach that clusters web pages based on the association of the web pages to the different people and clusters that are based on generic entity search.

Keywords: Entity resolution, information retrieval, graph based disambiguation, web people search, clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 860
73 Business Domain Modelling Using an Integrated Framework

Authors: Mohammed Salahat, Steve Wade

Abstract:

This paper presents an application of a “Systematic Soft Domain Driven Design Framework” as a soft systems approach to domain-driven design of information systems development. The framework use SSM as a guiding methodology within which we have embedded a sequence of design tasks based on the UML leading to the implementation of a software system using the Naked Objects framework. This framework have been used in action research projects that have involved the investigation and modelling of business processes using object-oriented domain models and the implementation of software systems based on those domain models. Within this framework, Soft Systems Methodology (SSM) is used as a guiding methodology to explore the problem situation and to develop the domain model using UML for the given business domain. The framework is proposed and evaluated in our previous works, and a real case study “Information Retrieval System for academic research” is used, in this paper, to show further practice and evaluation of the framework in different business domain. We argue that there are advantages from combining and using techniques from different methodologies in this way for business domain modelling. The framework is overviewed and justified as multimethodology using Mingers multimethodology ideas.

Keywords: SSM, UML, domain-driven design, soft domaindriven design, naked objects, soft language, information retrieval, multimethodology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1330
72 A Novel Framework for User-Friendly Ontology-Mediated Access to Relational Databases

Authors: Efthymios Chondrogiannis, Vassiliki Andronikou, Efstathios Karanastasis, Theodora Varvarigou

Abstract:

A large amount of data is typically stored in relational databases (DB). The latter can efficiently handle user queries which intend to elicit the appropriate information from data sources. However, direct access and use of this data requires the end users to have an adequate technical background, while they should also cope with the internal data structure and values presented. Consequently the information retrieval is a quite difficult process even for IT or DB experts, taking into account the limited contributions of relational databases from the conceptual point of view. Ontologies enable users to formally describe a domain of knowledge in terms of concepts and relations among them and hence they can be used for unambiguously specifying the information captured by the relational database. However, accessing information residing in a database using ontologies is feasible, provided that the users are keen on using semantic web technologies. For enabling users form different disciplines to retrieve the appropriate data, the design of a Graphical User Interface is necessary. In this work, we will present an interactive, ontology-based, semantically enable web tool that can be used for information retrieval purposes. The tool is totally based on the ontological representation of underlying database schema while it provides a user friendly environment through which the users can graphically form and execute their queries.

Keywords: Ontologies, Relational Databases, SPARQL, Web Interface.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1532
71 Semantic Indexing Approach of a Corpora Based On Ontology

Authors: Mohammed Erritali

Abstract:

The growth in the volume of text data such as books and articles in libraries for centuries has imposed to establish effective mechanisms to locate them. Early techniques such as abstraction, indexing and the use of classification categories have marked the birth of a new field of research called "Information Retrieval". Information Retrieval (IR) can be defined as the task of defining models and systems whose purpose is to facilitate access to a set of documents in electronic form (corpus) to allow a user to find the relevant ones for him, that is to say, the contents which matches with the information needs of the user. This paper presents a new semantic indexing approach of a documentary corpus. The indexing process starts first by a term weighting phase to determine the importance of these terms in the documents. Then the use of a thesaurus like Wordnet allows moving to the conceptual level. Each candidate concept is evaluated by determining its level of representation of the document, that is to say, the importance of the concept in relation to other concepts of the document. Finally, the semantic index is constructed by attaching to each concept of the ontology, the documents of the corpus in which these concepts are found.

Keywords: Semantic, indexing, corpora, WordNet, ontology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1025
70 Role of Natural Language Processing in Information Retrieval; Challenges and Opportunities

Authors: Khaled M. Alhawiti

Abstract:

This paper aims to analyze the role of natural language processing (NLP). The paper will discuss the role in the context of automated data retrieval, automated question answer, and text structuring. NLP techniques are gaining wider acceptance in real life applications and industrial concerns. There are various complexities involved in processing the text of natural language that could satisfy the need of decision makers. This paper begins with the description of the qualities of NLP practices. The paper then focuses on the challenges in natural language processing. The paper also discusses major techniques of NLP. The last section describes opportunities and challenges for future research.

Keywords: Data Retrieval, Information retrieval, Natural Language Processing, Text Structuring.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2213
69 Information Retrieval: A Comparative Study of Textual Indexing Using an Oriented Object Database (db4o) and the Inverted File

Authors: Mohammed Erritali

Abstract:

The growth in the volume of text data such as books and articles in libraries for centuries has imposed to establish effective mechanisms to locate them. Early techniques such as abstraction, indexing and the use of classification categories have marked the birth of a new field of research called "Information Retrieval". Information Retrieval (IR) can be defined as the task of defining models and systems whose purpose is to facilitate access to a set of documents in electronic form (corpus) to allow a user to find the relevant ones for him, that is to say, the contents which matches with the information needs of the user. Most of the models of information retrieval use a specific data structure to index a corpus which is called "inverted file" or "reverse index". This inverted file collects information on all terms over the corpus documents specifying the identifiers of documents that contain the term in question, the frequency of each term in the documents of the corpus, the positions of the occurrences of the word... In this paper we use an oriented object database (db4o) instead of the inverted file, that is to say, instead to search a term in the inverted file, we will search it in the db4o database. The purpose of this work is to make a comparative study to see if the oriented object databases may be competing for the inverse index in terms of access speed and resource consumption using a large volume of data.

Keywords: Information Retrieval, indexation, oriented object database (db4o), inverted file.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1311
68 A Comparative Analysis of Different Web Content Mining Tools

Authors: T. Suresh Kumar, M. Arthanari, N. Shanthi

Abstract:

Nowadays, the Web has become one of the most pervasive platforms for information change and retrieval. It collects the suitable and perfectly fitting information from websites that one requires. Data mining is the form of extracting data’s available in the internet. Web mining is one of the elements of data mining Technique, which relates to various research communities such as information recovery, folder managing system and simulated intellects. In this Paper we have discussed the concepts of Web mining. We contain generally focused on one of the categories of Web mining, specifically the Web Content Mining and its various farm duties. The mining tools are imperative to scanning the many images, text, and HTML documents and then, the result is used by the various search engines. We conclude by presenting a comparative table of these tools based on some pertinent criteria.

Keywords: Data Mining, Web Mining, Web Content Mining, Mining Tools, Information retrieval.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2843
67 Comparative Analysis of Different Page Ranking Algorithms

Authors: S. Prabha, K. Duraiswamy, J. Indhumathi

Abstract:

Search engine plays an important role in internet, to retrieve the relevant documents among the huge number of web pages. However, it retrieves more number of documents, which are all relevant to your search topics. To retrieve the most meaningful documents related to search topics, ranking algorithm is used in information retrieval technique. One of the issues in data miming is ranking the retrieved document. In information retrieval the ranking is one of the practical problems. This paper includes various Page Ranking algorithms, page segmentation algorithms and compares those algorithms used for Information Retrieval. Diverse Page Rank based algorithms like Page Rank (PR), Weighted Page Rank (WPR), Weight Page Content Rank (WPCR), Hyperlink Induced Topic Selection (HITS), Distance Rank, Eigen Rumor, Distance Rank Time Rank, Tag Rank, Relational Based Page Rank and Query Dependent Ranking algorithms are discussed and compared.

Keywords: Information Retrieval, Web Page Ranking, search engine, web mining, page segmentations.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3486
66 Fuzzy C-Means Clustering for Biomedical Documents Using Ontology Based Indexing and Semantic Annotation

Authors: S. Logeswari, K. Premalatha

Abstract:

Search is the most obvious application of information retrieval. The variety of widely obtainable biomedical data is enormous and is expanding fast. This expansion makes the existing techniques are not enough to extract the most interesting patterns from the collection as per the user requirement. Recent researches are concentrating more on semantic based searching than the traditional term based searches. Algorithms for semantic searches are implemented based on the relations exist between the words of the documents. Ontologies are used as domain knowledge for identifying the semantic relations as well as to structure the data for effective information retrieval. Annotation of data with concepts of ontology is one of the wide-ranging practices for clustering the documents. In this paper, indexing based on concept and annotation are proposed for clustering the biomedical documents. Fuzzy c-means (FCM) clustering algorithm is used to cluster the documents. The performances of the proposed methods are analyzed with traditional term based clustering for PubMed articles in five different diseases communities. The experimental results show that the proposed methods outperform the term based fuzzy clustering.

Keywords: MeSH Ontology, Concept Indexing, Annotation, semantic relations, Fuzzy c-means.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1918
65 Relevance Feedback within CBIR Systems

Authors: Mawloud Mosbah, Bachir Boucheham

Abstract:

We present here the results for a comparative study of some techniques, available in the literature, related to the relevance feedback mechanism in the case of a short-term learning. Only one method among those considered here is belonging to the data mining field which is the K-nearest neighbors algorithm (KNN) while the rest of the methods is related purely to the information retrieval field and they fall under the purview of the following three major axes: Shifting query, Feature Weighting and the optimization of the parameters of similarity metric. As a contribution, and in addition to the comparative purpose, we propose a new version of the KNN algorithm referred to as an incremental KNN which is distinct from the original version in the sense that besides the influence of the seeds, the rate of the actual target image is influenced also by the images already rated. The results presented here have been obtained after experiments conducted on the Wang database for one iteration and utilizing color moments on the RGB space. This compact descriptor, Color Moments, is adequate for the efficiency purposes needed in the case of interactive systems. The results obtained allow us to claim that the proposed algorithm proves good results; it even outperforms a wide range of techniques available in the literature.

Keywords: CBIR, Category Search, Relevance Feedback (RFB), Query Point Movement, Standard Rocchio’s Formula, Adaptive Shifting Query, Feature Weighting, Optimization of the Parameters of Similarity Metric, Original KNN, Incremental KNN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1966
64 Identification of Coauthors in Scientific Database

Authors: Thiago M. R Dias, Gray F. Moita

Abstract:

The analysis of scientific collaboration networks has contributed significantly to improving the understanding of how does the process of collaboration between researchers and also to understand how the evolution of scientific production of researchers or research groups occurs. However, the identification of collaborations in large scientific databases is not a trivial task given the high computational cost of the methods commonly used. This paper proposes a method for identifying collaboration in large data base of curriculum researchers. The proposed method has low computational cost with satisfactory results, proving to be an interesting alternative for the modeling and characterization of large scientific collaboration networks.

Keywords: Extraction and data integration, Information Retrieval, Scientific Collaboration.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1326
63 A Review on Important Aspects of Information Retrieval

Authors: Yogesh Gupta, Ashish Saini, A.K. Saxena

Abstract:

Information retrieval has become an important field of study and research under computer science due to explosive growth of information available in the form of full text, hypertext, administrative text, directory, numeric or bibliographic text. The research work is going on various aspects of information retrieval systems so as to improve its efficiency and reliability. This paper presents a comprehensive study, which discusses not only emergence and evolution of information retrieval but also includes different information retrieval models and some important aspects such as document representation, similarity measure and query expansion.

Keywords: Information Retrieval, query expansion, similarity measure, query expansion, vector space model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2402
62 A Review of Genetic Algorithm Optimization: Operations and Applications to Water Pipeline Systems

Authors: I. Abuiziah, N. Shakarneh

Abstract:

Genetic Algorithm (GA) is a powerful technique for solving optimization problems. It follows the idea of survival of the fittest - Better and better solutions evolve from previous generations until a near optimal solution is obtained. GA uses the main three operations, the selection, crossover and mutation to produce new generations from the old ones. GA has been widely used to solve optimization problems in many applications such as traveling salesman problem, airport traffic control, information retrieval (IR), reactive power optimization, job shop scheduling, and hydraulics systems such as water pipeline systems. In water pipeline systems we need to achieve some goals optimally such as minimum cost of construction, minimum length of pipes and diameters, and the place of protection devices. GA shows high performance over the other optimization techniques, moreover, it is easy to implement and use. Also, it searches a limited number of solutions.

Keywords: Genetic Algorithm, optimization, pipeline systems, selection, cross over.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4669
61 ARCS for Critical Information Retrieval Development

Authors: Suttipong Boonphadung

Abstract:

The research on ARCS for critical information retrieval development aimed to (1) investigate conditions of critical information retrieval skill of the Mathematics pre-service teachers before applying ARCS model in learning activities, (2) study and analyze the development of critical information retrieval skill of the Mathematics pre-service teachers after utilizing ARCS model in learning activities, and (3) evaluate the Mathematics pre-service teachers’ satisfaction on using ARCS model in learning activities as a tool to development critical information retrieval skill. Forty-one of 4th year Mathematics pre-service teachers who have enrolled in the subject of Research for Learning Development of semester 2 in 2012 were purposively selected as the research cohort. The research tools were self-report and interview questionnaire that was approved as content validity and reliability (IOC=.66-1.00, α =.834). The research found that critical information retrieval skill of the research samples before using ARCS model in learning activities was in the normal high level. According to the in-depth interview and focus group, the result however showed that the pre-service teachers still lack inadequate and effective knowledge in information retrieval. Additionally, critical information retrieval skill of the research cohort after applying ARCS model in learning activities appeared to be high level. The result revealed that the pre-service teachers are able to explain the method of searching, extraction, and selecting information as well as evaluating quality of information, and effectively making decision in accepting information. Moreover, the research discovered that the pre-service teachers showed normal high to highest level of satisfaction on using ARCS model in learning activities as a tool to development their critical information retrieval skill.

Keywords: Critical information retrieval skill, ARCS model, Satisfaction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1187
60 Tele-Diagnosis System for Rural Thailand

Authors: C. Snae Namahoot, M. Brueckner

Abstract:

Thailand-s health system is challenged by the rising number of patients and decreasing ratio of medical practitioners/patients, especially in rural areas. This may tempt inexperienced GPs to rush through the process of anamnesis with the risk of incorrect diagnosis. Patients have to travel far to the hospital and wait for a long time presenting their case. Many patients try to cure themselves with traditional Thai medicine. Many countries are making use of the Internet for medical information gathering, distribution and storage. Telemedicine applications are a relatively new field of study in Thailand; the infrastructure of ICT had hampered widespread use of the Internet for using medical information. With recent improvements made health and technology professionals can work out novel applications and systems to help advance telemedicine for the benefit of the people. Here we explore the use of telemedicine for people with health problems in rural areas in Thailand and present a Telemedicine Diagnosis System for Rural Thailand (TEDIST) for diagnosing certain conditions that people with Internet access can use to establish contact with Community Health Centers, e.g. by mobile phone. The system uses a Web-based input method for individual patients- symptoms, which are taken by an expert system for the analysis of conditions and appropriate diseases. The analysis harnesses a knowledge base and a backward chaining component to find out, which health professionals should be presented with the case. Doctors have the opportunity to exchange emails or chat with the patients they are responsible for or other specialists. Patients- data are then stored in a Personal Health Record.

Keywords: Biomedical engineering, data acquisition, expert system, information management system, and information retrieval.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2216
59 Query Reformulation Guided by External Resource for Information Retrieval

Authors: Mohammed El Amine Abderrahim

Abstract:

Reformulating the user query is a technique that aims to improve the performance of an Information Retrieval System (IRS) in terms of precision and recall. This paper tries to evaluate the technique of query reformulation guided by an external resource for Arabic texts. To do this, various precision and recall measures were conducted and two corpora with different external resources like Arabic WordNet (AWN) and the Arabic Dictionary (thesaurus) of Meaning (ADM) were used. Examination of the obtained results will allow us to measure the real contribution of this reformulation technique in improving the IRS performance.

Keywords: Arabic NLP, Arabic Information Retrieval, Arabic WordNet, Query Expansion.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1069
58 Learning Styles of University Students in Bangkok: The Characteristics and the Relevant Instructional Context

Authors: Chaiwat Tantarangsee

Abstract:

The purposes of this study are 1) to identify learning styles of university students in Bangkok, and 2) to study the frequency of the relevant instructional context of the identified learning styles. Learning Styles employed in this study are those of Honey and Mumford, which include 1) Reflectors, 2) Theorists, 3) Pragmatists, and 4) Activists. The population comprises 1383 students and 5 lecturers. Research tools are 2 questionnaires – one used for identifying students- learning styles, and the other used for identifying the frequency of the relevant instructional context of the identified learning styles. The research findings reveal that 32.30 percent - are Activists, while 28.10 percent are Theorists, 20.10 are Reflectors, and 19.50 are Pragmatists. In terms of the relevant instructional context of the identified 4 learning styles, it is found that the frequency level of the instructional context is totally in high level. Moreover, 2 lists of the context being conducted most frequently are 'Lead'in activity to review background knowledge,- and 'Information retrieval report.' And these two activities serve the learning styles of theorists and activists. It is, therefore, suggested that more instructional context supporting the activists, the majority of the population, learning best by doing, as well as emotional learning situation should be added.

Keywords: Instructional Context, Learning Styles, Learning Style Preference, and Learning Style Questionnaire.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1488
57 Analyzing the Relation of Community Group for Research Paper Bookmarking by Using Association Rule

Authors: P. Jomsri

Abstract:

Currently searching through internet is very popular especially in a field of academic. A huge of educational information such as research papers are overload for user. So community-base web sites have been developed to help user search information more easily from process of customizing a web site to need each specifies user or set of user. In this paper propose to use association rule analyze the community group on research paper bookmarking. A set of design goals for community group frameworks is developed and discussed. Additionally Researcher analyzes the initial relation by using association rule discovery between the antecedent and the consequent of a rule in the groups of user for generate the idea to improve ranking search result and development recommender system.

Keywords: association rule, information retrieval, research paper bookmarking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1096