Search results for: Information retrieval systems
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7587

Search results for: Information retrieval systems

7557 Secure Image Retrieval Based On Orthogonal Decomposition under Cloud Environment

Authors: Yanyan Xu, Lizhi Xiong, Zhengquan Xu, Li Jiang

Abstract:

In order to protect data privacy, image with sensitive or private information needs to be encrypted before being outsourced to the cloud. However, this causes difficulties in image retrieval and data management. A secure image retrieval method based on orthogonal decomposition is proposed in the paper. The image is divided into two different components, for which encryption and feature extraction are executed separately. As a result, cloud server can extract features from an encrypted image directly and compare them with the features of the queried images, so that the user can thus obtain the image. Different from other methods, the proposed method has no special requirements to encryption algorithms. Experimental results prove that the proposed method can achieve better security and better retrieval precision.

Keywords: Secure image retrieval, secure search, orthogonal decomposition, secure cloud computing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2114
7556 A Study of Gaps in CBMIR Using Different Methods and Prospective

Authors: Pradeep Singh, Sukhwinder Singh, Gurjinder Kaur

Abstract:

In recent years, rapid advances in software and hardware in the field of information technology along with a digital imaging revolution in the medical domain facilitate the generation and storage of large collections of images by hospitals and clinics. To search these large image collections effectively and efficiently poses significant technical challenges, and it raises the necessity of constructing intelligent retrieval systems. Content-based Image Retrieval (CBIR) consists of retrieving the most visually similar images to a given query image from a database of images[5]. Medical CBIR (content-based image retrieval) applications pose unique challenges but at the same time offer many new opportunities. On one hand, while one can easily understand news or sports videos, a medical image is often completely incomprehensible to untrained eyes.

Keywords: Classification, clustering, content-based image retrieval (CBIR), relevance feedback (RF), statistical similarity matching, support vector machine (SVM).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1787
7555 Salient Points Reduction for Content-Based Image Retrieval

Authors: Yao-Hong Tsai

Abstract:

Salient points are frequently used to represent local properties of the image in content-based image retrieval. In this paper, we present a reduction algorithm that extracts the local most salient points such that they not only give a satisfying representation of an image, but also make the image retrieval process efficiently. This algorithm recursively reduces the continuous point set by their corresponding saliency values under a top-down approach. The resulting salient points are evaluated with an image retrieval system using Hausdoff distance. In this experiment, it shows that our method is robust and the extracted salient points provide better retrieval performance comparing with other point detectors.

Keywords: Barnard detector, Content-based image retrieval, Points reduction, Salient point.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1469
7554 Business Domain Modelling Using an Integrated Framework

Authors: Mohammed Salahat, Steve Wade

Abstract:

This paper presents an application of a “Systematic Soft Domain Driven Design Framework” as a soft systems approach to domain-driven design of information systems development. The framework use SSM as a guiding methodology within which we have embedded a sequence of design tasks based on the UML leading to the implementation of a software system using the Naked Objects framework. This framework have been used in action research projects that have involved the investigation and modelling of business processes using object-oriented domain models and the implementation of software systems based on those domain models. Within this framework, Soft Systems Methodology (SSM) is used as a guiding methodology to explore the problem situation and to develop the domain model using UML for the given business domain. The framework is proposed and evaluated in our previous works, and a real case study “Information Retrieval System for academic research” is used, in this paper, to show further practice and evaluation of the framework in different business domain. We argue that there are advantages from combining and using techniques from different methodologies in this way for business domain modelling. The framework is overviewed and justified as multimethodology using Mingers multimethodology ideas.

Keywords: SSM, UML, domain-driven design, soft domaindriven design, naked objects, soft language, information retrieval, multimethodology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1777
7553 Enhancing capabilities of Texture Extraction for Color Image Retrieval

Authors: Pranam Janney, Sridhar G, Sridhar V.

Abstract:

Content-Based Image Retrieval has been a major area of research in recent years. Efficient image retrieval with high precision would require an approach which combines usage of both the color and texture features of the image. In this paper we propose a method for enhancing the capabilities of texture based feature extraction and further demonstrate the use of these enhanced texture features in Texture-Based Color Image Retrieval.

Keywords: Image retrieval, texture feature extraction, color extraction

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1622
7552 Graph Codes-2D Projections of Multimedia Feature Graphs for Fast and Effective Retrieval

Authors: Stefan Wagenpfeil, Felix Engel, Paul McKevitt, Matthias Hemmje

Abstract:

Multimedia Indexing and Retrieval is generally de-signed and implemented by employing feature graphs. These graphs typically contain a significant number of nodes and edges to reflect the level of detail in feature detection. A higher level of detail increases the effectiveness of the results but also leads to more complex graph structures. However, graph-traversal-based algorithms for similarity are quite inefficient and computation intensive, espe-cially for large data structures. To deliver fast and effective retrieval, an efficient similarity algorithm, particularly for large graphs, is mandatory. Hence, in this paper, we define a graph-projection into a 2D space (Graph Code) as well as the corresponding algorithms for indexing and retrieval. We show that calculations in this space can be performed more efficiently than graph-traversals due to a simpler processing model and a high level of parallelisation. In consequence, we prove that the effectiveness of retrieval also increases substantially, as Graph Codes facilitate more levels of detail in feature fusion. Thus, Graph Codes provide a significant increase in efficiency and effectiveness (especially for Multimedia indexing and retrieval) and can be applied to images, videos, audio, and text information.

Keywords: indexing, retrieval, multimedia, graph code, graph algorithm

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 443
7551 Balancing Strategies for Parallel Content-based Data Retrieval Algorithms in a k-tree Structured Database

Authors: Radu Dobrescu, Matei Dobrescu, Daniela Hossu

Abstract:

The paper proposes a unified model for multimedia data retrieval which includes data representatives, content representatives, index structure, and search algorithms. The multimedia data are defined as k-dimensional signals indexed in a multidimensional k-tree structure. The benefits of using the k-tree unified model were demonstrated by running the data retrieval application on a six networked nodes test bed cluster. The tests were performed with two retrieval algorithms, one that allows parallel searching using a single feature, the second that performs a weighted cascade search for multiple features querying. The experiments show a significant reduction of retrieval time while maintaining the quality of results.

Keywords: balancing strategies, multimedia databases, parallelprocessing, retrieval algorithms

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1424
7550 Information Retrieval in the Semantic LIFE Personal Digital Memory Framework

Authors: Hanh Huu Hoang, Tho Manh Nguyen

Abstract:

Ever increasing capacities of contemporary storage devices inspire the vision to accumulate (personal) information without the need of deleting old data over a long time-span. Hence the target of SemanticLIFE project is to create a Personal Information Management system for a human lifetime data. One of the most important characteristics of the system is its dedication to retrieve information in a very efficient way. By adopting user demands regarding the reduction of ambiguities, our approach aims at a user-oriented and yet powerful enough system with a satisfactory query performance. We introduce the query system of SemanticLIFE, the Virtual Query System, which uses emerging Semantic Web technologies to fulfill users- requirements.

Keywords: Ontology-based Information Retrieval, Digital Memories, SemanticLIFE.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1346
7549 3D Model Retrieval based on Normal Vector Interpolation Method

Authors: Ami Kim, Oubong Gwun, Juwhan Song

Abstract:

In this paper, we proposed the distribution of mesh normal vector direction as a feature descriptor of a 3D model. A normal vector shows the entire shape of a model well. The distribution of normal vectors was sampled in proportion to each polygon's area so that the information on the surface with less surface area may be less reflected on composing a feature descriptor in order to enhance retrieval performance. At the analysis result of ANMRR, the enhancement of approx. 12.4%~34.7% compared to the existing method has also been indicated.

Keywords: Interpolated Normal Vector, Feature Descriptor, 3DModel Retrieval.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1474
7548 Extension and Evaluation of Interface “2D-RIB“ for Impression-Based Retrieval

Authors: S. Kikuchi, Y. Hashimoto, T. Takayama, T. Ikeda, Y. Murata

Abstract:

Recently, lots of researchers are attracted to retrieving multimedia database by using some impression words and their values. Ikezoe-s research is one of the representatives and uses eight pairs of opposite impression words. We had modified its retrieval interface and proposed '2D-RIB'. In '2D-RIB', after a retrieval person selects a single basic music, the system visually shows some other music around the basic one along relative position. He/she can select one of them fitting to his/her intention, as a retrieval result. The purpose of this paper is to improve his/her satisfaction level to the retrieval result in 2D-RIB. One of our extensions is to define and introduce the following two measures: 'melody goodness' and 'general acceptance'. We implement them in different five combinations. According to an evaluation experiment, both of these two measures can contribute to the improvement. Another extension is three types of customization. We have implemented them and clarified which customization is effective.

Keywords: Multimedia database, impression-based retrieval, interface, satisfaction level.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1299
7547 Experiments on Element and Document Statistics for XML Retrieval

Authors: Mohamed Ben Aouicha, Mohamed Tmar, Mohand Boughanem, Mohamed Abid

Abstract:

This paper presents an information retrieval model on XML documents based on tree matching. Queries and documents are represented by extended trees. An extended tree is built starting from the original tree, with additional weighted virtual links between each node and its indirect descendants allowing to directly reach each descendant. Therefore only one level separates between each node and its indirect descendants. This allows to compare the user query and the document with flexibility and with respect to the structural constraints of the query. The content of each node is very important to decide weither a document element is relevant or not, thus the content should be taken into account in the retrieval process. We separate between the structure-based and the content-based retrieval processes. The content-based score of each node is commonly based on the well-known Tf × Idf criteria. In this paper, we compare between this criteria and another one we call Tf × Ief. The comparison is based on some experiments into a dataset provided by INEX1 to show the effectiveness of our approach on one hand and those of both weighting functions on the other.

Keywords: XML retrieval, INEX, Tf × Idf, Tf × Ief

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1336
7546 A Learning Agent for Knowledge Extraction from an Active Semantic Network

Authors: Simon Thiel, Stavros Dalakakis, Dieter Roller

Abstract:

This paper outlines the development of a learning retrieval agent. Task of this agent is to extract knowledge of the Active Semantic Network in respect to user-requests. Based on a reinforcement learning approach, the agent learns to interpret the user-s intention. Especially, the learning algorithm focuses on the retrieval of complex long distant relations. Increasing its learnt knowledge with every request-result-evaluation sequence, the agent enhances his capability in finding the intended information.

Keywords: Reinforcement learning, learning retrieval agent, search in semantic networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1494
7545 Local Image Descriptor using VQ-SIFT for Image Retrieval

Authors: Qiu Chen, Feifei Lee, Koji Kotani, Tadahiro Ohmi

Abstract:

In this paper, we present local image descriptor using VQ-SIFT for more effective and efficient image retrieval. Instead of SIFT's weighted orientation histograms, we apply vector quantization (VQ) histogram as an alternate representation for SIFT features. Experimental results show that SIFT features using VQ-based local descriptors can achieve better image retrieval accuracy than the conventional algorithm while the computational cost is significantly reduced.

Keywords: SIFT feature, Vector quantization histogram, Localdescriptor, Image retrieval.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2403
7544 A Comparative Analysis of Different Web Content Mining Tools

Authors: T. Suresh Kumar, M. Arthanari, N. Shanthi

Abstract:

Nowadays, the Web has become one of the most pervasive platforms for information change and retrieval. It collects the suitable and perfectly fitting information from websites that one requires. Data mining is the form of extracting data’s available in the internet. Web mining is one of the elements of data mining Technique, which relates to various research communities such as information recovery, folder managing system and simulated intellects. In this Paper we have discussed the concepts of Web mining. We contain generally focused on one of the categories of Web mining, specifically the Web Content Mining and its various farm duties. The mining tools are imperative to scanning the many images, text, and HTML documents and then, the result is used by the various search engines. We conclude by presenting a comparative table of these tools based on some pertinent criteria.

Keywords: Data Mining, Web Mining, Web Content Mining, Mining Tools, Information retrieval.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3553
7543 Extended “2D-RIB“ for Impression-Based Satisfactory Retrieval and its Evaluation

Authors: T. Takayama, S. Kikuchi, Y. Hashimoto, T. Ikeda, Y. Murata

Abstract:

Recently, lots of researchers are attracted to retrieving multimedia database by using some impression words and their values. Ikezoe-s research is one of the representatives and uses eight pairs of opposite impression words. We had modified its retrieval interface and proposed '2D-RIB' in the previous work. The aim of the present paper is to improve his/her satisfaction level to the retrieval result in the 2D-RIB. Our method is to extend the 2D-RIB. One of our extensions is to define and introduce the following two measures: 'melody goodness' and 'general acceptance'. Another extension is three types of customization menus. The result of evaluation using a pilot system is as follows. Both of these two measures 'melody goodness' and -general acceptance- can contribute to the improvement. Moreover, it is effective if we introduce the customization menu which enables a retrieval person to reduce the strictness level of retrieval condition in an impression pair based on his/her need.

Keywords: Multimedia database, impression-based retrieval, interface, satisfaction level.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1218
7542 Semi-Automatic Analyzer to Detect Authorial Intentions in Scientific Documents

Authors: Kanso Hassan, Elhore Ali, Soule-dupuy Chantal, Tazi Said

Abstract:

Information Retrieval has the objective of studying models and the realization of systems allowing a user to find the relevant documents adapted to his need of information. The information search is a problem which remains difficult because the difficulty in the representing and to treat the natural languages such as polysemia. Intentional Structures promise to be a new paradigm to extend the existing documents structures and to enhance the different phases of documents process such as creation, editing, search and retrieval. The intention recognition of the author-s of texts can reduce the largeness of this problem. In this article, we present intentions recognition system is based on a semi-automatic method of extraction the intentional information starting from a corpus of text. This system is also able to update the ontology of intentions for the enrichment of the knowledge base containing all possible intentions of a domain. This approach uses the construction of a semi-formal ontology which considered as the conceptualization of the intentional information contained in a text. An experiments on scientific publications in the field of computer science was considered to validate this approach.

Keywords: Information research, text analyzes, intentionalstructure, segmentation, ontology, natural language processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1638
7541 Opponent Color and Curvelet Transform Based Image Retrieval System Using Genetic Algorithm

Authors: Yesubai Rubavathi Charles, Ravi Ramraj

Abstract:

In order to retrieve images efficiently from a large database, a unique method integrating color and texture features using genetic programming has been proposed. Opponent color histogram which gives shadow, shade, and light intensity invariant property is employed in the proposed framework for extracting color features. For texture feature extraction, fast discrete curvelet transform which captures more orientation information at different scales is incorporated to represent curved like edges. The recent scenario in the issues of image retrieval is to reduce the semantic gap between user’s preference and low level features. To address this concern, genetic algorithm combined with relevance feedback is embedded to reduce semantic gap and retrieve user’s preference images. Extensive and comparative experiments have been conducted to evaluate proposed framework for content based image retrieval on two databases, i.e., COIL-100 and Corel-1000. Experimental results clearly show that the proposed system surpassed other existing systems in terms of precision and recall. The proposed work achieves highest performance with average precision of 88.2% on COIL-100 and 76.3% on Corel, the average recall of 69.9% on COIL and 76.3% on Corel. Thus, the experimental results confirm that the proposed content based image retrieval system architecture attains better solution for image retrieval.

Keywords: Content based image retrieval, Curvelet transform, Genetic algorithm, Opponent color histogram, Relevance feedback.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1822
7540 Content Based Image Retrieval of Brain MR Images across Different Classes

Authors: Abraham Varghese, Kannan Balakrishnan, Reji R. Varghese, Joseph S. Paul

Abstract:

Magnetic Resonance Imaging play a vital role in the decision-diagnosis process of brain MR images. For an accurate diagnosis of brain related problems, the experts mostly compares both T1 and T2 weighted images as the information presented in these two images are complementary. In this paper, rotational and translational invariant form of Local binary Pattern (LBP) with additional gray scale information is used to retrieve similar slices of T1 weighted images from T2 weighted images or vice versa. The incorporation of additional gray scale information on LBP can extract more local texture information. The accuracy of retrieval can be improved by extracting moment features of LBP and reweighting the features based on users feedback. Here retrieval is done in a single subject scenario where similar images of a particular subject at a particular level are retrieved, and multiple subjects scenario where relevant images at a particular level across the subjects are retrieved.

Keywords: Local Binary pattern (LBP), Modified Local Binary pattern (MOD-LBP), T1 and T2 weighted images, Moment features.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2381
7539 A Specification-Based Approach for Retrieval of Reusable Business Component for Software Reuse

Authors: Meng Fanchao, Zhan Dechen, Xu Xiaofei

Abstract:

Software reuse can be considered as the most realistic and promising way to improve software engineering productivity and quality. Automated assistance for software reuse involves the representation, classification, retrieval and adaptation of components. The representation and retrieval of components are important to software reuse in Component-Based on Software Development (CBSD). However, current industrial component models mainly focus on the implement techniques and ignore the semantic information about component, so it is difficult to retrieve the components that satisfy user-s requirements. This paper presents a method of business component retrieval based on specification matching to solve the software reuse of enterprise information system. First, a business component model oriented reuse is proposed. In our model, the business data type is represented as sign data type based on XML, which can express the variable business data type that can describe the variety of business operations. Based on this model, we propose specification match relationships in two levels: business operation level and business component level. In business operation level, we use input business data types, output business data types and the taxonomy of business operations evaluate the similarity between business operations. In the business component level, we propose five specification matches between business components. To retrieval reusable business components, we propose the measure of similarity degrees to calculate the similarities between business components. Finally, a business component retrieval command like SQL is proposed to help user to retrieve approximate business components from component repository.

Keywords: Business component, business operation, business data type, specification matching.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1409
7538 Semantic Indexing Approach of a Corpora Based On Ontology

Authors: Mohammed Erritali

Abstract:

The growth in the volume of text data such as books and articles in libraries for centuries has imposed to establish effective mechanisms to locate them. Early techniques such as abstraction, indexing and the use of classification categories have marked the birth of a new field of research called "Information Retrieval". Information Retrieval (IR) can be defined as the task of defining models and systems whose purpose is to facilitate access to a set of documents in electronic form (corpus) to allow a user to find the relevant ones for him, that is to say, the contents which matches with the information needs of the user. This paper presents a new semantic indexing approach of a documentary corpus. The indexing process starts first by a term weighting phase to determine the importance of these terms in the documents. Then the use of a thesaurus like Wordnet allows moving to the conceptual level. Each candidate concept is evaluated by determining its level of representation of the document, that is to say, the importance of the concept in relation to other concepts of the document. Finally, the semantic index is constructed by attaching to each concept of the ontology, the documents of the corpus in which these concepts are found.

Keywords: Semantic, indexing, corpora, WordNet, ontology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1368
7537 A Weighted-Profiling Using an Ontology Basefor Semantic-Based Search

Authors: Hikmat A. M. Abd-El-Jaber, Tengku M. T. Sembok

Abstract:

The information on the Web increases tremendously. A number of search engines have been developed for searching Web information and retrieving relevant documents that satisfy the inquirers needs. Search engines provide inquirers irrelevant documents among search results, since the search is text-based rather than semantic-based. Information retrieval research area has presented a number of approaches and methodologies such as profiling, feedback, query modification, human-computer interaction, etc for improving search results. Moreover, information retrieval has employed artificial intelligence techniques and strategies such as machine learning heuristics, tuning mechanisms, user and system vocabularies, logical theory, etc for capturing user's preferences and using them for guiding the search based on the semantic analysis rather than syntactic analysis. Although a valuable improvement has been recorded on search results, the survey has shown that still search engines users are not really satisfied with their search results. Using ontologies for semantic-based searching is likely the key solution. Adopting profiling approach and using ontology base characteristics, this work proposes a strategy for finding the exact meaning of the query terms in order to retrieve relevant information according to user needs. The evaluation of conducted experiments has shown the effectiveness of the suggested methodology and conclusion is presented.

Keywords: information retrieval, user profiles, semantic Web, ontology, search engine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3217
7536 Application of l1-Norm Minimization Technique to Image Retrieval

Authors: C. S. Sastry, Saurabh Jain, Ashish Mishra

Abstract:

Image retrieval is a topic where scientific interest is currently high. The important steps associated with image retrieval system are the extraction of discriminative features and a feasible similarity metric for retrieving the database images that are similar in content with the search image. Gabor filtering is a widely adopted technique for feature extraction from the texture images. The recently proposed sparsity promoting l1-norm minimization technique finds the sparsest solution of an under-determined system of linear equations. In the present paper, the l1-norm minimization technique as a similarity metric is used in image retrieval. It is demonstrated through simulation results that the l1-norm minimization technique provides a promising alternative to existing similarity metrics. In particular, the cases where the l1-norm minimization technique works better than the Euclidean distance metric are singled out.

Keywords: l1-norm minimization, content based retrieval, modified Gabor function.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3432
7535 Local Mesh Co-Occurrence Pattern for Content Based Image Retrieval

Authors: C. Yesubai Rubavathi, R. Ravi

Abstract:

This paper presents the local mesh co-occurrence patterns (LMCoP) using HSV color space for image retrieval system. HSV color space is used in this method to utilize color, intensity and brightness of images. Local mesh patterns are applied to define the local information of image and gray level co-occurrence is used to obtain the co-occurrence of LMeP pixels. Local mesh co-occurrence pattern extracts the local directional information from local mesh pattern and converts it into a well-mannered feature vector using gray level co-occurrence matrix. The proposed method is tested on three different databases called MIT VisTex, Corel, and STex. Also, this algorithm is compared with existing methods, and results in terms of precision and recall are shown in this paper.

Keywords: Content-based image retrieval system, HSV color space, gray level co-occurrence matrix, local mesh pattern.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2222
7534 Distributional Semantics Approach to Thai Word Sense Disambiguation

Authors: Sunee Pongpinigpinyo, Wanchai Rivepiboon

Abstract:

Word sense disambiguation is one of the most important open problems in natural language processing applications such as information retrieval and machine translation. Many approach strategies can be employed to resolve word ambiguity with a reasonable degree of accuracy. These strategies are: knowledgebased, corpus-based, and hybrid-based. This paper pays attention to the corpus-based strategy that employs an unsupervised learning method for disambiguation. We report our investigation of Latent Semantic Indexing (LSI), an information retrieval technique and unsupervised learning, to the task of Thai noun and verbal word sense disambiguation. The Latent Semantic Indexing has been shown to be efficient and effective for Information Retrieval. For the purposes of this research, we report experiments on two Thai polysemous words, namely  /hua4/ and /kep1/ that are used as a representative of Thai nouns and verbs respectively. The results of these experiments demonstrate the effectiveness and indicate the potential of applying vector-based distributional information measures to semantic disambiguation.

Keywords: Distributional semantics, Latent Semantic Indexing, natural language processing, Polysemous words, unsupervisedlearning, Word Sense Disambiguation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1814
7533 Advanced Information Extraction with n-gram based LSI

Authors: Ahmet Güven, Ö. Özgür Bozkurt, Oya Kalıpsız

Abstract:

Number of documents being created increases at an increasing pace while most of them being in already known topics and little of them introducing new concepts. This fact has started a new era in information retrieval discipline where the requirements have their own specialties. That is digging into topics and concepts and finding out subtopics or relations between topics. Up to now IR researches were interested in retrieving documents about a general topic or clustering documents under generic subjects. However these conventional approaches can-t go deep into content of documents which makes it difficult for people to reach to right documents they were searching. So we need new ways of mining document sets where the critic point is to know much about the contents of the documents. As a solution we are proposing to enhance LSI, one of the proven IR techniques by supporting its vector space with n-gram forms of words. Positive results we have obtained are shown in two different application area of IR domain; querying a document database, clustering documents in the document database.

Keywords: Document clustering, Information Extraction, Information Retrieval, LSI, n-gram.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1803
7532 WebGD: A CORBA-based Document Classification and Retrieval System on the Web

Authors: Fuyang Peng, Bo Deng, Chao Qi, Mou Zhan

Abstract:

This paper presents the design and implementation of the WebGD, a CORBA-based document classification and retrieval system on Internet. The WebGD makes use of such techniques as Web, CORBA, Java, NLP, fuzzy technique, knowledge-based processing and database technology. Unified classification and retrieval model, classifying and retrieving with one reasoning engine and flexible working mode configuration are some of its main features. The architecture of WebGD, the unified classification and retrieval model, the components of the WebGD server and the fuzzy inference engine are discussed in this paper in detail.

Keywords: Text Mining, document classification, knowledgeprocessing, fuzzy logic, Web, CORBA

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1848
7531 Knowledge Representation and Retrieval in Design Project Memory

Authors: Smain M. Bekhti, Nada T. Matta

Abstract:

Knowledge sharing in general and the contextual access to knowledge in particular, still represent a key challenge in the knowledge management framework. Researchers on semantic web and human machine interface study techniques to enhance this access. For instance, in semantic web, the information retrieval is based on domain ontology. In human machine interface, keeping track of user's activity provides some elements of the context that can guide the access to information. We suggest an approach based on these two key guidelines, whilst avoiding some of their weaknesses. The approach permits a representation of both the context and the design rationale of a project for an efficient access to knowledge. In fact, the method consists of an information retrieval environment that, in the one hand, can infer knowledge, modeled as a semantic network, and on the other hand, is based on the context and the objectives of a specific activity (the design). The environment we defined can also be used to gather similar project elements in order to build classifications of tasks, problems, arguments, etc. produced in a company. These classifications can show the evolution of design strategies in the company.

Keywords: Project Memory, Knowledge re-use, Design rationale, Knowledge representation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1626
7530 A Universal Model for Content-Based Image Retrieval

Authors: S. Nandagopalan, Dr. B. S. Adiga, N. Deepak

Abstract:

In this paper a novel approach for generalized image retrieval based on semantic contents is presented. A combination of three feature extraction methods namely color, texture, and edge histogram descriptor. There is a provision to add new features in future for better retrieval efficiency. Any combination of these methods, which is more appropriate for the application, can be used for retrieval. This is provided through User Interface (UI) in the form of relevance feedback. The image properties analyzed in this work are by using computer vision and image processing algorithms. For color the histogram of images are computed, for texture cooccurrence matrix based entropy, energy, etc, are calculated and for edge density it is Edge Histogram Descriptor (EHD) that is found. For retrieval of images, a novel idea is developed based on greedy strategy to reduce the computational complexity. The entire system was developed using AForge.Imaging (an open source product), MATLAB .NET Builder, C#, and Oracle 10g. The system was tested with Coral Image database containing 1000 natural images and achieved better results.

Keywords: Content Based Image Retrieval (CBIR), Cooccurrencematrix, Feature vector, Edge Histogram Descriptor(EHD), Greedy strategy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2933
7529 Evolutionary Query Optimization for Heterogeneous Distributed Database Systems

Authors: Reza Ghaemi, Amin Milani Fard, Hamid Tabatabaee, Mahdi Sadeghizadeh

Abstract:

Due to new distributed database applications such as huge deductive database systems, the search complexity is constantly increasing and we need better algorithms to speedup traditional relational database queries. An optimal dynamic programming method for such high dimensional queries has the big disadvantage of its exponential order and thus we are interested in semi-optimal but faster approaches. In this work we present a multi-agent based mechanism to meet this demand and also compare the result with some commonly used query optimization algorithms.

Keywords: Information retrieval systems, list fusion methods, document score, multi-agent systems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3424
7528 Effective Digital Music Retrieval System through Content-based Features

Authors: Bokyung Sung, Kwanghyo Koo, Jungsoo Kim, Myung-Bum Jung, Jinman Kwon, Ilju Ko

Abstract:

In this paper, we propose effective system for digital music retrieval. We divided proposed system into Client and Server. Client part consists of pre-processing and Content-based feature extraction stages. In pre-processing stage, we minimized Time code Gap that is occurred among same music contents. As content-based feature, first-order differentiated MFCC were used. These presented approximately envelop of music feature sequences. Server part included Music Server and Music Matching stage. Extracted features from 1,000 digital music files were stored in Music Server. In Music Matching stage, we found retrieval result through similarity measure by DTW. In experiment, we used 450 queries. These were made by mixing different compression standards and sound qualities from 50 digital music files. Retrieval accurate indicated 97% and retrieval time was average 15ms in every single query. Out experiment proved that proposed system is effective in retrieve digital music and robust at various user environments of web.

Keywords: Music Retrieval, Content-based, Music Feature and Digital Music.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1519