Search results for: content based retrieval
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 12389

Search results for: content based retrieval

12329 Organization Model of Semantic Document Repository and Search Techniques for Studying Information Technology

Authors: Nhon Do, Thuong Huynh, An Pham

Abstract:

Nowadays, organizing a repository of documents and resources for learning on a special field as Information Technology (IT), together with search techniques based on domain knowledge or document-s content is an urgent need in practice of teaching, learning and researching. There have been several works related to methods of organization and search by content. However, the results are still limited and insufficient to meet user-s demand for semantic document retrieval. This paper presents a solution for the organization of a repository that supports semantic representation and processing in search. The proposed solution is a model which integrates components such as an ontology describing domain knowledge, a database of document repository, semantic representation for documents and a file system; with problems, semantic processing techniques and advanced search techniques based on measuring semantic similarity. The solution is applied to build a IT learning materials management system of a university with semantic search function serving students, teachers, and manager as well. The application has been implemented, tested at the University of Information Technology, Ho Chi Minh City, Vietnam and has achieved good results.

Keywords: document retrieval system, knowledgerepresentation, document representation, semantic search, ontology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1663
12328 Using Dempster-Shafer Theory in XML Information Retrieval

Authors: F. Raja, M. Rahgozar, F. Oroumchian

Abstract:

XML is a markup language which is becoming the standard format for information representation and data exchange. A major purpose of XML is the explicit representation of the logical structure of a document. Much research has been performed to exploit logical structure of documents in information retrieval in order to precisely extract user information need from large collections of XML documents. In this paper, we describe an XML information retrieval weighting scheme that tries to find the most relevant elements in XML documents in response to a user query. We present this weighting model for information retrieval systems that utilize plausible inferences to infer the relevance of elements in XML documents. We also add to this model the Dempster-Shafer theory of evidence to express the uncertainty in plausible inferences and Dempster-Shafer rule of combination to combine evidences derived from different inferences.

Keywords: Dempster-Shafer theory, plausible inferences, XMLinformation retrieval.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1483
12327 Retrieval of User Specific Images Using Semantic Signatures

Authors: K. Venkateswari, U. K. Balaji Saravanan, K. Thangaraj, K. V. Deepana

Abstract:

Image search engines rely on the surrounding textual keywords for the retrieval of images. It is a tedious work for the search engines like Google and Bing to interpret the user’s search intention and to provide the desired results. The recent researches also state that the Google image search engines do not work well on all the images. Consequently, this leads to the emergence of efficient image retrieval technique, which interprets the user’s search intention and shows the desired results. In order to accomplish this task, an efficient image re-ranking framework is required. Sequentially, to provide best image retrieval, the new image re-ranking framework is experimented in this paper. The implemented new image re-ranking framework provides best image retrieval from the image dataset by making use of re-ranking of retrieved images that is based on the user’s desired images. This is experimented in two sections. One is offline section and other is online section. In offline section, the reranking framework studies differently (reference classes or Semantic Spaces) for diverse user query keywords. The semantic signatures get generated by combining the textual and visual features of the images. In the online section, images are re-ranked by comparing the semantic signatures that are obtained from the reference classes with the user specified image query keywords. This re-ranking methodology will increases the retrieval image efficiency and the result will be effective to the user.

Keywords: CBIR, Image Re-ranking, Image Retrieval, Semantic Signature, Semantic Space.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1897
12326 Lecture Video Indexing and Retrieval Using Topic Keywords

Authors: B. J. Sandesh, Saurabha Jirgi, S. Vidya, Prakash Eljer, Gowri Srinivasa

Abstract:

In this paper, we propose a framework to help users to search and retrieve the portions in the lecture video of their interest. This is achieved by temporally segmenting and indexing the lecture video using the topic keywords. We use transcribed text from the video and documents relevant to the video topic extracted from the web for this purpose. The keywords for indexing are found by applying the non-negative matrix factorization (NMF) topic modeling techniques on the web documents. Our proposed technique first creates indices on the transcribed documents using the topic keywords, and these are mapped to the video to find the start and end time of the portions of the video for a particular topic. This time information is stored in the index table along with the topic keyword which is used to retrieve the specific portions of the video for the query provided by the users.

Keywords: Video indexing and retrieval, lecture videos, content based video search, multimodal indexing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1495
12325 Advanced Information Extraction with n-gram based LSI

Authors: Ahmet Güven, Ö. Özgür Bozkurt, Oya Kalıpsız

Abstract:

Number of documents being created increases at an increasing pace while most of them being in already known topics and little of them introducing new concepts. This fact has started a new era in information retrieval discipline where the requirements have their own specialties. That is digging into topics and concepts and finding out subtopics or relations between topics. Up to now IR researches were interested in retrieving documents about a general topic or clustering documents under generic subjects. However these conventional approaches can-t go deep into content of documents which makes it difficult for people to reach to right documents they were searching. So we need new ways of mining document sets where the critic point is to know much about the contents of the documents. As a solution we are proposing to enhance LSI, one of the proven IR techniques by supporting its vector space with n-gram forms of words. Positive results we have obtained are shown in two different application area of IR domain; querying a document database, clustering documents in the document database.

Keywords: Document clustering, Information Extraction, Information Retrieval, LSI, n-gram.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1753
12324 An Optical Flow Based Segmentation Method for Objects Extraction

Authors: C. Lodato, S. Lopes

Abstract:

This paper describes a segmentation algorithm based on the cooperation of an optical flow estimation method with edge detection and region growing procedures. The proposed method has been developed as a pre-processing stage to be used in methodologies and tools for video/image indexing and retrieval by content. The addressed problem consists in extracting whole objects from background for producing images of single complete objects from videos or photos. The extracted images are used for calculating the object visual features necessary for both indexing and retrieval processes. The first task of the algorithm exploits the cues from motion analysis for moving area detection. Objects and background are then refined using respectively edge detection and region growing procedures. These tasks are iteratively performed until objects and background are completely resolved. The developed method has been applied to a variety of indoor and outdoor scenes where objects of different type and shape are represented on variously textured background.

Keywords: Motion Detection, Object Extraction, Optical Flow, Segmentation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1853
12323 Algorithm for Information Retrieval Optimization

Authors: Kehinde K. Agbele, Kehinde Daniel Aruleba, Eniafe F. Ayetiran

Abstract:

When using Information Retrieval Systems (IRS), users often present search queries made of ad-hoc keywords. It is then up to the IRS to obtain a precise representation of the user’s information need and the context of the information. This paper investigates optimization of IRS to individual information needs in order of relevance. The study addressed development of algorithms that optimize the ranking of documents retrieved from IRS. This study discusses and describes a Document Ranking Optimization (DROPT) algorithm for information retrieval (IR) in an Internet-based or designated databases environment. Conversely, as the volume of information available online and in designated databases is growing continuously, ranking algorithms can play a major role in the context of search results. In this paper, a DROPT technique for documents retrieved from a corpus is developed with respect to document index keywords and the query vectors. This is based on calculating the weight (

Keywords: Internet ranking,

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1431
12322 Role of Natural Language Processing in Information Retrieval; Challenges and Opportunities

Authors: Khaled M. Alhawiti

Abstract:

This paper aims to analyze the role of natural language processing (NLP). The paper will discuss the role in the context of automated data retrieval, automated question answer, and text structuring. NLP techniques are gaining wider acceptance in real life applications and industrial concerns. There are various complexities involved in processing the text of natural language that could satisfy the need of decision makers. This paper begins with the description of the qualities of NLP practices. The paper then focuses on the challenges in natural language processing. The paper also discusses major techniques of NLP. The last section describes opportunities and challenges for future research.

Keywords: Data Retrieval, Information retrieval, Natural Language Processing, Text Structuring.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2772
12321 A Frame Work for Query Results Refinement in Multimedia Databases

Authors: Humaira Liaquat, Nadeem Iftikhar, Shaukat Ali, Zohaib Zafar Iqbal

Abstract:

In the current age, retrieval of relevant information from massive amount of data is a challenging job. Over the years, precise and relevant retrieval of information has attained high significance. There is a growing need in the market to build systems, which can retrieve multimedia information that precisely meets the user's current needs. In this paper, we have introduced a framework for refining query results before showing it to the user, using ambient intelligence, user profile, group profile, user location, time, day, user device type and extracted features. A prototypic tool was also developed to demonstrate the efficiency of the proposed approach.

Keywords: Context aware retrieval, Information retrieval, Ambient Intelligence, Multimedia databases, User and group profile.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1499
12320 A Multilanguage Source Code Retrieval System Using Structural-Semantic Fingerprints

Authors: Mohamed Amine Ouddan, Hassane Essafi

Abstract:

Source code retrieval is of immense importance in the software engineering field. The complex tasks of retrieving and extracting information from source code documents is vital in the development cycle of the large software systems. The two main subtasks which result from these activities are code duplication prevention and plagiarism detection. In this paper, we propose a Mohamed Amine Ouddan, and Hassane Essafi source code retrieval system based on two-level fingerprint representation, respectively the structural and the semantic information within a source code. A sequence alignment technique is applied on these fingerprints in order to quantify the similarity between source code portions. The specific purpose of the system is to detect plagiarism and duplicated code between programs written in different programming languages belonging to the same class, such as C, Cµ, Java and CSharp. These four languages are supported by the actual version of the system which is designed such that it may be easily adapted for any programming language.

Keywords: Source code retrieval, plagiarism detection, clonedetection, sequence alignment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1740
12319 Shape-Based Image Retrieval Using Shape Matrix

Authors: C. Sheng, Y. Xin

Abstract:

Retrieval image by shape similarity, given a template shape is particularly challenging, owning to the difficulty to derive a similarity measurement that closely conforms to the common perception of similarity by humans. In this paper, a new method for the representation and comparison of shapes is present which is based on the shape matrix and snake model. It is scaling, rotation, translation invariant. And it can retrieve the shape images with some missing or occluded parts. In the method, the deformation spent by the template to match the shape images and the matching degree is used to evaluate the similarity between them.

Keywords: shape representation, shape matching, shape matrix, deformation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1468
12318 OCIRS: An Ontology-based Chinese Idioms Retrieval System

Authors: Hu Haibo, Tu Chunmei, Fu Chunlei, Fu Li, Mao Fan, Ma Yuan

Abstract:

Chinese Idioms are a type of traditional Chinese idiomatic expressions with specific meanings and stereotypes structure which are widely used in classical Chinese and are still common in vernacular written and spoken Chinese today. Currently, Chinese Idioms are retrieved in glossary with key character or key word in morphology or pronunciation index that can not meet the need of searching semantically. OCIRS is proposed to search the desired idiom in the case of users only knowing its meaning without any key character or key word. The user-s request in a sentence or phrase will be grammatically analyzed in advance by word segmentation, key word extraction and semantic similarity computation, thus can be mapped to the idiom domain ontology which is constructed to provide ample semantic relations and to facilitate description logics-based reasoning for idiom retrieval. The experimental evaluation shows that OCIRS realizes the function of searching idioms via semantics, obtaining preliminary achievement as requested by the users.

Keywords: Chinese idiom, idiom retrieval, semantic searching, ontology, semantics similarity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1669
12317 Distributional Semantics Approach to Thai Word Sense Disambiguation

Authors: Sunee Pongpinigpinyo, Wanchai Rivepiboon

Abstract:

Word sense disambiguation is one of the most important open problems in natural language processing applications such as information retrieval and machine translation. Many approach strategies can be employed to resolve word ambiguity with a reasonable degree of accuracy. These strategies are: knowledgebased, corpus-based, and hybrid-based. This paper pays attention to the corpus-based strategy that employs an unsupervised learning method for disambiguation. We report our investigation of Latent Semantic Indexing (LSI), an information retrieval technique and unsupervised learning, to the task of Thai noun and verbal word sense disambiguation. The Latent Semantic Indexing has been shown to be efficient and effective for Information Retrieval. For the purposes of this research, we report experiments on two Thai polysemous words, namely  /hua4/ and /kep1/ that are used as a representative of Thai nouns and verbs respectively. The results of these experiments demonstrate the effectiveness and indicate the potential of applying vector-based distributional information measures to semantic disambiguation.

Keywords: Distributional semantics, Latent Semantic Indexing, natural language processing, Polysemous words, unsupervisedlearning, Word Sense Disambiguation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1760
12316 Knowledge Representation and Retrieval in Design Project Memory

Authors: Smain M. Bekhti, Nada T. Matta

Abstract:

Knowledge sharing in general and the contextual access to knowledge in particular, still represent a key challenge in the knowledge management framework. Researchers on semantic web and human machine interface study techniques to enhance this access. For instance, in semantic web, the information retrieval is based on domain ontology. In human machine interface, keeping track of user's activity provides some elements of the context that can guide the access to information. We suggest an approach based on these two key guidelines, whilst avoiding some of their weaknesses. The approach permits a representation of both the context and the design rationale of a project for an efficient access to knowledge. In fact, the method consists of an information retrieval environment that, in the one hand, can infer knowledge, modeled as a semantic network, and on the other hand, is based on the context and the objectives of a specific activity (the design). The environment we defined can also be used to gather similar project elements in order to build classifications of tasks, problems, arguments, etc. produced in a company. These classifications can show the evolution of design strategies in the company.

Keywords: Project Memory, Knowledge re-use, Design rationale, Knowledge representation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1580
12315 Design, Development and Analysis of Automated Storage and Retrieval System with Single and Dual Command Dispatching using MATLAB

Authors: M. Aslam, Farrukh, A. R. Gardezi, Nasir Hayat

Abstract:

Automated material handling is given prime importance in the semi automated and automated facilities since it provides solution to the gigantic problems related to inventory and also support the latest philosophies like just in time production JIT and lean production. Automated storage and retrieval system is an antidote (if designed properly) to the facility sufferings like getting the right material , materials getting perished, long cycle times or many other similar kind of problems. A working model of automated storage and retrieval system (AS/RS) is designed and developed under the design parameters specified by Material Handling Industry of America (MHIA). Later on analysis was carried out to calculate the throughput and size of the machine. The possible implementation of this technology in local scenario is also discussed in this paper.

Keywords: Automated storage and retrieval system (AS/RS), Material handling, Computer integrated manufacturing (CIM), Lightdependent resistor (LDR)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3374
12314 Decision Rule Induction in a Learning Content Management System

Authors: Nittaya Kerdprasop, Narin Muenrat, Kittisak Kerdprasop

Abstract:

A learning content management system (LCMS) is an environment to support web-based learning content development. Primary function of the system is to manage the learning process as well as to generate content customized to meet a unique requirement of each learner. Among the available supporting tools offered by several vendors, we propose to enhance the LCMS functionality to individualize the presented content with the induction ability. Our induction technique is based on rough set theory. The induced rules are intended to be the supportive knowledge for guiding the content flow planning. They can also be used as decision rules to help content developers on managing content delivered to individual learner.

Keywords: Decision rules, Knowledge induction, Learning content management system, Rough set.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1518
12313 Key Frame Based Video Summarization via Dependency Optimization

Authors: Janya Sainui

Abstract:

As a rapid growth of digital videos and data communications, video summarization that provides a shorter version of the video for fast video browsing and retrieval is necessary. Key frame extraction is one of the mechanisms to generate video summary. In general, the extracted key frames should both represent the entire video content and contain minimum redundancy. However, most of the existing approaches heuristically select key frames; hence, the selected key frames may not be the most different frames and/or not cover the entire content of a video. In this paper, we propose a method of video summarization which provides the reasonable objective functions for selecting key frames. In particular, we apply a statistical dependency measure called quadratic mutual informaion as our objective functions for maximizing the coverage of the entire video content as well as minimizing the redundancy among selected key frames. The proposed key frame extraction algorithm finds key frames as an optimization problem. Through experiments, we demonstrate the success of the proposed video summarization approach that produces video summary with better coverage of the entire video content while less redundancy among key frames comparing to the state-of-the-art approaches.

Keywords: Video summarization, key frame extraction, dependency measure, quadratic mutual information, optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 916
12312 A Weighted-Profiling Using an Ontology Basefor Semantic-Based Search

Authors: Hikmat A. M. Abd-El-Jaber, Tengku M. T. Sembok

Abstract:

The information on the Web increases tremendously. A number of search engines have been developed for searching Web information and retrieving relevant documents that satisfy the inquirers needs. Search engines provide inquirers irrelevant documents among search results, since the search is text-based rather than semantic-based. Information retrieval research area has presented a number of approaches and methodologies such as profiling, feedback, query modification, human-computer interaction, etc for improving search results. Moreover, information retrieval has employed artificial intelligence techniques and strategies such as machine learning heuristics, tuning mechanisms, user and system vocabularies, logical theory, etc for capturing user's preferences and using them for guiding the search based on the semantic analysis rather than syntactic analysis. Although a valuable improvement has been recorded on search results, the survey has shown that still search engines users are not really satisfied with their search results. Using ontologies for semantic-based searching is likely the key solution. Adopting profiling approach and using ontology base characteristics, this work proposes a strategy for finding the exact meaning of the query terms in order to retrieve relevant information according to user needs. The evaluation of conducted experiments has shown the effectiveness of the suggested methodology and conclusion is presented.

Keywords: information retrieval, user profiles, semantic Web, ontology, search engine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3165
12311 AINA: Disney Animation Information as Educational Resources

Authors: Piedad Garrido, Fernando Repulles, Andy Bloor, Julio A. Sanguesa, Jesus Gallardo, Vicente Torres, Jesus Tramullas

Abstract:

With the emergence and development of Information and Communications Technologies (ICTs), Higher Education is experiencing rapid changes, not only in its teaching strategies but also in student’s learning skills. However, we have noticed that students often have difficulty when seeking innovative, useful, and interesting learning resources for their work. This is due to the lack of supervision in the selection of good query tools. This paper presents AINA, an Information Retrieval (IR) computer system aimed at providing motivating and stimulating content to both students and teachers working on different areas and at different educational levels. In particular, our proposal consists of an open virtual resource environment oriented to the vast universe of Disney comics and cartoons. Our test suite includes Disney’s long and shorts films, and we have performed some activities based on the Just In Time Teaching (JiTT) methodology. More specifically, it has been tested by groups of university and secondary school students.

Keywords: Information retrieval, animation, educational resources, JiTT.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1154
12310 Ontology-based Domain Modelling for Consistent Content Change Management

Authors: Muhammad Javed, Yalemisew M. Abgaz, Claus Pahl

Abstract:

Ontology-based modelling of multi-formatted software application content is a challenging area in content management. When the number of software content unit is huge and in continuous process of change, content change management is important. The management of content in this context requires targeted access and manipulation methods. We present a novel approach to deal with model-driven content-centric information systems and access to their content. At the core of our approach is an ontology-based semantic annotation technique for diversely formatted content that can improve the accuracy of access and systems evolution. Domain ontologies represent domain-specific concepts and conform to metamodels. Different ontologies - from application domain ontologies to software ontologies - capture and model the different properties and perspectives on a software content unit. Interdependencies between domain ontologies, the artifacts and the content are captured through a trace model. The annotation traces are formalised and a graph-based system is selected for the representation of the annotation traces.

Keywords: Consistent Content Management, Impact Categorisation, Trace Model, Ontology Evolution

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1639
12309 Mining and Visual Management of XML-Based Image Collections

Authors: Khalil Shihab, Nida Al-Chalabi

Abstract:

This article describes Uruk, the virtual museum of Iraq that we developed for visual exploration and retrieval of image collections. The system largely exploits the loosely-structured hierarchy of XML documents that provides a useful representation method to store semi-structured or unstructured data, which does not easily fit into existing database. The system offers users the capability to mine and manage the XML-based image collections through a web-based Graphical User Interface (GUI). Typically, at an interactive session with the system, the user can browse a visual structural summary of the XML database in order to select interesting elements. Using this intermediate result, queries combining structure and textual references can be composed and presented to the system. After query evaluation, the full set of answers is presented in a visual and structured way.

Keywords: Data-centric XML, graphical user interfaces, information retrieval, case-based reasoning, fuzzy sets

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1742
12308 Query Reformulation Guided by External Resource for Information Retrieval

Authors: Mohammed El Amine Abderrahim

Abstract:

Reformulating the user query is a technique that aims to improve the performance of an Information Retrieval System (IRS) in terms of precision and recall. This paper tries to evaluate the technique of query reformulation guided by an external resource for Arabic texts. To do this, various precision and recall measures were conducted and two corpora with different external resources like Arabic WordNet (AWN) and the Arabic Dictionary (thesaurus) of Meaning (ADM) were used. Examination of the obtained results will allow us to measure the real contribution of this reformulation technique in improving the IRS performance.

Keywords: Arabic NLP, Arabic Information Retrieval, Arabic WordNet, Query Expansion.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1358
12307 Object-Based Image Indexing and Retrieval in DCT Domain using Clustering Techniques

Authors: Hossein Nezamabadi-pour, Saeid Saryazdi

Abstract:

In this paper, we present a new and effective image indexing technique that extracts features directly from DCT domain. Our proposed approach is an object-based image indexing. For each block of size 8*8 in DCT domain a feature vector is extracted. Then, feature vectors of all blocks of image using a k-means algorithm is clustered into groups. Each cluster represents a special object of the image. Then we select some clusters that have largest members after clustering. The centroids of the selected clusters are taken as image feature vectors and indexed into the database. Also, we propose an approach for using of proposed image indexing method in automatic image classification. Experimental results on a database of 800 images from 8 semantic groups in automatic image classification are reported.

Keywords: Object-based image retrieval, DCT domain, Image indexing, Image classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1980
12306 Content-Based Color Image Retrieval Based On 2-D Histogram and Statistical Moments

Authors: Khalid Elasnaoui, Brahim Aksasse, Mohammed Ouanan

Abstract:

In this paper, we are interested in the problem of finding similar images in a large database. For this purpose we propose a new algorithm based on a combination of the 2-D histogram intersection in the HSV space and statistical moments. The proposed histogram is based on a 3x3 window and not only on the intensity of the pixel. This approach overcome the drawback of the conventional 1-D histogram which is ignoring the spatial distribution of pixels in the image, while the statistical moments are used to escape the effects of the discretisation of the color space which is intrinsic to the use of histograms. We compare the performance of our new algorithm to various methods of the state of the art and we show that it has several advantages. It is fast, consumes little memory and requires no learning. To validate our results, we apply this algorithm to search for similar images in different image databases.

Keywords: 2-D histogram, Statistical moments, Indexing, Similarity distance, Histograms intersection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1890
12305 Selection of Relevant Servers in Distributed Information Retrieval System

Authors: Benhamouda Sara, Guezouli Larbi

Abstract:

Nowadays, the dissemination of information touches the distributed world, where selecting the relevant servers to a user request is an important problem in distributed information retrieval. During the last decade, several research studies on this issue have been launched to find optimal solutions and many approaches of collection selection have been proposed. In this paper, we propose a new collection selection approach that takes into consideration the number of documents in a collection that contains terms of the query and the weights of those terms in these documents. We tested our method and our studies show that this technique can compete with other state-of-the-art algorithms that we choose to test the performance of our approach.

Keywords: Distributed information retrieval, relevance, server selection, collection selection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1326
12304 Fast Extraction of Edge Histogram in DCT Domain based on MPEG7

Authors: Minyoung Eom, Yoonsik Choe

Abstract:

In these days, multimedia data is transmitted and processed in compressed format. Due to the decoding procedure and filtering for edge detection, the feature extraction process of MPEG-7 Edge Histogram Descriptor is time-consuming as well as computationally expensive. To improve efficiency of compressed image retrieval, we propose a new edge histogram generation algorithm in DCT domain in this paper. Using the edge information provided by only two AC coefficients of DCT coefficients, we can get edge directions and strengths directly in DCT domain. The experimental results demonstrate that our system has good performance in terms of retrieval efficiency and effectiveness.

Keywords: DCT, Descriptor, EHD, MPEG7.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2069
12303 Frequency- and Content-Based Tag Cloud Font Distribution Algorithm

Authors: Ágnes Bogárdi-Mészöly, Takeshi Hashimoto, Shohei Yokoyama, Hiroshi Ishikawa

Abstract:

The spread of Web 2.0 has caused user-generated content explosion. Users can tag resources to describe and organize them. Tag clouds provide rough impression of relative importance of each tag within overall cloud in order to facilitate browsing among numerous tags and resources. The goal of our paper is to enrich visualization of tag clouds. A font distribution algorithm has been proposed to calculate a novel metric based on frequency and content, and to classify among classes from this metric based on power law distribution and percentages. The suggested algorithm has been validated and verified on the tag cloud of a real-world thesis portal.

Keywords: Tag cloud, font distribution algorithm, frequency-based, content-based, power law.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2043
12302 Enhanced Frame-based Video Coding to Support Content-based Functionalities

Authors: Prabhudev Hosur, Rolando Carrasco

Abstract:

This paper presents the enhanced frame-based video coding scheme. The input source video to the enhanced frame-based video encoder consists of a rectangular-size video and shapes of arbitrarily-shaped objects on video frames. The rectangular frame texture is encoded by the conventional frame-based coding technique and the video object-s shape is encoded using the contour-based vertex coding. It is possible to achieve several useful content-based functionalities by utilizing the shape information in the bitstream at the cost of a very small overhead to the bitrate.

Keywords: Video coding, content-based, hyper video, interactivity, shape coding, polygon.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1613
12301 Fuzzy C-Means Clustering for Biomedical Documents Using Ontology Based Indexing and Semantic Annotation

Authors: S. Logeswari, K. Premalatha

Abstract:

Search is the most obvious application of information retrieval. The variety of widely obtainable biomedical data is enormous and is expanding fast. This expansion makes the existing techniques are not enough to extract the most interesting patterns from the collection as per the user requirement. Recent researches are concentrating more on semantic based searching than the traditional term based searches. Algorithms for semantic searches are implemented based on the relations exist between the words of the documents. Ontologies are used as domain knowledge for identifying the semantic relations as well as to structure the data for effective information retrieval. Annotation of data with concepts of ontology is one of the wide-ranging practices for clustering the documents. In this paper, indexing based on concept and annotation are proposed for clustering the biomedical documents. Fuzzy c-means (FCM) clustering algorithm is used to cluster the documents. The performances of the proposed methods are analyzed with traditional term based clustering for PubMed articles in five different diseases communities. The experimental results show that the proposed methods outperform the term based fuzzy clustering.

Keywords: MeSH Ontology, Concept Indexing, Annotation, semantic relations, Fuzzy c-means.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2255
12300 Comparative Evaluation of Color-Based Video Signatures in the Presence of Various Distortion Types

Authors: Aritz Sánchez de la Fuente, Patrick Ndjiki-Nya, Karsten Sühring, Tobias Hinz, Karsten Müller, Thomas Wiegand

Abstract:

The robustness of color-based signatures in the presence of a selection of representative distortions is investigated. Considered are five signatures that have been developed and evaluated within a new modular framework. Two signatures presented in this work are directly derived from histograms gathered from video frames. The other three signatures are based on temporal information by computing difference histograms between adjacent frames. In order to obtain objective and reproducible results, the evaluations are conducted based on several randomly assembled test sets. These test sets are extracted from a video repository that contains a wide range of broadcast content including documentaries, sports, news, movies, etc. Overall, the experimental results show the adequacy of color-histogram-based signatures for video fingerprinting applications and indicate which type of signature should be preferred in the presence of certain distortions.

Keywords: color histograms, robust hashing, video retrieval, video signature

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1409