Search results for: Latent Semantic Indexing
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 399

Search results for: Latent Semantic Indexing

399 Topic Modeling Using Latent Dirichlet Allocation and Latent Semantic Indexing on South African Telco Twitter Data

Authors: Phumelele P. Kubheka, Pius A. Owolawi, Gbolahan Aiyetoro

Abstract:

Twitter is one of the most popular social media platforms where users share their opinions on different subjects. Twitter can be considered a great source for mining text due to the high volumes of data generated through the platform daily. Many industries such as telecommunication companies can leverage the availability of Twitter data to better understand their markets and make an appropriate business decision. This study performs topic modeling on Twitter data using Latent Dirichlet Allocation (LDA). The obtained results are benchmarked with another topic modeling technique, Latent Semantic Indexing (LSI). The study aims to retrieve topics on a Twitter dataset containing user tweets on South African Telcos. Results from this study show that LSI is much faster than LDA. However, LDA yields better results with higher topic coherence by 8% for the best-performing model in this experiment. A higher topic coherence score indicates better performance of the model.

Keywords: Big data, latent Dirichlet allocation, latent semantic indexing, Telco, topic modeling, Twitter.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 378
398 Distributional Semantics Approach to Thai Word Sense Disambiguation

Authors: Sunee Pongpinigpinyo, Wanchai Rivepiboon

Abstract:

Word sense disambiguation is one of the most important open problems in natural language processing applications such as information retrieval and machine translation. Many approach strategies can be employed to resolve word ambiguity with a reasonable degree of accuracy. These strategies are: knowledgebased, corpus-based, and hybrid-based. This paper pays attention to the corpus-based strategy that employs an unsupervised learning method for disambiguation. We report our investigation of Latent Semantic Indexing (LSI), an information retrieval technique and unsupervised learning, to the task of Thai noun and verbal word sense disambiguation. The Latent Semantic Indexing has been shown to be efficient and effective for Information Retrieval. For the purposes of this research, we report experiments on two Thai polysemous words, namely  /hua4/ and /kep1/ that are used as a representative of Thai nouns and verbs respectively. The results of these experiments demonstrate the effectiveness and indicate the potential of applying vector-based distributional information measures to semantic disambiguation.

Keywords: Distributional semantics, Latent Semantic Indexing, natural language processing, Polysemous words, unsupervisedlearning, Word Sense Disambiguation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1755
397 An Semantic Algorithm for Text Categoritation

Authors: Xu Zhao

Abstract:

Text categorization techniques are widely used to many Information Retrieval (IR) applications. In this paper, we proposed a simple but efficient method that can automatically find the relationship between any pair of terms and documents, also an indexing matrix is established for text categorization. We call this method Indexing Matrix Categorization Machine (IMCM). Several experiments are conducted to show the efficiency and robust of our algorithm.

Keywords: Text categorization, Sub-space learning, Latent Semantic Space

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1414
396 A Content Vector Model for Text Classification

Authors: Eric Jiang

Abstract:

As a popular rank-reduced vector space approach, Latent Semantic Indexing (LSI) has been used in information retrieval and other applications. In this paper, an LSI-based content vector model for text classification is presented, which constructs multiple augmented category LSI spaces and classifies text by their content. The model integrates the class discriminative information from the training data and is equipped with several pertinent feature selection and text classification algorithms. The proposed classifier has been applied to email classification and its experiments on a benchmark spam testing corpus (PU1) have shown that the approach represents a competitive alternative to other email classifiers based on the well-known SVM and naïve Bayes algorithms.

Keywords: Feature Selection, Latent Semantic Indexing, Text Classification, Vector Space Model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1838
395 Semantic Indexing Approach of a Corpora Based On Ontology

Authors: Mohammed Erritali

Abstract:

The growth in the volume of text data such as books and articles in libraries for centuries has imposed to establish effective mechanisms to locate them. Early techniques such as abstraction, indexing and the use of classification categories have marked the birth of a new field of research called "Information Retrieval". Information Retrieval (IR) can be defined as the task of defining models and systems whose purpose is to facilitate access to a set of documents in electronic form (corpus) to allow a user to find the relevant ones for him, that is to say, the contents which matches with the information needs of the user. This paper presents a new semantic indexing approach of a documentary corpus. The indexing process starts first by a term weighting phase to determine the importance of these terms in the documents. Then the use of a thesaurus like Wordnet allows moving to the conceptual level. Each candidate concept is evaluated by determining its level of representation of the document, that is to say, the importance of the concept in relation to other concepts of the document. Finally, the semantic index is constructed by attaching to each concept of the ontology, the documents of the corpus in which these concepts are found.

Keywords: Semantic, indexing, corpora, WordNet, ontology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1328
394 Content-based Retrieval of Medical Images

Authors: Lilac A. E. Al-Safadi

Abstract:

With the advance of multimedia and diagnostic images technologies, the number of radiographic images is increasing constantly. The medical field demands sophisticated systems for search and retrieval of the produced multimedia document. This paper presents an ongoing research that focuses on the semantic content of radiographic image documents to facilitate semantic-based radiographic image indexing and a retrieval system. The proposed model would divide a radiographic image document, based on its semantic content, and would be converted into a logical structure or a semantic structure. The logical structure represents the overall organization of information. The semantic structure, which is bound to logical structure, is composed of semantic objects with interrelationships in the various spaces in the radiographic image.

Keywords: Semantic Indexing, Content-Based Retrieval, Radiographic Images, Data Model

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1446
393 Fuzzy C-Means Clustering for Biomedical Documents Using Ontology Based Indexing and Semantic Annotation

Authors: S. Logeswari, K. Premalatha

Abstract:

Search is the most obvious application of information retrieval. The variety of widely obtainable biomedical data is enormous and is expanding fast. This expansion makes the existing techniques are not enough to extract the most interesting patterns from the collection as per the user requirement. Recent researches are concentrating more on semantic based searching than the traditional term based searches. Algorithms for semantic searches are implemented based on the relations exist between the words of the documents. Ontologies are used as domain knowledge for identifying the semantic relations as well as to structure the data for effective information retrieval. Annotation of data with concepts of ontology is one of the wide-ranging practices for clustering the documents. In this paper, indexing based on concept and annotation are proposed for clustering the biomedical documents. Fuzzy c-means (FCM) clustering algorithm is used to cluster the documents. The performances of the proposed methods are analyzed with traditional term based clustering for PubMed articles in five different diseases communities. The experimental results show that the proposed methods outperform the term based fuzzy clustering.

Keywords: MeSH Ontology, Concept Indexing, Annotation, semantic relations, Fuzzy c-means.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2250
392 Object-Based Image Indexing and Retrieval in DCT Domain using Clustering Techniques

Authors: Hossein Nezamabadi-pour, Saeid Saryazdi

Abstract:

In this paper, we present a new and effective image indexing technique that extracts features directly from DCT domain. Our proposed approach is an object-based image indexing. For each block of size 8*8 in DCT domain a feature vector is extracted. Then, feature vectors of all blocks of image using a k-means algorithm is clustered into groups. Each cluster represents a special object of the image. Then we select some clusters that have largest members after clustering. The centroids of the selected clusters are taken as image feature vectors and indexed into the database. Also, we propose an approach for using of proposed image indexing method in automatic image classification. Experimental results on a database of 800 images from 8 semantic groups in automatic image classification are reported.

Keywords: Object-based image retrieval, DCT domain, Image indexing, Image classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1977
391 UB-Tree Indexing for Semantic Query Optimization of Range Queries

Authors: S. Housseno, A. Simonet, M. Simonet

Abstract:

Semantic query optimization consists in restricting the search space in order to reduce the set of objects of interest for a query. This paper presents an indexing method based on UB-trees and a static analysis of the constraints associated to the views of the database and to any constraint expressed on attributes. The result of the static analysis is a partitioning of the object space into disjoint blocks. Through Space Filling Curve (SFC) techniques, each fragment (block) of the partition is assigned a unique identifier, enabling the efficient indexing of fragments by UB-trees. The search space corresponding to a range query is restricted to a subset of the blocks of the partition. This approach has been developed in the context of a KB-DBMS but it can be applied to any relational system.

Keywords: Index, Range query, UB-tree, Space Filling Curve, Query optimization, Views, Database, Integrity Constraint, Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1450
390 Indexing and Searching of Image Data in Multimedia Databases Using Axial Projection

Authors: Khalid A. Kaabneh

Abstract:

This paper introduces and studies new indexing techniques for content-based queries in images databases. Indexing is the key to providing sophisticated, accurate and fast searches for queries in image data. This research describes a new indexing approach, which depends on linear modeling of signals, using bases for modeling. A basis is a set of chosen images, and modeling an image is a least-squares approximation of the image as a linear combination of the basis images. The coefficients of the basis images are taken together to serve as index for that image. The paper describes the implementation of the indexing scheme, and presents the findings of our extensive evaluation that was conducted to optimize (1) the choice of the basis matrix (B), and (2) the size of the index A (N). Furthermore, we compare the performance of our indexing scheme with other schemes. Our results show that our scheme has significantly higher performance.

Keywords: Axial Projection, images, indexing, multimedia database, searching.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1344
389 Enhancing Retrieval Effectiveness of Malay Documents by Exploiting Implicit Semantic Relationship between Words

Authors: Mohd Pouzi Hamzah, Tengku Mohd Tengku Sembok

Abstract:

Phrases has a long history in information retrieval, particularly in commercial systems. Implicit semantic relationship between words in a form of BaseNP have shown significant improvement in term of precision in many IR studies. Our research focuses on linguistic phrases which is language dependent. Our results show that using BaseNP can improve performance although above 62% of words formation in Malay Language based on derivational affixes and suffixes.

Keywords: Information Retrieval, Malay Language, Semantic Relationship, Retrieval Effectiveness, Conceptual Indexing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1377
388 Toward a Use of Ontology to Reinforcing Semantic Classification of Message Based On LSA

Authors: S. Lgarch, M. Khalidi Idrissi, S. Bennani

Abstract:

For best collaboration, Asynchronous tools and particularly the discussion forums are the most used thanks to their flexibility in terms of time. To convey only the messages that belong to a theme of interest of the tutor in order to help him during his tutoring work, use of a tool for classification of these messages is indispensable. For this we have proposed a semantics classification tool of messages of a discussion forum that is based on LSA (Latent Semantic Analysis), which includes a thesaurus to organize the vocabulary. Benefits offered by formal ontology can overcome the insufficiencies that a thesaurus generates during its use and encourage us then to use it in our semantic classifier. In this work we propose the use of some functionalities that a OWL ontology proposes. We then explain how functionalities like “ObjectProperty", "SubClassOf" and “Datatype" property make our classification more intelligent by way of integrating new terms. New terms found are generated based on the first terms introduced by tutor and semantic relations described by OWL formalism.

Keywords: Classification of messages, collaborative communication tools, discussion forum, e-learning, formal description, latente semantic analysis, ontology, owl, semantic relations, semantic web, thesaurus, tutoring.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1568
387 Parallel Querying of Distributed Ontologies with Shared Vocabulary

Authors: Sharjeel Aslam, Vassil Vassilev, Karim Ouazzane

Abstract:

Ontologies and various semantic repositories became a convenient approach for implementing model-driven architectures of distributed systems on the Web. SPARQL is the standard query language for querying such. However, although SPARQL is well-established standard for querying semantic repositories in RDF and OWL format and there are commonly used APIs which supports it, like Jena for Java, its parallel option is not incorporated in them. This article presents a complete framework consisting of an object algebra for parallel RDF and an index-based implementation of the parallel query engine capable of dealing with the distributed RDF ontologies which share common vocabulary. It has been implemented in Java, and for validation of the algorithms has been applied to the problem of organizing virtual exhibitions on the Web.

Keywords: Distributed ontologies, parallel querying, semantic indexing, shared vocabulary, SPARQL.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 605
386 Lecture Video Indexing and Retrieval Using Topic Keywords

Authors: B. J. Sandesh, Saurabha Jirgi, S. Vidya, Prakash Eljer, Gowri Srinivasa

Abstract:

In this paper, we propose a framework to help users to search and retrieve the portions in the lecture video of their interest. This is achieved by temporally segmenting and indexing the lecture video using the topic keywords. We use transcribed text from the video and documents relevant to the video topic extracted from the web for this purpose. The keywords for indexing are found by applying the non-negative matrix factorization (NMF) topic modeling techniques on the web documents. Our proposed technique first creates indices on the transcribed documents using the topic keywords, and these are mapped to the video to find the start and end time of the portions of the video for a particular topic. This time information is stored in the index table along with the topic keyword which is used to retrieve the specific portions of the video for the query provided by the users.

Keywords: Video indexing and retrieval, lecture videos, content based video search, multimodal indexing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1492
385 Concept Indexing using Ontology and Supervised Machine Learning

Authors: Rossitza M. Setchi, Qiao Tang

Abstract:

Nowadays, ontologies are the only widely accepted paradigm for the management of sharable and reusable knowledge in a way that allows its automatic interpretation. They are collaboratively created across the Web and used to index, search and annotate documents. The vast majority of the ontology based approaches, however, focus on indexing texts at document level. Recently, with the advances in ontological engineering, it became clear that information indexing can largely benefit from the use of general purpose ontologies which aid the indexing of documents at word level. This paper presents a concept indexing algorithm, which adds ontology information to words and phrases and allows full text to be searched, browsed and analyzed at different levels of abstraction. This algorithm uses a general purpose ontology, OntoRo, and an ontologically tagged corpus, OntoCorp, both developed for the purpose of this research. OntoRo and OntoCorp are used in a two-stage supervised machine learning process aimed at generating ontology tagging rules. The first experimental tests show a tagging accuracy of 78.91% which is encouraging in terms of the further improvement of the algorithm.

Keywords: Concepts, indexing, machine learning, ontology, tagging.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1629
384 Approaches to Developing Semantic Web Services

Authors: Jorge Cardoso

Abstract:

It has been recognized that due to the autonomy and heterogeneity, of Web services and the Web itself, new approaches should be developed to describe and advertise Web services. The most notable approaches rely on the description of Web services using semantics. This new breed of Web services, termed semantic Web services, will enable the automatic annotation, advertisement, discovery, selection, composition, and execution of interorganization business logic, making the Internet become a common global platform where organizations and individuals communicate with each other to carry out various commercial activities and to provide value-added services. This paper deals with two of the hottest R&D and technology areas currently associated with the Web – Web services and the semantic Web. It describes how semantic Web services extend Web services as the semantic Web improves the current Web, and presents three different conceptual approaches to deploying semantic Web services, namely, WSDL-S, OWL-S, and WSMO.

Keywords: Semantic Web, Web service, Web process, WWW

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1385
383 Semantic Markup for Web Applications

Authors: Martin Dostal, Dalibor Fiala, Karel Ježek

Abstract:

In this paper we would like to introduce some of the best practices of using semantic markup and its significance in the success of web applications. Search engines are one of the best ways to reach potential customers and are some of the main indicators of web sites' fruitfulness. We will introduce the most important semantic vocabularies which are used by Google and Yahoo. Afterwards, we will explain the process of semantic markup implementation and its significance for search engines and other semantic markup consumers. We will describe techniques for slow conceiving RDFa markup to our web application for collecting Call for papers (CFP) announcements.

Keywords: Call for papers, Google, RDFa, semantic markup, semantic web, Yahoo.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1730
382 Image Indexing Using a Color Similarity Metric based on the Human Visual System

Authors: Angelo Nodari, Ignazio Gallo

Abstract:

The novelty proposed in this study is twofold and consists in the developing of a new color similarity metric based on the human visual system and a new color indexing based on a textual approach. The new color similarity metric proposed is based on the color perception of the human visual system. Consequently the results returned by the indexing system can fulfill as much as possibile the user expectations. We developed a web application to collect the users judgments about the similarities between colors, whose results are used to estimate the metric proposed in this study. In order to index the image's colors, we used a text indexing engine to facilitate the integration of visual features in a database of text documents. The textual signature is build by weighting the image's colors in according to their occurrence in the image. The use of a textual indexing engine, provide us a simple, fast and robust solution to index images. A typical usage of the system proposed in this study, is the development of applications whose data type is both visual and textual. In order to evaluate the proposed method we chose a price comparison engine as a case of study, collecting a series of commercial offers containing the textual description and the image representing a specific commercial offer.

Keywords: Color Extraction, Content-Based Image Retrieval, Indexing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2983
381 A Proposed Trust Model for the Semantic Web

Authors: Hoda Waguih

Abstract:

A serious problem on the WWW is finding reliable information. Not everything found on the Web is true and the Semantic Web does not change that in any way. The problem will be even more crucial for the Semantic Web, where agents will be integrating and using information from multiple sources. Thus, if an incorrect premise is used due to a single faulty source, then any conclusions drawn may be in error. Thus, statements published on the Semantic Web have to be seen as claims rather than as facts, and there should be a way to decide which among many possibly inconsistent sources is most reliable. In this work, we propose a trust model for the Semantic Web. The proposed model is inspired by the use trust in human society. Trust is a type of social knowledge and encodes evaluations about which agents can be taken as reliable sources of information or services. Our proposed model allows agents to decide which among different sources of information to trust and thus act rationally on the semantic web.

Keywords: Semantic Web, Trust, Web of Trust, WWW.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1489
380 On the Move to Semantic Web Services

Authors: Jorge Cardoso

Abstract:

Semantic Web services will enable the semiautomatic and automatic annotation, advertisement, discovery, selection, composition, and execution of inter-organization business logic, making the Internet become a common global platform where organizations and individuals communicate with each other to carry out various commercial activities and to provide value-added services. There is a growing consensus that Web services alone will not be sufficient to develop valuable solutions due the degree of heterogeneity, autonomy, and distribution of the Web. This paper deals with two of the hottest R&D and technology areas currently associated with the Web – Web services and the Semantic Web. It presents the synergies that can be created between Web Services and Semantic Web technologies to provide a new generation of eservices.

Keywords: Semantic Web, Web service, Web process, WWW.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1241
379 Grouping and Indexing Color Features for Efficient Image Retrieval

Authors: M. V. Sudhamani, C. R. Venugopal

Abstract:

Content-based Image Retrieval (CBIR) aims at searching image databases for specific images that are similar to a given query image based on matching of features derived from the image content. This paper focuses on a low-dimensional color based indexing technique for achieving efficient and effective retrieval performance. In our approach, the color features are extracted using the mean shift algorithm, a robust clustering technique. Then the cluster (region) mode is used as representative of the image in 3-D color space. The feature descriptor consists of the representative color of a region and is indexed using a spatial indexing method that uses *R -tree thus avoiding the high-dimensional indexing problems associated with the traditional color histogram. Alternatively, the images in the database are clustered based on region feature similarity using Euclidian distance. Only representative (centroids) features of these clusters are indexed using *R -tree thus improving the efficiency. For similarity retrieval, each representative color in the query image or region is used independently to find regions containing that color. The results of these methods are compared. A JAVA based query engine supporting query-by- example is built to retrieve images by color.

Keywords: Content-based, indexing, cluster, region.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1767
378 A Materialized Approach to the Integration of XML Documents: the OSIX System

Authors: H. Ahmad, S. Kermanshahani, A. Simonet, M. Simonet

Abstract:

The data exchanged on the Web are of different nature from those treated by the classical database management systems; these data are called semi-structured data since they do not have a regular and static structure like data found in a relational database; their schema is dynamic and may contain missing data or types. Therefore, the needs for developing further techniques and algorithms to exploit and integrate such data, and extract relevant information for the user have been raised. In this paper we present the system OSIX (Osiris based System for Integration of XML Sources). This system has a Data Warehouse model designed for the integration of semi-structured data and more precisely for the integration of XML documents. The architecture of OSIX relies on the Osiris system, a DL-based model designed for the representation and management of databases and knowledge bases. Osiris is a viewbased data model whose indexing system supports semantic query optimization. We show that the problem of query processing on a XML source is optimized by the indexing approach proposed by Osiris.

Keywords: Data integration, semi-structured data, views, XML.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1540
377 Latent Semantic Inference for Agriculture FAQ Retrieval

Authors: Dawei Wang, Rujing Wang, Ying Li, Baozi Wei

Abstract:

FAQ system can make user find answer to the problem that puzzles them. But now the research on Chinese FAQ system is still on the theoretical stage. This paper presents an approach to semantic inference for FAQ mining. To enhance the efficiency, a small pool of the candidate question-answering pairs retrieved from the system for the follow-up work according to the concept of the agriculture domain extracted from user input .Input queries or questions are converted into four parts, the question word segment (QWS), the verb segment (VS), the concept of agricultural areas segment (CS), the auxiliary segment (AS). A semantic matching method is presented to estimate the similarity between the semantic segments of the query and the questions in the pool of the candidate. A thesaurus constructed from the HowNet, a Chinese knowledge base, is adopted for word similarity measure in the matcher. The questions are classified into eleven intension categories using predefined question stemming keywords. For FAQ mining, given a query, the question part and answer part in an FAQ question-answer pair is matched with the input query, respectively. Finally, the probabilities estimated from these two parts are integrated and used to choose the most likely answer for the input query. These approaches are experimented on an agriculture FAQ system. Experimental results indicate that the proposed approach outperformed the FAQ-Finder system in agriculture FAQ retrieval.

Keywords: FAQ, Semantic Inference, Ontology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1335
376 Toward An Agreement on Semantic Web Architecture

Authors: Haytham Al-Feel, M.A.Koutb, Hoda Suoror

Abstract:

There are many problems associated with the World Wide Web: getting lost in the hyperspace; the web content is still accessible only to humans and difficulties of web administration. The solution to these problems is the Semantic Web which is considered to be the extension for the current web presents information in both human readable and machine processable form. The aim of this study is to reach new generic foundation architecture for the Semantic Web because there is no clear architecture for it, there are four versions, but still up to now there is no agreement for one of these versions nor is there a clear picture for the relation between different layers and technologies inside this architecture. This can be done depending on the idea of previous versions as well as Gerber-s evaluation method as a step toward an agreement for one Semantic Web architecture.

Keywords: Semantic Web Architecture, XML, RDF and Ontology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1653
375 Categorizing Search Result Records Using Word Sense Disambiguation

Authors: R. Babisaraswathi, N. Shanthi, S. S. Kiruthika

Abstract:

Web search engines are designed to retrieve and extract the information in the web databases and to return dynamic web pages. The Semantic Web is an extension of the current web in which it includes semantic content in web pages. The main goal of semantic web is to promote the quality of the current web by changing its contents into machine understandable form. Therefore, the milestone of semantic web is to have semantic level information in the web. Nowadays, people use different keyword- based search engines to find the relevant information they need from the web. But many of the words are polysemous. When these words are used to query a search engine, it displays the Search Result Records (SRRs) with different meanings. The SRRs with similar meanings are grouped together based on Word Sense Disambiguation (WSD). In addition to that semantic annotation is also performed to improve the efficiency of search result records. Semantic Annotation is the process of adding the semantic metadata to web resources. Thus the grouped SRRs are annotated and generate a summary which describes the information in SRRs. But the automatic semantic annotation is a significant challenge in the semantic web. Here ontology and knowledge based representation are used to annotate the web pages.

Keywords: Ontology, Semantic Web, WordNet, Word Sense Disambiguation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1712
374 An Evaluation Model for Semantic Enablement of Virtual Research Environments

Authors: Tristan O'Neill, Trina Myers, Jarrod Trevathan

Abstract:

The Tropical Data Hub (TDH) is a virtual research environment that provides researchers with an e-research infrastructure to congregate significant tropical data sets for data reuse, integration, searching, and correlation. However, researchers often require data and metadata synthesis across disciplines for crossdomain analyses and knowledge discovery. A triplestore offers a semantic layer to achieve a more intelligent method of search to support the synthesis requirements by automating latent linkages in the data and metadata. Presently, the benchmarks to aid the decision of which triplestore is best suited for use in an application environment like the TDH are limited to performance. This paper describes a new evaluation tool developed to analyze both features and performance. The tool comprises a weighted decision matrix to evaluate the interoperability, functionality, performance, and support availability of a range of integrated and native triplestores to rank them according to requirements of the TDH.

Keywords: Virtual research environment, Semantic Web, performance analysis, tropical data hub.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1730
373 A Formulation of the Latent Class Vector Model for Pairwise Data

Authors: Tomoya Okubo, Kuninori Nakamura, Shin-ichi Mayekawa

Abstract:

In this research, a latent class vector model for pairwise data is formulated. As compared to the basic vector model, this model yields consistent estimates of the parameters since the number of parameters to be estimated does not increase with the number of subjects. The result of the analysis reveals that the model was stable and could classify each subject to the latent classes representing the typical scales used by these subjects.

Keywords: finite mixture models, latent class analysis, Thrustone's paired comparison method, vector model

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1165
372 New Methods for E-Commerce Databases Designing in Semantic Web Systems (Modern Systems)

Authors: Karim Heidari, Serajodin Katebi, Ali Reza Mahdavi Far

Abstract:

The purpose of this paper is to study Database Models to use them efficiently in E-commerce websites. In this paper we are going to find a method which can save and retrieve information in Ecommerce websites. Thus, semantic web applications can work with, and we are also going to study different technologies of E-commerce databases and we know that one of the most important deficits in semantic web is the shortage of semantic data, since most of the information is still stored in relational databases, we present an approach to map legacy data stored in relational databases into the Semantic Web using virtually any modern RDF query language, as long as it is closed within RDF. To achieve this goal we study XML structures for relational data bases of old websites and eventually we will come up one level over XML and look for a map from relational model (RDM) to RDF. Noting that a large number of semantic webs get advantage of relational model, opening the ways which can be converted to XML and RDF in modern systems (semantic web) is important.

Keywords: E-Commerce, Semantic Web, Database, XML, RDF.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2266
371 A Contribution to the Polynomial Eigen Problem

Authors: Malika Yaici, Kamel Hariche, Tim Clarke

Abstract:

The relationship between eigenstructure (eigenvalues and eigenvectors) and latent structure (latent roots and latent vectors) is established. In control theory eigenstructure is associated with the state space description of a dynamic multi-variable system and a latent structure is associated with its matrix fraction description. Beginning with block controller and block observer state space forms and moving on to any general state space form, we develop the identities that relate eigenvectors and latent vectors in either direction. Numerical examples illustrate this result. A brief discussion of the potential of these identities in linear control system design follows. Additionally, we present a consequent result: a quick and easy method to solve the polynomial eigenvalue problem for regular matrix polynomials.

Keywords: Eigenvalues/Eigenvectors, Latent values/vectors, Matrix fraction description, State space description.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1841
370 The Semantic Web: a New Approach for Future World Wide Web

Authors: Sahar Nasrolahi, Mahdi Nikdast, Mehrdad Mahdavi Boroujerdi

Abstract:

The purpose of semantic web research is to transform the Web from a linked document repository into a distributed knowledge base and application platform, thus allowing the vast range of available information and services to be more efficiently exploited. As a first step in this transformation, languages such as OWL have been developed. Although fully realizing the Semantic Web still seems some way off, OWL has already been very successful and has rapidly become a defacto standard for ontology development in fields as diverse as geography, geology, astronomy, agriculture, defence and the life sciences. The aim of this paper is to classify key concepts of Semantic Web as well as introducing a new practical approach which uses these concepts to outperform Word Wide Web.

Keywords: Semantic Web, Ontology, OWL, Microformat, Word Wide Web.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1554