Search results for: semantic segmentation
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 593

Search results for: semantic segmentation

83 On-line Lao Handwritten Recognition with Proportional Invariant Feature

Authors: Khampheth Bounnady, Boontee Kruatrachue, Somkiat Wangsiripitak

Abstract:

This paper proposed high level feature for online Lao handwritten recognition. This feature must be high level enough so that the feature is not change when characters are written by different persons at different speed and different proportion (shorter or longer stroke, head, tail, loop, curve). In this high level feature, a character is divided in to sequence of curve segments where a segment start where curve reverse rotation (counter clockwise and clockwise). In each segment, following features are gathered cumulative change in direction of curve (- for clockwise), cumulative curve length, cumulative length of left to right, right to left, top to bottom and bottom to top ( cumulative change in X and Y axis of segment). This feature is simple yet robust for high accuracy recognition. The feature can be gather from parsing the original time sampling sequence X, Y point of the pen location without re-sampling. We also experiment on other segmentation point such as the maximum curvature point which was widely used by other researcher. Experiments results show that the recognition rates are at 94.62% in comparing to using maximum curvature point 75.07%. This is due to a lot of variations of turning points in handwritten.

Keywords: Handwritten feature, chain code, Lao handwritten recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2032
82 A Structural Support Vector Machine Approach for Biometric Recognition

Authors: Vishal Awasthi, Atul Kumar Agnihotri

Abstract:

Face is a non-intrusive strong biometrics for identification of original and dummy facial by different artificial means. Face recognition is extremely important in the contexts of computer vision, psychology, surveillance, pattern recognition, neural network, content based video processing. The availability of a widespread face database is crucial to test the performance of these face recognition algorithms. The openly available face databases include face images with a wide range of poses, illumination, gestures and face occlusions but there is no dummy face database accessible in public domain. This paper presents a face detection algorithm based on the image segmentation in terms of distance from a fixed point and template matching methods. This proposed work is having the most appropriate number of nodal points resulting in most appropriate outcomes in terms of face recognition and detection. The time taken to identify and extract distinctive facial features is improved in the range of 90 to 110 sec. with the increment of efficiency by 3%.

Keywords: Face recognition, Principal Component Analysis, PCA, Linear Discriminant Analysis, LDA, Improved Support Vector Machine, iSVM, elastic bunch mapping technique.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 494
81 Effect Comparison of Speckle Noise Reduction Filters on 2D-Echocardigraphic Images

Authors: Faten A. Dawood, Rahmita W. Rahmat, Suhaini B. Kadiman, Lili N. Abdullah, Mohd D. Zamrin

Abstract:

Echocardiography imaging is one of the most common diagnostic tests that are widely used for assessing the abnormalities of the regional heart ventricle function. The main goal of the image enhancement task in 2D-echocardiography (2DE) is to solve two major anatomical structure problems; speckle noise and low quality. Therefore, speckle noise reduction is one of the important steps that used as a pre-processing to reduce the distortion effects in 2DE image segmentation. In this paper, we present the common filters that based on some form of low-pass spatial smoothing filters such as Mean, Gaussian, and Median. The Laplacian filter was used as a high-pass sharpening filter. A comparative analysis was presented to test the effectiveness of these filters after being applied to original 2DE images of 4-chamber and 2-chamber views. Three statistical quantity measures: root mean square error (RMSE), peak signal-to-ratio (PSNR) and signal-tonoise ratio (SNR) are used to evaluate the filter performance quantitatively on the output enhanced image.

Keywords: Gaussian operator, median filter, speckle texture, peak signal-to-ratio

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1995
80 Teaching Linguistic Humour Research Theories: Egyptian Higher Education EFL Literature Classes

Authors: O. F. Elkommos

Abstract:

“Humour studies” is an interdisciplinary research area that is relatively recent. It interests researchers from the disciplines of psychology, sociology, medicine, nursing, in the work place, gender studies, among others, and certainly teaching, language learning, linguistics, and literature. Linguistic theories of humour research are numerous; some of which are of interest to the present study. In spite of the fact that humour courses are now taught in universities around the world in the Egyptian context it is not included. The purpose of the present study is two-fold: to review the state of arts and to show how linguistic theories of humour can be possibly used as an art and craft of teaching and of learning in EFL literature classes. In the present study linguistic theories of humour were applied to selected literary texts to interpret humour as an intrinsic artistic communicative competence challenge. Humour in the area of linguistics was seen as a fifth component of communicative competence of the second language leaner. In literature it was studied as satire, irony, wit, or comedy. Linguistic theories of humour now describe its linguistic structure, mechanism, function, and linguistic deviance. Semantic Script Theory of Verbal Humor (SSTH), General Theory of Verbal Humor (GTVH), Audience Based Theory of Humor (ABTH), and their extensions and subcategories as well as the pragmatic perspective were employed in the analyses. This research analysed the linguistic semantic structure of humour, its mechanism, and how the audience reader (teacher or learner) becomes an interactive interpreter of the humour. This promotes humour competence together with the linguistic, social, cultural, and discourse communicative competence. Studying humour as part of the literary texts and the perception of its function in the work also brings its positive association in class for educational purposes. Humour is by default a provoking/laughter-generated device. Incongruity recognition, perception and resolving it, is a cognitive mastery. This cognitive process involves a humour experience that lightens up the classroom and the mind. It establishes connections necessary for the learning process. In this context the study examined selected narratives to exemplify the application of the theories. It is, therefore, recommended that the theories would be taught and applied to literary texts for a better understanding of the language. Students will then develop their language competence. Teachers in EFL/ESL classes will teach the theories, assist students apply them and interpret text and in the process will also use humour. This is thus easing students' acquisition of the second language, making the classroom an enjoyable, cheerful, self-assuring, and self-illuminating experience for both themselves and their students. It is further recommended that courses of humour research studies should become an integral part of higher education curricula in Egypt.

Keywords: ABTH, deviance, disjuncture, episodic, GTVH, humour competence, humour comprehension, humour in the classroom, humour in the literary texts, humour research linguistic theories, incongruity- resolution, isotopy-disjunction, jab line, longer text joke, narrative story line (macro-micro), punch line, six knowledge resource, SSTH, stacks, strands, teaching linguistics, teaching literature, TEFL, TESL.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1410
79 Automated Heart Sound Classification from Unsegmented Phonocardiogram Signals Using Time Frequency Features

Authors: Nadia Masood Khan, Muhammad Salman Khan, Gul Muhammad Khan

Abstract:

Cardiologists perform cardiac auscultation to detect abnormalities in heart sounds. Since accurate auscultation is a crucial first step in screening patients with heart diseases, there is a need to develop computer-aided detection/diagnosis (CAD) systems to assist cardiologists in interpreting heart sounds and provide second opinions. In this paper different algorithms are implemented for automated heart sound classification using unsegmented phonocardiogram (PCG) signals. Support vector machine (SVM), artificial neural network (ANN) and cartesian genetic programming evolved artificial neural network (CGPANN) without the application of any segmentation algorithm has been explored in this study. The signals are first pre-processed to remove any unwanted frequencies. Both time and frequency domain features are then extracted for training the different models. The different algorithms are tested in multiple scenarios and their strengths and weaknesses are discussed. Results indicate that SVM outperforms the rest with an accuracy of 73.64%.

Keywords: Pattern recognition, machine learning, computer aided diagnosis, heart sound classification, and feature extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1284
78 Semi-Automatic Analyzer to Detect Authorial Intentions in Scientific Documents

Authors: Kanso Hassan, Elhore Ali, Soule-dupuy Chantal, Tazi Said

Abstract:

Information Retrieval has the objective of studying models and the realization of systems allowing a user to find the relevant documents adapted to his need of information. The information search is a problem which remains difficult because the difficulty in the representing and to treat the natural languages such as polysemia. Intentional Structures promise to be a new paradigm to extend the existing documents structures and to enhance the different phases of documents process such as creation, editing, search and retrieval. The intention recognition of the author-s of texts can reduce the largeness of this problem. In this article, we present intentions recognition system is based on a semi-automatic method of extraction the intentional information starting from a corpus of text. This system is also able to update the ontology of intentions for the enrichment of the knowledge base containing all possible intentions of a domain. This approach uses the construction of a semi-formal ontology which considered as the conceptualization of the intentional information contained in a text. An experiments on scientific publications in the field of computer science was considered to validate this approach.

Keywords: Information research, text analyzes, intentionalstructure, segmentation, ontology, natural language processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1638
77 DocPro: A Framework for Processing Semantic and Layout Information in Business Documents

Authors: Ming-Jen Huang, Chun-Fang Huang, Chiching Wei

Abstract:

With the recent advance of the deep neural network, we observe new applications of NLP (natural language processing) and CV (computer vision) powered by deep neural networks for processing business documents. However, creating a real-world document processing system needs to integrate several NLP and CV tasks, rather than treating them separately. There is a need to have a unified approach for processing documents containing textual and graphical elements with rich formats, diverse layout arrangement, and distinct semantics. In this paper, a framework that fulfills this unified approach is presented. The framework includes a representation model definition for holding the information generated by various tasks and specifications defining the coordination between these tasks. The framework is a blueprint for building a system that can process documents with rich formats, styles, and multiple types of elements. The flexible and lightweight design of the framework can help build a system for diverse business scenarios, such as contract monitoring and reviewing.

Keywords: Document processing, framework, formal definition, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 640
76 Binarization of Text Region based on Fuzzy Clustering and Histogram Distribution in Signboards

Authors: Jonghyun Park, Toan Nguyen Dinh, Gueesang Lee

Abstract:

In this paper, we present a novel approach to accurately detect text regions including shop name in signboard images with complex background for mobile system applications. The proposed method is based on the combination of text detection using edge profile and region segmentation using fuzzy c-means method. In the first step, we perform an elaborate canny edge operator to extract all possible object edges. Then, edge profile analysis with vertical and horizontal direction is performed on these edge pixels to detect potential text region existing shop name in a signboard. The edge profile and geometrical characteristics of each object contour are carefully examined to construct candidate text regions and classify the main text region from background. Finally, the fuzzy c-means algorithm is performed to segment and detected binarize text region. Experimental results show that our proposed method is robust in text detection with respect to different character size and color and can provide reliable text binarization result.

Keywords: Text detection, edge profile, signboard image, fuzzy clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2227
75 Stock Market Integration Measurement: Investigation of Malaysia and Singapore Stock Markets

Authors: B. K. Yeoh, Z. Arsad, C. W. Hooy

Abstract:

This paper tests the level of market integration between Malaysia and Singapore stock markets with the world market. Kalman Filter (KF) methodology is used on the International Capital Asset Pricing Model (ICAPM) and the pricing errors estimated within the framework of ICAPM are used as a measure of market integration or segmentation. The advantage of the KF technique is that it allows for time-varying coefficients in estimating ICAPM and hence able to capture the varying degree of market integration. Empirical results show clear evidence of varying degree of market integration for both case of Malaysia and Singapore. Furthermore, the results show that the changes in the level of market integration are found to coincide with certain economic events that have taken placed. The findings certainly provide evidence on the practicability of the KF technique to estimate stock markets integration. In the comparison between Malaysia and Singapore stock market, the result shows that the trends of the market integration indices for Malaysia and Singapore look similar through time but the magnitude is notably different with the Malaysia stock market showing greater degree of market integration. Finally, significant evidence of varying degree of market integration shows the inappropriate use of OLS in estimating the level of market integration.

Keywords: ICAPM, Kalman filter, stock market integration.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2173
74 Security Architecture for At-Home Medical Care Using Sensor Network

Authors: S.S.Mohanavalli, Sheila Anand

Abstract:

This paper proposes a novel architecture for At- Home medical care which enables senior citizens, patients with chronic ailments and patients requiring post- operative care to be remotely monitored in the comfort of their homes. This architecture is implemented using sensors and wireless networking for transmitting patient data to the hospitals, health- care centers for monitoring by medical professionals. Patients are equipped with sensors to measure their physiological parameters, like blood pressure, pulse rate etc. and a Wearable Data Acquisition Unit is used to transmit the patient sensor data. Medical professionals can be alerted to any abnormal variations in these values for diagnosis and suitable treatment. Security threats and challenges inherent to wireless communication and sensor network have been discussed and a security mechanism to ensure data confidentiality and source authentication has been proposed. Symmetric key algorithm AES has been used for encrypting the data and a patent-free, two-pass block cipher mode CCFB has been used for implementing semantic security.

Keywords: data confidentiality, integrity, remotemonitoring, source authentication

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1742
73 The Design of the HL7 RIM-based Sharing Components for Clinical Information Systems

Authors: Wei-Yi Yang, Li-Hui Lee, Hsiao-Li Gien, Hsing-Yi Chu, Yi-Ting Chou, Der-Ming Liou

Abstract:

The American Health Level Seven (HL7) Reference Information Model (RIM) consists of six back-bone classes that have different specialized attributes. Furthermore, for the purpose of enforcing the semantic expression, there are some specific mandatory vocabulary domains have been defined for representing the content values of some attributes. In the light of the fact that it is a duplicated effort on spending a lot of time and human cost to develop and modify Clinical Information Systems (CIS) for most hospitals due to the variety of workflows. This study attempts to design and develop sharing RIM-based components of the CIS for the different business processes. Therefore, the CIS contains data of a consistent format and type. The programmers can do transactions with the RIM-based clinical repository by the sharing RIM-based components. And when developing functions of the CIS, the sharing components also can be adopted in the system. These components not only satisfy physicians- needs in using a CIS but also reduce the time of developing new components of a system. All in all, this study provides a new viewpoint that integrating the data and functions with the business processes, it is an easy and flexible approach to build a new CIS.

Keywords: HL7, Reference Information Model (RIM), web service, process management.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1886
72 A New Fuzzy Decision Support Method for Analysis of Economic Factors of Turkey's Construction Industry

Authors: R. Tur, A. Yardımcı

Abstract:

Imperfect knowledge cannot be avoided all the time. Imperfections may have several forms; uncertainties, imprecision and incompleteness. When we look to classification of methods for the management of imperfect knowledge we see fuzzy set-based techniques. The choice of a method to process data is linked to the choice of knowledge representation, which can be numerical, symbolic, logical or semantic and it depends on the nature of the problem to be solved for example decision support, which will be mentioned in our study. Fuzzy Logic is used for its ability to manage imprecise knowledge, but it can take advantage of the ability of neural networks to learn coefficients or functions. Such an association of methods is typical of so-called soft computing. In this study a new method was used for the management of imprecision for collected knowledge which related to economic analysis of construction industry in Turkey. Because of sudden changes occurring in economic factors decrease competition strength of construction companies. The better evaluation of these changes in economical factors in view of construction industry will made positive influence on company-s decisions which are dealing construction.

Keywords: Fuzzy logic, decision support systems, construction industry.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1636
71 Ontology Population via NLP Techniques in Risk Management

Authors: Jawad Makki, Anne-Marie Alquier, Violaine Prince

Abstract:

In this paper we propose an NLP-based method for Ontology Population from texts and apply it to semi automatic instantiate a Generic Knowledge Base (Generic Domain Ontology) in the risk management domain. The approach is semi-automatic and uses a domain expert intervention for validation. The proposed approach relies on a set of Instances Recognition Rules based on syntactic structures, and on the predicative power of verbs in the instantiation process. It is not domain dependent since it heavily relies on linguistic knowledge. A description of an experiment performed on a part of the ontology of the PRIMA1 project (supported by the European community) is given. A first validation of the method is done by populating this ontology with Chemical Fact Sheets from Environmental Protection Agency2. The results of this experiment complete the paper and support the hypothesis that relying on the predicative power of verbs in the instantiation process improves the performance.

Keywords: Information Extraction, Instance Recognition Rules, Ontology Population, Risk Management, Semantic analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1536
70 Feature-Based Summarizing and Ranking from Customer Reviews

Authors: Dim En Nyaung, Thin Lai Lai Thein

Abstract:

Due to the rapid increase of Internet, web opinion sources dynamically emerge which is useful for both potential customers and product manufacturers for prediction and decision purposes. These are the user generated contents written in natural languages and are unstructured-free-texts scheme. Therefore, opinion mining techniques become popular to automatically process customer reviews for extracting product features and user opinions expressed over them. Since customer reviews may contain both opinionated and factual sentences, a supervised machine learning technique applies for subjectivity classification to improve the mining performance. In this paper, we dedicate our work is the task of opinion summarization. Therefore, product feature and opinion extraction is critical to opinion summarization, because its effectiveness significantly affects the identification of semantic relationships. The polarity and numeric score of all the features are determined by Senti-WordNet Lexicon. The problem of opinion summarization refers how to relate the opinion words with respect to a certain feature. Probabilistic based model of supervised learning will improve the result that is more flexible and effective.

Keywords: Opinion Mining, Opinion Summarization, Sentiment Analysis, Text Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2933
69 A Universal Model for Content-Based Image Retrieval

Authors: S. Nandagopalan, Dr. B. S. Adiga, N. Deepak

Abstract:

In this paper a novel approach for generalized image retrieval based on semantic contents is presented. A combination of three feature extraction methods namely color, texture, and edge histogram descriptor. There is a provision to add new features in future for better retrieval efficiency. Any combination of these methods, which is more appropriate for the application, can be used for retrieval. This is provided through User Interface (UI) in the form of relevance feedback. The image properties analyzed in this work are by using computer vision and image processing algorithms. For color the histogram of images are computed, for texture cooccurrence matrix based entropy, energy, etc, are calculated and for edge density it is Edge Histogram Descriptor (EHD) that is found. For retrieval of images, a novel idea is developed based on greedy strategy to reduce the computational complexity. The entire system was developed using AForge.Imaging (an open source product), MATLAB .NET Builder, C#, and Oracle 10g. The system was tested with Coral Image database containing 1000 natural images and achieved better results.

Keywords: Content Based Image Retrieval (CBIR), Cooccurrencematrix, Feature vector, Edge Histogram Descriptor(EHD), Greedy strategy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2934
68 Fuzzy Mathematical Morphology approach in Image Processing

Authors: Yee Yee Htun, Dr. Khaing Khaing Aye

Abstract:

Morphological operators transform the original image into another image through the interaction with the other image of certain shape and size which is known as the structure element. Mathematical morphology provides a systematic approach to analyze the geometric characteristics of signals or images, and has been applied widely too many applications such as edge detection, objection segmentation, noise suppression and so on. Fuzzy Mathematical Morphology aims to extend the binary morphological operators to grey-level images. In order to define the basic morphological operations such as fuzzy erosion, dilation, opening and closing, a general method based upon fuzzy implication and inclusion grade operators is introduced. The fuzzy morphological operations extend the ordinary morphological operations by using fuzzy sets where for fuzzy sets, the union operation is replaced by a maximum operation, and the intersection operation is replaced by a minimum operation. In this work, it consists of two articles. In the first one, fuzzy set theory, fuzzy Mathematical morphology which is based on fuzzy logic and fuzzy set theory; fuzzy Mathematical operations and their properties will be studied in details. As a second part, the application of fuzziness in Mathematical morphology in practical work such as image processing will be discussed with the illustration problems.

Keywords: Binary Morphological, Fuzzy sets, Grayscalemorphology, Image processing, Mathematical morphology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3247
67 A Text Clustering System based on k-means Type Subspace Clustering and Ontology

Authors: Liping Jing, Michael K. Ng, Xinhua Yang, Joshua Zhexue Huang

Abstract:

This paper presents a text clustering system developed based on a k-means type subspace clustering algorithm to cluster large, high dimensional and sparse text data. In this algorithm, a new step is added in the k-means clustering process to automatically calculate the weights of keywords in each cluster so that the important words of a cluster can be identified by the weight values. For understanding and interpretation of clustering results, a few keywords that can best represent the semantic topic are extracted from each cluster. Two methods are used to extract the representative words. The candidate words are first selected according to their weights calculated by our new algorithm. Then, the candidates are fed to the WordNet to identify the set of noun words and consolidate the synonymy and hyponymy words. Experimental results have shown that the clustering algorithm is superior to the other subspace clustering algorithms, such as PROCLUS and HARP and kmeans type algorithm, e.g., Bisecting-KMeans. Furthermore, the word extraction method is effective in selection of the words to represent the topics of the clusters.

Keywords: Subspace Clustering, Text Mining, Feature Weighting, Cluster Interpretation, Ontology

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2462
66 A Web-Based Self-Learning Grammar for Spoken Language Understanding

Authors: S. M. Biondi, V. Catania, R. Di Natale, A. R. Intilisano, D. Panno

Abstract:

One of the major goals of Spoken Dialog Systems (SDS) is to understand what the user utters. In the SDS domain, the Spoken Language Understanding (SLU) Module classifies user utterances by means of a pre-definite conceptual knowledge. The SLU module is able to recognize only the meaning previously included in its knowledge base. Due the vastity of that knowledge, the information storing is a very expensive process. Updating and managing the knowledge base are time-consuming and error-prone processes because of the rapidly growing number of entities like proper nouns and domain-specific nouns. This paper proposes a solution to the problem of Name Entity Recognition (NER) applied to a SDS domain. The proposed solution attempts to automatically recognize the meaning associated with an utterance by using the PANKOW (Pattern based Annotation through Knowledge On the Web) method at runtime. The method being proposed extracts information from the Web to increase the SLU knowledge module and reduces the development effort. In particular, the Google Search Engine is used to extract information from the Facebook social network.

Keywords: Spoken Dialog System, Spoken Language Understanding, Web Semantic, Name Entity Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1776
65 Using Mean-Shift Tracking Algorithms for Real-Time Tracking of Moving Images on an Autonomous Vehicle Testbed Platform

Authors: Benjamin Gorry, Zezhi Chen, Kevin Hammond, Andy Wallace, Greg Michaelson

Abstract:

This paper describes new computer vision algorithms that have been developed to track moving objects as part of a long-term study into the design of (semi-)autonomous vehicles. We present the results of a study to exploit variable kernels for tracking in video sequences. The basis of our work is the mean shift object-tracking algorithm; for a moving target, it is usual to define a rectangular target window in an initial frame, and then process the data within that window to separate the tracked object from the background by the mean shift segmentation algorithm. Rather than use the standard, Epanechnikov kernel, we have used a kernel weighted by the Chamfer distance transform to improve the accuracy of target representation and localization, minimising the distance between the two distributions in RGB color space using the Bhattacharyya coefficient. Experimental results show the improved tracking capability and versatility of the algorithm in comparison with results using the standard kernel. These algorithms are incorporated as part of a robot test-bed architecture which has been used to demonstrate their effectiveness.

Keywords: Hume, functional programming, autonomous vehicle, pioneer robot, vision.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1652
64 Ontology-based Concept Weighting for Text Documents

Authors: Hmway Hmway Tar, Thi Thi Soe Nyaunt

Abstract:

Documents clustering become an essential technology with the popularity of the Internet. That also means that fast and high-quality document clustering technique play core topics. Text clustering or shortly clustering is about discovering semantically related groups in an unstructured collection of documents. Clustering has been very popular for a long time because it provides unique ways of digesting and generalizing large amounts of information. One of the issues of clustering is to extract proper feature (concept) of a problem domain. The existing clustering technology mainly focuses on term weight calculation. To achieve more accurate document clustering, more informative features including concept weight are important. Feature Selection is important for clustering process because some of the irrelevant or redundant feature may misguide the clustering results. To counteract this issue, the proposed system presents the concept weight for text clustering system developed based on a k-means algorithm in accordance with the principles of ontology so that the important of words of a cluster can be identified by the weight values. To a certain extent, it has resolved the semantic problem in specific areas.

Keywords: Clustering, Concept Weight, Document clustering, Feature Selection, Ontology

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2406
63 Personal Knowledge Management: Systematic Review and Future Direction

Authors: Kuribachew Gizaw Tohiye, Monica Garfield

Abstract:

Personal knowledge management is the aspect of knowledge management that relates to the way in which individuals organize and manage their own set of knowledge. While in that respect, there has been research in this area for the past 25 years, it is at present necessary to speculate upon what research has been done and what we have discovered about this arena of knowledge management. In contrast to organizational knowledge management, which focuses on a firm’s profitability and competitiveness, personal knowledge management (PKM) is concerned with the person’s self-effectiveness, competence and success. People are concerned in managing their knowledge in order to become more efficient in a variety of personal and organizational interests. This study presents a systematic review of PKM studies. Articles with PKM concepts are reviewed with the objective of clearly defining PKM, identifying the benefits of PKM, classifying the tools that enable PKM and finding the research gaps to indicate future research directions in the area. Consequently, we have developed a definition of PKM and identified the benefits of PKM, including an understanding of who seeks PKM and for what. Tools enabling PKM are identified and classified under three categories Web 1.0, 2.0 and 3.0 and finally the research gap and future directions are suggested. Research which facilitates collaboration by using semantic technologies is suggested to be studied further to improve PKM effectiveness.

Keywords: Knowledge management, organizational knowledge management, personal knowledge management, systematic review.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2484
62 A General Framework for Knowledge Discovery Using High Performance Machine Learning Algorithms

Authors: S. Nandagopalan, N. Pradeep

Abstract:

The aim of this paper is to propose a general framework for storing, analyzing, and extracting knowledge from two-dimensional echocardiographic images, color Doppler images, non-medical images, and general data sets. A number of high performance data mining algorithms have been used to carry out this task. Our framework encompasses four layers namely physical storage, object identification, knowledge discovery, user level. Techniques such as active contour model to identify the cardiac chambers, pixel classification to segment the color Doppler echo image, universal model for image retrieval, Bayesian method for classification, parallel algorithms for image segmentation, etc., were employed. Using the feature vector database that have been efficiently constructed, one can perform various data mining tasks like clustering, classification, etc. with efficient algorithms along with image mining given a query image. All these facilities are included in the framework that is supported by state-of-the-art user interface (UI). The algorithms were tested with actual patient data and Coral image database and the results show that their performance is better than the results reported already.

Keywords: Active Contour, Bayesian, Echocardiographic image, Feature vector.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1714
61 Opinion Mining Framework in the Education Domain

Authors: A. M. H. Elyasir, K. S. M. Anbananthen

Abstract:

The internet is growing larger and becoming the most popular platform for the people to share their opinion in different interests. We choose the education domain specifically comparing some Malaysian universities against each other. This comparison produces benchmark based on different criteria shared by the online users in various online resources including Twitter, Facebook and web pages. The comparison is accomplished using opinion mining framework to extract, process the unstructured text and classify the result to positive, negative or neutral (polarity). Hence, we divide our framework to three main stages; opinion collection (extraction), unstructured text processing and polarity classification. The extraction stage includes web crawling, HTML parsing, Sentence segmentation for punctuation classification, Part of Speech (POS) tagging, the second stage processes the unstructured text with stemming and stop words removal and finally prepare the raw text for classification using Named Entity Recognition (NER). Last phase is to classify the polarity and present overall result for the comparison among the Malaysian universities. The final result is useful for those who are interested to study in Malaysia, in which our final output declares clear winners based on the public opinions all over the web.

Keywords: Entity Recognition, Education Domain, Opinion Mining, Unstructured Text.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2965
60 Humanoid Personalized Avatar Through Multiple Natural Language Processing

Authors: Jin Hou, Xia Wang, Fang Xu, Viet Dung Nguyen, Ling Wu

Abstract:

There has been a growing interest in implementing humanoid avatars in networked virtual environment. However, most existing avatar communication systems do not take avatars- social backgrounds into consideration. This paper proposes a novel humanoid avatar animation system to represent personalities and facial emotions of avatars based on culture, profession, mood, age, taste, and so forth. We extract semantic keywords from the input text through natural language processing, and then the animations of personalized avatars are retrieved and displayed according to the order of the keywords. Our primary work is focused on giving avatars runtime instruction from multiple natural languages. Experiments with Chinese, Japanese and English input based on the prototype show that interactive avatar animations can be displayed in real time and be made available online. This system provides a more natural and interesting means of human communication, and therefore is expected to be used for cross-cultural communication, multiuser online games, and other entertainment applications.

Keywords: personalized avatar, mutiple natural luanguage processing, social backgrounds, anmimation, human computer interaction

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1972
59 Automatic Music Score Recognition System Using Digital Image Processing

Authors: Yuan-Hsiang Chang, Zhong-Xian Peng, Li-Der Jeng

Abstract:

Music has always been an integral part of human’s daily lives. But, for the most people, reading musical score and turning it into melody is not easy. This study aims to develop an Automatic music score recognition system using digital image processing, which can be used to read and analyze musical score images automatically. The technical approaches included: (1) staff region segmentation; (2) image preprocessing; (3) note recognition; and (4) accidental and rest recognition. Digital image processing techniques (e.g., horizontal /vertical projections, connected component labeling, morphological processing, template matching, etc.) were applied according to musical notes, accidents, and rests in staff notations. Preliminary results showed that our system could achieve detection and recognition rates of 96.3% and 91.7%, respectively. In conclusion, we presented an effective automated musical score recognition system that could be integrated in a system with a media player to play music/songs given input images of musical score. Ultimately, this system could also be incorporated in applications for mobile devices as a learning tool, such that a music player could learn to play music/songs.

Keywords: Connected component labeling, image processing, morphological processing, optical musical recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1931
58 Quad Tree Decomposition Based Analysis of Compressed Image Data Communication for Lossy and Lossless Using WSN

Authors: N. Muthukumaran, R. Ravi

Abstract:

The Quad Tree Decomposition based performance analysis of compressed image data communication for lossy and lossless through wireless sensor network is presented. Images have considerably higher storage requirement than text. While transmitting a multimedia content there is chance of the packets being dropped due to noise and interference. At the receiver end the packets that carry valuable information might be damaged or lost due to noise, interference and congestion. In order to avoid the valuable information from being dropped various retransmission schemes have been proposed. In this proposed scheme QTD is used. QTD is an image segmentation method that divides the image into homogeneous areas. In this proposed scheme involves analysis of parameters such as compression ratio, peak signal to noise ratio, mean square error, bits per pixel in compressed image and analysis of difficulties during data packet communication in Wireless Sensor Networks. By considering the above, this paper is to use the QTD to improve the compression ratio as well as visual quality and the algorithm in MATLAB 7.1 and NS2 Simulator software tool.

Keywords: Image compression, Compression Ratio, Quad tree decomposition, Wireless sensor networks, NS2 simulator.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2391
57 Automatic Reusability Appraisal of Software Components using Neuro-fuzzy Approach

Authors: Parvinder S. Sandhu, Hardeep Singh

Abstract:

Automatic reusability appraisal could be helpful in evaluating the quality of developed or developing reusable software components and in identification of reusable components from existing legacy systems; that can save cost of developing the software from scratch. But the issue of how to identify reusable components from existing systems has remained relatively unexplored. In this paper, we have mentioned two-tier approach by studying the structural attributes as well as usability or relevancy of the component to a particular domain. Latent semantic analysis is used for the feature vector representation of various software domains. It exploits the fact that FeatureVector codes can be seen as documents containing terms -the idenifiers present in the components- and so text modeling methods that capture co-occurrence information in low-dimensional spaces can be used. Further, we devised Neuro- Fuzzy hybrid Inference System, which takes structural metric values as input and calculates the reusability of the software component. Decision tree algorithm is used to decide initial set of fuzzy rules for the Neuro-fuzzy system. The results obtained are convincing enough to propose the system for economical identification and retrieval of reusable software components.

Keywords: Clustering, ID3, LSA, Neuro-fuzzy System, SVD

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1662
56 A Robust Diverged Localization and Recognition of License Registration Characters

Authors: M. Sankari, R. Bremananth, C.Meena

Abstract:

Localization and Recognition of License registration characters from the moving vehicle is a computationally complex task in the field of machine vision and is of substantial interest because of its diverse applications such as cross border security, law enforcement and various other intelligent transportation applications. Previous research used the plate specific details such as aspect ratio, character style, color or dimensions of the plate in the complex task of plate localization. In this paper, license registration character is localized by Enhanced Weight based density map (EWBDM) method, which is independent of such constraints. In connection with our previous method, this paper proposes a method that relaxes constraints in lighting conditions, different fonts of character occurred in the plate and plates with hand-drawn characters in various aspect quotients. The robustness of this method is well suited for applications where the appearance of plates seems to be varied widely. Experimental results show that this approach is suited for recognizing license plates in different external environments. 

Keywords: Character segmentation, Connectivity checking, Edge detection, Image analysis, license plate localization, license number recognition, Quality frame selection

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1897
55 An Ontology Based Question Answering System on Software Test Document Domain

Authors: Meltem Serhatli, Ferda N. Alpaslan

Abstract:

Processing the data by computers and performing reasoning tasks is an important aim in Computer Science. Semantic Web is one step towards it. The use of ontologies to enhance the information by semantically is the current trend. Huge amount of domain specific, unstructured on-line data needs to be expressed in machine understandable and semantically searchable format. Currently users are often forced to search manually in the results returned by the keyword-based search services. They also want to use their native languages to express what they search. In this paper, an ontology-based automated question answering system on software test documents domain is presented. The system allows users to enter a question about the domain by means of natural language and returns exact answer of the questions. Conversion of the natural language question into the ontology based query is the challenging part of the system. To be able to achieve this, a new algorithm regarding free text to ontology based search engine query conversion is proposed. The algorithm is based on investigation of suitable question type and parsing the words of the question sentence.

Keywords: Description Logics, ontology, question answering, reasoning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2149
54 The Role of Contextual Ontologies in Enterprise Modeling

Authors: Ahmed Arara

Abstract:

Information sharing and exchange, rather than information processing, is what characterizes information technology in the 21st century. Ontologies, as shared common understanding, gain increasing attention, as they appear as the most promising solution to enable information sharing both at a semantic level and in a machine-processable way. Domain Ontology-based modeling has been exploited to provide shareability and information exchange among diversified, heterogeneous applications of enterprises. Contextual ontologies are “an explicit specification of contextual conceptualization". That is: ontology is characterized by concepts that have multiple representations and they may exist in several contexts. Hence, contextual ontologies are a set of concepts and relationships, which are seen from different perspectives. Contextualization is to allow for ontologies to be partitioned according to their contexts. The need for contextual ontologies in enterprise modeling has become crucial due to the nature of today's competitive market. Information resources in enterprise is distributed and diversified and is in need to be shared and communicated locally through the intranet and globally though the internet. This paper discusses the roles that ontologies play in an enterprise modeling, and how ontologies assist in building a conceptual model in order to provide communicative and interoperable information systems. The issue of enterprise modeling based on contextual domain ontology is also investigated, and a framework is proposed for an enterprise model that consists of various applications.

Keywords: Contextual ontologies, Enterprise model, domainontology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1842