Search results for: Semantic textual similarity
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 692

Search results for: Semantic textual similarity

512 Genetic Diversity Based Population Study of Freshwater Mud Eel (Monopterus cuchia) in Bangladesh

Authors: M. F. Miah, K. M. A. Zinnah, M. J. Raihan, H. Ali, M. N. Naser

Abstract:

As genetic diversity is most important for existing, breeding and production of any fish; this study was undertaken for investigating genetic diversity of freshwater mud eel, Monopterus cuchia at population level where three ecological populations such as flooded area of Sylhet (P1), open water of Moulvibazar (P2) and open water of Sunamganj (P3) districts of Bangladesh were considered. Four arbitrary RAPD primers (OPB-12, C0-4, B-03 and OPB-08) were screened and RAPD banding patterns were analyzed among the populations considering 15 individuals of each population. In total 174, 138 and 149 bands were detected in the populations of P1, P2 and P3 respectively; however, each primer revealed less number of bands in each population. 100% polymorphic loci were recorded in P2 and P3 whereas only one monomorphic locus was observed in P1, recorded 97.5% polymorphism. Different genetic parameters such as inter-individual pairwise similarity, genetic distance, Nei genetic similarity, linkage distances, cluster analysis and allelic information, etc. were considered for measuring genetic diversity. The average inter-individual pairwise similarity was recorded 2.98, 1.47 and 1.35 in P1, P2 and P3 respectively. Considering genetic distance analysis, the highest distance 1 was recorded in P2 and P3 and the lowest genetic distance 0.444 was found in P2. The average Nei genetic similarity was observed 0.19, 0.16 and 0.13 in P1, P2 and P3, respectively; however, the average linkage distance was recorded 24.92, 17.14 and 15.28 in P1, P3 and P2 respectively. Based on linkage distance, genetic clusters were generated in three populations where 6 clades and 7 clusters were found in P1, 3 clades and 5 clusters were observed in P2 and 4 clades and 7 clusters were detected in P3. In addition, allelic information was observed where the frequency of p and q alleles were observed 0.093 and 0.907 in P1, 0.076 and 0.924 in P2, 0.074 and 0.926 in P3 respectively. The average gene diversity was observed highest in P2 (0.132) followed by P3 (0.131) and P1 (0.121) respectively.

Keywords: Genetic diversity, Monopterus cuchia, population, RAPD, Bangladesh.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1811
511 A Review on Important Aspects of Information Retrieval

Authors: Yogesh Gupta, Ashish Saini, A.K. Saxena

Abstract:

Information retrieval has become an important field of study and research under computer science due to explosive growth of information available in the form of full text, hypertext, administrative text, directory, numeric or bibliographic text. The research work is going on various aspects of information retrieval systems so as to improve its efficiency and reliability. This paper presents a comprehensive study, which discusses not only emergence and evolution of information retrieval but also includes different information retrieval models and some important aspects such as document representation, similarity measure and query expansion.

Keywords: Information Retrieval, query expansion, similarity measure, query expansion, vector space model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3322
510 Bottom Up Text Mining through Hierarchical Document Representation

Authors: Y. Djouadi., F. Souam.

Abstract:

Most of the existing text mining approaches are proposed, keeping in mind, transaction databases model. Thus, the mined dataset is structured using just one concept: the “transaction", whereas the whole dataset is modeled using the “set" abstract type. In such cases, the structure of the whole dataset and the relationships among the transactions themselves are not modeled and consequently, not considered in the mining process. We believe that taking into account structure properties of hierarchically structured information (e.g. textual document, etc ...) in the mining process, can leads to best results. For this purpose, an hierarchical associations rule mining approach for textual documents is proposed in this paper and the classical set-oriented mining approach is reconsidered profits to a Direct Acyclic Graph (DAG) oriented approach. Natural languages processing techniques are used in order to obtain the DAG structure. Based on this graph model, an hierarchical bottom up algorithm is proposed. The main idea is that each node is mined with its parent node.

Keywords: Graph based association rules mining, Hierarchical document structure, Text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2048
509 Applying Similarity Theory and Hilbert Huang Transform for Estimating the Differences of Pig-s Blood Pressure Signals between Situations of Intestinal Artery Blocking and Unblocking

Authors: Jia-Rong Yeh, Tzu-Yu Lin, Jiann-Shing Shieh, Yun Chen

Abstract:

A mammal-s body can be seen as a blood vessel with complex tunnels. When heart pumps blood periodically, blood runs through blood vessels and rebounds from walls of blood vessels. Blood pressure signals can be measured with complex but periodic patterns. When an artery is clamped during a surgical operation, the spectrum of blood pressure signals will be different from that of normal situation. In this investigation, intestinal artery clamping operations were conducted to a pig for simulating the situation of intestinal blocking during a surgical operation. Similarity theory is a convenient and easy tool to prove that patterns of blood pressure signals of intestinal artery blocking and unblocking are surely different. And, the algorithm of Hilbert Huang Transform can be applied to extract the character parameters of blood pressure pattern. In conclusion, the patterns of blood pressure signals of two different situations, intestinal artery blocking and unblocking, can be distinguished by these character parameters defined in this paper.

Keywords: Blood pressure, spectrum, intestinal artery, similarity theory and Hilbert Huang Transform.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1613
508 Similarity Based Retrieval in Case Based Reasoning for Analysis of Medical Images

Authors: M. Das Gupta, S. Banerjee

Abstract:

Content Based Image Retrieval (CBIR) coupled with Case Based Reasoning (CBR) is a paradigm that is becoming increasingly popular in the diagnosis and therapy planning of medical ailments utilizing the digital content of medical images. This paper presents a survey of some of the promising approaches used in the detection of abnormalities in retina images as well in mammographic screening and detection of regions of interest in MRI scans of the brain. We also describe our proposed algorithm to detect hard exudates in fundus images of the retina of Diabetic Retinopathy patients.

Keywords: Case based reasoning, Exudates, Retina image, Similarity based retrieval.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2111
507 Information Retrieval in the Semantic LIFE Personal Digital Memory Framework

Authors: Hanh Huu Hoang, Tho Manh Nguyen

Abstract:

Ever increasing capacities of contemporary storage devices inspire the vision to accumulate (personal) information without the need of deleting old data over a long time-span. Hence the target of SemanticLIFE project is to create a Personal Information Management system for a human lifetime data. One of the most important characteristics of the system is its dedication to retrieve information in a very efficient way. By adopting user demands regarding the reduction of ambiguities, our approach aims at a user-oriented and yet powerful enough system with a satisfactory query performance. We introduce the query system of SemanticLIFE, the Virtual Query System, which uses emerging Semantic Web technologies to fulfill users- requirements.

Keywords: Ontology-based Information Retrieval, Digital Memories, SemanticLIFE.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1331
506 Flow Behavior and Performances of Centrifugal Compressor Stage Vaneless Diffusers

Authors: Y. Galerkin, O. Solovieva

Abstract:

Parameters of flow are calculated in vaneless diffusers with relative width 0,014–0,10. Inlet angles of flow and similarity criteria were varied. There is information on flow separation, boundary layer development, configuration of streamlines. Polytrophic efficiency, loss coefficient and recovery coefficient are used to compare effectiveness of diffusers. The sample of optimization of narrow diffuser with conical walls is presented. Three wide diffusers with narrowing walls are compared. The work is made in the R&D laboratory “Gas dynamics of turbo machines” of the TU SPb.

Keywords: Vaneless diffuser, relative width, flow angle, flow separation, loss coefficient, similarity criteria.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2255
505 Comparative Studies on Vertical Stratification,Floristic Composition, and Woody Species Diversity of Subtropical Evergreen Broadleaf Forests Between the Ryukyu Archipelago, Japan, and South China

Authors: M. Wu, S. M. Feroz, A. Hagihara, L. Xue, Z. L. Huang

Abstract:

In order to compare vertical stratification, floristic composition, and woody species diversity of subtropical evergreen broadleaf forests between the Ryukyu Archipelago, Japan, and South China, tree censuses in a 400 m2 plot in Ishigaki Island and a 1225 m2 plot in Dinghushan Nature Reserve were performed. Both of the subtropical forests consisted of five vertical strata. The floristic composition of the Ishigaki forest was quite different from that of the Dinghushan forest in terms of similarity on a species level (Kuno-s similarity index r0 = 0.05). The values of Shannon-s index H' and Pielou-s index J ' tended to increase from the bottom stratum upward in both forests, except H' for the top stratum in the Ishigaki forest and the upper two strata in the Dinghushan forest. The woody species diversity in the Dinghushan forest (H'= 3.01 bit) was much lower than that in the Ishigaki forest (H'= 4.36 bit).

Keywords: Floristic similarity, subtropical evergreen broadleaf forest, vertical stratification, woody species diversity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1652
504 Designing a Patient Monitoring System Using Cloud and Semantic Web Technologies

Authors: Chryssa Thermolia, Ekaterini S. Bei, Stelios Sotiriadis, Kostas Stravoskoufos, Euripides G.M. Petrakis

Abstract:

Moving into a new era of healthcare, new tools and devices are developed to extend and improve health services, such as remote patient monitoring and risk prevention. In this concept, Internet of Things (IoT) and Cloud Computing present great advantages by providing remote and efficient services, as well as cooperation between patients, clinicians, researchers and other health professionals. This paper focuses on patients suffering from bipolar disorder, a brain disorder that belongs to a group of conditions called affective disorders, which is characterized by great mood swings. We exploit the advantages of Semantic Web and Cloud Technologies to develop a patient monitoring system to support clinicians. Based on intelligently filtering of evidence-knowledge and individual-specific information we aim to provide treatment notifications and recommended function tests at appropriate times or concluding into alerts for serious mood changes and patient’s nonresponse to treatment. We propose an architecture as the back-end part of a cloud platform for IoT, intertwining intelligence devices with patients’ daily routine and clinicians’ support.

Keywords: Bipolar disorder, intelligent systems patient monitoring, semantic web technologies, IoT.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2426
503 Semantic Enhanced Social Media Sentiments for Stock Market Prediction

Authors: K. Nirmala Devi, V. Murali Bhaskaran

Abstract:

Traditional document representation for classification follows Bag of Words (BoW) approach to represent the term weights. The conventional method uses the Vector Space Model (VSM) to exploit the statistical information of terms in the documents and they fail to address the semantic information as well as order of the terms present in the documents. Although, the phrase based approach follows the order of the terms present in the documents rather than semantics behind the word. Therefore, a semantic concept based approach is used in this paper for enhancing the semantics by incorporating the ontology information. In this paper a novel method is proposed to forecast the intraday stock market price directional movement based on the sentiments from Twitter and money control news articles. The stock market forecasting is a very difficult and highly complicated task because it is affected by many factors such as economic conditions, political events and investor’s sentiment etc. The stock market series are generally dynamic, nonparametric, noisy and chaotic by nature. The sentiment analysis along with wisdom of crowds can automatically compute the collective intelligence of future performance in many areas like stock market, box office sales and election outcomes. The proposed method utilizes collective sentiments for stock market to predict the stock price directional movements. The collective sentiments in the above social media have powerful prediction on the stock price directional movements as up/down by using Granger Causality test.

Keywords: Bag of Words, Collective Sentiments, Ontology, Semantic relations, Sentiments, Social media, Stock Prediction, Twitter, Vector Space Model and wisdom of crowds.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2790
502 A Sentence-to-Sentence Relation Network for Recognizing Textual Entailment

Authors: Isaac K. E. Ampomah, Seong-Bae Park, Sang-Jo Lee

Abstract:

Over the past decade, there have been promising developments in Natural Language Processing (NLP) with several investigations of approaches focusing on Recognizing Textual Entailment (RTE). These models include models based on lexical similarities, models based on formal reasoning, and most recently deep neural models. In this paper, we present a sentence encoding model that exploits the sentence-to-sentence relation information for RTE. In terms of sentence modeling, Convolutional neural network (CNN) and recurrent neural networks (RNNs) adopt different approaches. RNNs are known to be well suited for sequence modeling, whilst CNN is suited for the extraction of n-gram features through the filters and can learn ranges of relations via the pooling mechanism. We combine the strength of RNN and CNN as stated above to present a unified model for the RTE task. Our model basically combines relation vectors computed from the phrasal representation of each sentence and final encoded sentence representations. Firstly, we pass each sentence through a convolutional layer to extract a sequence of higher-level phrase representation for each sentence from which the first relation vector is computed. Secondly, the phrasal representation of each sentence from the convolutional layer is fed into a Bidirectional Long Short Term Memory (Bi-LSTM) to obtain the final sentence representations from which a second relation vector is computed. The relations vectors are combined and then used in then used in the same fashion as attention mechanism over the Bi-LSTM outputs to yield the final sentence representations for the classification. Experiment on the Stanford Natural Language Inference (SNLI) corpus suggests that this is a promising technique for RTE.

Keywords: Deep neural models, natural language inference, recognizing textual entailment, sentence-to-sentence relation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1441
501 Using Suffix Tree Document Representation in Hierarchical Agglomerative Clustering

Authors: Daniel I. Morariu, Radu G. Cretulescu, Lucian N. Vintan

Abstract:

In text categorization problem the most used method for documents representation is based on words frequency vectors called VSM (Vector Space Model). This representation is based only on words from documents and in this case loses any “word context" information found in the document. In this article we make a comparison between the classical method of document representation and a method called Suffix Tree Document Model (STDM) that is based on representing documents in the Suffix Tree format. For the STDM model we proposed a new approach for documents representation and a new formula for computing the similarity between two documents. Thus we propose to build the suffix tree only for any two documents at a time. This approach is faster, it has lower memory consumption and use entire document representation without using methods for disposing nodes. Also for this method is proposed a formula for computing the similarity between documents, which improves substantially the clustering quality. This representation method was validated using HAC - Hierarchical Agglomerative Clustering. In this context we experiment also the stemming influence in the document preprocessing step and highlight the difference between similarity or dissimilarity measures to find “closer" documents.

Keywords: Text Clustering, Suffix tree documentrepresentation, Hierarchical Agglomerative Clustering

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1900
500 Generic Multimedia Database Architecture

Authors: Mohib ur Rehman, Imran Ihsan, Mobin Uddin Ahmed, Nadeem Iftikhar, Muhammad Abdul Qadir

Abstract:

Multimedia, as it stands now is perhaps the most diverse and rich culture around the globe. One of the major needs of Multimedia is to have a single system that enables people to efficiently search through their multimedia catalogues. Many Domain Specific Systems and architectures have been proposed but up till now no generic and complete architecture is proposed. In this paper, we have suggested a generic architecture for Multimedia Database. The main strengths of our architecture besides being generic are Semantic Libraries to reduce semantic gap, levels of feature extraction for more specific and detailed feature extraction according to classes defined by prior level, and merging of two types of queries i.e. text and QBE (Query by Example) for more accurate yet detailed results.

Keywords: Multimedia Database Architecture, Semantics, Feature Extraction, Ontology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1772
499 Analysis of Diverse Clustering Tools in Data Mining

Authors: S. Sarumathi, N. Shanthi, M. Sharmila

Abstract:

Clustering in data mining is an unsupervised learning technique of aggregating the data objects into meaningful groups such that the intra cluster similarity of objects are maximized and inter cluster similarity of objects are minimized. Over the past decades several clustering tools were emerged in which clustering algorithms are inbuilt and are easier to use and extract the expected results. Data mining mainly deals with the huge databases that inflicts on cluster analysis and additional rigorous computational constraints. These challenges pave the way for the emergence of powerful expansive data mining clustering softwares. In this survey, a variety of clustering tools used in data mining are elucidated along with the pros and cons of each software.

Keywords: Cluster Analysis, Clustering Algorithms, Clustering Techniques, Association, Visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2190
498 Unsteady Reversed Stagnation-Point Flow over a Flat Plate

Authors: Vai Kuong Sin, Chon Kit Chio

Abstract:

This paper investigates the nature of the development of two-dimensional laminar flow of an incompressible fluid at the reversed stagnation-point. ". In this study, we revisit the problem of reversed stagnation-point flow over a flat plate. Proudman and Johnson (1962) first studied the flow and obtained an asymptotic solution by neglecting the viscous terms. This is no true in neglecting the viscous terms within the total flow field. In particular it is pointed out that for a plate impulsively accelerated from rest to a constant velocity V0 that a similarity solution to the self-similar ODE is obtained which is noteworthy completely analytical.

Keywords: reversed stagnation-point flow, similarity solutions, analytical solution, numerical solution

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1443
497 Isolation and Identification of Diacylglycerol Acyltransferase Type- 2 (GAT2) Genes from Three Egyptian Olive Cultivars

Authors: Yahia I. Mohamed, Ahmed I. Marzouk, Mohamed A. Yacout

Abstract:

Aim of this work was to study the genetic basis for oil accumulation in olive fruit via tracking DGAT2 (Diacylglycerol acyltransferase type-2) gene in three Egyptian Origen Olive cultivars namely Toffahi, Hamed and Maraki using molecular marker techniques and bioinformatics tools. Results illustrate that, firstly: specific genomic band of Maraki cultivars was identified as DGAT2 (Diacylglycerol acyltransferase type-2) and identical for this gene in Olea europaea with 100% of similarity. Secondly, differential genomic band of Maraki cultivars which produced from RAPD fingerprinting technique reflected predicted distinguished sequence which identified as DGAT2 (Diacylglycerol acyltransferase type-2) in Fragaria vesca subsp. Vesca with 76% of sequential similarity. Third and finally, specific genomic specific band of Hamed cultivars was identified as two fragments, 1- Olea europaea cultivar Koroneiki diacylglycerol acyltransferase type 2 mRNA, complete cds with two matches regions with 99% or 2- Predicted: Fragaria vesca subsp. vesca diacylglycerol O-acyltransferase 2-like (LOC101313050), mRNA with 86 % of similarity.

Keywords: Olea europaea, fingerprinting, Diacylglycerol acyltransferase type- 2 (DGAT2).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2404
496 Powerful Tool to Expand Business Intelligence: Text Mining

Authors: Li Gao, Elizabeth Chang, Song Han

Abstract:

With the extensive inclusion of document, especially text, in the business systems, data mining does not cover the full scope of Business Intelligence. Data mining cannot deliver its impact on extracting useful details from the large collection of unstructured and semi-structured written materials based on natural languages. The most pressing issue is to draw the potential business intelligence from text. In order to gain competitive advantages for the business, it is necessary to develop the new powerful tool, text mining, to expand the scope of business intelligence. In this paper, we will work out the strong points of text mining in extracting business intelligence from huge amount of textual information sources within business systems. We will apply text mining to each stage of Business Intelligence systems to prove that text mining is the powerful tool to expand the scope of BI. After reviewing basic definitions and some related technologies, we will discuss the relationship and the benefits of these to text mining. Some examples and applications of text mining will also be given. The motivation behind is to develop new approach to effective and efficient textual information analysis. Thus we can expand the scope of Business Intelligence using the powerful tool, text mining.

Keywords: Business intelligence, document warehouse, text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2644
495 Incremental Algorithm to Cluster the Categorical Data with Frequency Based Similarity Measure

Authors: S.Aranganayagi, K.Thangavel

Abstract:

Clustering categorical data is more complicated than the numerical clustering because of its special properties. Scalability and memory constraint is the challenging problem in clustering large data set. This paper presents an incremental algorithm to cluster the categorical data. Frequencies of attribute values contribute much in clustering similar categorical objects. In this paper we propose new similarity measures based on the frequencies of attribute values and its cardinalities. The proposed measures and the algorithm are experimented with the data sets from UCI data repository. Results prove that the proposed method generates better clusters than the existing one.

Keywords: Clustering, Categorical, Incremental, Frequency, Domain

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1811
494 A Systems Approach to Gene Ranking from DNA Microarray Data of Cervical Cancer

Authors: Frank Emmert Streib, Matthias Dehmer, Jing Liu, Max Mühlhauser

Abstract:

In this paper we present a method for gene ranking from DNA microarray data. More precisely, we calculate the correlation networks, which are unweighted and undirected graphs, from microarray data of cervical cancer whereas each network represents a tissue of a certain tumor stage and each node in the network represents a gene. From these networks we extract one tree for each gene by a local decomposition of the correlation network. The interpretation of a tree is that it represents the n-nearest neighbor genes on the n-th level of a tree, measured by the Dijkstra distance, and, hence, gives the local embedding of a gene within the correlation network. For the obtained trees we measure the pairwise similarity between trees rooted by the same gene from normal to cancerous tissues. This evaluates the modification of the tree topology due to progression of the tumor. Finally, we rank the obtained similarity values from all tissue comparisons and select the top ranked genes. For these genes the local neighborhood in the correlation networks changes most between normal and cancerous tissues. As a result we find that the top ranked genes are candidates suspected to be involved in tumor growth and, hence, indicates that our method captures essential information from the underlying DNA microarray data of cervical cancer.

Keywords: Graph similarity, DNA microarray data, cancer.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1745
493 Wasp Venom Peptides may play a role in the Pathogenesis of Acute Disseminated Encephalomyelitis in Humans: A Structural Similarity Analysis

Authors: Permphan Dharmasaroja

Abstract:

Acute disseminated encephalomyelitis (ADEM) has been reported to develop after a hymenoptera sting, but its pathogenesis is not known in detail. Myelin basic protein (MBP)- specific T cells have been detected in the blood of patients with ADEM, and a proportion of these patients develop multiple sclerosis (MS). In an attempt to understand the mechanisms underlying ADEM, molecular mimicry between hymenoptera venom peptides and the human immunodominant MBP peptide was scrutinized, based on the sequence and structural similarities, whether it was the root of the disease. The results suggest that the three wasp venom peptides have low sequence homology with the human immunodominant MBP residues 85-99. Structural similarity analysis among the three venom peptides and the MS-related HLA-DR2b (DRA, DRB1*1501)-associated immunodominant MHC binding/TCR contact residues 88-93, VVHFFK showed that hyaluronidase residues 7-12, phospholipase A1 residues 98-103, and antigen 5 residues 109-114 showed a high degree of similarity 83.3%, 100%, and 83.3% respectively. In conclusion, some wasp venom peptides, particularly phospholipase A1, may potentially act as the molecular motifs of the human 3HLA-DR2b-associated immunodominant MBP88-93, and possibly present a mechanism for induction of wasp sting-associated ADEM.

Keywords: central nervous system, Hymenoptera, myelin basicprotein, molecular mimicry.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1609
492 An Optimization Algorithm Based on Dynamic Schema with Dissimilarities and Similarities of Chromosomes

Authors: Radhwan Yousif Sedik Al-Jawadi

Abstract:

Optimization is necessary for finding appropriate solutions to a range of real-life problems. In particular, genetic (or more generally, evolutionary) algorithms have proved very useful in solving many problems for which analytical solutions are not available. In this paper, we present an optimization algorithm called Dynamic Schema with Dissimilarity and Similarity of Chromosomes (DSDSC) which is a variant of the classical genetic algorithm. This approach constructs new chromosomes from a schema and pairs of existing ones by exploring their dissimilarities and similarities. To show the effectiveness of the algorithm, it is tested and compared with the classical GA, on 15 two-dimensional optimization problems taken from literature. We have found that, in most cases, our method is better than the classical genetic algorithm.

Keywords: Genetic algorithm, similarity and dissimilarity, chromosome injection, dynamic schema.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1288
491 A Content Vector Model for Text Classification

Authors: Eric Jiang

Abstract:

As a popular rank-reduced vector space approach, Latent Semantic Indexing (LSI) has been used in information retrieval and other applications. In this paper, an LSI-based content vector model for text classification is presented, which constructs multiple augmented category LSI spaces and classifies text by their content. The model integrates the class discriminative information from the training data and is equipped with several pertinent feature selection and text classification algorithms. The proposed classifier has been applied to email classification and its experiments on a benchmark spam testing corpus (PU1) have shown that the approach represents a competitive alternative to other email classifiers based on the well-known SVM and naïve Bayes algorithms.

Keywords: Feature Selection, Latent Semantic Indexing, Text Classification, Vector Space Model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1871
490 A Similarity Metric for Assessment of Image Fusion Algorithms

Authors: Nedeljko Cvejic, Artur Łoza, David Bull, Nishan Canagarajah

Abstract:

In this paper, we present a novel objective nonreference performance assessment algorithm for image fusion. It takes into account local measurements to estimate how well the important information in the source images is represented by the fused image. The metric is based on the Universal Image Quality Index and uses the similarity between blocks of pixels in the input images and the fused image as the weighting factors for the metrics. Experimental results confirm that the values of the proposed metrics correlate well with the subjective quality of the fused images, giving a significant improvement over standard measures based on mean squared error and mutual information.

Keywords: Fusion performance measures, image fusion, nonreferencequality measures, objective quality measures.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2469
489 Generation of Sets of Synthetic Classifiers for the Evaluation of Abstract-Level Combination Methods

Authors: N. Greco, S. Impedovo, R.Modugno, G. Pirlo

Abstract:

This paper presents a new technique for generating sets of synthetic classifiers to evaluate abstract-level combination methods. The sets differ in terms of both recognition rates of the individual classifiers and degree of similarity. For this purpose, each abstract-level classifier is considered as a random variable producing one class label as the output for an input pattern. From the initial set of classifiers, new slightly different sets are generated by applying specific operators, which are defined at the purpose. Finally, the sets of synthetic classifiers have been used to estimate the performance of combination methods for abstract-level classifiers. The experimental results demonstrate the effectiveness of the proposed approach.

Keywords: Abstract-level Classifier, Dempster-Shafer Rule, Multi-expert Systems, Similarity Index, System Evaluation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1476
488 Cutting and Breaking Events in Telugu

Authors: Vasanta Duggirala, Y. Viswanatha Naidu

Abstract:

This paper makes a contribution to the on-going debate on conceptualization and lexicalization of cutting and breaking (C&B) verbs by discussing data from Telugu, a language of India belonging to the Dravidian family. Five Telugu native speakers- verbalizations of agentive actions depicted in 43 short video-clips were analyzed. It was noted that verbalization of C&B events in Telugu requires formal units such as simple lexical verbs, explicator compound verbs, and other complex verb forms. The properties of the objects involved, the kind of instruments used, and the manner of action had differential influence on the lexicalization patterns. Further, it was noted that all the complex verb forms encode 'result' and 'cause' sub-events in that order. Due to the polysemy associated with some of the verb forms, our data does not support the straightforward bipartition of this semantic domain.

Keywords: Cluster analysis, Cutting and breaking events, Polysemy, Semantic extension, Telugu.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2132
487 Method of Cluster Based Cross-Domain Knowledge Acquisition for Biologically Inspired Design

Authors: Shen Jian, Hu Jie, Ma Jin, Peng Ying Hong, Fang Yi, Liu Wen Hai

Abstract:

Biologically inspired design inspires inventions and new technologies in the field of engineering by mimicking functions, principles, and structures in the biological domain. To deal with the obstacles of cross-domain knowledge acquisition in the existing biologically inspired design process, functional semantic clustering based on functional feature semantic correlation and environmental constraint clustering composition based on environmental characteristic constraining adaptability are proposed. A knowledge cell clustering algorithm and the corresponding prototype system is developed. Finally, the effectiveness of the method is verified by the visual prosthetic device design.

Keywords: Knowledge based engineering, biologically inspired design, knowledge cell, knowledge clustering, knowledge acquisition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1003
486 Social Software Approach to E-Learning 3.0

Authors: Anna Nedyalkova, KrassimirNedyalkov, TeodoraBakardjieva

Abstract:

In the present paper, we-ll explore how social media tools provide an opportunity for new developments of the e-Learning in the context of managing personal knowledge. There will be a discussion how social media tools provide a possibility for helping knowledge workersand students to gather, organize and manage their personal information as a part of the e-learning process. At the centre of this social software driven approach to e-learning environments are the challenges of personalization and collaboration. We-ll share concepts of how organizations are using social media for e-Learning and believe that integration of these tools into traditional e-Learning is probably not a choice, but inevitability. Students- Survey of use of web technologies and social networking tools is presented. Newly developed framework for semantic blogging capable of organizing results relevant to user requirements is implemented at Varna Free University (VFU) to provide more effective navigation and search.

Keywords: Semantic blogging, social media tools, e-Learning, web 2.0, web 3.0.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1799
485 Distances over Incomplete Diabetes and Breast Cancer Data Based on Bhattacharyya Distance

Authors: Loai AbdAllah, Mahmoud Kaiyal

Abstract:

Missing values in real-world datasets are a common problem. Many algorithms were developed to deal with this problem, most of them replace the missing values with a fixed value that was computed based on the observed values. In our work, we used a distance function based on Bhattacharyya distance to measure the distance between objects with missing values. Bhattacharyya distance, which measures the similarity of two probability distributions. The proposed distance distinguishes between known and unknown values. Where the distance between two known values is the Mahalanobis distance. When, on the other hand, one of them is missing the distance is computed based on the distribution of the known values, for the coordinate that contains the missing value. This method was integrated with Wikaya, a digital health company developing a platform that helps to improve prevention of chronic diseases such as diabetes and cancer. In order for Wikaya’s recommendation system to work distance between users need to be measured. Since there are missing values in the collected data, there is a need to develop a distance function distances between incomplete users profiles. To evaluate the accuracy of the proposed distance function in reflecting the actual similarity between different objects, when some of them contain missing values, we integrated it within the framework of k nearest neighbors (kNN) classifier, since its computation is based only on the similarity between objects. To validate this, we ran the algorithm over diabetes and breast cancer datasets, standard benchmark datasets from the UCI repository. Our experiments show that kNN classifier using our proposed distance function outperforms the kNN using other existing methods.

Keywords: Missing values, distance metric, Bhattacharyya distance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 769
484 Enhanced Conference Organization Based On Correlation of Web Information and Ontology Based Expertise Search

Authors: Hassan Noureddine, Maria Sokhn, Iman Jarkass, Elena Mugellini, Omar Abou Khaled

Abstract:

From the importance of the conference and its constructive role in the studies discussion, there must be a strong organization that allows the exploitation of the discussions in opening new horizons. The vast amount of information scattered across the web, make it difficult to find experts, who can play a prominent role in organizing conferences. In this paper we proposed a new approach of extracting researchers- information from various Web resources and correlating them in order to confirm their correctness. As a validator of this approach, we propose a service that will be useful to set up a conference. Its main objective is to find appropriate experts, as well as the social events for a conference. For this application we us Semantic Web technologies like RDF and ontology to represent the confirmed information, which are linked to another ontology (skills ontology) that are used to present and compute the expertise.

Keywords: Expert finding, Information extraction, Ontologies, Semantic web, Social events.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1608
483 Machine Learning for Music Aesthetic Annotation Using MIDI Format: A Harmony-Based Classification Approach

Authors: Lin Yang, Zhian Mi, Jiacheng Xiao, Rong Li

Abstract:

Swimming with the tide of deep learning, the field of music information retrieval (MIR) experiences parallel development and a sheer variety of feature-learning models has been applied to music classification and tagging tasks. Among those learning techniques, the deep convolutional neural networks (CNNs) have been widespreadly used with better performance than the traditional approach especially in music genre classification and prediction. However, regarding the music recommendation, there is a large semantic gap between the corresponding audio genres and the various aspects of a song that influence user preference. In our study, aiming to bridge the gap, we strive to construct an automatic music aesthetic annotation model with MIDI format for better comparison and measurement of the similarity between music pieces in the way of harmonic analysis. We use the matrix of qualification converted from MIDI files as input to train two different classifiers, support vector machine (SVM) and Decision Tree (DT). Experimental results in performance of a tag prediction task have shown that both learning algorithms are capable of extracting high-level properties in an end-to end manner from music information. The proposed model is helpful to learn the audience taste and then the resulting recommendations are likely to appeal to a niche consumer.

Keywords: Harmonic analysis, machine learning, music classification and tagging, MIDI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 734