Search results for: Jaccard’s similarity measure.
1268 A Comparative Study of Web-pages Classification Methods using Fuzzy Operators Applied to Arabic Web-pages
Authors: Ahmad T. Al-Taani, Noor Aldeen K. Al-Awad
Abstract:
In this study, a fuzzy similarity approach for Arabic web pages classification is presented. The approach uses a fuzzy term-category relation by manipulating membership degree for the training data and the degree value for a test web page. Six measures are used and compared in this study. These measures include: Einstein, Algebraic, Hamacher, MinMax, Special case fuzzy and Bounded Difference approaches. These measures are applied and compared using 50 different Arabic web-pages. Einstein measure was gave best performance among the other measures. An analysis of these measures and concluding remarks are drawn in this study.
Keywords: Text classification, HTML, web pages, machine learning, fuzzy logic, Arabic web pages.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21911267 Applying Similarity Theory and Hilbert Huang Transform for Estimating the Differences of Pig-s Blood Pressure Signals between Situations of Intestinal Artery Blocking and Unblocking
Authors: Jia-Rong Yeh, Tzu-Yu Lin, Jiann-Shing Shieh, Yun Chen
Abstract:
A mammal-s body can be seen as a blood vessel with complex tunnels. When heart pumps blood periodically, blood runs through blood vessels and rebounds from walls of blood vessels. Blood pressure signals can be measured with complex but periodic patterns. When an artery is clamped during a surgical operation, the spectrum of blood pressure signals will be different from that of normal situation. In this investigation, intestinal artery clamping operations were conducted to a pig for simulating the situation of intestinal blocking during a surgical operation. Similarity theory is a convenient and easy tool to prove that patterns of blood pressure signals of intestinal artery blocking and unblocking are surely different. And, the algorithm of Hilbert Huang Transform can be applied to extract the character parameters of blood pressure pattern. In conclusion, the patterns of blood pressure signals of two different situations, intestinal artery blocking and unblocking, can be distinguished by these character parameters defined in this paper.Keywords: Blood pressure, spectrum, intestinal artery, similarity theory and Hilbert Huang Transform.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15771266 Control-flow Complexity Measurement of Processes and Weyuker's Properties
Authors: Jorge Cardoso
Abstract:
Process measurement is the task of empirically and objectively assigning numbers to the properties of business processes in such a way as to describe them. Desirable attributes to study and measure include complexity, cost, maintainability, and reliability. In our work we will focus on investigating process complexity. We define process complexity as the degree to which a business process is difficult to analyze, understand or explain. One way to analyze a process- complexity is to use a process control-flow complexity measure. In this paper, an attempt has been made to evaluate the control-flow complexity measure in terms of Weyuker-s properties. Weyuker-s properties must be satisfied by any complexity measure to qualify as a good and comprehensive one.
Keywords: Business process measurement, workflow, complexity.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 26401265 Similarity Based Retrieval in Case Based Reasoning for Analysis of Medical Images
Authors: M. Das Gupta, S. Banerjee
Abstract:
Content Based Image Retrieval (CBIR) coupled with Case Based Reasoning (CBR) is a paradigm that is becoming increasingly popular in the diagnosis and therapy planning of medical ailments utilizing the digital content of medical images. This paper presents a survey of some of the promising approaches used in the detection of abnormalities in retina images as well in mammographic screening and detection of regions of interest in MRI scans of the brain. We also describe our proposed algorithm to detect hard exudates in fundus images of the retina of Diabetic Retinopathy patients.
Keywords: Case based reasoning, Exudates, Retina image, Similarity based retrieval.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20801264 Flow Behavior and Performances of Centrifugal Compressor Stage Vaneless Diffusers
Authors: Y. Galerkin, O. Solovieva
Abstract:
Parameters of flow are calculated in vaneless diffusers with relative width 0,014–0,10. Inlet angles of flow and similarity criteria were varied. There is information on flow separation, boundary layer development, configuration of streamlines. Polytrophic efficiency, loss coefficient and recovery coefficient are used to compare effectiveness of diffusers. The sample of optimization of narrow diffuser with conical walls is presented. Three wide diffusers with narrowing walls are compared. The work is made in the R&D laboratory “Gas dynamics of turbo machines” of the TU SPb.
Keywords: Vaneless diffuser, relative width, flow angle, flow separation, loss coefficient, similarity criteria.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22161263 Comparative Studies on Vertical Stratification,Floristic Composition, and Woody Species Diversity of Subtropical Evergreen Broadleaf Forests Between the Ryukyu Archipelago, Japan, and South China
Authors: M. Wu, S. M. Feroz, A. Hagihara, L. Xue, Z. L. Huang
Abstract:
In order to compare vertical stratification, floristic composition, and woody species diversity of subtropical evergreen broadleaf forests between the Ryukyu Archipelago, Japan, and South China, tree censuses in a 400 m2 plot in Ishigaki Island and a 1225 m2 plot in Dinghushan Nature Reserve were performed. Both of the subtropical forests consisted of five vertical strata. The floristic composition of the Ishigaki forest was quite different from that of the Dinghushan forest in terms of similarity on a species level (Kuno-s similarity index r0 = 0.05). The values of Shannon-s index H' and Pielou-s index J ' tended to increase from the bottom stratum upward in both forests, except H' for the top stratum in the Ishigaki forest and the upper two strata in the Dinghushan forest. The woody species diversity in the Dinghushan forest (H'= 3.01 bit) was much lower than that in the Ishigaki forest (H'= 4.36 bit).
Keywords: Floristic similarity, subtropical evergreen broadleaf forest, vertical stratification, woody species diversity.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16161262 Issue Reorganization Using the Measure of Relevance
Authors: William Wong Xiu Shun, Yoonjin Hyun, Mingyu Kim, Seongi Choi, Namgyu Kim
Abstract:
The need to extract R&D keywords from issues and use them to retrieve R&D information is increasing rapidly. However, it is difficult to identify related issues or distinguish them. Although the similarity between issues cannot be identified, with an R&D lexicon, issues that always share the same R&D keywords can be determined. In detail, the R&D keywords that are associated with a particular issue imply the key technology elements that are needed to solve a particular issue. Furthermore, the relationship among issues that share the same R&D keywords can be shown in a more systematic way by clustering them according to keywords. Thus, sharing R&D results and reusing R&D technology can be facilitated. Indirectly, redundant investment in R&D can be reduced as the relevant R&D information can be shared among corresponding issues and the reusability of related R&D can be improved. Therefore, a methodology to cluster issues from the perspective of common R&D keywords is proposed to satisfy these demands.
Keywords: Clustering, Social Network Analysis, Text Mining, Topic Analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20031261 Weighted Clustering Coefficient for Identifying Modular Formations in Protein-Protein Interaction Networks
Authors: Zelmina Lubovac, Björn Olsson, Jonas Gamalielsson
Abstract:
This paper describes a novel approach for deriving modules from protein-protein interaction networks, which combines functional information with topological properties of the network. This approach is based on weighted clustering coefficient, which uses weights representing the functional similarities between the proteins. These weights are calculated according to the semantic similarity between the proteins, which is based on their Gene Ontology terms. We recently proposed an algorithm for identification of functional modules, called SWEMODE (Semantic WEights for MODule Elucidation), that identifies dense sub-graphs containing functionally similar proteins. The rational underlying this approach is that each module can be reduced to a set of triangles (protein triplets connected to each other). Here, we propose considering semantic similarity weights of all triangle-forming edges between proteins. We also apply varying semantic similarity thresholds between neighbours of each node that are not neighbours to each other (and hereby do not form a triangle), to derive new potential triangles to include in module-defining procedure. The results show an improvement of pure topological approach, in terms of number of predicted modules that match known complexes.Keywords: Modules, systems biology, protein interactionnetworks, yeast.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20531260 Using Suffix Tree Document Representation in Hierarchical Agglomerative Clustering
Authors: Daniel I. Morariu, Radu G. Cretulescu, Lucian N. Vintan
Abstract:
In text categorization problem the most used method for documents representation is based on words frequency vectors called VSM (Vector Space Model). This representation is based only on words from documents and in this case loses any “word context" information found in the document. In this article we make a comparison between the classical method of document representation and a method called Suffix Tree Document Model (STDM) that is based on representing documents in the Suffix Tree format. For the STDM model we proposed a new approach for documents representation and a new formula for computing the similarity between two documents. Thus we propose to build the suffix tree only for any two documents at a time. This approach is faster, it has lower memory consumption and use entire document representation without using methods for disposing nodes. Also for this method is proposed a formula for computing the similarity between documents, which improves substantially the clustering quality. This representation method was validated using HAC - Hierarchical Agglomerative Clustering. In this context we experiment also the stemming influence in the document preprocessing step and highlight the difference between similarity or dissimilarity measures to find “closer" documents.Keywords: Text Clustering, Suffix tree documentrepresentation, Hierarchical Agglomerative Clustering
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18661259 Analysis of Diverse Clustering Tools in Data Mining
Authors: S. Sarumathi, N. Shanthi, M. Sharmila
Abstract:
Clustering in data mining is an unsupervised learning technique of aggregating the data objects into meaningful groups such that the intra cluster similarity of objects are maximized and inter cluster similarity of objects are minimized. Over the past decades several clustering tools were emerged in which clustering algorithms are inbuilt and are easier to use and extract the expected results. Data mining mainly deals with the huge databases that inflicts on cluster analysis and additional rigorous computational constraints. These challenges pave the way for the emergence of powerful expansive data mining clustering softwares. In this survey, a variety of clustering tools used in data mining are elucidated along with the pros and cons of each software.
Keywords: Cluster Analysis, Clustering Algorithms, Clustering Techniques, Association, Visualization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21511258 Unsteady Reversed Stagnation-Point Flow over a Flat Plate
Authors: Vai Kuong Sin, Chon Kit Chio
Abstract:
This paper investigates the nature of the development of two-dimensional laminar flow of an incompressible fluid at the reversed stagnation-point. ". In this study, we revisit the problem of reversed stagnation-point flow over a flat plate. Proudman and Johnson (1962) first studied the flow and obtained an asymptotic solution by neglecting the viscous terms. This is no true in neglecting the viscous terms within the total flow field. In particular it is pointed out that for a plate impulsively accelerated from rest to a constant velocity V0 that a similarity solution to the self-similar ODE is obtained which is noteworthy completely analytical.Keywords: reversed stagnation-point flow, similarity solutions, analytical solution, numerical solution
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14151257 Isolation and Identification of Diacylglycerol Acyltransferase Type- 2 (GAT2) Genes from Three Egyptian Olive Cultivars
Authors: Yahia I. Mohamed, Ahmed I. Marzouk, Mohamed A. Yacout
Abstract:
Aim of this work was to study the genetic basis for oil accumulation in olive fruit via tracking DGAT2 (Diacylglycerol acyltransferase type-2) gene in three Egyptian Origen Olive cultivars namely Toffahi, Hamed and Maraki using molecular marker techniques and bioinformatics tools. Results illustrate that, firstly: specific genomic band of Maraki cultivars was identified as DGAT2 (Diacylglycerol acyltransferase type-2) and identical for this gene in Olea europaea with 100% of similarity. Secondly, differential genomic band of Maraki cultivars which produced from RAPD fingerprinting technique reflected predicted distinguished sequence which identified as DGAT2 (Diacylglycerol acyltransferase type-2) in Fragaria vesca subsp. Vesca with 76% of sequential similarity. Third and finally, specific genomic specific band of Hamed cultivars was identified as two fragments, 1- Olea europaea cultivar Koroneiki diacylglycerol acyltransferase type 2 mRNA, complete cds with two matches regions with 99% or 2- Predicted: Fragaria vesca subsp. vesca diacylglycerol O-acyltransferase 2-like (LOC101313050), mRNA with 86 % of similarity.
Keywords: Olea europaea, fingerprinting, Diacylglycerol acyltransferase type- 2 (DGAT2).
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23771256 Automatic Image Alignment and Stitching of Medical Images with Seam Blending
Authors: Abhinav Kumar, Raja Sekhar Bandaru, B Madhusudan Rao, Saket Kulkarni, Nilesh Ghatpande
Abstract:
This paper proposes an algorithm which automatically aligns and stitches the component medical images (fluoroscopic) with varying degrees of overlap into a single composite image. The alignment method is based on similarity measure between the component images. As applied here the technique is intensity based rather than feature based. It works well in domains where feature based methods have difficulty, yet more robust than traditional correlation. Component images are stitched together using the new triangular averaging based blending algorithm. The quality of the resultant image is tested for photometric inconsistencies and geometric misalignments. This method cannot correct rotational, scale and perspective artifacts.
Keywords: Histogram Matching, Image Alignment, ImageStitching, Medical Imaging.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 37021255 Wasp Venom Peptides may play a role in the Pathogenesis of Acute Disseminated Encephalomyelitis in Humans: A Structural Similarity Analysis
Authors: Permphan Dharmasaroja
Abstract:
Acute disseminated encephalomyelitis (ADEM) has been reported to develop after a hymenoptera sting, but its pathogenesis is not known in detail. Myelin basic protein (MBP)- specific T cells have been detected in the blood of patients with ADEM, and a proportion of these patients develop multiple sclerosis (MS). In an attempt to understand the mechanisms underlying ADEM, molecular mimicry between hymenoptera venom peptides and the human immunodominant MBP peptide was scrutinized, based on the sequence and structural similarities, whether it was the root of the disease. The results suggest that the three wasp venom peptides have low sequence homology with the human immunodominant MBP residues 85-99. Structural similarity analysis among the three venom peptides and the MS-related HLA-DR2b (DRA, DRB1*1501)-associated immunodominant MHC binding/TCR contact residues 88-93, VVHFFK showed that hyaluronidase residues 7-12, phospholipase A1 residues 98-103, and antigen 5 residues 109-114 showed a high degree of similarity 83.3%, 100%, and 83.3% respectively. In conclusion, some wasp venom peptides, particularly phospholipase A1, may potentially act as the molecular motifs of the human 3HLA-DR2b-associated immunodominant MBP88-93, and possibly present a mechanism for induction of wasp sting-associated ADEM.Keywords: central nervous system, Hymenoptera, myelin basicprotein, molecular mimicry.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15771254 Image Indexing Using a Color Similarity Metric based on the Human Visual System
Authors: Angelo Nodari, Ignazio Gallo
Abstract:
The novelty proposed in this study is twofold and consists in the developing of a new color similarity metric based on the human visual system and a new color indexing based on a textual approach. The new color similarity metric proposed is based on the color perception of the human visual system. Consequently the results returned by the indexing system can fulfill as much as possibile the user expectations. We developed a web application to collect the users judgments about the similarities between colors, whose results are used to estimate the metric proposed in this study. In order to index the image's colors, we used a text indexing engine to facilitate the integration of visual features in a database of text documents. The textual signature is build by weighting the image's colors in according to their occurrence in the image. The use of a textual indexing engine, provide us a simple, fast and robust solution to index images. A typical usage of the system proposed in this study, is the development of applications whose data type is both visual and textual. In order to evaluate the proposed method we chose a price comparison engine as a case of study, collecting a series of commercial offers containing the textual description and the image representing a specific commercial offer.
Keywords: Color Extraction, Content-Based Image Retrieval, Indexing
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 29861253 An Optimization Algorithm Based on Dynamic Schema with Dissimilarities and Similarities of Chromosomes
Authors: Radhwan Yousif Sedik Al-Jawadi
Abstract:
Optimization is necessary for finding appropriate solutions to a range of real-life problems. In particular, genetic (or more generally, evolutionary) algorithms have proved very useful in solving many problems for which analytical solutions are not available. In this paper, we present an optimization algorithm called Dynamic Schema with Dissimilarity and Similarity of Chromosomes (DSDSC) which is a variant of the classical genetic algorithm. This approach constructs new chromosomes from a schema and pairs of existing ones by exploring their dissimilarities and similarities. To show the effectiveness of the algorithm, it is tested and compared with the classical GA, on 15 two-dimensional optimization problems taken from literature. We have found that, in most cases, our method is better than the classical genetic algorithm.Keywords: Genetic algorithm, similarity and dissimilarity, chromosome injection, dynamic schema.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12561252 A Similarity Metric for Assessment of Image Fusion Algorithms
Authors: Nedeljko Cvejic, Artur Łoza, David Bull, Nishan Canagarajah
Abstract:
In this paper, we present a novel objective nonreference performance assessment algorithm for image fusion. It takes into account local measurements to estimate how well the important information in the source images is represented by the fused image. The metric is based on the Universal Image Quality Index and uses the similarity between blocks of pixels in the input images and the fused image as the weighting factors for the metrics. Experimental results confirm that the values of the proposed metrics correlate well with the subjective quality of the fused images, giving a significant improvement over standard measures based on mean squared error and mutual information.Keywords: Fusion performance measures, image fusion, nonreferencequality measures, objective quality measures.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24131251 Generation of Sets of Synthetic Classifiers for the Evaluation of Abstract-Level Combination Methods
Authors: N. Greco, S. Impedovo, R.Modugno, G. Pirlo
Abstract:
This paper presents a new technique for generating sets of synthetic classifiers to evaluate abstract-level combination methods. The sets differ in terms of both recognition rates of the individual classifiers and degree of similarity. For this purpose, each abstract-level classifier is considered as a random variable producing one class label as the output for an input pattern. From the initial set of classifiers, new slightly different sets are generated by applying specific operators, which are defined at the purpose. Finally, the sets of synthetic classifiers have been used to estimate the performance of combination methods for abstract-level classifiers. The experimental results demonstrate the effectiveness of the proposed approach.
Keywords: Abstract-level Classifier, Dempster-Shafer Rule, Multi-expert Systems, Similarity Index, System Evaluation
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14361250 Using Spectral Vectors and M-Tree for Graph Clustering and Searching in Graph Databases of Protein Structures
Authors: Do Phuc, Nguyen Thi Kim Phung
Abstract:
In this paper, we represent protein structure by using graph. A protein structure database will become a graph database. Each graph is represented by a spectral vector. We use Jacobi rotation algorithm to calculate the eigenvalues of the normalized Laplacian representation of adjacency matrix of graph. To measure the similarity between two graphs, we calculate the Euclidean distance between two graph spectral vectors. To cluster the graphs, we use M-tree with the Euclidean distance to cluster spectral vectors. Besides, M-tree can be used for graph searching in graph database. Our proposal method was tested with graph database of 100 graphs representing 100 protein structures downloaded from Protein Data Bank (PDB) and we compare the result with the SCOP hierarchical structure.Keywords: Eigenvalues, m-tree, graph database, protein structure, spectra graph theory.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16021249 MIBiClus: Mutual Information based Biclustering Algorithm
Authors: Neelima Gupta, Seema Aggarwal
Abstract:
Most of the biclustering/projected clustering algorithms are based either on the Euclidean distance or correlation coefficient which capture only linear relationships. However, in many applications, like gene expression data and word-document data, non linear relationships may exist between the objects. Mutual Information between two variables provides a more general criterion to investigate dependencies amongst variables. In this paper, we improve upon our previous algorithm that uses mutual information for biclustering in terms of computation time and also the type of clusters identified. The algorithm is able to find biclusters with mixed relationships and is faster than the previous one. To the best of our knowledge, none of the other existing algorithms for biclustering have used mutual information as a similarity measure. We present the experimental results on synthetic data as well as on the yeast expression data. Biclusters on the yeast data were found to be biologically and statistically significant using GO Tool Box and FuncAssociate.
Keywords: Biclustering, mutual information.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15761248 Application of Data Envelopment Analysis to Assess Quality Management Efficiency
Authors: Chuen Tse Kuah, Kuan Yew Wong, Farzad Behrouzi
Abstract:
This paper is aimed to give an illustration on the application of Data Envelopment Analysis (DEA) as a tool to assess Quality Management (QM) efficiency. A variant of DEA, slack based measure (SBM) is used for this purpose. From this study, it is found that DEA is suitable to measure QM efficiency and give improvement suggestions to the inefficient QM.Keywords: Quality Management, Data Envelopment Analysis, Slack Based Measure, Efficiency Measurement.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20401247 Genetic Characterization of Barley Genotypes via Inter-Simple Sequence Repeat
Authors: Mustafa Yorgancılar, Emine Atalay, Necdet Akgün, Ali Topal
Abstract:
In this study, polymerase chain reaction based Inter-simple sequence repeat (ISSR) from DNA fingerprinting techniques were used to investigate the genetic relationships among barley crossbreed genotypes in Turkey. It is important that selection based on the genetic base in breeding programs via ISSR, in terms of breeding time. 14 ISSR primers generated a total of 97 bands, of which 81 (83.35%) were polymorphic. The highest total resolution power (RP) value was obtained from the F2 (0.53) and M16 (0.51) primers. According to the ISSR result, the genetic similarity index changed between 0.64–095; Lane 3 with Line 6 genotypes were the closest, while Line 36 were the most distant ones. The ISSR markers were found to be promising for assessing genetic diversity in barley crossbreed genotypes.
Keywords: Barley, crossbreed, genetic similarity, ISSR.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8641246 Project Complexity Indices based on Topology Features
Authors: Amer A. Boushaala
Abstract:
The heuristic decision rules used for project scheduling will vary depending upon the project-s size, complexity, duration, personnel, and owner requirements. The concept of project complexity has received little detailed attention. The need to differentiate between easy and hard problem instances and the interest in isolating the fundamental factors that determine the computing effort required by these procedures inspired a number of researchers to develop various complexity measures. In this study, the most common measures of project complexity are presented. A new measure of project complexity is developed. The main privilege of the proposed measure is that, it considers size, shape and logic characteristics, time characteristics, resource demands and availability characteristics as well as number of critical activities and critical paths. The degree of sensitivity of the proposed measure for complexity of project networks has been tested and evaluated against the other measures of complexity of the considered fifty project networks under consideration in the current study. The developed measure showed more sensitivity to the changes in the network data and gives accurate quantified results when comparing the complexities of networks.Keywords: Activity networks, Complexity index, Networkcomplexity measure, Network topology, Project Network.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16301245 Using Morphological and Microsatellite (SSR) Markers to Assess the Genetic Diversity in Alfalfa (Medicago sativa L.)
Authors: T. Cholastova, D. Knotova
Abstract:
Utilization of diverse germplasm is needed to enhance the genetic diversity of cultivars. The objective of this study was to evaluate the genetic relationships of 98 alfalfa germplasm accessions using morphological traits and SSR markers. From the 98 tested populations, 81 were locals originating in Europe, 17 were introduced from USA, Australia, New Zealand and Canada. Three primers generated 67 polymorphic bands. The average polymorphic information content (PIC) was very high (> 0.90) over all three used primer combinations. Cluster analysis using Unweighted Pair Group Method with Arithmetic Means (UPGMA) and Jaccard´s coefficient grouped the accessions into 2 major clusters with 4 sub-clusters with no correlation between genetic and morphological diversity. The SSR analysis clearly indicated that even with three polymorphic primers, reliable estimation of genetic diversity could be obtained.Keywords: genetic diversity, Medicago sativa L., morphological traits, SSR markers
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 30401244 Documents Emotions Classification Model Based on TF-IDF Weighting Measure
Authors: Amr Mansour Mohsen, Hesham Ahmed Hassan, Amira M. Idrees
Abstract:
Emotions classification of text documents is applied to reveal if the document expresses a determined emotion from its writer. As different supervised methods are previously used for emotion documents’ classification, in this research we present a novel model that supports the classification algorithms for more accurate results by the support of TF-IDF measure. Different experiments have been applied to reveal the applicability of the proposed model, the model succeeds in raising the accuracy percentage according to the determined metrics (precision, recall, and f-measure) based on applying the refinement of the lexicon, integration of lexicons using different perspectives, and applying the TF-IDF weighting measure over the classifying features. The proposed model has also been compared with other research to prove its competence in raising the results’ accuracy.
Keywords: Emotion detection, TF-IDF, WEKA tool, classification algorithms.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16611243 XML Schema Automatic Matching Solution
Authors: Huynh Quyet Thang, Vo Sy Nam
Abstract:
Schema matching plays a key role in many different applications, such as schema integration, data integration, data warehousing, data transformation, E-commerce, peer-to-peer data management, ontology matching and integration, semantic Web, semantic query processing, etc. Manual matching is expensive and error-prone, so it is therefore important to develop techniques to automate the schema matching process. In this paper, we present a solution for XML schema automated matching problem which produces semantic mappings between corresponding schema elements of given source and target schemas. This solution contributed in solving more comprehensively and efficiently XML schema automated matching problem. Our solution based on combining linguistic similarity, data type compatibility and structural similarity of XML schema elements. After describing our solution, we present experimental results that demonstrate the effectiveness of this approach.Keywords: XML Schema, Schema Matching, SemanticMatching, Automatic XML Schema Matching.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17851242 The Mutated Distance between Two Mixture Trees
Authors: Wan Chian Li, Justie Su-Tzu Juan, Yi-Chun Wang, Shu-Chuan Chen
Abstract:
The evolutionary tree is an important topic in bioinformation. In 2006, Chen and Lindsay proposed a new method to build the mixture tree from DNA sequences. Mixture tree is a new type evolutionary tree, and it has two additional information besides the information of ordinary evolutionary tree. One of the information is time parameter, and the other is the set of mutated sites. In 2008, Lin and Juan proposed an algorithm to compute the distance between two mixture trees. Their algorithm computes the distance with only considering the time parameter between two mixture trees. In this paper, we proposes a method to measure the similarity of two mixture trees with considering the set of mutated sites and develops two algorithm to compute the distance between two mixture trees. The time complexity of these two proposed algorithms are O(n2 × max{h(T1), h(T2)}) and O(n2), respectively
Keywords: evolutionary tree, mixture tree, mutated site, distance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13731241 Application of KL Divergence for Estimation of Each Metabolic Pathway Genes
Authors: Shohei Maruyama, Yasuo Matsuyama, Sachiyo Aburatani
Abstract:
Development of a method to estimate gene functions is an important task in bioinformatics. One of the approaches for the annotation is the identification of the metabolic pathway that genes are involved in. Since gene expression data reflect various intracellular phenomena, those data are considered to be related with genes’ functions. However, it has been difficult to estimate the gene function with high accuracy. It is considered that the low accuracy of the estimation is caused by the difficulty of accurately measuring a gene expression. Even though they are measured under the same condition, the gene expressions will vary usually. In this study, we proposed a feature extraction method focusing on the variability of gene expressions to estimate the genes' metabolic pathway accurately. First, we estimated the distribution of each gene expression from replicate data. Next, we calculated the similarity between all gene pairs by KL divergence, which is a method for calculating the similarity between distributions. Finally, we utilized the similarity vectors as feature vectors and trained the multiclass SVM for identifying the genes' metabolic pathway. To evaluate our developed method, we applied the method to budding yeast and trained the multiclass SVM for identifying the seven metabolic pathways. As a result, the accuracy that calculated by our developed method was higher than the one that calculated from the raw gene expression data. Thus, our developed method combined with KL divergence is useful for identifying the genes' metabolic pathway.
Keywords: Metabolic pathways, gene expression data, microarray, Kullback–Leibler divergence, KL divergence, support vector machines, SVM, machine learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22711240 Fighter Aircraft Selection Using Technique for Order Preference by Similarity to Ideal Solution with Multiple Criteria Decision Making Analysis
Authors: C. Ardil
Abstract:
This paper presents a multiple criteria decision making analysis technique for selecting fighter aircraft for the national air force. The selection of military aircraft is a process consisting of contradictory goals and objectives. When a modern air force needs to choose fighter aircraft to upgrade existing fleets, a multiple criteria decision making analysis and scenario planning for defense acquisition has been put forward. The selection of fighter aircraft for the air defense force is a strategic decision making process, since the purchase or lease of fighter jets, maintenance and operating costs and having a fleet is the biggest cost for the air force. Multiple criteria decision making analysis methods are effectively applied to facilitate decision making from various available options. The selection criteria were determined using the literature on the problem of fighter aircraft selection. The selection of fighter aircraft to be purchased for the air defense forces is handled using a multiple criteria decision making analysis technique that also determines a suitable methodological approach for the defense procurement and fleet upgrade planning process. The aim of this study is to originate an approach to evaluate fighter aircraft alternatives, Su-35, F-35, and TF-X (MMU), based on technique for order preference by similarity to ideal solution (TOPSIS).
Keywords: Fighter Aircraft, Fighter Aircraft Selection, Technique for Order Preference by Similarity to Ideal Solution, TOPSIS, Multiple Criteria Decision Making, Multiple Criteria Decision Making Analysis, MCDMA, Su-35, F-35, TF-X (MMU)
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5641239 A New Approach for Controlling Overhead Traveling Crane Using Rough Controller
Authors: Mazin Z. Othman
Abstract:
This paper presents the idea of a rough controller with application to control the overhead traveling crane system. The structure of such a controller is based on a suggested concept of a fuzzy logic controller. A measure of fuzziness in rough sets is introduced. A comparison between fuzzy logic controller and rough controller has been demonstrated. The results of a simulation comparing the performance of both controllers are shown. From these results we infer that the performance of the proposed rough controller is satisfactory.
Keywords: Accuracy measure, Fuzzy Logic Controller (FLC), Overhead Traveling Crane (OTC), Rough Set Theory (RST), Roughness measure
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1657