Search results for: phylogenetic trees

202 Comparison of Phylogenetic Trees of Multiple Protein Sequence Alignment Methods

Authors: Khaddouja Boujenfa, Nadia Essoussi, Mohamed Limam

Abstract:

Multiple sequence alignment is a fundamental part in many bioinformatics applications such as phylogenetic analysis. Many alignment methods have been proposed. Each method gives a different result for the same data set, and consequently generates a different phylogenetic tree. Hence, the chosen alignment method affects the resulting tree. However in the literature, there is no evaluation of multiple alignment methods based on the comparison of their phylogenetic trees. This work evaluates the following eight aligners: ClustalX, T-Coffee, SAGA, MUSCLE, MAFFT, DIALIGN, ProbCons and Align-m, based on their phylogenetic trees (test trees) produced on a given data set. The Neighbor-Joining method is used to estimate trees. Three criteria, namely, the dNNI, the dRF and the Id_Tree are established to test the ability of different alignment methods to produce closer test tree compared to the reference one (true tree). Results show that the method which produces the most accurate alignment gives the nearest test tree to the reference tree. MUSCLE outperforms all aligners with respect to the three criteria and for all datasets, performing particularly better when sequence identities are within 10-20%. It is followed by T-Coffee at lower sequence identity (<10%), Align-m at 20-30% identity, and ClustalX and ProbCons at 30-50% identity. Also, it is noticed that when sequence identities are higher (>30%), trees scores of all methods become similar.

Keywords: Multiple alignment methods, phylogenetic trees, Neighbor-Joining method, Robinson-Foulds distance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1826

201 A Maximum Parsimony Model to Reconstruct Phylogenetic Network in Honey Bee Evolution

Authors: Usha Chouhan, K. R. Pardasani

Abstract:

Phylogenies ; The evolutionary histories of groups of species are one of the most widely used tools throughout the life sciences, as well as objects of research with in systematic, evolutionary biology. In every phylogenetic analysis reconstruction produces trees. These trees represent the evolutionary histories of many groups of organisms, bacteria due to horizontal gene transfer and plants due to process of hybridization. The process of gene transfer in bacteria and hybridization in plants lead to reticulate networks, therefore, the methods of constructing trees fail in constructing reticulate networks. In this paper a model has been employed to reconstruct phylogenetic network in honey bee. This network represents reticulate evolution in honey bee. The maximum parsimony approach has been used to obtain this reticulate network.

Keywords: Hybridization, HGT, Reticulate networks, Recombination, Species, Parsimony.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1606

200 Predicting Protein-Protein Interactions from Protein Sequences Using Phylogenetic Profiles

Authors: Omer Nebil Yaveroglu, Tolga Can

Abstract:

In this study, a high accuracy protein-protein interaction prediction method is developed. The importance of the proposed method is that it only uses sequence information of proteins while predicting interaction. The method extracts phylogenetic profiles of proteins by using their sequence information. Combining the phylogenetic profiles of two proteins by checking existence of homologs in different species and fitting this combined profile into a statistical model, it is possible to make predictions about the interaction status of two proteins. For this purpose, we apply a collection of pattern recognition techniques on the dataset of combined phylogenetic profiles of protein pairs. Support Vector Machines, Feature Extraction using ReliefF, Naive Bayes Classification, K-Nearest Neighborhood Classification, Decision Trees, and Random Forest Classification are the methods we applied for finding the classification method that best predicts the interaction status of protein pairs. Random Forest Classification outperformed all other methods with a prediction accuracy of 76.93%

Keywords: Protein Interaction Prediction, Phylogenetic Profile, SVM , ReliefF, Decision Trees, Random Forest Classification

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1612

199 Enhanced Character Based Algorithm for Small Parsimony

Authors: Parvinder Singh Sandhu, Sumeet Kaur Sehra, Karmjit Kaur

Abstract:

Phylogenetic tree is a graphical representation of the evolutionary relationship among three or more genes or organisms. These trees show relatedness of data sets, species or genes divergence time and nature of their common ancestors. Quality of a phylogenetic tree requires parsimony criterion. Various approaches have been proposed for constructing most parsimonious trees. This paper is concerned about calculating and optimizing the changes of state that are needed called Small Parsimony Algorithms. This paper has proposed enhanced small parsimony algorithm to give better score based on number of evolutionary changes needed to produce the observed sequence changes tree and also give the ancestor of the given input.

Keywords: Phylogenetic Analysis, Small Parsimony, EnhancedFitch Algorithm, Enhanced Sakoff Algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1348

198 Heritage Tree Expert Assessment and Classification: Malaysian Perspective

Authors: B.-Y.-S. Lau, Y.-C.-T. Jonathan, M.-S. Alias

Abstract:

Heritage trees are natural large, individual trees with exceptionally value due to association with age or event or distinguished people. In Malaysia, there is an abundance of tropical heritage trees throughout the country. It is essential to set up a repository of heritage trees to prevent valuable trees from being cut down. In this cross domain study, a web-based online expert system namely the Heritage Tree Expert Assessment and Classification (HTEAC) is developed and deployed for public to nominate potential heritage trees. Based on the nomination, tree care experts or arborists would evaluate and verify the nominated trees as heritage trees. The expert system automatically rates the approved heritage trees according to pre-defined grades via Delphi technique. Features and usability test of the expert system are presented. Preliminary result is promising for the system to be used as a full scale public system.

Keywords: Arboriculture, Delphi, expert system, heritage tree, urban forestry.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1430

197 Evaluation of Hazardous Status of Avenue Trees in University of Port Harcourt

Authors: F. S. Eguakun, T. C. Nkwor

Abstract:

Trees in the university environment are uniquely position; however, they can also present a millstone to the infrastructure and humans they coexist with. The numerous benefits of trees can be negated due to poor tree health and anthropogenic activities and as such can become hazardous. The study aims at evaluating the hazardous status of avenue trees in University of Port Harcourt. Data were collected from all the avenue trees within the selected major roads in the University. Tree growth variables were measured and health condition of the avenue trees were assessed as an indicator of some structural defects. The hazard status of the avenue trees was determined. Several tree species were used as avenue trees in the University however, Azadirachta indica (81%) was found to be most abundant. The result shows that only 0.3% avenue tree species was found to pose severe harzard in Abuja part of the University. Most avenue trees (55.2%) were rated as medium hazard status. Due to the danger and risk associated with hazardous trees, the study recommends that good and effective management strategies be implemented so as to prevent future damages from trees with small or medium hazard status.

Keywords: Avenue tree, hazard status, inventory, urban.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 716

196 Evolutionary Decision Trees and Software Metrics for Module Defects Identification

Authors: Monica Chiş

Abstract:

Software metric is a measure of some property of a piece of software or its specification. The aim of this paper is to present an application of evolutionary decision trees in software engineering in order to classify the software modules that have or have not one or more reported defects. For this some metrics are used for detecting the class of modules with defects or without defects.

Keywords: Evolutionary decision trees, decision trees, softwaremetrics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1751

195 Molecular Evolutionary Analysis of Yeast Protein Interaction Network

Authors: Soichi Ogishima, Takeshi Hase, So Nakagawa, Yasuhiro Suzuki, Hiroshi Tanaka

Abstract:

To understand life as biological system, evolutionary understanding is indispensable. Protein interactions data are rapidly accumulating and are suitable for system-level evolutionary analysis. We have analyzed yeast protein interaction network by both mathematical and biological approaches. In this poster presentation, we inferred the evolutionary birth periods of yeast proteins by reconstructing phylogenetic profile. It has been thought that hub proteins that have high connection degree are evolutionary old. But our analysis showed that hub proteins are entirely evolutionary new. We also examined evolutionary processes of protein complexes. It showed that member proteins of complexes were tend to have appeared in the same evolutionary period. Our results suggested that protein interaction network evolved by modules that form the functional unit. We also reconstructed standardized phylogenetic trees and calculated evolutionary rates of yeast proteins. It showed that there is no obvious correlation between evolutionary rates and connection degrees of yeast proteins.

Keywords: Protein interaction network, evolution, modularity, evolutionary rate, connection degrees.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1363

194 The Efficiency of Cytochrome Oxidase Subunit 1 Gene (cox1) in Reconstruction of Phylogenetic Relations among Some Crustacean Species

Authors: Yasser M. Saad, Heba El-Sebaie Abd El-Sadek

Abstract:

Some Metapenaeus monoceros cox1 gene fragments were isolated, purified, sequenced, and comparatively analyzed with some other Crustacean Cox1 gene sequences (obtained from National Center for Biotechnology Information). This work was designed for testing the efficiency of this system in reconstruction of phylogenetic relations among some Crustacean species belonging to four genera (Metapenaeus, Artemia, Daphnia and Calanus). The single nucleotide polymorphism and haplotype diversity were calculated for all estimated mt-DNA fragments. The genetic distance values were 0.292, 0.015, 0.151, and 0.09 within Metapenaeus species, Calanus species, Artemia species, and Daphnia species, respectively. The reconstructed phylogenetic tree is clustered into some unique clades. Cytochrome oxidase subunit 1 gene (cox1) was a powerful system in reconstruction of phylogenetic relations among evaluated crustacean species.

Keywords: Crustacean, Genetics, cox1, phylogeny.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1294

193 Phylogenetic Inference from 18S rRNA Gene Sequences of Horseshoe Crabs, Tachypleus gigas between Tanjung Dawai, Kedah and Cherating, Pahang, Peninsular Malaysia

Authors: Ismail, N., Sarijan, S

Abstract:

The phylogenetic analysis using the most conservative portions of 18S rRNA gene revealed the phylogenetic relationship among the two populations where DNA divergence showed that the nucleotides diversity value were -0.00838 for the Tanjung Dawai, Kedah and -0.00708 for the Cherating, Pahang populations respectively. The net nucleotide divergence among populations (Da) was -0.0073 indicating a low polymorphism among the populations studied. Total number of mutations in the Tanjung Dawai, Kedah samples was higher than Cherating, Pahang samples, which are 73 and 59 respectively while shared mutations across the populations were 8, and reveal the evolutionary in the genome of Malaysian T. gigas. The tree topology of both populations inferred using Neigbour-joining method by comparing 1791 bp of partial 18S rRNA sequence revealed that T. gigas haplotypes were clustered into seven clades, suggesting that they are genetically diverse among populations which derived from a common ancestor.

Keywords: Horseshoe crabs, Tachypleus gigas, 18S rRNA genesequences, phylogenetic analysis

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1842

192 The Mutated Distance between Two Mixture Trees

Authors: Wan Chian Li, Justie Su-Tzu Juan, Yi-Chun Wang, Shu-Chuan Chen

Abstract:

The evolutionary tree is an important topic in bioinformation. In 2006, Chen and Lindsay proposed a new method to build the mixture tree from DNA sequences. Mixture tree is a new type evolutionary tree, and it has two additional information besides the information of ordinary evolutionary tree. One of the information is time parameter, and the other is the set of mutated sites. In 2008, Lin and Juan proposed an algorithm to compute the distance between two mixture trees. Their algorithm computes the distance with only considering the time parameter between two mixture trees. In this paper, we proposes a method to measure the similarity of two mixture trees with considering the set of mutated sites and develops two algorithm to compute the distance between two mixture trees. The time complexity of these two proposed algorithms are O(n2 × max{h(T1), h(T2)}) and O(n2), respectively

Keywords: evolutionary tree, mixture tree, mutated site, distance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1416

191 Systematics of Water Lilies (Genus Nymphaea L.) Using 18S rDNA Sequences

Authors: M. Nakkuntod, S. Srinarang, K.W. Hilu

Abstract:

Water lily (Nymphaea L.) is the largest genus of Nymphaeaceae. This family is composed of six genera (Nuphar, Ondinea, Euryale, Victoria, Barclaya, Nymphaea). Its members are nearly worldwide in tropical and temperate regions. The classification of some species in Nymphaea is ambiguous due to high variation in leaf and flower parts such as leaf margin, stamen appendage. Therefore, the phylogenetic relationships based on 18S rDNA were constructed to delimit this genus. DNAs of 52 specimens belonging to water lily family were extracted using modified conventional method containing cetyltrimethyl ammonium bromide (CTAB). The results showed that the amplified fragment is about 1600 base pairs in size. After analysis, the aligned sequences presented 9.36% for variable characters comprising 2.66% of parsimonious informative sites and 6.70% of singleton sites. Moreover, there are 6 regions of 1-2 base(s) for insertion/deletion. The phylogenetic trees based on maximum parsimony and maximum likelihood with high bootstrap support indicated that genus Nymphaea was a paraphyletic group because of Ondinea, Victoria and Euryale disruption. Within genus Nymphaea, subgenus Nymphaea is a basal lineage group which cooperated with Euryale and Victoria. The other four subgenera, namely Lotos, Hydrocallis, Brachyceras and Anecphya were included the same large clade which Ondinea was placed within Anecphya clade due to geographical sharing.

Keywords: nrDNA, phylogeny, taxonomy, Waterlily.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1129

190 Generating Concept Trees from Dynamic Self-organizing Map

Authors: Norashikin Ahmad, Damminda Alahakoon

Abstract:

Self-organizing map (SOM) provides both clustering and visualization capabilities in mining data. Dynamic self-organizing maps such as Growing Self-organizing Map (GSOM) has been developed to overcome the problem of fixed structure in SOM to enable better representation of the discovered patterns. However, in mining large datasets or historical data the hierarchical structure of the data is also useful to view the cluster formation at different levels of abstraction. In this paper, we present a technique to generate concept trees from the GSOM. The formation of tree from different spread factor values of GSOM is also investigated and the quality of the trees analyzed. The results show that concept trees can be generated from GSOM, thus, eliminating the need for re-clustering of the data from scratch to obtain a hierarchical view of the data under study.

Keywords: dynamic self-organizing map, concept formation, clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1458

189 Independent Spanning Trees on Systems-on-chip Hypercubes Routing

Authors: Eduardo Sant'Ana da Silva, Andre Luiz Pires Guedes, Eduardo Todt

Abstract:

Independent spanning trees (ISTs) provide a number of advantages in data broadcasting. One can cite the use in fault tolerance network protocols for distributed computing and bandwidth. However, the problem of constructing multiple ISTs is considered hard for arbitrary graphs. In this paper we present an efficient algorithm to construct ISTs on hypercubes that requires minimum resources to be performed.

Keywords: Hypercube, Independent Spanning Trees, Networks On Chip, Systems On Chip.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1886

188 An Accurate Method for Phylogeny Tree Reconstruction Based on a Modified Wild Dog Algorithm

Authors: Essam Al Daoud

Abstract:

This study solves a phylogeny problem by using modified wild dog pack optimization. The least squares error is considered as a cost function that needs to be minimized. Therefore, in each iteration, new distance matrices based on the constructed trees are calculated and used to select the alpha dog. To test the suggested algorithm, ten homologous genes are selected and collected from National Center for Biotechnology Information (NCBI) databanks (i.e., 16S, 18S, 28S, Cox 1, ITS1, ITS2, ETS, ATPB, Hsp90, and STN). The data are divided into three categories: 50 taxa, 100 taxa and 500 taxa. The empirical results show that the proposed algorithm is more reliable and accurate than other implemented methods.

Keywords: Least squares, neighbor joining, phylogenetic tree, wild dogpack.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1392

187 Restoring Trees Damaged by Cyclone Hudhud at Visakhapatnam, India

Authors: Mohan Kotamrazu

Abstract:

Cyclone Hudhud which battered the city of Visakhapatnam on 12^th October, 2014, damaged many buildings, public amenities and infrastructure facilities along the Visakha- Bheemili coastal corridor. More than half the green cover of the city was wiped out. Majority of the trees along the coastal corridor suffered from complete or partial damage. In order to understand the different ways that trees incurred damage during the cyclone, a damage assessment study was carried out by the author. The areas covered by this study included two university campuses, several parks and residential colonies which bore the brunt of the cyclone. Post disaster attempts have been made to restore many of the trees that have suffered from partial or complete damage from the effects of extreme winds. This paper examines the various ways that trees incurred damage from the cyclone Hudhud and presents some examples of the restoration efforts carried out by educational institutions, public parks and religious institutions of the city of Visakhapatnam in the aftermath of the devastating cyclone.

Keywords: Defoliation, restoration, salt spray damage, wind throw.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1844

186 Data Mining in Oral Medicine Using Decision Trees

Authors: Fahad Shahbaz Khan, Rao Muhammad Anwer, Olof Torgersson, Göran Falkman

Abstract:

Data mining has been used very frequently to extract hidden information from large databases. This paper suggests the use of decision trees for continuously extracting the clinical reasoning in the form of medical expert-s actions that is inherent in large number of EMRs (Electronic Medical records). In this way the extracted data could be used to teach students of oral medicine a number of orderly processes for dealing with patients who represent with different problems within the practice context over time.

Keywords: Data mining, Oral Medicine, Decision Trees, WEKA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2500

185 Measuring the Structural Similarity of Web-based Documents: A Novel Approach

Authors: Matthias Dehmer, Frank Emmert Streib, Alexander Mehler, Jürgen Kilian

Abstract:

Most known methods for measuring the structural similarity of document structures are based on, e.g., tag measures, path metrics and tree measures in terms of their DOM-Trees. Other methods measures the similarity in the framework of the well known vector space model. In contrast to these we present a new approach to measuring the structural similarity of web-based documents represented by so called generalized trees which are more general than DOM-Trees which represent only directed rooted trees.We will design a new similarity measure for graphs representing web-based hypertext structures. Our similarity measure is mainly based on a novel representation of a graph as strings of linear integers, whose components represent structural properties of the graph. The similarity of two graphs is then defined as the optimal alignment of the underlying property strings. In this paper we apply the well known technique of sequence alignments to solve a novel and challenging problem: Measuring the structural similarity of generalized trees. More precisely, we first transform our graphs considered as high dimensional objects in linear structures. Then we derive similarity values from the alignments of the property strings in order to measure the structural similarity of generalized trees. Hence, we transform a graph similarity problem to a string similarity problem. We demonstrate that our similarity measure captures important structural information by applying it to two different test sets consisting of graphs representing web-based documents.

Keywords: Graph similarity, hierarchical and directed graphs, hypertext, generalized trees, web structure mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2556

184 Ranking and Unranking Algorithms for k-ary Trees in Gray Code Order

Authors: Fateme Ashari-Ghomi, Najme Khorasani, Abbas Nowzari-Dalini

Abstract:

In this paper, we present two new ranking and unranking algorithms for k-ary trees represented by x-sequences in Gray code order. These algorithms are based on a gray code generation algorithm developed by Ahrabian et al.. In mentioned paper, a recursive backtracking generation algorithm for x-sequences corresponding to k-ary trees in Gray code was presented. This generation algorithm is based on Vajnovszki-s algorithm for generating binary trees in Gray code ordering. Up to our knowledge no ranking and unranking algorithms were given for x-sequences in this ordering. we present ranking and unranking algorithms with O(kn2) time complexity for x-sequences in this Gray code ordering

Keywords: k-ary Tree Generation, Ranking, Unranking, Gray Code.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2106

183 The Influence of Forest Management Histories on Dead Wood and Habitat Trees in the Old Growth Forest in Northern Iran

Authors: Kiomars Sefidi

Abstract:

Dead wood and habitat tree such as fallen logs, snags, stumps and cracks and loos bark etc. are regarded as an important ecological component of forests on which many forest dwelling species depend on presence of them within forest ecosystems. Meanwhile its relation to management history in Caspian forest has gone unreported. The aim of research was to compare the amounts of dead wood and habitat trees in the forests with historically different intensities of management, including: forests with the long term implication of management (PS), the short term implication of management (NS) which were compared with semi virgin forest (GS). The number of 405 individual dead and habitat trees were recorded and measured at 109 sampling locations. ANOVA revealed volume of dead tree in the form and decay classes significantly differ within sites and dead volume in the semi virgin forest significantly higher than managed sites. Comparing the amount of dead and habitat tree in three sites showed that, dead tree volume related with management history and significantly differ in three study sites. Meanwhile, frequency of habitat trees was significantly different within sites. The highest amount of habitat trees including cavities, cracks and loose bark and fork split trees was recorded in virgin site and lowest recorded in the sites with the long term implication of management. It can be concluded that forest management cause reduction of the amount of dead and habitat tree specially in a large size, thus managing this forest according to ecological sustainable principles require a commitment to maintaining stand structure that allow, continued generation of dead trees in a full range of size.

Keywords: Cracks trees, forest biodiversity, fork split trees, nature conservation, sustainable management.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1720

182 A Novel Methodology for Synthesis of Fault Trees from MATLAB-Simulink Model

Authors: F. Tajarrod, G. Latif-Shabgahi

Abstract:

Fault tree analysis is a well-known method for reliability and safety assessment of engineering systems. In the last 3 decades, a number of methods have been introduced, in the literature, for automatic construction of fault trees. The main difference between these methods is the starting model from which the tree is constructed. This paper presents a new methodology for the construction of static and dynamic fault trees from a system Simulink model. The method is introduced and explained in detail, and its correctness and completeness is experimentally validated by using an example, taken from literature. Advantages of the method are also mentioned.

Keywords: Fault tree, Simulink, Standby Sparing and Redundancy

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3000

181 The Game of Col on Complete K-ary Trees

Authors: Alessandro Cincotti, Timothee Bossart

Abstract:

Col is a classic combinatorial game played on graphs and to solve a general instance is a PSPACE-complete problem. However, winning strategies can be found for some specific graph instances. In this paper, the solution of Col on complete k-ary trees is presented.

Keywords: Combinatorial game, Complete k-ary tree, Mapcoloring game.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1187

180 Learning and Evaluating Possibilistic Decision Trees using Information Affinity

Authors: Ilyes Jenhani, Salem Benferhat, Zied Elouedi

Abstract:

This paper investigates the issue of building decision trees from data with imprecise class values where imprecision is encoded in the form of possibility distributions. The Information Affinity similarity measure is introduced into the well-known gain ratio criterion in order to assess the homogeneity of a set of possibility distributions representing instances-s classes belonging to a given training partition. For the experimental study, we proposed an information affinity based performance criterion which we have used in order to show the performance of the approach on well-known benchmarks.

Keywords: Data mining from uncertain data, Decision Trees, Possibility Theory.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1514

179 On Reversal and Transposition Medians

Authors: Martin Bader

Abstract:

During the last years, the genomes of more and more species have been sequenced, providing data for phylogenetic recon- struction based on genome rearrangement measures. A main task in all phylogenetic reconstruction algorithms is to solve the median of three problem. Although this problem is NP-hard even for the sim- plest distance measures, there are exact algorithms for the breakpoint median and the reversal median that are fast enough for practical use. In this paper, this approach is extended to the transposition median as well as to the weighted reversal and transposition median. Although there is no exact polynomial algorithm known even for the pairwise distances, we will show that it is in most cases possible to solve these problems exactly within reasonable time by using a branch and bound algorithm.

Keywords: Comparative genomics, genome rearrangements, me-dian, reversals, transpositions.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1687

178 Spatial Data Mining by Decision Trees

Authors: S. Oujdi, H. Belbachir

Abstract:

Existing methods of data mining cannot be applied on spatial data because they require spatial specificity consideration, as spatial relationships. This paper focuses on the classification with decision trees, which are one of the data mining techniques. We propose an extension of the C4.5 algorithm for spatial data, based on two different approaches Join materialization and Querying on the fly the different tables. Similar works have been done on these two main approaches, the first - Join materialization - favors the processing time in spite of memory space, whereas the second - Querying on the fly different tables- promotes memory space despite of the processing time. The modified C4.5 algorithm requires three entries tables: a target table, a neighbor table, and a spatial index join that contains the possible spatial relationship among the objects in the target table and those in the neighbor table. Thus, the proposed algorithms are applied to a spatial data pattern in the accidentology domain. A comparative study of our approach with other works of classification by spatial decision trees will be detailed.

Keywords: C4.5 Algorithm, Decision trees, S-CART, Spatial data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2986

177 Calcification Classification in Mammograms Using Decision Trees

Authors: S. Usha, S. Arumugam

Abstract:

Cancer affects people globally with breast cancer being a leading killer. Breast cancer is due to the uncontrollable multiplication of cells resulting in a tumour or neoplasm. Tumours are called ‘benign’ when cancerous cells do not ravage other body tissues and ‘malignant’ if they do so. As mammography is an effective breast cancer detection tool at an early stage which is the most treatable stage it is the primary imaging modality for screening and diagnosis of this cancer type. This paper presents an automatic mammogram classification technique using wavelet and Gabor filter. Correlation feature selection is used to reduce the feature set and selected features are classified using different decision trees.

Keywords: Breast Cancer, Mammogram, Symlet Wavelets, Gabor Filters, Decision Trees

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1751

176 Angular-Coordinate Driven Radial Tree Drawing

Authors: Farshad Ghassemi Toosi, Nikola S. Nikolov

Abstract:

We present a visualization technique for radial drawing of trees consisting of two slightly different algorithms. Both of them make use of node-link diagrams for visual encoding. This visualization creates clear drawings without edge crossing. One of the algorithms is suitable for real-time visualization of large trees, as it requires minimal recalculation of the layout if leaves are inserted or removed from the tree; while the other algorithm makes better utilization of the drawing space. The algorithms are very similar and follow almost the same procedure but with different parameters. Both algorithms assign angular coordinates for all nodes which are then converted into 2D Cartesian coordinates for visualization. We present both algorithms and discuss how they compare to each other.

Keywords: Radial Tree Drawing, Real-Time Visualization, Angular Coordinates, Large Trees.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2601

175 Error-Robust Nature of Genome Profiling Applied for Clustering of Species Demonstrated by Computer Simulation

Authors: Shamim Ahmed Koichi Nishigaki

Abstract:

Genome profiling (GP), a genotype based technology, which exploits random PCR and temperature gradient gel electrophoresis, has been successful in identification/classification of organisms. In this technology, spiddos (Species identification dots) and PaSS (Pattern similarity score) were employed for measuring the closeness (or distance) between genomes. Based on the closeness (PaSS), we can buildup phylogenetic trees of the organisms. We noticed that the topology of the tree is rather robust against the experimental fluctuation conveyed by spiddos. This fact was confirmed quantitatively in this study by computer-simulation, providing the limit of the reliability of this highly powerful methodology. As a result, we could demonstrate the effectiveness of the GP approach for identification/classification of organisms.

Keywords: Fluctuation, Genome profiling (GP), Pattern similarity score (PaSS), Robustness, Spiddos-shift.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1538

174 Comparative Study of Decision Trees and Rough Sets Theory as Knowledge ExtractionTools for Design and Control of Industrial Processes

Authors: Marcin Perzyk, Artur Soroczynski

Abstract:

General requirements for knowledge representation in the form of logic rules, applicable to design and control of industrial processes, are formulated. Characteristic behavior of decision trees (DTs) and rough sets theory (RST) in rules extraction from recorded data is discussed and illustrated with simple examples. The significance of the models- drawbacks was evaluated, using simulated and industrial data sets. It is concluded that performance of DTs may be considerably poorer in several important aspects, compared to RST, particularly when not only a characterization of a problem is required, but also detailed and precise rules are needed, according to actual, specific problems to be solved.

Keywords: Knowledge extraction, decision trees, rough setstheory, industrial processes.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1632

173 Structure Based Computational Analysis and Molecular Phylogeny of C- Phycocyanin Gene from the Selected Cyanobacteria

Authors: N. Reehana, A. Parveez Ahamed, D. Mubarak Ali, A. Suresh, R. Arvind Kumar, N. Thajuddin

Abstract:

Cyanobacteria play a vital role in the production of phycobiliproteins that includes phycocyanin and phycoerythrin pigments. Phycocyanin and related phycobiliproteins have wide variety of application that is used in the food, biotechnology and cosmetic industry because of their color, fluorescent and antioxidant properties. The present study is focused to understand the pigment at molecular level in the Cyanobacteria Oscillatoria terebriformis NTRI05 and Oscillatoria foreaui NTRI06. After extraction of genomic DNA, the amplification of C-Phycocyanin gene was done with the suitable primer PCβF and PCαR and the sequencing was performed. Structural and Phylogenetic analysis was attained using the sequence to develop a molecular model.

Keywords: Cyanobacteria, C-Phycocyanin gene, Phylogenetic analysis, Structural analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3060