Search results for: decision tree
1676 Semantic Spatial Objects Data Structure for Spatial Access Method
Authors: Kalum Priyanath Udagepola, Zuo Decheng, Wu Zhibo, Yang Xiaozong
Abstract:
Modern spatial database management systems require a unique Spatial Access Method (SAM) in order solve complex spatial quires efficiently. In this case the spatial data structure takes a prominent place in the SAM. Inadequate data structure leads forming poor algorithmic choices and forging deficient understandings of algorithm behavior on the spatial database. A key step in developing a better semantic spatial object data structure is to quantify the performance effects of semantic and outlier detections that are not reflected in the previous tree structures (R-Tree and its variants). This paper explores a novel SSRO-Tree on SAM to the Topo-Semantic approach. The paper shows how to identify and handle the semantic spatial objects with outlier objects during page overflow/underflow, using gain/loss metrics. We introduce a new SSRO-Tree algorithm which facilitates the achievement of better performance in practice over algorithms that are superior in the R*-Tree and RO-Tree by considering selection queries.
Keywords: Outlier, semantic spatial object, spatial objects, SSRO-Tree, topo-semantic.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16941675 Accelerating GLA with an M-Tree
Authors: Olli Luoma, Johannes Tuikkala, Olli Nevalainen
Abstract:
In this paper, we propose a novel improvement for the generalized Lloyd Algorithm (GLA). Our algorithm makes use of an M-tree index built on the codebook which makes it possible to reduce the number of distance computations when the nearest code words are searched. Our method does not impose the use of any specific distance function, but works with any metric distance, making it more general than many other fast GLA variants. Finally, we present the positive results of our performance experiments.Keywords: Clustering, GLA, M-Tree, Vector Quantization .
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15231674 Patient-Specific Modeling Algorithm for Medical Data Based on AUC
Authors: Guilherme Ribeiro, Alexandre Oliveira, Antonio Ferreira, Shyam Visweswaran, Gregory Cooper
Abstract:
Patient-specific models are instance-based learning algorithms that take advantage of the particular features of the patient case at hand to predict an outcome. We introduce two patient-specific algorithms based on decision tree paradigm that use AUC as a metric to select an attribute. We apply the patient specific algorithms to predict outcomes in several datasets, including medical datasets. Compared to the patient-specific decision path (PSDP) entropy-based and CART methods, the AUC-based patient-specific decision path models performed equivalently on area under the ROC curve (AUC). Our results provide support for patient-specific methods being a promising approach for making clinical predictions.Keywords: Approach instance-based, area Under the ROC Curve, Patient-specific Decision Path, clinical predictions.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15801673 The Optimization of an Intelligent Traffic Congestion Level Classification from Motorists- Judgments on Vehicle's Moving Patterns
Authors: Thammasak Thianniwet, Satidchoke Phosaard, Wasan Pattara-Atikom
Abstract:
We proposed a technique to identify road traffic congestion levels from velocity of mobile sensors with high accuracy and consistent with motorists- judgments. The data collection utilized a GPS device, a webcam, and an opinion survey. Human perceptions were used to rate the traffic congestion levels into three levels: light, heavy, and jam. Then the ratings and velocity were fed into a decision tree learning model (J48). We successfully extracted vehicle movement patterns to feed into the learning model using a sliding windows technique. The parameters capturing the vehicle moving patterns and the windows size were heuristically optimized. The model achieved accuracy as high as 99.68%. By implementing the model on the existing traffic report systems, the reports will cover comprehensive areas. The proposed method can be applied to any parts of the world.Keywords: intelligent transportation system (ITS), traffic congestion level, human judgment, decision tree (J48), geographic positioning system (GPS).
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18211672 Remarks on Some Properties of Decision Rules
Authors: Songlin Yang, Ying Ge
Abstract:
This paper shows that some properties of the decision rules in the literature do not hold by presenting a counterexample. We give sufficient and necessary conditions under which these properties are valid. These results will be helpful when one tries to choose the right decision rules in the research of rough set theory.Keywords: set, Decision table, Decision rule, coverage factor.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14121671 Comparative Evaluation of Accuracy of Selected Machine Learning Classification Techniques for Diagnosis of Cancer: A Data Mining Approach
Authors: Rajvir Kaur, Jeewani Anupama Ginige
Abstract:
With recent trends in Big Data and advancements in Information and Communication Technologies, the healthcare industry is at the stage of its transition from clinician oriented to technology oriented. Many people around the world die of cancer because the diagnosis of disease was not done at an early stage. Nowadays, the computational methods in the form of Machine Learning (ML) are used to develop automated decision support systems that can diagnose cancer with high confidence in a timely manner. This paper aims to carry out the comparative evaluation of a selected set of ML classifiers on two existing datasets: breast cancer and cervical cancer. The ML classifiers compared in this study are Decision Tree (DT), Support Vector Machine (SVM), k-Nearest Neighbor (k-NN), Logistic Regression, Ensemble (Bagged Tree) and Artificial Neural Networks (ANN). The evaluation is carried out based on standard evaluation metrics Precision (P), Recall (R), F1-score and Accuracy. The experimental results based on the evaluation metrics show that ANN showed the highest-level accuracy (99.4%) when tested with breast cancer dataset. On the other hand, when these ML classifiers are tested with the cervical cancer dataset, Ensemble (Bagged Tree) technique gave better accuracy (93.1%) in comparison to other classifiers.Keywords: Artificial neural networks, breast cancer, cancer dataset, classifiers, cervical cancer, F-score, logistic regression, machine learning, precision, recall, support vector machine.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15531670 A Straightforward Approach for Determining the Weights of Decision Makers Based on Angle Cosine and Projection Method
Authors: Qiang Yang, Ping-An Du
Abstract:
Group decision making with multiple attribute has attracted intensive concern in the decision analysis area. This paper assumes that the contributions of all the decision makers (DMs) are not equal to the decision process based on different knowledge and experience in group setting. The aim of this paper is to develop a novel approach to determine weights of DMs in the group decision making problems. In this paper, the weights of DMs are determined in the group decision environment via angle cosine and projection method. First of all, the average decision of all individual decisions is defined as the ideal decision. After that, we define the weight of each decision maker (DM) by aggregating the angle cosine and projection between individual decision and ideal decision with associated direction indicator μ. By using the weights of DMs, all individual decisions are aggregated into a collective decision. Further, the preference order of alternatives is ranked in accordance with the overall row value of collective decision. Finally, an example in a chemical company is provided to illustrate the developed approach.Keywords: Angel cosine, ideal decision, projection method, weights of decision makers.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18591669 Orchestra/Percussion Classification Algorithm for United Speech Audio Coding System
Authors: Yueming Wang, Rendong Ying, Sumxin Jiang, Peilin Liu
Abstract:
Unified Speech Audio Coding (USAC), the latest MPEG standardization for unified speech and audio coding, uses a speech/audio classification algorithm to distinguish speech and audio segments of the input signal. The quality of the recovered audio can be increased by well-designed orchestra/percussion classification and subsequent processing. However, owing to the shortcoming of the system, introducing an orchestra/percussion classification and modifying subsequent processing can enormously increase the quality of the recovered audio. This paper proposes an orchestra/percussion classification algorithm for the USAC system which only extracts 3 scales of Mel-Frequency Cepstral Coefficients (MFCCs) rather than traditional 13 scales of MFCCs and use Iterative Dichotomiser 3 (ID3) Decision Tree rather than other complex learning method, thus the proposed algorithm has lower computing complexity than most existing algorithms. Considering that frequent changing of attributes may lead to quality loss of the recovered audio signal, this paper also design a modified subsequent process to help the whole classification system reach an accurate rate as high as 97% which is comparable to classical 99%.
Keywords: ID3 Decision Tree, MFCC, Orchestra/Percussion Classification, USAC
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16731668 Decision Support Framework in Managerial Learning Environment for Organization
Authors: M. Mazhar Manzoor, Nasar.A, A. Sattar
Abstract:
In the open space of decision support system the mental impression of a manager-s decision has been the subject of large importance than the ordinary famous one, when helped by decision support system. Much of this study is an attempt to realize the relation of decision support system usage and decision outcomes that governs the system. For example, several researchers have proposed so many different models to analyze the linkage between decision support system processes and results of decision making. This study draws the important relation of manager-s mental approach with the use of decision support system. The findings of this paper are theoretical attempts to provide Decision Support System (DSS) in a way to exhibit and promote the learning in semi structured area. The proposed model shows the points of one-s learning improvements and maintains a theoretical approach in order to explore the DSS contribution in enhancing the decision forming and governing the system.Keywords: Decision Support System , Learning Organization,
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14611667 Development of the Academic Model to Predict Student Success at VUT-FSASEC Using Decision Trees
Authors: Langa Hendrick Musawenkosi, Twala Bhekisipho
Abstract:
The success or failure of students is a concern for every academic institution, college, university, governments and students themselves. Several approaches have been researched to address this concern. In this paper, a view is held that when a student enters a university or college or an academic institution, he or she enters an academic environment. The academic environment is unique concept used to develop the solution for making predictions effectively. This paper presents a model to determine the propensity of a student to succeed or fail in the French South African Schneider Electric Education Center (FSASEC) at the Vaal University of Technology (VUT). The Decision Tree algorithm is used to implement the model at FSASEC.
Keywords: Academic environment model, decision trees, FSASEC, K-nearest neighbor, machine learning, popularity index, support vector machine.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11371666 Oil Palm Empty Fruit Bunch as a New Organic Filler for Electrical Tree Inhibition
Authors: M. H. Ahmad, A. A. A. Jamil, H. Ahmad, M. A. M. Piah, A. Darus, Y. Z. Arief, N. Bashir
Abstract:
The use of synthetic retardants in polymeric insulated cables is not uncommon in the high voltage engineering to study electrical treeing phenomenon. However few studies on organic materials for the same investigation have been carried. .This paper describes the study on the effects of Oil Palm Empty Fruit Bunch (OPEFB) microfiller on the tree initiation and propagation in silicone rubber with different weight percentages (wt %) of filler to insulation bulk material. The weight percentages used were 0 wt % and 1 wt % respectively. It was found that the OPEFB retards the propagation of the electrical treeing development. For tree inception study, the addition of 1(wt %) OPEFB has increase the tree inception voltage of silicone rubber. So, OPEFB is a potential retardant to the initiation and growth of electrical treeing occurring in polymeric materials for high voltage application. However more studies on the effects of physical and electrical properties of OPEFB as a tree retardant material are required.Keywords: Oil palm empty fruit bunch, electrical tree, siliconerubber, fillers.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23631665 Decision Trees for Predicting Risk of Mortality using Routinely Collected Data
Authors: Tessy Badriyah, Jim S. Briggs, Dave R. Prytherch
Abstract:
It is well known that Logistic Regression is the gold standard method for predicting clinical outcome, especially predicting risk of mortality. In this paper, the Decision Tree method has been proposed to solve specific problems that commonly use Logistic Regression as a solution. The Biochemistry and Haematology Outcome Model (BHOM) dataset obtained from Portsmouth NHS Hospital from 1 January to 31 December 2001 was divided into four subsets. One subset of training data was used to generate a model, and the model obtained was then applied to three testing datasets. The performance of each model from both methods was then compared using calibration (the χ2 test or chi-test) and discrimination (area under ROC curve or c-index). The experiment presented that both methods have reasonable results in the case of the c-index. However, in some cases the calibration value (χ2) obtained quite a high result. After conducting experiments and investigating the advantages and disadvantages of each method, we can conclude that Decision Trees can be seen as a worthy alternative to Logistic Regression in the area of Data Mining.Keywords: Decision Trees, Logistic Regression, clinical outcome, risk of mortality.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25231664 Choosing R-tree or Quadtree Spatial DataIndexing in One Oracle Spatial Database System to Make Faster Showing Geographical Map in Mobile Geographical Information System Technology
Authors: Maruto Masserie Sardadi, Mohd Shafry bin Mohd Rahim, Zahabidin Jupri, Daut bin Daman
Abstract:
The latest Geographic Information System (GIS) technology makes it possible to administer the spatial components of daily “business object," in the corporate database, and apply suitable geographic analysis efficiently in a desktop-focused application. We can use wireless internet technology for transfer process in spatial data from server to client or vice versa. However, the problem in wireless Internet is system bottlenecks that can make the process of transferring data not efficient. The reason is large amount of spatial data. Optimization in the process of transferring and retrieving data, however, is an essential issue that must be considered. Appropriate decision to choose between R-tree and Quadtree spatial data indexing method can optimize the process. With the rapid proliferation of these databases in the past decade, extensive research has been conducted on the design of efficient data structures to enable fast spatial searching. Commercial database vendors like Oracle have also started implementing these spatial indexing to cater to the large and diverse GIS. This paper focuses on the decisions to choose R-tree and quadtree spatial indexing using Oracle spatial database in mobile GIS application. From our research condition, the result of using Quadtree and R-tree spatial data indexing method in one single spatial database can save the time until 42.5%.Keywords: Indexing, Mobile GIS, MapViewer, Oracle SpatialDatabase.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 40341663 Evaluation of Hazardous Status of Avenue Trees in University of Port Harcourt
Authors: F. S. Eguakun, T. C. Nkwor
Abstract:
Trees in the university environment are uniquely position; however, they can also present a millstone to the infrastructure and humans they coexist with. The numerous benefits of trees can be negated due to poor tree health and anthropogenic activities and as such can become hazardous. The study aims at evaluating the hazardous status of avenue trees in University of Port Harcourt. Data were collected from all the avenue trees within the selected major roads in the University. Tree growth variables were measured and health condition of the avenue trees were assessed as an indicator of some structural defects. The hazard status of the avenue trees was determined. Several tree species were used as avenue trees in the University however, Azadirachta indica (81%) was found to be most abundant. The result shows that only 0.3% avenue tree species was found to pose severe harzard in Abuja part of the University. Most avenue trees (55.2%) were rated as medium hazard status. Due to the danger and risk associated with hazardous trees, the study recommends that good and effective management strategies be implemented so as to prevent future damages from trees with small or medium hazard status.
Keywords: Avenue tree, hazard status, inventory, urban.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7161662 A Comparison between Reagents Extracted from Tree Leaves for Spectrophotometric Determination of Hafnium(IV)
Authors: A. Boveiri Monji, H. Yousefnia, S. Zolghadri, B. Salimi
Abstract:
The main goal of this paper was to make use of green reagents as a substitute of perilous synthetic reagents and organic solvents for spectrophotometric determination of hafnium(IV). The extracts taken from six different kinds of tree leaves including Acer negundo, Ficus carica, Cerasus avium, Chimonanthus, Salix babylonica and Pinus brutia, were applied as green reagents for the experiments. In 6-M hydrochloric acid, hafnium reacted with the reagent to form a yellow product and showed maximum absorbance at 421 nm. Among tree leaves, Chimonanthus showed satisfactory results with a molar absorptivity value of 0.61 × 104 l mol-1 cm-1 and the method was linear in the 0.3-9 µg mL -1 concentration range. The detection limit value was 0.064 µg mL-1. The proposed method was simple, low cost, clean, and selective.
Keywords: Spectrophotometric determination, tree leaves, synthetic reagents, hafnium.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7891661 Heritage Tree Expert Assessment and Classification: Malaysian Perspective
Authors: B.-Y.-S. Lau, Y.-C.-T. Jonathan, M.-S. Alias
Abstract:
Heritage trees are natural large, individual trees with exceptionally value due to association with age or event or distinguished people. In Malaysia, there is an abundance of tropical heritage trees throughout the country. It is essential to set up a repository of heritage trees to prevent valuable trees from being cut down. In this cross domain study, a web-based online expert system namely the Heritage Tree Expert Assessment and Classification (HTEAC) is developed and deployed for public to nominate potential heritage trees. Based on the nomination, tree care experts or arborists would evaluate and verify the nominated trees as heritage trees. The expert system automatically rates the approved heritage trees according to pre-defined grades via Delphi technique. Features and usability test of the expert system are presented. Preliminary result is promising for the system to be used as a full scale public system.Keywords: Arboriculture, Delphi, expert system, heritage tree, urban forestry.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14301660 Fuzzy Shortest Paths Approximation for Solving the Fuzzy Steiner Tree Problem in Graphs
Authors: Miloš Šeda
Abstract:
In this paper, we deal with the Steiner tree problem (STP) on a graph in which a fuzzy number, instead of a real number, is assigned to each edge. We propose a modification of the shortest paths approximation based on the fuzzy shortest paths (FSP) evaluations. Since a fuzzy min operation using the extension principle leads to nondominated solutions, we propose another approach to solving the FSP using Cheng's centroid point fuzzy ranking method.Keywords: Steiner tree, single shortest path problem, fuzzyranking, binary heap, priority queue.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16941659 Using Spectral Vectors and M-Tree for Graph Clustering and Searching in Graph Databases of Protein Structures
Authors: Do Phuc, Nguyen Thi Kim Phung
Abstract:
In this paper, we represent protein structure by using graph. A protein structure database will become a graph database. Each graph is represented by a spectral vector. We use Jacobi rotation algorithm to calculate the eigenvalues of the normalized Laplacian representation of adjacency matrix of graph. To measure the similarity between two graphs, we calculate the Euclidean distance between two graph spectral vectors. To cluster the graphs, we use M-tree with the Euclidean distance to cluster spectral vectors. Besides, M-tree can be used for graph searching in graph database. Our proposal method was tested with graph database of 100 graphs representing 100 protein structures downloaded from Protein Data Bank (PDB) and we compare the result with the SCOP hierarchical structure.Keywords: Eigenvalues, m-tree, graph database, protein structure, spectra graph theory.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16551658 Revised PLWAP Tree with Non-frequent Items for Mining Sequential Pattern
Authors: R. Vishnu Priya, A. Vadivel
Abstract:
Sequential pattern mining is a challenging task in data mining area with large applications. One among those applications is mining patterns from weblog. Recent times, weblog is highly dynamic and some of them may become absolute over time. In addition, users may frequently change the threshold value during the data mining process until acquiring required output or mining interesting rules. Some of the recently proposed algorithms for mining weblog, build the tree with two scans and always consume large time and space. In this paper, we build Revised PLWAP with Non-frequent Items (RePLNI-tree) with single scan for all items. While mining sequential patterns, the links related to the nonfrequent items are not considered. Hence, it is not required to delete or maintain the information of nodes while revising the tree for mining updated transactions. The algorithm supports both incremental and interactive mining. It is not required to re-compute the patterns each time, while weblog is updated or minimum support changed. The performance of the proposed tree is better, even the size of incremental database is more than 50% of existing one. For evaluation purpose, we have used the benchmark weblog dataset and found that the performance of proposed tree is encouraging compared to some of the recently proposed approaches.
Keywords: Sequential pattern mining, weblog, frequent and non-frequent items, incremental and interactive mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19301657 A Novel Methodology for Synthesis of Fault Trees from MATLAB-Simulink Model
Authors: F. Tajarrod, G. Latif-Shabgahi
Abstract:
Fault tree analysis is a well-known method for reliability and safety assessment of engineering systems. In the last 3 decades, a number of methods have been introduced, in the literature, for automatic construction of fault trees. The main difference between these methods is the starting model from which the tree is constructed. This paper presents a new methodology for the construction of static and dynamic fault trees from a system Simulink model. The method is introduced and explained in detail, and its correctness and completeness is experimentally validated by using an example, taken from literature. Advantages of the method are also mentioned.Keywords: Fault tree, Simulink, Standby Sparing and Redundancy
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 30001656 Game-Tree Simplification by Pattern Matching and Its Acceleration Approach using an FPGA
Authors: Suguru Ochiai, Toru Yabuki, Yoshiki Yamaguchi, Yuetsu Kodama
Abstract:
In this paper, we propose a Connect6 solver which adopts a hybrid approach based on a tree-search algorithm and image processing techniques. The solver must deal with the complicated computation and provide high performance in order to make real-time decisions. The proposed approach enables the solver to be implemented on a single Spartan-6 XC6SLX45 FPGA produced by XILINX without using any external devices. The compact implementation is achieved through image processing techniques to optimize a tree-search algorithm of the Connect6 game. The tree search is widely used in computer games and the optimal search brings the best move in every turn of a computer game. Thus, many tree-search algorithms such as Minimax algorithm and artificial intelligence approaches have been widely proposed in this field. However, there is one fundamental problem in this area; the computation time increases rapidly in response to the growth of the game tree. It means the larger the game tree is, the bigger the circuit size is because of their highly parallel computation characteristics. Here, this paper aims to reduce the size of a Connect6 game tree using image processing techniques and its position symmetric property. The proposed solver is composed of four computational modules: a two-dimensional checkmate strategy checker, a template matching module, a skilful-line predictor, and a next-move selector. These modules work well together in selecting next moves from some candidates and the total amount of their circuits is small. The details of the hardware design for an FPGA implementation are described and the performance of this design is also shown in this paper.Keywords: Connect6, pattern matching, game-tree reduction, hardware direct computation
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19731655 Pattern Matching Based on Regular Tree Grammars
Authors: Riad S. Jabri
Abstract:
Pattern matching based on regular tree grammars have been widely used in many areas of computer science. In this paper, we propose a pattern matcher within the framework of code generation, based on a generic and a formalized approach. According to this approach, parsers for regular tree grammars are adapted to a general pattern matching solution, rather than adapting the pattern matching according to their parsing behavior. Hence, we first formalize the construction of the pattern matches respective to input trees drawn from a regular tree grammar in a form of the so-called match trees. Then, we adopt a recently developed generic parser and tightly couple its parsing behavior with such construction. In addition to its generality, the resulting pattern matcher is characterized by its soundness and efficient implementation. This is demonstrated by the proposed theory and by the derived algorithms for its implementation. A comparison with similar and well-known approaches, such as the ones based on tree automata and LR parsers, has shown that our pattern matcher can be applied to a broader class of grammars, and achieves better approximation of pattern matches in one pass. Furthermore, its use as a machine code selector is characterized by a minimized overhead, due to the balanced distribution of the cost computations into static ones, during parser generation time, and into dynamic ones, during parsing time.
Keywords: Bottom-up automata, Code selection, Pattern matching, Regular tree grammars, Match trees.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12681654 Hybrid Approach for Software Defect Prediction Using Machine Learning with Optimization Technique
Authors: C. Manjula, Lilly Florence
Abstract:
Software technology is developing rapidly which leads to the growth of various industries. Now-a-days, software-based applications have been adopted widely for business purposes. For any software industry, development of reliable software is becoming a challenging task because a faulty software module may be harmful for the growth of industry and business. Hence there is a need to develop techniques which can be used for early prediction of software defects. Due to complexities in manual prediction, automated software defect prediction techniques have been introduced. These techniques are based on the pattern learning from the previous software versions and finding the defects in the current version. These techniques have attracted researchers due to their significant impact on industrial growth by identifying the bugs in software. Based on this, several researches have been carried out but achieving desirable defect prediction performance is still a challenging task. To address this issue, here we present a machine learning based hybrid technique for software defect prediction. First of all, Genetic Algorithm (GA) is presented where an improved fitness function is used for better optimization of features in data sets. Later, these features are processed through Decision Tree (DT) classification model. Finally, an experimental study is presented where results from the proposed GA-DT based hybrid approach is compared with those from the DT classification technique. The results show that the proposed hybrid approach achieves better classification accuracy.
Keywords: Decision tree, genetic algorithm, machine learning, software defect prediction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14651653 Computing Entropy for Ortholog Detection
Authors: Hsing-Kuo Pao, John Case
Abstract:
Biological sequences from different species are called or-thologs if they evolved from a sequence of a common ancestor species and they have the same biological function. Approximations of Kolmogorov complexity or entropy of biological sequences are already well known to be useful in extracting similarity information between such sequences -in the interest, for example, of ortholog detection. As is well known, the exact Kolmogorov complexity is not algorithmically computable. In prac-tice one can approximate it by computable compression methods. How-ever, such compression methods do not provide a good approximation to Kolmogorov complexity for short sequences. Herein is suggested a new ap-proach to overcome the problem that compression approximations may notwork well on short sequences. This approach is inspired by new, conditional computations of Kolmogorov entropy. A main contribution of the empir-ical work described shows the new set of entropy-based machine learning attributes provides good separation between positive (ortholog) and nega-tive (non-ortholog) data - better than with good, previously known alter-natives (which do not employ some means to handle short sequences well).Also empirically compared are the new entropy based attribute set and a number of other, more standard similarity attributes sets commonly used in genomic analysis. The various similarity attributes are evaluated by cross validation, through boosted decision tree induction C5.0, and by Receiver Operating Characteristic (ROC) analysis. The results point to the conclu-sion: the new, entropy based attribute set by itself is not the one giving the best prediction; however, it is the best attribute set for use in improving the other, standard attribute sets when conjoined with them.
Keywords: compression, decision tree, entropy, ortholog, ROC.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18261652 Tree Sign Patterns of Small Order that Allow an Eventually Positive Matrix
Authors: Ber-Lin Yu, Jie Cui, Hong Cheng, Zhengfeng Yu
Abstract:
A sign pattern is a matrix whose entries belong to the set {+,−, 0}. An n-by-n sign pattern A is said to allow an eventually positive matrix if there exist some real matrices A with the same sign pattern as A and a positive integer k0 such that Ak > 0 for all k ≥ k0. It is well known that identifying and classifying the n-by-n sign patterns that allow an eventually positive matrix are posed as two open problems. In this article, the tree sign patterns of small order that allow an eventually positive matrix are classified completely.Keywords: Eventually positive matrix, sign pattern, tree.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12671651 Comparison of Phylogenetic Trees of Multiple Protein Sequence Alignment Methods
Authors: Khaddouja Boujenfa, Nadia Essoussi, Mohamed Limam
Abstract:
Multiple sequence alignment is a fundamental part in many bioinformatics applications such as phylogenetic analysis. Many alignment methods have been proposed. Each method gives a different result for the same data set, and consequently generates a different phylogenetic tree. Hence, the chosen alignment method affects the resulting tree. However in the literature, there is no evaluation of multiple alignment methods based on the comparison of their phylogenetic trees. This work evaluates the following eight aligners: ClustalX, T-Coffee, SAGA, MUSCLE, MAFFT, DIALIGN, ProbCons and Align-m, based on their phylogenetic trees (test trees) produced on a given data set. The Neighbor-Joining method is used to estimate trees. Three criteria, namely, the dNNI, the dRF and the Id_Tree are established to test the ability of different alignment methods to produce closer test tree compared to the reference one (true tree). Results show that the method which produces the most accurate alignment gives the nearest test tree to the reference tree. MUSCLE outperforms all aligners with respect to the three criteria and for all datasets, performing particularly better when sequence identities are within 10-20%. It is followed by T-Coffee at lower sequence identity (<10%), Align-m at 20-30% identity, and ClustalX and ProbCons at 30-50% identity. Also, it is noticed that when sequence identities are higher (>30%), trees scores of all methods become similar.Keywords: Multiple alignment methods, phylogenetic trees, Neighbor-Joining method, Robinson-Foulds distance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18261650 Binary Classification Tree with Tuned Observation-based Clustering
Authors: Maythapolnun Athimethphat, Boontarika Lerteerawong
Abstract:
There are several approaches for handling multiclass classification. Aside from one-against-one (OAO) and one-against-all (OAA), hierarchical classification technique is also commonly used. A binary classification tree is a hierarchical classification structure that breaks down a k-class problem into binary sub-problems, each solved by a binary classifier. In each node, a set of classes is divided into two subsets. A good class partition should be able to group similar classes together. Many algorithms measure similarity in term of distance between class centroids. Classes are grouped together by a clustering algorithm when distances between their centroids are small. In this paper, we present a binary classification tree with tuned observation-based clustering (BCT-TOB) that finds a class partition by performing clustering on observations instead of class centroids. A merging step is introduced to merge any insignificant class split. The experiment shows that performance of BCT-TOB is comparable to other algorithms.
Keywords: multiclass classification, hierarchical classification, binary classification tree, clustering, observation-based clustering
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17301649 Tree Based Decomposition of Sunspot Images
Authors: Hossein Mirzaee, Farhad Besharati
Abstract:
Solar sunspot rotation, latitudinal bands are studied based on intelligent computation methods. A combination of image fusion method with together tree decomposition is used to obtain quantitative values about the latitudes of trajectories on sun surface that sunspots rotate around them. Daily solar images taken with SOlar and Heliospheric (SOHO) satellite are fused for each month separately .The result of fused image is decomposed with Quad Tree decomposition method in order to achieve the precise information about latitudes of sunspot trajectories. Such analysis is useful for gathering information about the regions on sun surface and coordinates in space that is more expose to solar geomagnetic storms, tremendous flares and hot plasma gases permeate interplanetary space and help human to serve their technical systems. Here sunspot images in September, November and October in 2001 are used for studying the magnetic behavior of sun.Keywords: Quad tree decomposition, sunspot image.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12491648 A Hybrid Scheme for on-Line Diagnostic Decision Making Using Optimal Data Representation and Filtering Technique
Authors: Hyun-Woo Cho
Abstract:
The early diagnostic decision making in industrial processes is absolutely necessary to produce high quality final products. It helps to provide early warning for a special event in a process, and finding its assignable cause can be obtained. This work presents a hybrid diagnostic schmes for batch processes. Nonlinear representation of raw process data is combined with classification tree techniques. The nonlinear kernel-based dimension reduction is executed for nonlinear classification decision boundaries for fault classes. In order to enhance diagnosis performance for batch processes, filtering of the data is performed to get rid of the irrelevant information of the process data. For the diagnosis performance of several representation, filtering, and future observation estimation methods, four diagnostic schemes are evaluated. In this work, the performance of the presented diagnosis schemes is demonstrated using batch process data.
Keywords: Diagnostics, batch process, nonlinear representation, data filtering, multivariate statistical approach
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13161647 Image Processing Approach for Detection of Three-Dimensional Tree-Rings from X-Ray Computed Tomography
Authors: Jorge Martinez-Garcia, Ingrid Stelzner, Joerg Stelzner, Damian Gwerder, Philipp Schuetz
Abstract:
Tree-ring analysis is an important part of the quality assessment and the dating of (archaeological) wood samples. It provides quantitative data about the whole anatomical ring structure, which can be used, for example, to measure the impact of the fluctuating environment on the tree growth, for the dendrochronological analysis of archaeological wooden artefacts and to estimate the wood mechanical properties. Despite advances in computer vision and edge recognition algorithms, detection and counting of annual rings are still limited to 2D datasets and performed in most cases manually, which is a time consuming, tedious task and depends strongly on the operator’s experience. This work presents an image processing approach to detect the whole 3D tree-ring structure directly from X-ray computed tomography imaging data. The approach relies on a modified Canny edge detection algorithm, which captures fully connected tree-ring edges throughout the measured image stack and is validated on X-ray computed tomography data taken from six wood species.
Keywords: Ring recognition, edge detection, X-ray computed tomography, dendrochronology.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 806