Search results for: belief decision tree
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1814

Search results for: belief decision tree

1724 A Novel Methodology for Synthesis of Fault Trees from MATLAB-Simulink Model

Authors: F. Tajarrod, G. Latif-Shabgahi

Abstract:

Fault tree analysis is a well-known method for reliability and safety assessment of engineering systems. In the last 3 decades, a number of methods have been introduced, in the literature, for automatic construction of fault trees. The main difference between these methods is the starting model from which the tree is constructed. This paper presents a new methodology for the construction of static and dynamic fault trees from a system Simulink model. The method is introduced and explained in detail, and its correctness and completeness is experimentally validated by using an example, taken from literature. Advantages of the method are also mentioned.

Keywords: Fault tree, Simulink, Standby Sparing and Redundancy

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3002
1723 Game-Tree Simplification by Pattern Matching and Its Acceleration Approach using an FPGA

Authors: Suguru Ochiai, Toru Yabuki, Yoshiki Yamaguchi, Yuetsu Kodama

Abstract:

In this paper, we propose a Connect6 solver which adopts a hybrid approach based on a tree-search algorithm and image processing techniques. The solver must deal with the complicated computation and provide high performance in order to make real-time decisions. The proposed approach enables the solver to be implemented on a single Spartan-6 XC6SLX45 FPGA produced by XILINX without using any external devices. The compact implementation is achieved through image processing techniques to optimize a tree-search algorithm of the Connect6 game. The tree search is widely used in computer games and the optimal search brings the best move in every turn of a computer game. Thus, many tree-search algorithms such as Minimax algorithm and artificial intelligence approaches have been widely proposed in this field. However, there is one fundamental problem in this area; the computation time increases rapidly in response to the growth of the game tree. It means the larger the game tree is, the bigger the circuit size is because of their highly parallel computation characteristics. Here, this paper aims to reduce the size of a Connect6 game tree using image processing techniques and its position symmetric property. The proposed solver is composed of four computational modules: a two-dimensional checkmate strategy checker, a template matching module, a skilful-line predictor, and a next-move selector. These modules work well together in selecting next moves from some candidates and the total amount of their circuits is small. The details of the hardware design for an FPGA implementation are described and the performance of this design is also shown in this paper.

Keywords: Connect6, pattern matching, game-tree reduction, hardware direct computation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1974
1722 Pattern Matching Based on Regular Tree Grammars

Authors: Riad S. Jabri

Abstract:

Pattern matching based on regular tree grammars have been widely used in many areas of computer science. In this paper, we propose a pattern matcher within the framework of code generation, based on a generic and a formalized approach. According to this approach, parsers for regular tree grammars are adapted to a general pattern matching solution, rather than adapting the pattern matching according to their parsing behavior. Hence, we first formalize the construction of the pattern matches respective to input trees drawn from a regular tree grammar in a form of the so-called match trees. Then, we adopt a recently developed generic parser and tightly couple its parsing behavior with such construction. In addition to its generality, the resulting pattern matcher is characterized by its soundness and efficient implementation. This is demonstrated by the proposed theory and by the derived algorithms for its implementation. A comparison with similar and well-known approaches, such as the ones based on tree automata and LR parsers, has shown that our pattern matcher can be applied to a broader class of grammars, and achieves better approximation of pattern matches in one pass. Furthermore, its use as a machine code selector is characterized by a minimized overhead, due to the balanced distribution of the cost computations into static ones, during parser generation time, and into dynamic ones, during parsing time.

Keywords: Bottom-up automata, Code selection, Pattern matching, Regular tree grammars, Match trees.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1269
1721 Hybrid Approach for Software Defect Prediction Using Machine Learning with Optimization Technique

Authors: C. Manjula, Lilly Florence

Abstract:

Software technology is developing rapidly which leads to the growth of various industries. Now-a-days, software-based applications have been adopted widely for business purposes. For any software industry, development of reliable software is becoming a challenging task because a faulty software module may be harmful for the growth of industry and business. Hence there is a need to develop techniques which can be used for early prediction of software defects. Due to complexities in manual prediction, automated software defect prediction techniques have been introduced. These techniques are based on the pattern learning from the previous software versions and finding the defects in the current version. These techniques have attracted researchers due to their significant impact on industrial growth by identifying the bugs in software. Based on this, several researches have been carried out but achieving desirable defect prediction performance is still a challenging task. To address this issue, here we present a machine learning based hybrid technique for software defect prediction. First of all, Genetic Algorithm (GA) is presented where an improved fitness function is used for better optimization of features in data sets. Later, these features are processed through Decision Tree (DT) classification model. Finally, an experimental study is presented where results from the proposed GA-DT based hybrid approach is compared with those from the DT classification technique. The results show that the proposed hybrid approach achieves better classification accuracy.

Keywords: Decision tree, genetic algorithm, machine learning, software defect prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1465
1720 Computing Entropy for Ortholog Detection

Authors: Hsing-Kuo Pao, John Case

Abstract:

Biological sequences from different species are called or-thologs if they evolved from a sequence of a common ancestor species and they have the same biological function. Approximations of Kolmogorov complexity or entropy of biological sequences are already well known to be useful in extracting similarity information between such sequences -in the interest, for example, of ortholog detection. As is well known, the exact Kolmogorov complexity is not algorithmically computable. In prac-tice one can approximate it by computable compression methods. How-ever, such compression methods do not provide a good approximation to Kolmogorov complexity for short sequences. Herein is suggested a new ap-proach to overcome the problem that compression approximations may notwork well on short sequences. This approach is inspired by new, conditional computations of Kolmogorov entropy. A main contribution of the empir-ical work described shows the new set of entropy-based machine learning attributes provides good separation between positive (ortholog) and nega-tive (non-ortholog) data - better than with good, previously known alter-natives (which do not employ some means to handle short sequences well).Also empirically compared are the new entropy based attribute set and a number of other, more standard similarity attributes sets commonly used in genomic analysis. The various similarity attributes are evaluated by cross validation, through boosted decision tree induction C5.0, and by Receiver Operating Characteristic (ROC) analysis. The results point to the conclu-sion: the new, entropy based attribute set by itself is not the one giving the best prediction; however, it is the best attribute set for use in improving the other, standard attribute sets when conjoined with them.

Keywords: compression, decision tree, entropy, ortholog, ROC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1827
1719 Tree Sign Patterns of Small Order that Allow an Eventually Positive Matrix

Authors: Ber-Lin Yu, Jie Cui, Hong Cheng, Zhengfeng Yu

Abstract:

A sign pattern is a matrix whose entries belong to the set {+,−, 0}. An n-by-n sign pattern A is said to allow an eventually positive matrix if there exist some real matrices A with the same sign pattern as A and a positive integer k0 such that Ak > 0 for all k ≥ k0. It is well known that identifying and classifying the n-by-n sign patterns that allow an eventually positive matrix are posed as two open problems. In this article, the tree sign patterns of small order that allow an eventually positive matrix are classified completely.

Keywords: Eventually positive matrix, sign pattern, tree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1267
1718 Comparison of Phylogenetic Trees of Multiple Protein Sequence Alignment Methods

Authors: Khaddouja Boujenfa, Nadia Essoussi, Mohamed Limam

Abstract:

Multiple sequence alignment is a fundamental part in many bioinformatics applications such as phylogenetic analysis. Many alignment methods have been proposed. Each method gives a different result for the same data set, and consequently generates a different phylogenetic tree. Hence, the chosen alignment method affects the resulting tree. However in the literature, there is no evaluation of multiple alignment methods based on the comparison of their phylogenetic trees. This work evaluates the following eight aligners: ClustalX, T-Coffee, SAGA, MUSCLE, MAFFT, DIALIGN, ProbCons and Align-m, based on their phylogenetic trees (test trees) produced on a given data set. The Neighbor-Joining method is used to estimate trees. Three criteria, namely, the dNNI, the dRF and the Id_Tree are established to test the ability of different alignment methods to produce closer test tree compared to the reference one (true tree). Results show that the method which produces the most accurate alignment gives the nearest test tree to the reference tree. MUSCLE outperforms all aligners with respect to the three criteria and for all datasets, performing particularly better when sequence identities are within 10-20%. It is followed by T-Coffee at lower sequence identity (<10%), Align-m at 20-30% identity, and ClustalX and ProbCons at 30-50% identity. Also, it is noticed that when sequence identities are higher (>30%), trees scores of all methods become similar.

Keywords: Multiple alignment methods, phylogenetic trees, Neighbor-Joining method, Robinson-Foulds distance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1827
1717 Binary Classification Tree with Tuned Observation-based Clustering

Authors: Maythapolnun Athimethphat, Boontarika Lerteerawong

Abstract:

There are several approaches for handling multiclass classification. Aside from one-against-one (OAO) and one-against-all (OAA), hierarchical classification technique is also commonly used. A binary classification tree is a hierarchical classification structure that breaks down a k-class problem into binary sub-problems, each solved by a binary classifier. In each node, a set of classes is divided into two subsets. A good class partition should be able to group similar classes together. Many algorithms measure similarity in term of distance between class centroids. Classes are grouped together by a clustering algorithm when distances between their centroids are small. In this paper, we present a binary classification tree with tuned observation-based clustering (BCT-TOB) that finds a class partition by performing clustering on observations instead of class centroids. A merging step is introduced to merge any insignificant class split. The experiment shows that performance of BCT-TOB is comparable to other algorithms.

Keywords: multiclass classification, hierarchical classification, binary classification tree, clustering, observation-based clustering

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1732
1716 Tree Based Decomposition of Sunspot Images

Authors: Hossein Mirzaee, Farhad Besharati

Abstract:

Solar sunspot rotation, latitudinal bands are studied based on intelligent computation methods. A combination of image fusion method with together tree decomposition is used to obtain quantitative values about the latitudes of trajectories on sun surface that sunspots rotate around them. Daily solar images taken with SOlar and Heliospheric (SOHO) satellite are fused for each month separately .The result of fused image is decomposed with Quad Tree decomposition method in order to achieve the precise information about latitudes of sunspot trajectories. Such analysis is useful for gathering information about the regions on sun surface and coordinates in space that is more expose to solar geomagnetic storms, tremendous flares and hot plasma gases permeate interplanetary space and help human to serve their technical systems. Here sunspot images in September, November and October in 2001 are used for studying the magnetic behavior of sun.

Keywords: Quad tree decomposition, sunspot image.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1250
1715 A Hybrid Scheme for on-Line Diagnostic Decision Making Using Optimal Data Representation and Filtering Technique

Authors: Hyun-Woo Cho

Abstract:

The early diagnostic decision making in industrial processes is absolutely necessary to produce high quality final products. It helps to provide early warning for a special event in a process, and finding its assignable cause can be obtained. This work presents a hybrid diagnostic schmes for batch processes. Nonlinear representation of raw process data is combined with classification tree techniques. The nonlinear kernel-based dimension reduction is executed for nonlinear classification decision boundaries for fault classes. In order to enhance diagnosis performance for batch processes, filtering of the data is performed to get rid of the irrelevant information of the process data. For the diagnosis performance of several representation, filtering, and future observation estimation methods, four diagnostic schemes are evaluated. In this work, the performance of the presented diagnosis schemes is demonstrated using batch process data.

Keywords: Diagnostics, batch process, nonlinear representation, data filtering, multivariate statistical approach

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1317
1714 Image Processing Approach for Detection of Three-Dimensional Tree-Rings from X-Ray Computed Tomography

Authors: Jorge Martinez-Garcia, Ingrid Stelzner, Joerg Stelzner, Damian Gwerder, Philipp Schuetz

Abstract:

Tree-ring analysis is an important part of the quality assessment and the dating of (archaeological) wood samples. It provides quantitative data about the whole anatomical ring structure, which can be used, for example, to measure the impact of the fluctuating environment on the tree growth, for the dendrochronological analysis of archaeological wooden artefacts and to estimate the wood mechanical properties. Despite advances in computer vision and edge recognition algorithms, detection and counting of annual rings are still limited to 2D datasets and performed in most cases manually, which is a time consuming, tedious task and depends strongly on the operator’s experience. This work presents an image processing approach to detect the whole 3D tree-ring structure directly from X-ray computed tomography imaging data. The approach relies on a modified Canny edge detection algorithm, which captures fully connected tree-ring edges throughout the measured image stack and is validated on X-ray computed tomography data taken from six wood species.

Keywords: Ring recognition, edge detection, X-ray computed tomography, dendrochronology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 807
1713 Using Time-Series NDVI to Model Land Cover Change: A Case Study in the Berg River Catchment Area, Western Cape, South Africa

Authors: A. S. Adesuyi, Z. Munch

Abstract:

This study investigates the use of a time-series of MODIS NDVI data to identify agricultural land cover change on an annual time step (2007 - 2012) and characterize the trend. Following an ISODATA classification of the MODIS imagery to selectively mask areas not agriculture or semi-natural, NDVI signatures were created to identify areas cereals and vineyards with the aid of ancillary, pictometry and field sample data for 2010. The NDVI signature curve and training samples were used to create a decision tree model in WEKA 3.6.9 using decision tree classifier (J48) algorithm; Model 1 including ISODATA classification and Model 2 not. These two models were then used to classify all data for the study area for 2010, producing land cover maps with classification accuracies of 77% and 80% for Model 1 and 2 respectively. Model 2 was subsequently used to create land cover classification and change detection maps for all other years. Subtle changes and areas of consistency (unchanged) were observed in the agricultural classes and crop practices. Over the years as predicted by the land cover classification. Forty one percent of the catchment comprised of cereals with 35% possibly following a crop rotation system. Vineyards largely remained constant with only one percent conversion to vineyard from other land cover classes.

Keywords: Change detection, Land cover, NDVI, time-series.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2291
1712 Urban and Rural Children’s Knowledge on Biodiversity in Bizkaia: Tree Identification Skills and Animal and Plant Listing

Authors: Joserra Díez, Ainhoa Meñika, Iñaki Sanz-Azkue, Arritokieta Ortuzar

Abstract:

Biodiversity provides humans with a great range of ecosystemic services; it is therefore an indispensable resource and a legacy to coming generations. However, in the last decades, the increasing exploitation of the Planet has caused a great loss of biodiversity and its acquaintance has decreased remarkably; especially in urbanized areas, due to the decreasing attachment of humans to nature. Yet, the Primary Education curriculum primes the identification of flora and fauna to guarantee the knowledge of children on their surroundings, so that they care for the environment as well as for themselves. In order to produce effective didactic material that meets the needs of both teachers and pupils, it is fundamental to diagnose the current situation. In the present work, the knowledge on biodiversity of 3rd cycle Primary Education students in Biscay (n=98) and its relation to the size of the town/city of their school is discussed. Two tests have been used with such aim: one for tree identification and the other one so that the students enumerated the species of trees and animals they knew. Results reveal that knowledge of students on tree identification is scarce regardless the size of the city/town and of their school. On the other hand, animal species are better known than tree species.

Keywords: Biodiversity, population, tree identification, animal identification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1188
1711 Auto Regressive Tree Modeling for Parametric Optimization in Fuzzy Logic Control System

Authors: Arshia Azam, J. Amarnath, Ch. D. V. Paradesi Rao

Abstract:

The advantage of solving the complex nonlinear problems by utilizing fuzzy logic methodologies is that the experience or expert-s knowledge described as a fuzzy rule base can be directly embedded into the systems for dealing with the problems. The current limitation of appropriate and automated designing of fuzzy controllers are focused in this paper. The structure discovery and parameter adjustment of the Branched T-S fuzzy model is addressed by a hybrid technique of type constrained sparse tree algorithms. The simulation result for different system model is evaluated and the identification error is observed to be minimum.

Keywords: Fuzzy logic, branch T-S fuzzy model, tree modeling, complex nonlinear system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1389
1710 Pressure Losses on Realistic Geometry of Tracheobronchial Tree

Authors: Michaela Chovancova, Jakub Elcner

Abstract:

Real bronchial tree is very complicated piping system. Analysis of flow and pressure losses in this system is very difficult. Due to the complex geometry and the very small size in the lower generations is examination by CFD possible only in the central part of bronchial tree. For specify the pressure losses of lower generations is necessary to provide a mathematical equation. Determination of mathematical formulas for calculation of pressure losses in the real lungs is time consuming and inefficient process due to its complexity and diversity. For these calculations is necessary to slightly simplify the geometry of lungs (same cross-section over the length of individual generation) or use one of the idealized models of lungs (Horsfield, Weibel). The article compares the values of pressure losses obtained from CFD simulation of air flow in the central part of the real bronchial tree with the values calculated in a slightly simplified real lungs by using a mathematical relationship derived from the Bernoulli and continuity equations. The aim of the article is to analyse the accuracy of the analytical method and its possibility of use for the calculation of pressure losses in lower generations, which is difficult to solve by numerical method due to the small geometry.

Keywords: Pressure gradient, airways resistance, real geometry of bronchial tree, breathing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1878
1709 Remote-Sensing Sunspot Images to Obtain the Sunspot Roads

Authors: Hossein Mirzaee, Farhad Besharati

Abstract:

A combination of image fusion and quad tree decomposition method is used for detecting the sunspot trajectories in each month and computation of the latitudes of these trajectories in each solar hemisphere. Daily solar images taken with SOHO satellite are fused for each month and the result of fused image is decomposed with Quad Tree decomposition method in order to classifying the sunspot trajectories and then to achieve the precise information about latitudes of sunspot trajectories. Also with fusion we deduce some physical remarkable conclusions about sun magnetic fields behavior. Using quad tree decomposition we give information about the region on sun surface and the space angle that tremendous flares and hot plasma gases permeate interplanetary space and attack to satellites and human technical systems. Here sunspot images in June, July and August 2001 are used for studying and give a method to compute the latitude of sunspot trajectories in each month with sunspot images.

Keywords: Quad Tree Decomposition, Sunspot.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1209
1708 Evaluation of the Impact of Dataset Characteristics for Classification Problems in Biological Applications

Authors: Kanthida Kusonmano, Michael Netzer, Bernhard Pfeifer, Christian Baumgartner, Klaus R. Liedl, Armin Graber

Abstract:

Availability of high dimensional biological datasets such as from gene expression, proteomic, and metabolic experiments can be leveraged for the diagnosis and prognosis of diseases. Many classification methods in this area have been studied to predict disease states and separate between predefined classes such as patients with a special disease versus healthy controls. However, most of the existing research only focuses on a specific dataset. There is a lack of generic comparison between classifiers, which might provide a guideline for biologists or bioinformaticians to select the proper algorithm for new datasets. In this study, we compare the performance of popular classifiers, which are Support Vector Machine (SVM), Logistic Regression, k-Nearest Neighbor (k-NN), Naive Bayes, Decision Tree, and Random Forest based on mock datasets. We mimic common biological scenarios simulating various proportions of real discriminating biomarkers and different effect sizes thereof. The result shows that SVM performs quite stable and reaches a higher AUC compared to other methods. This may be explained due to the ability of SVM to minimize the probability of error. Moreover, Decision Tree with its good applicability for diagnosis and prognosis shows good performance in our experimental setup. Logistic Regression and Random Forest, however, strongly depend on the ratio of discriminators and perform better when having a higher number of discriminators.

Keywords: Classification, High dimensional data, Machine learning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2384
1707 An Optimized Design of Non-uniform Filterbank

Authors: Ram Kumar Soni, Alok Jain, Rajiv Saxena

Abstract:

The tree structured approach of non-uniform filterbank (NUFB) is normally used in perfect reconstruction (PR). The PR is not always feasible due to certain limitations, i.e, constraints in selecting design parameters, design complexity and some times output is severely affected by aliasing error if necessary and sufficient conditions of PR is not satisfied perfectly. Therefore, there has been generalized interest of researchers to go for near perfect reconstruction (NPR). In this proposed work, an optimized tree structure technique is used for the design of NPR non-uniform filterbank. Window functions of Blackman family are used to design the prototype FIR filter. A single variable linear optimization is used to minimize the amplitude distortion. The main feature of the proposed design is its simplicity with linear phase property.

Keywords: Tree structure, NUFB, QMF, NPR.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1738
1706 About the Case Portfolio Management Algorithms and Their Applications

Authors: M. Chumburidze, N. Salia, T. Namchevadze

Abstract:

This work deals with case processing problems in business. The task of strategic credit requirements management of cases portfolio is discussed. The information model of credit requirements in a binary tree diagram is considered. The algorithms to solve issues of prioritizing clusters of cases in business have been investigated. An implementation of priority queues to support case management operations has been presented. The corresponding pseudo codes for the programming application have been constructed. The tools applied in this development are based on binary tree ordering algorithms, optimization theory, and business management methods.

Keywords: Credit network, case portfolio, binary tree, priority queue, stack.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 78
1705 Decision Making under Strict Uncertainty: Case Study in Sewer Network Planning

Authors: Zhen Wu, David Lupien St-Pierre, Georges Abdul-Nour

Abstract:

In decision making under strict uncertainty, decision makers have to choose a decision without any information about the states of nature. The classic criteria of Laplace, Wald, Savage, Hurwicz and Starr are introduced and compared in a case study of sewer network planning. Furthermore, results from different criteria are discussed and analyzed. Moreover, this paper discusses the idea that decision making under strict uncertainty (DMUSU) can be viewed as a two-player game and thus be solved by a solution concept in game theory: Nash equilibrium.

Keywords: Decision criteria, decision making, sewer network planning, strict uncertainty.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1496
1704 Applying Fuzzy Decision Making Approach to IT Outsourcing Supplier Selection

Authors: Gülcin Büyüközkan, Mehmet Sakir Ersoy

Abstract:

The decision of information technology (IT) outsourcing requires close attention to the evaluation of supplier selection process because the selection decision involves conflicting multiple criteria and is replete with complex decision making problems. Selecting the most appropriate suppliers is considered an important strategic decision that may impact the performance of outsourcing engagements. The objective of this paper is to aid decision makers to evaluate and assess possible IT outsourcing suppliers. An axiomatic design based fuzzy group decision making is adopted to evaluate supplier alternatives. Finally, a case study is given to demonstrate the potential of the methodology. KeywordsIT outsourcing, Supplier selection, Multi-criteria decision making, Axiomatic design, Fuzzy logic.

Keywords: IT outsourcing, Supplier selection, Multi-criteria decision making, Axiomatic design, Fuzzy logic

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1952
1703 Using Suffix Tree Document Representation in Hierarchical Agglomerative Clustering

Authors: Daniel I. Morariu, Radu G. Cretulescu, Lucian N. Vintan

Abstract:

In text categorization problem the most used method for documents representation is based on words frequency vectors called VSM (Vector Space Model). This representation is based only on words from documents and in this case loses any “word context" information found in the document. In this article we make a comparison between the classical method of document representation and a method called Suffix Tree Document Model (STDM) that is based on representing documents in the Suffix Tree format. For the STDM model we proposed a new approach for documents representation and a new formula for computing the similarity between two documents. Thus we propose to build the suffix tree only for any two documents at a time. This approach is faster, it has lower memory consumption and use entire document representation without using methods for disposing nodes. Also for this method is proposed a formula for computing the similarity between documents, which improves substantially the clustering quality. This representation method was validated using HAC - Hierarchical Agglomerative Clustering. In this context we experiment also the stemming influence in the document preprocessing step and highlight the difference between similarity or dissimilarity measures to find “closer" documents.

Keywords: Text Clustering, Suffix tree documentrepresentation, Hierarchical Agglomerative Clustering

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1911
1702 Bayesian Belief Networks for Test Driven Development

Authors: Vijayalakshmy Periaswamy S., Kevin McDaid

Abstract:

Testing accounts for the major percentage of technical contribution in the software development process. Typically, it consumes more than 50 percent of the total cost of developing a piece of software. The selection of software tests is a very important activity within this process to ensure the software reliability requirements are met. Generally tests are run to achieve maximum coverage of the software code and very little attention is given to the achieved reliability of the software. Using an existing methodology, this paper describes how to use Bayesian Belief Networks (BBNs) to select unit tests based on their contribution to the reliability of the module under consideration. In particular the work examines how the approach can enhance test-first development by assessing the quality of test suites resulting from this development methodology and providing insight into additional tests that can significantly reduce the achieved reliability. In this way the method can produce an optimal selection of inputs and the order in which the tests are executed to maximize the software reliability. To illustrate this approach, a belief network is constructed for a modern software system incorporating the expert opinion, expressed through probabilities of the relative quality of the elements of the software, and the potential effectiveness of the software tests. The steps involved in constructing the Bayesian Network are explained as is a method to allow for the test suite resulting from test-driven development.

Keywords: Software testing, Test Driven Development, Bayesian Belief Networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1887
1701 Modeling of Reinforcement in Concrete Beams Using Machine Learning Tools

Authors: Yogesh Aggarwal

Abstract:

The paper discusses the results obtained to predict reinforcement in singly reinforced beam using Neural Net (NN), Support Vector Machines (SVM-s) and Tree Based Models. Major advantage of SVM-s over NN is of minimizing a bound on the generalization error of model rather than minimizing a bound on mean square error over the data set as done in NN. Tree Based approach divides the problem into a small number of sub problems to reach at a conclusion. Number of data was created for different parameters of beam to calculate the reinforcement using limit state method for creation of models and validation. The results from this study suggest a remarkably good performance of tree based and SVM-s models. Further, this study found that these two techniques work well and even better than Neural Network methods. A comparison of predicted values with actual values suggests a very good correlation coefficient with all four techniques.

Keywords: Linear Regression, M5 Model Tree, Neural Network, Support Vector Machines.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2035
1700 A Proposed Technique for Software Development Risks Identification by using FTA Model

Authors: Hatem A. Khater, A. Baith Mohamed, Sara M. Kamel

Abstract:

Software Development Risks Identification (SDRI), using Fault Tree Analysis (FTA), is a proposed technique to identify not only the risk factors but also the causes of the appearance of the risk factors in software development life cycle. The method is based on analyzing the probable causes of software development failures before they become problems and adversely affect a project. It uses Fault tree analysis (FTA) to determine the probability of a particular system level failures that are defined by A Taxonomy for Sources of Software Development Risk to deduce failure analysis in which an undesired state of a system by using Boolean logic to combine a series of lower-level events. The major purpose of this paper is to use the probabilistic calculations of Fault Tree Analysis approach to determine all possible causes that lead to software development risk occurrence

Keywords: Software Development Risks Identification (SDRI), Fault Tree Analysis (FTA), Taxonomy for Software Development Risks (TSDR), Probabilistic Risk Assessment (PRA).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2217
1699 Theoretical Appraisal of Satisfactory Decisions: Uncertainty, Evolutionary Ideas and Beliefs, and Satisfactory Time Use

Authors: Okay Gunes

Abstract:

Unsatisfactory experiences due to an information shortage regarding the future pay-offs of actual choices, yield satisficing decision-making. This research will examine, for the first time in the literature, the motivation behind suboptimal decisions due to uncertainty by subjecting Adam Smith’s and Jeremy Bentham’s assumptions about the nature of the actions that lead to satisficing behavior, in order to clarify the theoretical background of a “consumption-based satisfactory time” concept. The contribution of this paper with respect to the existing literature is threefold: firstly, it is showed in this paper that Adam Smith’s uncertainty is related to the problem of the constancy of ideas and not related directly to beliefs. Secondly, possessions, as in Jeremy Bentham’s oeuvre, are assumed to be just as pleasing, as protecting and improving the actual or expected quality of life, so long as they reduce any displeasure due to the undesired outcomes of uncertainty. Finally, each consumption decision incurs its own satisfactory time period, owed to not feeling hungry, being healthy, not having transportation…etc. This reveals that the level of satisfaction is indeed a behavioral phenomenon where its value would depend on the simultaneous satisfaction derived from all activities.

Keywords: Decision-making, idea and belief, satisficing, uncertainty.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1930
1698 Classification and Analysis of Risks in Software Engineering

Authors: Hooman Hoodat, Hassan Rashidi

Abstract:

Despite various methods that exist in software risk management, software projects have a high rate of failure. When complexity and size of the projects are increased, managing software development becomes more difficult. In these projects the need for more analysis and risk assessment is vital. In this paper, a classification for software risks is specified. Then relations between these risks using risk tree structure are presented. Analysis and assessment of these risks are done using probabilistic calculations. This analysis helps qualitative and quantitative assessment of risk of failure. Moreover it can help software risk management process. This classification and risk tree structure can apply to some software tools.

Keywords: Risk analysis, risk assessment, risk classification, risk tree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9031
1697 Forest Growth Simulation: Tropical Rain Forest Stand Table Projection

Authors: Yasmin Yahya, Roslan Ismail, Samreth Vanna, Khorn Saret

Abstract:

The study on the tree growth for four species groups of commercial timber in Koh Kong province, Cambodia-s tropical rainforest is described. The simulation for these four groups had been successfully developed in the 5-year interval through year-60. Data were obtained from twenty permanent sample plots in the duration of thirteen years. The aim for this study was to develop stand table simulation system of tree growth by the species group. There were five steps involved in the development of the tree growth simulation: aggregate the tree species into meaningful groups by using cluster analysis; allocate the trees in the diameter classes by the species group; observe the diameter movement of the species group. The diameter growth rate, mortality rate and recruitment rate were calculated by using some mathematical formula. Simulation equation had been created by combining those parameters. Result showed the dissimilarity of the diameter growth among species groups.

Keywords: cluster analysis, diameter growth, simulation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2213
1696 Text Mining Technique for Data Mining Application

Authors: M. Govindarajan

Abstract:

Text Mining is around applying knowledge discovery techniques to unstructured text is termed knowledge discovery in text (KDT), or Text data mining or Text Mining. In decision tree approach is most useful in classification problem. With this technique, tree is constructed to model the classification process. There are two basic steps in the technique: building the tree and applying the tree to the database. This paper describes a proposed C5.0 classifier that performs rulesets, cross validation and boosting for original C5.0 in order to reduce the optimization of error ratio. The feasibility and the benefits of the proposed approach are demonstrated by means of medial data set like hypothyroid. It is shown that, the performance of a classifier on the training cases from which it was constructed gives a poor estimate by sampling or using a separate test file, either way, the classifier is evaluated on cases that were not used to build and evaluate the classifier are both are large. If the cases in hypothyroid.data and hypothyroid.test were to be shuffled and divided into a new 2772 case training set and a 1000 case test set, C5.0 might construct a different classifier with a lower or higher error rate on the test cases. An important feature of see5 is its ability to classifiers called rulesets. The ruleset has an error rate 0.5 % on the test cases. The standard errors of the means provide an estimate of the variability of results. One way to get a more reliable estimate of predictive is by f-fold –cross- validation. The error rate of a classifier produced from all the cases is estimated as the ratio of the total number of errors on the hold-out cases to the total number of cases. The Boost option with x trials instructs See5 to construct up to x classifiers in this manner. Trials over numerous datasets, large and small, show that on average 10-classifier boosting reduces the error rate for test cases by about 25%.

Keywords: C5.0, Error Ratio, text mining, training data, test data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2489
1695 Taking People, Process and Partnership on Board for Participatory Decision Making

Authors: B. Mikulskienė

Abstract:

Public administration institutions in cooperation with politicians are not the sole policy decision makers in full meaning any longer. Meanwhile, a special role, namely steering the decision making process, could be delegated to them. Despite the wide scientific discussion on different aspects what has direct impact on policy creation, there is a lack of holistic practical managerial advice, which could integrate infrastructure of policy decision making with intellectual capital and with interconnection of partnership. The proposed harmonized decision making model of process, people and partnership entitled by acronym HM-3P is analyzed as a framework for implementation of public administration steering role seeking the coherent social involvement in policy decision making.

Keywords: participatory decision making, partnership, stakeholders.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1454