Search results for: naïve Bayesian tree.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 549

Search results for: naïve Bayesian tree.

369 A Minimum Spanning Tree-Based Method for Initializing the K-Means Clustering Algorithm

Authors: J. Yang, Y. Ma, X. Zhang, S. Li, Y. Zhang

Abstract:

The traditional k-means algorithm has been widely used as a simple and efficient clustering method. However, the algorithm often converges to local minima for the reason that it is sensitive to the initial cluster centers. In this paper, an algorithm for selecting initial cluster centers on the basis of minimum spanning tree (MST) is presented. The set of vertices in MST with same degree are regarded as a whole which is used to find the skeleton data points. Furthermore, a distance measure between the skeleton data points with consideration of degree and Euclidean distance is presented. Finally, MST-based initialization method for the k-means algorithm is presented, and the corresponding time complexity is analyzed as well. The presented algorithm is tested on five data sets from the UCI Machine Learning Repository. The experimental results illustrate the effectiveness of the presented algorithm compared to three existing initialization methods.

Keywords: Degree, initial cluster center, k-means, minimum spanning tree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1493
368 Biodiversity and Climate Change: Consequences for Norway Spruce Mountain Forests in Slovakia

Authors: Jozef Mindas, Jaroslav Skvarenina, Jana Skvareninova

Abstract:

Study of the effects of climate change on Norway Spruce (Picea abies) forests has mainly focused on the diversity of tree species diversity of tree species as a result of the ability of species to tolerate temperature and moisture changes as well as some effects of disturbance regime changes. The tree species’ diversity changes in spruce forests due to climate change have been analyzed via gap model. Forest gap model is a dynamic model for calculation basic characteristics of individual forest trees. Input ecological data for model calculations have been taken from the permanent research plots located in primeval forests in mountainous regions in Slovakia. The results of regional scenarios of the climatic change for the territory of Slovakia have been used, from which the values are according to the CGCM3.1 (global) model, KNMI and MPI (regional) models. Model results for conditions of the climate change scenarios suggest a shift of the upper forest limit to the region of the present subalpine zone, in supramontane zone. N. spruce representation will decrease at the expense of beech and precious broadleaved species (Acer sp., Sorbus sp., Fraxinus sp.). The most significant tree species diversity changes have been identified for the upper tree line and current belt of dwarf pine (Pinus mugo) occurrence. The results have been also discussed in relation to most important disturbances (wind storms, snow and ice storms) and phenological changes which consequences are little known. Special discussion is focused on biomass production changes in relation to carbon storage diversity in different carbon pools.

Keywords: Biodiversity, climate change, Norway spruce forests, gap model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1585
367 Dynamic Bayesian Networks Modeling for Inferring Genetic Regulatory Networks by Search Strategy: Comparison between Greedy Hill Climbing and MCMC Methods

Authors: Huihai Wu, Xiaohui Liu

Abstract:

Using Dynamic Bayesian Networks (DBN) to model genetic regulatory networks from gene expression data is one of the major paradigms for inferring the interactions among genes. Averaging a collection of models for predicting network is desired, rather than relying on a single high scoring model. In this paper, two kinds of model searching approaches are compared, which are Greedy hill-climbing Search with Restarts (GSR) and Markov Chain Monte Carlo (MCMC) methods. The GSR is preferred in many papers, but there is no such comparison study about which one is better for DBN models. Different types of experiments have been carried out to try to give a benchmark test to these approaches. Our experimental results demonstrated that on average the MCMC methods outperform the GSR in accuracy of predicted network, and having the comparable performance in time efficiency. By proposing the different variations of MCMC and employing simulated annealing strategy, the MCMC methods become more efficient and stable. Apart from comparisons between these approaches, another objective of this study is to investigate the feasibility of using DBN modeling approaches for inferring gene networks from few snapshots of high dimensional gene profiles. Through synthetic data experiments as well as systematic data experiments, the experimental results revealed how the performances of these approaches can be influenced as the target gene network varies in the network size, data size, as well as system complexity.

Keywords: Genetic regulatory network, Dynamic Bayesian network, GSR, MCMC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1841
366 Faults Forecasting System

Authors: Hanaa E.Sayed, Hossam A. Gabbar, Shigeji Miyazaki

Abstract:

This paper presents Faults Forecasting System (FFS) that utilizes statistical forecasting techniques in analyzing process variables data in order to forecast faults occurrences. FFS is proposing new idea in detecting faults. Current techniques used in faults detection are based on analyzing the current status of the system variables in order to check if the current status is fault or not. FFS is using forecasting techniques to predict future timing for faults before it happens. Proposed model is applying subset modeling strategy and Bayesian approach in order to decrease dimensionality of the process variables and improve faults forecasting accuracy. A practical experiment, designed and implemented in Okayama University, Japan, is implemented, and the comparison shows that our proposed model is showing high forecasting accuracy and BEFORE-TIME.

Keywords: Bayesian Techniques, Faults Detection, Forecasting techniques, Multivariate Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1505
365 A Decision Tree Approach to Estimate Permanent Residents Using Remote Sensing Data in Lebanese Municipalities

Authors: K. Allaw, J. Adjizian Gerard, M. Chehayeb, A. Raad, W. Fahs, A. Badran, A. Fakherdin, H. Madi, N. Badaro Saliba

Abstract:

Population estimation using Geographic Information System (GIS) and remote sensing faces many obstacles such as the determination of permanent residents. A permanent resident is an individual who stays and works during all four seasons in his village. So, all those who move towards other cities or villages are excluded from this category. The aim of this study is to identify the factors affecting the percentage of permanent residents in a village and to determine the attributed weight to each factor. To do so, six factors have been chosen (slope, precipitation, temperature, number of services, time to Central Business District (CBD) and the proximity to conflict zones) and each one of those factors has been evaluated using one of the following data: the contour lines map of 50 m, the precipitation map, four temperature maps and data collected through surveys. The weighting procedure has been done using decision tree method. As a result of this procedure, temperature (50.8%) and percentage of precipitation (46.5%) are the most influencing factors.

Keywords: Remote sensing and GIS, permanent residence, decision tree, Lebanon.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 950
364 Comparison between Associative Classification and Decision Tree for HCV Treatment Response Prediction

Authors: Enas M. F. El Houby, Marwa S. Hassan

Abstract:

Combined therapy using Interferon and Ribavirin is the standard treatment in patients with chronic hepatitis C. However, the number of responders to this treatment is low, whereas its cost and side effects are high. Therefore, there is a clear need to predict patient’s response to the treatment based on clinical information to protect the patients from the bad drawbacks, Intolerable side effects and waste of money. Different machine learning techniques have been developed to fulfill this purpose. From these techniques are Associative Classification (AC) and Decision Tree (DT). The aim of this research is to compare the performance of these two techniques in the prediction of virological response to the standard treatment of HCV from clinical information. 200 patients treated with Interferon and Ribavirin; were analyzed using AC and DT. 150 cases had been used to train the classifiers and 50 cases had been used to test the classifiers. The experiment results showed that the two techniques had given acceptable results however the best accuracy for the AC reached 92% whereas for DT reached 80%.

Keywords: Associative Classification, Data mining, Decision tree, HCV, interferon.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1845
363 Unsupervised Segmentation by Hidden Markov Chain with Bi-dimensional Observed Process

Authors: Abdelali Joumad, Abdelaziz Nasroallah

Abstract:

In unsupervised segmentation context, we propose a bi-dimensional hidden Markov chain model (X,Y) that we adapt to the image segmentation problem. The bi-dimensional observed process Y = (Y 1, Y 2) is such that Y 1 represents the noisy image and Y 2 represents a noisy supplementary information on the image, for example a noisy proportion of pixels of the same type in a neighborhood of the current pixel. The proposed model can be seen as a competitive alternative to the Hilbert-Peano scan. We propose a bayesian algorithm to estimate parameters of the considered model. The performance of this algorithm is globally favorable, compared to the bi-dimensional EM algorithm through numerical and visual data.

Keywords: Image segmentation, Hidden Markov chain with a bi-dimensional observed process, Peano-Hilbert scan, Bayesian approach, MCMC methods, Bi-dimensional EM algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1554
362 A Reasoning Method of Cyber-Attack Attribution Based on Threat Intelligence

Authors: Li Qiang, Yang Ze-Ming, Liu Bao-Xu, Jiang Zheng-Wei

Abstract:

With the increasing complexity of cyberspace security, the cyber-attack attribution has become an important challenge of the security protection systems. The difficult points of cyber-attack attribution were forced on the problems of huge data handling and key data missing. According to this situation, this paper presented a reasoning method of cyber-attack attribution based on threat intelligence. The method utilizes the intrusion kill chain model and Bayesian network to build attack chain and evidence chain of cyber-attack on threat intelligence platform through data calculation, analysis and reasoning. Then, we used a number of cyber-attack events which we have observed and analyzed to test the reasoning method and demo system, the result of testing indicates that the reasoning method can provide certain help in cyber-attack attribution.

Keywords: Reasoning, Bayesian networks, cyber-attack attribution, kill chain, threat intelligence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2621
361 A Tree Based Association Rule Approach for XML Data with Semantic Integration

Authors: D. Sasikala, K. Premalatha

Abstract:

The use of eXtensible Markup Language (XML) in web, business and scientific databases lead to the development of methods, techniques and systems to manage and analyze XML data. Semi-structured documents suffer due to its heterogeneity and dimensionality. XML structure and content mining represent convergence for research in semi-structured data and text mining. As the information available on the internet grows drastically, extracting knowledge from XML documents becomes a harder task. Certainly, documents are often so large that the data set returned as answer to a query may also be very big to convey the required information. To improve the query answering, a Semantic Tree Based Association Rule (STAR) mining method is proposed. This method provides intentional information by considering the structure, content and the semantics of the content. The method is applied on Reuter’s dataset and the results show that the proposed method outperforms well.

Keywords: Semi--structured Document, Tree based Association Rule (TAR), Semantic Association Rule Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2292
360 Bayes Net Classifiers for Prediction of Renal Graft Status and Survival Period

Authors: Jiakai Li, Gursel Serpen, Steven Selman, Matt Franchetti, Mike Riesen, Cynthia Schneider

Abstract:

This paper presents the development of a Bayesian belief network classifier for prediction of graft status and survival period in renal transplantation using the patient profile information prior to the transplantation. The objective was to explore feasibility of developing a decision making tool for identifying the most suitable recipient among the candidate pool members. The dataset was compiled from the University of Toledo Medical Center Hospital patients as reported to the United Network Organ Sharing, and had 1228 patient records for the period covering 1987 through 2009. The Bayes net classifiers were developed using the Weka machine learning software workbench. Two separate classifiers were induced from the data set, one to predict the status of the graft as either failed or living, and a second classifier to predict the graft survival period. The classifier for graft status prediction performed very well with a prediction accuracy of 97.8% and true positive values of 0.967 and 0.988 for the living and failed classes, respectively. The second classifier to predict the graft survival period yielded a prediction accuracy of 68.2% and a true positive rate of 0.85 for the class representing those instances with kidneys failing during the first year following transplantation. Simulation results indicated that it is feasible to develop a successful Bayesian belief network classifier for prediction of graft status, but not the graft survival period, using the information in UNOS database.

Keywords: Bayesian network classifier, renal transplantation, graft survival period, United Network for Organ Sharing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2057
359 An Automatic Bayesian Classification System for File Format Selection

Authors: Roman Graf, Sergiu Gordea, Heather M. Ryan

Abstract:

This paper presents an approach for the classification of an unstructured format description for identification of file formats. The main contribution of this work is the employment of data mining techniques to support file format selection with just the unstructured text description that comprises the most important format features for a particular organisation. Subsequently, the file format indentification method employs file format classifier and associated configurations to support digital preservation experts with an estimation of required file format. Our goal is to make use of a format specification knowledge base aggregated from a different Web sources in order to select file format for a particular institution. Using the naive Bayes method, the decision support system recommends to an expert, the file format for his institution. The proposed methods facilitate the selection of file format and the quality of a digital preservation process. The presented approach is meant to facilitate decision making for the preservation of digital content in libraries and archives using domain expert knowledge and specifications of file formats. To facilitate decision-making, the aggregated information about the file formats is presented as a file format vocabulary that comprises most common terms that are characteristic for all researched formats. The goal is to suggest a particular file format based on this vocabulary for analysis by an expert. The sample file format calculation and the calculation results including probabilities are presented in the evaluation section.

Keywords: Data mining, digital libraries, digital preservation, file format.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1612
358 Experiments on Element and Document Statistics for XML Retrieval

Authors: Mohamed Ben Aouicha, Mohamed Tmar, Mohand Boughanem, Mohamed Abid

Abstract:

This paper presents an information retrieval model on XML documents based on tree matching. Queries and documents are represented by extended trees. An extended tree is built starting from the original tree, with additional weighted virtual links between each node and its indirect descendants allowing to directly reach each descendant. Therefore only one level separates between each node and its indirect descendants. This allows to compare the user query and the document with flexibility and with respect to the structural constraints of the query. The content of each node is very important to decide weither a document element is relevant or not, thus the content should be taken into account in the retrieval process. We separate between the structure-based and the content-based retrieval processes. The content-based score of each node is commonly based on the well-known Tf × Idf criteria. In this paper, we compare between this criteria and another one we call Tf × Ief. The comparison is based on some experiments into a dataset provided by INEX1 to show the effectiveness of our approach on one hand and those of both weighting functions on the other.

Keywords: XML retrieval, INEX, Tf × Idf, Tf × Ief

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1276
357 Tree Based Data Fusion Clustering Routing Algorithm for Illimitable Network Administration in Wireless Sensor Network

Authors: Y. Harold Robinson, M. Rajaram, E. Golden Julie, S. Balaji

Abstract:

In wireless sensor networks, locality and positioning information can be captured using Global Positioning System (GPS). This message can be congregated initially from spot to identify the system. Users can retrieve information of interest from a wireless sensor network (WSN) by injecting queries and gathering results from the mobile sink nodes. Routing is the progression of choosing optimal path in a mobile network. Intermediate node employs permutation of device nodes into teams and generating cluster heads that gather the data from entity cluster’s node and encourage the collective data to base station. WSNs are widely used for gathering data. Since sensors are power-constrained devices, it is quite vital for them to reduce the power utilization. A tree-based data fusion clustering routing algorithm (TBDFC) is used to reduce energy consumption in wireless device networks. Here, the nodes in a tree use the cluster formation, whereas the elevation of the tree is decided based on the distance of the member nodes to the cluster-head. Network simulation shows that this scheme improves the power utilization by the nodes, and thus considerably improves the lifetime.

Keywords: WSN, TBDFC, LEACH, PEGASIS, TREEPSI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1073
356 Influence of Noise on the Inference of Dynamic Bayesian Networks from Short Time Series

Authors: Frank Emmert Streib, Matthias Dehmer, Gökhan H. Bakır, Max Mühlhauser

Abstract:

In this paper we investigate the influence of external noise on the inference of network structures. The purpose of our simulations is to gain insights in the experimental design of microarray experiments to infer, e.g., transcription regulatory networks from microarray experiments. Here external noise means, that the dynamics of the system under investigation, e.g., temporal changes of mRNA concentration, is affected by measurement errors. Additionally to external noise another problem occurs in the context of microarray experiments. Practically, it is not possible to monitor the mRNA concentration over an arbitrary long time period as demanded by the statistical methods used to learn the underlying network structure. For this reason, we use only short time series to make our simulations more biologically plausible.

Keywords: Dynamic Bayesian networks, structure learning, gene networks, Markov chain Monte Carlo, microarray data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1563
355 Application of Rapidly Exploring Random Tree Star-Smart and G2 Quintic Pythagorean Hodograph Curves to the UAV Path Planning Problem

Authors: Luiz G. Véras, Felipe L. Medeiros, Lamartine F. Guimarães

Abstract:

This work approaches the automatic planning of paths for Unmanned Aerial Vehicles (UAVs) through the application of the Rapidly Exploring Random Tree Star-Smart (RRT*-Smart) algorithm. RRT*-Smart is a sampling process of positions of a navigation environment through a tree-type graph. The algorithm consists of randomly expanding a tree from an initial position (root node) until one of its branches reaches the final position of the path to be planned. The algorithm ensures the planning of the shortest path, considering the number of iterations tending to infinity. When a new node is inserted into the tree, each neighbor node of the new node is connected to it, if and only if the extension of the path between the root node and that neighbor node, with this new connection, is less than the current extension of the path between those two nodes. RRT*-smart uses an intelligent sampling strategy to plan less extensive routes by spending a smaller number of iterations. This strategy is based on the creation of samples/nodes near to the convex vertices of the navigation environment obstacles. The planned paths are smoothed through the application of the method called quintic pythagorean hodograph curves. The smoothing process converts a route into a dynamically-viable one based on the kinematic constraints of the vehicle. This smoothing method models the hodograph components of a curve with polynomials that obey the Pythagorean Theorem. Its advantage is that the obtained structure allows computation of the curve length in an exact way, without the need for quadratural techniques for the resolution of integrals.

Keywords: Path planning, path smoothing, Pythagorean hodograph curve, RRT*-Smart.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 838
354 Algorithm for Path Recognition in-between Tree Rows for Agricultural Wheeled-Mobile Robots

Authors: Anderson Rocha, Pedro Miguel de Figueiredo Dinis Oliveira Gaspar

Abstract:

Machine vision has been widely used in recent years in agriculture, as a tool to promote the automation of processes and increase the levels of productivity. The aim of this work is the development of a path recognition algorithm based on image processing to guide a terrestrial robot in-between tree rows. The proposed algorithm was developed using the software MATLAB, and it uses several image processing operations, such as threshold detection, morphological erosion, histogram equalization and the Hough transform, to find edge lines along tree rows on an image and to create a path to be followed by a mobile robot. To develop the algorithm, a set of images of different types of orchards was used, which made possible the construction of a method capable of identifying paths between trees of different heights and aspects. The algorithm was evaluated using several images with different characteristics of quality and the results showed that the proposed method can successfully detect a path in different types of environments.

Keywords: Agricultural mobile robot, image processing, path recognition, Hough transform.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1717
353 Data Oriented Modeling of Uniform Random Variable: Applied Approach

Authors: Ahmad Habibizad Navin, Mehdi Naghian Fesharaki, Mirkamal Mirnia, Mohamad Teshnelab, Ehsan Shahamatnia

Abstract:

In this paper we introduce new data oriented modeling of uniform random variable well-matched with computing systems. Due to this conformity with current computers structure, this modeling will be efficiently used in statistical inference.

Keywords: Uniform random variable, Data oriented modeling, Statistical inference, Prodigraph, Statistically complete tree, Uniformdigital probability digraph, Uniform n-complete probability tree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1579
352 Impact of Exchange Rate on Macroeconomic Indicators

Authors: Aleksandre Ergeshidze

Abstract:

The exchange rate is a pivotal pricing instrument that simultaneously impacts various components of the economy. Depreciation of nominal exchange rate is export promoting, which might be a desired export-led growth policy, and particularly critical to closing-down the widening current account imbalance. However, negative effects resulting from high dollarization and high share of imported intermediate inputs can outweigh positive effect. The aim of this research is to quantify impact of change in nominal exchange rate and test contractionary depreciation hypothesis on Georgian economy using structural and Bayesian vector autoregression. According to the acquired results, appreciation of nominal exchange rate is expected to decrease inflation, monetary policy rate, interest rate on domestic currency loans and economic growth in the medium run; however, impact on economic growth in the short run is statistically not significant.

Keywords: Bayesian vector autoregression, contractionary depreciation, dollarization, nominal exchange rate, structural vector autoregression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1126
351 Real-Time Testing of Steel Strip Welds based on Bayesian Decision Theory

Authors: Julio Molleda, Daniel F. García, Juan C. Granda, Francisco J. Suárez

Abstract:

One of the main trouble in a steel strip manufacturing line is the breakage of whatever weld carried out between steel coils, that are used to produce the continuous strip to be processed. A weld breakage results in a several hours stop of the manufacturing line. In this process the damages caused by the breakage must be repaired. After the reparation and in order to go on with the production it will be necessary a restarting process of the line. For minimizing this problem, a human operator must inspect visually and manually each weld in order to avoid its breakage during the manufacturing process. The work presented in this paper is based on the Bayesian decision theory and it presents an approach to detect, on real-time, steel strip defective welds. This approach is based on quantifying the tradeoffs between various classification decisions using probability and the costs that accompany such decisions.

Keywords: Classification, Pattern Recognition, ProbabilisticReasoning, Statistical Data Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1364
350 The Influence of Forest Management Histories on Dead Wood and Habitat Trees in the Old Growth Forest in Northern Iran

Authors: Kiomars Sefidi

Abstract:

Dead wood and habitat tree such as fallen logs, snags, stumps and cracks and loos bark etc. are regarded as an important ecological component of forests on which many forest dwelling species depend on presence of them within forest ecosystems. Meanwhile its relation to management history in Caspian forest has gone unreported. The aim of research was to compare the amounts of dead wood and habitat trees in the forests with historically different intensities of management, including: forests with the long term implication of management (PS), the short term implication of management (NS) which were compared with semi virgin forest (GS). The number of 405 individual dead and habitat trees were recorded and measured at 109 sampling locations. ANOVA revealed volume of dead tree in the form and decay classes significantly differ within sites and dead volume in the semi virgin forest significantly higher than managed sites. Comparing the amount of dead and habitat tree in three sites showed that, dead tree volume related with management history and significantly differ in three study sites. Meanwhile, frequency of habitat trees was significantly different within sites. The highest amount of habitat trees including cavities, cracks and loose bark and fork split trees was recorded in virgin site and lowest recorded in the sites with the long term implication of management. It can be concluded that forest management cause reduction of the amount of dead and habitat tree specially in a large size, thus managing this forest according to ecological sustainable principles require a commitment to maintaining stand structure that allow, continued generation of dead trees in a full range of size.

Keywords: Cracks trees, forest biodiversity, fork split trees, nature conservation, sustainable management.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1685
349 Attacks Classification in Adaptive Intrusion Detection using Decision Tree

Authors: Dewan Md. Farid, Nouria Harbi, Emna Bahri, Mohammad Zahidur Rahman, Chowdhury Mofizur Rahman

Abstract:

Recently, information security has become a key issue in information technology as the number of computer security breaches are exposed to an increasing number of security threats. A variety of intrusion detection systems (IDS) have been employed for protecting computers and networks from malicious network-based or host-based attacks by using traditional statistical methods to new data mining approaches in last decades. However, today's commercially available intrusion detection systems are signature-based that are not capable of detecting unknown attacks. In this paper, we present a new learning algorithm for anomaly based network intrusion detection system using decision tree algorithm that distinguishes attacks from normal behaviors and identifies different types of intrusions. Experimental results on the KDD99 benchmark network intrusion detection dataset demonstrate that the proposed learning algorithm achieved 98% detection rate (DR) in comparison with other existing methods.

Keywords: Detection rate, decision tree, intrusion detectionsystem, network security.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3559
348 Spatio-Temporal Analysis and Mapping of Malaria in Thailand

Authors: Krisada Lekdee, Sunee Sammatat, Nittaya Boonsit

Abstract:

This paper proposes a GLMM with spatial and temporal effects for malaria data in Thailand. A Bayesian method is used for parameter estimation via Gibbs sampling MCMC. A conditional autoregressive (CAR) model is assumed to present the spatial effects. The temporal correlation is presented through the covariance matrix of the random effects. The malaria quarterly data have been extracted from the Bureau of Epidemiology, Ministry of Public Health of Thailand. The factors considered are rainfall and temperature. The result shows that rainfall and temperature are positively related to the malaria morbidity rate. The posterior means of the estimated morbidity rates are used to construct the malaria maps. The top 5 highest morbidity rates (per 100,000 population) are in Trat (Q3, 111.70), Chiang Mai (Q3, 104.70), Narathiwat (Q4, 97.69), Chiang Mai (Q2, 88.51), and Chanthaburi (Q3, 86.82). According to the DIC criterion, the proposed model has a better performance than the GLMM with spatial effects but without temporal terms.

Keywords: Bayesian method, generalized linear mixed model (GLMM), malaria, spatial effects, temporal correlation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2101
347 Development of an Automatic Calibration Framework for Hydrologic Modelling Using Approximate Bayesian Computation

Authors: A. Chowdhury, P. Egodawatta, J. M. McGree, A. Goonetilleke

Abstract:

Hydrologic models are increasingly used as tools to predict stormwater quantity and quality from urban catchments. However, due to a range of practical issues, most models produce gross errors in simulating complex hydraulic and hydrologic systems. Difficulty in finding a robust approach for model calibration is one of the main issues. Though automatic calibration techniques are available, they are rarely used in common commercial hydraulic and hydrologic modelling software e.g. MIKE URBAN. This is partly due to the need for a large number of parameters and large datasets in the calibration process. To overcome this practical issue, a framework for automatic calibration of a hydrologic model was developed in R platform and presented in this paper. The model was developed based on the time-area conceptualization. Four calibration parameters, including initial loss, reduction factor, time of concentration and time-lag were considered as the primary set of parameters. Using these parameters, automatic calibration was performed using Approximate Bayesian Computation (ABC). ABC is a simulation-based technique for performing Bayesian inference when the likelihood is intractable or computationally expensive to compute. To test the performance and usefulness, the technique was used to simulate three small catchments in Gold Coast. For comparison, simulation outcomes from the same three catchments using commercial modelling software, MIKE URBAN were used. The graphical comparison shows strong agreement of MIKE URBAN result within the upper and lower 95% credible intervals of posterior predictions as obtained via ABC. Statistical validation for posterior predictions of runoff result using coefficient of determination (CD), root mean square error (RMSE) and maximum error (ME) was found reasonable for three study catchments. The main benefit of using ABC over MIKE URBAN is that ABC provides a posterior distribution for runoff flow prediction, and therefore associated uncertainty in predictions can be obtained. In contrast, MIKE URBAN just provides a point estimate. Based on the results of the analysis, it appears as though ABC the developed framework performs well for automatic calibration.

Keywords: Automatic calibration framework, approximate Bayesian computation, hydrologic and hydraulic modelling, MIKE URBAN software, R platform.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1664
346 Web Personalization to Build Trust in E-Commerce: A Design Science Approach

Authors: Choon Ling Sia, Yani Shi, Jiaqi Yan, Huaping Chen

Abstract:

With the development of the Internet, E-commerce is growing at an exponential rate, and lots of online stores are built up to sell their goods online. A major factor influencing the successful adoption of E-commerce is consumer-s trust. For new or unknown Internet business, consumers- lack of trust has been cited as a major barrier to its proliferation. As web sites provide key interface for consumer use of E-Commerce, we investigate the design of web site to build trust in E-Commerce from a design science approach. A conceptual model is proposed in this paper to describe the ontology of online transaction and human-computer interaction. Based on this conceptual model, we provide a personalized webpage design approach using Bayesian networks learning method. Experimental evaluation are designed to show the effectiveness of web personalization in improving consumer-s trust in new or unknown online store.

Keywords: Trust, Web site design, Human-ComputerInteraction, E-Commerce, Design science, Bayesian network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1952
345 Architectural Stratification and Woody Species Diversity of a Subtropical Forest Grown in a Limestone Habitat in Okinawa Island, Japan

Authors: S. M. Feroz, K. Yoshimura, A. Hagihara

Abstract:

The forest stand consisted of four layers. The species composition between the third and the bottom layers was almost similar, whereas it was almost exclusive between the top and the lower three layers. The values of Shannon-s index H' and Pielou-s index J ' tended to increase from the bottom layer upward, except for H' -value of the top layer. The values of H' and J ' were 4.21 bit and 0.73, respectively, for the total stand. High woody species diversity of the forest depended on large trees in the upper layers, which trend was different from a subtropical evergreen broadleaf forest grown in silicate habitat in the northern part of Okinawa Island. The spatial distribution of trees was overlapped between the third and the bottom layers, whereas it was independent or slightly exclusive between the top and the lower three layers. Mean tree weight of each layer decreased from the top toward the bottom layer, whereas the corresponding tree density increased from the top downward. This relationship was analogous to the process of self-thinning plant populations.

Keywords: Canopy multi-layering, limestone habitat, mean tree weight-density relationship, species diversity, subtropical forest.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1178
344 Synthesis of Analogue to Camptothecine

Authors: Abdulkareem Hamid, Adam Daïch

Abstract:

Camptothecin (CPT) is a cytotoxic quinoline alkaloid, which inhibits the DNA enzyme topoisomerase I (topo I). It was discovered in 1966 by M. E. Wall and M. C. Wani in systematic screening of natural products for anticancer drugs. It was isolated from the bark and stem of Camptotheca acuminata (Camptotheca, Happy tree), a tree native in China. CPT showed remarkable anticancer activity in preliminary clinical trials but also low solubility and (high) adverse drug reaction. Because of these disadvantages synthetic and medicinal chemists have developed numerous syntheses of Camptothecine [1][2][3] and various derivatives to increase the benefits of the chemical, with good results. In our method CPT analogues has be six steps starting from available material DL Malic acid.

Keywords: Camptothecine, synthesis, analogue.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1565
343 Tree Based Data Aggregation to Resolve Funneling Effect in Wireless Sensor Network

Authors: G. Rajesh, B. Vinayaga Sundaram, C. Aarthi

Abstract:

In wireless sensor network, sensor node transmits the sensed data to the sink node in multi-hop communication periodically. This high traffic induces congestion at the node which is present one-hop distance to the sink node. The packet transmission and reception rate of these nodes should be very high, when compared to other sensor nodes in the network. Therefore, the energy consumption of that node is very high and this effect is known as the “funneling effect”. The tree based-data aggregation technique (TBDA) is used to reduce the energy consumption of the node. The throughput of the overall performance shows a considerable decrease in the number of packet transmissions to the sink node. The proposed scheme, TBDA, avoids the funneling effect and extends the lifetime of the wireless sensor network. The average case time complexity for inserting the node in the tree is O(n log n) and for the worst case time complexity is O(n2).

Keywords: Data Aggregation, Funneling Effect, Traffic Congestion, Wireless Sensor Network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1269
342 Hybrid Anomaly Detection Using Decision Tree and Support Vector Machine

Authors: Elham Serkani, Hossein Gharaee Garakani, Naser Mohammadzadeh, Elaheh Vaezpour

Abstract:

Intrusion detection systems (IDS) are the main components of network security. These systems analyze the network events for intrusion detection. The design of an IDS is through the training of normal traffic data or attack. The methods of machine learning are the best ways to design IDSs. In the method presented in this article, the pruning algorithm of C5.0 decision tree is being used to reduce the features of traffic data used and training IDS by the least square vector algorithm (LS-SVM). Then, the remaining features are arranged according to the predictor importance criterion. The least important features are eliminated in the order. The remaining features of this stage, which have created the highest level of accuracy in LS-SVM, are selected as the final features. The features obtained, compared to other similar articles which have examined the selected features in the least squared support vector machine model, are better in the accuracy, true positive rate, and false positive. The results are tested by the UNSW-NB15 dataset.

Keywords: Intrusion detection system, decision tree, support vector machine, feature selection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1177
341 Segmentation of Piecewise Polynomial Regression Model by Using Reversible Jump MCMC Algorithm

Authors: Suparman

Abstract:

Piecewise polynomial regression model is very flexible model for modeling the data. If the piecewise polynomial regression model is matched against the data, its parameters are not generally known. This paper studies the parameter estimation problem of piecewise polynomial regression model. The method which is used to estimate the parameters of the piecewise polynomial regression model is Bayesian method. Unfortunately, the Bayes estimator cannot be found analytically. Reversible jump MCMC algorithm is proposed to solve this problem. Reversible jump MCMC algorithm generates the Markov chain that converges to the limit distribution of the posterior distribution of piecewise polynomial regression model parameter. The resulting Markov chain is used to calculate the Bayes estimator for the parameters of piecewise polynomial regression model.

Keywords: Piecewise, Bayesian, reversible jump MCMC, segmentation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1623
340 Speaker Identification by Joint Statistical Characterization in the Log Gabor Wavelet Domain

Authors: Suman Senapati, Goutam Saha

Abstract:

Real world Speaker Identification (SI) application differs from ideal or laboratory conditions causing perturbations that leads to a mismatch between the training and testing environment and degrade the performance drastically. Many strategies have been adopted to cope with acoustical degradation; wavelet based Bayesian marginal model is one of them. But Bayesian marginal models cannot model the inter-scale statistical dependencies of different wavelet scales. Simple nonlinear estimators for wavelet based denoising assume that the wavelet coefficients in different scales are independent in nature. However wavelet coefficients have significant inter-scale dependency. This paper enhances this inter-scale dependency property by a Circularly Symmetric Probability Density Function (CS-PDF) related to the family of Spherically Invariant Random Processes (SIRPs) in Log Gabor Wavelet (LGW) domain and corresponding joint shrinkage estimator is derived by Maximum a Posteriori (MAP) estimator. A framework is proposed based on these to denoise speech signal for automatic speaker identification problems. The robustness of the proposed framework is tested for Text Independent Speaker Identification application on 100 speakers of POLYCOST and 100 speakers of YOHO speech database in three different noise environments. Experimental results show that the proposed estimator yields a higher improvement in identification accuracy compared to other estimators on popular Gaussian Mixture Model (GMM) based speaker model and Mel-Frequency Cepstral Coefficient (MFCC) features.

Keywords: Speaker Identification, Log Gabor Wavelet, Bayesian Bivariate Estimator, Circularly Symmetric Probability Density Function, SIRP.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1591