Search results for: binary classification tree
1191 A Spanning Tree for Enhanced Cluster Based Routing in Wireless Sensor Network
Authors: M. Saravanan, M. Madheswaran
Abstract:
Wireless Sensor Network (WSN) clustering architecture enables features like network scalability, communication overhead reduction, and fault tolerance. After clustering, aggregated data is transferred to data sink and reducing unnecessary, redundant data transfer. It reduces nodes transmitting, and so saves energy consumption. Also, it allows scalability for many nodes, reduces communication overhead, and allows efficient use of WSN resources. Clustering based routing methods manage network energy consumption efficiently. Building spanning trees for data collection rooted at a sink node is a fundamental data aggregation method in sensor networks. The problem of determining Cluster Head (CH) optimal number is an NP-Hard problem. In this paper, we combine cluster based routing features for cluster formation and CH selection and use Minimum Spanning Tree (MST) for intra-cluster communication. The proposed method is based on optimizing MST using Simulated Annealing (SA). In this work, normalized values of mobility, delay, and remaining energy are considered for finding optimal MST. Simulation results demonstrate the effectiveness of the proposed method in improving the packet delivery ratio and reducing the end to end delay.
Keywords: Wireless sensor network, clustering, minimum spanning tree, genetic algorithm, low energy adaptive clustering hierarchy, simulated annealing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17861190 Allometric Models for Biomass Estimation in Savanna Woodland Area, Niger State, Nigeria
Authors: Abdullahi Jibrin, Aishetu Abdulkadir
Abstract:
The development of allometric models is crucial to accurate forest biomass/carbon stock assessment. The aim of this study was to develop a set of biomass prediction models that will enable the determination of total tree aboveground biomass for savannah woodland area in Niger State, Nigeria. Based on the data collected through biometric measurements of 1816 trees and destructive sampling of 36 trees, five species specific and one site specific models were developed. The sample size was distributed equally between the five most dominant species in the study site (Vitellaria paradoxa, Irvingia gabonensis, Parkia biglobosa, Anogeissus leiocarpus, Pterocarpus erinaceous). Firstly, the equations were developed for five individual species. Secondly these five species were mixed and were used to develop an allometric equation of mixed species. Overall, there was a strong positive relationship between total tree biomass and the stem diameter. The coefficient of determination (R2 values) ranging from 0.93 to 0.99 P < 0.001 were realised for the models; with considerable low standard error of the estimates (SEE) which confirms that the total tree above ground biomass has a significant relationship with the dbh. F-test values for the biomass prediction models were also significant at p < 0.001 which indicates that the biomass prediction models are valid. This study recommends that for improved biomass estimates in the study site, the site specific biomass models should preferably be used instead of using generic models.
Keywords: Allometriy, biomass, carbon stock, model, regression equation, woodland, inventory.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 27881189 A Comparative Study of Malware Detection Techniques Using Machine Learning Methods
Authors: Cristina Vatamanu, Doina Cosovan, Dragoş Gavriluţ, Henri Luchian
Abstract:
In the past few years, the amount of malicious software increased exponentially and, therefore, machine learning algorithms became instrumental in identifying clean and malware files through (semi)-automated classification. When working with very large datasets, the major challenge is to reach both a very high malware detection rate and a very low false positive rate. Another challenge is to minimize the time needed for the machine learning algorithm to do so. This paper presents a comparative study between different machine learning techniques such as linear classifiers, ensembles, decision trees or various hybrids thereof. The training dataset consists of approximately 2 million clean files and 200.000 infected files, which is a realistic quantitative mixture. The paper investigates the above mentioned methods with respect to both their performance (detection rate and false positive rate) and their practicability.Keywords: Detection Rate, False Positives, Perceptron, One Side Class, Ensembles, Decision Tree, Hybrid methods, Feature Selection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 32801188 A CTL Specification of Serializability for Transactions Accessing Uniform Data
Authors: Rafat Alshorman, Walter Hussak
Abstract:
Existing work in temporal logic on representing the execution of infinitely many transactions, uses linear-time temporal logic (LTL) and only models two-step transactions. In this paper, we use the comparatively efficient branching-time computational tree logic CTL and extend the transaction model to a class of multistep transactions, by introducing distinguished propositional variables to represent the read and write steps of n multi-step transactions accessing m data items infinitely many times. We prove that the well known correspondence between acyclicity of conflict graphs and serializability for finite schedules, extends to infinite schedules. Furthermore, in the case of transactions accessing the same set of data items in (possibly) different orders, serializability corresponds to the absence of cycles of length two. This result is used to give an efficient encoding of the serializability condition into CTL.Keywords: computational tree logic, serializability, multi-step transactions.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11761187 Combining Fuzzy Logic and Data Miningto Predict the Result of an EIA Review
Authors: Kevin Fong-Rey Liu, Jia-Shen Chen, Han-Hsi Liang, Cheng-Wu Chen, Yung-Shuen Shen
Abstract:
The purpose of determining impact significance is to place value on impacts. Environmental impact assessment review is a process that judges whether impact significance is acceptable or not in accordance with the scientific facts regarding environmental, ecological and socio-economical impacts described in environmental impact statements (EIS) or environmental impact assessment reports (EIAR). The first aim of this paper is to summarize the criteria of significance evaluation from the past review results and accordingly utilize fuzzy logic to incorporate these criteria into scientific facts. The second aim is to employ data mining technique to construct an EIS or EIAR prediction model for reviewing results which can assist developers to prepare and revise better environmental management plans in advance. The validity of the previous prediction model proposed by authors in 2009 is 92.7%. The enhanced validity in this study can attain 100.0%.Keywords: Environmental impact assessment review, impactsignificance, fuzzy logic, data mining, classification tree.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19441186 Medical Image Segmentation Using Deformable Model and Local Fitting Binary: Thoracic Aorta
Authors: B. Bagheri Nakhjavanlo, T. S. Ellis, P.Raoofi, Sh.ziari
Abstract:
This paper presents an application of level sets for the segmentation of abdominal and thoracic aortic aneurysms in CTA datasets. An important challenge in reliably detecting aortic is the need to overcome problems associated with intensity inhomogeneities. Level sets are part of an important class of methods that utilize partial differential equations (PDEs) and have been extensively applied in image segmentation. A kernel function in the level set formulation aids the suppression of noise in the extracted regions of interest and then guides the motion of the evolving contour for the detection of weak boundaries. The speed of curve evolution has been significantly improved with a resulting decrease in segmentation time compared with previous implementations of level sets, and are shown to be more effective than other approaches in coping with intensity inhomogeneities. We have applied the Courant Friedrichs Levy (CFL) condition as stability criterion for our algorithm.Keywords: Image segmentation, Level-sets, Local fitting binary, Thoracic aorta.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14561185 A New Classification of Risk-Reduction Options to Improve the Risk-Reduction Readiness of the Railway Industry
Authors: Eberechi Weli, Michael Todinov
Abstract:
The gap between the selection of risk-reduction options in the railway industry and the task of their effective implementation results in compromised safety and substantial losses. An effective risk management must necessarily integrate the evaluation phases with the implementation phase. This paper proposes an essential categorisation of risk reduction measures that best addresses a standard railway industry portfolio. By categorising the risk reduction options into design, operational, procedural and technical options, it is guaranteed that the efforts of the implementation facilitators (people, processes and supporting systems) are systematically harmonised. The classification is based on an integration of fundamental principles of risk reduction in the railway industry with the systems engineering approach.
This paper argues that the use of a similar classification approach is an attribute of organisations possessing a superior level of risk-reduction readiness. The integration of the proposed rational classification structure provides a solid ground for effective risk reduction.
Keywords: Cost effectiveness, organisational readiness, risk reduction, railway, system engineering.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18021184 Characterisation and Classification of Natural Transients
Authors: Ernst D. Schmitter
Abstract:
Monitoring lightning electromagnetic pulses (sferics) and other terrestrial as well as extraterrestrial transient radiation signals is of considerable interest for practical and theoretical purposes in astro- and geophysics as well as meteorology. Managing a continuous flow of data, automisation of the detection and classification process is important. Features based on a combination of wavelet and statistical methods proved efficient for analysis and characterisation of transients and as input into a radial basis function network that is trained to discriminate transients from pulse like to wave like.Keywords: transient signals, statistics, wavelets, neural networks
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14501183 Development of Low-cost OCDMA Encoder Based On Arrayed Waveguide Gratings(AWGs) and Optical Switches
Authors: Mohammad Syuhaimi Ab-Rahman, Boon Chuan Ng, Norshilawati Mohamad Ibrahim, Sahbudin Shaari
Abstract:
This paper describes the development of a 16-ports optical code division multiple access (OCDMA) encoder prototype based on Arrayed Waveguide Grating (AWG) and optical switches. It is potentially to provide a high security for data transmission due to all data will be transmitted in binary code form. The output signals from AWG are coded with a binary code that given to an optical switch before it signal modulate with the carrier and transmitted to the receiver. The 16-ports encoder used 16 double pole double throw (DPDT) toggle switches to control the polarization of voltage source from +5 V to -5 V for 16 optical switches. When +5 V is given, the optical switch will give code '1' and vice versa. The experimental results showed the insertion loss, crosstalk, uniformity, and optical signal-noise-ratio (OSNR) for the developed prototype are <12 dB, 9.77 dB, <1.63dB, and ≥20dB.
Keywords: AWG, encoder, OCDMA, optical switch.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15811182 Firing Angle Range Control For Minimising Harmonics in TCR Employed in SVC-s
Authors: D. R. Patil, U. Gudaru
Abstract:
Most electrical distribution systems are incurring large losses as the loads are wide spread, inadequate reactive power compensation facilities and their improper control. A typical static VAR compensator consists of capacitor bank in binary sequential steps operated in conjunction with a thyristor controlled reactor of the smallest step size. This SVC facilitates stepless control of reactive power closely matching with load requirements so as to maintain power factor nearer to unity. This type of SVC-s requiring a appropriately controlled TCR. This paper deals with an air cored reactor suitable for distribution transformer of 3phase, 50Hz, Dy11, 11KV/433V, 125 KVA capacity. Air cored reactors are designed, built, tested and operated in conjunction with capacitor bank in five binary sequential steps. It is established how the delta connected TCR minimizes the harmonic components and the operating range for various electrical quantities as a function of firing angle is investigated. In particular firing angle v/s line & phase currents, D.C. components, THD-s, active and reactive powers, odd and even triplen harmonics, dominant characteristic harmonics are all investigated and range of firing angle is fixed for satisfactory operation. The harmonic spectra for phase and line quantities at specified firing angles are given. In case the TCR is operated within the bound specified in this paper established through simulation studies are yielding the best possible operating condition particularly free from all dominant harmonics.Keywords: Binary Sequential switched capacitor bank, TCR, Nontriplen harmonics, step less Q control, Active and Reactivepower, Simulink
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 59961181 Dynamic Load Balancing Strategy for Grid Computing
Authors: Belabbas Yagoubi, Yahya Slimani
Abstract:
Workload and resource management are two essential functions provided at the service level of the grid software infrastructure. To improve the global throughput of these software environments, workloads have to be evenly scheduled among the available resources. To realize this goal several load balancing strategies and algorithms have been proposed. Most strategies were developed in mind, assuming homogeneous set of sites linked with homogeneous and fast networks. However for computational grids we must address main new issues, namely: heterogeneity, scalability and adaptability. In this paper, we propose a layered algorithm which achieve dynamic load balancing in grid computing. Based on a tree model, our algorithm presents the following main features: (i) it is layered; (ii) it supports heterogeneity and scalability; and, (iii) it is totally independent from any physical architecture of a grid.
Keywords: Grid computing, load balancing, workload, tree based model.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 31381180 Estimation of Skew Angle in Binary Document Images Using Hough Transform
Authors: Nandini N., Srikanta Murthy K., G. Hemantha Kumar
Abstract:
This paper includes two novel techniques for skew estimation of binary document images. These algorithms are based on connected component analysis and Hough transform. Both these methods focus on reducing the amount of input data provided to Hough transform. In the first method, referred as word centroid approach, the centroids of selected words are used for skew detection. In the second method, referred as dilate & thin approach, the selected characters are blocked and dilated to get word blocks and later thinning is applied. The final image fed to Hough transform has the thinned coordinates of word blocks in the image. The methods have been successful in reducing the computational complexity of Hough transform based skew estimation algorithms. Promising experimental results are also provided to prove the effectiveness of the proposed methods.Keywords: Dilation, Document processing, Hough transform, Optical Character Recognition, Skew estimation, and Thinning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 32661179 Automatic Classification of Lung Diseases from CT Images
Authors: Abobaker Mohammed Qasem Farhan, Shangming Yang, Mohammed Al-Nehari
Abstract:
Pneumonia is a kind of lung disease that creates congestion in the chest. Such pneumonic conditions lead to loss of life due to the severity of high congestion. Pneumonic lung disease is caused by viral pneumonia, bacterial pneumonia, or COVID-19 induced pneumonia. The early prediction and classification of such lung diseases help reduce the mortality rate. We propose the automatic Computer-Aided Diagnosis (CAD) system in this paper using the deep learning approach. The proposed CAD system takes input from raw computerized tomography (CT) scans of the patient's chest and automatically predicts disease classification. We designed the Hybrid Deep Learning Algorithm (HDLA) to improve accuracy and reduce processing requirements. The raw CT scans are pre-processed first to enhance their quality for further analysis. We then applied a hybrid model that consists of automatic feature extraction and classification. We propose the robust 2D Convolutional Neural Network (CNN) model to extract the automatic features from the pre-processed CT image. This CNN model assures feature learning with extremely effective 1D feature extraction for each input CT image. The outcome of the 2D CNN model is then normalized using the Min-Max technique. The second step of the proposed hybrid model is related to training and classification using different classifiers. The simulation outcomes using the publicly available dataset prove the robustness and efficiency of the proposed model compared to state-of-art algorithms.
Keywords: CT scans, COVID-19, deep learning, image processing, pneumonia, lung disease.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6101178 Urban Land Cover Change of Olomouc City Using LANDSAT Images
Authors: Miloš Marjanović, Jaroslav Burian, Ja kub Miřijovský, Jan Harbula
Abstract:
This paper regards the phenomena of intensive suburbanization and urbanization in Olomouc city and in Olomouc region in general for the period of 1986–2009. A Remote Sensing approach that involves tracking of changes in Land Cover units is proposed to quantify the urbanization state and trends in temporal and spatial aspects. It actually consisted of two approaches, Experiment 1 and Experiment 2 which implied two different image classification solutions in order to provide Land Cover maps for each 1986–2009 time split available in the Landsat image set. Experiment 1 dealt with the unsupervised classification, while Experiment 2 involved semi- supervised classification, using a combination of object-based and pixel-based classifiers. The resulting Land Cover maps were subsequently quantified for the proportion of urban area unit and its trend through time, and also for the urban area unit stability, yielding the relation of spatial and temporal development of the urban area unit. Some outcomes seem promising but there is indisputably room for improvements of source data and also processing and filtering.
Keywords: Change detection, image classification, land cover, Landsat images, Olomouc city, urbanization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18301177 An Improved Preprocessing for Biosonar Target Classification
Authors: Turgay Temel, John Hallam
Abstract:
An improved processing description to be employed in biosonar signal processing in a cochlea model is proposed and examined. It is compared to conventional models using a modified discrimination analysis and both are tested. Their performances are evaluated with echo data captured from natural targets (trees).Results indicate that the phase characteristics of low-pass filters employed in the echo processing have a significant effect on class separability for this data.
Keywords: Cochlea model, discriminant analysis, neurospikecoding, classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14921176 PM10 Prediction and Forecasting Using CART: A Case Study for Pleven, Bulgaria
Authors: Snezhana G. Gocheva-Ilieva, Maya P. Stoimenova
Abstract:
Ambient air pollution with fine particulate matter (PM10) is a systematic permanent problem in many countries around the world. The accumulation of a large number of measurements of both the PM10 concentrations and the accompanying atmospheric factors allow for their statistical modeling to detect dependencies and forecast future pollution. This study applies the classification and regression trees (CART) method for building and analyzing PM10 models. In the empirical study, average daily air data for the city of Pleven, Bulgaria for a period of 5 years are used. Predictors in the models are seven meteorological variables, time variables, as well as lagged PM10 variables and some lagged meteorological variables, delayed by 1 or 2 days with respect to the initial time series, respectively. The degree of influence of the predictors in the models is determined. The selected best CART models are used to forecast future PM10 concentrations for two days ahead after the last date in the modeling procedure and show very accurate results.Keywords: Cross-validation, decision tree, lagged variables, short-term forecasting.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7371175 Gene Expression Data Classification Using Discriminatively Regularized Sparse Subspace Learning
Authors: Chunming Xu
Abstract:
Sparse representation which can represent high dimensional data effectively has been successfully used in computer vision and pattern recognition problems. However, it doesn-t consider the label information of data samples. To overcome this limitation, we develop a novel dimensionality reduction algorithm namely dscriminatively regularized sparse subspace learning(DR-SSL) in this paper. The proposed DR-SSL algorithm can not only make use of the sparse representation to model the data, but also can effective employ the label information to guide the procedure of dimensionality reduction. In addition,the presented algorithm can effectively deal with the out-of-sample problem.The experiments on gene-expression data sets show that the proposed algorithm is an effective tool for dimensionality reduction and gene-expression data classification.Keywords: sparse representation, dimensionality reduction, labelinformation, sparse subspace learning, gene-expression data classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14471174 Gene Selection Guided by Feature Interdependence
Authors: Hung-Ming Lai, Andreas Albrecht, Kathleen Steinhöfel
Abstract:
Cancers could normally be marked by a number of differentially expressed genes which show enormous potential as biomarkers for a certain disease. Recent years, cancer classification based on the investigation of gene expression profiles derived by high-throughput microarrays has widely been used. The selection of discriminative genes is, therefore, an essential preprocess step in carcinogenesis studies. In this paper, we have proposed a novel gene selector using information-theoretic measures for biological discovery. This multivariate filter is a four-stage framework through the analyses of feature relevance, feature interdependence, feature redundancy-dependence and subset rankings, and having been examined on the colon cancer data set. Our experimental result show that the proposed method outperformed other information theorem based filters in all aspect of classification errors and classification performance.
Keywords: Colon cancer, feature interdependence, feature subset selection, gene selection, microarray data analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21441173 A New Weighted LDA Method in Comparison to Some Versions of LDA
Authors: Delaram Jarchi, Reza Boostani
Abstract:
Linear Discrimination Analysis (LDA) is a linear solution for classification of two classes. In this paper, we propose a variant LDA method for multi-class problem which redefines the between class and within class scatter matrices by incorporating a weight function into each of them. The aim is to separate classes as much as possible in a situation that one class is well separated from other classes, incidentally, that class must have a little influence on classification. It has been suggested to alleviate influence of classes that are well separated by adding a weight into between class scatter matrix and within class scatter matrix. To obtain a simple and effective weight function, ordinary LDA between every two classes has been used in order to find Fisher discrimination value and passed it as an input into two weight functions and redefined between class and within class scatter matrices. Experimental results showed that our new LDA method improved classification rate, on glass, iris and wine datasets, in comparison to different versions of LDA.Keywords: Discriminant vectors, weighted LDA, uncorrelation, principle components, Fisher-face method, Bootstarp method.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15231172 Classification of the Latin Alphabet as Pattern on ARToolkit Markers for Augmented Reality Applications
Authors: Mohamed Badeche, Mohamed Benmohammed
Abstract:
augmented reality is a technique used to insert virtual objects in real scenes. One of the most used libraries in the area is the ARToolkit library. It is based on the recognition of the markers that are in the form of squares with a pattern inside. This pattern which is mostly textual is source of confusing. In this paper, we present the results of a classification of Latin characters as a pattern on the ARToolkit markers to know the most distinguishable among them.
Keywords: ARToolkit library, augmented reality, K-means, patterns
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18401171 Genetic Algorithm for Feature Subset Selection with Exploitation of Feature Correlations from Continuous Wavelet Transform: a real-case Application
Authors: G. Van Dijck, M. M. Van Hulle, M. Wevers
Abstract:
A genetic algorithm (GA) based feature subset selection algorithm is proposed in which the correlation structure of the features is exploited. The subset of features is validated according to the classification performance. Features derived from the continuous wavelet transform are potentially strongly correlated. GA-s that do not take the correlation structure of features into account are inefficient. The proposed algorithm forms clusters of correlated features and searches for a good candidate set of clusters. Secondly a search within the clusters is performed. Different simulations of the algorithm on a real-case data set with strong correlations between features show the increased classification performance. Comparison is performed with a standard GA without use of the correlation structure.Keywords: Classification, genetic algorithm, hierarchicalagglomerative clustering, wavelet transform.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12241170 Medical Image Segmentation Using Deformable Models and Local Fitting Binary
Authors: B. Bagheri Nakhjavanlo, T. J. Ellis, P. Raoofi, J. Dehmeshki
Abstract:
This paper presents a customized deformable model for the segmentation of abdominal and thoracic aortic aneurysms in CTA datasets. An important challenge in reliably detecting aortic aneurysm is the need to overcome problems associated with intensity inhomogeneities and image noise. Level sets are part of an important class of methods that utilize partial differential equations (PDEs) and have been extensively applied in image segmentation. A Gaussian kernel function in the level set formulation, which extracts the local intensity information, aids the suppression of noise in the extracted regions of interest and then guides the motion of the evolving contour for the detection of weak boundaries. The speed of curve evolution has been significantly improved with a resulting decrease in segmentation time compared with previous implementations of level sets. The results indicate the method is more effective than other approaches in coping with intensity inhomogeneities.Keywords: Abdominal and thoracic aortic aneurysms, intensityinhomogeneity, level sets, local fitting binary.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18161169 Mining of Interesting Prediction Rules with Uniform Two-Level Genetic Algorithm
Authors: Bilal Alatas, Ahmet Arslan
Abstract:
The main goal of data mining is to extract accurate, comprehensible and interesting knowledge from databases that may be considered as large search spaces. In this paper, a new, efficient type of Genetic Algorithm (GA) called uniform two-level GA is proposed as a search strategy to discover truly interesting, high-level prediction rules, a difficult problem and relatively little researched, rather than discovering classification knowledge as usual in the literatures. The proposed method uses the advantage of uniform population method and addresses the task of generalized rule induction that can be regarded as a generalization of the task of classification. Although the task of generalized rule induction requires a lot of computations, which is usually not satisfied with the normal algorithms, it was demonstrated that this method increased the performance of GAs and rapidly found interesting rules.
Keywords: Classification rule mining, data mining, genetic algorithms.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15941168 Machine Learning for Aiding Meningitis Diagnosis in Pediatric Patients
Authors: Karina Zaccari, Ernesto Cordeiro Marujo
Abstract:
This paper presents a Machine Learning (ML) approach to support Meningitis diagnosis in patients at a children’s hospital in Sao Paulo, Brazil. The aim is to use ML techniques to reduce the use of invasive procedures, such as cerebrospinal fluid (CSF) collection, as much as possible. In this study, we focus on predicting the probability of Meningitis given the results of a blood and urine laboratory tests, together with the analysis of pain or other complaints from the patient. We tested a number of different ML algorithms, including: Adaptative Boosting (AdaBoost), Decision Tree, Gradient Boosting, K-Nearest Neighbors (KNN), Logistic Regression, Random Forest and Support Vector Machines (SVM). Decision Tree algorithm performed best, with 94.56% and 96.18% accuracy for training and testing data, respectively. These results represent a significant aid to doctors in diagnosing Meningitis as early as possible and in preventing expensive and painful procedures on some children.
Keywords: Machine learning, medical diagnosis, meningitis detection, gradient boosting.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11101167 Government (Big) Data Ecosystem: Definition, Classification of Actors, and Their Roles
Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis
Abstract:
Organizations, including governments, generate (big) data that are high in volume, velocity, veracity, and come from a variety of sources. Public Administrations are using (big) data, implementing base registries, and enforcing data sharing within the entire government to deliver (big) data related integrated services, provision of insights to users, and for good governance. Government (Big) data ecosystem actors represent distinct entities that provide data, consume data, manipulate data to offer paid services, and extend data services like data storage, hosting services to other actors. In this research work, we perform a systematic literature review. The key objectives of this paper are to propose a robust definition of government (big) data ecosystem and a classification of government (big) data ecosystem actors and their roles. We showcase a graphical view of actors, roles, and their relationship in the government (big) data ecosystem. We also discuss our research findings. We did not find too much published research articles about the government (big) data ecosystem, including its definition and classification of actors and their roles. Therefore, we lent ideas for the government (big) data ecosystem from numerous areas that include scientific research data, humanitarian data, open government data, industry data, in the literature.
Keywords: Big data, big data ecosystem, classification of big data actors, big data actors roles, definition of government (big) data ecosystem, data-driven government, eGovernment, gaps in data ecosystems, government (big) data, public administration, systematic literature review.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21431166 Clustering in WSN Based on Minimum Spanning Tree Using Divide and Conquer Approach
Authors: Uttam Vijay, Nitin Gupta
Abstract:
Due to heavy energy constraints in WSNs clustering is an efficient way to manage the energy in sensors. There are many methods already proposed in the area of clustering and research is still going on to make clustering more energy efficient. In our paper we are proposing a minimum spanning tree based clustering using divide and conquer approach. The MST based clustering was first proposed in 1970’s for large databases. Here we are taking divide and conquer approach and implementing it for wireless sensor networks with the constraints attached to the sensor networks. This Divide and conquer approach is implemented in a way that we don’t have to construct the whole MST before clustering but we just find the edge which will be the part of the MST to a corresponding graph and divide the graph in clusters there itself if that edge from the graph can be removed judging on certain constraints and hence saving lot of computation.
Keywords: Algorithm, Clustering, Edge-Weighted Graph, Weighted-LEACH.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24751165 Computational Identification of MicroRNAs and their Targets in two Species of Evergreen Spruce Tree (Picea)
Authors: Muhammad Y.K. Barozai, Ifthikhar A. Baloch, M. Din
Abstract:
MicroRNAs (miRNAs) are small, non-coding and regulatory RNAs about 20 to 24 nucleotides long. Their conserved nature among the various organisms makes them a good source of new miRNAs discovery by comparative genomics approach. The study resulted in 21 miRNAs of 20 pre-miRNAs belonging to 16 families (miR156, 157, 158, 164, 165, 168, 169, 172, 319, 390, 393, 394, 395, 400, 472 and 861) in evergreen spruce tree (Picea). The miRNA families; miR 157, 158, 164, 165, 168, 169, 319, 390, 393, 394, 400, 472 and 861 are reported for the first time in the Picea. All 20 miRNA precursors form stable minimum free energy stem-loop structure as their orthologues form in Arabidopsis and the mature miRNA reside in the stem portion of the stem loop structure. Sixteen (16) miRNAs are from Picea glauca and five (5) belong to Picea sitchensis. Their targets consist of transcription factors, growth related, stressed related and hypothetical proteins.Keywords: BLAST, Comparative Genomics, Micro-RNAs, Spruce
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20531164 Adaptive Hierarchical Key Structure Generation for Key Management in Wireless Sensor Networks using A*
Authors: Jin Myoung Kim, Tae Ho Cho
Abstract:
Wireless Sensor networks have a wide spectrum of civil and military applications that call for secure communication such as the terrorist tracking, target surveillance in hostile environments. For the secure communication in these application areas, we propose a method for generating a hierarchical key structure for the efficient group key management. In this paper, we apply A* algorithm in generating a hierarchical key structure by considering the history data of the ratio of addition and eviction of sensor nodes in a location where sensor nodes are deployed. Thus generated key tree structure provides an efficient way of managing the group key in terms of energy consumption when addition and eviction event occurs. A* algorithm tries to minimize the number of messages needed for group key management by the history data. The experimentation with the tree shows efficiency of the proposed method.
Keywords: Heuristic search, key management, security, sensor network.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16841163 Effects of Hidden Unit Sizes and Autoregressive Features in Mental Task Classification
Authors: Ramaswamy Palaniappan, Nai-Jen Huan
Abstract:
Classification of electroencephalogram (EEG) signals extracted during mental tasks is a technique that is actively pursued for Brain Computer Interfaces (BCI) designs. In this paper, we compared the classification performances of univariateautoregressive (AR) and multivariate autoregressive (MAR) models for representing EEG signals that were extracted during different mental tasks. Multilayer Perceptron (MLP) neural network (NN) trained by the backpropagation (BP) algorithm was used to classify these features into the different categories representing the mental tasks. Classification performances were also compared across different mental task combinations and 2 sets of hidden units (HU): 2 to 10 HU in steps of 2 and 20 to 100 HU in steps of 20. Five different mental tasks from 4 subjects were used in the experimental study and combinations of 2 different mental tasks were studied for each subject. Three different feature extraction methods with 6th order were used to extract features from these EEG signals: AR coefficients computed with Burg-s algorithm (ARBG), AR coefficients computed with stepwise least square algorithm (ARLS) and MAR coefficients computed with stepwise least square algorithm. The best results were obtained with 20 to 100 HU using ARBG. It is concluded that i) it is important to choose the suitable mental tasks for different individuals for a successful BCI design, ii) higher HU are more suitable and iii) ARBG is the most suitable feature extraction method.Keywords: Autoregressive, Brain-Computer Interface, Electroencephalogram, Neural Network.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18031162 A Comprehensive Method of Fault Detection and Isolation Based On Testability Modeling Data
Authors: Junyou Shi, Weiwei Cui
Abstract:
Testability modeling is a commonly used method in testability design and analysis of system. A dependency matrix will be obtained from testability modeling, and we will give a quantitative evaluation about fault detection and isolation. Based on the dependency matrix, we can obtain the diagnosis tree. The tree provides the procedures of the fault detection and isolation. But the dependency matrix usually includes built-in test (BIT) and manual test in fact. BIT runs the test automatically and is not limited by the procedures. The method above cannot give a more efficient diagnosis and use the advantages of the BIT. A Comprehensive method of fault detection and isolation is proposed. This method combines the advantages of the BIT and Manual test by splitting the matrix. The result of the case study shows that the method is effective.
Keywords: BIT, fault detection, fault isolation, testability modeling.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1666