Search results for: random forest classifier

797 Brain Image Segmentation Using Conditional Random Field Based On Modified Artificial Bee Colony Optimization Algorithm

Authors: B. Thiagarajan, R. Bremananth

Abstract:

Tumor is an uncontrolled growth of tissues in any part of the body. Tumors are of different types and they have different characteristics and treatments. Brain tumor is inherently serious and life-threatening because of its character in the limited space of the intracranial cavity (space formed inside the skull). Locating the tumor within MR (magnetic resonance) image of brain is integral part of the treatment of brain tumor. This segmentation task requires classification of each voxel as either tumor or non-tumor, based on the description of the voxel under consideration. Many studies are going on in the medical field using Markov Random Fields (MRF) in segmentation of MR images. Even though the segmentation process is better, computing the probability and estimation of parameters is difficult. In order to overcome the aforementioned issues, Conditional Random Field (CRF) is used in this paper for segmentation, along with the modified artificial bee colony optimization and modified fuzzy possibility c-means (MFPCM) algorithm. This work is mainly focused to reduce the computational complexities, which are found in existing methods and aimed at getting higher accuracy. The efficiency of this work is evaluated using the parameters such as region non-uniformity, correlation and computation time. The experimental results are compared with the existing methods such as MRF with improved Genetic Algorithm (GA) and MRF-Artificial Bee Colony (MRF-ABC) algorithm.

Keywords: Conditional random field, Magnetic resonance, Markov random field, Modified artificial bee colony.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2948

796 A Nano-Scaled SRAM Guard Band Design with Gaussian Mixtures Model of Complex Long Tail RTN Distributions

Authors: Worawit Somha, Hiroyuki Yamauchi

Abstract:

This paper proposes, for the first time, how the challenges facing the guard-band designs including the margin assist-circuits scheme for the screening-test in the coming process generations should be addressed. The increased screening error impacts are discussed based on the proposed statistical analysis models. It has been shown that the yield-loss caused by the misjudgment on the screening test would become 5-orders of magnitude larger than that for the conventional one when the amplitude of random telegraph noise (RTN) caused variations approaches to that of random dopant fluctuation. Three fitting methods to approximate the RTN caused complex Gamma mixtures distributions by the simple Gaussian mixtures model (GMM) are proposed and compared. It has been verified that the proposed methods can reduce the error of the fail-bit predictions by 4-orders of magnitude.

Keywords: Mixtures of Gaussian, Random telegraph noise, EM algorithm, Long-tail distribution, Fail-bit analysis, Static random access memory, Guard band design.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1841

795 Machine Learning for Aiding Meningitis Diagnosis in Pediatric Patients

Authors: Karina Zaccari, Ernesto Cordeiro Marujo

Abstract:

This paper presents a Machine Learning (ML) approach to support Meningitis diagnosis in patients at a children’s hospital in Sao Paulo, Brazil. The aim is to use ML techniques to reduce the use of invasive procedures, such as cerebrospinal fluid (CSF) collection, as much as possible. In this study, we focus on predicting the probability of Meningitis given the results of a blood and urine laboratory tests, together with the analysis of pain or other complaints from the patient. We tested a number of different ML algorithms, including: Adaptative Boosting (AdaBoost), Decision Tree, Gradient Boosting, K-Nearest Neighbors (KNN), Logistic Regression, Random Forest and Support Vector Machines (SVM). Decision Tree algorithm performed best, with 94.56% and 96.18% accuracy for training and testing data, respectively. These results represent a significant aid to doctors in diagnosing Meningitis as early as possible and in preventing expensive and painful procedures on some children.

Keywords: Machine learning, medical diagnosis, meningitis detection, gradient boosting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1110

794 Estimating 3D-Position of A Stationary Random Acoustic Source Using Bispectral Analysis of 4-Point Detected Signals

Authors: Katsumi Hirata

Abstract:

To develop the useful acoustic environmental recognition system, the method of estimating 3D-position of a stationary random acoustic source using bispectral analysis of 4-point detected signals is proposed. The method uses information about amplitude attenuation and propagation delay extracted from amplitude ratios and angles of auto- and cross-bispectra of the detected signals. It is expected that using bispectral analysis affects less influence of Gaussian noises than using conventional power spectral one. In this paper, the basic principle of the method is mentioned first, and its validity and features are considered from results of the fundamental experiments assumed ideal circumstances.

Keywords: 4-point detection, a stationary random acoustic source, auto- and cross-bispectra, estimation of 3D-position.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1437

793 SNR Classification Using Multiple CNNs

Authors: Thinh Ngo, Paul Rad, Brian Kelley

Abstract:

Noise estimation is essential in today wireless systems for power control, adaptive modulation, interference suppression and quality of service. Deep learning (DL) has already been applied in the physical layer for modulation and signal classifications. Unacceptably low accuracy of less than 50% is found to undermine traditional application of DL classification for SNR prediction. In this paper, we use divide-and-conquer algorithm and classifier fusion method to simplify SNR classification and therefore enhances DL learning and prediction. Specifically, multiple CNNs are used for classification rather than a single CNN. Each CNN performs a binary classification of a single SNR with two labels: less than, greater than or equal. Together, multiple CNNs are combined to effectively classify over a range of SNR values from −20 ≤ SNR ≤ 32 dB.We use pre-trained CNNs to predict SNR over a wide range of joint channel parameters including multiple Doppler shifts (0, 60, 120 Hz), power-delay profiles, and signal-modulation types (QPSK,16QAM,64-QAM). The approach achieves individual SNR prediction accuracy of 92%, composite accuracy of 70% and prediction convergence one order of magnitude faster than that of traditional estimation.

Keywords: Classification, classifier fusion, CNN, Deep Learning, prediction, SNR.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 720

792 Constant Factor Approximation Algorithm for p-Median Network Design Problem with Multiple Cable Types

Authors: Chaghoub Soraya, Zhang Xiaoyan

Abstract:

This research presents the first constant approximation algorithm to the p-median network design problem with multiple cable types. This problem was addressed with a single cable type and there is a bifactor approximation algorithm for the problem. To the best of our knowledge, the algorithm proposed in this paper is the first constant approximation algorithm for the p-median network design with multiple cable types. The addressed problem is a combination of two well studied problems which are p-median problem and network design problem. The introduced algorithm is a random sampling approximation algorithm of constant factor which is conceived by using some random sampling techniques form the literature. It is based on a redistribution Lemma from the literature and a steiner tree problem as a subproblem. This algorithm is simple, and it relies on the notions of random sampling and probability. The proposed approach gives an approximation solution with one constant ratio without violating any of the constraints, in contrast to the one proposed in the literature. This paper provides a (21 + 2)-approximation algorithm for the p-median network design problem with multiple cable types using random sampling techniques.

Keywords: Approximation algorithms, buy-at-bulk, combinatorial optimization, network design, p-median.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 595

791 A Distance Function for Data with Missing Values and Its Application

Authors: Loai AbdAllah, Ilan Shimshoni

Abstract:

Missing values in data are common in real world applications. Since the performance of many data mining algorithms depend critically on it being given a good metric over the input space, we decided in this paper to define a distance function for unlabeled datasets with missing values. We use the Bhattacharyya distance, which measures the similarity of two probability distributions, to define our new distance function. According to this distance, the distance between two points without missing attributes values is simply the Mahalanobis distance. When on the other hand there is a missing value of one of the coordinates, the distance is computed according to the distribution of the missing coordinate. Our distance is general and can be used as part of any algorithm that computes the distance between data points. Because its performance depends strongly on the chosen distance measure, we opted for the k nearest neighbor classifier to evaluate its ability to accurately reflect object similarity. We experimented on standard numerical datasets from the UCI repository from different fields. On these datasets we simulated missing values and compared the performance of the kNN classifier using our distance to other three basic methods. Our experiments show that kNN using our distance function outperforms the kNN using other methods. Moreover, the runtime performance of our method is only slightly higher than the other methods.

Keywords: Missing values, Distance metric, Bhattacharyya distance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2751

790 Solving Process Planning and Scheduling with Number of Operation Plus Processing Time Due-Date Assignment Concurrently Using a Genetic Search

Authors: Halil Ibrahim Demir, Alper Goksu, Onur Canpolat, Caner Erden, Melek Nur

Abstract:

Traditionally process planning, scheduling and due date assignment are performed sequentially and separately. High interrelation between these functions makes integration very useful. Although there are numerous works on integrated process planning and scheduling and many works on scheduling with due date assignment, there are only a few works on the integration of these three functions. Here we tested the different integration levels of these three functions and found a fully integrated version as the best. We applied genetic search and random search and genetic search was found better compared to the random search. We penalized all earliness, tardiness and due date related costs. Since all these three terms are all undesired, it is better to penalize all of them.

Keywords: Process planning, scheduling, due-date assignment, genetic algorithm, random search.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 839

789 On the Central Limit Theorems for Forward and Backward Martingales

Authors: Yilun Shang

Abstract:

Let {Xi}i≥1 be a martingale difference sequence with Xi = Si - Si-1. Under some regularity conditions, we show that (X2 1+· · ·+X2N n)-1/2SNn is asymptotically normal, where {Ni}i≥1 is a sequence of positive integer-valued random variables tending to infinity. In a similar manner, a backward (or reverse) martingale central limit theorem with random indices is provided.

Keywords: central limit theorem, martingale difference sequence, backward martingale.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2781

788 Adaptive Network Intrusion Detection Learning: Attribute Selection and Classification

Authors: Dewan Md. Farid, Jerome Darmont, Nouria Harbi, Nguyen Huu Hoa, Mohammad Zahidur Rahman

Abstract:

In this paper, a new learning approach for network intrusion detection using naïve Bayesian classifier and ID3 algorithm is presented, which identifies effective attributes from the training dataset, calculates the conditional probabilities for the best attribute values, and then correctly classifies all the examples of training and testing dataset. Most of the current intrusion detection datasets are dynamic, complex and contain large number of attributes. Some of the attributes may be redundant or contribute little for detection making. It has been successfully tested that significant attribute selection is important to design a real world intrusion detection systems (IDS). The purpose of this study is to identify effective attributes from the training dataset to build a classifier for network intrusion detection using data mining algorithms. The experimental results on KDD99 benchmark intrusion detection dataset demonstrate that this new approach achieves high classification rates and reduce false positives using limited computational resources.

Keywords: Attributes selection, Conditional probabilities, information gain, network intrusion detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2698

787 Hierarchical PSO-Adaboost Based Classifiers for Fast and Robust Face Detection

Authors: Hong Pan, Yaping Zhu, Liang Zheng Xia

Abstract:

We propose a fast and robust hierarchical face detection system which finds and localizes face images with a cascade of classifiers. Three modules contribute to the efficiency of our detector. First, heterogeneous feature descriptors are exploited to enrich feature types and feature numbers for face representation. Second, a PSO-Adaboost algorithm is proposed to efficiently select discriminative features from a large pool of available features and reinforce them into the final ensemble classifier. Compared with the standard exhaustive Adaboost for feature selection, the new PSOAdaboost algorithm reduces the training time up to 20 times. Finally, a three-stage hierarchical classifier framework is developed for rapid background removal. In particular, candidate face regions are detected more quickly by using a large size window in the first stage. Nonlinear SVM classifiers are used instead of decision stump functions in the last stage to remove those remaining complex nonface patterns that can not be rejected in the previous two stages. Experimental results show our detector achieves superior performance on the CMU+MIT frontal face dataset.

Keywords: Adaboost, Face detection, Feature selection, PSO

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2199

786 Random Oracle Model of Information Hiding System

Authors: Nan Jiang, Jian Wang

Abstract:

Random Oracle Model (ROM) is an effective method for measuring the practical security of cryptograph. In this paper, we try to use it into information hiding system (IHS). Because IHS has its own properties, the ROM must be modified if it is used into IHS. Firstly, we fully discuss why and how to modify each part of ROM respectively. The main changes include: 1) Divide the attacks that IHS may be suffered into two phases and divide the attacks of each phase into several kinds. 2) Distinguish Oracles and Black-boxes clearly. 3) Define Oracle and four Black-boxes that IHS used. 4) Propose the formalized adversary model. And 5) Give the definition of judge. Secondly, based on ROM of IHS, the security against known original cover attack (KOCA-KOCA-security) is defined. Then, we give an actual information hiding scheme and prove that it is KOCA-KOCA-secure. Finally, we conclude the paper and propose the open problems of further research.

Keywords: Attack, Information Hiding, Provable Security, Random Oracle Model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1348

785 Electromyography Pattern Classification with Laplacian Eigenmaps in Human Running

Authors: Elnaz Lashgari, Emel Demircan

Abstract:

Electromyography (EMG) is one of the most important interfaces between humans and robots for rehabilitation. Decoding this signal helps to recognize muscle activation and converts it into smooth motion for the robots. Detecting each muscle’s pattern during walking and running is vital for improving the quality of a patient’s life. In this study, EMG data from 10 muscles in 10 subjects at 4 different speeds were analyzed. EMG signals are nonlinear with high dimensionality. To deal with this challenge, we extracted some features in time-frequency domain and used manifold learning and Laplacian Eigenmaps algorithm to find the intrinsic features that represent data in low-dimensional space. We then used the Bayesian classifier to identify various patterns of EMG signals for different muscles across a range of running speeds. The best result for vastus medialis muscle corresponds to 97.87±0.69 for sensitivity and 88.37±0.79 for specificity with 97.07±0.29 accuracy using Bayesian classifier. The results of this study provide important insight into human movement and its application for robotics research.

Keywords: Electrocardiogram, manifold learning, Laplacian Eigenmaps, running pattern.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1120

784 Random Access in IoT Using Naïve Bayes Classification

Authors: Alhusein Almahjoub, Dongyu Qiu

Abstract:

This paper deals with the random access procedure in next-generation networks and presents the solution to reduce total service time (TST) which is one of the most important performance metrics in current and future internet of things (IoT) based networks. The proposed solution focuses on the calculation of optimal transmission probability which maximizes the success probability and reduces TST. It uses the information of several idle preambles in every time slot, and based on it, it estimates the number of backlogged IoT devices using Naïve Bayes estimation which is a type of supervised learning in the machine learning domain. The estimation of backlogged devices is necessary since optimal transmission probability depends on it and the eNodeB does not have information about it. The simulations are carried out in MATLAB which verify that the proposed solution gives excellent performance.

Keywords: Random access, LTE/LTE-A, 5G, machine learning, Naïve Bayes estimation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 448

783 Land Use Changes in Two Mediterranean Coastal Regions: Do Urban Areas Matter?

Authors: L. Salvati, D. Smiraglia, S. Bajocco, M. Munafò

Abstract:

This paper focuses on Land Use and Land Cover Changes (LULCC) occurred in the urban coastal regions of the Mediterranean basin in the last thirty years. LULCC were assessed diachronically (1975-2006) in two urban areas, Rome (Italy) and Athens (Greece), by using CORINE land cover maps. In strictly coastal territories a persistent growth of built-up areas at the expenses of both agricultural and forest land uses was found. On the contrary, a different pattern was observed in the surrounding inland areas, where a high conversion rate of the agricultural land uses to both urban and forest land uses was recorded. The impact of city growth on the complex pattern of coastal LULCC is finally discussed.

Keywords: Land use changes, coastal region, Rome, Attica, southern Europe.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2431

782 Machine Learning Approach for Identifying Dementia from MRI Images

Authors: S. K. Aruna, S. Chitra

Abstract:

This research paper presents a framework for classifying Magnetic Resonance Imaging (MRI) images for Dementia. Dementia, an age-related cognitive decline is indicated by degeneration of cortical and sub-cortical structures. Characterizing morphological changes helps understand disease development and contributes to early prediction and prevention of the disease. Modelling, that captures the brain’s structural variability and which is valid in disease classification and interpretation is very challenging. Features are extracted using Gabor filter with 0, 30, 60, 90 orientations and Gray Level Co-occurrence Matrix (GLCM). It is proposed to normalize and fuse the features. Independent Component Analysis (ICA) selects features. Support Vector Machine (SVM) classifier with different kernels is evaluated, for efficiency to classify dementia. This study evaluates the presented framework using MRI images from OASIS dataset for identifying dementia. Results showed that the proposed feature fusion classifier achieves higher classification accuracy.

Keywords: Magnetic resonance imaging, dementia, Gabor filter, gray level co-occurrence matrix, support vector machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2116

781 Machine Learning Framework: Competitive Intelligence and Key Drivers Identification of Market Share Trends among Healthcare Facilities

Authors: A. Appe, B. Poluparthi, L. Kasivajjula, U. Mv, S. Bagadi, P. Modi, A. Singh, H. Gunupudi, S. Troiano, J. Paul, J. Stovall, J. Yamamoto

Abstract:

The necessity of data-driven decisions in healthcare strategy formulation is rapidly increasing. A reliable framework which helps identify factors impacting a healthcare provider facility or a hospital (from here on termed as facility) market share is of key importance. This pilot study aims at developing a data-driven machine learning-regression framework which aids strategists in formulating key decisions to improve the facility’s market share which in turn impacts in improving the quality of healthcare services. The US (United States) healthcare business is chosen for the study, and the data spanning 60 key facilities in Washington State and about 3 years of historical data are considered. In the current analysis, market share is termed as the ratio of the facility’s encounters to the total encounters among the group of potential competitor facilities. The current study proposes a two-pronged approach of competitor identification and regression approach to evaluate and predict market share, respectively. Leveraged model agnostic technique, SHAP (SHapley Additive exPlanations), to quantify the relative importance of features impacting the market share. Typical techniques in literature to quantify the degree of competitiveness among facilities use an empirical method to calculate a competitive factor to interpret the severity of competition. The proposed method identifies a pool of competitors, develops Directed Acyclic Graphs (DAGs) and feature level word vectors, and evaluates the key connected components at the facility level. This technique is robust since it is data-driven, which minimizes the bias from empirical techniques. The DAGs factor in partial correlations at various segregations and key demographics of facilities along with a placeholder to factor in various business rules (for e.g., quantifying the patient exchanges, provider references, and sister facilities). Identified are the multiple groups of competitors among facilities. Leveraging the competitors' identified developed and fine-tuned Random Forest Regression model to predict the market share. To identify key drivers of market share at an overall level, permutation feature importance of the attributes was calculated. For relative quantification of features at a facility level, incorporated SHAP, a model agnostic explainer. This helped to identify and rank the attributes at each facility which impacts the market share. This approach proposes an amalgamation of the two popular and efficient modeling practices, viz., machine learning with graphs and tree-based regression techniques to reduce the bias. With these, we helped to drive strategic business decisions.

Keywords: Competition, DAGs, hospital, healthcare, machine learning, market share, random forest, SHAP.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 284

780 Using Machine Learning Techniques for Autism Spectrum Disorder Analysis and Detection in Children

Authors: Norah Alshahrani, Abdulaziz Almaleh

Abstract:

Autism Spectrum Disorder (ASD) is a condition related to issues with brain development that affects how a person recognises and communicates with others which results in difficulties with interaction and communication socially and it is constantly growing. Early recognition of ASD allows children to lead safe and healthy lives and helps doctors with accurate diagnoses and management of conditions. Therefore, it is crucial to develop a method that will achieve good results and with high accuracy for the measurement of ASD in children. In this paper, ASD datasets of toddlers and children have been analyzed. We employed the following machine learning techniques to attempt to explore ASD: Random Forest (RF), Decision Tree (DT), Na¨ıve Bayes (NB) and Support Vector Machine (SVM). Then feature selection was used to provide fewer attributes from ASD datasets while preserving model performance. As a result, we found that the best result has been provided by SVM, achieving 0.98% in the toddler dataset and 0.99% in the children dataset.

Keywords: Autism Spectrum Disorder, ASD, Machine Learning, ML, Feature Selection, Support Vector Machine, SVM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 598

779 Reliability Based Performance Evaluation of Stone Column Improved Soft Ground

Authors: A. GuhaRay, C. V. S. P. Kiranmayi, S. Rudraraju

Abstract:

The present study considers the effect of variation of different geotechnical random variables in the design of stone column-foundation systems for assessing the bearing capacity and consolidation settlement of highly compressible soil. The soil and stone column properties, spacing, diameter and arrangement of stone columns are considered as the random variables. Probability of failure (P_f) is computed for a target degree of consolidation and a target safe load by Monte Carlo Simulation (MCS). The study shows that the variation in coefficient of radial consolidation (c_r) and cohesion of soil (c_s) are two most important factors influencing Pf. If the coefficient of variation (COV) of c_r exceeds 20%, P_f exceeds 0.001, which is unsafe following the guidelines of US Army Corps of Engineers. The bearing capacity also exceeds its safe value for COV of c_s > 30%. It is also observed that as the spacing between the stone column increases, the probability of reaching a target degree of consolidation decreases. Accordingly, design guidelines, considering both consolidation and bearing capacity of improved ground, are proposed for different spacing and diameter of stone columns and geotechnical random variables.

Keywords: Bearing capacity, consolidation, geotechnical random variables, probability of failure, stone columns.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1177

778 Application of Geo-Informatic Technology in Studying of Land Tenure and Land Use for Cultivation of Cash Crops by Local Communities in the Local Administration Organizations of Phailuang and Maepoon in Lublae District, Uttaradit Province

Authors: Kunchit Pirapake

Abstract:

Application of Geo-Informatic technology in land tenure and land use on the economic crop area, to create sustainable land, access to the area, and produce sustainable food for the demand of its people in the community. The research objectives are to 1) apply Geo-Informatic Technology on land ownership and agricultural land use (cash crops) in the research area, 2) create GIS database on land ownership and land use, 3) create database of an online Geoinformation system on land tenure and land use. The results of this study reveal that, first; the study area is on high slope, mountains and valleys. The land is mainly in the forest zone which was included in the Forest Act 1941 and National Conserved Forest 1964. Residents gained the rights to exploit the land passed down from their ancestors. The practice was recognized by communities. The land was suitable for cultivating a wide variety of economic crops that was the main income of the family. At present the local residents keep expanding the land to grow cash crops. Second; creating a database of the geographic information system consisted of the area range, announcement from the Interior Ministry, interpretation of satellite images, transportation routes, waterways, plots of land with a title deed available at the provincial land office. Most pieces of land without a title deed are located in the forest and national reserve areas. Data were created from a field study and a land zone determined by a GPS. Last; an online Geo-Informatic System can show the information of land tenure and land use of each economic crop. Satellite data with high resolution which could be updated and checked on the online Geo-Informatic System simultaneously.

Keywords: Geo-Informatic Technology, Land Tenure, Online Geo-Informatic System, Land Use of cash crops.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1469

777 Scaling up Detection Rates and Reducing False Positives in Intrusion Detection using NBTree

Authors: Dewan Md. Farid, Nguyen Huu Hoa, Jerome Darmont, Nouria Harbi, Mohammad Zahidur Rahman

Abstract:

In this paper, we present a new learning algorithm for anomaly based network intrusion detection using improved self adaptive naïve Bayesian tree (NBTree), which induces a hybrid of decision tree and naïve Bayesian classifier. The proposed approach scales up the balance detections for different attack types and keeps the false positives at acceptable level in intrusion detection. In complex and dynamic large intrusion detection dataset, the detection accuracy of naïve Bayesian classifier does not scale up as well as decision tree. It has been successfully tested in other problem domains that naïve Bayesian tree improves the classification rates in large dataset. In naïve Bayesian tree nodes contain and split as regular decision-trees, but the leaves contain naïve Bayesian classifiers. The experimental results on KDD99 benchmark network intrusion detection dataset demonstrate that this new approach scales up the detection rates for different attack types and reduces false positives in network intrusion detection.

Keywords: Detection rates, false positives, network intrusiondetection, naïve Bayesian tree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2281

776 Reduction of Plants Biodiversity in Hyrcanian Forest by Coal Mining Activities

Authors: Mahsa Tavakoli, Seyed Mohammad Hojjati, Yahya Kooch

Abstract:

Considering that coal mining is one of the important industrial activities, it may cause damages to environment. According to the author’s best knowledge, the effect of traditional coal mining activities on plant biodiversity has not been investigated in the Hyrcanian forests. Therefore, in this study, the effect of coal mining activities on vegetation and tree diversity was investigated in Hyrcanian forest, North Iran. After filed visiting and determining the mine, 16 plots (20×20 m²) were established by systematic-randomly (60×60 m²) in an area of 4 ha (200×200 m²-mine entrance placed at center). An area adjacent to the mine was not affected by the mining activity, and it is considered as the control area. In each plot, the data about trees such as number and type of species were recorded. The biodiversity of vegetation cover was considered 5 square sub-plots (1 m²) in each plot. PAST software and Ecological Methodology were used to calculate Biodiversity indices. The value of Shannon Wiener and Simpson diversity indices for tree cover in control area (1.04±0.34 and 0.62±0.20) was significantly higher than mining area (0.78±0.27 and 0.45±0.14). The value of evenness indices for tree cover in the mining area was significantly lower than that of the control area. The value of Shannon Wiener and Simpson diversity indices for vegetation cover in the control area (1.37±0.06 and 0.69±0.02) was significantly higher than the mining area (1.02±0.13 and 0.50±0.07). The value of evenness index in the control area was significantly higher than the mining area. Plant communities are a good indicator of the changes in the site. Study about changes in vegetation biodiversity and plant dynamics in the degraded land can provide necessary information for forest management and reforestation of these areas.

Keywords: Vegetation biodiversity, species composition, traditional coal mining, caspian forest.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 897

775 The Effect of Forest Fires on Physical Properties and Magnetic Susceptibility of Semi-Arid Soils in North-Eastern, Libya

Authors: G. S. Eldiabani, W. H. G. Hale, C. P. Heron

Abstract:

Forest areas are particularly susceptible to fires, which are often manmade. One of the most fire affected forest regions in the world is the Mediterranean. Libya, in the Mediterranean region, has soils that are considered to be arid except in a small area called Aljabal Alakhdar (Green mountain), which is the geographic area covered by this study. Like other forests in the Mediterranean it has suffered extreme degradation. This is mainly due to people removing fire wood, or sometimes converting forested areas to agricultural use, as well as fires which may alter several soil chemical and physical properties. The purpose of this study was to evaluate the effects of fires on the physical properties of soil of Aljabal Alakhdar forest in the north-east of Libya. The physical properties of soil following fire in two geographic areas have been determined, with those subjected to the fire compared to those in adjacent unburned areas in one coastal and one mountain site. Physical properties studied were: soil particle size (soil texture), soil water content, soil porosity and soil particle density. For the first time in Libyan soils, the effect of burning on the magnetic susceptibility properties of soils was also tested. The results showed that the soils in both study sites, irrespective of burning or depth fell into the category of a silt loam texture, low water content, homogeneity of porosity of the soil profiles, relatively high soil particle density values and there is a much greater value of the soil magnetic susceptibility in the top layer from both sites except for the soil water content and magnetic susceptibility, fire has not had a clear effect on the soils’ physical properties.

Keywords: Aljabal Alakhdar, the coastal site, the mountain site, fire effect, soil particle size, soil water content, soil porosity, soil particle density, soil magnetic susceptibility.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2650

774 Analyzing the Changing Pattern of Nigerian Vegetation Zones and Its Ecological and Socio-Economic Implications Using Spot-Vegetation Sensor

Authors: B. L. Gadiga

Abstract:

This study assesses the major ecological zones in Nigeria with the view to understanding the spatial pattern of vegetation zones and the implications on conservation within the period of sixteen (16) years. Satellite images used for this study were acquired from the SPOT-VEGETATION between 1998 and 2013. The annual NDVI images selected for this study were derived from SPOT-4 sensor and were acquired within the same season (November) in order to reduce differences in spectral reflectance due to seasonal variations. The images were sliced into five classes based on literatures and knowledge of the area (i.e. <0.16 Non-Vegetated areas; 0.16-0.22 Sahel Savannah; 0.22-0.40 Sudan Savannah, 0.40-0.47 Guinea Savannah and >0.47 Forest Zone). Classification of the 1998 and 2013 images into forested and non forested areas showed that forested area decrease from 511,691 km² in 1998 to 478,360 km² in 2013. Differencing change detection method was performed on 1998 and 2013 NDVI images to identify areas of ecological concern. The result shows that areas undergoing vegetation degradation covers an area of 73,062 km² while areas witnessing some form restoration cover an area of 86,315 km². The result also shows that there is a weak correlation between rainfall and the vegetation zones. The non-vegetated areas have a correlation coefficient (r) of 0.0088, Sahel Savannah belt 0.1988, Sudan Savannah belt -0.3343, Guinea Savannah belt 0.0328 and Forest belt 0.2635. The low correlation can be associated with the encroachment of the Sudan Savannah belt into the forest belt of South-eastern part of the country as revealed by the image analysis. The degradation of the forest vegetation is therefore responsible for the serious erosion problems witnessed in the South-east. The study recommends constant monitoring of vegetation and strict enforcement of environmental laws in the country.

Keywords: Vegetation, NDVI, SPOT-vegetation, ecology, degradation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 838

773 Multi-Layer Perceptron Neural Network Classifier with Binary Particle Swarm Optimization Based Feature Selection for Brain-Computer Interfaces

Authors: K. Akilandeswari, G. M. Nasira

Abstract:

Brain-Computer Interfaces (BCIs) measure brain signals activity, intentionally and unintentionally induced by users, and provides a communication channel without depending on the brain’s normal peripheral nerves and muscles output pathway. Feature Selection (FS) is a global optimization machine learning problem that reduces features, removes irrelevant and noisy data resulting in acceptable recognition accuracy. It is a vital step affecting pattern recognition system performance. This study presents a new Binary Particle Swarm Optimization (BPSO) based feature selection algorithm. Multi-layer Perceptron Neural Network (MLPNN) classifier with backpropagation training algorithm and Levenberg-Marquardt training algorithm classify selected features.

Keywords: Brain-Computer Interfaces (BCI), Feature Selection (FS), Walsh–Hadamard Transform (WHT), Binary Particle Swarm Optimization (BPSO), Multi-Layer Perceptron (MLP), Levenberg–Marquardt algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2185

772 Stochastic Comparisons of Heterogeneous Samples with Homogeneous Exponential Samples

Authors: Nitin Gupta, Rakesh Kumar Bajaj

Abstract:

In the present communication, stochastic comparison of a series (parallel) system having heterogeneous components with random lifetimes and series (parallel) system having homogeneous exponential components with random lifetimes has been studied. Further, conditions under which such a comparison is possible has been established.

Keywords: Exponential distribution, Order statistics, Star ordering, Stochastic ordering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1564

771 New Features for Specific JPEG Steganalysis

Authors: Johann Barbier, Eric Filiol, Kichenakoumar Mayoura

Abstract:

We present in this paper a new approach for specific JPEG steganalysis and propose studying statistics of the compressed DCT coefficients. Traditionally, steganographic algorithms try to preserve statistics of the DCT and of the spatial domain, but they cannot preserve both and also control the alteration of the compressed data. We have noticed a deviation of the entropy of the compressed data after a first embedding. This deviation is greater when the image is a cover medium than when the image is a stego image. To observe this deviation, we pointed out new statistic features and combined them with the Multiple Embedding Method. This approach is motivated by the Avalanche Criterion of the JPEG lossless compression step. This criterion makes possible the design of detectors whose detection rates are independent of the payload. Finally, we designed a Fisher discriminant based classifier for well known steganographic algorithms, Outguess, F5 and Hide and Seek. The experiemental results we obtained show the efficiency of our classifier for these algorithms. Moreover, it is also designed to work with low embedding rates (< 10-5) and according to the avalanche criterion of RLE and Huffman compression step, its efficiency is independent of the quantity of hidden information.

Keywords: Compressed frequency domain, Fisher discriminant, specific JPEG steganalysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2162

770 Improving Subjective Bias Detection Using Bidirectional Encoder Representations from Transformers and Bidirectional Long Short-Term Memory

Authors: Ebipatei Victoria Tunyan, T. A. Cao, Cheol Young Ock

Abstract:

Detecting subjectively biased statements is a vital task. This is because this kind of bias, when present in the text or other forms of information dissemination media such as news, social media, scientific texts, and encyclopedias, can weaken trust in the information and stir conflicts amongst consumers. Subjective bias detection is also critical for many Natural Language Processing (NLP) tasks like sentiment analysis, opinion identification, and bias neutralization. Having a system that can adequately detect subjectivity in text will boost research in the above-mentioned areas significantly. It can also come in handy for platforms like Wikipedia, where the use of neutral language is of importance. The goal of this work is to identify the subjectively biased language in text on a sentence level. With machine learning, we can solve complex AI problems, making it a good fit for the problem of subjective bias detection. A key step in this approach is to train a classifier based on BERT (Bidirectional Encoder Representations from Transformers) as upstream model. BERT by itself can be used as a classifier; however, in this study, we use BERT as data preprocessor as well as an embedding generator for a Bi-LSTM (Bidirectional Long Short-Term Memory) network incorporated with attention mechanism. This approach produces a deeper and better classifier. We evaluate the effectiveness of our model using the Wiki Neutrality Corpus (WNC), which was compiled from Wikipedia edits that removed various biased instances from sentences as a benchmark dataset, with which we also compare our model to existing approaches. Experimental analysis indicates an improved performance, as our model achieved state-of-the-art accuracy in detecting subjective bias. This study focuses on the English language, but the model can be fine-tuned to accommodate other languages.

Keywords: Subjective bias detection, machine learning, BERT–BiLSTM–Attention, text classification, natural language processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 830

769 Data Mining Classification Methods Applied in Drug Design

Authors: Mária Stachová, Lukáš Sobíšek

Abstract:

Data mining incorporates a group of statistical methods used to analyze a set of information, or a data set. It operates with models and algorithms, which are powerful tools with the great potential. They can help people to understand the patterns in certain chunk of information so it is obvious that the data mining tools have a wide area of applications. For example in the theoretical chemistry data mining tools can be used to predict moleculeproperties or improve computer-assisted drug design. Classification analysis is one of the major data mining methodologies. The aim of thecontribution is to create a classification model, which would be able to deal with a huge data set with high accuracy. For this purpose logistic regression, Bayesian logistic regression and random forest models were built using R software. TheBayesian logistic regression in Latent GOLD software was created as well. These classification methods belong to supervised learning methods. It was necessary to reduce data matrix dimension before construct models and thus the factor analysis (FA) was used. Those models were applied to predict the biological activity of molecules, potential new drug candidates.

Keywords: data mining, classification, drug design, QSAR

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2849

768 A Fitted Random Sampling Scheme for Load Distribution in Grid Networks

Authors: O. A. Rahmeh, P. Johnson, S. Lehmann

Abstract:

Grid networks provide the ability to perform higher throughput computing by taking advantage of many networked computer-s resources to solve large-scale computation problems. As the popularity of the Grid networks has increased, there is a need to efficiently distribute the load among the resources accessible on the network. In this paper, we present a stochastic network system that gives a distributed load-balancing scheme by generating almost regular networks. This network system is self-organized and depends only on local information for load distribution and resource discovery. The in-degree of each node is refers to its free resources, and job assignment and resource discovery processes required for load balancing is accomplished by using fitted random sampling. Simulation results show that the generated network system provides an effective, scalable, and reliable load-balancing scheme for the distributed resources accessible on Grid networks.

Keywords: Complex networks, grid networks, load-balancing, random sampling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1785