Search results for: naive Bayes classifier
97 A Simple Olfactometer for Odour and Lateralization Thresholds of Chemical Vapours
Authors: Lena Ernstgård, Aishwarya M. Dwivedi, Johan Lundström, Gunnar Johanson
Abstract:
A simple inexpensive olfactometer was constructed to enable valid measures of detection threshold of low concentrations of vapours of chemicals. The delivery system consists of seven syringe pumps, each connected to a Tedlar bag containing a predefined concentration of the test chemical in the air. The seven pumps are connected to a 8-way mixing valve which in turn connects to a birhinal nose piece. Chemical vapor of known concentration is generated by injection of an appropriate amount of the test chemical into a Tedlar bag with a known volume of clean air. Complete vaporization is assured by gentle heating of the bag from the outside with a heat flow. The six test concentrations are obtained by adding different volumes from the starting bag to six new Tedlar bags with known volumes of clean air. One bag contains clean air only. Thus, six different test concentrations and clean air can easily be tested in series by shifting the valve to new positions. Initial in-line measurement with a photoionization detector showed that the delivery system quickly responded to a shift in valve position. Thus 90% of the desired concentration was reached within 15 seconds. The concentrations in the bags are verified daily by gas chromatography. The stability of the system in terms of chemical concentration is monitored in real time by means of a photo-ionization detector. To determine lateralization thresholds, an additional pump supplying clean air is added to the delivery system in a way so that the nostrils can be separately and interchangeably be exposed to clean air and test chemical. Odor and lateralization thresholds were determined for three aldehydes; acrolein, crotonaldehyde, and hexanal in 20 healthy naïve individuals. Aldehydes generally have a strong odour, and the selected aldehydes are also considered to be irritating to mucous membranes. The median odor thresholds of the three aldehydes were 0.017, 0.0008, and 0.097 ppm, respectively. No lateralization threshold could be identified for acrolein, whereas the medians for crotonaldehyde and hexanal were 0.003 and 0.39 ppm, respectively. In conclusion, we constructed a simple, inexpensive olfactometer that allows for stable and easily measurable concentrations of vapors of the test chemical. Our test with aldehydes demonstrates that the system produces valid detection among volunteers in terms of odour and lateralization thresholds.Keywords: irritation, odour delivery, olfactometer, smell
Procedia PDF Downloads 21596 Markov Random Field-Based Segmentation Algorithm for Detection of Land Cover Changes Using Uninhabited Aerial Vehicle Synthetic Aperture Radar Polarimetric Images
Authors: Mehrnoosh Omati, Mahmod Reza Sahebi
Abstract:
The information on land use/land cover changing plays an essential role for environmental assessment, planning and management in regional development. Remotely sensed imagery is widely used for providing information in many change detection applications. Polarimetric Synthetic aperture radar (PolSAR) image, with the discrimination capability between different scattering mechanisms, is a powerful tool for environmental monitoring applications. This paper proposes a new boundary-based segmentation algorithm as a fundamental step for land cover change detection. In this method, first, two PolSAR images are segmented using integration of marker-controlled watershed algorithm and coupled Markov random field (MRF). Then, object-based classification is performed to determine changed/no changed image objects. Compared with pixel-based support vector machine (SVM) classifier, this novel segmentation algorithm significantly reduces the speckle effect in PolSAR images and improves the accuracy of binary classification in object-based level. The experimental results on Uninhabited Aerial Vehicle Synthetic Aperture Radar (UAVSAR) polarimetric images show a 3% and 6% improvement in overall accuracy and kappa coefficient, respectively. Also, the proposed method can correctly distinguish homogeneous image parcels.Keywords: coupled Markov random field (MRF), environment, object-based analysis, polarimetric SAR (PolSAR) images
Procedia PDF Downloads 21695 Analysis of Anti-Tuberculosis Immune Response Induced in Lungs by Intranasal Immunization with Mycobacterium indicus pranii
Authors: Ananya Gupta, Sangeeta Bhaskar
Abstract:
Mycobacterium indicus pranii (MIP) is a saprophytic mycobacterium. It is a predecessor of M. avium complex (MAC). Whole genome analysis and growth kinetics studies have placed MIP in between pathogenic and non-pathogenic species. It shares significant antigenic repertoire with M. tuberculosis and have unique immunomodulatory properties. MIP provides better protection than BCG against pulmonary tuberculosis in animal models. Immunization with MIP by aerosol route provides significantly higher protection as compared to immunization by subcutaneous (s.c.) route. However, mechanism behind differential protection has not been studied. In this study, using mice model we have evaluated and compared the M.tb specific immune response in lung compartments (airway lumen / lung interstitium) as well as spleen following MIP immunization via nasal (i.n.) and s.c. route. MIP i.n. vaccination resulted in increased seeding of memory T cells (CD4+ and CD8+ T-cells) in the airway lumen. Frequency of CD4+ T cells expressing Th1 migratory marker (CXCR3) and activation marker (CD69) were also high in airway lumen of MIP i.n. group. Significantly high ex vivo secretion of cytokines- IFN-, IL-12, IL-17 and TNF- from cells of airway luminal spaces provides evidence of antigen-specific lung immune response, besides generating systemic immunity comparable to MIP s.c. group. Analysis of T cell response on per cell basis revealed that antigen specific T-cells of MIP i.n. group were functionally superior as higher percentage of these cells simultaneously secreted IFN-gamma, IL-2 and TNF-alpha cytokines as compared to MIP s.c. group. T-cells secreting more than one of the cytokines simultaneously are believed to have robust effector response and crucial for protection, compared with single cytokine secreting T-cells. Adoptive transfer of airway luminal T-cells from MIP i.n. group into trachea of naive B6 mice revealed that MIP induced CD8 T-cells play crucial role in providing long term protection. Thus the study demonstrates that MIP intranasal vaccination induces M.tb specific memory T-cells in the airway lumen that results in an early and robust recall response against M.tb infection.Keywords: airway lumen, Mycobacterium indicus pranii, Th1 migratory markers, vaccination
Procedia PDF Downloads 18694 Vehicle Speed Estimation Using Image Processing
Authors: Prodipta Bhowmik, Poulami Saha, Preety Mehra, Yogesh Soni, Triloki Nath Jha
Abstract:
In India, the smart city concept is growing day by day. So, for smart city development, a better traffic management and monitoring system is a very important requirement. Nowadays, road accidents increase due to more vehicles on the road. Reckless driving is mainly responsible for a huge number of accidents. So, an efficient traffic management system is required for all kinds of roads to control the traffic speed. The speed limit varies from road to road basis. Previously, there was a radar system but due to high cost and less precision, the radar system is unable to become favorable in a traffic management system. Traffic management system faces different types of problems every day and it has become a researchable topic on how to solve this problem. This paper proposed a computer vision and machine learning-based automated system for multiple vehicle detection, tracking, and speed estimation of vehicles using image processing. Detection of vehicles and estimating their speed from a real-time video is tough work to do. The objective of this paper is to detect vehicles and estimate their speed as accurately as possible. So for this, a real-time video is first captured, then the frames are extracted from that video, then from that frames, the vehicles are detected, and thereafter, the tracking of vehicles starts, and finally, the speed of the moving vehicles is estimated. The goal of this method is to develop a cost-friendly system that can able to detect multiple types of vehicles at the same time.Keywords: OpenCV, Haar Cascade classifier, DLIB, YOLOV3, centroid tracker, vehicle detection, vehicle tracking, vehicle speed estimation, computer vision
Procedia PDF Downloads 8293 Intrusion Detection in Cloud Computing Using Machine Learning
Authors: Faiza Babur Khan, Sohail Asghar
Abstract:
With an emergence of distributed environment, cloud computing is proving to be the most stimulating computing paradigm shift in computer technology, resulting in spectacular expansion in IT industry. Many companies have augmented their technical infrastructure by adopting cloud resource sharing architecture. Cloud computing has opened doors to unlimited opportunities from application to platform availability, expandable storage and provision of computing environment. However, from a security viewpoint, an added risk level is introduced from clouds, weakening the protection mechanisms, and hardening the availability of privacy, data security and on demand service. Issues of trust, confidentiality, and integrity are elevated due to multitenant resource sharing architecture of cloud. Trust or reliability of cloud refers to its capability of providing the needed services precisely and unfailingly. Confidentiality is the ability of the architecture to ensure authorization of the relevant party to access its private data. It also guarantees integrity to protect the data from being fabricated by an unauthorized user. So in order to assure provision of secured cloud, a roadmap or model is obligatory to analyze a security problem, design mitigation strategies, and evaluate solutions. The aim of the paper is twofold; first to enlighten the factors which make cloud security critical along with alleviation strategies and secondly to propose an intrusion detection model that identifies the attackers in a preventive way using machine learning Random Forest classifier with an accuracy of 99.8%. This model uses less number of features. A comparison with other classifiers is also presented.Keywords: cloud security, threats, machine learning, random forest, classification
Procedia PDF Downloads 31992 Rasagiline Improves Metabolic Function and Reduces Tissue Injury in the Substantia Nigra in Parkinson's Disease: A Longitudinal In-Vivo Advanced MRI Study
Authors: Omar Khan, Shana Krstevska, Edwin George, Veronica Gorden, Fen Bao, Christina Caon, NP-C, Carla Santiago, Imad Zak, Navid Seraji-Bozorgzad
Abstract:
Objective: To quantify cellular injury in the substantia nigra (SN) in patients with Parkinson's disease (PD) and to examine the effect of rasagiline of tissue injury in the SN in patients with PD. Background: N-acetylaspartate (NAA) quantified with MRS is a reliable marker of neuronal metabolic function. Fractional anisotropy (FA) and mean diffusivity (MD) obtained with DTI, characterize tissue alignment and integrity. Rasagline, has been shown to exert anti-apototic effect. We applied these advanced MRI techniques to examine: (i) the effect of rasagiline on cellular injury and metabolism in patients with early PD, and (ii) longitudinal changes seen over time in PD. Methods: We conducted a prospective longitudinal study in patients with mild PD, naive to dopaminergic treatment. The imaging protocol included multi-voxel proton-MRS and DTI of the SN, acquired on a 3T scanner. Scans were performed at baseline and month 3, during which the patient was on no treatment. At that point, rasagiline 1 mg orally daily was initiated and MRI scans are were obtained at 6 and 12 months after starting rasagiline. The primary objective was to compare changes during the 3-month period of “no treatment” to the changes observed “on treatment” with rasagiline at month 12. Age-matched healthy controls were also imaged. Image analysis was performed blinded to treatment allocation and period. Results: 25 patients were enrolled in this study. Compared to the period of “no treatment”, there was significant increase in the NAA “on treatment” period (-3.04 % vs +10.95 %, p= 0.0006). Compared to the period of “no treatment”, there was significant increase in following 12 month in the FA “on treatment” (-4.8% vs +15.3%, p<0.0001). The MD increased during “no treatment” and decreased in “on treatment” (+2.8% vs -7.5%, p=0.0056). Further analysis and clinical correlation are ongoing. Conclusions: Advanced MRI techniques quantifying cellular injury in the SN in PD is a feasible approach to investigate dopaminergic neuronal injury and could be developed as an outcome in exploratory studies. Rasagiline appears to have a stabilizing effect on dopaminergic cell loss and metabolism in the SN in PD, that warrants further investigation in long-term studies.Keywords: substantia nigra, Parkinson's disease, MRI, neuronal loss, biomarker
Procedia PDF Downloads 31391 The Impact of the Application of Blockchain Technology in Accounting and Auditing
Authors: Yusuf Adebayo Oduwole
Abstract:
The evaluation of blockchain technology's potential effects on the accounting and auditing fields is the main objective of this essay. It also adds to the existing body of work by examining how these practices alter technological concerns, including cryptocurrency accounting, regulation, governance, accounting practices, and technical challenges. Examples of this advancement include the growth of the concept of blockchain and its application in accounting. This technology is being considered one of the digital revolutions that could disrupt the world and civilization as it can transfer large volumes of virtual currencies like cryptocurrencies with the help of a third party. The basis for this research is a systematic review of the articles using Vosviewer to display and reflect on the bibliometric information of the articles accessible on the Scopus database. Also, as the practice of using blockchain technology in the field of accounting and auditing is still in its infancy, it may be useful to carry out a more thorough analysis of any implications for accounting and auditing regarding aspects of governance, regulation, and cryptocurrency that have not yet been discussed or addressed to any significant extent. The main findings on the relationship between blockchain and accounting show that the application of smart contracts, such as triple-entry accounting, has increased the quality of accounting records as well as reliance on the information available. This results in fewer cyclical assignments, no need for resolution, and real-time accounting, among others. Thereby, to integrate blockchain through a computer system, one must continuously learn and remain naive when using blockchain-integrated accounting software. This includes learning about how cryptocurrencies are accounted for and regulated. In this study, three original and contributed efforts are presented. To offer a transparent view of the state of previous relevant studies and research works in accounting and auditing that focus on blockchain, it begins by using bibliographic visibility analysis and a Scopus narrative analysis. Second, it highlights legislative, governance, and ethical concerns, such as education, where it tackles the use of blockchain in accounting and auditing. Lastly, it examines the impact of blockchain technologies on the accounting recognition of cryptocurrencies. Users of the technology should, therefore, take their time and learn how it works, as well as keep abreast of the different developments. In addition, the accounting industry must integrate blockchain certification and practice, most likely offline or as part of university education for those intending to become auditors or accountants.Keywords: blockchain, crypto assets, governance, regulation & smart contracts
Procedia PDF Downloads 2690 Iris Recognition Based on the Low Order Norms of Gradient Components
Authors: Iman A. Saad, Loay E. George
Abstract:
Iris pattern is an important biological feature of human body; it becomes very hot topic in both research and practical applications. In this paper, an algorithm is proposed for iris recognition and a simple, efficient and fast method is introduced to extract a set of discriminatory features using first order gradient operator applied on grayscale images. The gradient based features are robust, up to certain extents, against the variations may occur in contrast or brightness of iris image samples; the variations are mostly occur due lightening differences and camera changes. At first, the iris region is located, after that it is remapped to a rectangular area of size 360x60 pixels. Also, a new method is proposed for detecting eyelash and eyelid points; it depends on making image statistical analysis, to mark the eyelash and eyelid as a noise points. In order to cover the features localization (variation), the rectangular iris image is partitioned into N overlapped sub-images (blocks); then from each block a set of different average directional gradient densities values is calculated to be used as texture features vector. The applied gradient operators are taken along the horizontal, vertical and diagonal directions. The low order norms of gradient components were used to establish the feature vector. Euclidean distance based classifier was used as a matching metric for determining the degree of similarity between the features vector extracted from the tested iris image and template features vectors stored in the database. Experimental tests were performed using 2639 iris images from CASIA V4-Interival database, the attained recognition accuracy has reached up to 99.92%.Keywords: iris recognition, contrast stretching, gradient features, texture features, Euclidean metric
Procedia PDF Downloads 33389 Fake News Detection Based on Fusion of Domain Knowledge and Expert Knowledge
Authors: Yulan Wu
Abstract:
The spread of fake news on social media has posed significant societal harm to the public and the nation, with its threats spanning various domains, including politics, economics, health, and more. News on social media often covers multiple domains, and existing models studied by researchers and relevant organizations often perform well on datasets from a single domain. However, when these methods are applied to social platforms with news spanning multiple domains, their performance significantly deteriorates. Existing research has attempted to enhance the detection performance of multi-domain datasets by adding single-domain labels to the data. However, these methods overlook the fact that a news article typically belongs to multiple domains, leading to the loss of domain knowledge information contained within the news text. To address this issue, research has found that news records in different domains often use different vocabularies to describe their content. In this paper, we propose a fake news detection framework that combines domain knowledge and expert knowledge. Firstly, it utilizes an unsupervised domain discovery module to generate a low-dimensional vector for each news article, representing domain embeddings, which can retain multi-domain knowledge of the news content. Then, a feature extraction module uses the domain embeddings discovered through unsupervised domain knowledge to guide multiple experts in extracting news knowledge for the total feature representation. Finally, a classifier is used to determine whether the news is fake or not. Experiments show that this approach can improve multi-domain fake news detection performance while reducing the cost of manually labeling domain labels.Keywords: fake news, deep learning, natural language processing, multiple domains
Procedia PDF Downloads 7288 Roof and Road Network Detection through Object Oriented SVM Approach Using Low Density LiDAR and Optical Imagery in Misamis Oriental, Philippines
Authors: Jigg L. Pelayo, Ricardo G. Villar, Einstine M. Opiso
Abstract:
The advances of aerial laser scanning in the Philippines has open-up entire fields of research in remote sensing and machine vision aspire to provide accurate timely information for the government and the public. Rapid mapping of polygonal roads and roof boundaries is one of its utilization offering application to disaster risk reduction, mitigation and development. The study uses low density LiDAR data and high resolution aerial imagery through object-oriented approach considering the theoretical concept of data analysis subjected to machine learning algorithm in minimizing the constraints of feature extraction. Since separating one class from another in distinct regions of a multi-dimensional feature-space, non-trivial computing for fitting distribution were implemented to formulate the learned ideal hyperplane. Generating customized hybrid feature which were then used in improving the classifier findings. Supplemental algorithms for filtering and reshaping object features are develop in the rule set for enhancing the final product. Several advantages in terms of simplicity, applicability, and process transferability is noticeable in the methodology. The algorithm was tested in the different random locations of Misamis Oriental province in the Philippines demonstrating robust performance in the overall accuracy with greater than 89% and potential to semi-automation. The extracted results will become a vital requirement for decision makers, urban planners and even the commercial sector in various assessment processes.Keywords: feature extraction, machine learning, OBIA, remote sensing
Procedia PDF Downloads 36087 Credit Card Fraud Detection with Ensemble Model: A Meta-Heuristic Approach
Authors: Gong Zhilin, Jing Yang, Jian Yin
Abstract:
The purpose of this paper is to develop a novel system for credit card fraud detection based on sequential modeling of data using hybrid deep learning models. The projected model encapsulates five major phases are pre-processing, imbalance-data handling, feature extraction, optimal feature selection, and fraud detection with an ensemble classifier. The collected raw data (input) is pre-processed to enhance the quality of the data through alleviation of the missing data, noisy data as well as null values. The pre-processed data are class imbalanced in nature, and therefore they are handled effectively with the K-means clustering-based SMOTE model. From the balanced class data, the most relevant features like improved Principal Component Analysis (PCA), statistical features (mean, median, standard deviation) and higher-order statistical features (skewness and kurtosis). Among the extracted features, the most optimal features are selected with the Self-improved Arithmetic Optimization Algorithm (SI-AOA). This SI-AOA model is the conceptual improvement of the standard Arithmetic Optimization Algorithm. The deep learning models like Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN), and optimized Quantum Deep Neural Network (QDNN). The LSTM and CNN are trained with the extracted optimal features. The outcomes from LSTM and CNN will enter as input to optimized QDNN that provides the final detection outcome. Since the QDNN is the ultimate detector, its weight function is fine-tuned with the Self-improved Arithmetic Optimization Algorithm (SI-AOA).Keywords: credit card, data mining, fraud detection, money transactions
Procedia PDF Downloads 12886 Low-Cost Image Processing System for Evaluating Pavement Surface Distress
Authors: Keerti Kembhavi, M. R. Archana, V. Anjaneyappa
Abstract:
Most asphalt pavement condition evaluation use rating frameworks in which asphalt pavement distress is estimated by type, extent, and severity. Rating is carried out by the pavement condition rating (PCR), which is tedious and expensive. This paper presents the development of a low-cost technique for image pavement distress analysis that permits the identification of pothole and cracks. The paper explores the application of image processing tools for the detection of potholes and cracks. Longitudinal cracking and pothole are detected using Fuzzy-C- Means (FCM) and proceeded with the Spectral Theory algorithm. The framework comprises three phases, including image acquisition, processing, and extraction of features. A digital camera (Gopro) with the holder is used to capture pavement distress images on a moving vehicle. FCM classifier and Spectral Theory algorithms are used to compute features and classify the longitudinal cracking and pothole. The Matlab2016Ra Image preparing tool kit utilizes performance analysis to identify the viability of pavement distress on selected urban stretches of Bengaluru city, India. The outcomes of image evaluation with the utilization semi-computerized image handling framework represented the features of longitudinal crack and pothole with an accuracy of about 80%. Further, the detected images are validated with the actual dimensions, and it is seen that dimension variability is about 0.46. The linear regression model y=1.171x-0.155 is obtained using the existing and experimental / image processing area. The R2 correlation square obtained from the best fit line is 0.807, which is considered in the linear regression model to be ‘large positive linear association’.Keywords: crack detection, pothole detection, spectral clustering, fuzzy-c-means
Procedia PDF Downloads 18185 Leveraging xAPI in a Corporate e-Learning Environment to Facilitate the Tracking, Modelling, and Predictive Analysis of Learner Behaviour
Authors: Libor Zachoval, Daire O Broin, Oisin Cawley
Abstract:
E-learning platforms, such as Blackboard have two major shortcomings: limited data capture as a result of the limitations of SCORM (Shareable Content Object Reference Model), and lack of incorporation of Artificial Intelligence (AI) and machine learning algorithms which could lead to better course adaptations. With the recent development of Experience Application Programming Interface (xAPI), a large amount of additional types of data can be captured and that opens a window of possibilities from which online education can benefit. In a corporate setting, where companies invest billions on the learning and development of their employees, some learner behaviours can be troublesome for they can hinder the knowledge development of a learner. Behaviours that hinder the knowledge development also raise ambiguity about learner’s knowledge mastery, specifically those related to gaming the system. Furthermore, a company receives little benefit from their investment if employees are passing courses without possessing the required knowledge and potential compliance risks may arise. Using xAPI and rules derived from a state-of-the-art review, we identified three learner behaviours, primarily related to guessing, in a corporate compliance course. The identified behaviours are: trying each option for a question, specifically for multiple-choice questions; selecting a single option for all the questions on the test; and continuously repeating tests upon failing as opposed to going over the learning material. These behaviours were detected on learners who repeated the test at least 4 times before passing the course. These findings suggest that gauging the mastery of a learner from multiple-choice questions test scores alone is a naive approach. Thus, next steps will consider the incorporation of additional data points, knowledge estimation models to model knowledge mastery of a learner more accurately, and analysis of the data for correlations between knowledge development and identified learner behaviours. Additional work could explore how learner behaviours could be utilised to make changes to a course. For example, course content may require modifications (certain sections of learning material may be shown to not be helpful to many learners to master the learning outcomes aimed at) or course design (such as the type and duration of feedback).Keywords: artificial intelligence, corporate e-learning environment, knowledge maintenance, xAPI
Procedia PDF Downloads 12184 Lexical Based Method for Opinion Detection on Tripadvisor Collection
Authors: Faiza Belbachir, Thibault Schienhinski
Abstract:
The massive development of online social networks allows users to post and share their opinions on various topics. With this huge volume of opinion, it is interesting to extract and interpret these information for different domains, e.g., product and service benchmarking, politic, system of recommendation. This is why opinion detection is one of the most important research tasks. It consists on differentiating between opinion data and factual data. The difficulty of this task is to determine an approach which returns opinionated document. Generally, there are two approaches used for opinion detection i.e. Lexical based approaches and Machine Learning based approaches. In Lexical based approaches, a dictionary of sentimental words is used, words are associated with weights. The opinion score of document is derived by the occurrence of words from this dictionary. In Machine learning approaches, usually a classifier is trained using a set of annotated document containing sentiment, and features such as n-grams of words, part-of-speech tags, and logical forms. Majority of these works are based on documents text to determine opinion score but dont take into account if these texts are really correct. Thus, it is interesting to exploit other information to improve opinion detection. In our work, we will develop a new way to consider the opinion score. We introduce the notion of trust score. We determine opinionated documents but also if these opinions are really trustable information in relation with topics. For that we use lexical SentiWordNet to calculate opinion and trust scores, we compute different features about users like (numbers of their comments, numbers of their useful comments, Average useful review). After that, we combine opinion score and trust score to obtain a final score. We applied our method to detect trust opinions in TRIPADVISOR collection. Our experimental results report that the combination between opinion score and trust score improves opinion detection.Keywords: Tripadvisor, opinion detection, SentiWordNet, trust score
Procedia PDF Downloads 19883 Using Time Series NDVI to Model Land Cover Change: A Case Study in the Berg River Catchment Area, Western Cape, South Africa
Authors: Adesuyi Ayodeji Steve, Zahn Munch
Abstract:
This study investigates the use of MODIS NDVI to identify agricultural land cover change areas on an annual time step (2007 - 2012) and characterize the trend in the study area. An ISODATA classification was performed on the MODIS imagery to select only the agricultural class producing 3 class groups namely: agriculture, agriculture/semi-natural, and semi-natural. NDVI signatures were created for the time series to identify areas dominated by cereals and vineyards with the aid of ancillary, pictometry and field sample data. The NDVI signature curve and training samples aided in creating a decision tree model in WEKA 3.6.9. From the training samples two classification models were built in WEKA using decision tree classifier (J48) algorithm; Model 1 included ISODATA classification and Model 2 without, both having accuracies of 90.7% and 88.3% respectively. The two models were used to classify the whole study area, thus producing two land cover maps with Model 1 and 2 having classification accuracies of 77% and 80% respectively. Model 2 was used to create change detection maps for all the other years. Subtle changes and areas of consistency (unchanged) were observed in the agricultural classes and crop practices over the years as predicted by the land cover classification. 41% of the catchment comprises of cereals with 35% possibly following a crop rotation system. Vineyard largely remained constant over the years, with some conversion to vineyard (1%) from other land cover classes. Some of the changes might be as a result of misclassification and crop rotation system.Keywords: change detection, land cover, modis, NDVI
Procedia PDF Downloads 40082 A Machine Learning-Based Model to Screen Antituberculosis Compound Targeted against LprG Lipoprotein of Mycobacterium tuberculosis
Authors: Syed Asif Hassan, Syed Atif Hassan
Abstract:
Multidrug-resistant Tuberculosis (MDR-TB) is an infection caused by the resistant strains of Mycobacterium tuberculosis that do not respond either to isoniazid or rifampicin, which are the most important anti-TB drugs. The increase in the occurrence of a drug-resistance strain of MTB calls for an intensive search of novel target-based therapeutics. In this context LprG (Rv1411c) a lipoprotein from MTB plays a pivotal role in the immune evasion of Mtb leading to survival and propagation of the bacterium within the host cell. Therefore, a machine learning method will be developed for generating a computational model that could predict for a potential anti LprG activity of the novel antituberculosis compound. The present study will utilize dataset from PubChem database maintained by National Center for Biotechnology Information (NCBI). The dataset involves compounds screened against MTB were categorized as active and inactive based upon PubChem activity score. PowerMV, a molecular descriptor generator, and visualization tool will be used to generate the 2D molecular descriptors for the actives and inactive compounds present in the dataset. The 2D molecular descriptors generated from PowerMV will be used as features. We feed these features into three different classifiers, namely, random forest, a deep neural network, and a recurring neural network, to build separate predictive models and choosing the best performing model based on the accuracy of predicting novel antituberculosis compound with an anti LprG activity. Additionally, the efficacy of predicted active compounds will be screened using SMARTS filter to choose molecule with drug-like features.Keywords: antituberculosis drug, classifier, machine learning, molecular descriptors, prediction
Procedia PDF Downloads 38981 Local Directional Encoded Derivative Binary Pattern Based Coral Image Classification Using Weighted Distance Gray Wolf Optimization Algorithm
Authors: Annalakshmi G., Sakthivel Murugan S.
Abstract:
This paper presents a local directional encoded derivative binary pattern (LDEDBP) feature extraction method that can be applied for the classification of submarine coral reef images. The classification of coral reef images using texture features is difficult due to the dissimilarities in class samples. In coral reef image classification, texture features are extracted using the proposed method called local directional encoded derivative binary pattern (LDEDBP). The proposed approach extracts the complete structural arrangement of the local region using local binary batten (LBP) and also extracts the edge information using local directional pattern (LDP) from the edge response available in a particular region, thereby achieving extra discriminative feature value. Typically the LDP extracts the edge details in all eight directions. The process of integrating edge responses along with the local binary pattern achieves a more robust texture descriptor than the other descriptors used in texture feature extraction methods. Finally, the proposed technique is applied to an extreme learning machine (ELM) method with a meta-heuristic algorithm known as weighted distance grey wolf optimizer (GWO) to optimize the input weight and biases of single-hidden-layer feed-forward neural networks (SLFN). In the empirical results, ELM-WDGWO demonstrated their better performance in terms of accuracy on all coral datasets, namely RSMAS, EILAT, EILAT2, and MLC, compared with other state-of-the-art algorithms. The proposed method achieves the highest overall classification accuracy of 94% compared to the other state of art methods.Keywords: feature extraction, local directional pattern, ELM classifier, GWO optimization
Procedia PDF Downloads 16380 A Machine Learning Approach for Detecting and Locating Hardware Trojans
Authors: Kaiwen Zheng, Wanting Zhou, Nan Tang, Lei Li, Yuanhang He
Abstract:
The integrated circuit industry has become a cornerstone of the information society, finding widespread application in areas such as industry, communication, medicine, and aerospace. However, with the increasing complexity of integrated circuits, Hardware Trojans (HTs) implanted by attackers have become a significant threat to their security. In this paper, we proposed a hardware trojan detection method for large-scale circuits. As HTs introduce physical characteristic changes such as structure, area, and power consumption as additional redundant circuits, we proposed a machine-learning-based hardware trojan detection method based on the physical characteristics of gate-level netlists. This method transforms the hardware trojan detection problem into a machine-learning binary classification problem based on physical characteristics, greatly improving detection speed. To address the problem of imbalanced data, where the number of pure circuit samples is far less than that of HTs circuit samples, we used the SMOTETomek algorithm to expand the dataset and further improve the performance of the classifier. We used three machine learning algorithms, K-Nearest Neighbors, Random Forest, and Support Vector Machine, to train and validate benchmark circuits on Trust-Hub, and all achieved good results. In our case studies based on AES encryption circuits provided by trust-hub, the test results showed the effectiveness of the proposed method. To further validate the method’s effectiveness for detecting variant HTs, we designed variant HTs using open-source HTs. The proposed method can guarantee robust detection accuracy in the millisecond level detection time for IC, and FPGA design flows and has good detection performance for library variant HTs.Keywords: hardware trojans, physical properties, machine learning, hardware security
Procedia PDF Downloads 14479 An Unsupervised Domain-Knowledge Discovery Framework for Fake News Detection
Authors: Yulan Wu
Abstract:
With the rapid development of social media, the issue of fake news has gained considerable prominence, drawing the attention of both the public and governments. The widespread dissemination of false information poses a tangible threat across multiple domains of society, including politics, economy, and health. However, much research has concentrated on supervised training models within specific domains, their effectiveness diminishes when applied to identify fake news across multiple domains. To solve this problem, some approaches based on domain labels have been proposed. By segmenting news to their specific area in advance, judges in the corresponding field may be more accurate on fake news. However, these approaches disregard the fact that news records can pertain to multiple domains, resulting in a significant loss of valuable information. In addition, the datasets used for training must all be domain-labeled, which creates unnecessary complexity. To solve these problems, an unsupervised domain knowledge discovery framework for fake news detection is proposed. Firstly, to effectively retain the multidomain knowledge of the text, a low-dimensional vector for each news text to capture domain embeddings is generated. Subsequently, a feature extraction module utilizing the unsupervisedly discovered domain embeddings is used to extract the comprehensive features of news. Finally, a classifier is employed to determine the authenticity of the news. To verify the proposed framework, a test is conducted on the existing widely used datasets, and the experimental results demonstrate that this method is able to improve the detection performance for fake news across multiple domains. Moreover, even in datasets that lack domain labels, this method can still effectively transfer domain knowledge, which can educe the time consumed by tagging without sacrificing the detection accuracy.Keywords: fake news, deep learning, natural language processing, multiple domains
Procedia PDF Downloads 9678 Lung Cancer Detection and Multi Level Classification Using Discrete Wavelet Transform Approach
Authors: V. Veeraprathap, G. S. Harish, G. Narendra Kumar
Abstract:
Uncontrolled growth of abnormal cells in the lung in the form of tumor can be either benign (non-cancerous) or malignant (cancerous). Patients with Lung Cancer (LC) have an average of five years life span expectancy provided diagnosis, detection and prediction, which reduces many treatment options to risk of invasive surgery increasing survival rate. Computed Tomography (CT), Positron Emission Tomography (PET), and Magnetic Resonance Imaging (MRI) for earlier detection of cancer are common. Gaussian filter along with median filter used for smoothing and noise removal, Histogram Equalization (HE) for image enhancement gives the best results without inviting further opinions. Lung cavities are extracted and the background portion other than two lung cavities is completely removed with right and left lungs segmented separately. Region properties measurements area, perimeter, diameter, centroid and eccentricity measured for the tumor segmented image, while texture is characterized by Gray-Level Co-occurrence Matrix (GLCM) functions, feature extraction provides Region of Interest (ROI) given as input to classifier. Two levels of classifications, K-Nearest Neighbor (KNN) is used for determining patient condition as normal or abnormal, while Artificial Neural Networks (ANN) is used for identifying the cancer stage is employed. Discrete Wavelet Transform (DWT) algorithm is used for the main feature extraction leading to best efficiency. The developed technology finds encouraging results for real time information and on line detection for future research.Keywords: artificial neural networks, ANN, discrete wavelet transform, DWT, gray-level co-occurrence matrix, GLCM, k-nearest neighbor, KNN, region of interest, ROI
Procedia PDF Downloads 15377 Web Data Scraping Technology Using Term Frequency Inverse Document Frequency to Enhance the Big Data Quality on Sentiment Analysis
Authors: Sangita Pokhrel, Nalinda Somasiri, Rebecca Jeyavadhanam, Swathi Ganesan
Abstract:
Tourism is a booming industry with huge future potential for global wealth and employment. There are countless data generated over social media sites every day, creating numerous opportunities to bring more insights to decision-makers. The integration of Big Data Technology into the tourism industry will allow companies to conclude where their customers have been and what they like. This information can then be used by businesses, such as those in charge of managing visitor centers or hotels, etc., and the tourist can get a clear idea of places before visiting. The technical perspective of natural language is processed by analysing the sentiment features of online reviews from tourists, and we then supply an enhanced long short-term memory (LSTM) framework for sentiment feature extraction of travel reviews. We have constructed a web review database using a crawler and web scraping technique for experimental validation to evaluate the effectiveness of our methodology. The text form of sentences was first classified through Vader and Roberta model to get the polarity of the reviews. In this paper, we have conducted study methods for feature extraction, such as Count Vectorization and TFIDF Vectorization, and implemented Convolutional Neural Network (CNN) classifier algorithm for the sentiment analysis to decide the tourist’s attitude towards the destinations is positive, negative, or simply neutral based on the review text that they posted online. The results demonstrated that from the CNN algorithm, after pre-processing and cleaning the dataset, we received an accuracy of 96.12% for the positive and negative sentiment analysis.Keywords: counter vectorization, convolutional neural network, crawler, data technology, long short-term memory, web scraping, sentiment analysis
Procedia PDF Downloads 8576 Spectral Mixture Model Applied to Cannabis Parcel Determination
Authors: Levent Basayigit, Sinan Demir, Yusuf Ucar, Burhan Kara
Abstract:
Many research projects require accurate delineation of the different land cover type of the agricultural area. Especially it is critically important for the definition of specific plants like cannabis. However, the complexity of vegetation stands structure, abundant vegetation species, and the smooth transition between different seconder section stages make vegetation classification difficult when using traditional approaches such as the maximum likelihood classifier. Most of the time, classification distinguishes only between trees/annual or grain. It has been difficult to accurately determine the cannabis mixed with other plants. In this paper, a mixed distribution models approach is applied to classify pure and mix cannabis parcels using Worldview-2 imagery in the Lakes region of Turkey. Five different land use types (i.e. sunflower, maize, bare soil, and cannabis) were identified in the image. A constrained Gaussian mixture discriminant analysis (GMDA) was used to unmix the image. In the study, 255 reflectance ratios derived from spectral signatures of seven bands (Blue-Green-Yellow-Red-Rededge-NIR1-NIR2) were randomly arranged as 80% for training and 20% for test data. Gaussian mixed distribution model approach is proved to be an effective and convenient way to combine very high spatial resolution imagery for distinguishing cannabis vegetation. Based on the overall accuracies of the classification, the Gaussian mixed distribution model was found to be very successful to achieve image classification tasks. This approach is sensitive to capture the illegal cannabis planting areas in the large plain. This approach can also be used for monitoring and determination with spectral reflections in illegal cannabis planting areas.Keywords: Gaussian mixture discriminant analysis, spectral mixture model, Worldview-2, land parcels
Procedia PDF Downloads 19475 Applying Multiplicative Weight Update to Skin Cancer Classifiers
Authors: Animish Jain
Abstract:
This study deals with using Multiplicative Weight Update within artificial intelligence and machine learning to create models that can diagnose skin cancer using microscopic images of cancer samples. In this study, the multiplicative weight update method is used to take the predictions of multiple models to try and acquire more accurate results. Logistic Regression, Convolutional Neural Network (CNN), and Support Vector Machine Classifier (SVMC) models are employed within the Multiplicative Weight Update system. These models are trained on pictures of skin cancer from the ISIC-Archive, to look for patterns to label unseen scans as either benign or malignant. These models are utilized in a multiplicative weight update algorithm which takes into account the precision and accuracy of each model through each successive guess to apply weights to their guess. These guesses and weights are then analyzed together to try and obtain the correct predictions. The research hypothesis for this study stated that there would be a significant difference in the accuracy of the three models and the Multiplicative Weight Update system. The SVMC model had an accuracy of 77.88%. The CNN model had an accuracy of 85.30%. The Logistic Regression model had an accuracy of 79.09%. Using Multiplicative Weight Update, the algorithm received an accuracy of 72.27%. The final conclusion that was drawn was that there was a significant difference in the accuracy of the three models and the Multiplicative Weight Update system. The conclusion was made that using a CNN model would be the best option for this problem rather than a Multiplicative Weight Update system. This is due to the possibility that Multiplicative Weight Update is not effective in a binary setting where there are only two possible classifications. In a categorical setting with multiple classes and groupings, a Multiplicative Weight Update system might become more proficient as it takes into account the strengths of multiple different models to classify images into multiple categories rather than only two categories, as shown in this study. This experimentation and computer science project can help to create better algorithms and models for the future of artificial intelligence in the medical imaging field.Keywords: artificial intelligence, machine learning, multiplicative weight update, skin cancer
Procedia PDF Downloads 7874 First-Trimester Screening of Preeclampsia in a Routine Care
Authors: Tamar Grdzelishvili, Zaza Sinauridze
Abstract:
Introduction: Preeclampsia is a complication of the second trimester of pregnancy, which is characterized by high morbidity and multiorgan damage. Many complex pathogenic mechanisms are now implicated to be responsible for this disease (1). Preeclampsia is one of the leading causes of maternal mortality worldwide. Statistics are enough to convince you of the seriousness of this pathology: about 100,000 women die of preeclampsia every year. It occurs in 3-14% (varies significantly depending on racial origin or ethnicity and geographical region) of pregnant women, in 75% of cases - in a mild form, and in 25% - in a severe form. During severe pre-eclampsia-eclampsia, perinatal mortality increases by 5 times and stillbirth by 9.6 times. Considering that the only way to treat the disease is to end the pregnancy, the main thing is timely diagnosis and prevention of the disease. Identification of high-risk pregnant women for PE and giving prophylaxis would reduce the incidence of preterm PE. First-trimester screening model developed by the Fetal Medicine Foundation (FMF), which uses the Bayes-theorem to combine maternal characteristics and medical history together with measurements of mean arterial pressure, uterine artery pulsatility index, and serum placental growth factor, has been proven to be effective and have superior screening performance to that of traditional risk factor-based approach for the prediction of PE (2) Methods: Retrospective single center screening study. The study population consisted of women from the Tbilisi maternity hospital “Pineo medical ecosystem” who met the following criteria: they spoke Georgian, English, or Russian and agreed to participate in the study after discussing informed consent and answering questions. Prior to the study, the informed consent forms approved by the Institutional Review Board were obtained from the study subjects. Early assessment of preeclampsia was performed between 11-13 weeks of pregnancy. The following were evaluated: anamnesis, dopplerography of the uterine artery, mean arterial blood pressure, and biochemical parameter: Pregnancy-associated plasma protein A (PAPP-A). Individual risk assessment was performed with performed by Fast Screen 3.0 software ThermoFisher scientific. Results: A total of 513 women were recruited and through the study, 51 women were diagnosed with preeclampsia (34.5% in the pregnant women with high risk, 6.5% in the pregnant women with low risk; P<0.000 1). Conclusions: First-trimester screening combining maternal factors with uterine artery Doppler, blood pressure, and pregnancy-associated plasma protein-A is useful to predict PE in a routine care setting. More patient studies are needed for final conclusions. The research is still ongoing.Keywords: first-trimester, preeclampsia, screening, pregnancy-associated plasma protein
Procedia PDF Downloads 7573 Preliminary Study of Hand Gesture Classification in Upper-Limb Prosthetics Using Machine Learning with EMG Signals
Authors: Linghui Meng, James Atlas, Deborah Munro
Abstract:
There is an increasing demand for prosthetics capable of mimicking natural limb movements and hand gestures, but precise movement control of prosthetics using only electrode signals continues to be challenging. This study considers the implementation of machine learning as a means of improving accuracy and presents an initial investigation into hand gesture recognition using models based on electromyographic (EMG) signals. EMG signals, which capture muscle activity, are used as inputs to machine learning algorithms to improve prosthetic control accuracy, functionality and adaptivity. Using logistic regression, a machine learning classifier, this study evaluates the accuracy of classifying two hand gestures from the publicly available Ninapro dataset using two-time series feature extraction algorithms: Time Series Feature Extraction (TSFE) and Convolutional Neural Networks (CNNs). Trials were conducted using varying numbers of EMG channels from one to eight to determine the impact of channel quantity on classification accuracy. The results suggest that although both algorithms can successfully distinguish between hand gesture EMG signals, CNNs outperform TSFE in extracting useful information for both accuracy and computational efficiency. In addition, although more channels of EMG signals provide more useful information, they also require more complex and computationally intensive feature extractors and consequently do not perform as well as lower numbers of channels. The findings also underscore the potential of machine learning techniques in developing more effective and adaptive prosthetic control systems.Keywords: EMG, machine learning, prosthetic control, electromyographic prosthetics, hand gesture classification, CNN, computational neural networks, TSFE, time series feature extraction, channel count, logistic regression, ninapro, classifiers
Procedia PDF Downloads 2672 Relativity in Toddlers' Understanding of the Physical World as Key to Misconceptions in the Science Classroom
Authors: Michael Hast
Abstract:
Within their first year, infants can differentiate between objects based on their weight. By at least 5 years children hold consistent weight-related misconceptions about the physical world, such as that heavy things fall faster than lighter ones because of their weight. Such misconceptions are seen as a challenge for science education since they are often highly resistant to change through instruction. Understanding the time point of emergence of such ideas could, therefore, be crucial for early science pedagogy. The paper thus discusses two studies that jointly address the issue by examining young children’s search behaviour in hidden displacement tasks under consideration of relative object weight. In both studies, they were tested with a heavy or a light ball, and they either had information about one of the balls only or both. In Study 1, 88 toddlers aged 2 to 3½ years watched a ball being dropped into a curved tube and were then allowed to search for the ball in three locations – one straight beneath the tube entrance, one where the curved tube lead to, and one that corresponded to neither of the previous outcomes. Success and failure at the task were not impacted by weight of the balls alone in any particular way. However, from around 3 years onwards, relative lightness, gained through having tactile experience of both balls beforehand, enhanced search success. Conversely, relative heaviness increased search errors such that children increasingly searched in the location immediately beneath the tube entry – known as the gravity bias. In Study 2, 60 toddlers aged 2, 2½ and 3 years watched a ball roll down a ramp and behind a screen with four doors, with a barrier placed along the ramp after one of four doors. Toddlers were allowed to open the doors to find the ball. While search accuracy generally increased with age, relative weight did not play a role in 2-year-olds’ search behaviour. Relative lightness improved 2½-year-olds’ searches. At 3 years, both relative lightness and relative heaviness had a significant impact, with the former improving search accuracy and the latter reducing it. Taken together, both studies suggest that between 2 and 3 years of age, relative object weight is increasingly taken into consideration in navigating naïve physical concepts. In particular, it appears to contribute to the early emergence of misconceptions relating to object weight. This insight from developmental psychology research may have consequences for early science education and related pedagogy towards early conceptual change.Keywords: conceptual development, early science education, intuitive physics, misconceptions, object weight
Procedia PDF Downloads 18971 Innovative Screening Tool Based on Physical Properties of Blood
Authors: Basant Singh Sikarwar, Mukesh Roy, Ayush Goyal, Priya Ranjan
Abstract:
This work combines two bodies of knowledge which includes biomedical basis of blood stain formation and fluid communities’ wisdom that such formation of blood stain depends heavily on physical properties. Moreover biomedical research tells that different patterns in stains of blood are robust indicator of blood donor’s health or lack thereof. Based on these valuable insights an innovative screening tool is proposed which can act as an aide in the diagnosis of diseases such Anemia, Hyperlipidaemia, Tuberculosis, Blood cancer, Leukemia, Malaria etc., with enhanced confidence in the proposed analysis. To realize this powerful technique, simple, robust and low-cost micro-fluidic devices, a micro-capillary viscometer and a pendant drop tensiometer are designed and proposed to be fabricated to measure the viscosity, surface tension and wettability of various blood samples. Once prognosis and diagnosis data has been generated, automated linear and nonlinear classifiers have been applied into the automated reasoning and presentation of results. A support vector machine (SVM) classifies data on a linear fashion. Discriminant analysis and nonlinear embedding’s are coupled with nonlinear manifold detection in data and detected decisions are made accordingly. In this way, physical properties can be used, using linear and non-linear classification techniques, for screening of various diseases in humans and cattle. Experiments are carried out to validate the physical properties measurement devices. This framework can be further developed towards a real life portable disease screening cum diagnostics tool. Small-scale production of screening cum diagnostic devices is proposed to carry out independent test.Keywords: blood, physical properties, diagnostic, nonlinear, classifier, device, surface tension, viscosity, wettability
Procedia PDF Downloads 37470 Selection of Optimal Reduced Feature Sets of Brain Signal Analysis Using Heuristically Optimized Deep Autoencoder
Authors: Souvik Phadikar, Nidul Sinha, Rajdeep Ghosh
Abstract:
In brainwaves research using electroencephalogram (EEG) signals, finding the most relevant and effective feature set for identification of activities in the human brain is a big challenge till today because of the random nature of the signals. The feature extraction method is a key issue to solve this problem. Finding those features that prove to give distinctive pictures for different activities and similar for the same activities is very difficult, especially for the number of activities. The performance of a classifier accuracy depends on this quality of feature set. Further, more number of features result in high computational complexity and less number of features compromise with the lower performance. In this paper, a novel idea of the selection of optimal feature set using a heuristically optimized deep autoencoder is presented. Using various feature extraction methods, a vast number of features are extracted from the EEG signals and fed to the autoencoder deep neural network. The autoencoder encodes the input features into a small set of codes. To avoid the gradient vanish problem and normalization of the dataset, a meta-heuristic search algorithm is used to minimize the mean square error (MSE) between encoder input and decoder output. To reduce the feature set into a smaller one, 4 hidden layers are considered in the autoencoder network; hence it is called Heuristically Optimized Deep Autoencoder (HO-DAE). In this method, no features are rejected; all the features are combined into the response of responses of the hidden layer. The results reveal that higher accuracy can be achieved using optimal reduced features. The proposed HO-DAE is also compared with the regular autoencoder to test the performance of both. The performance of the proposed method is validated and compared with the other two methods recently reported in the literature, which reveals that the proposed method is far better than the other two methods in terms of classification accuracy.Keywords: autoencoder, brainwave signal analysis, electroencephalogram, feature extraction, feature selection, optimization
Procedia PDF Downloads 11269 Index t-SNE: Tracking Dynamics of High-Dimensional Datasets with Coherent Embeddings
Authors: Gaelle Candel, David Naccache
Abstract:
t-SNE is an embedding method that the data science community has widely used. It helps two main tasks: to display results by coloring items according to the item class or feature value; and for forensic, giving a first overview of the dataset distribution. Two interesting characteristics of t-SNE are the structure preservation property and the answer to the crowding problem, where all neighbors in high dimensional space cannot be represented correctly in low dimensional space. t-SNE preserves the local neighborhood, and similar items are nicely spaced by adjusting to the local density. These two characteristics produce a meaningful representation, where the cluster area is proportional to its size in number, and relationships between clusters are materialized by closeness on the embedding. This algorithm is non-parametric. The transformation from a high to low dimensional space is described but not learned. Two initializations of the algorithm would lead to two different embeddings. In a forensic approach, analysts would like to compare two or more datasets using their embedding. A naive approach would be to embed all datasets together. However, this process is costly as the complexity of t-SNE is quadratic and would be infeasible for too many datasets. Another approach would be to learn a parametric model over an embedding built with a subset of data. While this approach is highly scalable, points could be mapped at the same exact position, making them indistinguishable. This type of model would be unable to adapt to new outliers nor concept drift. This paper presents a methodology to reuse an embedding to create a new one, where cluster positions are preserved. The optimization process minimizes two costs, one relative to the embedding shape and the second relative to the support embedding’ match. The embedding with the support process can be repeated more than once, with the newly obtained embedding. The successive embedding can be used to study the impact of one variable over the dataset distribution or monitor changes over time. This method has the same complexity as t-SNE per embedding, and memory requirements are only doubled. For a dataset of n elements sorted and split into k subsets, the total embedding complexity would be reduced from O(n²) to O(n²=k), and the memory requirement from n² to 2(n=k)², which enables computation on recent laptops. The method showed promising results on a real-world dataset, allowing to observe the birth, evolution, and death of clusters. The proposed approach facilitates identifying significant trends and changes, which empowers the monitoring high dimensional datasets’ dynamics.Keywords: concept drift, data visualization, dimension reduction, embedding, monitoring, reusability, t-SNE, unsupervised learning
Procedia PDF Downloads 14168 A Proposed Optimized and Efficient Intrusion Detection System for Wireless Sensor Network
Authors: Abdulaziz Alsadhan, Naveed Khan
Abstract:
In recent years intrusions on computer network are the major security threat. Hence, it is important to impede such intrusions. The hindrance of such intrusions entirely relies on its detection, which is primary concern of any security tool like Intrusion Detection System (IDS). Therefore, it is imperative to accurately detect network attack. Numerous intrusion detection techniques are available but the main issue is their performance. The performance of IDS can be improved by increasing the accurate detection rate and reducing false positive. The existing intrusion detection techniques have the limitation of usage of raw data set for classification. The classifier may get jumble due to redundancy, which results incorrect classification. To minimize this problem, Principle Component Analysis (PCA), Linear Discriminant Analysis (LDA), and Local Binary Pattern (LBP) can be applied to transform raw features into principle features space and select the features based on their sensitivity. Eigen values can be used to determine the sensitivity. To further classify, the selected features greedy search, back elimination, and Particle Swarm Optimization (PSO) can be used to obtain a subset of features with optimal sensitivity and highest discriminatory power. These optimal feature subset used to perform classification. For classification purpose, Support Vector Machine (SVM) and Multilayer Perceptron (MLP) used due to its proven ability in classification. The Knowledge Discovery and Data mining (KDD’99) cup dataset was considered as a benchmark for evaluating security detection mechanisms. The proposed approach can provide an optimal intrusion detection mechanism that outperforms the existing approaches and has the capability to minimize the number of features and maximize the detection rates.Keywords: Particle Swarm Optimization (PSO), Principle Component Analysis (PCA), Linear Discriminant Analysis (LDA), Local Binary Pattern (LBP), Support Vector Machine (SVM), Multilayer Perceptron (MLP)
Procedia PDF Downloads 364