Search results for: Naive Bayes classifier
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 517

Search results for: Naive Bayes classifier

157 Gait Biometric for Person Re-Identification

Authors: Lavanya Srinivasan

Abstract:

Biometric identification is to identify unique features in a person like fingerprints, iris, ear, and voice recognition that need the subject's permission and physical contact. Gait biometric is used to identify the unique gait of the person by extracting moving features. The main advantage of gait biometric to identify the gait of a person at a distance, without any physical contact. In this work, the gait biometric is used for person re-identification. The person walking naturally compared with the same person walking with bag, coat, and case recorded using longwave infrared, short wave infrared, medium wave infrared, and visible cameras. The videos are recorded in rural and in urban environments. The pre-processing technique includes human identified using YOLO, background subtraction, silhouettes extraction, and synthesis Gait Entropy Image by averaging the silhouettes. The moving features are extracted from the Gait Entropy Energy Image. The extracted features are dimensionality reduced by the principal component analysis and recognised using different classifiers. The comparative results with the different classifier show that linear discriminant analysis outperforms other classifiers with 95.8% for visible in the rural dataset and 94.8% for longwave infrared in the urban dataset.

Keywords: biometric, gait, silhouettes, YOLO

Procedia PDF Downloads 172
156 A Novel Breast Cancer Detection Algorithm Using Point Region Growing Segmentation and Pseudo-Zernike Moments

Authors: Aileen F. Wang

Abstract:

Mammography has been one of the most reliable methods for early detection and diagnosis of breast cancer. However, mammography misses about 17% and up to 30% of breast cancers due to the subtle and unstable appearances of breast cancer in their early stages. Recent computer-aided diagnosis (CADx) technology using Zernike moments has improved detection accuracy. However, it has several drawbacks: it uses manual segmentation, Zernike moments are not robust, and it still has a relatively high false negative rate (FNR)–17.6%. This project will focus on the development of a novel breast cancer detection algorithm to automatically segment the breast mass and further reduce FNR. The algorithm consists of automatic segmentation of a single breast mass using Point Region Growing Segmentation, reconstruction of the segmented breast mass using Pseudo-Zernike moments, and classification of the breast mass using the root mean square (RMS). A comparative study among the various algorithms on the segmentation and reconstruction of breast masses was performed on randomly selected mammographic images. The results demonstrated that the newly developed algorithm is the best in terms of accuracy and cost effectiveness. More importantly, the new classifier RMS has the lowest FNR–6%.

Keywords: computer aided diagnosis, mammography, point region growing segmentation, pseudo-zernike moments, root mean square

Procedia PDF Downloads 452
155 Qualitative Inquiry on Existential Concerns and Well-Being among the Youth of Higher Education Institutions in Ethiopia: Case Study of Addis Ababa University

Authors: Ezgiamn Abraha Hagos

Abstract:

Higher education is important for college students to develop their authentic identity by means of getting exposure to diverse ideas and experiences. However, current college students are not successfully achieving a satisfying sense of meaning and purpose in their lives, which often places them in a state of existential vacuum. Thus, this study uncovers the existential concerns of youth in higher education by means of assessing their view on meaningful life and integration of it as a guide into their lives and challenges faced in doing so. Data were procured from thirty undergraduate students of Addis Ababa University, Ethiopia via interview, naïve sketch method, and content analysis of selected magazines and newspapers. Data were analyzed using organization, immersion, generating themes, coding, offering interpretation as well as checking the data. Relationship, education, and belief were found to be main sources of meaning. But, many of the study participants failed to articulate their meaning in life explicitly and identified to be in a state of drifting. Moreover, hopelessness, economic problems and quality of training impinge their sense of meaning in life negatively. The content analysis principally embodied the youth in higher education as a group of people confronted with rafts of challenges such as debauchery, moral crisis, self-destructive behaviors and hankering for support and direction. Thus, crafting the asset-based approach and counseling services that will prepare the youth for the future and develop holistically in terms of body and mind are tremendously vital.

Keywords: higher education institutions; meaning in life; youth

Procedia PDF Downloads 106
154 Expanding Trading Strategies By Studying Sentiment Correlation With Data Mining Techniques

Authors: Ved Kulkarni, Karthik Kini

Abstract:

This experiment aims to understand how the media affects the power markets in the mainland United States and study the duration of reaction time between news updates and actual price movements. it have taken into account electric utility companies trading in the NYSE and excluded companies that are more politically involved and move with higher sensitivity to Politics. The scrapper checks for any news related to keywords, which are predefined and stored for each specific company. Based on this, the classifier will allocate the effect into five categories: positive, negative, highly optimistic, highly negative, or neutral. The effect on the respective price movement will be studied to understand the response time. Based on the response time observed, neural networks would be trained to understand and react to changing market conditions, achieving the best strategy in every market. The stock trader would be day trading in the first phase and making option strategy predictions based on the black holes model. The expected result is to create an AI-based system that adjusts trading strategies within the market response time to each price movement.

Keywords: data mining, language processing, artificial neural networks, sentiment analysis

Procedia PDF Downloads 17
153 Electroencephalogram Based Approach for Mental Stress Detection during Gameplay with Level Prediction

Authors: Priyadarsini Samal, Rajesh Singla

Abstract:

Many mobile games come with the benefits of entertainment by introducing stress to the human brain. In recognizing this mental stress, the brain-computer interface (BCI) plays an important role. It has various neuroimaging approaches which help in analyzing the brain signals. Electroencephalogram (EEG) is the most commonly used method among them as it is non-invasive, portable, and economical. Here, this paper investigates the pattern in brain signals when introduced with mental stress. Two healthy volunteers played a game whose aim was to search hidden words from the grid, and the levels were chosen randomly. The EEG signals during gameplay were recorded to investigate the impacts of stress with the changing levels from easy to medium to hard. A total of 16 features of EEG were analyzed for this experiment which includes power band features with relative powers, event-related desynchronization, along statistical features. Support vector machine was used as the classifier, which resulted in an accuracy of 93.9% for three-level stress analysis; for two levels, the accuracy of 92% and 98% are achieved. In addition to that, another game that was similar in nature was played by the volunteers. A suitable regression model was designed for prediction where the feature sets of the first and second game were used for testing and training purposes, respectively, and an accuracy of 73% was found.

Keywords: brain computer interface, electroencephalogram, regression model, stress, word search

Procedia PDF Downloads 187
152 A Hybrid Feature Selection and Deep Learning Algorithm for Cancer Disease Classification

Authors: Niousha Bagheri Khulenjani, Mohammad Saniee Abadeh

Abstract:

Learning from very big datasets is a significant problem for most present data mining and machine learning algorithms. MicroRNA (miRNA) is one of the important big genomic and non-coding datasets presenting the genome sequences. In this paper, a hybrid method for the classification of the miRNA data is proposed. Due to the variety of cancers and high number of genes, analyzing the miRNA dataset has been a challenging problem for researchers. The number of features corresponding to the number of samples is high and the data suffer from being imbalanced. The feature selection method has been used to select features having more ability to distinguish classes and eliminating obscures features. Afterward, a Convolutional Neural Network (CNN) classifier for classification of cancer types is utilized, which employs a Genetic Algorithm to highlight optimized hyper-parameters of CNN. In order to make the process of classification by CNN faster, Graphics Processing Unit (GPU) is recommended for calculating the mathematic equation in a parallel way. The proposed method is tested on a real-world dataset with 8,129 patients, 29 different types of tumors, and 1,046 miRNA biomarkers, taken from The Cancer Genome Atlas (TCGA) database.

Keywords: cancer classification, feature selection, deep learning, genetic algorithm

Procedia PDF Downloads 111
151 EEG-Based Classification of Psychiatric Disorders: Bipolar Mood Disorder vs. Schizophrenia

Authors: Han-Jeong Hwang, Jae-Hyun Jo, Fatemeh Alimardani

Abstract:

An accurate diagnosis of psychiatric diseases is a challenging issue, in particular when distinct symptoms for different diseases are overlapped, such as delusions appeared in bipolar mood disorder (BMD) and schizophrenia (SCH). In the present study, we propose a useful way to discriminate BMD and SCH using electroencephalography (EEG). A total of thirty BMD and SCH patients (15 vs. 15) took part in our experiment. EEG signals were measured with nineteen electrodes attached on the scalp using the international 10-20 system, while they were exposed to a visual stimulus flickering at 16 Hz for 95 s. The flickering visual stimulus induces a certain brain signal, known as steady-state visual evoked potential (SSVEP), which is differently observed in patients with BMD and SCH, respectively, in terms of SSVEP amplitude because they process the same visual information in own unique way. For classifying BDM and SCH patients, machine learning technique was employed in which leave-one-out-cross validation was performed. The SSVEPs induced at the fundamental (16 Hz) and second harmonic (32 Hz) stimulation frequencies were extracted using fast Fourier transformation (FFT), and they were used as features. The most discriminative feature was selected using the Fisher score, and support vector machine (SVM) was used as a classifier. From the analysis, we could obtain a classification accuracy of 83.33 %, showing the feasibility of discriminating patients with BMD and SCH using EEG. We expect that our approach can be utilized for psychiatrists to more accurately diagnose the psychiatric disorders, BMD and SCH.

Keywords: bipolar mood disorder, electroencephalography, schizophrenia, machine learning

Procedia PDF Downloads 419
150 Stock Market Prediction Using Convolutional Neural Network That Learns from a Graph

Authors: Mo-Se Lee, Cheol-Hwi Ahn, Kee-Young Kwahk, Hyunchul Ahn

Abstract:

Over the past decade, deep learning has been in spotlight among various machine learning algorithms. In particular, CNN (Convolutional Neural Network), which is known as effective solution for recognizing and classifying images, has been popularly applied to classification and prediction problems in various fields. In this study, we try to apply CNN to stock market prediction, one of the most challenging tasks in the machine learning research. In specific, we propose to apply CNN as the binary classifier that predicts stock market direction (up or down) by using a graph as its input. That is, our proposal is to build a machine learning algorithm that mimics a person who looks at the graph and predicts whether the trend will go up or down. Our proposed model consists of four steps. In the first step, it divides the dataset into 5 days, 10 days, 15 days, and 20 days. And then, it creates graphs for each interval in step 2. In the next step, CNN classifiers are trained using the graphs generated in the previous step. In step 4, it optimizes the hyper parameters of the trained model by using the validation dataset. To validate our model, we will apply it to the prediction of KOSPI200 for 1,986 days in eight years (from 2009 to 2016). The experimental dataset will include 14 technical indicators such as CCI, Momentum, ROC and daily closing price of KOSPI200 of Korean stock market.

Keywords: convolutional neural network, deep learning, Korean stock market, stock market prediction

Procedia PDF Downloads 425
149 The Ameliorative Effects of the Histamine H3 Receptor Antagonist/Inverse Agonist DL77 on MK801-Induced Memory Deficits in Rats

Authors: B. Sadek, N. Khan, Shreesh K. Ojha, Adel Sadeq, D. Lazewska, K. Kiec-Kononowicz

Abstract:

The involvement of Histamine H3 receptors (H3Rs) in memory and the potential role of H3R antagonists in pharmacological control of neurodegenerative disorders, e.g., Alzheimer disease (AD) is well established. Therefore, the memory-enhancing effects of the H3R antagonist DL77 on MK801-induced cognitive deficits were evaluated in passive avoidance paradigm (PAP) and novel object recognition (NOR) tasks in adult male rats, applying donepezil (DOZ) as a reference drug. Animals pretreated with acute systemic administration of DL77 (2.5, 5, and 10 mg/kg, i.p.) were significantly ameliorated in regard to MK801-induced memory deficits in PAP. The ameliorative effect of most effective dose of DL77 (5 mg/kg, i.p.) was abrogated when animals were pretreated with a co-injection with the H3R agonist R-(α)-methylhistamine (RAMH, 10 mg/kg, i.p.). Moreover, and in the NOR paradigm, DL77 (5 mg/kg, i.p.) reversed MK801-induced deficits long-term memory (LTM), and the DL77-provided procognitive effect was comparable to that of reference drug DOZ, and was reversed when animals were co-injected with RAMH (10 mg/kg, i.p.). However, DL77(5 mg/kg, i.p.) failed to alter short-term memory (STM) impairment in NOR test. Furthermore, DL77 (5 mg/kg) failed to induce any alterations of anxiety and locomotor behaviors of animals naive to elevated-plus maze (EPM), indicating that the ameliorative effects observed in PAP or NOR tests were not associated to alterations in emotions or in natural locomotion of tested animals. These results reveal the potential contribution of H3Rs in modulating CNS neurotransmission systems associated with neurodegenerative disorders, e.g., AD.

Keywords: histamine H3 receptor, antagonist, learning and memory, Alzheimer's disease, neurodegeneration, passive avoidance paradigm, novel object recognition, behavioral research

Procedia PDF Downloads 155
148 Adaptive Swarm Balancing Algorithms for Rare-Event Prediction in Imbalanced Healthcare Data

Authors: Jinyan Li, Simon Fong, Raymond Wong, Mohammed Sabah, Fiaidhi Jinan

Abstract:

Clinical data analysis and forecasting have make great contributions to disease control, prevention and detection. However, such data usually suffer from highly unbalanced samples in class distributions. In this paper, we target at the binary imbalanced dataset, where the positive samples take up only the minority. We investigate two different meta-heuristic algorithms, particle swarm optimization and bat-inspired algorithm, and combine both of them with the synthetic minority over-sampling technique (SMOTE) for processing the datasets. One approach is to process the full dataset as a whole. The other is to split up the dataset and adaptively process it one segment at a time. The experimental results reveal that while the performance improvements obtained by the former methods are not scalable to larger data scales, the later one, which we call Adaptive Swarm Balancing Algorithms, leads to significant efficiency and effectiveness improvements on large datasets. We also find it more consistent with the practice of the typical large imbalanced medical datasets. We further use the meta-heuristic algorithms to optimize two key parameters of SMOTE. Leading to more credible performances of the classifier, and shortening the running time compared with the brute-force method.

Keywords: Imbalanced dataset, meta-heuristic algorithm, SMOTE, big data

Procedia PDF Downloads 441
147 Blood Glucose Level Measurement from Breath Analysis

Authors: Tayyab Hassan, Talha Rehman, Qasim Abdul Aziz, Ahmad Salman

Abstract:

The constant monitoring of blood glucose level is necessary for maintaining health of patients and to alert medical specialists to take preemptive measures before the onset of any complication as a result of diabetes. The current clinical monitoring of blood glucose uses invasive methods repeatedly which are uncomfortable and may result in infections in diabetic patients. Several attempts have been made to develop non-invasive techniques for blood glucose measurement. In this regard, the existing methods are not reliable and are less accurate. Other approaches claiming high accuracy have not been tested on extended dataset, and thus, results are not statistically significant. It is a well-known fact that acetone concentration in breath has a direct relation with blood glucose level. In this paper, we have developed the first of its kind, reliable and high accuracy breath analyzer for non-invasive blood glucose measurement. The acetone concentration in breath was measured using MQ 138 sensor in the samples collected from local hospitals in Pakistan involving one hundred patients. The blood glucose levels of these patients are determined using conventional invasive clinical method. We propose a linear regression classifier that is trained to map breath acetone level to the collected blood glucose level achieving high accuracy.

Keywords: blood glucose level, breath acetone concentration, diabetes, linear regression

Procedia PDF Downloads 171
146 MhAGCN: Multi-Head Attention Graph Convolutional Network for Web Services Classification

Authors: Bing Li, Zhi Li, Yilong Yang

Abstract:

Web classification can promote the quality of service discovery and management in the service repository. It is widely used to locate developers desired services. Although traditional classification methods based on supervised learning models can achieve classification tasks, developers need to manually mark web services, and the quality of these tags may not be enough to establish an accurate classifier for service classification. With the doubling of the number of web services, the manual tagging method has become unrealistic. In recent years, the attention mechanism has made remarkable progress in the field of deep learning, and its huge potential has been fully demonstrated in various fields. This paper designs a multi-head attention graph convolutional network (MHAGCN) service classification method, which can assign different weights to the neighborhood nodes without complicated matrix operations or relying on understanding the entire graph structure. The framework combines the advantages of the attention mechanism and graph convolutional neural network. It can classify web services through automatic feature extraction. The comprehensive experimental results on a real dataset not only show the superior performance of the proposed model over the existing models but also demonstrate its potentially good interpretability for graph analysis.

Keywords: attention mechanism, graph convolutional network, interpretability, service classification, service discovery

Procedia PDF Downloads 135
145 A Neural Network Classifier for Estimation of the Degree of Infestation by Late Blight on Tomato Leaves

Authors: Gizelle K. Vianna, Gabriel V. Cunha, Gustavo S. Oliveira

Abstract:

Foliage diseases in plants can cause a reduction in both quality and quantity of agricultural production. Intelligent detection of plant diseases is an essential research topic as it may help monitoring large fields of crops by automatically detecting the symptoms of foliage diseases. This work investigates ways to recognize the late blight disease from the analysis of tomato digital images, collected directly from the field. A pair of multilayer perceptron neural network analyzes the digital images, using data from both RGB and HSL color models, and classifies each image pixel. One neural network is responsible for the identification of healthy regions of the tomato leaf, while the other identifies the injured regions. The outputs of both networks are combined to generate the final classification of each pixel from the image and the pixel classes are used to repaint the original tomato images by using a color representation that highlights the injuries on the plant. The new images will have only green, red or black pixels, if they came from healthy or injured portions of the leaf, or from the background of the image, respectively. The system presented an accuracy of 97% in detection and estimation of the level of damage on the tomato leaves caused by late blight.

Keywords: artificial neural networks, digital image processing, pattern recognition, phytosanitary

Procedia PDF Downloads 327
144 Fused Structure and Texture (FST) Features for Improved Pedestrian Detection

Authors: Hussin K. Ragb, Vijayan K. Asari

Abstract:

In this paper, we present a pedestrian detection descriptor called Fused Structure and Texture (FST) features based on the combination of the local phase information with the texture features. Since the phase of the signal conveys more structural information than the magnitude, the phase congruency concept is used to capture the structural features. On the other hand, the Center-Symmetric Local Binary Pattern (CSLBP) approach is used to capture the texture information of the image. The dimension less quantity of the phase congruency and the robustness of the CSLBP operator on the flat images, as well as the blur and illumination changes, lead the proposed descriptor to be more robust and less sensitive to the light variations. The proposed descriptor can be formed by extracting the phase congruency and the CSLBP values of each pixel of the image with respect to its neighborhood. The histogram of the oriented phase and the histogram of the CSLBP values for the local regions in the image are computed and concatenated to construct the FST descriptor. Several experiments were conducted on INRIA and the low resolution DaimlerChrysler datasets to evaluate the detection performance of the pedestrian detection system that is based on the FST descriptor. A linear Support Vector Machine (SVM) is used to train the pedestrian classifier. These experiments showed that the proposed FST descriptor has better detection performance over a set of state of the art feature extraction methodologies.

Keywords: pedestrian detection, phase congruency, local phase, LBP features, CSLBP features, FST descriptor

Procedia PDF Downloads 488
143 A Replicon-Baculovirus Model for Efficient Packaging of Hepatitis E Virus RNA and Production of Infectious Virions

Authors: Mohammad K. Parvez, Mohammed S. Al-Dosari

Abstract:

Hepatitis E virus (HEV) is an emerging RNA virus that causes acute and chronic liver disease with a global mortality rate of about 2%. Despite milestone developments in understanding of HEV biology, there is still lack of a robust culture system or animal model. Therefore, in a novel approach, two recombinant-baculoviruses (vBac-ORF2 and vBac-ORF3) that could overexpress HEV ORF2 (structural/capsid) and ORF3 (nonstructural/regulatory) proteins, respectively were constructed. The established HEV-SAR55 (genotype 1) replicon that contained GFP gene, in place of ORF2/ORF3 sequences was in vitro transcribed, and GFP production in RNA transfected S10-3 cells was scored by FACS. Enhanced infectivity, if any, of nascent virions produced by exogenously-supplied ORF2 and viral RNA by co-expression of ORF3 was tested on naïve HepG2 cells. Co-transduction with vBac-ORF2/vBac-ORF3 (108 pfu/microL) produced high amounts of native ORF2/ORF3 in approximately 60% of S10-3 cells, determined by immunofluorescence microscopy and Western analysis. FACS analysis showed about 9% GFP positivity of S10-3 cells on day6 post-transfection (i.e, day5 post-transduction). Further, FACS scoring indicated that lysates from S10-3 cultures receiving the RNA plus vBac-ORF2 were capable of producing HEV particles with about 4% infectivity in HepG2 cells. However, lysates of cultures co-transduced with vBac-ORF3, were found to further enhance virion infectivity by approximately 17%. This supported a previously proposed role of ORF3 as a minor-structural protein in HEV virion assembly and infectivity. In conclusion, the present model for efficient genomic RNA packaging and production of infectious virions could be a valuable tool to study various aspects of HEV molecular biology, in vitro.

Keywords: chronic liver disease, hepatitis E virus, ORF2, ORF3, replicon

Procedia PDF Downloads 255
142 Anomaly Detection in a Data Center with a Reconstruction Method Using a Multi-Autoencoders Model

Authors: Victor Breux, Jérôme Boutet, Alain Goret, Viviane Cattin

Abstract:

Early detection of anomalies in data centers is important to reduce downtimes and the costs of periodic maintenance. However, there is little research on this topic and even fewer on the fusion of sensor data for the detection of abnormal events. The goal of this paper is to propose a method for anomaly detection in data centers by combining sensor data (temperature, humidity, power) and deep learning models. The model described in the paper uses one autoencoder per sensor to reconstruct the inputs. The auto-encoders contain Long-Short Term Memory (LSTM) layers and are trained using the normal samples of the relevant sensors selected by correlation analysis. The difference signal between the input and its reconstruction is then used to classify the samples using feature extraction and a random forest classifier. The data measured by the sensors of a data center between January 2019 and May 2020 are used to train the model, while the data between June 2020 and May 2021 are used to assess it. Performances of the model are assessed a posteriori through F1-score by comparing detected anomalies with the data center’s history. The proposed model outperforms the state-of-the-art reconstruction method, which uses only one autoencoder taking multivariate sequences and detects an anomaly with a threshold on the reconstruction error, with an F1-score of 83.60% compared to 24.16%.

Keywords: anomaly detection, autoencoder, data centers, deep learning

Procedia PDF Downloads 194
141 Intelligent Transport System: Classification of Traffic Signs Using Deep Neural Networks in Real Time

Authors: Anukriti Kumar, Tanmay Singh, Dinesh Kumar Vishwakarma

Abstract:

Traffic control has been one of the most common and irritating problems since the time automobiles have hit the roads. Problems like traffic congestion have led to a significant time burden around the world and one significant solution to these problems can be the proper implementation of the Intelligent Transport System (ITS). It involves the integration of various tools like smart sensors, artificial intelligence, position technologies and mobile data services to manage traffic flow, reduce congestion and enhance driver's ability to avoid accidents during adverse weather. Road and traffic signs’ recognition is an emerging field of research in ITS. Classification problem of traffic signs needs to be solved as it is a major step in our journey towards building semi-autonomous/autonomous driving systems. The purpose of this work focuses on implementing an approach to solve the problem of traffic sign classification by developing a Convolutional Neural Network (CNN) classifier using the GTSRB (German Traffic Sign Recognition Benchmark) dataset. Rather than using hand-crafted features, our model addresses the concern of exploding huge parameters and data method augmentations. Our model achieved an accuracy of around 97.6% which is comparable to various state-of-the-art architectures.

Keywords: multiclass classification, convolution neural network, OpenCV

Procedia PDF Downloads 176
140 Diagnosis and Analysis of Automated Liver and Tumor Segmentation on CT

Authors: R. R. Ramsheeja, R. Sreeraj

Abstract:

For view the internal structures of the human body such as liver, brain, kidney etc have a wide range of different modalities for medical images are provided nowadays. Computer Tomography is one of the most significant medical image modalities. In this paper use CT liver images for study the use of automatic computer aided techniques to calculate the volume of the liver tumor. Segmentation method is used for the detection of tumor from the CT scan is proposed. Gaussian filter is used for denoising the liver image and Adaptive Thresholding algorithm is used for segmentation. Multiple Region Of Interest(ROI) based method that may help to characteristic the feature different. It provides a significant impact on classification performance. Due to the characteristic of liver tumor lesion, inherent difficulties appear selective. For a better performance, a novel proposed system is introduced. Multiple ROI based feature selection and classification are performed. In order to obtain of relevant features for Support Vector Machine(SVM) classifier is important for better generalization performance. The proposed system helps to improve the better classification performance, reason in which we can see a significant reduction of features is used. The diagnosis of liver cancer from the computer tomography images is very difficult in nature. Early detection of liver tumor is very helpful to save the human life.

Keywords: computed tomography (CT), multiple region of interest(ROI), feature values, segmentation, SVM classification

Procedia PDF Downloads 509
139 Machine Learning Analysis of Student Success in Introductory Calculus Based Physics I Course

Authors: Chandra Prayaga, Aaron Wade, Lakshmi Prayaga, Gopi Shankar Mallu

Abstract:

This paper presents the use of machine learning algorithms to predict the success of students in an introductory physics course. Data having 140 rows pertaining to the performance of two batches of students was used. The lack of sufficient data to train robust machine learning models was compensated for by generating synthetic data similar to the real data. CTGAN and CTGAN with Gaussian Copula (Gaussian) were used to generate synthetic data, with the real data as input. To check the similarity between the real data and each synthetic dataset, pair plots were made. The synthetic data was used to train machine learning models using the PyCaret package. For the CTGAN data, the Ada Boost Classifier (ADA) was found to be the ML model with the best fit, whereas the CTGAN with Gaussian Copula yielded Logistic Regression (LR) as the best model. Both models were then tested for accuracy with the real data. ROC-AUC analysis was performed for all the ten classes of the target variable (Grades A, A-, B+, B, B-, C+, C, C-, D, F). The ADA model with CTGAN data showed a mean AUC score of 0.4377, but the LR model with the Gaussian data showed a mean AUC score of 0.6149. ROC-AUC plots were obtained for each Grade value separately. The LR model with Gaussian data showed consistently better AUC scores compared to the ADA model with CTGAN data, except in two cases of the Grade value, C- and A-.

Keywords: machine learning, student success, physics course, grades, synthetic data, CTGAN, gaussian copula CTGAN

Procedia PDF Downloads 44
138 A Kernel-Based Method for MicroRNA Precursor Identification

Authors: Bin Liu

Abstract:

MicroRNAs (miRNAs) are small non-coding RNA molecules, functioning in transcriptional and post-transcriptional regulation of gene expression. The discrimination of the real pre-miRNAs from the false ones (such as hairpin sequences with similar stem-loops) is necessary for the understanding of miRNAs’ role in the control of cell life and death. Since both their small size and sequence specificity, it cannot be based on sequence information alone but requires structure information about the miRNA precursor to get satisfactory performance. Kmers are convenient and widely used features for modeling the properties of miRNAs and other biological sequences. However, Kmers suffer from the inherent limitation that if the parameter K is increased to incorporate long range effects, some certain Kmer will appear rarely or even not appear, as a consequence, most Kmers absent and a few present once. Thus, the statistical learning approaches using Kmers as features become susceptible to noisy data once K becomes large. In this study, we proposed a Gapped k-mer approach to overcome the disadvantages of Kmers, and applied this method to the field of miRNA prediction. Combined with the structure status composition, a classifier called imiRNA-GSSC was proposed. We show that compared to the original imiRNA-kmer and alternative approaches. Trained on human miRNA precursors, this predictor can achieve an accuracy of 82.34 for predicting 4022 pre-miRNA precursors from eleven species.

Keywords: gapped k-mer, imiRNA-GSSC, microRNA precursor, support vector machine

Procedia PDF Downloads 161
137 Short Text Classification Using Part of Speech Feature to Analyze Students' Feedback of Assessment Components

Authors: Zainab Mutlaq Ibrahim, Mohamed Bader-El-Den, Mihaela Cocea

Abstract:

Students' textual feedback can hold unique patterns and useful information about learning process, it can hold information about advantages and disadvantages of teaching methods, assessment components, facilities, and other aspects of teaching. The results of analysing such a feedback can form a key point for institutions’ decision makers to advance and update their systems accordingly. This paper proposes a data mining framework for analysing end of unit general textual feedback using part of speech feature (PoS) with four machine learning algorithms: support vector machines, decision tree, random forest, and naive bays. The proposed framework has two tasks: first, to use the above algorithms to build an optimal model that automatically classifies the whole data set into two subsets, one subset is tailored to assessment practices (assessment related), and the other one is the non-assessment related data. Second task to use the same algorithms to build an optimal model for whole data set, and the new data subsets to automatically detect their sentiment. The significance of this paper is to compare the performance of the above four algorithms using part of speech feature to the performance of the same algorithms using n-grams feature. The paper follows Knowledge Discovery and Data Mining (KDDM) framework to construct the classification and sentiment analysis models, which is understanding the assessment domain, cleaning and pre-processing the data set, selecting and running the data mining algorithm, interpreting mined patterns, and consolidating the discovered knowledge. The results of this paper experiments show that both models which used both features performed very well regarding first task. But regarding the second task, models that used part of speech feature has underperformed in comparison with models that used unigrams and bigrams.

Keywords: assessment, part of speech, sentiment analysis, student feedback

Procedia PDF Downloads 142
136 Enhancement Method of Network Traffic Anomaly Detection Model Based on Adversarial Training With Category Tags

Authors: Zhang Shuqi, Liu Dan

Abstract:

For the problems in intelligent network anomaly traffic detection models, such as low detection accuracy caused by the lack of training samples, poor effect with small sample attack detection, a classification model enhancement method, F-ACGAN(Flow Auxiliary Classifier Generative Adversarial Network) which introduces generative adversarial network and adversarial training, is proposed to solve these problems. Generating adversarial data with category labels could enhance the training effect and improve classification accuracy and model robustness. FACGAN consists of three steps: feature preprocess, which includes data type conversion, dimensionality reduction and normalization, etc.; A generative adversarial network model with feature learning ability is designed, and the sample generation effect of the model is improved through adversarial iterations between generator and discriminator. The adversarial disturbance factor of the gradient direction of the classification model is added to improve the diversity and antagonism of generated data and to promote the model to learn from adversarial classification features. The experiment of constructing a classification model with the UNSW-NB15 dataset shows that with the enhancement of FACGAN on the basic model, the classification accuracy has improved by 8.09%, and the score of F1 has improved by 6.94%.

Keywords: data imbalance, GAN, ACGAN, anomaly detection, adversarial training, data augmentation

Procedia PDF Downloads 104
135 Laboratory-Based Monitoring of Hepatitis B Virus Vaccination Status in North Central Nigeria

Authors: Nwadioha Samuel Iheanacho, Abah Paul, Odimayo Simidele Michael

Abstract:

Background: The World Health Assembly through the Global Health Sector Strategy on viral hepatitis calls for the elimination of viral hepatitis as a public health threat by 2030. All hands are on deck to actualize this goal through an effective and active vaccination and monitoring tool. Aim: To combine the Epidemiologic with Laboratory Hepatitis B Virus vaccination monitoring tools. Method: Laboratory results analysis of subjects recruited during the World Hepatitis week from July 2020 to July 2021 was done after obtaining their epidemiologic data on Hepatitis B virus risk factors, in the Medical Microbiology Laboratory of Benue State University Teaching Hospital, Nigeria. Result: A total of 500 subjects comprising males 60.0%(n=300/500) and females 40.0%(n=200/500) were recruited. A fifty-three percent majority was of the age range of 26 to 36 years. Serologic profiles were as follows, 15.0%(n=75/500) HBsAg; 7.0% (n=35/500) HBeAg; 8.0% (n=40/500) Anti-Hbe; 20.0% (n=100/500) Anti-HBc and 38.0% (n=190/500) Anti-HBs. Immune responses to vaccination were as follows, 47.0%(n=235/500) Immune naïve {no serologic marker + normal ALT}; 33%(n=165/500) Immunity by vaccination {Anti-HBs + normal ALT}; 5%(n=25/500) Immunity to previous infection {Anti-HBs, Anti-HBc, +/- Anti-HBe + normal ALT}; 8%(n=40/500) Carriers {HBsAg, Anti-HBc, Anti-HBe +normal ALT} and 7% (35/500) Anti-HBe serum- negative infections {HBsAg, HBeAg, Anti-HBc +elevated ALT}. Conclusion: The present 33.0% immunity by vaccination coverage in Central Nigeria was much lower than the 41.0% national peak in 2013, and a far cry from the global expectation of attainment of a Global Health Sector Strategy on the elimination of viral hepatitis as a public health threat by 2030. Therefore, more creative ideas and collective effort are needed to attain this goal of the World Health Assembly.

Keywords: Hepatitis B, vaccination status, laboratory tools, resource-limited settings

Procedia PDF Downloads 74
134 Tensor Deep Stacking Neural Networks and Bilinear Mapping Based Speech Emotion Classification Using Facial Electromyography

Authors: P. S. Jagadeesh Kumar, Yang Yung, Wenli Hu

Abstract:

Speech emotion classification is a dominant research field in finding a sturdy and profligate classifier appropriate for different real-life applications. This effort accentuates on classifying different emotions from speech signal quarried from the features related to pitch, formants, energy contours, jitter, shimmer, spectral, perceptual and temporal features. Tensor deep stacking neural networks were supported to examine the factors that influence the classification success rate. Facial electromyography signals were composed of several forms of focuses in a controlled atmosphere by means of audio-visual stimuli. Proficient facial electromyography signals were pre-processed using moving average filter, and a set of arithmetical features were excavated. Extracted features were mapped into consistent emotions using bilinear mapping. With facial electromyography signals, a database comprising diverse emotions will be exposed with a suitable fine-tuning of features and training data. A success rate of 92% can be attained deprived of increasing the system connivance and the computation time for sorting diverse emotional states.

Keywords: speech emotion classification, tensor deep stacking neural networks, facial electromyography, bilinear mapping, audio-visual stimuli

Procedia PDF Downloads 254
133 Predictive Value of Hepatitis B Core-Related Antigen (HBcrAg) during Natural History of Hepatitis B Virus Infection

Authors: Yanhua Zhao, Yu Gou, Shu Feng, Dongdong Li, Chuanmin Tao

Abstract:

The natural history of HBV infection could experience immune tolerant (IT), immune clearance (IC), HBeAg-negative inactive/quienscent carrier (ENQ), and HBeAg-negative hepatitis (ENH). As current biomarkers for discriminating these four phases have some weaknesses, additional serological indicators are needed. Hepatits B core-related antigen (HBcrAg) encoded with precore/core gene contains denatured HBeAg, HBV core antigen (HBcAg) and a 22KDa precore protein (p22cr), which was demonstrated to have a close association with natural history of hepatitis B infection, but no specific cutoff values and diagnostic parameters to evaluate the diagnostic efficacy. This study aimed to clarify the distribution of HBcrAg levels and evaluate its diagnostic performance during the natural history of infection from a Western Chinese perspective. 294 samples collected from treatment-naïve chronic hepatitis B (CHB) patients in different phases (IT=64; IC=72; ENQ=100, and ENH=58). We detected the HBcrAg values and analyzed the relationship between HBcrAg and HBV DNA. HBsAg and other clinical parameters were quantitatively tested. HBcrAg levels of four phases were 9.30 log U/mL, 8.80 log U/mL, 3.00 log U/mL, and 5.10 logU/mL, respectively (p < 0.0001). Receiver operating characteristic curve analysis demonstrated that the area under curves (AUCs) of HBcrAg and quantitative HBsAg at cutoff values of 9.25 log U/mL and 4.355 log IU/mL for distinguishing IT from IC phases were 0.704 and 0.694, with sensitivity 76.39% and 59.72%, specificity 53.13% and 79.69%, respectively. AUCs of HBcrAg and quantitative HBsAg at cutoff values of 4.15 log U/mlmL and 2.395 log IU/mlmL for discriminating between ENQ and ENH phases were 0.931 and 0.653, with sensitivity 87.93% and 84%, specificity 91.38% and 39%, respectively. Therefore, HBcrAg levels varied significantly among four natural phases of HBV infection. It had higher predictive performance than quantitative HBsAg for distinguishing between ENQ-patients and ENH-patients and similar performance with HBsAg for the discrimination between IT and IC phases, which indicated that HBcrAg could be a potential serological marker for CHB.

Keywords: chronic hepatitis B, hepatitis B core-related antigen, hepatitis B surface antigens, hepatitis B virus

Procedia PDF Downloads 417
132 Diversity Indices as a Tool for Evaluating Quality of Water Ways

Authors: Khadra Ahmed, Khaled Kheireldin

Abstract:

In this paper, we present a pedestrian detection descriptor called Fused Structure and Texture (FST) features based on the combination of the local phase information with the texture features. Since the phase of the signal conveys more structural information than the magnitude, the phase congruency concept is used to capture the structural features. On the other hand, the Center-Symmetric Local Binary Pattern (CSLBP) approach is used to capture the texture information of the image. The dimension less quantity of the phase congruency and the robustness of the CSLBP operator on the flat images, as well as the blur and illumination changes, lead the proposed descriptor to be more robust and less sensitive to the light variations. The proposed descriptor can be formed by extracting the phase congruency and the CSLBP values of each pixel of the image with respect to its neighborhood. The histogram of the oriented phase and the histogram of the CSLBP values for the local regions in the image are computed and concatenated to construct the FST descriptor. Several experiments were conducted on INRIA and the low resolution DaimlerChrysler datasets to evaluate the detection performance of the pedestrian detection system that is based on the FST descriptor. A linear Support Vector Machine (SVM) is used to train the pedestrian classifier. These experiments showed that the proposed FST descriptor has better detection performance over a set of state of the art feature extraction methodologies.

Keywords: planktons, diversity indices, water quality index, water ways

Procedia PDF Downloads 518
131 Major Depressive Disorder: Diagnosis based on Electroencephalogram Analysis

Authors: Wajid Mumtaz, Aamir Saeed Malik, Syed Saad Azhar Ali, Mohd Azhar Mohd Yasin

Abstract:

In this paper, a technique based on electroencephalogram (EEG) analysis is presented, aiming for diagnosing major depressive disorder (MDD) among a potential population of MDD patients and healthy controls. EEG is recognized as a clinical modality during applications such as seizure diagnosis, index for anesthesia, detection of brain death or stroke. However, its usability for psychiatric illnesses such as MDD is less studied. Therefore, in this study, for the sake of diagnosis, 2 groups of study participants were recruited, 1) MDD patients, 2) healthy people as controls. EEG data acquired from both groups were analyzed involving inter-hemispheric asymmetry and composite permutation entropy index (CPEI). To automate the process, derived quantities from EEG were utilized as inputs to classifier such as logistic regression (LR) and support vector machine (SVM). The learning of these classification models was tested with a test dataset. Their learning efficiency is provided as accuracy of classifying MDD patients from controls, their sensitivities and specificities were reported, accordingly (LR =81.7 % and SVM =81.5 %). Based on the results, it is concluded that the derived measures are indicators for diagnosing MDD from a potential population of normal controls. In addition, the results motivate further exploring other measures for the same purpose.

Keywords: major depressive disorder, diagnosis based on EEG, EEG derived features, CPEI, inter-hemispheric asymmetry

Procedia PDF Downloads 546
130 Iris Feature Extraction and Recognition Based on Two-Dimensional Gabor Wavelength Transform

Authors: Bamidele Samson Alobalorun, Ifedotun Roseline Idowu

Abstract:

Biometrics technologies apply the human body parts for their unique and reliable identification based on physiological traits. The iris recognition system is a biometric–based method for identification. The human iris has some discriminating characteristics which provide efficiency to the method. In order to achieve this efficiency, there is a need for feature extraction of the distinct features from the human iris in order to generate accurate authentication of persons. In this study, an approach for an iris recognition system using 2D Gabor for feature extraction is applied to iris templates. The 2D Gabor filter formulated the patterns that were used for training and equally sent to the hamming distance matching technique for recognition. A comparison of results is presented using two iris image subjects of different matching indices of 1,2,3,4,5 filter based on the CASIA iris image database. By comparing the two subject results, the actual computational time of the developed models, which is measured in terms of training and average testing time in processing the hamming distance classifier, is found with best recognition accuracy of 96.11% after capturing the iris localization or segmentation using the Daughman’s Integro-differential, the normalization is confined to the Daugman’s rubber sheet model.

Keywords: Daugman rubber sheet, feature extraction, Hamming distance, iris recognition system, 2D Gabor wavelet transform

Procedia PDF Downloads 65
129 A Pattern Recognition Neural Network Model for Detection and Classification of SQL Injection Attacks

Authors: Naghmeh Moradpoor Sheykhkanloo

Abstract:

Structured Query Language Injection (SQLI) attack is a code injection technique in which malicious SQL statements are inserted into a given SQL database by simply using a web browser. Losing data, disclosing confidential information or even changing the value of data are the severe damages that SQLI attack can cause on a given database. SQLI attack has also been rated as the number-one attack among top ten web application threats on Open Web Application Security Project (OWASP). OWASP is an open community dedicated to enabling organisations to consider, develop, obtain, function, and preserve applications that can be trusted. In this paper, we propose an effective pattern recognition neural network model for detection and classification of SQLI attacks. The proposed model is built from three main elements of: a Uniform Resource Locator (URL) generator in order to generate thousands of malicious and benign URLs, a URL classifier in order to: 1) classify each generated URL to either a benign URL or a malicious URL and 2) classify the malicious URLs into different SQLI attack categories, and an NN model in order to: 1) detect either a given URL is a malicious URL or a benign URL and 2) identify the type of SQLI attack for each malicious URL. The model is first trained and then evaluated by employing thousands of benign and malicious URLs. The results of the experiments are presented in order to demonstrate the effectiveness of the proposed approach.

Keywords: neural networks, pattern recognition, SQL injection attacks, SQL injection attack classification, SQL injection attack detection

Procedia PDF Downloads 469
128 Dynamic Fault Diagnosis for Semi-Batch Reactor Under Closed-Loop Control via Independent RBFNN

Authors: Abdelkarim M. Ertiame, D. W. Yu, D. L. Yu, J. B. Gomm

Abstract:

In this paper, a new robust fault detection and isolation (FDI) scheme is developed to monitor a multivariable nonlinear chemical process called the Chylla-Haase polymerization reactor when it is under the cascade PI control. The scheme employs a radial basis function neural network (RBFNN) in an independent mode to model the process dynamics and using the weighted sum-squared prediction error as the residual. The recursive orthogonal Least Squares algorithm (ROLS) is employed to train the model to overcome the training difficulty of the independent mode of the network. Then, another RBFNN is used as a fault classifier to isolate faults from different features involved in the residual vector. The several actuator and sensor faults are simulated in a nonlinear simulation of the reactor in Simulink. The scheme is used to detect and isolate the faults on-line. The simulation results show the effectiveness of the scheme even the process is subjected to disturbances and uncertainties including significant changes in the monomer feed rate, fouling factor, impurity factor, ambient temperature and measurement noise. The simulation results are presented to illustrate the effectiveness and robustness of the proposed method.

Keywords: Robust fault detection, cascade control, independent RBF model, RBF neural networks, Chylla-Haase reactor, FDI under closed-loop control

Procedia PDF Downloads 496