Search results for: string classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2231

Search results for: string classification

1721 Protein Remote Homology Detection and Fold Recognition by Combining Profiles with Kernel Methods

Authors: Bin Liu

Abstract:

Protein remote homology detection and fold recognition are two most important tasks in protein sequence analysis, which is critical for protein structure and function studies. In this study, we combined the profile-based features with various string kernels, and constructed several computational predictors for protein remote homology detection and fold recognition. Experimental results on two widely used benchmark datasets showed that these methods outperformed the competing methods, indicating that these predictors are useful computational tools for protein sequence analysis. By analyzing the discriminative features of the training models, some interesting patterns were discovered, reflecting the characteristics of protein superfamilies and folds, which are important for the researchers who are interested in finding the patterns of protein folds.

Keywords: protein remote homology detection, protein fold recognition, profile-based features, Support Vector Machines (SVMs)

Procedia PDF Downloads 154
1720 Multivariate Analysis of Spectroscopic Data for Agriculture Applications

Authors: Asmaa M. Hussein, Amr Wassal, Ahmed Farouk Al-Sadek, A. F. Abd El-Rahman

Abstract:

In this study, a multivariate analysis of potato spectroscopic data was presented to detect the presence of brown rot disease or not. Near-Infrared (NIR) spectroscopy (1,350-2,500 nm) combined with multivariate analysis was used as a rapid, non-destructive technique for the detection of brown rot disease in potatoes. Spectral measurements were performed in 565 samples, which were chosen randomly at the infection place in the potato slice. In this study, 254 infected and 311 uninfected (brown rot-free) samples were analyzed using different advanced statistical analysis techniques. The discrimination performance of different multivariate analysis techniques, including classification, pre-processing, and dimension reduction, were compared. Applying a random forest algorithm classifier with different pre-processing techniques to raw spectra had the best performance as the total classification accuracy of 98.7% was achieved in discriminating infected potatoes from control.

Keywords: Brown rot disease, NIR spectroscopy, potato, random forest

Procedia PDF Downloads 183
1719 Application of Change Detection Techniques in Monitoring Environmental Phenomena: A Review

Authors: T. Garba, Y. Y. Babanyara, T. O. Quddus, A. K. Mukatari

Abstract:

Human activities make environmental parameters in order to keep on changing globally. While some changes are necessary and beneficial to flora and fauna, others have serious consequences threatening the survival of their natural habitat if these changes are not properly monitored and mitigated. In-situ assessments are characterized by many challenges due to the absence of time series data and sometimes areas to be observed or monitored are inaccessible. Satellites Remote Sensing provide us with the digital images of same geographic areas within a pre-defined interval. This makes it possible to monitor and detect changes of environmental phenomena. This paper, therefore, reviewed the commonly use changes detection techniques globally such as image differencing, image rationing, image regression, vegetation index difference, change vector analysis, principal components analysis, multidate classification, post-classification comparison, and visual interpretation. The paper concludes by suggesting the use of more than one technique.

Keywords: environmental phenomena, change detection, monitor, techniques

Procedia PDF Downloads 271
1718 Classification of Computer Generated Images from Photographic Images Using Convolutional Neural Networks

Authors: Chaitanya Chawla, Divya Panwar, Gurneesh Singh Anand, M. P. S Bhatia

Abstract:

This paper presents a deep-learning mechanism for classifying computer generated images and photographic images. The proposed method accounts for a convolutional layer capable of automatically learning correlation between neighbouring pixels. In the current form, Convolutional Neural Network (CNN) will learn features based on an image's content instead of the structural features of the image. The layer is particularly designed to subdue an image's content and robustly learn the sensor pattern noise features (usually inherited from image processing in a camera) as well as the statistical properties of images. The paper was assessed on latest natural and computer generated images, and it was concluded that it performs better than the current state of the art methods.

Keywords: image forensics, computer graphics, classification, deep learning, convolutional neural networks

Procedia PDF Downloads 328
1717 Internal Combustion Engine Fuel Composition Detection by Analysing Vibration Signals Using ANFIS Network

Authors: M. N. Khajavi, S. Nasiri, E. Farokhi, M. R. Bavir

Abstract:

Alcohol fuels are renewable, have low pollution and have high octane number; therefore, they are important as fuel in internal combustion engines. Percentage detection of these alcoholic fuels with gasoline is a complicated, time consuming, and expensive process. Nowadays, these processes are done in equipped laboratories, based on international standards. The aim of this research is to determine percentage detection of different fuels based on vibration analysis of engine block signals. By doing, so considerable saving in time and cost can be achieved. Five different fuels consisted of pure gasoline (G) as base fuel and combination of this fuel with different percent of ethanol and methanol are prepared. For example, volumetric combination of pure gasoline with 10 percent ethanol is called E10. By this convention, we made M10 (10% methanol plus 90% pure gasoline), E30 (30% ethanol plus 70% pure gasoline), and M30 (30% Methanol plus 70% pure gasoline) were prepared. To simulate real working condition for this experiment, the vehicle was mounted on a chassis dynamometer and run under 1900 rpm and 30 KW load. To measure the engine block vibration, a three axis accelerometer was mounted between cylinder 2 and 3. After acquisition of vibration signal, eight time feature of these signals were used as inputs to an Adaptive Neuro Fuzzy Inference System (ANFIS). The designed ANFIS was trained for classifying these five different fuels. The results show suitable classification ability of the designed ANFIS network with 96.3 percent of correct classification.

Keywords: internal combustion engine, vibration signal, fuel composition, classification, ANFIS

Procedia PDF Downloads 396
1716 Plant Identification Using Convolution Neural Network and Vision Transformer-Based Models

Authors: Virender Singh, Mathew Rees, Simon Hampton, Sivaram Annadurai

Abstract:

Plant identification is a challenging task that aims to identify the family, genus, and species according to plant morphological features. Automated deep learning-based computer vision algorithms are widely used for identifying plants and can help users narrow down the possibilities. However, numerous morphological similarities between and within species render correct classification difficult. In this paper, we tested custom convolution neural network (CNN) and vision transformer (ViT) based models using the PyTorch framework to classify plants. We used a large dataset of 88,000 provided by the Royal Horticultural Society (RHS) and a smaller dataset of 16,000 images from the PlantClef 2015 dataset for classifying plants at genus and species levels, respectively. Our results show that for classifying plants at the genus level, ViT models perform better compared to CNN-based models ResNet50 and ResNet-RS-420 and other state-of-the-art CNN-based models suggested in previous studies on a similar dataset. ViT model achieved top accuracy of 83.3% for classifying plants at the genus level. For classifying plants at the species level, ViT models perform better compared to CNN-based models ResNet50 and ResNet-RS-420, with a top accuracy of 92.5%. We show that the correct set of augmentation techniques plays an important role in classification success. In conclusion, these results could help end users, professionals and the general public alike in identifying plants quicker and with improved accuracy.

Keywords: plant identification, CNN, image processing, vision transformer, classification

Procedia PDF Downloads 91
1715 Text Emotion Recognition by Multi-Head Attention based Bidirectional LSTM Utilizing Multi-Level Classification

Authors: Vishwanath Pethri Kamath, Jayantha Gowda Sarapanahalli, Vishal Mishra, Siddhesh Balwant Bandgar

Abstract:

Recognition of emotional information is essential in any form of communication. Growing HCI (Human-Computer Interaction) in recent times indicates the importance of understanding of emotions expressed and becomes crucial for improving the system or the interaction itself. In this research work, textual data for emotion recognition is used. The text being the least expressive amongst the multimodal resources poses various challenges such as contextual information and also sequential nature of the language construction. In this research work, the proposal is made for a neural architecture to resolve not less than 8 emotions from textual data sources derived from multiple datasets using google pre-trained word2vec word embeddings and a Multi-head attention-based bidirectional LSTM model with a one-vs-all Multi-Level Classification. The emotions targeted in this research are Anger, Disgust, Fear, Guilt, Joy, Sadness, Shame, and Surprise. Textual data from multiple datasets were used for this research work such as ISEAR, Go Emotions, Affect datasets for creating the emotions’ dataset. Data samples overlap or conflicts were considered with careful preprocessing. Our results show a significant improvement with the modeling architecture and as good as 10 points improvement in recognizing some emotions.

Keywords: text emotion recognition, bidirectional LSTM, multi-head attention, multi-level classification, google word2vec word embeddings

Procedia PDF Downloads 171
1714 A Taxonomy of Routing Protocols in Wireless Sensor Networks

Authors: A. Kardi, R. Zagrouba, M. Alqahtani

Abstract:

The Internet of Everything (IoE) presents today a very attractive and motivating field of research. It is basically based on Wireless Sensor Networks (WSNs) in which the routing task is the major analysis topic. In fact, it directly affects the effectiveness and the lifetime of the network. This paper, developed from recent works and based on extensive researches, proposes a taxonomy of routing protocols in WSNs. Our main contribution is that we propose a classification model based on nine classes namely application type, delivery mode, initiator of communication, network architecture, path establishment (route discovery), network topology (structure), protocol operation, next hop selection and latency-awareness and energy-efficient routing protocols. In order to provide a total classification pattern to serve as reference for network designers, each class is subdivided into possible subclasses, presented, and discussed using different parameters such as purposes and characteristics.

Keywords: routing, sensor, survey, wireless sensor networks, WSNs

Procedia PDF Downloads 175
1713 A Heart Arrhythmia Prediction Using Machine Learning’s Classification Approach and the Concept of Data Mining

Authors: Roshani S. Golhar, Neerajkumar S. Sathawane, Snehal Dongre

Abstract:

Background and objectives: As the, cardiovascular illnesses increasing and becoming cause of mortality worldwide, killing around lot of people each year. Arrhythmia is a type of cardiac illness characterized by a change in the linearity of the heartbeat. The goal of this study is to develop novel deep learning algorithms for successfully interpreting arrhythmia using a single second segment. Because the ECG signal indicates unique electrical heart activity across time, considerable changes between time intervals are detected. Such variances, as well as the limited number of learning data available for each arrhythmia, make standard learning methods difficult, and so impede its exaggeration. Conclusions: The proposed method was able to outperform several state-of-the-art methods. Also proposed technique is an effective and convenient approach to deep learning for heartbeat interpretation, that could be probably used in real-time healthcare monitoring systems

Keywords: electrocardiogram, ECG classification, neural networks, convolutional neural networks, portable document format

Procedia PDF Downloads 63
1712 Medical Neural Classifier Based on Improved Genetic Algorithm

Authors: Fadzil Ahmad, Noor Ashidi Mat Isa

Abstract:

This study introduces an improved genetic algorithm procedure that focuses search around near optimal solution corresponded to a group of elite chromosome. This is achieved through a novel crossover technique known as Segmented Multi Chromosome Crossover. It preserves the highly important information contained in a gene segment of elite chromosome and allows an offspring to carry information from gene segment of multiple chromosomes. In this way the algorithm has better possibility to effectively explore the solution space. The improved GA is applied for the automatic and simultaneous parameter optimization and feature selection of artificial neural network in pattern recognition of medical problem, the cancer and diabetes disease. The experimental result shows that the average classification accuracy of the cancer and diabetes dataset has improved by 0.1% and 0.3% respectively using the new algorithm.

Keywords: genetic algorithm, artificial neural network, pattern clasification, classification accuracy

Procedia PDF Downloads 467
1711 A Computer-Aided System for Detection and Classification of Liver Cirrhosis

Authors: Abdel Hadi N. Ebraheim, Eman Azomi, Nefisa A. Fahmy

Abstract:

This paper designs and implements a computer-aided system (CAS) to help detect and diagnose liver cirrhosis in patients with Chronic Hepatitis C. Our system reduces the required features (tests) the patient is asked to do to tests to their minimal best most informative subset of tests, with a diagnostic accuracy above 99%, and hence saving both time and costs. We use the Support Vector Machine (SVM) with cross-validation, a Multilayer Perceptron Neural Network (MLP), and a Generalized Regression Neural Network (GRNN) that employs a base of radial functions for functional approximation, as classifiers. Our system is tested on 199 subjects, of them 99 Chronic Hepatitis C.The subjects were selected from among the outpatient clinic in National Herpetology and Tropical Medicine Research Institute (NHTMRI).

Keywords: liver cirrhosis, artificial neural network, support vector machine, multi-layer perceptron, classification, accuracy

Procedia PDF Downloads 454
1710 Applying Unmanned Aerial Vehicle on Agricultural Damage: A Case Study of the Meteorological Disaster on Taiwan Paddy Rice

Authors: Chiling Chen, Chiaoying Chou, Siyang Wu

Abstract:

Taiwan locates at the west of Pacific Ocean and intersects between continental and marine climate. Typhoons frequently strike Taiwan and come with meteorological disasters, i.e., heavy flooding, landslides, loss of life and properties, etc. Global climate change brings more extremely meteorological disasters. So, develop techniques to improve disaster prevention and mitigation is needed, to improve rescue processes and rehabilitations is important as well. In this study, UAVs (Unmanned Aerial Vehicles) are applied to take instant images for improving the disaster investigation and rescue processes. Paddy rice fields in the central Taiwan are the study area. There have been attacked by heavy rain during the monsoon season in June 2016. UAV images provide the high ground resolution (3.5cm) with 3D Point Clouds to develop image discrimination techniques and digital surface model (DSM) on rice lodging. Firstly, image supervised classification with Maximum Likelihood Method (MLD) is used to delineate the area of rice lodging. Secondly, 3D point clouds generated by Pix4D Mapper are used to develop DSM for classifying the lodging levels of paddy rice. As results, discriminate accuracy of rice lodging is 85% by image supervised classification, and the classification accuracy of lodging level is 87% by DSM. Therefore, UAVs not only provide instant images of agricultural damage after the meteorological disaster, but the image discriminations on rice lodging also reach acceptable accuracy (>85%). In the future, technologies of UAVs and image discrimination will be applied to different crop fields. The results of image discrimination will be overlapped with administrative boundaries of paddy rice, to establish GIS-based assist system on agricultural damage discrimination. Therefore, the time and labor would be greatly reduced on damage detection and monitoring.

Keywords: Monsoon, supervised classification, Pix4D, 3D point clouds, discriminate accuracy

Procedia PDF Downloads 298
1709 A Gene Selection Algorithm for Microarray Cancer Classification Using an Improved Particle Swarm Optimization

Authors: Arfan Ali Nagra, Tariq Shahzad, Meshal Alharbi, Khalid Masood Khan, Muhammad Mugees Asif, Taher M. Ghazal, Khmaies Ouahada

Abstract:

Gene selection is an essential step for the classification of microarray cancer data. Gene expression cancer data (DNA microarray) facilitates computing the robust and concurrent expression of various genes. Particle swarm optimization (PSO) requires simple operators and less number of parameters for tuning the model in gene selection. The selection of a prognostic gene with small redundancy is a great challenge for the researcher as there are a few complications in PSO based selection method. In this research, a new variant of PSO (Self-inertia weight adaptive PSO) has been proposed. In the proposed algorithm, SIW-APSO-ELM is explored to achieve gene selection prediction accuracies. This new algorithm balances the exploration capabilities of the improved inertia weight adaptive particle swarm optimization and the exploitation. The self-inertia weight adaptive particle swarm optimization (SIW-APSO) is used to search the solution. The SIW-APSO is updated with an evolutionary process in such a way that each particle iteratively improves its velocities and positions. The extreme learning machine (ELM) has been designed for the selection procedure. The proposed method has been to identify a number of genes in the cancer dataset. The classification algorithm contains ELM, K- centroid nearest neighbor (KCNN), and support vector machine (SVM) to attain high forecast accuracy as compared to the start-of-the-art methods on microarray cancer datasets that show the effectiveness of the proposed method.

Keywords: microarray cancer, improved PSO, ELM, SVM, evolutionary algorithms

Procedia PDF Downloads 77
1708 Detection of Phoneme [S] Mispronounciation for Sigmatism Diagnosis in Adults

Authors: Michal Krecichwost, Zauzanna Miodonska, Pawel Badura

Abstract:

The diagnosis of sigmatism is mostly based on the observation of articulatory organs. It is, however, not always possible to precisely observe the vocal apparatus, in particular in the oral cavity of the patient. Speech processing can allow to objectify the therapy and simplify the verification of its progress. In the described study the methodology for classification of incorrectly pronounced phoneme [s] is proposed. The recordings come from adults. They were registered with the speech recorder at the sampling rate of 44.1 kHz and the resolution of 16 bit. The database of pathological and normative speech has been collected for the study including reference assessments provided by the speech therapy experts. Ten adult subjects were asked to simulate a certain type of stigmatism under the speech therapy expert supervision. In the recordings, the analyzed phone [s] was surrounded by vowels, viz: ASA, ESE, ISI, SPA, USU, YSY. Thirteen MFCC (mel-frequency cepstral coefficients) and RMS (root mean square) values are calculated within each frame being a part of the analyzed phoneme. Additionally, 3 fricative formants along with corresponding amplitudes are determined for the entire segment. In order to aggregate the information within the segment, the average value of each MFCC coefficient is calculated. All features of other types are aggregated by means of their 75th percentile. The proposed method of features aggregation reduces the size of the feature vector used in the classification. Binary SVM (support vector machine) classifier is employed at the phoneme recognition stage. The first group consists of pathological phones, while the other of the normative ones. The proposed feature vector yields classification sensitivity and specificity measures above 90% level in case of individual logo phones. The employment of a fricative formants-based information improves the sole-MFCC classification results average of 5 percentage points. The study shows that the employment of specific parameters for the selected phones improves the efficiency of pathology detection referred to the traditional methods of speech signal parameterization.

Keywords: computer-aided pronunciation evaluation, sibilants, sigmatism diagnosis, speech processing

Procedia PDF Downloads 276
1707 Transformation of Positron Emission Tomography Raw Data into Images for Classification Using Convolutional Neural Network

Authors: Paweł Konieczka, Lech Raczyński, Wojciech Wiślicki, Oleksandr Fedoruk, Konrad Klimaszewski, Przemysław Kopka, Wojciech Krzemień, Roman Shopa, Jakub Baran, Aurélien Coussat, Neha Chug, Catalina Curceanu, Eryk Czerwiński, Meysam Dadgar, Kamil Dulski, Aleksander Gajos, Beatrix C. Hiesmayr, Krzysztof Kacprzak, łukasz Kapłon, Grzegorz Korcyl, Tomasz Kozik, Deepak Kumar, Szymon Niedźwiecki, Dominik Panek, Szymon Parzych, Elena Pérez Del Río, Sushil Sharma, Shivani Shivani, Magdalena Skurzok, Ewa łucja Stępień, Faranak Tayefi, Paweł Moskal

Abstract:

This paper develops the transformation of non-image data into 2-dimensional matrices, as a preparation stage for classification based on convolutional neural networks (CNNs). In positron emission tomography (PET) studies, CNN may be applied directly to the reconstructed distribution of radioactive tracers injected into the patient's body, as a pattern recognition tool. Nonetheless, much PET data still exists in non-image format and this fact opens a question on whether they can be used for training CNN. In this contribution, the main focus of this paper is the problem of processing vectors with a small number of features in comparison to the number of pixels in the output images. The proposed methodology was applied to the classification of PET coincidence events.

Keywords: convolutional neural network, kernel principal component analysis, medical imaging, positron emission tomography

Procedia PDF Downloads 136
1706 Using Probabilistic Neural Network (PNN) for Extracting Acoustic Microwaves (Bulk Acoustic Waves) in Piezoelectric Material

Authors: Hafdaoui Hichem, Mehadjebia Cherifa, Benatia Djamel

Abstract:

In this paper, we propose a new method for Bulk detection of an acoustic microwave signal during the propagation of acoustic microwaves in a piezoelectric substrate (Lithium Niobate LiNbO3). We have used the classification by probabilistic neural network (PNN) as a means of numerical analysis in which we classify all the values of the real part and the imaginary part of the coefficient attenuation with the acoustic velocity in order to build a model from which we note the Bulk waves easily. These singularities inform us of presence of Bulk waves in piezoelectric materials. By which we obtain accurate values for each of the coefficient attenuation and acoustic velocity for Bulk waves. This study will be very interesting in modeling and realization of acoustic microwaves devices (ultrasound) based on the propagation of acoustic microwaves.

Keywords: piezoelectric material, probabilistic neural network (PNN), classification, acoustic microwaves, bulk waves, the attenuation coefficient

Procedia PDF Downloads 425
1705 Local Interpretable Model-agnostic Explanations (LIME) Approach to Email Spam Detection

Authors: Rohini Hariharan, Yazhini R., Blessy Maria Mathew

Abstract:

The task of detecting email spam is a very important one in the era of digital technology that needs effective ways of curbing unwanted messages. This paper presents an approach aimed at making email spam categorization algorithms transparent, reliable and more trustworthy by incorporating Local Interpretable Model-agnostic Explanations (LIME). Our technique assists in providing interpretable explanations for specific classifications of emails to help users understand the decision-making process by the model. In this study, we developed a complete pipeline that incorporates LIME into the spam classification framework and allows creating simplified, interpretable models tailored to individual emails. LIME identifies influential terms, pointing out key elements that drive classification results, thus reducing opacity inherent in conventional machine learning models. Additionally, we suggest a visualization scheme for displaying keywords that will improve understanding of categorization decisions by users. We test our method on a diverse email dataset and compare its performance with various baseline models, such as Gaussian Naive Bayes, Multinomial Naive Bayes, Bernoulli Naive Bayes, Support Vector Classifier, K-Nearest Neighbors, Decision Tree, and Logistic Regression. Our testing results show that our model surpasses all other models, achieving an accuracy of 96.59% and a precision of 99.12%.

Keywords: text classification, LIME (local interpretable model-agnostic explanations), stemming, tokenization, logistic regression.

Procedia PDF Downloads 40
1704 Early Stage Suicide Ideation Detection Using Supervised Machine Learning and Neural Network Classifier

Authors: Devendra Kr Tayal, Vrinda Gupta, Aastha Bansal, Khushi Singh, Sristi Sharma, Hunny Gaur

Abstract:

In today's world, suicide is a serious problem. In order to save lives, early suicide attempt detection and prevention should be addressed. A good number of at-risk people utilize social media platforms to talk about their issues or find knowledge on related chores. Twitter and Reddit are two of the most common platforms that are used for expressing oneself. Extensive research has already been done in this field. Through supervised classification techniques like Nave Bayes, Bernoulli Nave Bayes, and Multiple Layer Perceptron on a Reddit dataset, we demonstrate the early recognition of suicidal ideation. We also performed comparative analysis on these approaches and used accuracy, recall score, F1 score, and precision score for analysis.

Keywords: machine learning, suicide ideation detection, supervised classification, natural language processing

Procedia PDF Downloads 87
1703 Knowledge Discovery and Data Mining Techniques in Textile Industry

Authors: Filiz Ersoz, Taner Ersoz, Erkin Guler

Abstract:

This paper addresses the issues and technique for textile industry using data mining techniques. Data mining has been applied to the stitching of garments products that were obtained from a textile company. Data mining techniques were applied to the data obtained from the CHAID algorithm, CART algorithm, Regression Analysis and, Artificial Neural Networks. Classification technique based analyses were used while data mining and decision model about the production per person and variables affecting about production were found by this method. In the study, the results show that as the daily working time increases, the production per person also decreases. In addition, the relationship between total daily working and production per person shows a negative result and the production per person show the highest and negative relationship.

Keywords: data mining, textile production, decision trees, classification

Procedia PDF Downloads 344
1702 A Dataset of Program Educational Objectives Mapped to ABET Outcomes: Data Cleansing, Exploratory Data Analysis and Modeling

Authors: Addin Osman, Anwar Ali Yahya, Mohammed Basit Kamal

Abstract:

Datasets or collections are becoming important assets by themselves and now they can be accepted as a primary intellectual output of a research. The quality and usage of the datasets depend mainly on the context under which they have been collected, processed, analyzed, validated, and interpreted. This paper aims to present a collection of program educational objectives mapped to student’s outcomes collected from self-study reports prepared by 32 engineering programs accredited by ABET. The manual mapping (classification) of this data is a notoriously tedious, time consuming process. In addition, it requires experts in the area, which are mostly not available. It has been shown the operational settings under which the collection has been produced. The collection has been cleansed, preprocessed, some features have been selected and preliminary exploratory data analysis has been performed so as to illustrate the properties and usefulness of the collection. At the end, the collection has been benchmarked using nine of the most widely used supervised multiclass classification techniques (Binary Relevance, Label Powerset, Classifier Chains, Pruned Sets, Random k-label sets, Ensemble of Classifier Chains, Ensemble of Pruned Sets, Multi-Label k-Nearest Neighbors and Back-Propagation Multi-Label Learning). The techniques have been compared to each other using five well-known measurements (Accuracy, Hamming Loss, Micro-F, Macro-F, and Macro-F). The Ensemble of Classifier Chains and Ensemble of Pruned Sets have achieved encouraging performance compared to other experimented multi-label classification methods. The Classifier Chains method has shown the worst performance. To recap, the benchmark has achieved promising results by utilizing preliminary exploratory data analysis performed on the collection, proposing new trends for research and providing a baseline for future studies.

Keywords: ABET, accreditation, benchmark collection, machine learning, program educational objectives, student outcomes, supervised multi-class classification, text mining

Procedia PDF Downloads 164
1701 Early Diagnosis of Myocardial Ischemia Based on Support Vector Machine and Gaussian Mixture Model by Using Features of ECG Recordings

Authors: Merve Begum Terzi, Orhan Arikan, Adnan Abaci, Mustafa Candemir

Abstract:

Acute myocardial infarction is a major cause of death in the world. Therefore, its fast and reliable diagnosis is a major clinical need. ECG is the most important diagnostic methodology which is used to make decisions about the management of the cardiovascular diseases. In patients with acute myocardial ischemia, temporary chest pains together with changes in ST segment and T wave of ECG occur shortly before the start of myocardial infarction. In this study, a technique which detects changes in ST/T sections of ECG is developed for the early diagnosis of acute myocardial ischemia. For this purpose, a database of real ECG recordings that contains a set of records from 75 patients presenting symptoms of chest pain who underwent elective percutaneous coronary intervention (PCI) is constituted. 12-lead ECG’s of the patients were recorded before and during the PCI procedure. Two ECG epochs, which are the pre-inflation ECG which is acquired before any catheter insertion and the occlusion ECG which is acquired during balloon inflation, are analyzed for each patient. By using pre-inflation and occlusion recordings, ECG features that are critical in the detection of acute myocardial ischemia are identified and the most discriminative features for the detection of acute myocardial ischemia are extracted. A classification technique based on support vector machine (SVM) approach operating with linear and radial basis function (RBF) kernels to detect ischemic events by using ST-T derived joint features from non-ischemic and ischemic states of the patients is developed. The dataset is randomly divided into training and testing sets and the training set is used to optimize SVM hyperparameters by using grid-search method and 10fold cross-validation. SVMs are designed specifically for each patient by tuning the kernel parameters in order to obtain the optimal classification performance results. As a result of implementing the developed classification technique to real ECG recordings, it is shown that the proposed technique provides highly reliable detections of the anomalies in ECG signals. Furthermore, to develop a detection technique that can be used in the absence of ECG recording obtained during healthy stage, the detection of acute myocardial ischemia based on ECG recordings of the patients obtained during ischemia is also investigated. For this purpose, a Gaussian mixture model (GMM) is used to represent the joint pdf of the most discriminating ECG features of myocardial ischemia. Then, a Neyman-Pearson type of approach is developed to provide detection of outliers that would correspond to acute myocardial ischemia. Neyman – Pearson decision strategy is used by computing the average log likelihood values of ECG segments and comparing them with a range of different threshold values. For different discrimination threshold values and number of ECG segments, probability of detection and probability of false alarm values are computed, and the corresponding ROC curves are obtained. The results indicate that increasing number of ECG segments provide higher performance for GMM based classification. Moreover, the comparison between the performances of SVM and GMM based classification showed that SVM provides higher classification performance results over ECG recordings of considerable number of patients.

Keywords: ECG classification, Gaussian mixture model, Neyman–Pearson approach, support vector machine

Procedia PDF Downloads 156
1700 Modular Robotics and Terrain Detection Using Inertial Measurement Unit Sensor

Authors: Shubhakar Gupta, Dhruv Prakash, Apoorv Mehta

Abstract:

In this project, we design a modular robot capable of using and switching between multiple methods of propulsion and classifying terrain, based on an Inertial Measurement Unit (IMU) input. We wanted to make a robot that is not only intelligent in its functioning but also versatile in its physical design. The advantage of a modular robot is that it can be designed to hold several movement-apparatuses, such as wheels, legs for a hexapod or a quadpod setup, propellers for underwater locomotion, and any other solution that may be needed. The robot takes roughness input from a gyroscope and an accelerometer in the IMU, and based on the terrain classification from an artificial neural network; it decides which method of propulsion would best optimize its movement. This provides the bot with adaptability over a set of terrains, which means it can optimize its locomotion on a terrain based on its roughness. A feature like this would be a great asset to have in autonomous exploration or research drones.

Keywords: modular robotics, terrain detection, terrain classification, neural network

Procedia PDF Downloads 140
1699 ICanny: CNN Modulation Recognition Algorithm

Authors: Jingpeng Gao, Xinrui Mao, Zhibin Deng

Abstract:

Aiming at the low recognition rate on the composite signal modulation in low signal to noise ratio (SNR), this paper proposes a modulation recognition algorithm based on ICanny-CNN. Firstly, the radar signal is transformed into the time-frequency image by Choi-Williams Distribution (CWD). Secondly, we propose an image processing algorithm using the Guided Filter and the threshold selection method, which is combined with the hole filling and the mask operation. Finally, the shallow convolutional neural network (CNN) is combined with the idea of the depth-wise convolution (Dw Conv) and the point-wise convolution (Pw Conv). The proposed CNN is designed to complete image classification and realize modulation recognition of radar signal. The simulation results show that the proposed algorithm can reach 90.83% at 0dB and 71.52% at -8dB. Therefore, the proposed algorithm has a good classification and anti-noise performance in radar signal modulation recognition and other fields.

Keywords: modulation recognition, image processing, composite signal, improved Canny algorithm

Procedia PDF Downloads 184
1698 Efficient Manageability and Intelligent Classification of Web Browsing History Using Machine Learning

Authors: Suraj Gururaj, Sumantha Udupa U.

Abstract:

Browsing the Web has emerged as the de facto activity performed on the Internet. Although browsing gets tracked, the manageability aspect of Web browsing history is very poor. In this paper, we have a workable solution implemented by using machine learning and natural language processing techniques for efficient manageability of user’s browsing history. The significance of adding such a capability to a Web browser is that it ensures efficient and quick information retrieval from browsing history, which currently is very challenging. Our solution guarantees that any important websites visited in the past can be easily accessible because of the intelligent and automatic classification. In a nutshell, our solution-based paper provides an implementation as a browser extension by intelligently classifying the browsing history into most relevant category automatically without any user’s intervention. This guarantees no information is lost and increases productivity by saving time spent revisiting websites that were of much importance.

Keywords: adhoc retrieval, Chrome extension, supervised learning, tile, Web personalization

Procedia PDF Downloads 367
1697 Review of Cyber Security in Oil and Gas Industry with Cloud Computing Perspective: Taxonomy, Issues and Future Direction

Authors: Irfan Mohiuddin, Ahmad Al Mogren

Abstract:

In recent years, cloud computing has earned substantial attention in the Oil and Gas Industry and provides services in all the phases of the industry lifecycle. Oil and gas supply infrastructure, in particular, is more vulnerable to accidental, natural and intentional threats because of its widespread distribution. Numerous surveys have been conducted on cloud security and privacy. However, to the best of our knowledge, hardly any survey is carried out that reviews cyber security in all phases with a cloud computing perspective. Moreover, a distinctive classification is performed for all the cloud-based cyber security measures based on the cloud component in use. The classification approach will enable researchers to identify the required technique used to enhance the security in specific cloud components. Also, the limitation of each component will allow the researchers to design optimal algorithms. Lastly, future directions are given to point out the imminent challenges that can pave the way for researchers to further enhance the resilience to cyber security threats in the oil and gas industry.

Keywords: cyber security, cloud computing, safety and security, oil and gas industry, security threats, oil and gas pipelines

Procedia PDF Downloads 136
1696 Time "And" Dimension(s) - Visualizing the 4th and 4+ Dimensions

Authors: Siddharth Rana

Abstract:

As we know so far, there are 3 dimensions that we are capable of interpreting and perceiving, and there is a 4th dimension, called time, about which we don’t know much yet. We, as humans, live in the 4th dimension, not the 3rd. We travel 3 dimensionally but cannot yet travel 4 dimensionally; perhaps if we could, then visiting the past and the future would be like climbing a mountain or going down a road. So far, we humans are not even capable of imagining any higher dimensions than the three dimensions in which we can travel. We are the beings of the 4th dimension; we are the beings of time; that is why we can travel 3 dimensionally; however, if, say, there were beings of the 5th dimension, then they would easily be able to travel 4 dimensionally, i.e., they could travel in the 4th dimension as well. Beings of the 5th dimension can easily time travel. However, beings of the 4th dimension, like us, cannot time travel because we live in a 4-D world, traveling 3 dimensionally. That means to ever do time travel, we just need to go to a higher dimension and not only perceive it but also be able to travel in it. However, traveling to the past is not very possible, unlike traveling to the future. Even if traveling to the past were possible, it would be very unlikely that an event in the past would be changed. In this paper, some approaches are provided to define time, our movement in time to the future, some aspects of time travel using dimensions, and how we can perceive a higher dimension.

Keywords: time, dimensions, String theory, relativity

Procedia PDF Downloads 97
1695 Profiling of the Cell-Cycle Related Genes in Response to Efavirenz, a Non-Nucleoside Reverse Transcriptase Inhibitor in Human Lung Cancer

Authors: Rahaba Marima, Clement Penny

Abstract:

The Health-related quality of life (HRQoL) for HIV positive patients has improved since the introduction of the highly active antiretroviral treatment (HAART). However, in the present HAART era, HIV co-morbidities such as lung cancer, a non-AIDS (NAIDS) defining cancer have been documented to be on the rise. Under normal physiological conditions, cells grow, repair and proliferate through the cell-cycle as cellular homeostasis is important in the maintenance and proper regulation of tissues and organs. Contrarily, the deregulation of the cell-cycle is a hallmark of cancer, including lung cancer. The association between lung cancer and the use of HAART components such as Efavirenz (EFV) is poorly understood. This study aimed at elucidating the effects of EFV on the cell-cycle genes’ expression in lung cancer. For this purpose, the human cell-cycle gene array composed of 84 genes was evaluated on both normal lung fibroblasts (MRC-5) cells and adenocarcinoma (A549) lung cells, in response to 13µM EFV or 0.01% vehicle. The ±2 up or down fold change was used as a basis of target selection, with p < 0.05. Additionally, RT-qPCR was done to validate the gene array results. Next, In-silico bio-informatics tools, Search Tool for the Retrieval of Interacting Genes/Proteins (STRING), Reactome, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway and Ingenuity Pathway Analysis (IPA) were used for gene/gene interaction studies as well as to map the molecular and biological pathways influenced by the identified targets. Interestingly, the DNA damage response (DDR) pathway genes such as p53, Ataxia telangiectasia mutated and Rad3 related (ATR), Growth arrest and DNA damage inducible alpha (GADD45A), HUS1 checkpoint homolog (HUS1) and Role of radiation (RAD) genes were shown to be upregulated following EFV treatment, as revealed by STRING analysis. Additionally, functional enrichment analysis by the KEGG pathway revealed that most of the differentially expressed gene targets function at the cell-cycle checkpoint such as p21, Aurora kinase B (AURKB) and Mitotic Arrest Deficient-Like 2 (MAD2L2). Core analysis by IPA revealed that p53 downstream targets such as survivin, Bcl2, and cyclin/cyclin dependent kinases (CDKs) complexes are down-regulated, following exposure to EFV. Furthermore, Reactome analysis showed a significant increase in cellular response to stress genes, DNA repair genes, and apoptosis genes, as observed in both normal and cancerous cells. These findings implicate the genotoxic effects of EFV on lung cells, provoking the DDR pathway. Notably, the constitutive expression of this pathway (DDR) often leads to uncontrolled cell proliferation and eventually tumourigenesis, which could be the attribute of HAART components’ (such as EFV) effect on human cancers. Targeting the cell-cycle and its regulation holds a promising therapeutic intervention to the potential HAART associated carcinogenesis, particularly lung cancer.

Keywords: cell-cycle, DNA damage response, Efavirenz, lung cancer

Procedia PDF Downloads 150
1694 Analysis on Prediction Models of TBM Performance and Selection of Optimal Input Parameters

Authors: Hang Lo Lee, Ki Il Song, Hee Hwan Ryu

Abstract:

An accurate prediction of TBM(Tunnel Boring Machine) performance is very difficult for reliable estimation of the construction period and cost in preconstruction stage. For this purpose, the aim of this study is to analyze the evaluation process of various prediction models published since 2000 for TBM performance, and to select the optimal input parameters for the prediction model. A classification system of TBM performance prediction model and applied methodology are proposed in this research. Input and output parameters applied for prediction models are also represented. Based on these results, a statistical analysis is performed using the collected data from shield TBM tunnel in South Korea. By performing a simple regression and residual analysis utilizinFg statistical program, R, the optimal input parameters are selected. These results are expected to be used for development of prediction model of TBM performance.

Keywords: TBM performance prediction model, classification system, simple regression analysis, residual analysis, optimal input parameters

Procedia PDF Downloads 303
1693 lncRNA Gene Expression Profiling Analysis by TCGA RNA-Seq Data of Breast Cancer

Authors: Xiaoping Su, Gabriel G. Malouf

Abstract:

Introduction: Breast cancer is a heterogeneous disease that can be classified in 4 subgroups using transcriptional profiling. The role of lncRNA expression in human breast cancer biology, prognosis, and molecular classification remains unknown. Methods and results: Using an integrative comprehensive analysis of lncRNA, mRNA and DNA methylation in 900 breast cancer patients from The Cancer Genome Atlas (TCGA) project, we unraveled the molecular portraits of 1,700 expressed lncRNA. Some of those lncRNAs (i.e, HOTAIR) are previously reported and others are novel (i.e, HOTAIRM1, MAPT-AS1). The lncRNA classification correlated well with the PAM50 classification for basal-like, Her-2 enriched and luminal B subgroups, in contrast to the luminal A subgroup which behaved differently. Importantly, estrogen receptor (ESR1) expression was associated with distinct lncRNA networks in lncRNA clusters III and IV. Gene set enrichment analysis for cis- and trans-acting lncRNA showed enrichment for breast cancer signatures driven by breast cancer master regulators. Almost two third of those lncRNA were marked by enhancer chromatin modifications (i.e., H3K27ac), suggesting that lncRNA expression may result in increased activity of neighboring genes. Differential analysis of gene expression profiling data showed that lncRNA HOTAIRM1 was significantly down-regulated in basal-like subtype, and DNA methylation profiling data showed that lncRNA HOTAIRM1 was highly methylated in basal-like subtype. Thus, our integrative analysis of gene expression and DNA methylation strongly suggested that lncRNA HOTAIRM1 should be a tumor suppressor in basal-like subtype. Conclusion and significance: Our study depicts the first lncRNA molecular portrait of breast cancer and shows that lncRNA HOTAIRM1 might be a novel tumor suppressor.

Keywords: lncRNA profiling, breast cancer, HOTAIRM1, tumor suppressor

Procedia PDF Downloads 100
1692 National Assessment for Schools in Saudi Arabia: Score Reliability and Plausible Values

Authors: Dimiter M. Dimitrov, Abdullah Sadaawi

Abstract:

The National Assessment for Schools (NAFS) in Saudi Arabia consists of standardized tests in Mathematics, Reading, and Science for school grade levels 3, 6, and 9. One main goal is to classify students into four categories of NAFS performance (minimal, basic, proficient, and advanced) by schools and the entire national sample. The NAFS scoring and equating is performed on a bounded scale (D-scale: ranging from 0 to 1) in the framework of the recently developed “D-scoring method of measurement.” The specificity of the NAFS measurement framework and data complexity presented both challenges and opportunities to (a) the estimation of score reliability for schools, (b) setting cut-scores for the classification of students into categories of performance, and (c) generating plausible values for distributions of student performance on the D-scale. The estimation of score reliability at the school level was performed in the framework of generalizability theory (GT), with students “nested” within schools and test items “nested” within test forms. The GT design was executed via a multilevel modeling syntax code in R. Cut-scores (on the D-scale) for the classification of students into performance categories was derived via a recently developed method of standard setting, referred to as “Response Vector for Mastery” (RVM) method. For each school, the classification of students into categories of NAFS performance was based on distributions of plausible values for the students’ scores on NAFS tests by grade level (3, 6, and 9) and subject (Mathematics, Reading, and Science). Plausible values (on the D-scale) for each individual student were generated via random selection from a statistical logit-normal distribution with parameters derived from the student’s D-score and its conditional standard error, SE(D). All procedures related to D-scoring, equating, generating plausible values, and classification of students into performance levels were executed via a computer program in R developed for the purpose of NAFS data analysis.

Keywords: large-scale assessment, reliability, generalizability theory, plausible values

Procedia PDF Downloads 8