Search results for: classification accuracies
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2204

Search results for: classification accuracies

1514 Transportation Mode Classification Using GPS Coordinates and Recurrent Neural Networks

Authors: Taylor Kolody, Farkhund Iqbal, Rabia Batool, Benjamin Fung, Mohammed Hussaeni, Saiqa Aleem

Abstract:

The rising threat of climate change has led to an increase in public awareness and care about our collective and individual environmental impact. A key component of this impact is our use of cars and other polluting forms of transportation, but it is often difficult for an individual to know how severe this impact is. While there are applications that offer this feedback, they require manual entry of what transportation mode was used for a given trip, which can be burdensome. In order to alleviate this shortcoming, a data from the 2016 TRIPlab datasets has been used to train a variety of machine learning models to automatically recognize the mode of transportation. The accuracy of 89.6% is achieved using single deep neural network model with Gated Recurrent Unit (GRU) architecture applied directly to trip data points over 4 primary classes, namely walking, public transit, car, and bike. These results are comparable in accuracy to results achieved by others using ensemble methods and require far less computation when classifying new trips. The lack of trip context data, e.g., bus routes, bike paths, etc., and the need for only a single set of weights make this an appropriate methodology for applications hoping to reach a broad demographic and have responsive feedback.

Keywords: classification, gated recurrent unit, recurrent neural network, transportation

Procedia PDF Downloads 137
1513 Geoinformation Technology of Agricultural Monitoring Using Multi-Temporal Satellite Imagery

Authors: Olena Kavats, Dmitry Khramov, Kateryna Sergieieva, Vladimir Vasyliev, Iurii Kavats

Abstract:

Geoinformation technologies of space agromonitoring are a means of operative decision making support in the tasks of managing the agricultural sector of the economy. Existing technologies use satellite images in the optical range of electromagnetic spectrum. Time series of optical images often contain gaps due to the presence of clouds and haze. A geoinformation technology is created. It allows to fill gaps in time series of optical images (Sentinel-2, Landsat-8, PROBA-V, MODIS) with radar survey data (Sentinel-1) and use information about agrometeorological conditions of the growing season for individual monitoring years. The technology allows to perform crop classification and mapping for spring-summer (winter and spring crops) and autumn-winter (winter crops) periods of vegetation, monitoring the dynamics of crop state seasonal changes, crop yield forecasting. Crop classification is based on supervised classification algorithms, takes into account the peculiarities of crop growth at different vegetation stages (dates of sowing, emergence, active vegetation, and harvesting) and agriculture land state characteristics (row spacing, seedling density, etc.). A catalog of samples of the main agricultural crops (Ukraine) is created and crop spectral signatures are calculated with the preliminary removal of row spacing, cloud cover, and cloud shadows in order to construct time series of crop growth characteristics. The obtained data is used in grain crop growth tracking and in timely detection of growth trends deviations from reference samples of a given crop for a selected date. Statistical models of crop yield forecast are created in the forms of linear and nonlinear interconnections between crop yield indicators and crop state characteristics (temperature, precipitation, vegetation indices, etc.). Predicted values of grain crop yield are evaluated with an accuracy up to 95%. The developed technology was used for agricultural areas monitoring in a number of Great Britain and Ukraine regions using EOS Crop Monitoring Platform (https://crop-monitoring.eos.com). The obtained results allow to conclude that joint use of Sentinel-1 and Sentinel-2 images improve separation of winter crops (rapeseed, wheat, barley) in the early stages of vegetation (October-December). It allows to separate successfully the soybean, corn, and sunflower sowing areas that are quite similar in their spectral characteristics.

Keywords: geoinformation technology, crop classification, crop yield prediction, agricultural monitoring, EOS Crop Monitoring Platform

Procedia PDF Downloads 456
1512 The Optimum Mel-Frequency Cepstral Coefficients (MFCCs) Contribution to Iranian Traditional Music Genre Classification by Instrumental Features

Authors: M. Abbasi Layegh, S. Haghipour, K. Athari, R. Khosravi, M. Tafkikialamdari

Abstract:

An approach to find the optimum mel-frequency cepstral coefficients (MFCCs) for the Radif of Mirzâ Ábdollâh, which is the principal emblem and the heart of Persian music, performed by most famous Iranian masters on two Iranian stringed instruments ‘Tar’ and ‘Setar’ is proposed. While investigating the variance of MFCC for each record in themusic database of 1500 gushe of the repertoire belonging to 12 modal systems (dastgâh and âvâz), we have applied the Fuzzy C-Mean clustering algorithm on each of the 12 coefficient and different combinations of those coefficients. We have applied the same experiment while increasing the number of coefficients but the clustering accuracy remained the same. Therefore, we can conclude that the first 7 MFCCs (V-7MFCC) are enough for classification of The Radif of Mirzâ Ábdollâh. Classical machine learning algorithms such as MLP neural networks, K-Nearest Neighbors (KNN), Gaussian Mixture Model (GMM), Hidden Markov Model (HMM) and Support Vector Machine (SVM) have been employed. Finally, it can be realized that SVM shows a better performance in this study.

Keywords: radif of Mirzâ Ábdollâh, Gushe, mel frequency cepstral coefficients, fuzzy c-mean clustering algorithm, k-nearest neighbors (KNN), gaussian mixture model (GMM), hidden markov model (HMM), support vector machine (SVM)

Procedia PDF Downloads 446
1511 Performance Analysis of Traffic Classification with Machine Learning

Authors: Htay Htay Yi, Zin May Aye

Abstract:

Network security is role of the ICT environment because malicious users are continually growing that realm of education, business, and then related with ICT. The network security contravention is typically described and examined centrally based on a security event management system. The firewalls, Intrusion Detection System (IDS), and Intrusion Prevention System are becoming essential to monitor or prevent of potential violations, incidents attack, and imminent threats. In this system, the firewall rules are set only for where the system policies are needed. Dataset deployed in this system are derived from the testbed environment. The traffic as in DoS and PortScan traffics are applied in the testbed with firewall and IDS implementation. The network traffics are classified as normal or attacks in the existing testbed environment based on six machine learning classification methods applied in the system. It is required to be tested to get datasets and applied for DoS and PortScan. The dataset is based on CICIDS2017 and some features have been added. This system tested 26 features from the applied dataset. The system is to reduce false positive rates and to improve accuracy in the implemented testbed design. The system also proves good performance by selecting important features and comparing existing a dataset by machine learning classifiers.

Keywords: false negative rate, intrusion detection system, machine learning methods, performance

Procedia PDF Downloads 118
1510 Reliability of Using Standard Penetration Test (SPT) in Evaluation of Soil Properties

Authors: Hossein Alimohammadi, Mohsen Amirmojahedi, Mehrdad Rowhani

Abstract:

Soil properties are used by geotechnical engineers to evaluate and analyze site conditions for designing purposes. Although basic soil classification tests are easy to perform and provide useful information to determine the properties of soils, it may take time to get the result and add some costs to the projects. Standard Penetration Test (SPT) provides an opportunity to evaluate soil parameters without performing laboratory tests. In addition to its simplicity and cheapness, the results become available immediately. This research provides a guideline on the application of the SPT test method, reliability of adapting the SPT test results in evaluating soil physical and mechanical properties such as Atterberg limits, shear strength, and compressive strength compressibility parameters. A total of 70 boreholes were investigated in this study by taking soil samples between depths of 1.2 to 15.25 meters. The project site was located in Morrow County, Ohio. A regression-based formula was proposed based on Tobit regression with a stepwise variable selection analysis conducted between SPT and other typical soil properties obtained from soil tests. The results of the research illustrated that the shear strength and physical properties of the soil affect the SPT number. The proposed correlation can help engineers to use SPT test results in their design with higher accuracy.

Keywords: standard penetration test, soil properties, soil classification, regression method

Procedia PDF Downloads 188
1509 Detecting Venomous Files in IDS Using an Approach Based on Data Mining Algorithm

Authors: Sukhleen Kaur

Abstract:

In security groundwork, Intrusion Detection System (IDS) has become an important component. The IDS has received increasing attention in recent years. IDS is one of the effective way to detect different kinds of attacks and malicious codes in a network and help us to secure the network. Data mining techniques can be implemented to IDS, which analyses the large amount of data and gives better results. Data mining can contribute to improving intrusion detection by adding a level of focus to anomaly detection. So far the study has been carried out on finding the attacks but this paper detects the malicious files. Some intruders do not attack directly, but they hide some harmful code inside the files or may corrupt those file and attack the system. These files are detected according to some defined parameters which will form two lists of files as normal files and harmful files. After that data mining will be performed. In this paper a hybrid classifier has been used via Naive Bayes and Ripper classification methods. The results show how the uploaded file in the database will be tested against the parameters and then it is characterised as either normal or harmful file and after that the mining is performed. Moreover, when a user tries to mine on harmful file it will generate an exception that mining cannot be made on corrupted or harmful files.

Keywords: data mining, association, classification, clustering, decision tree, intrusion detection system, misuse detection, anomaly detection, naive Bayes, ripper

Procedia PDF Downloads 414
1508 Classification of Foliar Nitrogen in Common Bean (Phaseolus Vulgaris L.) Using Deep Learning Models and Images

Authors: Marcos Silva Tavares, Jamile Raquel Regazzo, Edson José de Souza Sardinha, Murilo Mesquita Baesso

Abstract:

Common beans are a widely cultivated and consumed legume globally, serving as a staple food for humans, especially in developing countries, due to their nutritional characteristics. Nitrogen (N) is the most limiting nutrient for productivity, and foliar analysis is crucial to ensure balanced nitrogen fertilization. Excessive N applications can cause, either isolated or cumulatively, soil and water contamination, plant toxicity, and increase their susceptibility to diseases and pests. However, the quantification of N using conventional methods is time-consuming and costly, demanding new technologies to optimize the adequate supply of N to plants. Thus, it becomes necessary to establish constant monitoring of the foliar content of this macronutrient in plants, mainly at the V4 stage, aiming at precision management of nitrogen fertilization. In this work, the objective was to evaluate the performance of a deep learning model, Resnet-50, in the classification of foliar nitrogen in common beans using RGB images. The BRS Estilo cultivar was sown in a greenhouse in a completely randomized design with four nitrogen doses (T1 = 0 kg N ha-1, T2 = 25 kg N ha-1, T3 = 75 kg N ha-1, and T4 = 100 kg N ha-1) and 12 replications. Pots with 5L capacity were used with a substrate composed of 43% soil (Neossolo Quartzarênico), 28.5% crushed sugarcane bagasse, and 28.5% cured bovine manure. The water supply of the plants was done with 5mm of water per day. The application of urea (45% N) and the acquisition of images occurred 14 and 32 days after sowing, respectively. A code developed in Matlab© R2022b was used to cut the original images into smaller blocks, originating an image bank composed of 4 folders representing the four classes and labeled as T1, T2, T3, and T4, each containing 500 images of 224x224 pixels obtained from plants cultivated under different N doses. The Matlab© R2022b software was used for the implementation and performance analysis of the model. The evaluation of the efficiency was done by a set of metrics, including accuracy (AC), F1-score (F1), specificity (SP), area under the curve (AUC), and precision (P). The ResNet-50 showed high performance in the classification of foliar N levels in common beans, with AC values of 85.6%. The F1 for classes T1, T2, T3, and T4 was 76, 72, 74, and 77%, respectively. This study revealed that the use of RGB images combined with deep learning can be a promising alternative to slow laboratory analyses, capable of optimizing the estimation of foliar N. This can allow rapid intervention by the producer to achieve higher productivity and less fertilizer waste. Future approaches are encouraged to develop mobile devices capable of handling images using deep learning for the classification of the nutritional status of plants in situ.

Keywords: convolutional neural network, residual network 50, nutritional status, artificial intelligence

Procedia PDF Downloads 19
1507 Conformance to Spatial Planning between the Kampala Physical Development Plan of 2012 and the Existing Land Use in 2021

Authors: Brendah Nagula, Omolo Fredrick Okalebo, Ronald Ssengendo, Ivan Bamweyana

Abstract:

The Kampala Physical Development Plan (KPDP) was developed in 2012 and projected both long term and short term developments within the City .The purpose of the plan was to not only shape the city into a spatially planned area but also to control the urban sprawl trends that had expanded with pronounced instances of informal settlements. This plan was approved by the National Physical Planning Board and a signature was appended by the Minister in 2013. Much as the KPDP plan has been implemented using different approaches such as detailed planning, development control, subdivision planning, carrying out construction inspections, greening and beautification, there is still limited knowledge on the level of conformance towards this plan. Therefore, it is yet to be determined whether it has been effective in shaping the City into an ideal spatially planned area. Attaining a clear picture of the level of conformance towards the KPDP 2012 through evaluation between the planned and the existing land use in Kampala City was performed. Methods such as Supervised Classification and Post Classification Change Detection were adopted to perform this evaluation. Scrutiny of findings revealed Central Division registered the lowest level of conformance to the planning standards specified in the KPDP 2012 followed by Nakawa, Rubaga, Kawempe, and Makindye. Furthermore, mixed-use development was identified as the land use with the highest level of non-conformity of 25.11% and institutional land use registered the highest level of conformance of 84.45 %. The results show that the aspect of location was not carefully considered while allocating uses in the KPDP whereby areas located near the Central Business District have higher land rents and hence require uses that ensure profit maximization. Also, the prominence of development towards mixed-use denotes an increased demand for land towards compact development that was not catered for in the plan. Therefore in order to transform Kampala city into a spatially planned area, there is need to carefully develop detailed plans especially for all the Central Division planning precincts indicating considerations for land use densification.

Keywords: spatial plan, post classification change detection, Kampala city, landuse

Procedia PDF Downloads 92
1506 High-Resolution ECG Automated Analysis and Diagnosis

Authors: Ayad Dalloo, Sulaf Dalloo

Abstract:

Electrocardiogram (ECG) recording is prone to complications, on analysis by physicians, due to noise and artifacts, thus creating ambiguity leading to possible error of diagnosis. Such drawbacks may be overcome with the advent of high resolution Methods, such as Discrete Wavelet Analysis and Digital Signal Processing (DSP) techniques. This ECG signal analysis is implemented in three stages: ECG preprocessing, features extraction and classification with the aim of realizing high resolution ECG diagnosis and improved detection of abnormal conditions in the heart. The preprocessing stage involves removing spurious artifacts (noise), due to such factors as muscle contraction, motion, respiration, etc. ECG features are extracted by applying DSP and suggested sloping method techniques. These measured features represent peak amplitude values and intervals of P, Q, R, S, R’, and T waves on ECG, and other features such as ST elevation, QRS width, heart rate, electrical axis, QR and QT intervals. The classification is preformed using these extracted features and the criteria for cardiovascular diseases. The ECG diagnostic system is successfully applied to 12-lead ECG recordings for 12 cases. The system is provided with information to enable it diagnoses 15 different diseases. Physician’s and computer’s diagnoses are compared with 90% agreement, with respect to physician diagnosis, and the time taken for diagnosis is 2 seconds. All of these operations are programmed in Matlab environment.

Keywords: ECG diagnostic system, QRS detection, ECG baseline removal, cardiovascular diseases

Procedia PDF Downloads 297
1505 A Decision Support System to Detect the Lumbar Disc Disease on the Basis of Clinical MRI

Authors: Yavuz Unal, Kemal Polat, H. Erdinc Kocer

Abstract:

In this study, a decision support system comprising three stages has been proposed to detect the disc abnormalities of the lumbar region. In the first stage named the feature extraction, T2-weighted sagittal and axial Magnetic Resonance Images (MRI) were taken from 55 people and then 27 appearance and shape features were acquired from both sagittal and transverse images. In the second stage named the feature weighting process, k-means clustering based feature weighting (KMCBFW) proposed by Gunes et al. Finally, in the third stage named the classification process, the classifier algorithms including multi-layer perceptron (MLP- neural network), support vector machine (SVM), Naïve Bayes, and decision tree have been used to classify whether the subject has lumbar disc or not. In order to test the performance of the proposed method, the classification accuracy (%), sensitivity, specificity, precision, recall, f-measure, kappa value, and computation times have been used. The best hybrid model is the combination of k-means clustering based feature weighting and decision tree in the detecting of lumbar disc disease based on both sagittal and axial MR images.

Keywords: lumbar disc abnormality, lumbar MRI, lumbar spine, hybrid models, hybrid features, k-means clustering based feature weighting

Procedia PDF Downloads 520
1504 Evaluating Machine Learning Techniques for Activity Classification in Smart Home Environments

Authors: Talal Alshammari, Nasser Alshammari, Mohamed Sedky, Chris Howard

Abstract:

With the widespread adoption of the Internet-connected devices, and with the prevalence of the Internet of Things (IoT) applications, there is an increased interest in machine learning techniques that can provide useful and interesting services in the smart home domain. The areas that machine learning techniques can help advance are varied and ever-evolving. Classifying smart home inhabitants’ Activities of Daily Living (ADLs), is one prominent example. The ability of machine learning technique to find meaningful spatio-temporal relations of high-dimensional data is an important requirement as well. This paper presents a comparative evaluation of state-of-the-art machine learning techniques to classify ADLs in the smart home domain. Forty-two synthetic datasets and two real-world datasets with multiple inhabitants are used to evaluate and compare the performance of the identified machine learning techniques. Our results show significant performance differences between the evaluated techniques. Such as AdaBoost, Cortical Learning Algorithm (CLA), Decision Trees, Hidden Markov Model (HMM), Multi-layer Perceptron (MLP), Structured Perceptron and Support Vector Machines (SVM). Overall, neural network based techniques have shown superiority over the other tested techniques.

Keywords: activities of daily living, classification, internet of things, machine learning, prediction, smart home

Procedia PDF Downloads 357
1503 Fatigue Life Estimation of Tubular Joints - A Comparative Study

Authors: Jeron Maheswaran, Sudath C. Siriwardane

Abstract:

In fatigue analysis, the structural detail of tubular joint has taken great attention among engineers. The DNV-RP-C203 is covering this topic quite well for simple and clear joint cases. For complex joint and geometry, where joint classification isn’t available and limitation on validity range of non-dimensional geometric parameters, the challenges become a fact among engineers. The classification of joint is important to carry out through the fatigue analysis. These joint configurations are identified by the connectivity and the load distribution of tubular joints. To overcome these problems to some extent, this paper compare the fatigue life of tubular joints in offshore jacket according to the stress concentration factors (SCF) in DNV-RP-C203 and finite element method employed Abaqus/CAE. The paper presents the geometric details, material properties and considered load history of the jacket structure. Describe the global structural analysis and identification of critical tubular joints for fatigue life estimation. Hence fatigue life is determined based on the guidelines provided by design codes. Fatigue analysis of tubular joints is conducted using finite element employed Abaqus/CAE [4] as next major step. Finally, obtained SCFs and fatigue lives are compared and their significances are discussed.

Keywords: fatigue life, stress-concentration factor, finite element analysis, offshore jacket structure

Procedia PDF Downloads 453
1502 Investigating the Causes of Human Error-Induced Incidents in the Maintenance Operations of Petrochemical Industry by Using Human Factors Analysis and Classification System

Authors: Omid Kalatpour, Mohammadreza Ajdari

Abstract:

This article studied the possible causes of human error-induced incidents in the petrochemical industry maintenance activities by using Human Factors Analysis and Classification System (HFACS). The purpose of the study was anticipating and identifying these causes and proposing corrective and preventive actions. Maintenance department in a petrochemical company was selected for research. A checklist of human error-induced incidents was developed based on four HFACS main levels and nineteen sub-groups. Hierarchical task analysis (HTA) technique was used to identify maintenance activities and tasks. The main causes of possible incidents were identified by checklist and recorded. Corrective and preventive actions were defined depending on priority. Analyzing the worksheets of 444 activities in four levels of HFACS showed 37.6% of the causes were at the level of unsafe actions, 27.5% at the level of unsafe supervision, 20.9% at the level of preconditions for unsafe acts and 14% of the causes were at the level of organizational effects. The HFACS sub-groups showed errors (24.36%) inadequate supervision (14.89%) and violations (13.26%) with the most frequency. According to findings of this study, increasing the training effectiveness of operators and supervision improvement respectively are the most important measures in decreasing the human error-induced incidents in petrochemical industry maintenance.

Keywords: human error, petrochemical industry, maintenance, HFACS

Procedia PDF Downloads 242
1501 SCNet: A Vehicle Color Classification Network Based on Spatial Cluster Loss and Channel Attention Mechanism

Authors: Fei Gao, Xinyang Dong, Yisu Ge, Shufang Lu, Libo Weng

Abstract:

Vehicle color recognition plays an important role in traffic accident investigation. However, due to the influence of illumination, weather, and noise, vehicle color recognition still faces challenges. In this paper, a vehicle color classification network based on spatial cluster loss and channel attention mechanism (SCNet) is proposed for vehicle color recognition. A channel attention module is applied to extract the features of vehicle color representative regions and reduce the weight of nonrepresentative color regions in the channel. The proposed loss function, called spatial clustering loss (SC-loss), consists of two channel-specific components, such as a concentration component and a diversity component. The concentration component forces all feature channels belonging to the same class to be concentrated through the channel cluster. The diversity components impose additional constraints on the channels through the mean distance coefficient, making them mutually exclusive in spatial dimensions. In the comparison experiments, the proposed method can achieve state-of-the-art performance on the public datasets, VCD, and VeRi, which are 96.1% and 96.2%, respectively. In addition, the ablation experiment further proves that SC-loss can effectively improve the accuracy of vehicle color recognition.

Keywords: feature extraction, convolutional neural networks, intelligent transportation, vehicle color recognition

Procedia PDF Downloads 183
1500 DeepNIC a Method to Transform Each Tabular Variable into an Independant Image Analyzable by Basic CNNs

Authors: Nguyen J. M., Lucas G., Ruan S., Digonnet H., Antonioli D.

Abstract:

Introduction: Deep Learning (DL) is a very powerful tool for analyzing image data. But for tabular data, it cannot compete with machine learning methods like XGBoost. The research question becomes: can tabular data be transformed into images that can be analyzed by simple CNNs (Convolutional Neuron Networks)? Will DL be the absolute tool for data classification? All current solutions consist in repositioning the variables in a 2x2 matrix using their correlation proximity. In doing so, it obtains an image whose pixels are the variables. We implement a technology, DeepNIC, that offers the possibility of obtaining an image for each variable, which can be analyzed by simple CNNs. Material and method: The 'ROP' (Regression OPtimized) model is a binary and atypical decision tree whose nodes are managed by a new artificial neuron, the Neurop. By positioning an artificial neuron in each node of the decision trees, it is possible to make an adjustment on a theoretically infinite number of variables at each node. From this new decision tree whose nodes are artificial neurons, we created the concept of a 'Random Forest of Perfect Trees' (RFPT), which disobeys Breiman's concepts by assembling very large numbers of small trees with no classification errors. From the results of the RFPT, we developed a family of 10 statistical information criteria, Nguyen Information Criterion (NICs), which evaluates in 3 dimensions the predictive quality of a variable: Performance, Complexity and Multiplicity of solution. A NIC is a probability that can be transformed into a grey level. The value of a NIC depends essentially on 2 super parameters used in Neurops. By varying these 2 super parameters, we obtain a 2x2 matrix of probabilities for each NIC. We can combine these 10 NICs with the functions AND, OR, and XOR. The total number of combinations is greater than 100,000. In total, we obtain for each variable an image of at least 1166x1167 pixels. The intensity of the pixels is proportional to the probability of the associated NIC. The color depends on the associated NIC. This image actually contains considerable information about the ability of the variable to make the prediction of Y, depending on the presence or absence of other variables. A basic CNNs model was trained for supervised classification. Results: The first results are impressive. Using the GSE22513 public data (Omic data set of markers of Taxane Sensitivity in Breast Cancer), DEEPNic outperformed other statistical methods, including XGBoost. We still need to generalize the comparison on several databases. Conclusion: The ability to transform any tabular variable into an image offers the possibility of merging image and tabular information in the same format. This opens up great perspectives in the analysis of metadata.

Keywords: tabular data, CNNs, NICs, DeepNICs, random forest of perfect trees, classification

Procedia PDF Downloads 125
1499 A Mechanical Diagnosis Method Based on Vibration Fault Signal down-Sampling and the Improved One-Dimensional Convolutional Neural Network

Authors: Bowei Yuan, Shi Li, Liuyang Song, Huaqing Wang, Lingli Cui

Abstract:

Convolutional neural networks (CNN) have received extensive attention in the field of fault diagnosis. Many fault diagnosis methods use CNN for fault type identification. However, when the amount of raw data collected by sensors is massive, the neural network needs to perform a time-consuming classification task. In this paper, a mechanical fault diagnosis method based on vibration signal down-sampling and the improved one-dimensional convolutional neural network is proposed. Through the robust principal component analysis, the low-rank feature matrix of a large amount of raw data can be separated, and then down-sampling is realized to reduce the subsequent calculation amount. In the improved one-dimensional CNN, a smaller convolution kernel is used to reduce the number of parameters and computational complexity, and regularization is introduced before the fully connected layer to prevent overfitting. In addition, the multi-connected layers can better generalize classification results without cumbersome parameter adjustments. The effectiveness of the method is verified by monitoring the signal of the centrifugal pump test bench, and the average test accuracy is above 98%. When compared with the traditional deep belief network (DBN) and support vector machine (SVM) methods, this method has better performance.

Keywords: fault diagnosis, vibration signal down-sampling, 1D-CNN

Procedia PDF Downloads 131
1498 Quantitative Structure–Activity Relationship Analysis of Some Benzimidazole Derivatives by Linear Multivariate Method

Authors: Strahinja Z. Kovačević, Lidija R. Jevrić, Sanja O. Podunavac Kuzmanović

Abstract:

The relationship between antibacterial activity of eighteen different substituted benzimidazole derivatives and their molecular characteristics was studied using chemometric QSAR (Quantitative Structure–Activity Relationships) approach. QSAR analysis has been carried out on inhibitory activity towards Staphylococcus aureus, by using molecular descriptors, as well as minimal inhibitory activity (MIC). Molecular descriptors were calculated from the optimized structures. Principal component analysis (PCA) followed by hierarchical cluster analysis (HCA) and multiple linear regression (MLR) was performed in order to select molecular descriptors that best describe the antibacterial behavior of the compounds investigated, and to determine the similarities between molecules. The HCA grouped the molecules in separated clusters which have the similar inhibitory activity. PCA showed very similar classification of molecules as the HCA, and displayed which descriptors contribute to that classification. MLR equations, that represent MIC as a function of the in silico molecular descriptors were established. The statistical significance of the estimated models was confirmed by standard statistical measures and cross-validation parameters (SD = 0.0816, F = 46.27, R = 0.9791, R2CV = 0.8266, R2adj = 0.9379, PRESS = 0.1116). These parameters indicate the possibility of application of the established chemometric models in prediction of the antibacterial behaviour of studied derivatives and structurally very similar compounds.

Keywords: antibacterial, benzimidazole, molecular descriptors, QSAR

Procedia PDF Downloads 364
1497 Feature Selection Approach for the Classification of Hydraulic Leakages in Hydraulic Final Inspection using Machine Learning

Authors: Christian Neunzig, Simon Fahle, Jürgen Schulz, Matthias Möller, Bernd Kuhlenkötter

Abstract:

Manufacturing companies are facing global competition and enormous cost pressure. The use of machine learning applications can help reduce production costs and create added value. Predictive quality enables the securing of product quality through data-supported predictions using machine learning models as a basis for decisions on test results. Furthermore, machine learning methods are able to process large amounts of data, deal with unfavourable row-column ratios and detect dependencies between the covariates and the given target as well as assess the multidimensional influence of all input variables on the target. Real production data are often subject to highly fluctuating boundary conditions and unbalanced data sets. Changes in production data manifest themselves in trends, systematic shifts, and seasonal effects. Thus, Machine learning applications require intensive pre-processing and feature selection. Data preprocessing includes rule-based data cleaning, the application of dimensionality reduction techniques, and the identification of comparable data subsets. Within the used real data set of Bosch hydraulic valves, the comparability of the same production conditions in the production of hydraulic valves within certain time periods can be identified by applying the concept drift method. Furthermore, a classification model is developed to evaluate the feature importance in different subsets within the identified time periods. By selecting comparable and stable features, the number of features used can be significantly reduced without a strong decrease in predictive power. The use of cross-process production data along the value chain of hydraulic valves is a promising approach to predict the quality characteristics of workpieces. In this research, the ada boosting classifier is used to predict the leakage of hydraulic valves based on geometric gauge blocks from machining, mating data from the assembly, and hydraulic measurement data from end-of-line testing. In addition, the most suitable methods are selected and accurate quality predictions are achieved.

Keywords: classification, achine learning, predictive quality, feature selection

Procedia PDF Downloads 162
1496 Roughness Discrimination Using Bioinspired Tactile Sensors

Authors: Zhengkun Yi

Abstract:

Surface texture discrimination using artificial tactile sensors has attracted increasing attentions in the past decade as it can endow technical and robot systems with a key missing ability. However, as a major component of texture, roughness has rarely been explored. This paper presents an approach for tactile surface roughness discrimination, which includes two parts: (1) design and fabrication of a bioinspired artificial fingertip, and (2) tactile signal processing for tactile surface roughness discrimination. The bioinspired fingertip is comprised of two polydimethylsiloxane (PDMS) layers, a polymethyl methacrylate (PMMA) bar, and two perpendicular polyvinylidene difluoride (PVDF) film sensors. This artificial fingertip mimics human fingertips in three aspects: (1) Elastic properties of epidermis and dermis in human skin are replicated by the two PDMS layers with different stiffness, (2) The PMMA bar serves the role analogous to that of a bone, and (3) PVDF film sensors emulate Meissner’s corpuscles in terms of both location and response to the vibratory stimuli. Various extracted features and classification algorithms including support vector machines (SVM) and k-nearest neighbors (kNN) are examined for tactile surface roughness discrimination. Eight standard rough surfaces with roughness values (Ra) of 50 μm, 25 μm, 12.5 μm, 6.3 μm 3.2 μm, 1.6 μm, 0.8 μm, and 0.4 μm are explored. The highest classification accuracy of (82.6 ± 10.8) % can be achieved using solely one PVDF film sensor with kNN (k = 9) classifier and the standard deviation feature.

Keywords: bioinspired fingertip, classifier, feature extraction, roughness discrimination

Procedia PDF Downloads 312
1495 Preliminary Evaluation of Decommissioning Wastes for the First Commercial Nuclear Power Reactor in South Korea

Authors: Kyomin Lee, Joohee Kim, Sangho Kang

Abstract:

The commercial nuclear power reactor in South Korea, Kori Unit 1, which was a 587 MWe pressurized water reactor that started operation since 1978, was permanently shut down in June 2017 without an additional operating license extension. The Kori 1 Unit is scheduled to become the nuclear power unit to enter the decommissioning phase. In this study, the preliminary evaluation of the decommissioning wastes for the Kori Unit 1 was performed based on the following series of process: firstly, the plant inventory is investigated based on various documents (i.e., equipment/ component list, construction records, general arrangement drawings). Secondly, the radiological conditions of systems, structures and components (SSCs) are established to estimate the amount of radioactive waste by waste classification. Third, the waste management strategies for Kori Unit 1 including waste packaging are established. Forth, selection of the proper decontamination and dismantling (D&D) technologies is made considering the various factors. Finally, the amount of decommissioning waste by classification for Kori 1 is estimated using the DeCAT program, which was developed by KEPCO-E&C for a decommissioning cost estimation. The preliminary evaluation results have shown that the expected amounts of decommissioning wastes were less than about 2% and 8% of the total wastes generated (i.e., sum of clean wastes and radwastes) before/after waste processing, respectively, and it was found that the majority of contaminated material was carbon or alloy steel and stainless steel. In addition, within the range of availability of information, the results of the evaluation were compared with the results from the various decommissioning experiences data or international/national decommissioning study. The comparison results have shown that the radioactive waste amount from Kori Unit 1 decommissioning were much less than those from the plants decommissioned in U.S. and were comparable to those from the plants in Europe. This result comes from the difference of disposal cost and clearance criteria (i.e., free release level) between U.S. and non-U.S. The preliminary evaluation performed using the methodology established in this study will be useful as a important information in establishing the decommissioning planning for the decommissioning schedule and waste management strategy establishment including the transportation, packaging, handling, and disposal of radioactive wastes.

Keywords: characterization, classification, decommissioning, decontamination and dismantling, Kori 1, radioactive waste

Procedia PDF Downloads 209
1494 Myanmar Character Recognition Using Eight Direction Chain Code Frequency Features

Authors: Kyi Pyar Zaw, Zin Mar Kyu

Abstract:

Character recognition is the process of converting a text image file into editable and searchable text file. Feature Extraction is the heart of any character recognition system. The character recognition rate may be low or high depending on the extracted features. In the proposed paper, 25 features for one character are used in character recognition. Basically, there are three steps of character recognition such as character segmentation, feature extraction and classification. In segmentation step, horizontal cropping method is used for line segmentation and vertical cropping method is used for character segmentation. In the Feature extraction step, features are extracted in two ways. The first way is that the 8 features are extracted from the entire input character using eight direction chain code frequency extraction. The second way is that the input character is divided into 16 blocks. For each block, although 8 feature values are obtained through eight-direction chain code frequency extraction method, we define the sum of these 8 feature values as a feature for one block. Therefore, 16 features are extracted from that 16 blocks in the second way. We use the number of holes feature to cluster the similar characters. We can recognize the almost Myanmar common characters with various font sizes by using these features. All these 25 features are used in both training part and testing part. In the classification step, the characters are classified by matching the all features of input character with already trained features of characters.

Keywords: chain code frequency, character recognition, feature extraction, features matching, segmentation

Procedia PDF Downloads 320
1493 Partial Least Square Regression for High-Dimentional and High-Correlated Data

Authors: Mohammed Abdullah Alshahrani

Abstract:

The research focuses on investigating the use of partial least squares (PLS) methodology for addressing challenges associated with high-dimensional correlated data. Recent technological advancements have led to experiments producing data characterized by a large number of variables compared to observations, with substantial inter-variable correlations. Such data patterns are common in chemometrics, where near-infrared (NIR) spectrometer calibrations record chemical absorbance levels across hundreds of wavelengths, and in genomics, where thousands of genomic regions' copy number alterations (CNA) are recorded from cancer patients. PLS serves as a widely used method for analyzing high-dimensional data, functioning as a regression tool in chemometrics and a classification method in genomics. It handles data complexity by creating latent variables (components) from original variables. However, applying PLS can present challenges. The study investigates key areas to address these challenges, including unifying interpretations across three main PLS algorithms and exploring unusual negative shrinkage factors encountered during model fitting. The research presents an alternative approach to addressing the interpretation challenge of predictor weights associated with PLS. Sparse estimation of predictor weights is employed using a penalty function combining a lasso penalty for sparsity and a Cauchy distribution-based penalty to account for variable dependencies. The results demonstrate sparse and grouped weight estimates, aiding interpretation and prediction tasks in genomic data analysis. High-dimensional data scenarios, where predictors outnumber observations, are common in regression analysis applications. Ordinary least squares regression (OLS), the standard method, performs inadequately with high-dimensional and highly correlated data. Copy number alterations (CNA) in key genes have been linked to disease phenotypes, highlighting the importance of accurate classification of gene expression data in bioinformatics and biology using regularized methods like PLS for regression and classification.

Keywords: partial least square regression, genetics data, negative filter factors, high dimensional data, high correlated data

Procedia PDF Downloads 49
1492 Classification of Multiple Cancer Types with Deep Convolutional Neural Network

Authors: Nan Deng, Zhenqiu Liu

Abstract:

Thousands of patients with metastatic tumors were diagnosed with cancers of unknown primary sites each year. The inability to identify the primary cancer site may lead to inappropriate treatment and unexpected prognosis. Nowadays, a large amount of genomics and transcriptomics cancer data has been generated by next-generation sequencing (NGS) technologies, and The Cancer Genome Atlas (TCGA) database has accrued thousands of human cancer tumors and healthy controls, which provides an abundance of resource to differentiate cancer types. Meanwhile, deep convolutional neural networks (CNNs) have shown high accuracy on classification among a large number of image object categories. Here, we utilize 25 cancer primary tumors and 3 normal tissues from TCGA and convert their RNA-Seq gene expression profiling to color images; train, validate and test a CNN classifier directly from these images. The performance result shows that our CNN classifier can archive >80% test accuracy on most of the tumors and normal tissues. Since the gene expression pattern of distant metastases is similar to their primary tumors, the CNN classifier may provide a potential computational strategy on identifying the unknown primary origin of metastatic cancer in order to plan appropriate treatment for patients.

Keywords: bioinformatics, cancer, convolutional neural network, deep leaning, gene expression pattern

Procedia PDF Downloads 299
1491 Semantic Differences between Bug Labeling of Different Repositories via Machine Learning

Authors: Pooja Khanal, Huaming Zhang

Abstract:

Labeling of issues/bugs, also known as bug classification, plays a vital role in software engineering. Some known labels/classes of bugs are 'User Interface', 'Security', and 'API'. Most of the time, when a reporter reports a bug, they try to assign some predefined label to it. Those issues are reported for a project, and each project is a repository in GitHub/GitLab, which contains multiple issues. There are many software project repositories -ranging from individual projects to commercial projects. The labels assigned for different repositories may be dependent on various factors like human instinct, generalization of labels, label assignment policy followed by the reporter, etc. While the reporter of the issue may instinctively give that issue a label, another person reporting the same issue may label it differently. This way, it is not known mathematically if a label in one repository is similar or different to the label in another repository. Hence, the primary goal of this research is to find the semantic differences between bug labeling of different repositories via machine learning. Independent optimal classifiers for individual repositories are built first using the text features from the reported issues. The optimal classifiers may include a combination of multiple classifiers stacked together. Then, those classifiers are used to cross-test other repositories which leads the result to be deduced mathematically. The produce of this ongoing research includes a formalized open-source GitHub issues database that is used to deduce the similarity of the labels pertaining to the different repositories.

Keywords: bug classification, bug labels, GitHub issues, semantic differences

Procedia PDF Downloads 200
1490 Object-Scene: Deep Convolutional Representation for Scene Classification

Authors: Yanjun Chen, Chuanping Hu, Jie Shao, Lin Mei, Chongyang Zhang

Abstract:

Traditional image classification is based on encoding scheme (e.g. Fisher Vector, Vector of Locally Aggregated Descriptor) with low-level image features (e.g. SIFT, HoG). Compared to these low-level local features, deep convolutional features obtained at the mid-level layer of convolutional neural networks (CNN) have richer information but lack of geometric invariance. For scene classification, there are scattered objects with different size, category, layout, number and so on. It is crucial to find the distinctive objects in scene as well as their co-occurrence relationship. In this paper, we propose a method to take advantage of both deep convolutional features and the traditional encoding scheme while taking object-centric and scene-centric information into consideration. First, to exploit the object-centric and scene-centric information, two CNNs that trained on ImageNet and Places dataset separately are used as the pre-trained models to extract deep convolutional features at multiple scales. This produces dense local activations. By analyzing the performance of different CNNs at multiple scales, it is found that each CNN works better in different scale ranges. A scale-wise CNN adaption is reasonable since objects in scene are at its own specific scale. Second, a fisher kernel is applied to aggregate a global representation at each scale and then to merge into a single vector by using a post-processing method called scale-wise normalization. The essence of Fisher Vector lies on the accumulation of the first and second order differences. Hence, the scale-wise normalization followed by average pooling would balance the influence of each scale since different amount of features are extracted. Third, the Fisher vector representation based on the deep convolutional features is followed by a linear Supported Vector Machine, which is a simple yet efficient way to classify the scene categories. Experimental results show that the scale-specific feature extraction and normalization with CNNs trained on object-centric and scene-centric datasets can boost the results from 74.03% up to 79.43% on MIT Indoor67 when only two scales are used (compared to results at single scale). The result is comparable to state-of-art performance which proves that the representation can be applied to other visual recognition tasks.

Keywords: deep convolutional features, Fisher Vector, multiple scales, scale-specific normalization

Procedia PDF Downloads 331
1489 Machine Learning Approach for Automating Electronic Component Error Classification and Detection

Authors: Monica Racha, Siva Chandrasekaran, Alex Stojcevski

Abstract:

The engineering programs focus on promoting students' personal and professional development by ensuring that students acquire technical and professional competencies during four-year studies. The traditional engineering laboratory provides an opportunity for students to "practice by doing," and laboratory facilities aid them in obtaining insight and understanding of their discipline. Due to rapid technological advancements and the current COVID-19 outbreak, the traditional labs were transforming into virtual learning environments. Aim: To better understand the limitations of the physical laboratory, this research study aims to use a Machine Learning (ML) algorithm that interfaces with the Augmented Reality HoloLens and predicts the image behavior to classify and detect the electronic components. The automated electronic components error classification and detection automatically detect and classify the position of all components on a breadboard by using the ML algorithm. This research will assist first-year undergraduate engineering students in conducting laboratory practices without any supervision. With the help of HoloLens, and ML algorithm, students will reduce component placement error on a breadboard and increase the efficiency of simple laboratory practices virtually. Method: The images of breadboards, resistors, capacitors, transistors, and other electrical components will be collected using HoloLens 2 and stored in a database. The collected image dataset will then be used for training a machine learning model. The raw images will be cleaned, processed, and labeled to facilitate further analysis of components error classification and detection. For instance, when students conduct laboratory experiments, the HoloLens captures images of students placing different components on a breadboard. The images are forwarded to the server for detection in the background. A hybrid Convolutional Neural Networks (CNNs) and Support Vector Machines (SVMs) algorithm will be used to train the dataset for object recognition and classification. The convolution layer extracts image features, which are then classified using Support Vector Machine (SVM). By adequately labeling the training data and classifying, the model will predict, categorize, and assess students in placing components correctly. As a result, the data acquired through HoloLens includes images of students assembling electronic components. It constantly checks to see if students appropriately position components in the breadboard and connect the components to function. When students misplace any components, the HoloLens predicts the error before the user places the components in the incorrect proportion and fosters students to correct their mistakes. This hybrid Convolutional Neural Networks (CNNs) and Support Vector Machines (SVMs) algorithm automating electronic component error classification and detection approach eliminates component connection problems and minimizes the risk of component damage. Conclusion: These augmented reality smart glasses powered by machine learning provide a wide range of benefits to supervisors, professionals, and students. It helps customize the learning experience, which is particularly beneficial in large classes with limited time. It determines the accuracy with which machine learning algorithms can forecast whether students are making the correct decisions and completing their laboratory tasks.

Keywords: augmented reality, machine learning, object recognition, virtual laboratories

Procedia PDF Downloads 134
1488 Hacking's 'Between Goffman and Foucault': A Theoretical Frame for Criminology

Authors: Tomás Speziale

Abstract:

This paper aims to analyse how Ian Hacking states the theoretical basis of his research on the classification of people. Although all his early philosophical education had been based in Foucault, it is also true that Erving Goffman’s perspective provided him with epistemological and methodological tools for understanding face-to-face relationships. Hence, all his works must be thought of as social science texts that combine the research on how the individuals are constituted ‘top-down’ (as in Foucault), with the inquiry into how people renegotiate ‘bottom-up’ the classifications about them. Thus, Hacking´s proposal constitutes a middle ground between the French Philosopher and the American Sociologist. Placing himself between both authors allows Hacking to build a frame that is expected to adjust to Social Sciences’ main particularity: the fact that they study interactive kinds. These are kinds of people, which imply that those who are classified can change in certain ways that prompt the need for changing previous classifications themselves. It is all about the interaction between the labelling of people and the people who are classified. Consequently, understanding the way in which Hacking uses Foucault’s and Goffman’s theories is essential to fully comprehend the social dynamic between individuals and concepts, what Bert Hansen had called dialectical realism. His theoretical proposal, therefore, is not only valuable because it combines diverse perspectives, but also because it constitutes an utterly original and relevant framework for Sociological theory and particularly for Criminology.

Keywords: classification of people, Foucault's archaeology, Goffman's interpersonal sociology, interactive kinds

Procedia PDF Downloads 343
1487 Technologic Information about Photovoltaic Applied in Urban Residences

Authors: Stephanie Fabris Russo, Daiane Costa Guimarães, Jonas Pedro Fabris, Maria Emilia Camargo, Suzana Leitão Russo, José Augusto Andrade Filho

Abstract:

Among renewable energy sources, solar energy is the one that has stood out. Solar radiation can be used as a thermal energy source and can also be converted into electricity by means of effects on certain materials, such as thermoelectric and photovoltaic panels. These panels are often used to generate energy in homes, buildings, arenas, etc., and have low pollution emissions. Thus, a technological prospecting was performed to find patents related to the use of photovoltaic plates in urban residences. The patent search was based on ESPACENET, associating the keywords photovoltaic and home, where we found 136 patent documents in the period of 1994-2015 in the fields title and abstract. Note that the years 2009, 2010, 2011, 2012, 2013 and 2014 had the highest number of applicants, with respectively, 11, 13, 23, 29, 15 and 21. Regarding the country that deposited about this technology, it is clear that China leads with 67 patent deposits, followed by Japan with 38 patents applications. It is important to note that most depositors, 50% are companies, 44% are individual inventors and only 6% are universities. On the International Patent classification (IPC) codes, we noted that the most present classification in results was H02J3/38, which represents provisions in parallel to feed a single network by two or more generators, converters or transformers. Among all categories, there is the H session, which means Electricity, with 70% of the patents.

Keywords: photovoltaic, urban residences, technology forecasting, prospecting

Procedia PDF Downloads 300
1486 Movie Genre Preference Prediction Using Machine Learning for Customer-Based Information

Authors: Haifeng Wang, Haili Zhang

Abstract:

Most movie recommendation systems have been developed for customers to find items of interest. This work introduces a predictive model usable by small and medium-sized enterprises (SMEs) who are in need of a data-based and analytical approach to stock proper movies for local audiences and retain more customers. We used classification models to extract features from thousands of customers’ demographic, behavioral and social information to predict their movie genre preference. In the implementation, a Gaussian kernel support vector machine (SVM) classification model and a logistic regression model were established to extract features from sample data and their test error-in-sample were compared. Comparison of error-out-sample was also made under different Vapnik–Chervonenkis (VC) dimensions in the machine learning algorithm to find and prevent overfitting. Gaussian kernel SVM prediction model can correctly predict movie genre preferences in 85% of positive cases. The accuracy of the algorithm increased to 93% with a smaller VC dimension and less overfitting. These findings advance our understanding of how to use machine learning approach to predict customers’ preferences with a small data set and design prediction tools for these enterprises.

Keywords: computational social science, movie preference, machine learning, SVM

Procedia PDF Downloads 260
1485 An Improved Parallel Algorithm of Decision Tree

Authors: Jiameng Wang, Yunfei Yin, Xiyu Deng

Abstract:

Parallel optimization is one of the important research topics of data mining at this stage. Taking Classification and Regression Tree (CART) parallelization as an example, this paper proposes a parallel data mining algorithm based on SSP-OGini-PCCP. Aiming at the problem of choosing the best CART segmentation point, this paper designs an S-SP model without data association; and in order to calculate the Gini index efficiently, a parallel OGini calculation method is designed. In addition, in order to improve the efficiency of the pruning algorithm, a synchronous PCCP pruning strategy is proposed in this paper. In this paper, the optimal segmentation calculation, Gini index calculation, and pruning algorithm are studied in depth. These are important components of parallel data mining. By constructing a distributed cluster simulation system based on SPARK, data mining methods based on SSP-OGini-PCCP are tested. Experimental results show that this method can increase the search efficiency of the best segmentation point by an average of 89%, increase the search efficiency of the Gini segmentation index by 3853%, and increase the pruning efficiency by 146% on average; and as the size of the data set increases, the performance of the algorithm remains stable, which meets the requirements of contemporary massive data processing.

Keywords: classification, Gini index, parallel data mining, pruning ahead

Procedia PDF Downloads 123