Search results for: classification of filters
1655 The Optimum Mel-Frequency Cepstral Coefficients (MFCCs) Contribution to Iranian Traditional Music Genre Classification by Instrumental Features
Authors: M. Abbasi Layegh, S. Haghipour, K. Athari, R. Khosravi, M. Tafkikialamdari
Abstract:
An approach to find the optimum mel-frequency cepstral coefficients (MFCCs) for the Radif of Mirzâ Ábdollâh, which is the principal emblem and the heart of Persian music, performed by most famous Iranian masters on two Iranian stringed instruments ‘Tar’ and ‘Setar’ is proposed. While investigating the variance of MFCC for each record in themusic database of 1500 gushe of the repertoire belonging to 12 modal systems (dastgâh and âvâz), we have applied the Fuzzy C-Mean clustering algorithm on each of the 12 coefficient and different combinations of those coefficients. We have applied the same experiment while increasing the number of coefficients but the clustering accuracy remained the same. Therefore, we can conclude that the first 7 MFCCs (V-7MFCC) are enough for classification of The Radif of Mirzâ Ábdollâh. Classical machine learning algorithms such as MLP neural networks, K-Nearest Neighbors (KNN), Gaussian Mixture Model (GMM), Hidden Markov Model (HMM) and Support Vector Machine (SVM) have been employed. Finally, it can be realized that SVM shows a better performance in this study.Keywords: radif of Mirzâ Ábdollâh, Gushe, mel frequency cepstral coefficients, fuzzy c-mean clustering algorithm, k-nearest neighbors (KNN), gaussian mixture model (GMM), hidden markov model (HMM), support vector machine (SVM)
Procedia PDF Downloads 4451654 Performance Analysis of Traffic Classification with Machine Learning
Authors: Htay Htay Yi, Zin May Aye
Abstract:
Network security is role of the ICT environment because malicious users are continually growing that realm of education, business, and then related with ICT. The network security contravention is typically described and examined centrally based on a security event management system. The firewalls, Intrusion Detection System (IDS), and Intrusion Prevention System are becoming essential to monitor or prevent of potential violations, incidents attack, and imminent threats. In this system, the firewall rules are set only for where the system policies are needed. Dataset deployed in this system are derived from the testbed environment. The traffic as in DoS and PortScan traffics are applied in the testbed with firewall and IDS implementation. The network traffics are classified as normal or attacks in the existing testbed environment based on six machine learning classification methods applied in the system. It is required to be tested to get datasets and applied for DoS and PortScan. The dataset is based on CICIDS2017 and some features have been added. This system tested 26 features from the applied dataset. The system is to reduce false positive rates and to improve accuracy in the implemented testbed design. The system also proves good performance by selecting important features and comparing existing a dataset by machine learning classifiers.Keywords: false negative rate, intrusion detection system, machine learning methods, performance
Procedia PDF Downloads 1161653 Optimum Design of Attenuator of Spun-Bond Production System
Authors: Nasser Ghassembaglou, Abdullah Bolek, Oktay Yilmaz, Ertan Oznergiz, Hikmet Kocabas, Safak Yilmaz
Abstract:
Nanofibers are effective material which have frequently been investigated to produce high quality air filters. As an environmental approach our aim is to achieve nanofibers by melting. In spun-bond systems extruder, spin-pump, nozzle package and attenuator are used. Molten polymer which flows from extruder is made steady by spin-pump. Regular melt passes through nozzle holes and forms fibers under high pressure. The fibers pulled from nozzle are shrunk to micron size by an attenuator, after solidification they are collected on a conveyor. In this research different designs of attenuator system have been studied and also CFD analysis have been done on them. Afterwards, one of these designs tested and finally some optimizations have been done to reduce pressure loss and increase air velocity.Keywords: attenuator, nanofiber, spun-bond, extruder
Procedia PDF Downloads 4121652 The Direct Deconvolution Model for the Large Eddy Simulation of Turbulence
Authors: Ning Chang, Zelong Yuan, Yunpeng Wang, Jianchun Wang
Abstract:
Large eddy simulation (LES) has been extensively used in the investigation of turbulence. LES calculates the grid-resolved large-scale motions and leaves small scales modeled by sublfilterscale (SFS) models. Among the existing SFS models, the deconvolution model has been used successfully in the LES of the engineering flows and geophysical flows. Despite the wide application of deconvolution models, the effects of subfilter scale dynamics and filter anisotropy on the accuracy of SFS modeling have not been investigated in depth. The results of LES are highly sensitive to the selection of filters and the anisotropy of the grid, which has been overlooked in previous research. In the current study, two critical aspects of LES are investigated. Firstly, we analyze the influence of sub-filter scale (SFS) dynamics on the accuracy of direct deconvolution models (DDM) at varying filter-to-grid ratios (FGR) in isotropic turbulence. An array of invertible filters are employed, encompassing Gaussian, Helmholtz I and II, Butterworth, Chebyshev I and II, Cauchy, Pao, and rapidly decaying filters. The significance of FGR becomes evident, as it acts as a pivotal factor in error control for precise SFS stress prediction. When FGR is set to 1, the DDM models cannot accurately reconstruct the SFS stress due to the insufficient resolution of SFS dynamics. Notably, prediction capabilities are enhanced at an FGR of 2, resulting in accurate SFS stress reconstruction, except for cases involving Helmholtz I and II filters. A remarkable precision close to 100% is achieved at an FGR of 4 for all DDM models. Additionally, the further exploration extends to the filter anisotropy to address its impact on the SFS dynamics and LES accuracy. By employing dynamic Smagorinsky model (DSM), dynamic mixed model (DMM), and direct deconvolution model (DDM) with the anisotropic filter, aspect ratios (AR) ranging from 1 to 16 in LES filters are evaluated. The findings highlight the DDM's proficiency in accurately predicting SFS stresses under highly anisotropic filtering conditions. High correlation coefficients exceeding 90% are observed in the a priori study for the DDM's reconstructed SFS stresses, surpassing those of the DSM and DMM models. However, these correlations tend to decrease as lter anisotropy increases. In the a posteriori studies, the DDM model consistently outperforms the DSM and DMM models across various turbulence statistics, encompassing velocity spectra, probability density functions related to vorticity, SFS energy flux, velocity increments, strain-rate tensors, and SFS stress. It is observed that as filter anisotropy intensify, the results of DSM and DMM become worse, while the DDM continues to deliver satisfactory results across all filter-anisotropy scenarios. The findings emphasize the DDM framework's potential as a valuable tool for advancing the development of sophisticated SFS models for LES of turbulence.Keywords: deconvolution model, large eddy simulation, subfilter scale modeling, turbulence
Procedia PDF Downloads 751651 Reliability of Using Standard Penetration Test (SPT) in Evaluation of Soil Properties
Authors: Hossein Alimohammadi, Mohsen Amirmojahedi, Mehrdad Rowhani
Abstract:
Soil properties are used by geotechnical engineers to evaluate and analyze site conditions for designing purposes. Although basic soil classification tests are easy to perform and provide useful information to determine the properties of soils, it may take time to get the result and add some costs to the projects. Standard Penetration Test (SPT) provides an opportunity to evaluate soil parameters without performing laboratory tests. In addition to its simplicity and cheapness, the results become available immediately. This research provides a guideline on the application of the SPT test method, reliability of adapting the SPT test results in evaluating soil physical and mechanical properties such as Atterberg limits, shear strength, and compressive strength compressibility parameters. A total of 70 boreholes were investigated in this study by taking soil samples between depths of 1.2 to 15.25 meters. The project site was located in Morrow County, Ohio. A regression-based formula was proposed based on Tobit regression with a stepwise variable selection analysis conducted between SPT and other typical soil properties obtained from soil tests. The results of the research illustrated that the shear strength and physical properties of the soil affect the SPT number. The proposed correlation can help engineers to use SPT test results in their design with higher accuracy.Keywords: standard penetration test, soil properties, soil classification, regression method
Procedia PDF Downloads 1881650 Detecting Venomous Files in IDS Using an Approach Based on Data Mining Algorithm
Authors: Sukhleen Kaur
Abstract:
In security groundwork, Intrusion Detection System (IDS) has become an important component. The IDS has received increasing attention in recent years. IDS is one of the effective way to detect different kinds of attacks and malicious codes in a network and help us to secure the network. Data mining techniques can be implemented to IDS, which analyses the large amount of data and gives better results. Data mining can contribute to improving intrusion detection by adding a level of focus to anomaly detection. So far the study has been carried out on finding the attacks but this paper detects the malicious files. Some intruders do not attack directly, but they hide some harmful code inside the files or may corrupt those file and attack the system. These files are detected according to some defined parameters which will form two lists of files as normal files and harmful files. After that data mining will be performed. In this paper a hybrid classifier has been used via Naive Bayes and Ripper classification methods. The results show how the uploaded file in the database will be tested against the parameters and then it is characterised as either normal or harmful file and after that the mining is performed. Moreover, when a user tries to mine on harmful file it will generate an exception that mining cannot be made on corrupted or harmful files.Keywords: data mining, association, classification, clustering, decision tree, intrusion detection system, misuse detection, anomaly detection, naive Bayes, ripper
Procedia PDF Downloads 4131649 Classification of Foliar Nitrogen in Common Bean (Phaseolus Vulgaris L.) Using Deep Learning Models and Images
Authors: Marcos Silva Tavares, Jamile Raquel Regazzo, Edson José de Souza Sardinha, Murilo Mesquita Baesso
Abstract:
Common beans are a widely cultivated and consumed legume globally, serving as a staple food for humans, especially in developing countries, due to their nutritional characteristics. Nitrogen (N) is the most limiting nutrient for productivity, and foliar analysis is crucial to ensure balanced nitrogen fertilization. Excessive N applications can cause, either isolated or cumulatively, soil and water contamination, plant toxicity, and increase their susceptibility to diseases and pests. However, the quantification of N using conventional methods is time-consuming and costly, demanding new technologies to optimize the adequate supply of N to plants. Thus, it becomes necessary to establish constant monitoring of the foliar content of this macronutrient in plants, mainly at the V4 stage, aiming at precision management of nitrogen fertilization. In this work, the objective was to evaluate the performance of a deep learning model, Resnet-50, in the classification of foliar nitrogen in common beans using RGB images. The BRS Estilo cultivar was sown in a greenhouse in a completely randomized design with four nitrogen doses (T1 = 0 kg N ha-1, T2 = 25 kg N ha-1, T3 = 75 kg N ha-1, and T4 = 100 kg N ha-1) and 12 replications. Pots with 5L capacity were used with a substrate composed of 43% soil (Neossolo Quartzarênico), 28.5% crushed sugarcane bagasse, and 28.5% cured bovine manure. The water supply of the plants was done with 5mm of water per day. The application of urea (45% N) and the acquisition of images occurred 14 and 32 days after sowing, respectively. A code developed in Matlab© R2022b was used to cut the original images into smaller blocks, originating an image bank composed of 4 folders representing the four classes and labeled as T1, T2, T3, and T4, each containing 500 images of 224x224 pixels obtained from plants cultivated under different N doses. The Matlab© R2022b software was used for the implementation and performance analysis of the model. The evaluation of the efficiency was done by a set of metrics, including accuracy (AC), F1-score (F1), specificity (SP), area under the curve (AUC), and precision (P). The ResNet-50 showed high performance in the classification of foliar N levels in common beans, with AC values of 85.6%. The F1 for classes T1, T2, T3, and T4 was 76, 72, 74, and 77%, respectively. This study revealed that the use of RGB images combined with deep learning can be a promising alternative to slow laboratory analyses, capable of optimizing the estimation of foliar N. This can allow rapid intervention by the producer to achieve higher productivity and less fertilizer waste. Future approaches are encouraged to develop mobile devices capable of handling images using deep learning for the classification of the nutritional status of plants in situ.Keywords: convolutional neural network, residual network 50, nutritional status, artificial intelligence
Procedia PDF Downloads 161648 Conformance to Spatial Planning between the Kampala Physical Development Plan of 2012 and the Existing Land Use in 2021
Authors: Brendah Nagula, Omolo Fredrick Okalebo, Ronald Ssengendo, Ivan Bamweyana
Abstract:
The Kampala Physical Development Plan (KPDP) was developed in 2012 and projected both long term and short term developments within the City .The purpose of the plan was to not only shape the city into a spatially planned area but also to control the urban sprawl trends that had expanded with pronounced instances of informal settlements. This plan was approved by the National Physical Planning Board and a signature was appended by the Minister in 2013. Much as the KPDP plan has been implemented using different approaches such as detailed planning, development control, subdivision planning, carrying out construction inspections, greening and beautification, there is still limited knowledge on the level of conformance towards this plan. Therefore, it is yet to be determined whether it has been effective in shaping the City into an ideal spatially planned area. Attaining a clear picture of the level of conformance towards the KPDP 2012 through evaluation between the planned and the existing land use in Kampala City was performed. Methods such as Supervised Classification and Post Classification Change Detection were adopted to perform this evaluation. Scrutiny of findings revealed Central Division registered the lowest level of conformance to the planning standards specified in the KPDP 2012 followed by Nakawa, Rubaga, Kawempe, and Makindye. Furthermore, mixed-use development was identified as the land use with the highest level of non-conformity of 25.11% and institutional land use registered the highest level of conformance of 84.45 %. The results show that the aspect of location was not carefully considered while allocating uses in the KPDP whereby areas located near the Central Business District have higher land rents and hence require uses that ensure profit maximization. Also, the prominence of development towards mixed-use denotes an increased demand for land towards compact development that was not catered for in the plan. Therefore in order to transform Kampala city into a spatially planned area, there is need to carefully develop detailed plans especially for all the Central Division planning precincts indicating considerations for land use densification.Keywords: spatial plan, post classification change detection, Kampala city, landuse
Procedia PDF Downloads 901647 Teaching Light Polarization by Putting Art and Physics Together
Authors: Fabrizio Logiurato
Abstract:
Light Polarization has many technological applications, and its discovery was crucial to reveal the transverse nature of the electromagnetic waves. However, despite its fundamental and practical importance, in high school, this property of light is often neglected. This is a pity not only for its conceptual relevance, but also because polarization gives the possibility to perform many brilliant experiments with low cost materials. Moreover, the treatment of this matter lends very well to an interdisciplinary approach between art, biology and technology, which usually makes things more interesting to students. For these reasons, we have developed, and in this work, we introduce a laboratory on light polarization for high school and undergraduate students. They can see beautiful pictures when birefringent materials are set between two crossed polarizing filters. Pupils are very fascinated and drawn into by what they observe. The colourful images remind them of those ones of abstract painting or alien landscapes. With this multidisciplinary teaching method, students are more engaged and participative, and also, the learning process of the respective physics concepts is more effective.Keywords: light polarization, optical activity, multidisciplinary education, science and art
Procedia PDF Downloads 2101646 High-Resolution ECG Automated Analysis and Diagnosis
Authors: Ayad Dalloo, Sulaf Dalloo
Abstract:
Electrocardiogram (ECG) recording is prone to complications, on analysis by physicians, due to noise and artifacts, thus creating ambiguity leading to possible error of diagnosis. Such drawbacks may be overcome with the advent of high resolution Methods, such as Discrete Wavelet Analysis and Digital Signal Processing (DSP) techniques. This ECG signal analysis is implemented in three stages: ECG preprocessing, features extraction and classification with the aim of realizing high resolution ECG diagnosis and improved detection of abnormal conditions in the heart. The preprocessing stage involves removing spurious artifacts (noise), due to such factors as muscle contraction, motion, respiration, etc. ECG features are extracted by applying DSP and suggested sloping method techniques. These measured features represent peak amplitude values and intervals of P, Q, R, S, R’, and T waves on ECG, and other features such as ST elevation, QRS width, heart rate, electrical axis, QR and QT intervals. The classification is preformed using these extracted features and the criteria for cardiovascular diseases. The ECG diagnostic system is successfully applied to 12-lead ECG recordings for 12 cases. The system is provided with information to enable it diagnoses 15 different diseases. Physician’s and computer’s diagnoses are compared with 90% agreement, with respect to physician diagnosis, and the time taken for diagnosis is 2 seconds. All of these operations are programmed in Matlab environment.Keywords: ECG diagnostic system, QRS detection, ECG baseline removal, cardiovascular diseases
Procedia PDF Downloads 2951645 A Decision Support System to Detect the Lumbar Disc Disease on the Basis of Clinical MRI
Authors: Yavuz Unal, Kemal Polat, H. Erdinc Kocer
Abstract:
In this study, a decision support system comprising three stages has been proposed to detect the disc abnormalities of the lumbar region. In the first stage named the feature extraction, T2-weighted sagittal and axial Magnetic Resonance Images (MRI) were taken from 55 people and then 27 appearance and shape features were acquired from both sagittal and transverse images. In the second stage named the feature weighting process, k-means clustering based feature weighting (KMCBFW) proposed by Gunes et al. Finally, in the third stage named the classification process, the classifier algorithms including multi-layer perceptron (MLP- neural network), support vector machine (SVM), Naïve Bayes, and decision tree have been used to classify whether the subject has lumbar disc or not. In order to test the performance of the proposed method, the classification accuracy (%), sensitivity, specificity, precision, recall, f-measure, kappa value, and computation times have been used. The best hybrid model is the combination of k-means clustering based feature weighting and decision tree in the detecting of lumbar disc disease based on both sagittal and axial MR images.Keywords: lumbar disc abnormality, lumbar MRI, lumbar spine, hybrid models, hybrid features, k-means clustering based feature weighting
Procedia PDF Downloads 5171644 Evaluating Machine Learning Techniques for Activity Classification in Smart Home Environments
Authors: Talal Alshammari, Nasser Alshammari, Mohamed Sedky, Chris Howard
Abstract:
With the widespread adoption of the Internet-connected devices, and with the prevalence of the Internet of Things (IoT) applications, there is an increased interest in machine learning techniques that can provide useful and interesting services in the smart home domain. The areas that machine learning techniques can help advance are varied and ever-evolving. Classifying smart home inhabitants’ Activities of Daily Living (ADLs), is one prominent example. The ability of machine learning technique to find meaningful spatio-temporal relations of high-dimensional data is an important requirement as well. This paper presents a comparative evaluation of state-of-the-art machine learning techniques to classify ADLs in the smart home domain. Forty-two synthetic datasets and two real-world datasets with multiple inhabitants are used to evaluate and compare the performance of the identified machine learning techniques. Our results show significant performance differences between the evaluated techniques. Such as AdaBoost, Cortical Learning Algorithm (CLA), Decision Trees, Hidden Markov Model (HMM), Multi-layer Perceptron (MLP), Structured Perceptron and Support Vector Machines (SVM). Overall, neural network based techniques have shown superiority over the other tested techniques.Keywords: activities of daily living, classification, internet of things, machine learning, prediction, smart home
Procedia PDF Downloads 3551643 Fatigue Life Estimation of Tubular Joints - A Comparative Study
Authors: Jeron Maheswaran, Sudath C. Siriwardane
Abstract:
In fatigue analysis, the structural detail of tubular joint has taken great attention among engineers. The DNV-RP-C203 is covering this topic quite well for simple and clear joint cases. For complex joint and geometry, where joint classification isn’t available and limitation on validity range of non-dimensional geometric parameters, the challenges become a fact among engineers. The classification of joint is important to carry out through the fatigue analysis. These joint configurations are identified by the connectivity and the load distribution of tubular joints. To overcome these problems to some extent, this paper compare the fatigue life of tubular joints in offshore jacket according to the stress concentration factors (SCF) in DNV-RP-C203 and finite element method employed Abaqus/CAE. The paper presents the geometric details, material properties and considered load history of the jacket structure. Describe the global structural analysis and identification of critical tubular joints for fatigue life estimation. Hence fatigue life is determined based on the guidelines provided by design codes. Fatigue analysis of tubular joints is conducted using finite element employed Abaqus/CAE [4] as next major step. Finally, obtained SCFs and fatigue lives are compared and their significances are discussed.Keywords: fatigue life, stress-concentration factor, finite element analysis, offshore jacket structure
Procedia PDF Downloads 4511642 [Keynote Talk]: Photocatalytic Cleaning Performance of Air Filters for a Binary Mixture
Authors: Lexuan Zhong, Chang-Seo Lee, Fariborz Haghighat, Stuart Batterman, John C. Little
Abstract:
Ultraviolet photocatalytic oxidation (UV-PCO) technology has been recommended as a green approach to health indoor environment when it is integrated into mechanical ventilation systems for inorganic and organic compounds removal as well as energy saving due to less outdoor air intakes. Although much research has been devoted to UV-PCO, limited information is available on the UV-PCO behavior tested by the mixtures in literature. This project investigated UV-PCO performance and by-product generation using a single and a mixture of acetone and MEK at 100 ppb each in a single-pass duct system in an effort to obtain knowledge associated with competitive photochemical reactions involved in. The experiments were performed at 20 % RH, 22 °C, and a gas flow rate of 128 m3/h (75 cfm). Results show that acetone and MEK mutually reduced each other’s PCO removal efficiency, particularly negative removal efficiency for acetone. These findings were different from previous observation of facilitatory effects on the adsorption of acetone and MEK on photocatalyst surfaces.Keywords: by-products, inhibitory effect, mixture, photocatalytic oxidation
Procedia PDF Downloads 4961641 Investigating the Causes of Human Error-Induced Incidents in the Maintenance Operations of Petrochemical Industry by Using Human Factors Analysis and Classification System
Authors: Omid Kalatpour, Mohammadreza Ajdari
Abstract:
This article studied the possible causes of human error-induced incidents in the petrochemical industry maintenance activities by using Human Factors Analysis and Classification System (HFACS). The purpose of the study was anticipating and identifying these causes and proposing corrective and preventive actions. Maintenance department in a petrochemical company was selected for research. A checklist of human error-induced incidents was developed based on four HFACS main levels and nineteen sub-groups. Hierarchical task analysis (HTA) technique was used to identify maintenance activities and tasks. The main causes of possible incidents were identified by checklist and recorded. Corrective and preventive actions were defined depending on priority. Analyzing the worksheets of 444 activities in four levels of HFACS showed 37.6% of the causes were at the level of unsafe actions, 27.5% at the level of unsafe supervision, 20.9% at the level of preconditions for unsafe acts and 14% of the causes were at the level of organizational effects. The HFACS sub-groups showed errors (24.36%) inadequate supervision (14.89%) and violations (13.26%) with the most frequency. According to findings of this study, increasing the training effectiveness of operators and supervision improvement respectively are the most important measures in decreasing the human error-induced incidents in petrochemical industry maintenance.Keywords: human error, petrochemical industry, maintenance, HFACS
Procedia PDF Downloads 2401640 Musical Tesla Coil Controlled by an Audio Signal Processed in Matlab
Authors: Sandra Cuenca, Danilo Santana, Anderson Reyes
Abstract:
The following project is based on the manipulation of audio signals through the Matlab software, which has an audio signal that is modified, and its resultant obtained through the auxiliary port of the computer is passed through a signal amplifier whose amplified signal is connected to a tesla coil which has a behavior like a vumeter, the flashes at the output of the tesla coil increase and decrease its intensity depending on the audio signal in the computer and also the voltage source from which it is sent. The amplified signal then passes to the tesla coil being shown in the plasma sphere with the respective flashes; this activation is given through the specified parameters that we want to give in the MATLAB algorithm that contains the digital filters for the manipulation of our audio signal sent to the tesla coil to be displayed in a plasma sphere with flashes of the combination of colors commonly pink and purple that varies according to the tone of the song.Keywords: auxiliary port, tesla coil, vumeter, plasma sphere
Procedia PDF Downloads 881639 SCNet: A Vehicle Color Classification Network Based on Spatial Cluster Loss and Channel Attention Mechanism
Authors: Fei Gao, Xinyang Dong, Yisu Ge, Shufang Lu, Libo Weng
Abstract:
Vehicle color recognition plays an important role in traffic accident investigation. However, due to the influence of illumination, weather, and noise, vehicle color recognition still faces challenges. In this paper, a vehicle color classification network based on spatial cluster loss and channel attention mechanism (SCNet) is proposed for vehicle color recognition. A channel attention module is applied to extract the features of vehicle color representative regions and reduce the weight of nonrepresentative color regions in the channel. The proposed loss function, called spatial clustering loss (SC-loss), consists of two channel-specific components, such as a concentration component and a diversity component. The concentration component forces all feature channels belonging to the same class to be concentrated through the channel cluster. The diversity components impose additional constraints on the channels through the mean distance coefficient, making them mutually exclusive in spatial dimensions. In the comparison experiments, the proposed method can achieve state-of-the-art performance on the public datasets, VCD, and VeRi, which are 96.1% and 96.2%, respectively. In addition, the ablation experiment further proves that SC-loss can effectively improve the accuracy of vehicle color recognition.Keywords: feature extraction, convolutional neural networks, intelligent transportation, vehicle color recognition
Procedia PDF Downloads 1811638 DeepNIC a Method to Transform Each Tabular Variable into an Independant Image Analyzable by Basic CNNs
Authors: Nguyen J. M., Lucas G., Ruan S., Digonnet H., Antonioli D.
Abstract:
Introduction: Deep Learning (DL) is a very powerful tool for analyzing image data. But for tabular data, it cannot compete with machine learning methods like XGBoost. The research question becomes: can tabular data be transformed into images that can be analyzed by simple CNNs (Convolutional Neuron Networks)? Will DL be the absolute tool for data classification? All current solutions consist in repositioning the variables in a 2x2 matrix using their correlation proximity. In doing so, it obtains an image whose pixels are the variables. We implement a technology, DeepNIC, that offers the possibility of obtaining an image for each variable, which can be analyzed by simple CNNs. Material and method: The 'ROP' (Regression OPtimized) model is a binary and atypical decision tree whose nodes are managed by a new artificial neuron, the Neurop. By positioning an artificial neuron in each node of the decision trees, it is possible to make an adjustment on a theoretically infinite number of variables at each node. From this new decision tree whose nodes are artificial neurons, we created the concept of a 'Random Forest of Perfect Trees' (RFPT), which disobeys Breiman's concepts by assembling very large numbers of small trees with no classification errors. From the results of the RFPT, we developed a family of 10 statistical information criteria, Nguyen Information Criterion (NICs), which evaluates in 3 dimensions the predictive quality of a variable: Performance, Complexity and Multiplicity of solution. A NIC is a probability that can be transformed into a grey level. The value of a NIC depends essentially on 2 super parameters used in Neurops. By varying these 2 super parameters, we obtain a 2x2 matrix of probabilities for each NIC. We can combine these 10 NICs with the functions AND, OR, and XOR. The total number of combinations is greater than 100,000. In total, we obtain for each variable an image of at least 1166x1167 pixels. The intensity of the pixels is proportional to the probability of the associated NIC. The color depends on the associated NIC. This image actually contains considerable information about the ability of the variable to make the prediction of Y, depending on the presence or absence of other variables. A basic CNNs model was trained for supervised classification. Results: The first results are impressive. Using the GSE22513 public data (Omic data set of markers of Taxane Sensitivity in Breast Cancer), DEEPNic outperformed other statistical methods, including XGBoost. We still need to generalize the comparison on several databases. Conclusion: The ability to transform any tabular variable into an image offers the possibility of merging image and tabular information in the same format. This opens up great perspectives in the analysis of metadata.Keywords: tabular data, CNNs, NICs, DeepNICs, random forest of perfect trees, classification
Procedia PDF Downloads 1241637 A Mechanical Diagnosis Method Based on Vibration Fault Signal down-Sampling and the Improved One-Dimensional Convolutional Neural Network
Authors: Bowei Yuan, Shi Li, Liuyang Song, Huaqing Wang, Lingli Cui
Abstract:
Convolutional neural networks (CNN) have received extensive attention in the field of fault diagnosis. Many fault diagnosis methods use CNN for fault type identification. However, when the amount of raw data collected by sensors is massive, the neural network needs to perform a time-consuming classification task. In this paper, a mechanical fault diagnosis method based on vibration signal down-sampling and the improved one-dimensional convolutional neural network is proposed. Through the robust principal component analysis, the low-rank feature matrix of a large amount of raw data can be separated, and then down-sampling is realized to reduce the subsequent calculation amount. In the improved one-dimensional CNN, a smaller convolution kernel is used to reduce the number of parameters and computational complexity, and regularization is introduced before the fully connected layer to prevent overfitting. In addition, the multi-connected layers can better generalize classification results without cumbersome parameter adjustments. The effectiveness of the method is verified by monitoring the signal of the centrifugal pump test bench, and the average test accuracy is above 98%. When compared with the traditional deep belief network (DBN) and support vector machine (SVM) methods, this method has better performance.Keywords: fault diagnosis, vibration signal down-sampling, 1D-CNN
Procedia PDF Downloads 1301636 Equity Risk Premiums and Risk Free Rates in Modelling and Prediction of Financial Markets
Authors: Mohammad Ghavami, Reza S. Dilmaghani
Abstract:
This paper presents an adaptive framework for modelling financial markets using equity risk premiums, risk free rates and volatilities. The recorded economic factors are initially used to train four adaptive filters for a certain limited period of time in the past. Once the systems are trained, the adjusted coefficients are used for modelling and prediction of an important financial market index. Two different approaches based on least mean squares (LMS) and recursive least squares (RLS) algorithms are investigated. Performance analysis of each method in terms of the mean squared error (MSE) is presented and the results are discussed. Computer simulations carried out using recorded data show MSEs of 4% and 3.4% for the next month prediction using LMS and RLS adaptive algorithms, respectively. In terms of twelve months prediction, RLS method shows a better tendency estimation compared to the LMS algorithm.Keywords: adaptive methods, LSE, MSE, prediction of financial Markets
Procedia PDF Downloads 3341635 Quantitative Structure–Activity Relationship Analysis of Some Benzimidazole Derivatives by Linear Multivariate Method
Authors: Strahinja Z. Kovačević, Lidija R. Jevrić, Sanja O. Podunavac Kuzmanović
Abstract:
The relationship between antibacterial activity of eighteen different substituted benzimidazole derivatives and their molecular characteristics was studied using chemometric QSAR (Quantitative Structure–Activity Relationships) approach. QSAR analysis has been carried out on inhibitory activity towards Staphylococcus aureus, by using molecular descriptors, as well as minimal inhibitory activity (MIC). Molecular descriptors were calculated from the optimized structures. Principal component analysis (PCA) followed by hierarchical cluster analysis (HCA) and multiple linear regression (MLR) was performed in order to select molecular descriptors that best describe the antibacterial behavior of the compounds investigated, and to determine the similarities between molecules. The HCA grouped the molecules in separated clusters which have the similar inhibitory activity. PCA showed very similar classification of molecules as the HCA, and displayed which descriptors contribute to that classification. MLR equations, that represent MIC as a function of the in silico molecular descriptors were established. The statistical significance of the estimated models was confirmed by standard statistical measures and cross-validation parameters (SD = 0.0816, F = 46.27, R = 0.9791, R2CV = 0.8266, R2adj = 0.9379, PRESS = 0.1116). These parameters indicate the possibility of application of the established chemometric models in prediction of the antibacterial behaviour of studied derivatives and structurally very similar compounds.Keywords: antibacterial, benzimidazole, molecular descriptors, QSAR
Procedia PDF Downloads 3621634 Feature Selection Approach for the Classification of Hydraulic Leakages in Hydraulic Final Inspection using Machine Learning
Authors: Christian Neunzig, Simon Fahle, Jürgen Schulz, Matthias Möller, Bernd Kuhlenkötter
Abstract:
Manufacturing companies are facing global competition and enormous cost pressure. The use of machine learning applications can help reduce production costs and create added value. Predictive quality enables the securing of product quality through data-supported predictions using machine learning models as a basis for decisions on test results. Furthermore, machine learning methods are able to process large amounts of data, deal with unfavourable row-column ratios and detect dependencies between the covariates and the given target as well as assess the multidimensional influence of all input variables on the target. Real production data are often subject to highly fluctuating boundary conditions and unbalanced data sets. Changes in production data manifest themselves in trends, systematic shifts, and seasonal effects. Thus, Machine learning applications require intensive pre-processing and feature selection. Data preprocessing includes rule-based data cleaning, the application of dimensionality reduction techniques, and the identification of comparable data subsets. Within the used real data set of Bosch hydraulic valves, the comparability of the same production conditions in the production of hydraulic valves within certain time periods can be identified by applying the concept drift method. Furthermore, a classification model is developed to evaluate the feature importance in different subsets within the identified time periods. By selecting comparable and stable features, the number of features used can be significantly reduced without a strong decrease in predictive power. The use of cross-process production data along the value chain of hydraulic valves is a promising approach to predict the quality characteristics of workpieces. In this research, the ada boosting classifier is used to predict the leakage of hydraulic valves based on geometric gauge blocks from machining, mating data from the assembly, and hydraulic measurement data from end-of-line testing. In addition, the most suitable methods are selected and accurate quality predictions are achieved.Keywords: classification, achine learning, predictive quality, feature selection
Procedia PDF Downloads 1611633 Roughness Discrimination Using Bioinspired Tactile Sensors
Authors: Zhengkun Yi
Abstract:
Surface texture discrimination using artificial tactile sensors has attracted increasing attentions in the past decade as it can endow technical and robot systems with a key missing ability. However, as a major component of texture, roughness has rarely been explored. This paper presents an approach for tactile surface roughness discrimination, which includes two parts: (1) design and fabrication of a bioinspired artificial fingertip, and (2) tactile signal processing for tactile surface roughness discrimination. The bioinspired fingertip is comprised of two polydimethylsiloxane (PDMS) layers, a polymethyl methacrylate (PMMA) bar, and two perpendicular polyvinylidene difluoride (PVDF) film sensors. This artificial fingertip mimics human fingertips in three aspects: (1) Elastic properties of epidermis and dermis in human skin are replicated by the two PDMS layers with different stiffness, (2) The PMMA bar serves the role analogous to that of a bone, and (3) PVDF film sensors emulate Meissner’s corpuscles in terms of both location and response to the vibratory stimuli. Various extracted features and classification algorithms including support vector machines (SVM) and k-nearest neighbors (kNN) are examined for tactile surface roughness discrimination. Eight standard rough surfaces with roughness values (Ra) of 50 μm, 25 μm, 12.5 μm, 6.3 μm 3.2 μm, 1.6 μm, 0.8 μm, and 0.4 μm are explored. The highest classification accuracy of (82.6 ± 10.8) % can be achieved using solely one PVDF film sensor with kNN (k = 9) classifier and the standard deviation feature.Keywords: bioinspired fingertip, classifier, feature extraction, roughness discrimination
Procedia PDF Downloads 3101632 Preliminary Evaluation of Decommissioning Wastes for the First Commercial Nuclear Power Reactor in South Korea
Authors: Kyomin Lee, Joohee Kim, Sangho Kang
Abstract:
The commercial nuclear power reactor in South Korea, Kori Unit 1, which was a 587 MWe pressurized water reactor that started operation since 1978, was permanently shut down in June 2017 without an additional operating license extension. The Kori 1 Unit is scheduled to become the nuclear power unit to enter the decommissioning phase. In this study, the preliminary evaluation of the decommissioning wastes for the Kori Unit 1 was performed based on the following series of process: firstly, the plant inventory is investigated based on various documents (i.e., equipment/ component list, construction records, general arrangement drawings). Secondly, the radiological conditions of systems, structures and components (SSCs) are established to estimate the amount of radioactive waste by waste classification. Third, the waste management strategies for Kori Unit 1 including waste packaging are established. Forth, selection of the proper decontamination and dismantling (D&D) technologies is made considering the various factors. Finally, the amount of decommissioning waste by classification for Kori 1 is estimated using the DeCAT program, which was developed by KEPCO-E&C for a decommissioning cost estimation. The preliminary evaluation results have shown that the expected amounts of decommissioning wastes were less than about 2% and 8% of the total wastes generated (i.e., sum of clean wastes and radwastes) before/after waste processing, respectively, and it was found that the majority of contaminated material was carbon or alloy steel and stainless steel. In addition, within the range of availability of information, the results of the evaluation were compared with the results from the various decommissioning experiences data or international/national decommissioning study. The comparison results have shown that the radioactive waste amount from Kori Unit 1 decommissioning were much less than those from the plants decommissioned in U.S. and were comparable to those from the plants in Europe. This result comes from the difference of disposal cost and clearance criteria (i.e., free release level) between U.S. and non-U.S. The preliminary evaluation performed using the methodology established in this study will be useful as a important information in establishing the decommissioning planning for the decommissioning schedule and waste management strategy establishment including the transportation, packaging, handling, and disposal of radioactive wastes.Keywords: characterization, classification, decommissioning, decontamination and dismantling, Kori 1, radioactive waste
Procedia PDF Downloads 2081631 Myanmar Character Recognition Using Eight Direction Chain Code Frequency Features
Authors: Kyi Pyar Zaw, Zin Mar Kyu
Abstract:
Character recognition is the process of converting a text image file into editable and searchable text file. Feature Extraction is the heart of any character recognition system. The character recognition rate may be low or high depending on the extracted features. In the proposed paper, 25 features for one character are used in character recognition. Basically, there are three steps of character recognition such as character segmentation, feature extraction and classification. In segmentation step, horizontal cropping method is used for line segmentation and vertical cropping method is used for character segmentation. In the Feature extraction step, features are extracted in two ways. The first way is that the 8 features are extracted from the entire input character using eight direction chain code frequency extraction. The second way is that the input character is divided into 16 blocks. For each block, although 8 feature values are obtained through eight-direction chain code frequency extraction method, we define the sum of these 8 feature values as a feature for one block. Therefore, 16 features are extracted from that 16 blocks in the second way. We use the number of holes feature to cluster the similar characters. We can recognize the almost Myanmar common characters with various font sizes by using these features. All these 25 features are used in both training part and testing part. In the classification step, the characters are classified by matching the all features of input character with already trained features of characters.Keywords: chain code frequency, character recognition, feature extraction, features matching, segmentation
Procedia PDF Downloads 3181630 Partial Least Square Regression for High-Dimentional and High-Correlated Data
Authors: Mohammed Abdullah Alshahrani
Abstract:
The research focuses on investigating the use of partial least squares (PLS) methodology for addressing challenges associated with high-dimensional correlated data. Recent technological advancements have led to experiments producing data characterized by a large number of variables compared to observations, with substantial inter-variable correlations. Such data patterns are common in chemometrics, where near-infrared (NIR) spectrometer calibrations record chemical absorbance levels across hundreds of wavelengths, and in genomics, where thousands of genomic regions' copy number alterations (CNA) are recorded from cancer patients. PLS serves as a widely used method for analyzing high-dimensional data, functioning as a regression tool in chemometrics and a classification method in genomics. It handles data complexity by creating latent variables (components) from original variables. However, applying PLS can present challenges. The study investigates key areas to address these challenges, including unifying interpretations across three main PLS algorithms and exploring unusual negative shrinkage factors encountered during model fitting. The research presents an alternative approach to addressing the interpretation challenge of predictor weights associated with PLS. Sparse estimation of predictor weights is employed using a penalty function combining a lasso penalty for sparsity and a Cauchy distribution-based penalty to account for variable dependencies. The results demonstrate sparse and grouped weight estimates, aiding interpretation and prediction tasks in genomic data analysis. High-dimensional data scenarios, where predictors outnumber observations, are common in regression analysis applications. Ordinary least squares regression (OLS), the standard method, performs inadequately with high-dimensional and highly correlated data. Copy number alterations (CNA) in key genes have been linked to disease phenotypes, highlighting the importance of accurate classification of gene expression data in bioinformatics and biology using regularized methods like PLS for regression and classification.Keywords: partial least square regression, genetics data, negative filter factors, high dimensional data, high correlated data
Procedia PDF Downloads 491629 Classification of Multiple Cancer Types with Deep Convolutional Neural Network
Authors: Nan Deng, Zhenqiu Liu
Abstract:
Thousands of patients with metastatic tumors were diagnosed with cancers of unknown primary sites each year. The inability to identify the primary cancer site may lead to inappropriate treatment and unexpected prognosis. Nowadays, a large amount of genomics and transcriptomics cancer data has been generated by next-generation sequencing (NGS) technologies, and The Cancer Genome Atlas (TCGA) database has accrued thousands of human cancer tumors and healthy controls, which provides an abundance of resource to differentiate cancer types. Meanwhile, deep convolutional neural networks (CNNs) have shown high accuracy on classification among a large number of image object categories. Here, we utilize 25 cancer primary tumors and 3 normal tissues from TCGA and convert their RNA-Seq gene expression profiling to color images; train, validate and test a CNN classifier directly from these images. The performance result shows that our CNN classifier can archive >80% test accuracy on most of the tumors and normal tissues. Since the gene expression pattern of distant metastases is similar to their primary tumors, the CNN classifier may provide a potential computational strategy on identifying the unknown primary origin of metastatic cancer in order to plan appropriate treatment for patients.Keywords: bioinformatics, cancer, convolutional neural network, deep leaning, gene expression pattern
Procedia PDF Downloads 2991628 Semantic Differences between Bug Labeling of Different Repositories via Machine Learning
Authors: Pooja Khanal, Huaming Zhang
Abstract:
Labeling of issues/bugs, also known as bug classification, plays a vital role in software engineering. Some known labels/classes of bugs are 'User Interface', 'Security', and 'API'. Most of the time, when a reporter reports a bug, they try to assign some predefined label to it. Those issues are reported for a project, and each project is a repository in GitHub/GitLab, which contains multiple issues. There are many software project repositories -ranging from individual projects to commercial projects. The labels assigned for different repositories may be dependent on various factors like human instinct, generalization of labels, label assignment policy followed by the reporter, etc. While the reporter of the issue may instinctively give that issue a label, another person reporting the same issue may label it differently. This way, it is not known mathematically if a label in one repository is similar or different to the label in another repository. Hence, the primary goal of this research is to find the semantic differences between bug labeling of different repositories via machine learning. Independent optimal classifiers for individual repositories are built first using the text features from the reported issues. The optimal classifiers may include a combination of multiple classifiers stacked together. Then, those classifiers are used to cross-test other repositories which leads the result to be deduced mathematically. The produce of this ongoing research includes a formalized open-source GitHub issues database that is used to deduce the similarity of the labels pertaining to the different repositories.Keywords: bug classification, bug labels, GitHub issues, semantic differences
Procedia PDF Downloads 1981627 Object-Scene: Deep Convolutional Representation for Scene Classification
Authors: Yanjun Chen, Chuanping Hu, Jie Shao, Lin Mei, Chongyang Zhang
Abstract:
Traditional image classification is based on encoding scheme (e.g. Fisher Vector, Vector of Locally Aggregated Descriptor) with low-level image features (e.g. SIFT, HoG). Compared to these low-level local features, deep convolutional features obtained at the mid-level layer of convolutional neural networks (CNN) have richer information but lack of geometric invariance. For scene classification, there are scattered objects with different size, category, layout, number and so on. It is crucial to find the distinctive objects in scene as well as their co-occurrence relationship. In this paper, we propose a method to take advantage of both deep convolutional features and the traditional encoding scheme while taking object-centric and scene-centric information into consideration. First, to exploit the object-centric and scene-centric information, two CNNs that trained on ImageNet and Places dataset separately are used as the pre-trained models to extract deep convolutional features at multiple scales. This produces dense local activations. By analyzing the performance of different CNNs at multiple scales, it is found that each CNN works better in different scale ranges. A scale-wise CNN adaption is reasonable since objects in scene are at its own specific scale. Second, a fisher kernel is applied to aggregate a global representation at each scale and then to merge into a single vector by using a post-processing method called scale-wise normalization. The essence of Fisher Vector lies on the accumulation of the first and second order differences. Hence, the scale-wise normalization followed by average pooling would balance the influence of each scale since different amount of features are extracted. Third, the Fisher vector representation based on the deep convolutional features is followed by a linear Supported Vector Machine, which is a simple yet efficient way to classify the scene categories. Experimental results show that the scale-specific feature extraction and normalization with CNNs trained on object-centric and scene-centric datasets can boost the results from 74.03% up to 79.43% on MIT Indoor67 when only two scales are used (compared to results at single scale). The result is comparable to state-of-art performance which proves that the representation can be applied to other visual recognition tasks.Keywords: deep convolutional features, Fisher Vector, multiple scales, scale-specific normalization
Procedia PDF Downloads 3311626 Online Prediction of Nonlinear Signal Processing Problems Based Kernel Adaptive Filtering
Authors: Hamza Nejib, Okba Taouali
Abstract:
This paper presents two of the most knowing kernel adaptive filtering (KAF) approaches, the kernel least mean squares and the kernel recursive least squares, in order to predict a new output of nonlinear signal processing. Both of these methods implement a nonlinear transfer function using kernel methods in a particular space named reproducing kernel Hilbert space (RKHS) where the model is a linear combination of kernel functions applied to transform the observed data from the input space to a high dimensional feature space of vectors, this idea known as the kernel trick. Then KAF is the developing filters in RKHS. We use two nonlinear signal processing problems, Mackey Glass chaotic time series prediction and nonlinear channel equalization to figure the performance of the approaches presented and finally to result which of them is the adapted one.Keywords: online prediction, KAF, signal processing, RKHS, Kernel methods, KRLS, KLMS
Procedia PDF Downloads 397