Search results for: oversampling
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 10

Search results for: oversampling

10 Hard Disk Failure Predictions in Supercomputing System Based on CNN-LSTM and Oversampling Technique

Authors: Yingkun Huang, Li Guo, Zekang Lan, Kai Tian

Abstract:

Hard disk drives (HDD) failure of the exascale supercomputing system may lead to service interruption and invalidate previous calculations, and it will cause permanent data loss. Therefore, initiating corrective actions before hard drive failures materialize is critical to the continued operation of jobs. In this paper, a highly accurate analysis model based on CNN-LSTM and oversampling technique was proposed, which can correctly predict the necessity of a disk replacement even ten days in advance. Generally, the learning-based method performs poorly on a training dataset with long-tail distribution, especially fault prediction is a very classic situation as the scarcity of failure data. To overcome the puzzle, a new oversampling was employed to augment the data, and then, an improved CNN-LSTM with the shortcut was built to learn more effective features. The shortcut transmits the results of the previous layer of CNN and is used as the input of the LSTM model after weighted fusion with the output of the next layer. Finally, a detailed, empirical comparison of 6 prediction methods is presented and discussed on a public dataset for evaluation. The experiments indicate that the proposed method predicts disk failure with 0.91 Precision, 0.91 Recall, 0.91 F-measure, and 0.90 MCC for 10 days prediction horizon. Thus, the proposed algorithm is an efficient algorithm for predicting HDD failure in supercomputing.

Keywords: HDD replacement, failure, CNN-LSTM, oversampling, prediction

Procedia PDF Downloads 59
9 Application of Model Tree in the Prediction of TBM Rate of Penetration with Synthetic Minority Oversampling Technique

Authors: Ehsan Mehryaar

Abstract:

The rate of penetration is (RoP) one of the vital factors in the cost and time of tunnel boring projects; therefore, predicting it can lead to a substantial increase in the efficiency of the project. RoP is heavily dependent geological properties of the project site and TBM properties. In this study, 151-point data from Queen’s water tunnel is collected, which includes unconfined compression strength, peak slope index, angle with weak planes, and distance between planes of weaknesses. Since the size of the data is small, it was observed that it is imbalanced. To solve that problem synthetic minority oversampling technique is utilized. The model based on the model tree is proposed, where each leaf consists of a support vector machine model. Proposed model performance is then compared to existing empirical equations in the literature.

Keywords: Model tree, SMOTE, rate of penetration, TBM(tunnel boring machine), SVM

Procedia PDF Downloads 157
8 An Adaptive Oversampling Technique for Imbalanced Datasets

Authors: Shaukat Ali Shahee, Usha Ananthakumar

Abstract:

A data set exhibits class imbalance problem when one class has very few examples compared to the other class, and this is also referred to as between class imbalance. The traditional classifiers fail to classify the minority class examples correctly due to its bias towards the majority class. Apart from between-class imbalance, imbalance within classes where classes are composed of a different number of sub-clusters with these sub-clusters containing different number of examples also deteriorates the performance of the classifier. Previously, many methods have been proposed for handling imbalanced dataset problem. These methods can be classified into four categories: data preprocessing, algorithmic based, cost-based methods and ensemble of classifier. Data preprocessing techniques have shown great potential as they attempt to improve data distribution rather than the classifier. Data preprocessing technique handles class imbalance either by increasing the minority class examples or by decreasing the majority class examples. Decreasing the majority class examples lead to loss of information and also when minority class has an absolute rarity, removing the majority class examples is generally not recommended. Existing methods available for handling class imbalance do not address both between-class imbalance and within-class imbalance simultaneously. In this paper, we propose a method that handles between class imbalance and within class imbalance simultaneously for binary classification problem. Removing between class imbalance and within class imbalance simultaneously eliminates the biases of the classifier towards bigger sub-clusters by minimizing the error domination of bigger sub-clusters in total error. The proposed method uses model-based clustering to find the presence of sub-clusters or sub-concepts in the dataset. The number of examples oversampled among the sub-clusters is determined based on the complexity of sub-clusters. The method also takes into consideration the scatter of the data in the feature space and also adaptively copes up with unseen test data using Lowner-John ellipsoid for increasing the accuracy of the classifier. In this study, neural network is being used as this is one such classifier where the total error is minimized and removing the between-class imbalance and within class imbalance simultaneously help the classifier in giving equal weight to all the sub-clusters irrespective of the classes. The proposed method is validated on 9 publicly available data sets and compared with three existing oversampling techniques that rely on the spatial location of minority class examples in the euclidean feature space. The experimental results show the proposed method to be statistically significantly superior to other methods in terms of various accuracy measures. Thus the proposed method can serve as a good alternative to handle various problem domains like credit scoring, customer churn prediction, financial distress, etc., that typically involve imbalanced data sets.

Keywords: classification, imbalanced dataset, Lowner-John ellipsoid, model based clustering, oversampling

Procedia PDF Downloads 394
7 Sigma-Delta ADCs Converter a Study Case

Authors: Thiago Brito Bezerra, Mauro Lopes de Freitas, Waldir Sabino da Silva Júnior

Abstract:

The Sigma-Delta A/D converters have been proposed as a practical application for A/D conversion at high rates because of its simplicity and robustness to imperfections in the circuit, also because the traditional converters are more difficult to implement in VLSI technology. These difficulties with conventional conversion methods need precise analog components in their filters and conversion circuits, and are more vulnerable to noise and interference. This paper aims to analyze the architecture, function and application of Analog-Digital converters (A/D) Sigma-Delta to overcome these difficulties, showing some simulations using the Simulink software and Multisim.

Keywords: analysis, oversampling modulator, A/D converters, sigma-delta

Procedia PDF Downloads 305
6 Design of Decimation Filter Using Cascade Structure for Sigma Delta ADC

Authors: Misbahuddin Mahammad, P. Chandra Sekhar, Metuku Shyamsunder

Abstract:

The oversampled output of a sigma-delta modulator is decimated to Nyquist sampling rate by decimation filters. The decimation filters work twofold; they decimate the sampling rate by a factor of OSR (oversampling rate) and they remove the out band quantization noise resulting in an increase in resolution. The speed, area and power consumption of oversampled converter are governed largely by decimation filters in sigma-delta A/D converters. The scope of the work is to design a decimation filter for sigma-delta ADC and simulation using MATLAB. The decimation filter structure is based on cascaded-integrated comb (CIC) filter. A second decimation filter is using CIC for large rate change and cascaded FIR filters, for small rate changes, to improve the frequency response. The proposed structure is even more hardware efficient.

Keywords: sigma delta modulator, CIC filter, decimation filter, compensation filter, noise shaping

Procedia PDF Downloads 441
5 Performance Analysis of PAPR Reduction in OFDM Systems based on Partial Transmit Sequence (PTS) Technique

Authors: Alcardo Alex Barakabitze, Tan Xiaoheng

Abstract:

Orthogonal Frequency Division Multiplexing (OFDM) is a special case of Multi-Carrier Modulation (MCM) technique which transmits a stream of data over a number of lower data rate subcarriers. OFDM splits the total transmission bandwidth into a number of orthogonal and non-overlapping subcarriers and transmit the collection of bits called symbols in parallel using these subcarriers. This paper explores the Peak to Average Power Reduction (PAPR) using the Partial Transmit Sequence technique. We provide the distribution analysis and the basics of OFDM signals and then show how the PAPR increases as the number of subcarriers increases. We provide the performance analysis of CCDF and PAPR expressed in decibels through MATLAB simulations. The simulation results show that, in PTS technique, the performance of PAPR reduction in OFDM systems improves significantly as the number of sub-blocks increases. However, by keeping the same number of sub-blocks variation, oversampling factor and the number of OFDM blocks’ iteration for generating the CCDF, the OFDM systems with 128 subcarriers have an improved performance in PAPR reduction compared to OFDM systems with 256, 512 or >512 subcarriers.

Keywords: OFDM, peak to average power reduction (PAPR), bit error rate (BER), subcarriers, wireless communications

Procedia PDF Downloads 492
4 Using Machine Learning to Classify Human Fetal Health and Analyze Feature Importance

Authors: Yash Bingi, Yiqiao Yin

Abstract:

Reduction of child mortality is an ongoing struggle and a commonly used factor in determining progress in the medical field. The under-5 mortality number is around 5 million around the world, with many of the deaths being preventable. In light of this issue, Cardiotocograms (CTGs) have emerged as a leading tool to determine fetal health. By using ultrasound pulses and reading the responses, CTGs help healthcare professionals assess the overall health of the fetus to determine the risk of child mortality. However, interpreting the results of the CTGs is time-consuming and inefficient, especially in underdeveloped areas where an expert obstetrician is hard to come by. Using a support vector machine (SVM) and oversampling, this paper proposed a model that classifies fetal health with an accuracy of 99.59%. To further explain the CTG measurements, an algorithm based on Randomized Input Sampling for Explanation ((RISE) of Black-box Models was created, called Feature Alteration for explanation of Black Box Models (FAB), and compared the findings to Shapley Additive Explanations (SHAP) and Local Interpretable Model Agnostic Explanations (LIME). This allows doctors and medical professionals to classify fetal health with high accuracy and determine which features were most influential in the process.

Keywords: machine learning, fetal health, gradient boosting, support vector machine, Shapley values, local interpretable model agnostic explanations

Procedia PDF Downloads 123
3 Defect Classification of Hydrogen Fuel Pressure Vessels using Deep Learning

Authors: Dongju Kim, Youngjoo Suh, Hyojin Kim, Gyeongyeong Kim

Abstract:

Acoustic Emission Testing (AET) is widely used to test the structural integrity of an operational hydrogen storage container, and clustering algorithms are frequently used in pattern recognition methods to interpret AET results. However, the interpretation of AET results can vary from user to user as the tuning of the relevant parameters relies on the user's experience and knowledge of AET. Therefore, it is necessary to use a deep learning model to identify patterns in acoustic emission (AE) signal data that can be used to classify defects instead. In this paper, a deep learning-based model for classifying the types of defects in hydrogen storage tanks, using AE sensor waveforms, is proposed. As hydrogen storage tanks are commonly constructed using carbon fiber reinforced polymer composite (CFRP), a defect classification dataset is collected through a tensile test on a specimen of CFRP with an AE sensor attached. The performance of the classification model, using one-dimensional convolutional neural network (1-D CNN) and synthetic minority oversampling technique (SMOTE) data augmentation, achieved 91.09% accuracy for each defect. It is expected that the deep learning classification model in this paper, used with AET, will help in evaluating the operational safety of hydrogen storage containers.

Keywords: acoustic emission testing, carbon fiber reinforced polymer composite, one-dimensional convolutional neural network, smote data augmentation

Procedia PDF Downloads 69
2 Critically Sampled Hybrid Trigonometry Generalized Discrete Fourier Transform for Multistandard Receiver Platform

Authors: Temidayo Otunniyi

Abstract:

This paper presents a low computational channelization algorithm for the multi-standards platform using poly phase implementation of a critically sampled hybrid Trigonometry generalized Discrete Fourier Transform, (HGDFT). An HGDFT channelization algorithm exploits the orthogonality of two trigonometry Fourier functions, together with the properties of Quadrature Mirror Filter Bank (QMFB) and Exponential Modulated filter Bank (EMFB), respectively. HGDFT shows improvement in its implementation in terms of high reconfigurability, lower filter length, parallelism, and medium computational activities. Type 1 and type 111 poly phase structures are derived for real-valued HGDFT modulation. The design specifications are decimated critically and over-sampled for both single and multi standards receiver platforms. Evaluating the performance of oversampled single standard receiver channels, the HGDFT algorithm achieved 40% complexity reduction, compared to 34% and 38% reduction in the Discrete Fourier Transform (DFT) and tree quadrature mirror filter (TQMF) algorithm. The parallel generalized discrete Fourier transform (PGDFT) and recombined generalized discrete Fourier transform (RGDFT) had 41% complexity reduction and HGDFT had a 46% reduction in oversampling multi-standards mode. While in the critically sampled multi-standard receiver channels, HGDFT had complexity reduction of 70% while both PGDFT and RGDFT had a 34% reduction.

Keywords: software defined radio, channelization, critical sample rate, over-sample rate

Procedia PDF Downloads 103
1 Knowledge of Sexually Transmitted Infections and Socio-Demographic Factors Affecting High Risk Sex among Unmarried Youths in Nigeria

Authors: Obasanjo Afolabi Bolarinwa

Abstract:

This study assesses the levels of knowledge of sexually transmitted infections among unmarried youths in Nigeria; examines the pattern of high risk sex among unmarried youths in Nigeria; investigate the socio-demographic factors (age, place of residence, religion, level of education, wealth index and employment status) affecting the practice of high-risk sexual behaviour and ascertain the relationships between knowledge of sexually transmitted infections and practice of high risk sex. The goal of the study is to identify the factors associated with the practice of high risk sex among youth. These were with a view to identifying critical actions needed to reduce high risk sexual behaviour among youths. The study employed secondary data. The data for the study were extracted from the 2013 Nigeria Demographic and Health Survey (NDHS). The 2013 NDHS collected information from 38,948 Women ages 15-49 years and 17,359 men ages 15-49. A total of 7,744 female and 6,027 male respondents were utilized in the study. In order to adjust for the effect of oversampling of the population, the weighting factor provided by Measure DHS was applied. The data were analysed using frequency distribution and logistic regression. The results show that both male (92.2%) and female (93.6%) have accurate knowledge of sexually transmitted infections. The study also revealed that prevalence of high risk sexual behavior is high among Nigerian youths; this is evident as 77.7% (female) and 78.4% (male) are engaging in high risk sexual behavior. The bivariate analysis shows that age of respondent (χ2=294.2; p < 0.05), religion (χ2=136.64; p < 0.05), wealth index (χ2=17.38; p < 0.05), level of education (χ2=34.73; p < 0.05) and employment status (χ2=94.54; p < 0.05) were individual factors significantly associated with high risk sexual behaviour among male while age of respondent (χ2=327.07; p < 0.05), place of residence (χ2=6.71; p < 0.05), religion (χ2=81.04; p < 0.05), wealth index (χ2=7.41; p < 0.05), level of education (χ2=18.12; p < 0.05) and employment status (χ2=51.02; p < 0.05) were individual factors significantly associated with high risk sexual behaviour among female. Furthermore, the study shows that there is a relationship between knowledge of sexually transmitted infections and high risk sex among male (χ2=38.32; p < 0.05) and female (χ2=18.37; p < 0.05). At multivariate level, the study revealed that individual characteristics such as age, religion, place of residence, wealth index, levels of education and employment status were statistically significantly related with high risk sexual behaviour among male and female (p < 0.05). Lastly, the study shows that knowledge of sexually transmitted infection was significantly related to high risk sexual behaviour among youths (p < 0.05). The study concludes that there is a high level of knowledge of sexually transmitted infections among unmarried youths in Nigeria. The practice of high risk sex is high among unmarried youths but higher among male youths. The prevalence of high risk sexual activity is higher for males when they are at disadvantage and higher for females when they are at advantage. Socio-demographic factors like age of respondents, religion, wealth index, place of residence, employment status and highest level of education are factors influencing high risk sexual behaviour among youths.

Keywords: high risk sex, wealth index, sexual behaviour, knowledge

Procedia PDF Downloads 238