Search results for: Gaussian process for regression
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 17794

Search results for: Gaussian process for regression

17734 Additive White Gaussian Noise Filtering from ECG by Wiener Filter and Median Filter: A Comparative Study

Authors: Hossein Javidnia, Salehe Taheri

Abstract:

The Electrocardiogram (ECG) is the recording of the heart’s electrical potential versus time. ECG signals are often contaminated with noise such as baseline wander and muscle noise. As these signals have been widely used in clinical studies to detect heart diseases, it is essential to filter these noises. In this paper we compare performance of Wiener Filtering and Median Filtering methods to filter Additive White Gaussian (AWG) noise with the determined signal to noise ratio (SNR) ranging from 3 to 5 dB applied to long-term ECG recordings samples. Root mean square error (RMSE) and coefficient of determination (R2) between the filtered ECG and original ECG was used as the filter performance indicator. Experimental results show that Wiener filter has better noise filtering performance than Median filter.

Keywords: ECG noise filtering, Wiener filtering, median filtering, Gaussian noise, filtering performance

Procedia PDF Downloads 497
17733 Identifying Degradation Patterns of LI-Ion Batteries from Impedance Spectroscopy Using Machine Learning

Authors: Yunwei Zhang, Qiaochu Tang, Yao Zhang, Jiabin Wang, Ulrich Stimming, Alpha Lee

Abstract:

Forecasting the state of health and remaining useful life of Li-ion batteries is an unsolved challenge that limits technologies such as consumer electronics and electric vehicles. Here we build an accurate battery forecasting system by combining electrochemical impedance spectroscopy (EIS) -- a real-time, non-invasive and information-rich measurement that is hitherto underused in battery diagnosis -- with Gaussian process machine learning. We collect over 20,000 EIS spectra of commercial Li-ion batteries at different states of health, states of charge and temperatures -- the largest dataset to our knowledge of its kind. Our Gaussian process model takes the entire spectrum as input, without further feature engineering, and automatically determines which spectral features predict degradation. Our model accurately predicts the remaining useful life, even without complete knowledge of past operating conditions of the battery. Our results demonstrate the value of EIS signals in battery management systems.

Keywords: battery degradation, machine learning method, electrochemical impedance spectroscopy, battery diagnosis

Procedia PDF Downloads 111
17732 Performance Comparison of Different Regression Methods for a Polymerization Process with Adaptive Sampling

Authors: Florin Leon, Silvia Curteanu

Abstract:

Developing complete mechanistic models for polymerization reactors is not easy, because complex reactions occur simultaneously; there is a large number of kinetic parameters involved and sometimes the chemical and physical phenomena for mixtures involving polymers are poorly understood. To overcome these difficulties, empirical models based on sampled data can be used instead, namely regression methods typical of machine learning field. They have the ability to learn the trends of a process without any knowledge about its particular physical and chemical laws. Therefore, they are useful for modeling complex processes, such as the free radical polymerization of methyl methacrylate achieved in a batch bulk process. The goal is to generate accurate predictions of monomer conversion, numerical average molecular weight and gravimetrical average molecular weight. This process is associated with non-linear gel and glass effects. For this purpose, an adaptive sampling technique is presented, which can select more samples around the regions where the values have a higher variation. Several machine learning methods are used for the modeling and their performance is compared: support vector machines, k-nearest neighbor, k-nearest neighbor and random forest, as well as an original algorithm, large margin nearest neighbor regression. The suggested method provides very good results compared to the other well-known regression algorithms.

Keywords: batch bulk methyl methacrylate polymerization, adaptive sampling, machine learning, large margin nearest neighbor regression

Procedia PDF Downloads 270
17731 Comparative Study od Three Artificial Intelligence Techniques for Rain Domain in Precipitation Forecast

Authors: Nabilah Filzah Mohd Radzuan, Andi Putra, Zalinda Othman, Azuraliza Abu Bakar, Abdul Razak Hamdan

Abstract:

Precipitation forecast is important to avoid natural disaster incident which can cause losses in the involved area. This paper reviews three techniques logistic regression, decision tree, and random forest which are used in making precipitation forecast. These combination techniques through the vector auto-regression (VAR) model help in finding the advantages and strengths of each technique in the forecast process. The data-set contains variables of the rain’s domain. Adaptation of artificial intelligence techniques involved in rain domain enables the forecast process to be easier and systematic for precipitation forecast.

Keywords: logistic regression, decisions tree, random forest, VAR model

Procedia PDF Downloads 417
17730 An Automatic Speech Recognition Tool for the Filipino Language Using the HTK System

Authors: John Lorenzo Bautista, Yoon-Joong Kim

Abstract:

This paper presents the development of a Filipino speech recognition tool using the HTK System. The system was trained from a subset of the Filipino Speech Corpus developed by the DSP Laboratory of the University of the Philippines-Diliman. The speech corpus was both used in training and testing the system by estimating the parameters for phonetic HMM-based (Hidden-Markov Model) acoustic models. Experiments on different mixture-weights were incorporated in the study. The phoneme-level word-based recognition of a 5-state HMM resulted in an average accuracy rate of 80.13 for a single-Gaussian mixture model, 81.13 after implementing a phoneme-alignment, and 87.19 for the increased Gaussian-mixture weight model. The highest accuracy rate of 88.70% was obtained from a 5-state model with 6 Gaussian mixtures.

Keywords: Filipino language, Hidden Markov Model, HTK system, speech recognition

Procedia PDF Downloads 442
17729 Relative Entropy Used to Determine the Divergence of Cells in Single Cell RNA Sequence Data Analysis

Authors: An Chengrui, Yin Zi, Wu Bingbing, Ma Yuanzhu, Jin Kaixiu, Chen Xiao, Ouyang Hongwei

Abstract:

Single cell RNA sequence (scRNA-seq) is one of the effective tools to study transcriptomics of biological processes. Recently, similarity measurement of cells is Euclidian distance or its derivatives. However, the process of scRNA-seq is a multi-variate Bernoulli event model, thus we hypothesize that it would be more efficient when the divergence between cells is valued with relative entropy than Euclidian distance. In this study, we compared the performances of Euclidian distance, Spearman correlation distance and Relative Entropy using scRNA-seq data of the early, medial and late stage of limb development generated in our lab. Relative Entropy is better than other methods according to cluster potential test. Furthermore, we developed KL-SNE, an algorithm modifying t-SNE whose definition of divergence between cells Euclidian distance to Kullback–Leibler divergence. Results showed that KL-SNE was more effective to dissect cell heterogeneity than t-SNE, indicating the better performance of relative entropy than Euclidian distance. Specifically, the chondrocyte expressing Comp was clustered together with KL-SNE but not with t-SNE. Surprisingly, cells in early stage were surrounded by cells in medial stage in the processing of KL-SNE while medial cells neighbored to late stage with the process of t-SNE. This results parallel to Heatmap which showed cells in medial stage were more heterogenic than cells in other stages. In addition, we also found that results of KL-SNE tend to follow Gaussian distribution compared with those of the t-SNE, which could also be verified with the analysis of scRNA-seq data from another study on human embryo development. Therefore, it is also an effective way to convert non-Gaussian distribution to Gaussian distribution and facilitate the subsequent statistic possesses. Thus, relative entropy is potentially a better way to determine the divergence of cells in scRNA-seq data analysis.

Keywords: Single cell RNA sequence, Similarity measurement, Relative Entropy, KL-SNE, t-SNE

Procedia PDF Downloads 317
17728 The Use of Geographically Weighted Regression for Deforestation Analysis: Case Study in Brazilian Cerrado

Authors: Ana Paula Camelo, Keila Sanches

Abstract:

The Geographically Weighted Regression (GWR) was proposed in geography literature to allow relationship in a regression model to vary over space. In Brazil, the agricultural exploitation of the Cerrado Biome is the main cause of deforestation. In this study, we propose a methodology using geostatistical methods to characterize the spatial dependence of deforestation in the Cerrado based on agricultural production indicators. Therefore, it was used the set of exploratory spatial data analysis tools (ESDA) and confirmatory analysis using GWR. It was made the calibration a non-spatial model, evaluation the nature of the regression curve, election of the variables by stepwise process and multicollinearity analysis. After the evaluation of the non-spatial model was processed the spatial-regression model, statistic evaluation of the intercept and verification of its effect on calibration. In an analysis of Spearman’s correlation the results between deforestation and livestock was +0.783 and with soybeans +0.405. The model presented R²=0.936 and showed a strong spatial dependence of agricultural activity of soybeans associated to maize and cotton crops. The GWR is a very effective tool presenting results closer to the reality of deforestation in the Cerrado when compared with other analysis.

Keywords: deforestation, geographically weighted regression, land use, spatial analysis

Procedia PDF Downloads 331
17727 Confidence Intervals for Process Capability Indices for Autocorrelated Data

Authors: Jane A. Luke

Abstract:

Persistent pressure passed on to manufacturers from escalating consumer expectations and the ever growing global competitiveness have produced a rapidly increasing interest in the development of various manufacturing strategy models. Academic and industrial circles are taking keen interest in the field of manufacturing strategy. Many manufacturing strategies are currently centered on the traditional concepts of focused manufacturing capabilities such as quality, cost, dependability and innovation. Process capability indices was conducted assuming that the process under study is in statistical control and independent observations are generated over time. However, in practice, it is very common to come across processes which, due to their inherent natures, generate autocorrelated observations. The degree of autocorrelation affects the behavior of patterns on control charts. Even, small levels of autocorrelation between successive observations can have considerable effects on the statistical properties of conventional control charts. When observations are autocorrelated the classical control charts exhibit nonrandom patterns and lack of control. Many authors have considered the effect of autocorrelation on the performance of statistical process control charts. In this paper, the effect of autocorrelation on confidence intervals for different PCIs was included. Stationary Gaussian processes is explained. Effect of autocorrelation on PCIs is described in detail. Confidence intervals for Cp and Cpk are constructed for PCIs when data are both independent and autocorrelated. Confidence intervals for Cp and Cpk are computed. Approximate lower confidence limits for various Cpk are computed assuming AR(1) model for the data. Simulation studies and industrial examples are considered to demonstrate the results.

Keywords: autocorrelation, AR(1) model, Bissell’s approximation, confidence intervals, statistical process control, specification limits, stationary Gaussian processes

Procedia PDF Downloads 359
17726 Developing Variable Repetitive Group Sampling Control Chart Using Regression Estimator

Authors: Liaquat Ahmad, Muhammad Aslam, Muhammad Azam

Abstract:

In this article, we propose a control chart based on repetitive group sampling scheme for the location parameter. This charting scheme is based on the regression estimator; an estimator that capitalize the relationship between the variables of interest to provide more sensitive control than the commonly used individual variables. The control limit coefficients have been estimated for different sample sizes for less and highly correlated variables. The monitoring of the production process is constructed by adopting the procedure of the Shewhart’s x-bar control chart. Its performance is verified by the average run length calculations when the shift occurs in the average value of the estimator. It has been observed that the less correlated variables have rapid false alarm rate.

Keywords: average run length, control charts, process shift, regression estimators, repetitive group sampling

Procedia PDF Downloads 536
17725 Mixed Sub-Fractional Brownian Motion

Authors: Mounir Zili

Abstract:

We will introduce a new extension of the Brownian motion, that could serve to get a good model of many natural phenomena. It is a linear combination of a finite number of sub-fractional Brownian motions; that is why we will call it the mixed sub-fractional Brownian motion. We will present some basic properties of this process. Among others, we will check that our process is non-Markovian and that it has non-stationary increments. We will also give the conditions under which it is a semimartingale. Finally, the main features of its sample paths will be specified.

Keywords: mixed Gaussian processes, Sub-fractional Brownian motion, sample paths

Procedia PDF Downloads 460
17724 Gaussian Mixture Model Based Identification of Arterial Wall Movement for Computation of Distension Waveform

Authors: Ravindra B. Patil, P. Krishnamoorthy, Shriram Sethuraman

Abstract:

This work proposes a novel Gaussian Mixture Model (GMM) based approach for accurate tracking of the arterial wall and subsequent computation of the distension waveform using Radio Frequency (RF) ultrasound signal. The approach was evaluated on ultrasound RF data acquired using a prototype ultrasound system from an artery mimicking flow phantom. The effectiveness of the proposed algorithm is demonstrated by comparing with existing wall tracking algorithms. The experimental results show that the proposed method provides 20% reduction in the error margin compared to the existing approaches in tracking the arterial wall movement. This approach coupled with ultrasound system can be used to estimate the arterial compliance parameters required for screening of cardiovascular related disorders.

Keywords: distension waveform, Gaussian Mixture Model, RF ultrasound, arterial wall movement

Procedia PDF Downloads 475
17723 A Segmentation Method for Grayscale Images Based on the Firefly Algorithm and the Gaussian Mixture Model

Authors: Donatella Giuliani

Abstract:

In this research, we propose an unsupervised grayscale image segmentation method based on a combination of the Firefly Algorithm and the Gaussian Mixture Model. Firstly, the Firefly Algorithm has been applied in a histogram-based research of cluster means. The Firefly Algorithm is a stochastic global optimization technique, centered on the flashing characteristics of fireflies. In this context it has been performed to determine the number of clusters and the related cluster means in a histogram-based segmentation approach. Successively these means are used in the initialization step for the parameter estimation of a Gaussian Mixture Model. The parametric probability density function of a Gaussian Mixture Model is represented as a weighted sum of Gaussian component densities, whose parameters are evaluated applying the iterative Expectation-Maximization technique. The coefficients of the linear super-position of Gaussians can be thought as prior probabilities of each component. Applying the Bayes rule, the posterior probabilities of the grayscale intensities have been evaluated, therefore their maxima are used to assign each pixel to the clusters, according to their gray-level values. The proposed approach appears fairly solid and reliable when applied even to complex grayscale images. The validation has been performed by using different standard measures, more precisely: the Root Mean Square Error (RMSE), the Structural Content (SC), the Normalized Correlation Coefficient (NK) and the Davies-Bouldin (DB) index. The achieved results have strongly confirmed the robustness of this gray scale segmentation method based on a metaheuristic algorithm. Another noteworthy advantage of this methodology is due to the use of maxima of responsibilities for the pixel assignment that implies a consistent reduction of the computational costs.

Keywords: clustering images, firefly algorithm, Gaussian mixture model, meta heuristic algorithm, image segmentation

Procedia PDF Downloads 189
17722 Optimised Path Recommendation for a Real Time Process

Authors: Likewin Thomas, M. V. Manoj Kumar, B. Annappa

Abstract:

Traditional execution process follows the path of execution drawn by the process analyst without observing the behaviour of resource and other real-time constraints. Identifying process model, predicting the behaviour of resource and recommending the optimal path of execution for a real time process is challenging. The proposed AlfyMiner: αyM iner gives a new dimension in process execution with the novel techniques Process Model Analyser: PMAMiner and Resource behaviour Analyser: RBAMiner for recommending the probable path of execution. PMAMiner discovers next probable activity for currently executing activity in an online process using variant matching technique to identify the set of next probable activity, among which the next probable activity is discovered using decision tree model. RBAMiner identifies the resource suitable for performing the discovered next probable activity and observe the behaviour based on; load and performance using polynomial regression model, and waiting time using queueing theory. Based on the observed behaviour αyM iner recommend the probable path of execution with; next probable activity and the best suitable resource for performing it. Experiments were conducted on process logs of CoSeLoG Project1 and 72% of accuracy is obtained in identifying and recommending next probable activity and the efficiency of resource performance was optimised by 59% by decreasing their load.

Keywords: cross-organization process mining, process behaviour, path of execution, polynomial regression model

Procedia PDF Downloads 308
17721 Reliability Analysis of Dam under Quicksand Condition

Authors: Manthan Patel, Vinit Ahlawat, Anshh Singh Claire, Pijush Samui

Abstract:

This paper focuses on the analysis of quicksand condition for a dam foundation. The quicksand condition occurs in cohesion less soil when effective stress of soil becomes zero. In a dam, the saturated sediment may appear quite solid until a sudden change in pressure or shock initiates liquefaction. This causes the sand to form a suspension and lose strength hence resulting in failure of dam. A soil profile shows different properties at different points and the values obtained are uncertain thus reliability analysis is performed. The reliability is defined as probability of safety of a system in a given environment and loading condition and it is assessed as Reliability Index. The reliability analysis of dams under quicksand condition is carried by Gaussian Process Regression (GPR). Reliability index and factor of safety relating to liquefaction of soil is analysed using GPR. The results of reliability analysis by GPR is compared to that of conventional method and it is demonstrated that on applying GPR the probabilistic analysis reduces the computational time and efforts.

Keywords: factor of safety, GPR, reliability index, quicksand

Procedia PDF Downloads 456
17720 Orthogonal Regression for Nonparametric Estimation of Errors-In-Variables Models

Authors: Anastasiia Yu. Timofeeva

Abstract:

Two new algorithms for nonparametric estimation of errors-in-variables models are proposed. The first algorithm is based on penalized regression spline. The spline is represented as a piecewise-linear function and for each linear portion orthogonal regression is estimated. This algorithm is iterative. The second algorithm involves locally weighted regression estimation. When the independent variable is measured with error such estimation is a complex nonlinear optimization problem. The simulation results have shown the advantage of the second algorithm under the assumption that true smoothing parameters values are known. Nevertheless the use of some indexes of fit to smoothing parameters selection gives the similar results and has an oversmoothing effect.

Keywords: grade point average, orthogonal regression, penalized regression spline, locally weighted regression

Procedia PDF Downloads 383
17719 Control of a Stewart Platform for Minimizing Impact Energy in Simulating Spacecraft Docking Operations

Authors: Leonardo Herrera, Shield B. Lin, Stephen J. Montgomery-Smith, Ziraguen O. Williams

Abstract:

Three control algorithms: Proportional-Integral-Derivative, Linear-Quadratic-Gaussian, and Linear-Quadratic-Gaussian with the shift, were applied to the computer simulation of a one-directional dynamic model of a Stewart Platform. The goal was to compare the dynamic system responses under the three control algorithms and to minimize the impact energy when simulating spacecraft docking operations. Equations were derived for the control algorithms and the input and output of the feedback control system. Using MATLAB, Simulink diagrams were created to represent the three control schemes. A switch selector was used for the convenience of changing among different controllers. The simulation demonstrated the controller using the algorithm of Linear-Quadratic-Gaussian with the shift resulting in the lowest impact energy.

Keywords: controller, Stewart platform, docking operation, spacecraft

Procedia PDF Downloads 8
17718 Mixed-Sub Fractional Brownian Motion

Authors: Mounir Zili

Abstract:

We will introduce a new extension of the Brownian motion, that could serve to get a good model of many natural phenomena. It is a linear combination of a finite number of sub-fractional Brownian motions; that is why we will call it the mixed sub-fractional Brownian motion. We will present some basic properties of this process. Among others, we will check that our process is non-markovian and that it has non-stationary increments. We will also give the conditions under which it is a semi-martingale. Finally, the main features of its sample paths will be specified.

Keywords: fractal dimensions, mixed gaussian processes, sample paths, sub-fractional brownian motion

Procedia PDF Downloads 387
17717 MapReduce Logistic Regression Algorithms with RHadoop

Authors: Byung Ho Jung, Dong Hoon Lim

Abstract:

Logistic regression is a statistical method for analyzing a dataset in which there are one or more independent variables that determine an outcome. Logistic regression is used extensively in numerous disciplines, including the medical and social science fields. In this paper, we address the problem of estimating parameters in the logistic regression based on MapReduce framework with RHadoop that integrates R and Hadoop environment applicable to large scale data. There exist three learning algorithms for logistic regression, namely Gradient descent method, Cost minimization method and Newton-Rhapson's method. The Newton-Rhapson's method does not require a learning rate, while gradient descent and cost minimization methods need to manually pick a learning rate. The experimental results demonstrated that our learning algorithms using RHadoop can scale well and efficiently process large data sets on commodity hardware. We also compared the performance of our Newton-Rhapson's method with gradient descent and cost minimization methods. The results showed that our newton's method appeared to be the most robust to all data tested.

Keywords: big data, logistic regression, MapReduce, RHadoop

Procedia PDF Downloads 247
17716 Cyclostationary Gaussian Linearization for Analyzing Nonlinear System Response Under Sinusoidal Signal and White Noise Excitation

Authors: R. J. Chang

Abstract:

A cyclostationary Gaussian linearization method is formulated for investigating the time average response of nonlinear system under sinusoidal signal and white noise excitation. The quantitative measure of cyclostationary mean, variance, spectrum of mean amplitude, and mean power spectral density of noise is analyzed. The qualitative response behavior of stochastic jump and bifurcation are investigated. The validity of the present approach in predicting the quantitative and qualitative statistical responses is supported by utilizing Monte Carlo simulations. The present analysis without imposing restrictive analytical conditions can be directly derived by solving non-linear algebraic equations. The analytical solution gives reliable quantitative and qualitative prediction of mean and noise response for the Duffing system subjected to both sinusoidal signal and white noise excitation.

Keywords: cyclostationary, duffing system, Gaussian linearization, sinusoidal, white noise

Procedia PDF Downloads 462
17715 Forecasting of Grape Juice Flavor by Using Support Vector Regression

Authors: Ren-Jieh Kuo, Chun-Shou Huang

Abstract:

The research of juice flavor forecasting has become more important in China. Due to the fast economic growth in China, many different kinds of juices have been introduced to the market. If a beverage company can understand their customers’ preference well, the juice can be served more attractively. Thus, this study intends to introduce the basic theory and computing process of grapes juice flavor forecasting based on support vector regression (SVR). Applying SVR, BPN and LR to forecast the flavor of grapes juice in real data, the result shows that SVR is more suitable and effective at predicting performance.

Keywords: flavor forecasting, artificial neural networks, Support Vector Regression, China

Procedia PDF Downloads 453
17714 Tracking the Effect of Ibutilide on Amplitude and Frequency of Fibrillatory Intracardiac Electrograms Using the Regression Analysis

Authors: H. Hajimolahoseini, J. Hashemi, D. Redfearn

Abstract:

Background: Catheter ablation is an effective therapy for symptomatic atrial fibrillation (AF). The intracardiac electrocardiogram (IEGM) collected during this procedure contains precious information that has not been explored to its full capacity. Novel processing techniques allow looking at these recordings from different perspectives which can lead to improved therapeutic approaches. In our previous study, we showed that variation in amplitude measured through Shannon Entropy could be used as an AF recurrence risk stratification factor in patients who received Ibutilide before the electrograms were recorded. The aim of this study is to further investigate the effect of Ibutilide on characteristics of the recorded signals from the left atrium (LA) of a patient with persistent AF before and after administration of the drug. Methods: The IEGMs collected from different intra-atrial sites of 12 patients were studied and compared before and after Ibutilide administration. First, the before and after Ibutilide IEGMs that were recorded within a Euclidian distance of 3 mm in LA were selected as pairs for comparison. For every selected pair of IEGMs, the Probability Distribution Function (PDF) of the amplitude in time domain and magnitude in frequency domain was estimated using the regression analysis. The PDF represents the relative likelihood of a variable falling within a specific range of values. Results: Our observations showed that in time domain, the PDF of amplitudes was fitted to a Gaussian distribution while in frequency domain, it was fitted to a Rayleigh distribution. Our observations also revealed that after Ibutilide administration, the IEGMs would have significantly narrower short-tailed PDFs both in time and frequency domains. Conclusion: This study shows that the PDFs of the IEGMs before and after administration of Ibutilide represents significantly different properties, both in time and frequency domains. Hence, by fitting the PDF of IEGMs in time domain to a Gaussian distribution or in frequency domain to a Rayleigh distribution, the effect of Ibutilide can easily be tracked using the statistics of their PDF (e.g., standard deviation) while this is difficult through the waveform of IEGMs itself.

Keywords: atrial fibrillation, catheter ablation, probability distribution function, time-frequency characteristics

Procedia PDF Downloads 141
17713 Prediction of Energy Storage Areas for Static Photovoltaic System Using Irradiation and Regression Modelling

Authors: Kisan Sarda, Bhavika Shingote

Abstract:

This paper aims to evaluate regression modelling for prediction of Energy storage of solar photovoltaic (PV) system using Semi parametric regression techniques because there are some parameters which are known while there are some unknown parameters like humidity, dust etc. Here irradiation of solar energy is different for different places on the basis of Latitudes, so by finding out areas which give more storage we can implement PV systems at those places and our need of energy will be fulfilled. This regression modelling is done for daily, monthly and seasonal prediction of solar energy storage. In this, we have used R modules for designing the algorithm. This algorithm will give the best comparative results than other regression models for the solar PV cell energy storage.

Keywords: semi parametric regression, photovoltaic (PV) system, regression modelling, irradiation

Procedia PDF Downloads 350
17712 A Comparative Asessment of Some Algorithms for Modeling and Forecasting Horizontal Displacement of Ialy Dam, Vietnam

Authors: Kien-Trinh Thi Bui, Cuong Manh Nguyen

Abstract:

In order to simulate and reproduce the operational characteristics of a dam visually, it is necessary to capture the displacement at different measurement points and analyze the observed movement data promptly to forecast the dam safety. The accuracy of forecasts is further improved by applying machine learning methods to data analysis progress. In this study, the horizontal displacement monitoring data of the Ialy hydroelectric dam was applied to machine learning algorithms: Gaussian processes, multi-layer perceptron neural networks, and the M5-rules algorithm for modelling and forecasting of horizontal displacement of the Ialy hydropower dam (Vietnam), respectively, for analysing. The database which used in this research was built by collecting time series of data from 2006 to 2021 and divided into two parts: training dataset and validating dataset. The final results show all three algorithms have high performance for both training and model validation, but the MLPs is the best model. The usability of them are further investigated by comparison with a benchmark models created by multi-linear regression. The result show the performance which obtained from all the GP model, the MLPs model and the M5-Rules model are much better, therefore these three models should be used to analyze and predict the horizontal displacement of the dam.

Keywords: Gaussian processes, horizontal displacement, hydropower dam, Ialy dam, M5-Rules, multi-layer perception neural networks

Procedia PDF Downloads 174
17711 3D Shape Knitting: Loop Alignment on a Surface with Positive Gaussian Curvature

Authors: C. T. Cheung, R. K. P. Ng, T. Y. Lo, Zhou Jinyun

Abstract:

This paper aims at manipulating loop alignment in knitting a three-dimensional (3D) shape by its geometry. Two loop alignment methods are introduced to handle a surface with positive Gaussian curvature. As weft knitting is a two-dimensional (2D) knitting mechanism that the knitting cam carrying the feeders moves in two directions only, left and right, the knitted fabric generated grows in width and length but not in depth. Therefore, a 3D shape is required to be flattened to a 2D plane with surface area preserved for knitting. On this flattened plane, dimensional measurements are taken for loop alignment. The way these measurements being taken derived two different loop alignment methods. In this paper, only plain knitted structure was considered. Each knitted loop was taken as a basic unit for loop alignment in order to achieve the required geometric dimensions, without the inclusion of other stitches which give textural dimensions to the fabric. Two loop alignment methods were experimented and compared. Only one of these two can successfully preserve the dimensions of the shape.

Keywords: 3D knitting, 3D shape, loop alignment, positive Gaussian curvature

Procedia PDF Downloads 321
17710 Local Interpretable Model-agnostic Explanations (LIME) Approach to Email Spam Detection

Authors: Rohini Hariharan, Yazhini R., Blessy Maria Mathew

Abstract:

The task of detecting email spam is a very important one in the era of digital technology that needs effective ways of curbing unwanted messages. This paper presents an approach aimed at making email spam categorization algorithms transparent, reliable and more trustworthy by incorporating Local Interpretable Model-agnostic Explanations (LIME). Our technique assists in providing interpretable explanations for specific classifications of emails to help users understand the decision-making process by the model. In this study, we developed a complete pipeline that incorporates LIME into the spam classification framework and allows creating simplified, interpretable models tailored to individual emails. LIME identifies influential terms, pointing out key elements that drive classification results, thus reducing opacity inherent in conventional machine learning models. Additionally, we suggest a visualization scheme for displaying keywords that will improve understanding of categorization decisions by users. We test our method on a diverse email dataset and compare its performance with various baseline models, such as Gaussian Naive Bayes, Multinomial Naive Bayes, Bernoulli Naive Bayes, Support Vector Classifier, K-Nearest Neighbors, Decision Tree, and Logistic Regression. Our testing results show that our model surpasses all other models, achieving an accuracy of 96.59% and a precision of 99.12%.

Keywords: text classification, LIME (local interpretable model-agnostic explanations), stemming, tokenization, logistic regression.

Procedia PDF Downloads 18
17709 Evaluation of the CRISP-DM Business Understanding Step: An Approach for Assessing the Predictive Power of Regression versus Classification for the Quality Prediction of Hydraulic Test Results

Authors: Christian Neunzig, Simon Fahle, Jürgen Schulz, Matthias Möller, Bernd Kuhlenkötter

Abstract:

Digitalisation in production technology is a driver for the application of machine learning methods. Through the application of predictive quality, the great potential for saving necessary quality control can be exploited through the data-based prediction of product quality and states. However, the serial use of machine learning applications is often prevented by various problems. Fluctuations occur in real production data sets, which are reflected in trends and systematic shifts over time. To counteract these problems, data preprocessing includes rule-based data cleaning, the application of dimensionality reduction techniques, and the identification of comparable data subsets to extract stable features. Successful process control of the target variables aims to centre the measured values around a mean and minimise variance. Competitive leaders claim to have mastered their processes. As a result, much of the real data has a relatively low variance. For the training of prediction models, the highest possible generalisability is required, which is at least made more difficult by this data availability. The implementation of a machine learning application can be interpreted as a production process. The CRoss Industry Standard Process for Data Mining (CRISP-DM) is a process model with six phases that describes the life cycle of data science. As in any process, the costs to eliminate errors increase significantly with each advancing process phase. For the quality prediction of hydraulic test steps of directional control valves, the question arises in the initial phase whether a regression or a classification is more suitable. In the context of this work, the initial phase of the CRISP-DM, the business understanding, is critically compared for the use case at Bosch Rexroth with regard to regression and classification. The use of cross-process production data along the value chain of hydraulic valves is a promising approach to predict the quality characteristics of workpieces. Suitable methods for leakage volume flow regression and classification for inspection decision are applied. Impressively, classification is clearly superior to regression and achieves promising accuracies.

Keywords: classification, CRISP-DM, machine learning, predictive quality, regression

Procedia PDF Downloads 116
17708 Adaptive CFAR Analysis for Non-Gaussian Distribution

Authors: Bouchemha Amel, Chachoui Takieddine, H. Maalem

Abstract:

Automatic detection of targets in a modern communication system RADAR is based primarily on the concept of adaptive CFAR detector. To have an effective detection, we must minimize the influence of disturbances due to the clutter. The detection algorithm adapts the CFAR detection threshold which is proportional to the average power of the clutter, maintaining a constant probability of false alarm. In this article, we analyze the performance of two variants of adaptive algorithms CA-CFAR and OS-CFAR and we compare the thresholds of these detectors in the marine environment (no-Gaussian) with a Weibull distribution.

Keywords: CFAR, threshold, clutter, distribution, Weibull, detection

Procedia PDF Downloads 550
17707 Video Foreground Detection Based on Adaptive Mixture Gaussian Model for Video Surveillance Systems

Authors: M. A. Alavianmehr, A. Tashk, A. Sodagaran

Abstract:

Modeling background and moving objects are significant techniques for video surveillance and other video processing applications. This paper presents a foreground detection algorithm that is robust against illumination changes and noise based on adaptive mixture Gaussian model (GMM), and provides a novel and practical choice for intelligent video surveillance systems using static cameras. In the previous methods, the image of still objects (background image) is not significant. On the contrary, this method is based on forming a meticulous background image and exploiting it for separating moving objects from their background. The background image is specified either manually, by taking an image without vehicles, or is detected in real-time by forming a mathematical or exponential average of successive images. The proposed scheme can offer low image degradation. The simulation results demonstrate high degree of performance for the proposed method.

Keywords: image processing, background models, video surveillance, foreground detection, Gaussian mixture model

Procedia PDF Downloads 487
17706 New Segmentation of Piecewise Linear Regression Models Using Reversible Jump MCMC Algorithm

Authors: Suparman

Abstract:

Piecewise linear regression models are very flexible models for modeling the data. If the piecewise linear regression models are matched against the data, then the parameters are generally not known. This paper studies the problem of parameter estimation of piecewise linear regression models. The method used to estimate the parameters of picewise linear regression models is Bayesian method. But the Bayes estimator can not be found analytically. To overcome these problems, the reversible jump MCMC algorithm is proposed. Reversible jump MCMC algorithm generates the Markov chain converges to the limit distribution of the posterior distribution of the parameters of picewise linear regression models. The resulting Markov chain is used to calculate the Bayes estimator for the parameters of picewise linear regression models.

Keywords: regression, piecewise, Bayesian, reversible Jump MCMC

Procedia PDF Downloads 489
17705 Application Difference between Cox and Logistic Regression Models

Authors: Idrissa Kayijuka

Abstract:

The logistic regression and Cox regression models (proportional hazard model) at present are being employed in the analysis of prospective epidemiologic research looking into risk factors in their application on chronic diseases. However, a theoretical relationship between the two models has been studied. By definition, Cox regression model also called Cox proportional hazard model is a procedure that is used in modeling data regarding time leading up to an event where censored cases exist. Whereas the Logistic regression model is mostly applicable in cases where the independent variables consist of numerical as well as nominal values while the resultant variable is binary (dichotomous). Arguments and findings of many researchers focused on the overview of Cox and Logistic regression models and their different applications in different areas. In this work, the analysis is done on secondary data whose source is SPSS exercise data on BREAST CANCER with a sample size of 1121 women where the main objective is to show the application difference between Cox regression model and logistic regression model based on factors that cause women to die due to breast cancer. Thus we did some analysis manually i.e. on lymph nodes status, and SPSS software helped to analyze the mentioned data. This study found out that there is an application difference between Cox and Logistic regression models which is Cox regression model is used if one wishes to analyze data which also include the follow-up time whereas Logistic regression model analyzes data without follow-up-time. Also, they have measurements of association which is different: hazard ratio and odds ratio for Cox and logistic regression models respectively. A similarity between the two models is that they are both applicable in the prediction of the upshot of a categorical variable i.e. a variable that can accommodate only a restricted number of categories. In conclusion, Cox regression model differs from logistic regression by assessing a rate instead of proportion. The two models can be applied in many other researches since they are suitable methods for analyzing data but the more recommended is the Cox, regression model.

Keywords: logistic regression model, Cox regression model, survival analysis, hazard ratio

Procedia PDF Downloads 424