Search results for: least square regression
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1340

Search results for: least square regression

1280 Force Statistics and Wake Structure Mechanism of Flow around a Square Cylinder at Low Reynolds Numbers

Authors: Shams-Ul-Islam, Waqas Sarwar Abbasi, Hamid Rahman

Abstract:

Numerical investigation of flow around a square cylinder are presented using the multi-relaxation-time lattice Boltzmann methods at different Reynolds numbers. A detail analysis are given in terms of time-trace analysis of drag and lift coefficients, power spectra analysis of lift coefficient, vorticity contours visualizations, streamlines and phase diagrams. A number of physical quantities mean drag coefficient, drag coefficient, Strouhal number and root-mean-square values of drag and lift coefficients are calculated and compared with the well resolved experimental data and numerical results available in open literature. The Reynolds numbers affected the physical quantities.

Keywords: Code validation, Force statistics, Multi-relaxation-time lattice Boltzmann method, Reynolds numbers, Square cylinder.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3122
1279 Bipolar Square Wave Pulses for Liquid Food Sterilization using Cascaded H-Bridge Multilevel Inverter

Authors: Hanifah Jambari, Naziha A. Azli, M. Afendi M. Piah

Abstract:

This paper presents the generation of bipolar square wave pulses with characteristics that are suitable for liquid food sterilization using a Cascaded H-bridge Multilevel Inverter (CHMI). Bipolar square waves pulses have been reported as stable for a longer time during the sterilization process with minimum heat emission and increased efficiency. The CHMI allows the system to produce bipolar square wave pulses and yielding high output voltage without using a transformer while fulfilling the pulse requirements for effective liquid food sterilization. This in turn can reduce power consumption and cost of the overall liquid food sterilization system. The simulation results have shown that pulses with peak output voltage of 2.4 kV, pulse width of between 1 2s and 1 ms at frequencies of 50 Hz and 100 Hz can be generated by a 7-level CHMI. Results from the experimental set-up based on a 5-level CHMI has indicated the potential of the proposed circuit in producing bipolar square wave output pulses with peak values that depends on the DC source level supplied to the CHMI modules, pulse width of between 12.5 2s and 1 ms at frequencies of 50 Hz and 100 Hz.

Keywords: pulsed electric field, multilevel inverter, bipolarsquare wave, food sterilization

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2542
1278 Comparison of Multivariate Adaptive Regression Splines and Random Forest Regression in Predicting Forced Expiratory Volume in One Second

Authors: P. V. Pramila, V. Mahesh

Abstract:

Pulmonary Function Tests are important non-invasive diagnostic tests to assess respiratory impairments and provides quantifiable measures of lung function. Spirometry is the most frequently used measure of lung function and plays an essential role in the diagnosis and management of pulmonary diseases. However, the test requires considerable patient effort and cooperation, markedly related to the age of patients resulting in incomplete data sets. This paper presents, a nonlinear model built using Multivariate adaptive regression splines and Random forest regression model to predict the missing spirometric features. Random forest based feature selection is used to enhance both the generalization capability and the model interpretability. In the present study, flow-volume data are recorded for N= 198 subjects. The ranked order of feature importance index calculated by the random forests model shows that the spirometric features FVC, FEF25, PEF, FEF25-75, FEF50 and the demographic parameter height are the important descriptors. A comparison of performance assessment of both models prove that, the prediction ability of MARS with the `top two ranked features namely the FVC and FEF25 is higher, yielding a model fit of R2= 0.96 and R2= 0.99 for normal and abnormal subjects. The Root Mean Square Error analysis of the RF model and the MARS model also shows that the latter is capable of predicting the missing values of FEV1 with a notably lower error value of 0.0191 (normal subjects) and 0.0106 (abnormal subjects) with the aforementioned input features. It is concluded that combining feature selection with a prediction model provides a minimum subset of predominant features to train the model, as well as yielding better prediction performance. This analysis can assist clinicians with a intelligence support system in the medical diagnosis and improvement of clinical care.

Keywords: FEV1, Multivariate Adaptive Regression Splines Pulmonary Function Test, Random Forest.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3737
1277 Enhancing Temporal Extrapolation of Wind Speed Using a Hybrid Technique: A Case Study in West Coast of Denmark

Authors: B. Elshafei, X. Mao

Abstract:

The demand for renewable energy is significantly increasing, major investments are being supplied to the wind power generation industry as a leading source of clean energy. The wind energy sector is entirely dependable and driven by the prediction of wind speed, which by the nature of wind is very stochastic and widely random. This s0tudy employs deep multi-fidelity Gaussian process regression, used to predict wind speeds for medium term time horizons. Data of the RUNE experiment in the west coast of Denmark were provided by the Technical University of Denmark, which represent the wind speed across the study area from the period between December 2015 and March 2016. The study aims to investigate the effect of pre-processing the data by denoising the signal using empirical wavelet transform (EWT) and engaging the vector components of wind speed to increase the number of input data layers for data fusion using deep multi-fidelity Gaussian process regression (GPR). The outcomes were compared using root mean square error (RMSE) and the results demonstrated a significant increase in the accuracy of predictions which demonstrated that using vector components of the wind speed as additional predictors exhibits more accurate predictions than strategies that ignore them, reflecting the importance of the inclusion of all sub data and pre-processing signals for wind speed forecasting models.

Keywords: Data fusion, Gaussian process regression, signal denoise, temporal extrapolation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 501
1276 Quality of Service Evaluation using a Combination of Fuzzy C-Means and Regression Model

Authors: Aboagela Dogman, Reza Saatchi, Samir Al-Khayatt

Abstract:

In this study, a network quality of service (QoS) evaluation system was proposed. The system used a combination of fuzzy C-means (FCM) and regression model to analyse and assess the QoS in a simulated network. Network QoS parameters of multimedia applications were intelligently analysed by FCM clustering algorithm. The QoS parameters for each FCM cluster centre were then inputted to a regression model in order to quantify the overall QoS. The proposed QoS evaluation system provided valuable information about the network-s QoS patterns and based on this information, the overall network-s QoS was effectively quantified.

Keywords: Fuzzy C-means; regression model, network quality of service

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1720
1275 Defect Cause Modeling with Decision Tree and Regression Analysis

Authors: B. Bakır, İ. Batmaz, F. A. Güntürkün, İ. A. İpekçi, G. Köksal, N. E. Özdemirel

Abstract:

The main aim of this study is to identify the most influential variables that cause defects on the items produced by a casting company located in Turkey. To this end, one of the items produced by the company with high defective percentage rates is selected. Two approaches-the regression analysis and decision treesare used to model the relationship between process parameters and defect types. Although logistic regression models failed, decision tree model gives meaningful results. Based on these results, it can be claimed that the decision tree approach is a promising technique for determining the most important process variables.

Keywords: Casting industry, decision tree algorithm C5.0, logistic regression, quality improvement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2517
1274 Stability of Square Plate with Concentric Cutout

Authors: B. S. Jayashankarbabu, Karisiddappa

Abstract:

The finite element method is used to obtain the elastic buckling load factor for square isotropic plate containing circular, square and rectangular cutouts. ANSYS commercial finite element software had been used in the study. The applied inplane loads considered are uniaxial and biaxial compressions. In all the cases the load is distributed uniformly along the plate outer edges. The effects of the size and shape of concentric cutouts with different plate thickness ratios and the influence of plate edge conditions, such as SSSS, CCCC and mixed boundary condition SCSC on the plate buckling strength have been considered in the analysis.

Keywords: Concentric cutout, Elastic buckling, Finite element method, Inplane loads, Thickness ratio.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3226
1273 Mathematical Determination of Tall Square Building Height under Peak Wind Loads

Authors: Debojyoti Mitra

Abstract:

The present study concentrates on solving the along wind oscillation problem of a tall square building from first principles and across wind oscillation problem of the same from empirical relations obtained by experiments. The criterion for human comfort at the worst condition at the top floor of the building is being considered and a limiting value of height of a building for a given cross section is predicted. Numerical integrations are carried out as and when required. The results show severeness of across wind oscillations in comparison to along wind oscillation. The comfort criterion is combined with across wind oscillation results to determine the maximum allowable height of a building for a given square cross-section.

Keywords: Tall Building, Along-wind Response, Across-wind Response, Human Comfort.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1476
1272 Performance Analysis of Adaptive LMS Filter through Regression Analysis using SystemC

Authors: Hyeong-Geon Lee, Jae-Young Park, Suk-ki Lee, Jong-Tae Kim

Abstract:

The LMS adaptive filter has several parameters which can affect their performance. From among these parameters, most papers handle the step size parameter for controlling the performance. In this paper, we approach three parameters: step-size, filter tap-size and filter form. The regression analysis is used for defining the relation between parameters and performance of LMS adaptive filter with using the system level simulation results. The results present that all parameters have performance trends in each own particular form, which can be estimated from equations drawn by regression analysis.

Keywords: System level model, adaptive LMS FIR filter, regression analysis, systemC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2800
1271 Simulation of Natural Convection in Concentric Annuli between an Outer Inclined Square Enclosure and an Inner Horizontal Cylinder

Authors: Sattar Al-Jabair, Laith J. Habeeb

Abstract:

In this work, the natural convection in a concentric annulus between a cold outer inclined square enclosure and heated inner circular cylinder is simulated for two-dimensional steady state. The Boussinesq approximation was applied to model the buoyancy-driven effect and the governing equations were solved using the time marching approach staggered by body fitted coordinates. The coordinate transformation from the physical domain to the computational domain is set up by an analytical expression. Numerical results for Rayleigh numbers 103 , 104 , 105 and 106, aspect ratios 1.5 , 3.0 and 4.5 for seven different inclination angles for the outer square enclosure 0o , -30o , -45o , -60o , -90o , -135o , -180o are presented as well. The computed flow and temperature fields were demonstrated in the form of streamlines, isotherms and Nusselt numbers variation. It is found that both the aspect ratio and the Rayleigh number are critical to the patterns of flow and thermal fields. At all Rayleigh numbers angle of inclination has nominal effect on heat transfer.

Keywords: natural convection, concentric annulus, square inclined enclosure

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2852
1270 Density Estimation using Generalized Linear Model and a Linear Combination of Gaussians

Authors: Aly Farag, Ayman El-Baz, Refaat Mohamed

Abstract:

In this paper we present a novel approach for density estimation. The proposed approach is based on using the logistic regression model to get initial density estimation for the given empirical density. The empirical data does not exactly follow the logistic regression model, so, there will be a deviation between the empirical density and the density estimated using logistic regression model. This deviation may be positive and/or negative. In this paper we use a linear combination of Gaussian (LCG) with positive and negative components as a model for this deviation. Also, we will use the expectation maximization (EM) algorithm to estimate the parameters of LCG. Experiments on real images demonstrate the accuracy of our approach.

Keywords: Logistic regression model, Expectationmaximization, Segmentation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1732
1269 Multiple Regression based Graphical Modeling for Images

Authors: Pavan S., Sridhar G., Sridhar V.

Abstract:

Super resolution is one of the commonly referred inference problems in computer vision. In the case of images, this problem is generally addressed using a graphical model framework wherein each node represents a portion of the image and the edges between the nodes represent the statistical dependencies. However, the large dimensionality of images along with the large number of possible states for a node makes the inference problem computationally intractable. In this paper, we propose a representation wherein each node can be represented as acombination of multiple regression functions. The proposed approach achieves a tradeoff between the computational complexity and inference accuracy by varying the number of regression functions for a node.

Keywords: Belief propagation, Graphical model, Regression, Super resolution.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1546
1268 Empirical Statistical Modeling of Rainfall Prediction over Myanmar

Authors: Wint Thida Zaw, Thinn Thu Naing

Abstract:

One of the essential sectors of Myanmar economy is agriculture which is sensitive to climate variation. The most important climatic element which impacts on agriculture sector is rainfall. Thus rainfall prediction becomes an important issue in agriculture country. Multi variables polynomial regression (MPR) provides an effective way to describe complex nonlinear input output relationships so that an outcome variable can be predicted from the other or others. In this paper, the modeling of monthly rainfall prediction over Myanmar is described in detail by applying the polynomial regression equation. The proposed model results are compared to the results produced by multiple linear regression model (MLR). Experiments indicate that the prediction model based on MPR has higher accuracy than using MLR.

Keywords: Polynomial Regression, Rainfall Forecasting, Statistical forecasting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2633
1267 Profitability Assessment of Granite Aggregate Production and the Development of a Profit Assessment Model

Authors: Melodi Mbuyi Mata, Blessing Olamide Taiwo, Afolabi Ayodele David

Abstract:

The purpose of this research is to create empirical models for assessing the profitability of granite aggregate production in Akure, Ondo state aggregate quarries. In addition, an Artificial Neural Network (ANN) model and multivariate predicting models for granite profitability were developed in the study. A formal survey questionnaire was used to collect data for the study. The data extracted from the case study mine for this study include granite marketing operations, royalty, production costs, and mine production information. The following methods were used to achieve the goal of this study: descriptive statistics, MATLAB 2017, and SPSS16.0 software in analyzing and modeling the data collected from granite traders in the study areas. The ANN and Multi Variant Regression models' prediction accuracy was compared using a coefficient of determination (R2), Root Mean Square Error (RMSE), and mean square error (MSE). Due to the high prediction error, the model evaluation indices revealed that the ANN model was suitable for predicting generated profit in a typical quarry. More quarries in Nigeria's southwest region and other geopolitical zones should be considered to improve ANN prediction accuracy.

Keywords: National development, granite, profitability assessment, ANN models.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 82
1266 Studding of Number of Dataset on Precision of Estimated Saturated Hydraulic Conductivity

Authors: M. Siosemarde, M. Byzedi

Abstract:

Saturated hydraulic conductivity of Soil is an important property in processes involving water and solute flow in soils. Saturated hydraulic conductivity of soil is difficult to measure and can be highly variable, requiring a large number of replicate samples. In this study, 60 sets of soil samples were collected at Saqhez region of Kurdistan province-IRAN. The statistics such as Correlation Coefficient (R), Root Mean Square Error (RMSE), Mean Bias Error (MBE) and Mean Absolute Error (MAE) were used to evaluation the multiple linear regression models varied with number of dataset. In this study the multiple linear regression models were evaluated when only percentage of sand, silt, and clay content (SSC) were used as inputs, and when SSC and bulk density, Bd, (SSC+Bd) were used as inputs. The R, RMSE, MBE and MAE values of the 50 dataset for method (SSC), were calculated 0.925, 15.29, -1.03 and 12.51 and for method (SSC+Bd), were calculated 0.927, 15.28,-1.11 and 12.92, respectively, for relationship obtained from multiple linear regressions on data. Also the R, RMSE, MBE and MAE values of the 10 dataset for method (SSC), were calculated 0.725, 19.62, - 9.87 and 18.91 and for method (SSC+Bd), were calculated 0.618, 24.69, -17.37 and 22.16, respectively, which shows when number of dataset increase, precision of estimated saturated hydraulic conductivity, increases.

Keywords: dataset, precision, saturated hydraulic conductivity, soil and statistics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1791
1265 Estimation of Time -Varying Linear Regression with Unknown Time -Volatility via Continuous Generalization of the Akaike Information Criterion

Authors: Elena Ezhova, Vadim Mottl, Olga Krasotkina

Abstract:

The problem of estimating time-varying regression is inevitably concerned with the necessity to choose the appropriate level of model volatility - ranging from the full stationarity of instant regression models to their absolute independence of each other. In the stationary case the number of regression coefficients to be estimated equals that of regressors, whereas the absence of any smoothness assumptions augments the dimension of the unknown vector by the factor of the time-series length. The Akaike Information Criterion is a commonly adopted means of adjusting a model to the given data set within a succession of nested parametric model classes, but its crucial restriction is that the classes are rigidly defined by the growing integer-valued dimension of the unknown vector. To make the Kullback information maximization principle underlying the classical AIC applicable to the problem of time-varying regression estimation, we extend it onto a wider class of data models in which the dimension of the parameter is fixed, but the freedom of its values is softly constrained by a family of continuously nested a priori probability distributions.

Keywords: Time varying regression, time-volatility of regression coefficients, Akaike Information Criterion (AIC), Kullback information maximization principle.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1533
1264 Comparison of Neural Network and Logistic Regression Methods to Predict Xerostomia after Radiotherapy

Authors: Hui-Min Ting, Tsair-Fwu Lee, Ming-Yuan Cho, Pei-Ju Chao, Chun-Ming Chang, Long-Chang Chen, Fu-Min Fang

Abstract:

To evaluate the ability to predict xerostomia after radiotherapy, we constructed and compared neural network and logistic regression models. In this study, 61 patients who completed a questionnaire about their quality of life (QoL) before and after a full course of radiation therapy were included. Based on this questionnaire, some statistical data about the condition of the patients’ salivary glands were obtained, and these subjects were included as the inputs of the neural network and logistic regression models in order to predict the probability of xerostomia. Seven variables were then selected from the statistical data according to Cramer’s V and point-biserial correlation values and were trained by each model to obtain the respective outputs which were 0.88 and 0.89 for AUC, 9.20 and 7.65 for SSE, and 13.7% and 19.0% for MAPE, respectively. These parameters demonstrate that both neural network and logistic regression methods are effective for predicting conditions of parotid glands.

Keywords: NPC, ANN, logistic regression, xerostomia.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1636
1263 Bioprocess Optimization Based On Relevance Vector Regression Models and Evolutionary Programming Technique

Authors: R. Simutis, V. Galvanauskas, D. Levisauskas, J. Repsyte

Abstract:

This paper proposes a bioprocess optimization procedure based on Relevance Vector Regression models and evolutionary programming technique. Relevance Vector Regression scheme allows developing a compact and stable data-based process model avoiding time-consuming modeling expenses. The model building and process optimization procedure could be done in a half-automated way and repeated after every new cultivation run. The proposed technique was tested in a simulated mammalian cell cultivation process. The obtained results are promising and could be attractive for optimization of industrial bioprocesses.

Keywords: Bioprocess optimization, Evolutionary programming, Relevance Vector Regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2194
1262 Performance Analysis of Proprietary and Non-Proprietary Tools for Regression Testing Using Genetic Algorithm

Authors: K. Hema Shankari, R. Thirumalaiselvi, N. V. Balasubramanian

Abstract:

The present paper addresses to the research in the area of regression testing with emphasis on automated tools as well as prioritization of test cases. The uniqueness of regression testing and its cyclic nature is pointed out. The difference in approach between industry, with business model as basis, and academia, with focus on data mining, is highlighted. Test Metrics are discussed as a prelude to our formula for prioritization; a case study is further discussed to illustrate this methodology. An industrial case study is also described in the paper, where the number of test cases is so large that they have to be grouped as Test Suites. In such situations, a genetic algorithm proposed by us can be used to reconfigure these Test Suites in each cycle of regression testing. The comparison is made between a proprietary tool and an open source tool using the above-mentioned metrics. Our approach is clarified through several tables.

Keywords: APFD metric, genetic algorithm, regression testing, RFT tool, test case prioritization, selenium tool.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 917
1261 An EWMA p Chart Based On Improved Square Root Transformation

Authors: S. Sukparungsee

Abstract:

Generally, the traditional Shewhart p chart has been developed by for charting the binomial data. This chart has been developed using the normal approximation with condition as low defect level and the small to moderate sample size. In real applications, however, are away from these assumptions due to skewness in the exact distribution. In this paper, a modified Exponentially Weighted Moving Average (EWMA) control chat for detecting a change in binomial data by improving square root transformations, namely ISRT p EWMA control chart. The numerical results show that ISRT p EWMA chart is superior to ISRT p chart for small to moderate shifts, otherwise, the latter is better for large shifts.

Keywords: Number of defects, Exponentially Weighted Moving Average, Average Run Length, Square root transformations.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2485
1260 A Fuzzy Nonlinear Regression Model for Interval Type-2 Fuzzy Sets

Authors: O. Poleshchuk, E.Komarov

Abstract:

This paper presents a regression model for interval type-2 fuzzy sets based on the least squares estimation technique. Unknown coefficients are assumed to be triangular fuzzy numbers. The basic idea is to determine aggregation intervals for type-1 fuzzy sets, membership functions of whose are low membership function and upper membership function of interval type-2 fuzzy set. These aggregation intervals were called weighted intervals. Low and upper membership functions of input and output interval type-2 fuzzy sets for developed regression models are considered as piecewise linear functions.

Keywords: Interval type-2 fuzzy sets, fuzzy regression, weighted interval.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2218
1259 Combined Analysis of Sudoku Square Designs with Same Treatments

Authors: A. Danbaba

Abstract:

Several experiments are conducted at different environments such as locations or periods (seasons) with identical treatments to each experiment purposely to study the interaction between the treatments and environments or between the treatments and periods (seasons). The commonly used designs of experiments for this purpose are randomized block design, Latin square design, balanced incomplete block design, Youden design, and one or more factor designs. The interest is to carry out a combined analysis of the data from these multi-environment experiments, instead of analyzing each experiment separately. This paper proposed combined analysis of experiments conducted via Sudoku square design of odd order with same experimental treatments.

Keywords: Sudoku designs, combined analysis, multi-environment experiments, common treatments.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1539
1258 Mean Square Exponential Synchronization of Stochastic Neutral Type Chaotic Neural Networks with Mixed Delay

Authors: Zixin Liu, Huawei Yang, Fangwei Chen

Abstract:

This paper studies the mean square exponential synchronization problem of a class of stochastic neutral type chaotic neural networks with mixed delay. On the Basis of Lyapunov stability theory, some sufficient conditions ensuring the mean square exponential synchronization of two identical chaotic neural networks are obtained by using stochastic analysis and inequality technique. These conditions are expressed in the form of linear matrix inequalities (LMIs), whose feasibility can be easily checked by using Matlab LMI Toolbox. The feedback controller used in this paper is more general than those used in previous literatures. One simulation example is presented to demonstrate the effectiveness of the derived results.

Keywords: Exponential synchronization, stochastic analysis, chaotic neural networks, neutral type system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1557
1257 Comparison of Machine Learning Techniques for Single Imputation on Audiograms

Authors: Sarah Beaver, Renee Bryce

Abstract:

Audiograms detect hearing impairment, but missing values pose problems. This work explores imputations in an attempt to improve accuracy. This work implements Linear Regression, Lasso, Linear Support Vector Regression, Bayesian Ridge, K Nearest Neighbors (KNN), and Random Forest machine learning techniques to impute audiogram frequencies ranging from 125 Hz to 8000 Hz. The data contain patients who had or were candidates for cochlear implants. Accuracy is compared across two different Nested Cross-Validation k values. Over 4000 audiograms were used from 800 unique patients. Additionally, training on data combines and compares left and right ear audiograms versus single ear side audiograms. The accuracy achieved using Root Mean Square Error (RMSE) values for the best models for Random Forest ranges from 4.74 to 6.37. The R2 values for the best models for Random Forest ranges from .91 to .96. The accuracy achieved using RMSE values for the best models for KNN ranges from 5.00 to 7.72. The R2 values for the best models for KNN ranges from .89 to .95. The best imputation models received R2 between .89 to .96 and RMSE values less than 8dB. We also show that the accuracy of classification predictive models performed better with our imputation models versus constant imputations by a two percent increase.

Keywords: Machine Learning, audiograms, data imputations, single imputations.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 158
1256 Using Artificial Neural Network to Predict Collisions on Horizontal Tangents of 3D Two-Lane Highways

Authors: Omer F. Cansiz, Said M. Easa

Abstract:

The purpose of this study is mainly to predict collision frequency on the horizontal tangents combined with vertical curves using artificial neural network methods. The proposed ANN models are compared with existing regression models. First, the variables that affect collision frequency were investigated. It was found that only the annual average daily traffic, section length, access density, the rate of vertical curvature, smaller curve radius before and after the tangent were statistically significant according to related combinations. Second, three statistical models (negative binomial, zero inflated Poisson and zero inflated negative binomial) were developed using the significant variables for three alignment combinations. Third, ANN models are developed by applying the same variables for each combination. The results clearly show that the ANN models have the lowest mean square error value than those of the statistical models. Similarly, the AIC values of the ANN models are smaller to those of the regression models for all the combinations. Consequently, the ANN models have better statistical performances than statistical models for estimating collision frequency. The ANN models presented in this paper are recommended for evaluating the safety impacts 3D alignment elements on horizontal tangents.

Keywords: Collision frequency, horizontal tangent, 3D two-lane highway, negative binomial, zero inflated Poisson, artificial neural network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1633
1255 Predicting Bridge Pier Scour Depth with SVM

Authors: Arun Goel

Abstract:

Prediction of maximum local scour is necessary for the safety and economical design of the bridges. A number of equations have been developed over the years to predict local scour depth using laboratory data and a few pier equations have also been proposed using field data. Most of these equations are empirical in nature as indicated by the past publications. In this paper attempts have been made to compute local depth of scour around bridge pier in dimensional and non-dimensional form by using linear regression, simple regression and SVM (Poly & Rbf) techniques along with few conventional empirical equations. The outcome of this study suggests that the SVM (Poly & Rbf) based modeling can be employed as an alternate to linear regression, simple regression and the conventional empirical equations in predicting scour depth of bridge piers. The results of present study on the basis of non-dimensional form of bridge pier scour indicate the improvement in the performance of SVM (Poly & Rbf) in comparison to dimensional form of scour.

Keywords: Modeling, pier scour, regression, prediction, SVM (Poly & Rbf kernels).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1543
1254 Analytical Authentication of Butter Using Fourier Transform Infrared Spectroscopy Coupled with Chemometrics

Authors: M. Bodner, M. Scampicchio

Abstract:

Fourier Transform Infrared (FT-IR) spectroscopy coupled with chemometrics was used to distinguish between butter samples and non-butter samples. Further, quantification of the content of margarine in adulterated butter samples was investigated. Fingerprinting region (1400-800 cm–1) was used to develop unsupervised pattern recognition (Principal Component Analysis, PCA), supervised modeling (Soft Independent Modelling by Class Analogy, SIMCA), classification (Partial Least Squares Discriminant Analysis, PLS-DA) and regression (Partial Least Squares Regression, PLS-R) models. PCA of the fingerprinting region shows a clustering of the two sample types. All samples were classified in their rightful class by SIMCA approach; however, nine adulterated samples (between 1% and 30% w/w of margarine) were classified as belonging both at the butter class and at the non-butter one. In the two-class PLS-DA model’s (R2 = 0.73, RMSEP, Root Mean Square Error of Prediction = 0.26% w/w) sensitivity was 71.4% and Positive Predictive Value (PPV) 100%. Its threshold was calculated at 7% w/w of margarine in adulterated butter samples. Finally, PLS-R model (R2 = 0.84, RMSEP = 16.54%) was developed. PLS-DA was a suitable classification tool and PLS-R a proper quantification approach. Results demonstrate that FT-IR spectroscopy combined with PLS-R can be used as a rapid, simple and safe method to identify pure butter samples from adulterated ones and to determine the grade of adulteration of margarine in butter samples.

Keywords: Adulterated butter, margarine, PCA, PLS-DA, PLS-R, SIMCA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 780
1253 Computational Aspects of Regression Analysis of Interval Data

Authors: Michal Cerny

Abstract:

We consider linear regression models where both input data (the values of independent variables) and output data (the observations of the dependent variable) are interval-censored. We introduce a possibilistic generalization of the least squares estimator, so called OLS-set for the interval model. This set captures the impact of the loss of information on the OLS estimator caused by interval censoring and provides a tool for quantification of this effect. We study complexity-theoretic properties of the OLS-set. We also deal with restricted versions of the general interval linear regression model, in particular the crisp input – interval output model. We give an argument that natural descriptions of the OLS-set in the crisp input – interval output cannot be computed in polynomial time. Then we derive easily computable approximations for the OLS-set which can be used instead of the exact description. We illustrate the approach by an example.

Keywords: Linear regression, interval-censored data, computational complexity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1469
1252 Model-Driven and Data-Driven Approaches for Crop Yield Prediction: Analysis and Comparison

Authors: Xiangtuo Chen, Paul-Henry Cournéde

Abstract:

Crop yield prediction is a paramount issue in agriculture. The main idea of this paper is to find out efficient way to predict the yield of corn based meteorological records. The prediction models used in this paper can be classified into model-driven approaches and data-driven approaches, according to the different modeling methodologies. The model-driven approaches are based on crop mechanistic modeling. They describe crop growth in interaction with their environment as dynamical systems. But the calibration process of the dynamic system comes up with much difficulty, because it turns out to be a multidimensional non-convex optimization problem. An original contribution of this paper is to propose a statistical methodology, Multi-Scenarios Parameters Estimation (MSPE), for the parametrization of potentially complex mechanistic models from a new type of datasets (climatic data, final yield in many situations). It is tested with CORNFLO, a crop model for maize growth. On the other hand, the data-driven approach for yield prediction is free of the complex biophysical process. But it has some strict requirements about the dataset. A second contribution of the paper is the comparison of these model-driven methods with classical data-driven methods. For this purpose, we consider two classes of regression methods, methods derived from linear regression (Ridge and Lasso Regression, Principal Components Regression or Partial Least Squares Regression) and machine learning methods (Random Forest, k-Nearest Neighbor, Artificial Neural Network and SVM regression). The dataset consists of 720 records of corn yield at county scale provided by the United States Department of Agriculture (USDA) and the associated climatic data. A 5-folds cross-validation process and two accuracy metrics: root mean square error of prediction(RMSEP), mean absolute error of prediction(MAEP) were used to evaluate the crop prediction capacity. The results show that among the data-driven approaches, Random Forest is the most robust and generally achieves the best prediction error (MAEP 4.27%). It also outperforms our model-driven approach (MAEP 6.11%). However, the method to calibrate the mechanistic model from dataset easy to access offers several side-perspectives. The mechanistic model can potentially help to underline the stresses suffered by the crop or to identify the biological parameters of interest for breeding purposes. For this reason, an interesting perspective is to combine these two types of approaches.

Keywords: Crop yield prediction, crop model, sensitivity analysis, paramater estimation, particle swarm optimization, random forest.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1174
1251 Numerical Simulations of Shear Driven Square and Triangular Cavity by Using Lattice Boltzmann Scheme

Authors: A. M. Fudhail, N. A. C. Sidik, M. Z. M. Rody, H. M. Zahir, M.T. Musthafah

Abstract:

In this paper, fluid flow patterns of steady incompressible flow inside shear driven cavity are studied. The numerical simulations are conducted by using lattice Boltzmann method (LBM) for different Reynolds numbers. In order to simulate the flow, derivation of macroscopic hydrodynamics equations from the continuous Boltzmann equation need to be performed. Then, the numerical results of shear-driven flow inside square and triangular cavity are compared with results found in literature review. Present study found that flow patterns are affected by the geometry of the cavity and the Reynolds numbers used.

Keywords: Lattice Boltzmann method, shear driven cavity, square cavity, triangular cavity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1959