Search results for: Multivariate regression.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 876

Search results for: Multivariate regression.

756 Multi-Linear Regression Based Prediction of Mass Transfer by Multiple Plunging Jets

Authors: S. Deswal, M. Pal

Abstract:

The paper aims to compare the performance of vertical and inclined multiple plunging jets and to model and predict their mass transfer capacity by multi-linear regression based approach. The multiple vertical plunging jets have jet impact angle of θ = 90O; whereas, multiple inclined plunging jets have jet impact angle of θ = 60O. The results of the study suggests that mass transfer is higher for multiple jets, and inclined multiple plunging jets have up to 1.6 times higher mass transfer than vertical multiple plunging jets under similar conditions. The derived relationship, based on multi-linear regression approach, has successfully predicted the volumetric mass transfer coefficient (KLa) from operational parameters of multiple plunging jets with a correlation coefficient of 0.973, root mean square error of 0.002 and coefficient of determination of 0.946. The results suggests that predicted overall mass transfer coefficient is in good agreement with actual experimental values; thereby, suggesting the utility of derived relationship based on multi-linear regression based approach and can be successfully employed in modeling mass transfer by multiple plunging jets.

Keywords: Mass transfer, multiple plunging jets, multi-linear regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2150
755 Ordinal Regression with Fenton-Wilkinson Order Statistics: A Case Study of an Orienteering Race

Authors: Joonas Pääkkönen

Abstract:

In sports, individuals and teams are typically interested in final rankings. Final results, such as times or distances, dictate these rankings, also known as places. Places can be further associated with ordered random variables, commonly referred to as order statistics. In this work, we introduce a simple, yet accurate order statistical ordinal regression function that predicts relay race places with changeover-times. We call this function the Fenton-Wilkinson Order Statistics model. This model is built on the following educated assumption: individual leg-times follow log-normal distributions. Moreover, our key idea is to utilize Fenton-Wilkinson approximations of changeover-times alongside an estimator for the total number of teams as in the notorious German tank problem. This original place regression function is sigmoidal and thus correctly predicts the existence of a small number of elite teams that significantly outperform the rest of the teams. Our model also describes how place increases linearly with changeover-time at the inflection point of the log-normal distribution function. With real-world data from Jukola 2019, a massive orienteering relay race, the model is shown to be highly accurate even when the size of the training set is only 5% of the whole data set. Numerical results also show that our model exhibits smaller place prediction root-mean-square-errors than linear regression, mord regression and Gaussian process regression.

Keywords: Fenton-Wilkinson approximation, German tank problem, log-normal distribution, order statistics, ordinal regression, orienteering, sports analytics, sports modeling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 756
754 Fault Detection of Drinking Water Treatment Process Using PCA and Hotelling's T2 Chart

Authors: Joval P George, Dr. Zheng Chen, Philip Shaw

Abstract:

This paper deals with the application of Principal Component Analysis (PCA) and the Hotelling-s T2 Chart, using data collected from a drinking water treatment process. PCA is applied primarily for the dimensional reduction of the collected data. The Hotelling-s T2 control chart was used for the fault detection of the process. The data was taken from a United Utilities Multistage Water Treatment Works downloaded from an Integrated Program Management (IPM) dashboard system. The analysis of the results show that Multivariate Statistical Process Control (MSPC) techniques such as PCA, and control charts such as Hotelling-s T2, can be effectively applied for the early fault detection of continuous multivariable processes such as Drinking Water Treatment. The software package SIMCA-P was used to develop the MSPC models and Hotelling-s T2 Chart from the collected data.

Keywords: Principal component analysis, hotelling's t2 chart, multivariate statistical process control, drinking water treatment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2728
753 Drainage Prediction for Dam using Fuzzy Support Vector Regression

Authors: S. Wiriyarattanakun, A. Ruengsiriwatanakun, S. Noimanee

Abstract:

The drainage Estimating is an important factor in dam management. In this paper, we use fuzzy support vector regression (FSVR) to predict the drainage of the Sirikrit Dam at Uttaradit province, Thailand. The results show that the FSVR is a suitable method in drainage estimating.

Keywords: Drainage Estimation, Prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1220
752 On Estimating the Headcount Index by Using the Logistic Regression Estimator

Authors: Encarnación Álvarez, Rosa M. García-Fernández, Juan F. Muñoz, Francisco J. Blanco-Encomienda

Abstract:

The problem of estimating a proportion has important applications in the field of economics, and in general, in many areas such as social sciences. A common application in economics is the estimation of the headcount index. In this paper, we define the general headcount index as a proportion. Furthermore, we introduce a new quantitative method for estimating the headcount index. In particular, we suggest to use the logistic regression estimator for the problem of estimating the headcount index. Assuming a real data set, results derived from Monte Carlo simulation studies indicate that the logistic regression estimator can be more accurate than the traditional estimator of the headcount index.

Keywords: Poverty line, poor, risk of poverty, sample, Monte Carlo simulations.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2045
751 Decision Trees for Predicting Risk of Mortality using Routinely Collected Data

Authors: Tessy Badriyah, Jim S. Briggs, Dave R. Prytherch

Abstract:

It is well known that Logistic Regression is the gold standard method for predicting clinical outcome, especially predicting risk of mortality. In this paper, the Decision Tree method has been proposed to solve specific problems that commonly use Logistic Regression as a solution. The Biochemistry and Haematology Outcome Model (BHOM) dataset obtained from Portsmouth NHS Hospital from 1 January to 31 December 2001 was divided into four subsets. One subset of training data was used to generate a model, and the model obtained was then applied to three testing datasets. The performance of each model from both methods was then compared using calibration (the χ2 test or chi-test) and discrimination (area under ROC curve or c-index). The experiment presented that both methods have reasonable results in the case of the c-index. However, in some cases the calibration value (χ2) obtained quite a high result. After conducting experiments and investigating the advantages and disadvantages of each method, we can conclude that Decision Trees can be seen as a worthy alternative to Logistic Regression in the area of Data Mining.

Keywords: Decision Trees, Logistic Regression, clinical outcome, risk of mortality.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2474
750 Two New Relative Efficiencies of Linear Weighted Regression

Authors: Shuimiao Wan, Chao Yuan, Baoguang Tian

Abstract:

In statistics parameter theory, usually the parameter estimations have two kinds, one is the least-square estimation (LSE), and the other is the best linear unbiased estimation (BLUE). Due to the determining theorem of minimum variance unbiased estimator (MVUE), the parameter estimation of BLUE in linear model is most ideal. But since the calculations are complicated or the covariance is not given, people are hardly to get the solution. Therefore, people prefer to use LSE rather than BLUE. And this substitution will take some losses. To quantize the losses, many scholars have presented many kinds of different relative efficiencies in different views. For the linear weighted regression model, this paper discusses the relative efficiencies of LSE of β to BLUE of β. It also defines two new relative efficiencies and gives their lower bounds.

Keywords: Linear weighted regression, Relative efficiency, Lower bound, Parameter estimation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2067
749 Forecasting the Influences of Information and Communication Technology on the Structural Changes of Japanese Industrial Sectors: A Study Using Statistical Analysis

Authors: Ubaidillah Zuhdi, Shunsuke Mori, Kazuhisa Kamegai

Abstract:

The purpose of this study is to forecast the influences of information and communication technology (ICT) on the structural changes of Japanese economies. In this study, input-output (IO) and statistical approaches are used as analysis instruments. More specifically, this study employs Leontief IO coefficients and constrained multivariate regression (CMR) model in order to achieve the purpose. The periods of initial and forecast in this study are 2005 and 2015, respectively. In this study, ICT is represented by ICT capital stocks. This study conducts two levels of analysis, namely macro and micro. The results of macro level analysis show that the dynamics of Japanese economies on the forecast period, relative to the initial period, are not so high. We focus on (1) commerce, (2) business services and office supplies, and (3) personal services sectors when conducting the analysis of the micro level. Further, we analyze its specific IO coefficients when doing this analysis. The results of the analysis explain that ICT gives a strong influence on the changes of these coefficients from initial to forecast periods.

Keywords: Forecast, ICT, Structural changes, Japanese economies.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1642
748 Assessment of EU Competitiveness Factors by Multivariate Methods

Authors: L. Melecký

Abstract:

Measurement of competitiveness between countries or regions is an important topic of many economic analysis and scientific papers. In European Union (EU), there is no mainstream approach of competitiveness evaluation and measuring. There are many opinions and methods of measurement and evaluation of competitiveness between states or regions at national and European level. The methods differ in structure of using the indicators of competitiveness and ways of their processing. The aim of the paper is to analyze main sources of competitive potential of the EU Member States with the help of Factor analysis (FA) and to classify the EU Member States to homogeneous units (clusters) according to the similarity of selected indicators of competitiveness factors by Cluster analysis (CA) in reference years 2000 and 2011. The theoretical part of the paper is devoted to the fundamental bases of competitiveness and the methodology of FA and CA methods. The empirical part of the paper deals with the evaluation of competitiveness factors in the EU Member States and cluster comparison of evaluated countries by cluster analysis. 

Keywords: Competitiveness, cluster analysis, EU, factor analysis, multivariate methods.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1993
747 Multivariate Statistical Analysis of Decathlon Performance Results in Olympic Athletes (1988-2008)

Authors: Jaebum Park, Vladimir M. Zatsiorsky

Abstract:

The performance results of the athletes competed in the 1988-2008 Olympic Games were analyzed (n = 166). The data were obtained from the IAAF official protocols. In the principal component analysis, the first three principal components explained 70% of the total variance. In the 1st principal component (with 43.1% of total variance explained) the largest factor loadings were for 100m (0.89), 400m (0.81), 110m hurdle run (0.76), and long jump (–0.72). This factor can be interpreted as the 'sprinting performance'. The loadings on the 2nd factor (15.3% of the total variance) presented a counter-intuitive throwing-jumping combination: the highest loadings were for throwing events (javelin throwing 0.76; shot put 0.74; and discus throwing 0.73) and also for jumping events (high jump 0.62; pole vaulting 0.58). On the 3rd factor (11.6% of total variance), the largest loading was for 1500 m running (0.88); all other loadings were below 0.4.

Keywords: Decathlon, principal component analysis, Olympic Games, multivariate statistical analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2762
746 The Strengths and Limitations of the Statistical Modeling of Complex Social Phenomenon: Focusing on SEM, Path Analysis, or Multiple Regression Models

Authors: Jihye Jeon

Abstract:

This paper analyzes the conceptual framework of three statistical methods, multiple regression, path analysis, and structural equation models. When establishing research model of the statistical modeling of complex social phenomenon, it is important to know the strengths and limitations of three statistical models. This study explored the character, strength, and limitation of each modeling and suggested some strategies for accurate explaining or predicting the causal relationships among variables. Especially, on the studying of depression or mental health, the common mistakes of research modeling were discussed.

Keywords: Multiple regression, path analysis, structural equation models, statistical modeling, social and psychological phenomenon.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9094
745 Ensembling Adaptively Constructed Polynomial Regression Models

Authors: Gints Jekabsons

Abstract:

The approach of subset selection in polynomial regression model building assumes that the chosen fixed full set of predefined basis functions contains a subset that is sufficient to describe the target relation sufficiently well. However, in most cases the necessary set of basis functions is not known and needs to be guessed – a potentially non-trivial (and long) trial and error process. In our research we consider a potentially more efficient approach – Adaptive Basis Function Construction (ABFC). It lets the model building method itself construct the basis functions necessary for creating a model of arbitrary complexity with adequate predictive performance. However, there are two issues that to some extent plague the methods of both the subset selection and the ABFC, especially when working with relatively small data samples: the selection bias and the selection instability. We try to correct these issues by model post-evaluation using Cross-Validation and model ensembling. To evaluate the proposed method, we empirically compare it to ABFC methods without ensembling, to a widely used method of subset selection, as well as to some other well-known regression modeling methods, using publicly available data sets.

Keywords: Basis function construction, heuristic search, modelensembles, polynomial regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1629
744 Harmonics Elimination in Multilevel Inverter Using Linear Fuzzy Regression

Authors: A. K. Al-Othman, H. A. Al-Mekhaizim

Abstract:

Multilevel inverters supplied from equal and constant dc sources almost don-t exist in practical applications. The variation of the dc sources affects the values of the switching angles required for each specific harmonic profile, as well as increases the difficulty of the harmonic elimination-s equations. This paper presents an extremely fast optimal solution of harmonic elimination of multilevel inverters with non-equal dc sources using Tanaka's fuzzy linear regression formulation. A set of mathematical equations describing the general output waveform of the multilevel inverter with nonequal dc sources is formulated. Fuzzy linear regression is then employed to compute the optimal solution set of switching angles.

Keywords: Multilevel converters, harmonics, pulse widthmodulation (PWM), optimal control.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1754
743 Burning Rate Response of Solid Fuels in Laminar Boundary Layer

Authors: A. M. Tahsini

Abstract:

Solid fuel transient burning behavior under oxidizer gas flow is numerically investigated. It is done using analysis of the regression rate responses to the imposed sudden and oscillatory variation at inflow properties. The conjugate problem is considered by simultaneous solution of flow and solid phase governing equations to compute the fuel regression rate. The advection upstream splitting method is used as flow computational scheme in finite volume method. The ignition phase is completely simulated to obtain the exact initial condition for response analysis. The results show that the transient burning effects which lead to the combustion instabilities and intermittent extinctions could be observed in solid fuels as the solid propellants.

Keywords: Extinction, Oscillation, Regression rate, Response, Transient burning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2317
742 A Model for Test Case Selection in the Software-Development Life Cycle

Authors: Adtha Lawanna

Abstract:

Software maintenance is one of the essential processes of Software-Development Life Cycle. The main philosophies of retaining software concern the improvement of errors, the revision of codes, the inhibition of future errors, and the development in piece and capacity. While the adjustment has been employing, the software structure has to be retested to an upsurge a level of assurance that it will be prepared due to the requirements. According to this state, the test cases must be considered for challenging the revised modules and the whole software. A concept of resolving this problem is ongoing by regression test selection such as the retest-all selections, random/ad-hoc selection and the safe regression test selection. Particularly, the traditional techniques concern a mapping between the test cases in a test suite and the lines of code it executes. However, there are not only the lines of code as one of the requirements that can affect the size of test suite but including the number of functions and faulty versions. Therefore, a model for test case selection is developed to cover those three requirements by the integral technique which can produce the smaller size of the test cases when compared with the traditional regression selection techniques.

Keywords: Software maintenance, regression test selection, test case.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1656
741 A Model for Test Case Selection in the Software-Development Life Cycle

Authors: Adtha Lawanna

Abstract:

Software maintenance is one of the essential processes of Software-Development Life Cycle. The main philosophies of retaining software concern the improvement of errors, the revision of codes, the inhibition of future errors, and the development in piece and capacity. While the adjustment has been employing, the software structure has to be retested to an upsurge a level of assurance that it will be prepared due to the requirements. According to this state, the test cases must be considered for challenging the revised modules and the whole software. A concept of resolving this problem is ongoing by regression test selection such as the retest-all selections, random/ad-hoc selection and the safe regression test selection. Particularly, the traditional techniques concern a mapping between the test cases in a test suite and the lines of code it executes. However, there are not only the lines of code as one of the requirements that can affect the size of test suite but including the number of functions and faulty versions. Therefore, a model for test case selection is developed to cover those three requirements by the integral technique which can produce the smaller size of the test cases when compared with the traditional regression selection techniques.

Keywords: Software maintenance, regression test selection, test case.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1551
740 Mathematical Modeling to Predict Surface Roughness in CNC Milling

Authors: Ab. Rashid M.F.F., Gan S.Y., Muhammad N.Y.

Abstract:

Surface roughness (Ra) is one of the most important requirements in machining process. In order to obtain better surface roughness, the proper setting of cutting parameters is crucial before the process take place. This research presents the development of mathematical model for surface roughness prediction before milling process in order to evaluate the fitness of machining parameters; spindle speed, feed rate and depth of cut. 84 samples were run in this study by using FANUC CNC Milling α-Τ14ιE. Those samples were randomly divided into two data sets- the training sets (m=60) and testing sets(m=24). ANOVA analysis showed that at least one of the population regression coefficients was not zero. Multiple Regression Method was used to determine the correlation between a criterion variable and a combination of predictor variables. It was established that the surface roughness is most influenced by the feed rate. By using Multiple Regression Method equation, the average percentage deviation of the testing set was 9.8% and 9.7% for training data set. This showed that the statistical model could predict the surface roughness with about 90.2% accuracy of the testing data set and 90.3% accuracy of the training data set.

Keywords: Surface roughness, regression analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2075
739 Stature Prediction Model Based On Hand Anthropometry

Authors: Arunesh Chandra, Pankaj Chandna, Surinder Deswal, Rajesh Kumar Mishra, Rajender Kumar

Abstract:

The arm length, hand length, hand breadth and middle finger length of 1540 right-handed industrial workers of Haryana state was used to assess the relationship between the upper limb dimensions and stature. Initially, the data were analyzed using basic univariate analysis and independent t-tests; then simple and multiple linear regression models were used to estimate stature using SPSS (version 17). There was a positive correlation between upper limb measurements (hand length, hand breadth, arm length and middle finger length) and stature (p < 0.01), which was highest for hand length. The accuracy of stature prediction ranged from ± 54.897 mm to ± 58.307 mm. The use of multiple regression equations gave better results than simple regression equations. This study provides new forensic standards for stature estimation from the upper limb measurements of male industrial workers of Haryana (India). The results of this research indicate that stature can be determined using hand dimensions with accuracy, when only upper limb is available due to any reasons likewise explosions, train/plane crashes, mutilated bodies, etc. The regression formula derived in this study will be useful for anatomists, archaeologists, anthropologists, design engineers and forensic scientists for fairly prediction of stature using regression equations.

Keywords: Anthropometric dimensions, Forensic identification, Industrial workers, Stature prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2902
738 Use of Regression Analysis in Determining the Length of Plastic Hinge in Reinforced Concrete Columns

Authors: Mehmet Alpaslan Köroğlu, Musa Hakan Arslan, Muslu Kazım Körez

Abstract:

Basic objective of this study is to create a regression analysis method that can estimate the length of a plastic hinge which is an important design parameter, by making use of the outcomes of (lateral load-lateral displacement hysteretic curves) the experimental studies conducted for the reinforced square concrete columns. For this aim, 170 different square reinforced concrete column tests results have been collected from the existing literature. The parameters which are thought affecting the plastic hinge length such as crosssection properties, features of material used, axial loading level, confinement of the column, longitudinal reinforcement bars in the columns etc. have been obtained from these 170 different square reinforced concrete column tests. In the study, when determining the length of plastic hinge, using the experimental test results, a regression analysis have been separately tested and compared with each other. In addition, the outcome of mentioned methods on determination of plastic hinge length of the reinforced concrete columns has been compared to other methods available in the literature.

Keywords: Columns, plastic hinge length, regression analysis, reinforced concrete.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4163
737 Optimal Maintenance Policy for a Partially Observable Two-Unit System

Authors: Leila Jafari, Viliam Makis, Akram Khaleghei G.B.

Abstract:

In this paper, we present a maintenance model of a two-unit series system with economic dependence. Unit#1 which is considered to be more expensive and more important, is subject to condition monitoring (CM) at equidistant, discrete time epochs and unit#2, which is not subject to CM has a general lifetime distribution. The multivariate observation vectors obtained through condition monitoring carry partial information about the hidden state of unit#1, which can be in a healthy or a warning state while operating. Only the failure state is assumed to be observable for both units. The objective is to find an optimal opportunistic maintenance policy minimizing the long-run expected average cost per unit time. The problem is formulated and solved in the partially observable semi-Markov decision process framework. An effective computational algorithm for finding the optimal policy and the minimum average cost is developed, illustrated by a numerical example.

Keywords: Condition-Based Maintenance, Semi-Markov Decision Process, Multivariate Bayesian Control Chart, Partially Observable System, Two-unit System.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2250
736 Modeling Aeration of Sharp Crested Weirs by Using Support Vector Machines

Authors: Arun Goel

Abstract:

The present paper attempts to investigate the prediction of air entrainment rate and aeration efficiency of a free overfall jets issuing from a triangular sharp crested weir by using regression based modelling. The empirical equations, Support vector machine (polynomial and radial basis function) models and the linear regression techniques were applied on the triangular sharp crested weirs relating the air entrainment rate and the aeration efficiency to the input parameters namely drop height, discharge, and vertex angle. It was observed that there exists a good agreement between the measured values and the values obtained using empirical equations, Support vector machine (Polynomial and rbf) models and the linear regression techniques. The test results demonstrated that the SVM based (Poly & rbf) model also provided acceptable prediction of the measured values with reasonable accuracy along with empirical equations and linear regression techniques in modelling the air entrainment rate and the aeration efficiency of a free overfall jets issuing from triangular sharp crested weir. Further sensitivity analysis has also been performed to study the impact of input parameter on the output in terms of air entrainment rate and aeration efficiency.

Keywords: Air entrainment rate, dissolved oxygen, regression, SVM, weir.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1912
735 A Research on Inference from Multiple Distance Variables in Hedonic Regression – Focus on Three Variables

Authors: Yan Wang, Yasushi Asami, Yukio Sadahiro

Abstract:

In urban context, urban nodes such as amenity or hazard will certainly affect house price, while classic hedonic analysis will employ distance variables measured from each urban nodes. However, effects from distances to facilities on house prices generally do not represent the true price of the property. Distance variables measured on the same surface are suffering a problem called multicollinearity, which is usually presented as magnitude variance and mean value in regression, errors caused by instability. In this paper, we provided a theoretical framework to identify and gather the data with less bias, and also provided specific sampling method on locating the sample region to avoid the spatial multicollinerity problem in three distance variable’s case.

Keywords: Hedonic regression, urban node, distance variables, multicollinerity, collinearity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1941
734 In situ Real-Time Multivariate Analysis of Methanolysis Monitoring of Sunflower Oil Using FTIR

Authors: Pascal Mwenge, Tumisang Seodigeng

Abstract:

The combination of world population and the third industrial revolution led to high demand for fuels. On the other hand, the decrease of global fossil 8fuels deposits and the environmental air pollution caused by these fuels has compounded the challenges the world faces due to its need for energy. Therefore, new forms of environmentally friendly and renewable fuels such as biodiesel are needed. The primary analytical techniques for methanolysis yield monitoring have been chromatography and spectroscopy, these methods have been proven reliable but are more demanding, costly and do not provide real-time monitoring. In this work, the in situ monitoring of biodiesel from sunflower oil using FTIR (Fourier Transform Infrared) has been studied; the study was performed using EasyMax Mettler Toledo reactor equipped with a DiComp (Diamond) probe. The quantitative monitoring of methanolysis was performed by building a quantitative model with multivariate calibration using iC Quant module from iC IR 7.0 software. 15 samples of known concentrations were used for the modelling which were taken in duplicate for model calibration and cross-validation, data were pre-processed using mean centering and variance scale, spectrum math square root and solvent subtraction. These pre-processing methods improved the performance indexes from 7.98 to 0.0096, 11.2 to 3.41, 6.32 to 2.72, 0.9416 to 0.9999, RMSEC, RMSECV, RMSEP and R2Cum, respectively. The R2 value of 1 (training), 0.9918 (test), 0.9946 (cross-validation) indicated the fitness of the model built. The model was tested against univariate model; small discrepancies were observed at low concentration due to unmodelled intermediates but were quite close at concentrations above 18%. The software eliminated the complexity of the Partial Least Square (PLS) chemometrics. It was concluded that the model obtained could be used to monitor methanol of sunflower oil at industrial and lab scale.

Keywords: Biodiesel, calibration, chemometrics, FTIR, methanolysis, multivariate analysis, transesterification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 856
733 Variation in the Traditional Knowledge of Curcuma longa L. in North-Eastern Algeria

Authors: A. Bouzabata, A. Boukhari

Abstract:

Curcuma longa L. (Zingiberaceae), commonly known as turmeric, has a long history of traditional uses for culinary purposes as a spice and a food colorant. The present study aimed to document the ethnobotanical knowledge about Curcuma longa, and to assess the variation in the herbalists’ experience in Northeastern Algeria. Data were collected using semi-structured questionnaires and direct interviews with 30 herbalists. Ethnobotanical indices, including the fidelity level (FL%), the relative frequency citation (RFC), and use value (UV) were determined by quantitative methods. Diversity in the level of knowledge was analyzed using univariate, non-parametric, and multivariate statistical methods. Three main categories of uses were recorded for C. longa: for food, for medicine, and for cosmetic purposes. As a medicine, turmeric was used for the treatment of gastrointestinal, dermatological, and hepatic diseases. Medicinal and food uses were correlated with both forms of preparation (rhizome and powder). The age group did not influence the use. Multivariate analyses showed a significant variation in traditional knowledge, associated with the use value, origin, quality, and efficacy of the drug. The findings suggested that the geographical origin of C. longa affected the use in Algeria.

Keywords: Curcuma longa, curcuma indices, ethnobotanical knowledge, variation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2553
732 Implementation of Response Surface Methodology using in Small Brown Rice Peeling Machine: Part I

Authors: S. Bangphan, P. Bangphan, T.Boonkang

Abstract:

Implementation of response surface methodology (RSM) was employed to study the effects of two factor (rubber clearance and round per minute) in brown rice peeling machine of The optimal BROKENS yield (19.02, average of three repeats),.The optimized composition derived from RSM regression was analyzed using Regression analysis and Analysis of Variance (ANOVA). At a significant level α = 0.05, the values of Regression coefficient, R 2 (adj)were 97.35 % and standard deviation were 1.09513. The independent variables are initial rubber clearance, and round per minute parameters namely. The investigating responses are final rubber clearance, and round per minute (RPM). The restriction of the optimization is the designated.

Keywords: Brown rice, Response surface methodology(RSM), Rubber clearance, Round per minute (RPM), Peeling machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1915
731 Climate Change in Albania and Its Effect on Cereal Yield

Authors: L. Basha, E. Gjika

Abstract:

This study is focused on analyzing climate change in Albania and its potential effects on cereal yields. Initially, monthly temperature and rainfalls in Albania were studied for the period 1960-2021. Climacteric variables are important variables when trying to model cereal yield behavior, especially when significant changes in weather conditions are observed. For this purpose, in the second part of the study, linear and nonlinear models explaining cereal yield are constructed for the same period, 1960-2021. The multiple linear regression analysis and lasso regression method are applied to the data between cereal yield and each independent variable: average temperature, average rainfall, fertilizer consumption, arable land, land under cereal production, and nitrous oxide emissions. In our regression model, heteroscedasticity is not observed, data follow a normal distribution, and there is a low correlation between factors, so we do not have the problem of multicollinearity. Machine learning methods, such as Random Forest (RF), are used to predict cereal yield responses to climacteric and other variables. RF showed high accuracy compared to the other statistical models in the prediction of cereal yield. We found that changes in average temperature negatively affect cereal yield. The coefficients of fertilizer consumption, arable land, and land under cereal production are positively affecting production. Our results show that the RF method is an effective and versatile machine-learning method for cereal yield prediction compared to the other two methods: multiple linear regression and lasso regression method.

Keywords: Cereal yield, climate change, machine learning, multiple regression model, random forest.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 93
730 Clustering Mixed Data Using Non-normal Regression Tree for Process Monitoring

Authors: Youngji Yoo, Cheong-Sool Park, Jun Seok Kim, Young-Hak Lee, Sung-Shick Kim, Jun-Geol Baek

Abstract:

In the semiconductor manufacturing process, large amounts of data are collected from various sensors of multiple facilities. The collected data from sensors have several different characteristics due to variables such as types of products, former processes and recipes. In general, Statistical Quality Control (SQC) methods assume the normality of the data to detect out-of-control states of processes. Although the collected data have different characteristics, using the data as inputs of SQC will increase variations of data, require wide control limits, and decrease performance to detect outof- control. Therefore, it is necessary to separate similar data groups from mixed data for more accurate process control. In the paper, we propose a regression tree using split algorithm based on Pearson distribution to handle non-normal distribution in parametric method. The regression tree finds similar properties of data from different variables. The experiments using real semiconductor manufacturing process data show improved performance in fault detecting ability.

Keywords: Semiconductor, non-normal mixed process data, clustering, Statistical Quality Control (SQC), regression tree, Pearson distribution system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1730
729 Performance Comparison of Different Regression Methods for a Polymerization Process with Adaptive Sampling

Authors: Florin Leon, Silvia Curteanu

Abstract:

Developing complete mechanistic models for polymerization reactors is not easy, because complex reactions occur simultaneously; there is a large number of kinetic parameters involved and sometimes the chemical and physical phenomena for mixtures involving polymers are poorly understood. To overcome these difficulties, empirical models based on sampled data can be used instead, namely regression methods typical of machine learning field. They have the ability to learn the trends of a process without any knowledge about its particular physical and chemical laws. Therefore, they are useful for modeling complex processes, such as the free radical polymerization of methyl methacrylate achieved in a batch bulk process. The goal is to generate accurate predictions of monomer conversion, numerical average molecular weight and gravimetrical average molecular weight. This process is associated with non-linear gel and glass effects. For this purpose, an adaptive sampling technique is presented, which can select more samples around the regions where the values have a higher variation. Several machine learning methods are used for the modeling and their performance is compared: support vector machines, k-nearest neighbor, k-nearest neighbor and random forest, as well as an original algorithm, large margin nearest neighbor regression. The suggested method provides very good results compared to the other well-known regression algorithms.

Keywords: Adaptive sampling, batch bulk methyl methacrylate polymerization, large margin nearest neighbor regression, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1349
728 Rapid Study on Feature Extraction and Classification Models in Healthcare Applications

Authors: S. Sowmyayani

Abstract:

The advancement of computer-aided design helps the medical force and security force. Some applications include biometric recognition, elderly fall detection, face recognition, cancer recognition, tumor recognition, etc. This paper deals with different machine learning algorithms that are more generically used for any health care system. The most focused problems are classification and regression. With the rise of big data, machine learning has become particularly important for solving problems. Machine learning uses two types of techniques: supervised learning and unsupervised learning. The former trains a model on known input and output data and predicts future outputs. Classification and regression are supervised learning techniques. Unsupervised learning finds hidden patterns in input data. Clustering is one such unsupervised learning technique. The above-mentioned models are discussed briefly in this paper.

Keywords: Supervised learning, unsupervised learning, regression, neural network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 273
727 Modeling Oxygen-transfer by Multiple Plunging Jets using Support Vector Machines and Gaussian Process Regression Techniques

Authors: Surinder Deswal

Abstract:

The paper investigates the potential of support vector machines and Gaussian process based regression approaches to model the oxygen–transfer capacity from experimental data of multiple plunging jets oxygenation systems. The results suggest the utility of both the modeling techniques in the prediction of the overall volumetric oxygen transfer coefficient (KLa) from operational parameters of multiple plunging jets oxygenation system. The correlation coefficient root mean square error and coefficient of determination values of 0.971, 0.002 and 0.945 respectively were achieved by support vector machine in comparison to values of 0.960, 0.002 and 0.920 respectively achieved by Gaussian process regression. Further, the performances of both these regression approaches in predicting the overall volumetric oxygen transfer coefficient was compared with the empirical relationship for multiple plunging jets. A comparison of results suggests that support vector machines approach works well in comparison to both empirical relationship and Gaussian process approaches, and could successfully be employed in modeling oxygen-transfer.

Keywords: Oxygen-transfer, multiple plunging jets, support vector machines, Gaussian process.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1591