Search results for: Multi-parameter logistic regression model
7749 Enhancing Spatial Interpolation: A Multi-Layer Inverse Distance Weighting Model for Complex Regression and Classification Tasks in Spatial Data Analysis
Authors: Yakin Hajlaoui, Richard Labib, Jean-Franc¸ois Plante, Michel Gamache
Abstract:
This study presents the Multi-Layer Inverse Distance Weighting Model (ML-IDW), inspired by the mathematical formulation of both multi-layer neural networks (ML-NNs) and Inverse Distance Weighting model (IDW). ML-IDW leverages ML-NNs’ processing capabilities, characterized by compositions of learnable non-linear functions applied to input features, and incorporates IDW’s ability to learn anisotropic spatial dependencies, presenting a promising solution for nonlinear spatial interpolation and learning from complex spatial data. We employ gradient descent and backpropagation to train ML-IDW. The performance of the proposed model is compared against conventional spatial interpolation models such as Kriging and standard IDW on regression and classification tasks using simulated spatial datasets of varying complexity. Our results highlight the efficacy of ML-IDW, particularly in handling complex spatial dataset, exhibiting lower mean square error in regression and higher F1 score in classification.
Keywords: Deep Learning, Multi-Layer Neural Networks, Gradient Descent, Spatial Interpolation, Inverse Distance Weighting.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 347748 Movie Genre Preference Prediction Using Machine Learning for Customer-Based Information
Authors: Haifeng Wang, Haili Zhang
Abstract:
Most movie recommendation systems have been developed for customers to find items of interest. This work introduces a predictive model usable by small and medium-sized enterprises (SMEs) who are in need of a data-based and analytical approach to stock proper movies for local audiences and retain more customers. We used classification models to extract features from thousands of customers’ demographic, behavioral and social information to predict their movie genre preference. In the implementation, a Gaussian kernel support vector machine (SVM) classification model and a logistic regression model were established to extract features from sample data and their test error-in-sample were compared. Comparison of error-out-sample was also made under different Vapnik–Chervonenkis (VC) dimensions in the machine learning algorithm to find and prevent overfitting. Gaussian kernel SVM prediction model can correctly predict movie genre preferences in 85% of positive cases. The accuracy of the algorithm increased to 93% with a smaller VC dimension and less overfitting. These findings advance our understanding of how to use machine learning approach to predict customers’ preferences with a small data set and design prediction tools for these enterprises.Keywords: Computational social science, movie preference, machine learning, SVM.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16497747 Functional Decomposition Based Effort Estimation Model for Software-Intensive Systems
Authors: Nermin Sökmen
Abstract:
An effort estimation model is needed for softwareintensive projects that consist of hardware, embedded software or some combination of the two, as well as high level software solutions. This paper first focuses on functional decomposition techniques to measure functional complexity of a computer system and investigates its impact on system development effort. Later, it examines effects of technical difficulty and design team capability factors in order to construct the best effort estimation model. With using traditional regression analysis technique, the study develops a system development effort estimation model which takes functional complexity, technical difficulty and design team capability factors as input parameters. Finally, the assumptions of the model are tested.
Keywords: Functional complexity, functional decomposition, development effort, technical difficulty, design team capability, regression analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22787746 Climate Change in Albania and Its Effect on Cereal Yield
Abstract:
This study is focused on analyzing climate change in Albania and its potential effects on cereal yields. Initially, monthly temperature and rainfalls in Albania were studied for the period 1960-2021. Climacteric variables are important variables when trying to model cereal yield behavior, especially when significant changes in weather conditions are observed. For this purpose, in the second part of the study, linear and nonlinear models explaining cereal yield are constructed for the same period, 1960-2021. The multiple linear regression analysis and lasso regression method are applied to the data between cereal yield and each independent variable: average temperature, average rainfall, fertilizer consumption, arable land, land under cereal production, and nitrous oxide emissions. In our regression model, heteroscedasticity is not observed, data follow a normal distribution, and there is a low correlation between factors, so we do not have the problem of multicollinearity. Machine learning methods, such as Random Forest (RF), are used to predict cereal yield responses to climacteric and other variables. RF showed high accuracy compared to the other statistical models in the prediction of cereal yield. We found that changes in average temperature negatively affect cereal yield. The coefficients of fertilizer consumption, arable land, and land under cereal production are positively affecting production. Our results show that the RF method is an effective and versatile machine-learning method for cereal yield prediction compared to the other two methods: multiple linear regression and lasso regression method.
Keywords: Cereal yield, climate change, machine learning, multiple regression model, random forest.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2477745 Exploring the Determinants for Successful Collaboration of SMEs
Authors: Heeyong Noh, Sungjoo Lee
Abstract:
The goal of this research is discovering the determinants of the success or failure of external cooperation in small and medium enterprises (SMEs). For this, a survey was given to 190 SMEs that experienced external cooperation within the last 3 years. A logistic regression model was used to derive organizational or strategic characteristics that significantly influence whether external collaboration of domestic SMEs is successful or not. Results suggest that research and development (R&D) features in general characteristics (both idea creation and discovering market opportunities) that focused on and emphasized indirected-market stakeholders (such as complementary companies and affiliates) and strategies in innovative strategic characteristics raise the probability of successful external cooperation. This can be used meaningfully to build a policy or strategy for inducing successful external cooperation or to understand the innovation of SMEs.Keywords: External collaboration, Innovation strategy, Logisticregression, SMEs.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21647744 A Statistical Model for the Geotechnical Parameters of Cement-Stabilised Hightown’s Soft Soil: A Case Stufy of Liverpool, UK
Authors: Hassnen M. Jafer, Khalid S. Hashim, W. Atherton, Ali W. Alattabi
Abstract:
This study investigates the effect of two important parameters (length of curing period and percentage of the added binder) on the strength of soil treated with OPC. An intermediate plasticity silty clayey soil with medium organic content was used in this study. This soft soil was treated with different percentages of a commercially available cement type 32.5-N. laboratory experiments were carried out on the soil treated with 0, 1.5, 3, 6, 9, and 12% OPC by the dry weight to determine the effect of OPC on the compaction parameters, consistency limits, and the compressive strength. Unconfined compressive strength (UCS) test was carried out on cement-treated specimens after exposing them to different curing periods (1, 3, 7, 14, 28, and 90 days). The results of UCS test were used to develop a non-linear multi-regression model to find the relationship between the predicted and the measured maximum compressive strength of the treated soil (qu). The results indicated that there was a significant improvement in the index of plasticity (IP) by treating with OPC; IP was decreased from 20.2 to 14.1 by using 12% of OPC; this percentage was enough to increase the UCS of the treated soil up to 1362 kPa after 90 days of curing. With respect to the statistical model of the predicted qu, the results showed that the regression coefficients (R2) was equal to 0.8534 which indicates a good reproducibility for the constructed model.Keywords: Cement admixtures, soft soil stabilisation, geotechnical parameters, unconfined compressive strength, multi-regression model.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13917743 Artificial Neural Network Modeling of a Closed Loop Pulsating Heat Pipe
Authors: Vipul M. Patel, Hemantkumar B. Mehta
Abstract:
Technological innovations in electronic world demand novel, compact, simple in design, less costly and effective heat transfer devices. Closed Loop Pulsating Heat Pipe (CLPHP) is a passive phase change heat transfer device and has potential to transfer heat quickly and efficiently from source to sink. Thermal performance of a CLPHP is governed by various parameters such as number of U-turns, orientations, input heat, working fluids and filling ratio. The present paper is an attempt to predict the thermal performance of a CLPHP using Artificial Neural Network (ANN). Filling ratio and heat input are considered as input parameters while thermal resistance is set as target parameter. Types of neural networks considered in the present paper are radial basis, generalized regression, linear layer, cascade forward back propagation, feed forward back propagation; feed forward distributed time delay, layer recurrent and Elman back propagation. Linear, logistic sigmoid, tangent sigmoid and Radial Basis Gaussian Function are used as transfer functions. Prediction accuracy is measured based on the experimental data reported by the researchers in open literature as a function of Mean Absolute Relative Deviation (MARD). The prediction of a generalized regression ANN model with spread constant of 4.8 is found in agreement with the experimental data for MARD in the range of ±1.81%.
Keywords: ANN models, CLPHP, filling ratio, generalized regression, spread constant.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11847742 Machine Learning Techniques in Bank Credit Analysis
Authors: Fernanda M. Assef, Maria Teresinha A. Steiner
Abstract:
The aim of this paper is to compare and discuss better classifier algorithm options for credit risk assessment by applying different Machine Learning techniques. Using records from a Brazilian financial institution, this study uses a database of 5,432 companies that are clients of the bank, where 2,600 clients are classified as non-defaulters, 1,551 are classified as defaulters and 1,281 are temporarily defaulters, meaning that the clients are overdue on their payments for up 180 days. For each case, a total of 15 attributes was considered for a one-against-all assessment using four different techniques: Artificial Neural Networks Multilayer Perceptron (ANN-MLP), Artificial Neural Networks Radial Basis Functions (ANN-RBF), Logistic Regression (LR) and finally Support Vector Machines (SVM). For each method, different parameters were analyzed in order to obtain different results when the best of each technique was compared. Initially the data were coded in thermometer code (numerical attributes) or dummy coding (for nominal attributes). The methods were then evaluated for each parameter and the best result of each technique was compared in terms of accuracy, false positives, false negatives, true positives and true negatives. This comparison showed that the best method, in terms of accuracy, was ANN-RBF (79.20% for non-defaulter classification, 97.74% for defaulters and 75.37% for the temporarily defaulter classification). However, the best accuracy does not always represent the best technique. For instance, on the classification of temporarily defaulters, this technique, in terms of false positives, was surpassed by SVM, which had the lowest rate (0.07%) of false positive classifications. All these intrinsic details are discussed considering the results found, and an overview of what was presented is shown in the conclusion of this study.
Keywords: Artificial Neural Networks, ANNs, classifier algorithms, credit risk assessment, logistic regression, machine learning, support vector machines.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12817741 Efficient System for Speech Recognition using General Regression Neural Network
Authors: Abderrahmane Amrouche, Jean Michel Rouvaen
Abstract:
In this paper we present an efficient system for independent speaker speech recognition based on neural network approach. The proposed architecture comprises two phases: a preprocessing phase which consists in segmental normalization and features extraction and a classification phase which uses neural networks based on nonparametric density estimation namely the general regression neural network (GRNN). The relative performances of the proposed model are compared to the similar recognition systems based on the Multilayer Perceptron (MLP), the Recurrent Neural Network (RNN) and the well known Discrete Hidden Markov Model (HMM-VQ) that we have achieved also. Experimental results obtained with Arabic digits have shown that the use of nonparametric density estimation with an appropriate smoothing factor (spread) improves the generalization power of the neural network. The word error rate (WER) is reduced significantly over the baseline HMM method. GRNN computation is a successful alternative to the other neural network and DHMM.Keywords: Speech Recognition, General Regression NeuralNetwork, Hidden Markov Model, Recurrent Neural Network, ArabicDigits.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21857740 Economic Dispatch Fuzzy Linear Regression and Optimization
Authors: A. K. Al-Othman
Abstract:
This study presents a new approach based on Tanaka's fuzzy linear regression (FLP) algorithm to solve well-known power system economic load dispatch problem (ELD). Tanaka's fuzzy linear regression (FLP) formulation will be employed to compute the optimal solution of optimization problem after linearization. The unknowns are expressed as fuzzy numbers with a triangular membership function that has middle and spread value reflected on the unknowns. The proposed fuzzy model is formulated as a linear optimization problem, where the objective is to minimize the sum of the spread of the unknowns, subject to double inequality constraints. Linear programming technique is employed to obtain the middle and the symmetric spread for every unknown (power generation level). Simulation results of the proposed approach will be compared with those reported in literature.Keywords: Economic Dispatch, Fuzzy Linear Regression (FLP)and Optimization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22937739 Ensemble Approach for Predicting Student's Academic Performance
Authors: L. A. Muhammad, M. S. Argungu
Abstract:
Educational data mining (EDM) has recorded substantial considerations. Techniques of data mining in one way or the other have been proposed to dig out out-of-sight knowledge in educational data. The result of the study got assists academic institutions in further enhancing their process of learning and methods of passing knowledge to students. Consequently, the performance of students boasts and the educational products are by no doubt enhanced. This study adopted a student performance prediction model premised on techniques of data mining with Students' Essential Features (SEF). SEF are linked to the learner's interactivity with the e-learning management system. The performance of the student's predictive model is assessed by a set of classifiers, viz. Bayes Network, Logistic Regression, and Reduce Error Pruning Tree (REP). Consequently, ensemble methods of Bagging, Boosting, and Random Forest (RF) are applied to improve the performance of these single classifiers. The study reveals that the result shows a robust affinity between learners' behaviors and their academic attainment. Result from the study shows that the REP Tree and its ensemble record the highest accuracy of 83.33% using SEF. Hence, in terms of the Receiver Operating Curve (ROC), boosting method of REP Tree records 0.903, which is the best. This result further demonstrates the dependability of the proposed model.
Keywords: Ensemble, bagging, Random Forest, boosting, data mining, classifiers, machine learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7607738 Transient Population Dynamics of Phase Singularities in 2D Beeler-Reuter Model
Authors: Hidetoshi Konno, Akio Suzuki
Abstract:
The paper presented a transient population dynamics of phase singularities in 2D Beeler-Reuter model. Two stochastic modelings are examined: (i) the Master equation approach with the transition rate (i.e., λ(n, t) = λ(t)n and μ(n, t) = μ(t)n) and (ii) the nonlinear Langevin equation approach with a multiplicative noise. The exact general solution of the Master equation with arbitrary time-dependent transition rate is given. Then, the exact solution of the mean field equation for the nonlinear Langevin equation is also given. It is demonstrated that transient population dynamics is successfully identified by the generalized Logistic equation with fractional higher order nonlinear term. It is also demonstrated the necessity of introducing time-dependent transition rate in the master equation approach to incorporate the effect of nonlinearity.
Keywords: Transient population dynamics, Phase singularity, Birth-death process, Non-stationary Master equation, nonlinear Langevin equation, generalized Logistic equation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15937737 Recommender Systems Using Ensemble Techniques
Authors: Yeonjeong Lee, Kyoung-jae Kim, Youngtae Kim
Abstract:
This study proposes a novel recommender system that uses data mining and multi-model ensemble techniques to enhance the recommendation performance through reflecting the precise user’s preference. The proposed model consists of two steps. In the first step, this study uses logistic regression, decision trees, and artificial neural networks to predict customers who have high likelihood to purchase products in each product group. Then, this study combines the results of each predictor using the multi-model ensemble techniques such as bagging and bumping. In the second step, this study uses the market basket analysis to extract association rules for co-purchased products. Finally, the system selects customers who have high likelihood to purchase products in each product group and recommends proper products from same or different product groups to them through above two steps. We test the usability of the proposed system by using prototype and real-world transaction and profile data. In addition, we survey about user satisfaction for the recommended product list from the proposed system and the randomly selected product lists. The results also show that the proposed system may be useful in real-world online shopping store.
Keywords: Product recommender system, Ensemble technique, Association rules, Decision tree, Artificial neural networks.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 42227736 Regression Test Selection Technique for Multi-Programming Language
Authors: Walid S. Abd El-hamid, Sherif S. El-Etriby, Mohiy M. Hadhoud
Abstract:
Regression testing is a maintenance activity applied to modified software to provide confidence that the changed parts are correct and that the unchanged parts have not been adversely affected by the modifications. Regression test selection techniques reduce the cost of regression testing, by selecting a subset of an existing test suite to use in retesting modified programs. This paper presents the first general regression-test-selection technique, which based on code and allows selecting test cases for any programs written in any programming language. Then it handles incomplete program. We also describe RTSDiff, a regression-test-selection system that implements the proposed technique. The results of the empirical studied that performed in four programming languages java, C#, Cµ and Visual basic show that the efficiency and effective in reducing the size of test suit.Keywords: Regression testing, testing, test selection, softwareevolution, software maintenance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15337735 ELD79-LGD2006 Transformation Techniques Implementation and Accuracy Comparison in Tripoli Area, Libya
Authors: Jamal A. Gledan, Othman A. Azzeidani
Abstract:
During the last decade, Libya established a new Geodetic Datum called Libyan Geodetic Datum 2006 (LGD 2006) by using GPS, whereas the ground traversing method was used to establish the last Libyan datum which was called the Europe Libyan Datum 79 (ELD79). The current research paper introduces ELD79 to LGD2006 coordinate transformation technique, the accurate comparison of transformation between multiple regression equations and the three – parameters model (Bursa-Wolf). The results had been obtained show that the overall accuracy of stepwise multi regression equations is better than that can be determined by using Bursa-Wolf transformation model.
Keywords: Geodetic datum, horizontal control points, traditional similarity transformation model, unconventional transformation techniques.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 27407734 Economic Loss due to Ganoderma Disease in Oil Palm
Authors: K. Assis, K. P. Chong, A. S. Idris, C. M. Ho
Abstract:
Oil palm or Elaeis guineensis is considered as the golden crop in Malaysia. But oil palm industry in this country is now facing with the most devastating disease called as Ganoderma Basal Stem Rot disease. The objective of this paper is to analyze the economic loss due to this disease. There were three commercial oil palm sites selected for collecting the required data for economic analysis. Yield parameter used to measure the loss was the total weight of fresh fruit bunch in six months. The predictors include disease severity, change in disease severity, number of infected neighbor palms, age of palm, planting generation, topography, and first order interaction variables. The estimation model of yield loss was identified by using backward elimination based regression method. Diagnostic checking was conducted on the residual of the best yield loss model. The value of mean absolute percentage error (MAPE) was used to measure the forecast performance of the model. The best yield loss model was then used to estimate the economic loss by using the current monthly price of fresh fruit bunch at mill gate.
Keywords: Ganoderma, oil palm, regression model, yield loss, economic loss.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 32377733 Identifying Factors Contributing to the Spread of Lyme Disease: A Regression Analysis of Virginia’s Data
Authors: Fatemeh Valizadeh Gamchi, Edward L. Boone
Abstract:
This research focuses on Lyme disease, a widespread infectious condition in the United States caused by the bacterium Borrelia burgdorferi sensu stricto. It is critical to identify environmental and economic elements that are contributing to the spread of the disease. This study examined data from Virginia to identify a subset of explanatory variables significant for Lyme disease case numbers. To identify relevant variables and avoid overfitting, linear poisson, and regularization regression methods such as ridge, lasso, and elastic net penalty were employed. Cross-validation was performed to acquire tuning parameters. The methods proposed can automatically identify relevant disease count covariates. The efficacy of the techniques was assessed using four criteria on three simulated datasets. Finally, using the Virginia Department of Health’s Lyme disease dataset, the study successfully identified key factors, and the results were consistent with previous studies.
Keywords: Lyme disease, Poisson generalized linear model, Ridge regression, Lasso Regression, elastic net regression.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1227732 Application of Feed-Forward Neural Networks Autoregressive Models with Genetic Algorithm in Gross Domestic Product Prediction
Authors: E. Giovanis
Abstract:
In this paper we present a Feed-Foward Neural Networks Autoregressive (FFNN-AR) model with genetic algorithms training optimization in order to predict the gross domestic product growth of six countries. Specifically we propose a kind of weighted regression, which can be used for econometric purposes, where the initial inputs are multiplied by the neural networks final optimum weights from input-hidden layer of the training process. The forecasts are compared with those of the ordinary autoregressive model and we conclude that the proposed regression-s forecasting results outperform significant those of autoregressive model. Moreover this technique can be used in Autoregressive-Moving Average models, with and without exogenous inputs, as also the training process with genetics algorithms optimization can be replaced by the error back-propagation algorithm.Keywords: Autoregressive model, Feed-Forward neuralnetworks, Genetic Algorithms, Gross Domestic Product
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16727731 Institutional Efficiency of Commonhold Industrial Parks Using a Polynomial Regression Model
Authors: Jeng-Wen Lin, Simon Chien-Yuan Chen
Abstract:
Based on assumptions of neo-classical economics and rational choice / public choice theory, this paper investigates the regulation of industrial land use in Taiwan by homeowners associations (HOAs) as opposed to traditional government administration. The comparison, which applies the transaction cost theory and a polynomial regression analysis, manifested that HOAs are superior to conventional government administration in terms of transaction costs and overall efficiency. A case study that compares Taiwan-s commonhold industrial park, NangKang Software Park, to traditional government counterparts using limited data on the costs and returns was analyzed. This empirical study on the relative efficiency of governmental and private institutions justified the important theoretical proposition. Numerical results prove the efficiency of the established model.Keywords: Homeowners Associations, Institutional Efficiency, Polynomial Regression, Transaction Cost.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15827730 Modelling Dengue Fever (DF) and Dengue Haemorrhagic Fever (DHF) Outbreak Using Poisson and Negative Binomial Model
Authors: W. Y. Wan Fairos, W. H. Wan Azaki, L. Mohamad Alias, Y. Bee Wah
Abstract:
Dengue fever has become a major concern for health authorities all over the world particularly in the tropical countries. These countries, in particular are experiencing the most worrying outbreak of dengue fever (DF) and dengue haemorrhagic fever (DHF). The DF and DHF epidemics, thus, have become the main causes of hospital admissions and deaths in Malaysia. This paper, therefore, attempts to examine the environmental factors that may influence the recent dengue outbreak. The aim of this study is twofold, firstly is to establish a statistical model to describe the relationship between the number of dengue cases and a range of explanatory variables and secondly, to identify the lag operator for explanatory variables which affect the dengue incidence the most. The explanatory variables involved include the level of cloud cover, percentage of relative humidity, amount of rainfall, maximum temperature, minimum temperature and wind speed. The Poisson and Negative Binomial regression analyses were used in this study. The results of the analyses on the 915 observations (daily data taken from July 2006 to Dec 2008), reveal that the climatic factors comprising of daily temperature and wind speed were found to significantly influence the incidence of dengue fever after 2 and 3 weeks of their occurrences. The effect of humidity, on the other hand, appears to be significant only after 2 weeks.Keywords: Dengue Fever, Dengue Hemorrhagic Fever, Negative Binomial Regression model, Poisson Regression model.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 28157729 Functional Food Knowledge and Perceptions among Young Consumers in Malaysia
Authors: G. Rezai, P.K.Teng, Z. Mohamed, M.N Shamsudin
Abstract:
Changing in consumers lifestyles and food consumption patterns provide a great opportunity in developing the functional food sector in Malaysia. There is only a little knowledge about whether Malaysian consumers are aware of functional food and if so what image consumers have of this product. The objective of this research is to determine the extent to which selected socioeconomic characteristics and attitudes influence consumers- awareness of functional food. A survey was conducted in the Klang Valley, Malaysia where 439 respondents were interviewed using a structured questionnaire. The result shows that most respondents have a positive attitude towards functional food. For the binary logistic estimation, the results indicate that age, income and other factors such as concern about food safety, subscribing to cooking or health magazines, being a vegetarian and consumers who have been involved in a food production company significantly influence Malaysian consumers- awareness towards functional food.Keywords: Binary logistic model, functional foods, knowledge and awareness, perception
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 57807728 Designing Early Warning System: Prediction Accuracy of Currency Crisis by Using k-Nearest Neighbour Method
Authors: Nor Azuana Ramli, Mohd Tahir Ismail, Hooy Chee Wooi
Abstract:
Developing a stable early warning system (EWS) model that is capable to give an accurate prediction is a challenging task. This paper introduces k-nearest neighbour (k-NN) method which never been applied in predicting currency crisis before with the aim of increasing the prediction accuracy. The proposed k-NN performance depends on the choice of a distance that is used where in our analysis; we take the Euclidean distance and the Manhattan as a consideration. For the comparison, we employ three other methods which are logistic regression analysis (logit), back-propagation neural network (NN) and sequential minimal optimization (SMO). The analysis using datasets from 8 countries and 13 macro-economic indicators for each country shows that the proposed k-NN method with k = 4 and Manhattan distance performs better than the other methods.
Keywords: Currency crisis, k-nearest neighbour method, logit, neural network.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22977727 A Cost Optimization Model for the Construction of Bored Piles
Authors: Kenneth M. Oba
Abstract:
Adequate management, control, and optimization of cost is an essential element for a successful construction project. A multiple linear regression optimization model was formulated to address the problem of costs associated with pile construction operations. A total of 32 PVC-reinforced concrete piles with diameter of 300 mm, 5.4 m long, were studied during the construction. The soil upon which the piles were installed was mostly silty sand, and completely submerged in water at Bonny, Nigeria. The piles are friction piles installed by boring method, using a piling auger. The volumes of soil removed, the weight of reinforcement cage installed, and volumes of fresh concrete poured into the PVC void were determined. The cost of constructing each pile based on the calculated quantities was determined. A model was derived and subjected to statistical tests using Statistical Package for the Social Sciences (SPSS) software. The model turned out to be adequate, fit, and have a high predictive accuracy with an R2 value of 0.833.
Keywords: Cost optimization modelling, multiple linear models, pile construction, regression models.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1767726 Improvement of MLLR Speaker Adaptation Using a Novel Method
Authors: Ing-Jr Ding
Abstract:
This paper presents a technical speaker adaptation method called WMLLR, which is based on maximum likelihood linear regression (MLLR). In MLLR, a linear regression-based transform which adapted the HMM mean vectors was calculated to maximize the likelihood of adaptation data. In this paper, the prior knowledge of the initial model is adequately incorporated into the adaptation. A series of speaker adaptation experiments are carried out at a 30 famous city names database to investigate the efficiency of the proposed method. Experimental results show that the WMLLR method outperforms the conventional MLLR method, especially when only few utterances from a new speaker are available for adaptation.Keywords: hidden Markov model, maximum likelihood linearregression, speech recognition, speaker adaptation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18427725 A Study on Exclusive Breastfeeding using Over-dispersed Statistical Models
Authors: Naushad Mamode Khan, Cheika Jahangeer, Maleika Heenaye-Mamode Khan
Abstract:
Breastfeeding is an important concept in the maternal life of a woman. In this paper, we focus on exclusive breastfeeding. Exclusive breastfeeding is the feeding of a baby on no other milk apart from breast milk. This type of breastfeeding is very important during the first six months because it supports optimal growth and development during infancy and reduces the risk of obliterating diseases and problems. Moreover, in Mauritius, exclusive breastfeeding has decreased the incidence and/or severity of diarrhea, lower respiratory infection and urinary tract infection. In this paper, we give an overview of exclusive breastfeeding in Mauritius and the factors influencing it. We further analyze the local practices of exclusive breastfeeding using the Generalized Poisson regression model and the negative-binomial model since the data are over-dispersed.
Keywords: Exclusive breast feeding, regression model, generalized poisson, negative binomial.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16007724 Estimation of Critical Period for Weed Control in Corn in Iran
Authors: Sohrab Mahmoodi, Ali Rahimi
Abstract:
The critical period for weed control (CPWC) is the period in the crop growth cycle during which weeds must be controlled to prevent unacceptable yield losses. Field studies were conducted in 2005 and 2006 in the University of Birjand at the south east of Iran to determine CPWC of corn using a randomized complete block design with 14 treatments and four replications. The treatments consisted of two different periods of weed interference, a critical weed-free period and a critical time of weed removal, were imposed at V3, V6, V9, V12, V15, and R1 (based on phonological stages of corn development) with a weedy check and a weed-free check. The CPWC was determined with the use of 2.5, 5, 10, 15 and 20% acceptable yield loss levels by non-linear Regression method and fitting Logistic and Gompertz nonlinear equations to relative yield data. The CPWC of corn was from 5- to 15-leaf stage (19-55 DAE) to prevent yield losses of 5%. This period to prevent yield losses of 2.5, 10 and 20% was 4- to 17-leaf stage (14-59 DAE), 6- to 12-leaf stage (25-47 DAE) and 8- to 9-leaf stage (31-36 DAE) respectively. The height and leaf area index of corn were significantly decreased by weed competition in both weed free and weed infested treatments (P<0.01). Results also showed that there was a significant positive correlation between yield and LAI of corn at silk stage when competing with weeds (r= 0.97).
Keywords: Corn, Critical period, Gompertz, Logistic, Weed control.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20307723 QSAR Studies of Certain Novel Heterocycles Derived from Bis-1, 2, 4 Triazoles as Anti-Tumor Agents
Authors: Madhusudan Purohit, Stephen Philip, Bharathkumar Inturi
Abstract:
In this paper we report the quantitative structure activity relationship of novel bis-triazole derivatives for predicting the activity profile. The full model encompassed a dataset of 46 Bis- triazoles. Tripos Sybyl X 2.0 program was used to conduct CoMSIA QSAR modeling. The Partial Least-Squares (PLS) analysis method was used to conduct statistical analysis and to derive a QSAR model based on the field values of CoMSIA descriptor. The compounds were divided into test and training set. The compounds were evaluated by various CoMSIA parameters to predict the best QSAR model. An optimum numbers of components were first determined separately by cross-validation regression for CoMSIA model, which were then applied in the final analysis. A series of parameters were used for the study and the best fit model was obtained using donor, partition coefficient and steric parameters. The CoMSIA models demonstrated good statistical results with regression coefficient (r2) and the cross-validated coefficient (q2) of 0.575 and 0.830 respectively. The standard error for the predicted model was 0.16322. In the CoMSIA model, the steric descriptors make a marginally larger contribution than the electrostatic descriptors. The finding that the steric descriptor is the largest contributor for the CoMSIA QSAR models is consistent with the observation that more than half of the binding site area is occupied by steric regions.
Keywords: 3D QSAR, CoMSIA, Triazoles.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14807722 Solving Single Machine Total Weighted Tardiness Problem Using Gaussian Process Regression
Authors: Wanatchapong Kongkaew
Abstract:
This paper proposes an application of probabilistic technique, namely Gaussian process regression, for estimating an optimal sequence of the single machine with total weighted tardiness (SMTWT) scheduling problem. In this work, the Gaussian process regression (GPR) model is utilized to predict an optimal sequence of the SMTWT problem, and its solution is improved by using an iterated local search based on simulated annealing scheme, called GPRISA algorithm. The results show that the proposed GPRISA method achieves a very good performance and a reasonable trade-off between solution quality and time consumption. Moreover, in the comparison of deviation from the best-known solution, the proposed mechanism noticeably outperforms the recently existing approaches.
Keywords: Gaussian process regression, iterated local search, simulated annealing, single machine total weighted tardiness.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22357721 Model-Driven and Data-Driven Approaches for Crop Yield Prediction: Analysis and Comparison
Authors: Xiangtuo Chen, Paul-Henry Cournéde
Abstract:
Crop yield prediction is a paramount issue in agriculture. The main idea of this paper is to find out efficient way to predict the yield of corn based meteorological records. The prediction models used in this paper can be classified into model-driven approaches and data-driven approaches, according to the different modeling methodologies. The model-driven approaches are based on crop mechanistic modeling. They describe crop growth in interaction with their environment as dynamical systems. But the calibration process of the dynamic system comes up with much difficulty, because it turns out to be a multidimensional non-convex optimization problem. An original contribution of this paper is to propose a statistical methodology, Multi-Scenarios Parameters Estimation (MSPE), for the parametrization of potentially complex mechanistic models from a new type of datasets (climatic data, final yield in many situations). It is tested with CORNFLO, a crop model for maize growth. On the other hand, the data-driven approach for yield prediction is free of the complex biophysical process. But it has some strict requirements about the dataset. A second contribution of the paper is the comparison of these model-driven methods with classical data-driven methods. For this purpose, we consider two classes of regression methods, methods derived from linear regression (Ridge and Lasso Regression, Principal Components Regression or Partial Least Squares Regression) and machine learning methods (Random Forest, k-Nearest Neighbor, Artificial Neural Network and SVM regression). The dataset consists of 720 records of corn yield at county scale provided by the United States Department of Agriculture (USDA) and the associated climatic data. A 5-folds cross-validation process and two accuracy metrics: root mean square error of prediction(RMSEP), mean absolute error of prediction(MAEP) were used to evaluate the crop prediction capacity. The results show that among the data-driven approaches, Random Forest is the most robust and generally achieves the best prediction error (MAEP 4.27%). It also outperforms our model-driven approach (MAEP 6.11%). However, the method to calibrate the mechanistic model from dataset easy to access offers several side-perspectives. The mechanistic model can potentially help to underline the stresses suffered by the crop or to identify the biological parameters of interest for breeding purposes. For this reason, an interesting perspective is to combine these two types of approaches.Keywords: Crop yield prediction, crop model, sensitivity analysis, paramater estimation, particle swarm optimization, random forest.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11747720 Gas Detection via Machine Learning
Authors: Walaa Khalaf, Calogero Pace, Manlio Gaudioso
Abstract:
We present an Electronic Nose (ENose), which is aimed at identifying the presence of one out of two gases, possibly detecting the presence of a mixture of the two. Estimation of the concentrations of the components is also performed for a volatile organic compound (VOC) constituted by methanol and acetone, for the ranges 40-400 and 22-220 ppm (parts-per-million), respectively. Our system contains 8 sensors, 5 of them being gas sensors (of the class TGS from FIGARO USA, INC., whose sensing element is a tin dioxide (SnO2) semiconductor), the remaining being a temperature sensor (LM35 from National Semiconductor Corporation), a humidity sensor (HIH–3610 from Honeywell), and a pressure sensor (XFAM from Fujikura Ltd.). Our integrated hardware–software system uses some machine learning principles and least square regression principle to identify at first a new gas sample, or a mixture, and then to estimate the concentrations. In particular we adopt a training model using the Support Vector Machine (SVM) approach with linear kernel to teach the system how discriminate among different gases. Then we apply another training model using the least square regression, to predict the concentrations. The experimental results demonstrate that the proposed multiclassification and regression scheme is effective in the identification of the tested VOCs of methanol and acetone with 96.61% correctness. The concentration prediction is obtained with 0.979 and 0.964 correlation coefficient for the predicted versus real concentrations of methanol and acetone, respectively.Keywords: Electronic nose, Least square regression, Mixture ofgases, Support Vector Machine.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2539