Search results for: regularized regression
3071 A Study on Characteristics of Hedonic Price Models in Korea Based on Meta-Regression Analysis
Authors: Minseo Jo
Abstract:
The purpose of this paper is to examine the factors in the hedonic price models, that has significance impact in determining the price of apartments. There are many variables employed in the hedonic price models and their effectiveness vary differently according to the researchers and the regions they are analysing. In order to consider various conditions, the meta-regression analysis has been selected for the study. In this paper, four meta-independent variables, from the 65 hedonic price models to analysis. The factors that influence the prices of apartments, as well as including factors that influence the prices of apartments, regions, which are divided into two of the research performed, years of research performed, the coefficients of the functions employed. The covariance between the four meta-variables and p-value of the coefficients and the four meta-variables and number of data used in the 65 hedonic price models have been analyzed in this study. The six factors that are most important in deciding the prices of apartments are positioning of apartments, the noise of the apartments, points of the compass and views from the apartments, proximity to the public transportations, companies that have constructed the apartments, social environments (such as schools etc.).Keywords: hedonic price model, housing price, meta-regression analysis, characteristics
Procedia PDF Downloads 4023070 Applied Bayesian Regularized Artificial Neural Network for Up-Scaling Wind Speed Profile and Distribution
Authors: Aghbalou Nihad, Charki Abderafi, Saida Rahali, Reklaoui Kamal
Abstract:
Maximize the benefit from the wind energy potential is the most interest of the wind power stakeholders. As a result, the wind tower size is radically increasing. Nevertheless, choosing an appropriate wind turbine for a selected site require an accurate estimate of vertical wind profile. It is also imperative from cost and maintenance strategy point of view. Then, installing tall towers or even more expensive devices such as LIDAR or SODAR raises the costs of a wind power project. Various models were developed coming within this framework. However, they suffer from complexity, generalization and lacks accuracy. In this work, we aim to investigate the ability of neural network trained using the Bayesian Regularization technique to estimate wind speed profile up to height of 100 m based on knowledge of wind speed lower heights. Results show that the proposed approach can achieve satisfactory predictions and proof the suitability of the proposed method for generating wind speed profile and probability distributions based on knowledge of wind speed at lower heights.Keywords: bayesian regularization, neural network, wind shear, accuracy
Procedia PDF Downloads 5023069 Examining the Effects of College Education on Democratic Attitudes in China: A Regression Discontinuity Analysis
Authors: Gang Wang
Abstract:
Education is widely believed to be a prerequisite for democracy and civil society, but the causal link between education and outcome variables is usually hardly to be identified. This study applies a fuzzy regression discontinuity design to examine the effects of college education on democratic attitudes in the Chinese context. In the analysis treatment assignment is determined by students’ college entry years and thus naturally selected by subjects’ ages. Using a sample of Chinese college students collected in Beijing in 2009, this study finds that college education actually reduces undergraduates’ motivation for political development in China but promotes political loyalty to the authoritarian government. Further hypotheses tests explain these interesting findings from two perspectives. The first is related to the complexity of politics. As college students progress over time, they increasingly realize the complexity of political reform in China’s authoritarian regime and rather stay away from politics. The second is related to students’ career opportunities. As students are close to graduation, they are immersed with job hunting and have a reduced interest in political freedom.Keywords: china, college education, democratic attitudes, regression discontinuity
Procedia PDF Downloads 3513068 Count Data Regression Modeling: An Application to Spontaneous Abortion in India
Authors: Prashant Verma, Prafulla K. Swain, K. K. Singh, Mukti Khetan
Abstract:
Objective: In India, around 20,000 women die every year due to abortion-related complications. In the modelling of count variables, there is sometimes a preponderance of zero counts. This article concerns the estimation of various count regression models to predict the average number of spontaneous abortion among women in the Punjab state of India. It also assesses the factors associated with the number of spontaneous abortions. Materials and methods: The study included 27,173 married women of Punjab obtained from the DLHS-4 survey (2012-13). Poisson regression (PR), Negative binomial (NB) regression, zero hurdle negative binomial (ZHNB), and zero-inflated negative binomial (ZINB) models were employed to predict the average number of spontaneous abortions and to identify the determinants affecting the number of spontaneous abortions. Results: Statistical comparisons among four estimation methods revealed that the ZINB model provides the best prediction for the number of spontaneous abortions. Antenatal care (ANC) place, place of residence, total children born to a woman, woman's education and economic status were found to be the most significant factors affecting the occurrence of spontaneous abortion. Conclusions: The study offers a practical demonstration of techniques designed to handle count variables. Statistical comparisons among four estimation models revealed that the ZINB model provided the best prediction for the number of spontaneous abortions and is recommended to be used to predict the number of spontaneous abortions. The study suggests that women receive institutional Antenatal care to attain limited parity. It also advocates promoting higher education among women in Punjab, India.Keywords: count data, spontaneous abortion, Poisson model, negative binomial model, zero hurdle negative binomial, zero-inflated negative binomial, regression
Procedia PDF Downloads 1553067 Business Constraints and Growth Potential of Smes: Case Study of Electrical Industry in Pakistan
Authors: Muhammad Waseem Akram
Abstract:
The current study attempts to analyze the impact of business constraints on the growth potential and performance of Small and Medium Enterprises (SMEs) in the electrical industry of Pakistan. Primary data have been utilized for the study collected from the electrical industry cluster in Sargodha, Pakistan. OLS regression is used to assess the impact of business constraints on the performance of SMEs by controlling the effect of Technology Level, Innovations, and Firm Size. To associate business constraints with the growth potential of SMEs, the study utilized Tetrachoric Correlation and Logistic Regression. Findings reveal that all the business constraints negatively affect the performance of SMEs in the electrical industry except Political Instability. Results of Tetrachoric Correlation show that all the business constraints are negatively correlated with the growth potential of SMEs. Logistic Regression results show that Energy Constraint, Inflation and Price Instability, and Bad Business Practices, all three business constraints cause to reduce the probability of income growth in sample SMEs.Keywords: SMEs, business constraints, performance, growth potential
Procedia PDF Downloads 1693066 Application of Nonparametric Geographically Weighted Regression to Evaluate the Unemployment Rate in East Java
Authors: Sifriyani Sifriyani, I Nyoman Budiantara, Sri Haryatmi, Gunardi Gunardi
Abstract:
East Java Province has a first rank as a province that has the most counties and cities in Indonesia and has the largest population. In 2015, the population reached 38.847.561 million, this figure showed a very high population growth. High population growth is feared to lead to increase the levels of unemployment. In this study, the researchers mapped and modeled the unemployment rate with 6 variables that were supposed to influence. Modeling was done by nonparametric geographically weighted regression methods with truncated spline approach. This method was chosen because spline method is a flexible method, these models tend to look for its own estimation. In this modeling, there were point knots, the point that showed the changes of data. The selection of the optimum point knots was done by selecting the most minimun value of Generalized Cross Validation (GCV). Based on the research, 6 variables were declared to affect the level of unemployment in eastern Java. They were the percentage of population that is educated above high school, the rate of economic growth, the population density, the investment ratio of total labor force, the regional minimum wage and the ratio of the number of big industry and medium scale industry from the work force. The nonparametric geographically weighted regression models with truncated spline approach had a coefficient of determination 98.95% and the value of MSE equal to 0.0047.Keywords: East Java, nonparametric geographically weighted regression, spatial, spline approach, unemployed rate
Procedia PDF Downloads 3213065 Comparative Analysis of Predictive Models for Customer Churn Prediction in the Telecommunication Industry
Authors: Deepika Christopher, Garima Anand
Abstract:
To determine the best model for churn prediction in the telecom industry, this paper compares 11 machine learning algorithms, namely Logistic Regression, Support Vector Machine, Random Forest, Decision Tree, XGBoost, LightGBM, Cat Boost, AdaBoost, Extra Trees, Deep Neural Network, and Hybrid Model (MLPClassifier). It also aims to pinpoint the top three factors that lead to customer churn and conducts customer segmentation to identify vulnerable groups. According to the data, the Logistic Regression model performs the best, with an F1 score of 0.6215, 81.76% accuracy, 68.95% precision, and 56.57% recall. The top three attributes that cause churn are found to be tenure, Internet Service Fiber optic, and Internet Service DSL; conversely, the top three models in this article that perform the best are Logistic Regression, Deep Neural Network, and AdaBoost. The K means algorithm is applied to establish and analyze four different customer clusters. This study has effectively identified customers that are at risk of churn and may be utilized to develop and execute strategies that lower customer attrition.Keywords: attrition, retention, predictive modeling, customer segmentation, telecommunications
Procedia PDF Downloads 573064 Research on the Spatio-Temporal Evolution Pattern of Traffic Dominance in Shaanxi Province
Authors: Leng Jian-Wei, Wang Lai-Jun, Li Ye
Abstract:
In order to measure and analyze the transportation situation within the counties of Shaanxi province over a certain period of time and to promote the province's future transportation planning and development, this paper proposes a reasonable layout plan and compares model rationality. The study uses entropy weight method to measure the transportation advantages of 107 counties in Shaanxi province from three dimensions: road network density, trunk line influence and location advantage in 2013 and 2021, and applies spatial autocorrelation analysis method to analyze the spatial layout and development trend of county-level transportation, and conducts ordinary least square (OLS)regression on transportation impact factors and other influencing factors. The paper also compares the regression fitting degree of the Geographically weighted regression(GWR) model and the OLS model. The results show that spatially, the transportation advantages of Shaanxi province generally show a decreasing trend from the Weihe Plain to the surrounding areas and mainly exhibit high-high clustering phenomenon. Temporally, transportation advantages show an overall upward trend, and the phenomenon of spatial imbalance gradually decreases. People's travel demands have changed to some extent, and the demand for rapid transportation has increased overall. The GWR model regression fitting degree of transportation advantages is 0.74, which is higher than the OLS regression model's fitting degree of 0.64. Based on the evolution of transportation advantages, it is predicted that this trend will continue for a period of time in the future. To improve the transportation advantages of Shaanxi province increasing the layout of rapid transportation can effectively enhance the transportation advantages of Shaanxi province. When analyzing spatial heterogeneity, geographic factors should be considered to establish a more reliable modelKeywords: traffic dominance, GWR model, spatial autocorrelation analysis, temporal and spatial evolution
Procedia PDF Downloads 893063 Effect of Drying on the Concrete Structures
Authors: A. Brahma
Abstract:
The drying of hydraulics materials is unavoidable and conducted to important spontaneous deformations. In this study, we show that it is possible to describe the drying shrinkage of the high-performance concrete by a simple expression. A multiple regression model was developed for the prediction of the drying shrinkage of the high-performance concrete. The assessment of the proposed model has been done by a set of statistical tests. The model developed takes in consideration the main parameters of confection and conservation. There was a very good agreement between drying shrinkage predicted by the multiple regression model and experimental results. The developed model adjusts easily to all hydraulic concrete types.Keywords: hydraulic concretes, drying, shrinkage, prediction, modeling
Procedia PDF Downloads 3683062 Influence of Parameters of Modeling and Data Distribution for Optimal Condition on Locally Weighted Projection Regression Method
Authors: Farhad Asadi, Mohammad Javad Mollakazemi, Aref Ghafouri
Abstract:
Recent research in neural networks science and neuroscience for modeling complex time series data and statistical learning has focused mostly on learning from high input space and signals. Local linear models are a strong choice for modeling local nonlinearity in data series. Locally weighted projection regression is a flexible and powerful algorithm for nonlinear approximation in high dimensional signal spaces. In this paper, different learning scenario of one and two dimensional data series with different distributions are investigated for simulation and further noise is inputted to data distribution for making different disordered distribution in time series data and for evaluation of algorithm in locality prediction of nonlinearity. Then, the performance of this algorithm is simulated and also when the distribution of data is high or when the number of data is less the sensitivity of this approach to data distribution and influence of important parameter of local validity in this algorithm with different data distribution is explained.Keywords: local nonlinear estimation, LWPR algorithm, online training method, locally weighted projection regression method
Procedia PDF Downloads 5023061 Exploration and Evaluation of the Effect of Multiple Countermeasures on Road Safety
Authors: Atheer Al-Nuaimi, Harry Evdorides
Abstract:
Every day many people die or get disabled or injured on roads around the world, which necessitates more specific treatments for transportation safety issues. International road assessment program (iRAP) model is one of the comprehensive road safety models which accounting for many factors that affect road safety in a cost-effective way in low and middle income countries. In iRAP model road safety has been divided into five star ratings from 1 star (the lowest level) to 5 star (the highest level). These star ratings are based on star rating score which is calculated by iRAP methodology depending on road attributes, traffic volumes and operating speeds. The outcome of iRAP methodology are the treatments that can be used to improve road safety and reduce fatalities and serious injuries (FSI) numbers. These countermeasures can be used separately as a single countermeasure or mix as multiple countermeasures for a location. There is general agreement that the adequacy of a countermeasure is liable to consistent losses when it is utilized as a part of mix with different countermeasures. That is, accident diminishment appraisals of individual countermeasures cannot be easily added together. The iRAP model philosophy makes utilization of a multiple countermeasure adjustment factors to predict diminishments in the effectiveness of road safety countermeasures when more than one countermeasure is chosen. A multiple countermeasure correction factors are figured for every 100-meter segment and for every accident type. However, restrictions of this methodology incorporate a presumable over-estimation in the predicted crash reduction. This study aims to adjust this correction factor by developing new models to calculate the effect of using multiple countermeasures on the number of fatalities for a location or an entire road. Regression models have been used to establish relationships between crash frequencies and the factors that affect their rates. Multiple linear regression, negative binomial regression, and Poisson regression techniques were used to develop models that can address the effectiveness of using multiple countermeasures. Analyses are conducted using The R Project for Statistical Computing showed that a model developed by negative binomial regression technique could give more reliable results of the predicted number of fatalities after the implementation of road safety multiple countermeasures than the results from iRAP model. The results also showed that the negative binomial regression approach gives more precise results in comparison with multiple linear and Poisson regression techniques because of the overdispersion and standard error issues.Keywords: international road assessment program, negative binomial, road multiple countermeasures, road safety
Procedia PDF Downloads 2403060 Rd-PLS Regression: From the Analysis of Two Blocks of Variables to Path Modeling
Authors: E. Tchandao Mangamana, V. Cariou, E. Vigneau, R. Glele Kakai, E. M. Qannari
Abstract:
A new definition of a latent variable associated with a dataset makes it possible to propose variants of the PLS2 regression and the multi-block PLS (MB-PLS). We shall refer to these variants as Rd-PLS regression and Rd-MB-PLS respectively because they are inspired by both Redundancy analysis and PLS regression. Usually, a latent variable t associated with a dataset Z is defined as a linear combination of the variables of Z with the constraint that the length of the loading weights vector equals 1. Formally, t=Zw with ‖w‖=1. Denoting by Z' the transpose of Z, we define herein, a latent variable by t=ZZ’q with the constraint that the auxiliary variable q has a norm equal to 1. This new definition of a latent variable entails that, as previously, t is a linear combination of the variables in Z and, in addition, the loading vector w=Z’q is constrained to be a linear combination of the rows of Z. More importantly, t could be interpreted as a kind of projection of the auxiliary variable q onto the space generated by the variables in Z, since it is collinear to the first PLS1 component of q onto Z. Consider the situation in which we aim to predict a dataset Y from another dataset X. These two datasets relate to the same individuals and are assumed to be centered. Let us consider a latent variable u=YY’q to which we associate the variable t= XX’YY’q. Rd-PLS consists in seeking q (and therefore u and t) so that the covariance between t and u is maximum. The solution to this problem is straightforward and consists in setting q to the eigenvector of YY’XX’YY’ associated with the largest eigenvalue. For the determination of higher order components, we deflate X and Y with respect to the latent variable t. Extending Rd-PLS to the context of multi-block data is relatively easy. Starting from a latent variable u=YY’q, we consider its ‘projection’ on the space generated by the variables of each block Xk (k=1, ..., K) namely, tk= XkXk'YY’q. Thereafter, Rd-MB-PLS seeks q in order to maximize the average of the covariances of u with tk (k=1, ..., K). The solution to this problem is given by q, eigenvector of YY’XX’YY’, where X is the dataset obtained by horizontally merging datasets Xk (k=1, ..., K). For the determination of latent variables of order higher than 1, we use a deflation of Y and Xk with respect to the variable t= XX’YY’q. In the same vein, extending Rd-MB-PLS to the path modeling setting is straightforward. Methods are illustrated on the basis of case studies and performance of Rd-PLS and Rd-MB-PLS in terms of prediction is compared to that of PLS2 and MB-PLS.Keywords: multiblock data analysis, partial least squares regression, path modeling, redundancy analysis
Procedia PDF Downloads 1473059 Applying the Regression Technique for Prediction of the Acute Heart Attack
Authors: Paria Soleimani, Arezoo Neshati
Abstract:
Myocardial infarction is one of the leading causes of death in the world. Some of these deaths occur even before the patient reaches the hospital. Myocardial infarction occurs as a result of impaired blood supply. Because the most of these deaths are due to coronary artery disease, hence the awareness of the warning signs of a heart attack is essential. Some heart attacks are sudden and intense, but most of them start slowly, with mild pain or discomfort, then early detection and successful treatment of these symptoms is vital to save them. Therefore, importance and usefulness of a system designing to assist physicians in the early diagnosis of the acute heart attacks is obvious. The purpose of this study is to determine how well a predictive model would perform based on the only patient-reportable clinical history factors, without using diagnostic tests or physical exams. This type of the prediction model might have application outside of the hospital setting to give accurate advice to patients to influence them to seek care in appropriate situations. For this purpose, the data were collected on 711 heart patients in Iran hospitals. 28 attributes of clinical factors can be reported by patients; were studied. Three logistic regression models were made on the basis of the 28 features to predict the risk of heart attacks. The best logistic regression model in terms of performance had a C-index of 0.955 and with an accuracy of 94.9%. The variables, severe chest pain, back pain, cold sweats, shortness of breath, nausea, and vomiting were selected as the main features.Keywords: Coronary heart disease, Acute heart attacks, Prediction, Logistic regression
Procedia PDF Downloads 4493058 Fuzzy Logic Classification Approach for Exponential Data Set in Health Care System for Predication of Future Data
Authors: Manish Pandey, Gurinderjit Kaur, Meenu Talwar, Sachin Chauhan, Jagbir Gill
Abstract:
Health-care management systems are a unit of nice connection as a result of the supply a straightforward and fast management of all aspects relating to a patient, not essentially medical. What is more, there are unit additional and additional cases of pathologies during which diagnosing and treatment may be solely allotted by victimization medical imaging techniques. With associate ever-increasing prevalence, medical pictures area unit directly acquired in or regenerate into digital type, for his or her storage additionally as sequent retrieval and process. Data Mining is the process of extracting information from large data sets through using algorithms and Techniques drawn from the field of Statistics, Machine Learning and Data Base Management Systems. Forecasting may be a prediction of what's going to occur within the future, associated it's an unsure method. Owing to the uncertainty, the accuracy of a forecast is as vital because the outcome foretold by foretelling the freelance variables. A forecast management should be wont to establish if the accuracy of the forecast is within satisfactory limits. Fuzzy regression strategies have normally been wont to develop shopper preferences models that correlate the engineering characteristics with shopper preferences relating to a replacement product; the patron preference models offer a platform, wherever by product developers will decide the engineering characteristics so as to satisfy shopper preferences before developing the merchandise. Recent analysis shows that these fuzzy regression strategies area units normally will not to model client preferences. We tend to propose a Testing the strength of Exponential Regression Model over regression toward the mean Model.Keywords: health-care management systems, fuzzy regression, data mining, forecasting, fuzzy membership function
Procedia PDF Downloads 2793057 Glucose Monitoring System Using Machine Learning Algorithms
Authors: Sangeeta Palekar, Neeraj Rangwani, Akash Poddar, Jayu Kalambe
Abstract:
The bio-medical analysis is an indispensable procedure for identifying health-related diseases like diabetes. Monitoring the glucose level in our body regularly helps us identify hyperglycemia and hypoglycemia, which can cause severe medical problems like nerve damage or kidney diseases. This paper presents a method for predicting the glucose concentration in blood samples using image processing and machine learning algorithms. The glucose solution is prepared by the glucose oxidase (GOD) and peroxidase (POD) method. An experimental database is generated based on the colorimetric technique. The image of the glucose solution is captured by the raspberry pi camera and analyzed using image processing by extracting the RGB, HSV, LUX color space values. Regression algorithms like multiple linear regression, decision tree, RandomForest, and XGBoost were used to predict the unknown glucose concentration. The multiple linear regression algorithm predicts the results with 97% accuracy. The image processing and machine learning-based approach reduce the hardware complexities of existing platforms.Keywords: artificial intelligence glucose detection, glucose oxidase, peroxidase, image processing, machine learning
Procedia PDF Downloads 2033056 Statistical Analysis of the Impact of Maritime Transport Gross Domestic Product (GDP) on Nigeria’s Economy
Authors: Kehinde Peter Oyeduntan, Kayode Oshinubi
Abstract:
Nigeria is referred as the ‘Giant of Africa’ due to high population, land mass and large economy. However, it still trails far behind many smaller economies in the continent in terms of maritime operations. As we have seen that the maritime industry is the spark plug for national growth, because it houses the most crucial infrastructure that generates wealth for a nation, it is worrisome that a nation with six seaports lag in maritime activities. In this research, we have studied how the Gross Domestic Product (GDP) of the maritime transport influences the Nigerian economy. To do this, we applied Simple Linear Regression (SLR), Support Vector Machine (SVM), Polynomial Regression Model (PRM), Generalized Additive Model (GAM) and Generalized Linear Mixed Model (GLMM) to model the relationship between the nation’s Total GDP (TGDP) and the Maritime Transport GDP (MGDP) using a time series data of 20 years. The result showed that the MGDP is statistically significant to the Nigerian economy. Amongst the statistical tool applied, the PRM of order 4 describes the relationship better when compared to other methods. The recommendations presented in this study will guide policy makers and help improve the economy of Nigeria in terms of its GDP.Keywords: maritime transport, economy, GDP, regression, port
Procedia PDF Downloads 1533055 An Optimal Control Model to Determine Body Forces of Stokes Flow
Authors: Yuanhao Gao, Pin Lin, Kees Weijer
Abstract:
In this paper, we will determine the external body force distribution with analysis of stokes fluid motion using mathematical modelling and numerical approaching. The body force distribution is regarded as the unknown variable and could be determined by the idea of optimal control theory. The Stokes flow motion and its velocity are generated by given forces in a unit square domain. A regularized objective functional is built to match the numerical result of flow velocity with the generated velocity data. So that the force distribution could be determined by minimizing the value of objective functional, which is also the difference between the numerical and experimental velocity. Then after utilizing the Lagrange multiplier method, some partial differential equations are formulated consisting the optimal control system to solve. Finite element method and conjugate gradient method are used to discretize equations and deduce the iterative expression of target body force to compute the velocity numerically and body force distribution. Programming environment FreeFEM++ supports the implementation of this model.Keywords: optimal control model, Stokes equation, finite element method, conjugate gradient method
Procedia PDF Downloads 4053054 The Effect of Accounting Conservatism on Cost of Capital: A Quantile Regression Approach for MENA Countries
Authors: Maha Zouaoui Khalifa, Hakim Ben Othman, Hussaney Khaled
Abstract:
Prior empirical studies have investigated the economic consequences of accounting conservatism by examining its impact on the cost of equity capital (COEC). However, findings are not conclusive. We assume that inconsistent results of such association may be attributed to the regression models used in data analysis. To address this issue, we re-examine the effect of different dimension of accounting conservatism: unconditional conservatism (U_CONS) and conditional conservatism (C_CONS) on the COEC for a sample of listed firms from Middle Eastern and North Africa (MENA) countries, applying quantile regression (QR) approach developed by Koenker and Basset (1978). While classical ordinary least square (OLS) method is widely used in empirical accounting research, however it may produce inefficient and bias estimates in the case of departures from normality or long tail error distribution. QR method is more powerful than OLS to handle this kind of problem. It allows the coefficient on the independent variables to shift across the distribution of the dependent variable whereas OLS method only estimates the conditional mean effects of a response variable. We find as predicted that U_CONS has a significant positive effect on the COEC however, C_CONS has a negative impact. Findings suggest also that the effect of the two dimensions of accounting conservatism differs considerably across COEC quantiles. Comparing results from QR method with those of OLS, this study throws more lights on the association between accounting conservatism and COEC.Keywords: unconditional conservatism, conditional conservatism, cost of equity capital, OLS, quantile regression, emerging markets, MENA countries
Procedia PDF Downloads 3553053 Optimizing the Scanning Time with Radiation Prediction Using a Machine Learning Technique
Authors: Saeed Eskandari, Seyed Rasoul Mehdikhani
Abstract:
Radiation sources have been used in many industries, such as gamma sources in medical imaging. These waves have destructive effects on humans and the environment. It is very important to detect and find the source of these waves because these sources cannot be seen by the eye. A portable robot has been designed and built with the purpose of revealing radiation sources that are able to scan the place from 5 to 20 meters away and shows the location of the sources according to the intensity of the waves on a two-dimensional digital image. The operation of the robot is done by measuring the pixels separately. By increasing the image measurement resolution, we will have a more accurate scan of the environment, and more points will be detected. But this causes a lot of time to be spent on scanning. In this paper, to overcome this challenge, we designed a method that can optimize this time. In this method, a small number of important points of the environment are measured. Hence the remaining pixels are predicted and estimated by regression algorithms in machine learning. The research method is based on comparing the actual values of all pixels. These steps have been repeated with several other radiation sources. The obtained results of the study show that the values estimated by the regression method are very close to the real values.Keywords: regression, machine learning, scan radiation, robot
Procedia PDF Downloads 793052 Chemometric Regression Analysis of Radical Scavenging Ability of Kombucha Fermented Kefir-Like Products
Authors: Strahinja Kovacevic, Milica Karadzic Banjac, Jasmina Vitas, Stefan Vukmanovic, Radomir Malbasa, Lidija Jevric, Sanja Podunavac-Kuzmanovic
Abstract:
The present study deals with chemometric regression analysis of quality parameters and the radical scavenging ability of kombucha fermented kefir-like products obtained with winter savory (WS), peppermint (P), stinging nettle (SN) and wild thyme tea (WT) kombucha inoculums. Each analyzed sample was described by milk fat content (MF, %), total unsaturated fatty acids content (TUFA, %), monounsaturated fatty acids content (MUFA, %), polyunsaturated fatty acids content (PUFA, %), the ability of free radicals scavenging (RSA Dₚₚₕ, % and RSA.ₒₕ, %) and pH values measured after each hour from the start until the end of fermentation. The aim of the conducted regression analysis was to establish chemometric models which can predict the radical scavenging ability (RSA Dₚₚₕ, % and RSA.ₒₕ, %) of the samples by correlating it with the MF, TUFA, MUFA, PUFA and the pH value at the beginning, in the middle and at the end of fermentation process which lasted between 11 and 17 hours, until pH value of 4.5 was reached. The analysis was carried out applying univariate linear (ULR) and multiple linear regression (MLR) methods on the raw data and the data standardized by the min-max normalization method. The obtained models were characterized by very limited prediction power (poor cross-validation parameters) and weak statistical characteristics. Based on the conducted analysis it can be concluded that the resulting radical scavenging ability cannot be precisely predicted only on the basis of MF, TUFA, MUFA, PUFA content, and pH values, however, other quality parameters should be considered and included in the further modeling. This study is based upon work from project: Kombucha beverages production using alternative substrates from the territory of the Autonomous Province of Vojvodina, 142-451-2400/2019-03, supported by Provincial Secretariat for Higher Education and Scientific Research of AP Vojvodina.Keywords: chemometrics, regression analysis, kombucha, quality control
Procedia PDF Downloads 1423051 Enhancing Spatial Interpolation: A Multi-Layer Inverse Distance Weighting Model for Complex Regression and Classification Tasks in Spatial Data Analysis
Authors: Yakin Hajlaoui, Richard Labib, Jean-François Plante, Michel Gamache
Abstract:
This study introduces the Multi-Layer Inverse Distance Weighting Model (ML-IDW), inspired by the mathematical formulation of both multi-layer neural networks (ML-NNs) and Inverse Distance Weighting model (IDW). ML-IDW leverages ML-NNs' processing capabilities, characterized by compositions of learnable non-linear functions applied to input features, and incorporates IDW's ability to learn anisotropic spatial dependencies, presenting a promising solution for nonlinear spatial interpolation and learning from complex spatial data. it employ gradient descent and backpropagation to train ML-IDW, comparing its performance against conventional spatial interpolation models such as Kriging and standard IDW on regression and classification tasks using simulated spatial datasets of varying complexity. the results highlight the efficacy of ML-IDW, particularly in handling complex spatial datasets, exhibiting lower mean square error in regression and higher F1 score in classification.Keywords: deep learning, multi-layer neural networks, gradient descent, spatial interpolation, inverse distance weighting
Procedia PDF Downloads 523050 Indian Premier League (IPL) Score Prediction: Comparative Analysis of Machine Learning Models
Authors: Rohini Hariharan, Yazhini R, Bhamidipati Naga Shrikarti
Abstract:
In the realm of cricket, particularly within the context of the Indian Premier League (IPL), the ability to predict team scores accurately holds significant importance for both cricket enthusiasts and stakeholders alike. This paper presents a comprehensive study on IPL score prediction utilizing various machine learning algorithms, including Support Vector Machines (SVM), XGBoost, Multiple Regression, Linear Regression, K-nearest neighbors (KNN), and Random Forest. Through meticulous data preprocessing, feature engineering, and model selection, we aimed to develop a robust predictive framework capable of forecasting team scores with high precision. Our experimentation involved the analysis of historical IPL match data encompassing diverse match and player statistics. Leveraging this data, we employed state-of-the-art machine learning techniques to train and evaluate the performance of each model. Notably, Multiple Regression emerged as the top-performing algorithm, achieving an impressive accuracy of 77.19% and a precision of 54.05% (within a threshold of +/- 10 runs). This research contributes to the advancement of sports analytics by demonstrating the efficacy of machine learning in predicting IPL team scores. The findings underscore the potential of advanced predictive modeling techniques to provide valuable insights for cricket enthusiasts, team management, and betting agencies. Additionally, this study serves as a benchmark for future research endeavors aimed at enhancing the accuracy and interpretability of IPL score prediction models.Keywords: indian premier league (IPL), cricket, score prediction, machine learning, support vector machines (SVM), xgboost, multiple regression, linear regression, k-nearest neighbors (KNN), random forest, sports analytics
Procedia PDF Downloads 533049 The Impact of Unconditional and Conditional Conservatism on Cost of Equity Capital: A Quantile Regression Approach for MENA Countries
Authors: Khalifa Maha, Ben Othman Hakim, Khaled Hussainey
Abstract:
Prior empirical studies have investigated the economic consequences of accounting conservatism by examining its impact on the cost of equity capital (COEC). However, findings are not conclusive. We assume that inconsistent results of such association may be attributed to the regression models used in data analysis. To address this issue, we re-examine the effect of different dimension of accounting conservatism: unconditional conservatism (U_CONS) and conditional conservatism (C_CONS) on the COEC for a sample of listed firms from Middle Eastern and North Africa (MENA) countries, applying quantile regression (QR) approach developed by Koenker and Basset (1978). While classical ordinary least square (OLS) method is widely used in empirical accounting research, however it may produce inefficient and bias estimates in the case of departures from normality or long tail error distribution. QR method is more powerful than OLS to handle this kind of problem. It allows the coefficient on the independent variables to shift across the distribution of the dependent variable whereas OLS method only estimates the conditional mean effects of a response variable. We find as predicted that U_CONS has a significant positive effect on the COEC however, C_CONS has a negative impact. Findings suggest also that the effect of the two dimensions of accounting conservatism differs considerably across COEC quantiles. Comparing results from QR method with those of OLS, this study throws more lights on the association between accounting conservatism and COEC.Keywords: unconditional conservatism, conditional conservatism, cost of equity capital, OLS, quantile regression, emerging markets, MENA countries
Procedia PDF Downloads 3593048 Approach to Formulate Intuitionistic Fuzzy Regression Models
Authors: Liang-Hsuan Chen, Sheng-Shing Nien
Abstract:
This study aims to develop approaches to formulate intuitionistic fuzzy regression (IFR) models for many decision-making applications in the fuzzy environments using intuitionistic fuzzy observations. Intuitionistic fuzzy numbers (IFNs) are used to characterize the fuzzy input and output variables in the IFR formulation processes. A mathematical programming problem (MPP) is built up to optimally determine the IFR parameters. Each parameter in the MPP is defined as a couple of alternative numerical variables with opposite signs, and an intuitionistic fuzzy error term is added to the MPP to characterize the uncertainty of the model. The IFR model is formulated based on the distance measure to minimize the total distance errors between estimated and observed intuitionistic fuzzy responses in the MPP resolution processes. The proposed approaches are simple/efficient in the formulation/resolution processes, in which the sign of parameters can be determined so that the problem to predetermine the sign of parameters is avoided. Furthermore, the proposed approach has the advantage that the spread of the predicted IFN response will not be over-increased, since the parameters in the established IFR model are crisp. The performance of the obtained models is evaluated and compared with the existing approaches.Keywords: fuzzy sets, intuitionistic fuzzy number, intuitionistic fuzzy regression, mathematical programming method
Procedia PDF Downloads 1383047 A Preliminary Study of the Subcontractor Evaluation System for the International Construction Market
Authors: Hochan Seok, Woosik Jang, Seung-Heon Han
Abstract:
The stagnant global construction market has intensified competition since 2008 among firms that aim to win overseas contracts. Against this backdrop, subcontractor selection is identified as one of the most critical success factors in overseas construction project. However, it is difficult to select qualified subcontractors due to the lack of evaluation standards and reliability. This study aims to identify the problems associated with existing subcontractor evaluations using a correlations analysis and a multiple regression analysis with pre-qualification and performance evaluation of 121 firms in six countries.Keywords: subcontractor evaluation system, pre-qualification, performance evaluation, correlation analysis, multiple regression analysis
Procedia PDF Downloads 3683046 Liquid Chromatography Microfluidics for Detection and Quantification of Urine Albumin Using Linear Regression Method
Authors: Patricia B. Cruz, Catrina Jean G. Valenzuela, Analyn N. Yumang
Abstract:
Nearly a hundred per million of the Filipino population is diagnosed with Chronic Kidney Disease (CKD). The early stage of CKD has no symptoms and can only be discovered once the patient undergoes urinalysis. Over the years, different methods were discovered and used for the quantification of the urinary albumin such as the immunochemical assays where most of these methods require large machinery that has a high cost in maintenance and resources, and a dipstick test which is yet to be proven and is still debated as a reliable method in detecting early stages of microalbuminuria. This research study involves the use of the liquid chromatography concept in microfluidic instruments with biosensor as a means of separation and detection respectively, and linear regression to quantify human urinary albumin. The researchers’ main objective was to create a miniature system that quantifies and detect patients’ urinary albumin while reducing the amount of volume used per five test samples. For this study, 30 urine samples of unknown albumin concentrations were tested using VITROS Analyzer and the microfluidic system for comparison. Based on the data shared by both methods, the actual vs. predicted regression were able to create a positive linear relationship with an R2 of 0.9995 and a linear equation of y = 1.09x + 0.07, indicating that the predicted values and actual values are approximately equal. Furthermore, the microfluidic instrument uses 75% less in total volume – sample and reagents combined, compared to the VITROS Analyzer per five test samples.Keywords: Chronic Kidney Disease, Linear Regression, Microfluidics, Urinary Albumin
Procedia PDF Downloads 1363045 Using Machine-Learning Methods for Allergen Amino Acid Sequence's Permutations
Authors: Kuei-Ling Sun, Emily Chia-Yu Su
Abstract:
Allergy is a hypersensitive overreaction of the immune system to environmental stimuli, and a major health problem. These overreactions include rashes, sneezing, fever, food allergies, anaphylaxis, asthmatic, shock, or other abnormal conditions. Allergies can be caused by food, insect stings, pollen, animal wool, and other allergens. Their development of allergies is due to both genetic and environmental factors. Allergies involve immunoglobulin E antibodies, a part of the body’s immune system. Immunoglobulin E antibodies will bind to an allergen and then transfer to a receptor on mast cells or basophils triggering the release of inflammatory chemicals such as histamine. Based on the increasingly serious problem of environmental change, changes in lifestyle, air pollution problem, and other factors, in this study, we both collect allergens and non-allergens from several databases and use several machine learning methods for classification, including logistic regression (LR), stepwise regression, decision tree (DT) and neural networks (NN) to do the model comparison and determine the permutations of allergen amino acid’s sequence.Keywords: allergy, classification, decision tree, logistic regression, machine learning
Procedia PDF Downloads 3033044 Comparison of Multivariate Adaptive Regression Splines and Random Forest Regression in Predicting Forced Expiratory Volume in One Second
Authors: P. V. Pramila , V. Mahesh
Abstract:
Pulmonary Function Tests are important non-invasive diagnostic tests to assess respiratory impairments and provides quantifiable measures of lung function. Spirometry is the most frequently used measure of lung function and plays an essential role in the diagnosis and management of pulmonary diseases. However, the test requires considerable patient effort and cooperation, markedly related to the age of patients esulting in incomplete data sets. This paper presents, a nonlinear model built using Multivariate adaptive regression splines and Random forest regression model to predict the missing spirometric features. Random forest based feature selection is used to enhance both the generalization capability and the model interpretability. In the present study, flow-volume data are recorded for N= 198 subjects. The ranked order of feature importance index calculated by the random forests model shows that the spirometric features FVC, FEF 25, PEF,FEF 25-75, FEF50, and the demographic parameter height are the important descriptors. A comparison of performance assessment of both models prove that, the prediction ability of MARS with the `top two ranked features namely the FVC and FEF 25 is higher, yielding a model fit of R2= 0.96 and R2= 0.99 for normal and abnormal subjects. The Root Mean Square Error analysis of the RF model and the MARS model also shows that the latter is capable of predicting the missing values of FEV1 with a notably lower error value of 0.0191 (normal subjects) and 0.0106 (abnormal subjects). It is concluded that combining feature selection with a prediction model provides a minimum subset of predominant features to train the model, yielding better prediction performance. This analysis can assist clinicians with a intelligence support system in the medical diagnosis and improvement of clinical care.Keywords: FEV, multivariate adaptive regression splines pulmonary function test, random forest
Procedia PDF Downloads 3103043 On Improving Breast Cancer Prediction Using GRNN-CP
Authors: Kefaya Qaddoum
Abstract:
The aim of this study is to predict breast cancer and to construct a supportive model that will stimulate a more reliable prediction as a factor that is fundamental for public health. In this study, we utilize general regression neural networks (GRNN) to replace the normal predictions with prediction periods to achieve a reasonable percentage of confidence. The mechanism employed here utilises a machine learning system called conformal prediction (CP), in order to assign consistent confidence measures to predictions, which are combined with GRNN. We apply the resulting algorithm to the problem of breast cancer diagnosis. The results show that the prediction constructed by this method is reasonable and could be useful in practice.Keywords: neural network, conformal prediction, cancer classification, regression
Procedia PDF Downloads 2913042 Multiple Linear Regression for Rapid Estimation of Subsurface Resistivity from Apparent Resistivity Measurements
Authors: Sabiu Bala Muhammad, Rosli Saad
Abstract:
Multiple linear regression (MLR) models for fast estimation of true subsurface resistivity from apparent resistivity field measurements are developed and assessed in this study. The parameters investigated were apparent resistivity (ρₐ), horizontal location (X) and depth (Z) of measurement as the independent variables; and true resistivity (ρₜ) as the dependent variable. To achieve linearity in both resistivity variables, datasets were first transformed into logarithmic domain following diagnostic checks of normality of the dependent variable and heteroscedasticity to ensure accurate models. Four MLR models were developed based on hierarchical combination of the independent variables. The generated MLR coefficients were applied to another data set to estimate ρₜ values for validation. Contours of the estimated ρₜ values were plotted and compared to the observed data plots at the colour scale and blanking for visual assessment. The accuracy of the models was assessed using coefficient of determination (R²), standard error (SE) and weighted mean absolute percentage error (wMAPE). It is concluded that the MLR models can estimate ρₜ for with high level of accuracy.Keywords: apparent resistivity, depth, horizontal location, multiple linear regression, true resistivity
Procedia PDF Downloads 276