Search results for: locally weighted regression
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 4138

Search results for: locally weighted regression

3778 The Effect of Accounting Conservatism on Cost of Capital: A Quantile Regression Approach for MENA Countries

Authors: Maha Zouaoui Khalifa, Hakim Ben Othman, Hussaney Khaled

Abstract:

Prior empirical studies have investigated the economic consequences of accounting conservatism by examining its impact on the cost of equity capital (COEC). However, findings are not conclusive. We assume that inconsistent results of such association may be attributed to the regression models used in data analysis. To address this issue, we re-examine the effect of different dimension of accounting conservatism: unconditional conservatism (U_CONS) and conditional conservatism (C_CONS) on the COEC for a sample of listed firms from Middle Eastern and North Africa (MENA) countries, applying quantile regression (QR) approach developed by Koenker and Basset (1978). While classical ordinary least square (OLS) method is widely used in empirical accounting research, however it may produce inefficient and bias estimates in the case of departures from normality or long tail error distribution. QR method is more powerful than OLS to handle this kind of problem. It allows the coefficient on the independent variables to shift across the distribution of the dependent variable whereas OLS method only estimates the conditional mean effects of a response variable. We find as predicted that U_CONS has a significant positive effect on the COEC however, C_CONS has a negative impact. Findings suggest also that the effect of the two dimensions of accounting conservatism differs considerably across COEC quantiles. Comparing results from QR method with those of OLS, this study throws more lights on the association between accounting conservatism and COEC.

Keywords: unconditional conservatism, conditional conservatism, cost of equity capital, OLS, quantile regression, emerging markets, MENA countries

Procedia PDF Downloads 333
3777 Optimizing the Scanning Time with Radiation Prediction Using a Machine Learning Technique

Authors: Saeed Eskandari, Seyed Rasoul Mehdikhani

Abstract:

Radiation sources have been used in many industries, such as gamma sources in medical imaging. These waves have destructive effects on humans and the environment. It is very important to detect and find the source of these waves because these sources cannot be seen by the eye. A portable robot has been designed and built with the purpose of revealing radiation sources that are able to scan the place from 5 to 20 meters away and shows the location of the sources according to the intensity of the waves on a two-dimensional digital image. The operation of the robot is done by measuring the pixels separately. By increasing the image measurement resolution, we will have a more accurate scan of the environment, and more points will be detected. But this causes a lot of time to be spent on scanning. In this paper, to overcome this challenge, we designed a method that can optimize this time. In this method, a small number of important points of the environment are measured. Hence the remaining pixels are predicted and estimated by regression algorithms in machine learning. The research method is based on comparing the actual values of all pixels. These steps have been repeated with several other radiation sources. The obtained results of the study show that the values estimated by the regression method are very close to the real values.

Keywords: regression, machine learning, scan radiation, robot

Procedia PDF Downloads 61
3776 Chemometric Regression Analysis of Radical Scavenging Ability of Kombucha Fermented Kefir-Like Products

Authors: Strahinja Kovacevic, Milica Karadzic Banjac, Jasmina Vitas, Stefan Vukmanovic, Radomir Malbasa, Lidija Jevric, Sanja Podunavac-Kuzmanovic

Abstract:

The present study deals with chemometric regression analysis of quality parameters and the radical scavenging ability of kombucha fermented kefir-like products obtained with winter savory (WS), peppermint (P), stinging nettle (SN) and wild thyme tea (WT) kombucha inoculums. Each analyzed sample was described by milk fat content (MF, %), total unsaturated fatty acids content (TUFA, %), monounsaturated fatty acids content (MUFA, %), polyunsaturated fatty acids content (PUFA, %), the ability of free radicals scavenging (RSA Dₚₚₕ, % and RSA.ₒₕ, %) and pH values measured after each hour from the start until the end of fermentation. The aim of the conducted regression analysis was to establish chemometric models which can predict the radical scavenging ability (RSA Dₚₚₕ, % and RSA.ₒₕ, %) of the samples by correlating it with the MF, TUFA, MUFA, PUFA and the pH value at the beginning, in the middle and at the end of fermentation process which lasted between 11 and 17 hours, until pH value of 4.5 was reached. The analysis was carried out applying univariate linear (ULR) and multiple linear regression (MLR) methods on the raw data and the data standardized by the min-max normalization method. The obtained models were characterized by very limited prediction power (poor cross-validation parameters) and weak statistical characteristics. Based on the conducted analysis it can be concluded that the resulting radical scavenging ability cannot be precisely predicted only on the basis of MF, TUFA, MUFA, PUFA content, and pH values, however, other quality parameters should be considered and included in the further modeling. This study is based upon work from project: Kombucha beverages production using alternative substrates from the territory of the Autonomous Province of Vojvodina, 142-451-2400/2019-03, supported by Provincial Secretariat for Higher Education and Scientific Research of AP Vojvodina.

Keywords: chemometrics, regression analysis, kombucha, quality control

Procedia PDF Downloads 120
3775 Enhancing Spatial Interpolation: A Multi-Layer Inverse Distance Weighting Model for Complex Regression and Classification Tasks in Spatial Data Analysis

Authors: Yakin Hajlaoui, Richard Labib, Jean-François Plante, Michel Gamache

Abstract:

This study introduces the Multi-Layer Inverse Distance Weighting Model (ML-IDW), inspired by the mathematical formulation of both multi-layer neural networks (ML-NNs) and Inverse Distance Weighting model (IDW). ML-IDW leverages ML-NNs' processing capabilities, characterized by compositions of learnable non-linear functions applied to input features, and incorporates IDW's ability to learn anisotropic spatial dependencies, presenting a promising solution for nonlinear spatial interpolation and learning from complex spatial data. it employ gradient descent and backpropagation to train ML-IDW, comparing its performance against conventional spatial interpolation models such as Kriging and standard IDW on regression and classification tasks using simulated spatial datasets of varying complexity. the results highlight the efficacy of ML-IDW, particularly in handling complex spatial datasets, exhibiting lower mean square error in regression and higher F1 score in classification.

Keywords: deep learning, multi-layer neural networks, gradient descent, spatial interpolation, inverse distance weighting

Procedia PDF Downloads 19
3774 Indian Premier League (IPL) Score Prediction: Comparative Analysis of Machine Learning Models

Authors: Rohini Hariharan, Yazhini R, Bhamidipati Naga Shrikarti

Abstract:

In the realm of cricket, particularly within the context of the Indian Premier League (IPL), the ability to predict team scores accurately holds significant importance for both cricket enthusiasts and stakeholders alike. This paper presents a comprehensive study on IPL score prediction utilizing various machine learning algorithms, including Support Vector Machines (SVM), XGBoost, Multiple Regression, Linear Regression, K-nearest neighbors (KNN), and Random Forest. Through meticulous data preprocessing, feature engineering, and model selection, we aimed to develop a robust predictive framework capable of forecasting team scores with high precision. Our experimentation involved the analysis of historical IPL match data encompassing diverse match and player statistics. Leveraging this data, we employed state-of-the-art machine learning techniques to train and evaluate the performance of each model. Notably, Multiple Regression emerged as the top-performing algorithm, achieving an impressive accuracy of 77.19% and a precision of 54.05% (within a threshold of +/- 10 runs). This research contributes to the advancement of sports analytics by demonstrating the efficacy of machine learning in predicting IPL team scores. The findings underscore the potential of advanced predictive modeling techniques to provide valuable insights for cricket enthusiasts, team management, and betting agencies. Additionally, this study serves as a benchmark for future research endeavors aimed at enhancing the accuracy and interpretability of IPL score prediction models.

Keywords: indian premier league (IPL), cricket, score prediction, machine learning, support vector machines (SVM), xgboost, multiple regression, linear regression, k-nearest neighbors (KNN), random forest, sports analytics

Procedia PDF Downloads 27
3773 The Impact of Unconditional and Conditional Conservatism on Cost of Equity Capital: A Quantile Regression Approach for MENA Countries

Authors: Khalifa Maha, Ben Othman Hakim, Khaled Hussainey

Abstract:

Prior empirical studies have investigated the economic consequences of accounting conservatism by examining its impact on the cost of equity capital (COEC). However, findings are not conclusive. We assume that inconsistent results of such association may be attributed to the regression models used in data analysis. To address this issue, we re-examine the effect of different dimension of accounting conservatism: unconditional conservatism (U_CONS) and conditional conservatism (C_CONS) on the COEC for a sample of listed firms from Middle Eastern and North Africa (MENA) countries, applying quantile regression (QR) approach developed by Koenker and Basset (1978). While classical ordinary least square (OLS) method is widely used in empirical accounting research, however it may produce inefficient and bias estimates in the case of departures from normality or long tail error distribution. QR method is more powerful than OLS to handle this kind of problem. It allows the coefficient on the independent variables to shift across the distribution of the dependent variable whereas OLS method only estimates the conditional mean effects of a response variable. We find as predicted that U_CONS has a significant positive effect on the COEC however, C_CONS has a negative impact. Findings suggest also that the effect of the two dimensions of accounting conservatism differs considerably across COEC quantiles. Comparing results from QR method with those of OLS, this study throws more lights on the association between accounting conservatism and COEC.

Keywords: unconditional conservatism, conditional conservatism, cost of equity capital, OLS, quantile regression, emerging markets, MENA countries

Procedia PDF Downloads 343
3772 Approach to Formulate Intuitionistic Fuzzy Regression Models

Authors: Liang-Hsuan Chen, Sheng-Shing Nien

Abstract:

This study aims to develop approaches to formulate intuitionistic fuzzy regression (IFR) models for many decision-making applications in the fuzzy environments using intuitionistic fuzzy observations. Intuitionistic fuzzy numbers (IFNs) are used to characterize the fuzzy input and output variables in the IFR formulation processes. A mathematical programming problem (MPP) is built up to optimally determine the IFR parameters. Each parameter in the MPP is defined as a couple of alternative numerical variables with opposite signs, and an intuitionistic fuzzy error term is added to the MPP to characterize the uncertainty of the model. The IFR model is formulated based on the distance measure to minimize the total distance errors between estimated and observed intuitionistic fuzzy responses in the MPP resolution processes. The proposed approaches are simple/efficient in the formulation/resolution processes, in which the sign of parameters can be determined so that the problem to predetermine the sign of parameters is avoided. Furthermore, the proposed approach has the advantage that the spread of the predicted IFN response will not be over-increased, since the parameters in the established IFR model are crisp. The performance of the obtained models is evaluated and compared with the existing approaches.

Keywords: fuzzy sets, intuitionistic fuzzy number, intuitionistic fuzzy regression, mathematical programming method

Procedia PDF Downloads 117
3771 A Preliminary Study of the Subcontractor Evaluation System for the International Construction Market

Authors: Hochan Seok, Woosik Jang, Seung-Heon Han

Abstract:

The stagnant global construction market has intensified competition since 2008 among firms that aim to win overseas contracts. Against this backdrop, subcontractor selection is identified as one of the most critical success factors in overseas construction project. However, it is difficult to select qualified subcontractors due to the lack of evaluation standards and reliability. This study aims to identify the problems associated with existing subcontractor evaluations using a correlations analysis and a multiple regression analysis with pre-qualification and performance evaluation of 121 firms in six countries.

Keywords: subcontractor evaluation system, pre-qualification, performance evaluation, correlation analysis, multiple regression analysis

Procedia PDF Downloads 344
3770 Liquid Chromatography Microfluidics for Detection and Quantification of Urine Albumin Using Linear Regression Method

Authors: Patricia B. Cruz, Catrina Jean G. Valenzuela, Analyn N. Yumang

Abstract:

Nearly a hundred per million of the Filipino population is diagnosed with Chronic Kidney Disease (CKD). The early stage of CKD has no symptoms and can only be discovered once the patient undergoes urinalysis. Over the years, different methods were discovered and used for the quantification of the urinary albumin such as the immunochemical assays where most of these methods require large machinery that has a high cost in maintenance and resources, and a dipstick test which is yet to be proven and is still debated as a reliable method in detecting early stages of microalbuminuria. This research study involves the use of the liquid chromatography concept in microfluidic instruments with biosensor as a means of separation and detection respectively, and linear regression to quantify human urinary albumin. The researchers’ main objective was to create a miniature system that quantifies and detect patients’ urinary albumin while reducing the amount of volume used per five test samples. For this study, 30 urine samples of unknown albumin concentrations were tested using VITROS Analyzer and the microfluidic system for comparison. Based on the data shared by both methods, the actual vs. predicted regression were able to create a positive linear relationship with an R2 of 0.9995 and a linear equation of y = 1.09x + 0.07, indicating that the predicted values and actual values are approximately equal. Furthermore, the microfluidic instrument uses 75% less in total volume – sample and reagents combined, compared to the VITROS Analyzer per five test samples.

Keywords: Chronic Kidney Disease, Linear Regression, Microfluidics, Urinary Albumin

Procedia PDF Downloads 116
3769 Using Machine-Learning Methods for Allergen Amino Acid Sequence's Permutations

Authors: Kuei-Ling Sun, Emily Chia-Yu Su

Abstract:

Allergy is a hypersensitive overreaction of the immune system to environmental stimuli, and a major health problem. These overreactions include rashes, sneezing, fever, food allergies, anaphylaxis, asthmatic, shock, or other abnormal conditions. Allergies can be caused by food, insect stings, pollen, animal wool, and other allergens. Their development of allergies is due to both genetic and environmental factors. Allergies involve immunoglobulin E antibodies, a part of the body’s immune system. Immunoglobulin E antibodies will bind to an allergen and then transfer to a receptor on mast cells or basophils triggering the release of inflammatory chemicals such as histamine. Based on the increasingly serious problem of environmental change, changes in lifestyle, air pollution problem, and other factors, in this study, we both collect allergens and non-allergens from several databases and use several machine learning methods for classification, including logistic regression (LR), stepwise regression, decision tree (DT) and neural networks (NN) to do the model comparison and determine the permutations of allergen amino acid’s sequence.

Keywords: allergy, classification, decision tree, logistic regression, machine learning

Procedia PDF Downloads 281
3768 Comparison of Multivariate Adaptive Regression Splines and Random Forest Regression in Predicting Forced Expiratory Volume in One Second

Authors: P. V. Pramila , V. Mahesh

Abstract:

Pulmonary Function Tests are important non-invasive diagnostic tests to assess respiratory impairments and provides quantifiable measures of lung function. Spirometry is the most frequently used measure of lung function and plays an essential role in the diagnosis and management of pulmonary diseases. However, the test requires considerable patient effort and cooperation, markedly related to the age of patients esulting in incomplete data sets. This paper presents, a nonlinear model built using Multivariate adaptive regression splines and Random forest regression model to predict the missing spirometric features. Random forest based feature selection is used to enhance both the generalization capability and the model interpretability. In the present study, flow-volume data are recorded for N= 198 subjects. The ranked order of feature importance index calculated by the random forests model shows that the spirometric features FVC, FEF 25, PEF,FEF 25-75, FEF50, and the demographic parameter height are the important descriptors. A comparison of performance assessment of both models prove that, the prediction ability of MARS with the `top two ranked features namely the FVC and FEF 25 is higher, yielding a model fit of R2= 0.96 and R2= 0.99 for normal and abnormal subjects. The Root Mean Square Error analysis of the RF model and the MARS model also shows that the latter is capable of predicting the missing values of FEV1 with a notably lower error value of 0.0191 (normal subjects) and 0.0106 (abnormal subjects). It is concluded that combining feature selection with a prediction model provides a minimum subset of predominant features to train the model, yielding better prediction performance. This analysis can assist clinicians with a intelligence support system in the medical diagnosis and improvement of clinical care.

Keywords: FEV, multivariate adaptive regression splines pulmonary function test, random forest

Procedia PDF Downloads 287
3767 On Improving Breast Cancer Prediction Using GRNN-CP

Authors: Kefaya Qaddoum

Abstract:

The aim of this study is to predict breast cancer and to construct a supportive model that will stimulate a more reliable prediction as a factor that is fundamental for public health. In this study, we utilize general regression neural networks (GRNN) to replace the normal predictions with prediction periods to achieve a reasonable percentage of confidence. The mechanism employed here utilises a machine learning system called conformal prediction (CP), in order to assign consistent confidence measures to predictions, which are combined with GRNN. We apply the resulting algorithm to the problem of breast cancer diagnosis. The results show that the prediction constructed by this method is reasonable and could be useful in practice.

Keywords: neural network, conformal prediction, cancer classification, regression

Procedia PDF Downloads 263
3766 Geostatistical Simulation of Carcinogenic Industrial Effluent on the Irrigated Soil and Groundwater, District Sheikhupura, Pakistan

Authors: Asma Shaheen, Javed Iqbal

Abstract:

The water resources are depleting due to an intrusion of industrial pollution. There are clusters of industries including leather tanning, textiles, batteries, and chemical causing contamination. These industries use bulk quantity of water and discharge it with toxic effluents. The penetration of heavy metals through irrigation from industrial effluent has toxic effect on soil and groundwater. There was strong positive significant correlation between all the heavy metals in three media of industrial effluent, soil and groundwater (P < 0.001). The metal to the metal association was supported by dendrograms using cluster analysis. The geospatial variability was assessed by using geographically weighted regression (GWR) and pollution model to identify the simulation of carcinogenic elements in soil and groundwater. The principal component analysis identified the metals source, 48.8% variation in factor 1 have significant loading for sodium (Na), calcium (Ca), magnesium (Mg), iron (Fe), chromium (Cr), nickel (Ni), lead (Pb) and zinc (Zn) of tannery effluent-based process. In soil and groundwater, the metals have significant loading in factor 1 representing more than half of the total variation with 51.3 % and 53.6 % respectively which showed that pollutants in soil and water were driven by industrial effluent. The cumulative eigen values for the three media were also found to be greater than 1 representing significant clustering of related heavy metals. The results showed that heavy metals from industrial processes are seeping up toxic trace metals in the soil and groundwater. The poisonous pollutants from heavy metals turned the fresh resources of groundwater into unusable water. The availability of fresh water for irrigation and domestic use is being alarming.

Keywords: groundwater, geostatistical, heavy metals, industrial effluent

Procedia PDF Downloads 214
3765 Multicollinearity and MRA in Sustainability: Application of the Raise Regression

Authors: Claudia García-García, Catalina B. García-García, Román Salmerón-Gómez

Abstract:

Much economic-environmental research includes the analysis of possible interactions by using Moderated Regression Analysis (MRA), which is a specific application of multiple linear regression analysis. This methodology allows analyzing how the effect of one of the independent variables is moderated by a second independent variable by adding a cross-product term between them as an additional explanatory variable. Due to the very specification of the methodology, the moderated factor is often highly correlated with the constitutive terms. Thus, great multicollinearity problems arise. The appearance of strong multicollinearity in a model has important consequences. Inflated variances of the estimators may appear, there is a tendency to consider non-significant regressors that they probably are together with a very high coefficient of determination, incorrect signs of our coefficients may appear and also the high sensibility of the results to small changes in the dataset. Finally, the high relationship among explanatory variables implies difficulties in fixing the individual effects of each one on the model under study. These consequences shifted to the moderated analysis may imply that it is not worth including an interaction term that may be distorting the model. Thus, it is important to manage the problem with some methodology that allows for obtaining reliable results. After a review of those works that applied the MRA among the ten top journals of the field, it is clear that multicollinearity is mostly disregarded. Less than 15% of the reviewed works take into account potential multicollinearity problems. To overcome the issue, this work studies the possible application of recent methodologies to MRA. Particularly, the raised regression is analyzed. This methodology mitigates collinearity from a geometrical point of view: the collinearity problem arises because the variables under study are very close geometrically, so by separating both variables, the problem can be mitigated. Raise regression maintains the available information and modifies the problematic variables instead of deleting variables, for example. Furthermore, the global characteristics of the initial model are also maintained (sum of squared residuals, estimated variance, coefficient of determination, global significance test and prediction). The proposal is implemented to data from countries of the European Union during the last year available regarding greenhouse gas emissions, per capita GDP and a dummy variable that represents the topography of the country. The use of a dummy variable as the moderator is a special variant of MRA, sometimes called “subgroup regression analysis.” The main conclusion of this work is that applying new techniques to the field can improve in a substantial way the results of the analysis. Particularly, the use of raised regression mitigates great multicollinearity problems, so the researcher is able to rely on the interaction term when interpreting the results of a particular study.

Keywords: multicollinearity, MRA, interaction, raise

Procedia PDF Downloads 80
3764 Landslide Hazard Zonation Using Satellite Remote Sensing and GIS Technology

Authors: Ankit Tyagi, Reet Kamal Tiwari, Naveen James

Abstract:

Landslide is the major geo-environmental problem of Himalaya because of high ridges, steep slopes, deep valleys, and complex system of streams. They are mainly triggered by rainfall and earthquake and causing severe damage to life and property. In Uttarakhand, the Tehri reservoir rim area, which is situated in the lesser Himalaya of Garhwal hills, was selected for landslide hazard zonation (LHZ). The study utilized different types of data, including geological maps, topographic maps from the survey of India, Landsat 8, and Cartosat DEM data. This paper presents the use of a weighted overlay method in LHZ using fourteen causative factors. The various data layers generated and co-registered were slope, aspect, relative relief, soil cover, intensity of rainfall, seismic ground shaking, seismic amplification at surface level, lithology, land use/land cover (LULC), normalized difference vegetation index (NDVI), topographic wetness index (TWI), stream power index (SPI), drainage buffer and reservoir buffer. Seismic analysis is performed using peak horizontal acceleration (PHA) intensity and amplification factors in the evaluation of the landslide hazard index (LHI). Several digital image processing techniques such as topographic correction, NDVI, and supervised classification were widely used in the process of terrain factor extraction. Lithological features, LULC, drainage pattern, lineaments, and structural features are extracted using digital image processing techniques. Colour, tones, topography, and stream drainage pattern from the imageries are used to analyse geological features. Slope map, aspect map, relative relief are created by using Cartosat DEM data. DEM data is also used for the detailed drainage analysis, which includes TWI, SPI, drainage buffer, and reservoir buffer. In the weighted overlay method, the comparative importance of several causative factors obtained from experience. In this method, after multiplying the influence factor with the corresponding rating of a particular class, it is reclassified, and the LHZ map is prepared. Further, based on the land-use map developed from remote sensing images, a landslide vulnerability study for the study area is carried out and presented in this paper.

Keywords: weighted overlay method, GIS, landslide hazard zonation, remote sensing

Procedia PDF Downloads 108
3763 Development of Cost Effective Ultra High Performance Concrete by Using Locally Available Materials

Authors: Mohamed Sifan, Brabha Nagaratnam, Julian Thamboo, Keerthan Poologanathan

Abstract:

Ultra high performance concrete (UHPC) is a type of cementitious material known for its exceptional strength, ductility, and durability. However, its production is often associated with high costs due to the significant amount of cementitious materials required and the use of fine powders to achieve the desired strength. The aim of this research is to explore the feasibility of developing cost-effective UHPC mixes using locally available materials. Specifically, the study aims to investigate the use of coarse limestone sand along with other sand types, namely, basalt sand, dolomite sand, and river sand for developing UHPC mixes and evaluating its performances. The study utilises the particle packing model to develop various UHPC mixes. The particle packing model involves optimising the combination of coarse limestone sand, basalt sand, dolomite sand, and river sand to achieve the desired properties of UHPC. The developed UHPC mixes are then evaluated based on their workability (measured through slump flow and mini slump value), compressive strength (at 7, 28, and 90 days), splitting tensile strength, and microstructural characteristics analysed through scanning electron microscope (SEM) analysis. The results of this study demonstrate that cost-effective UHPC mixes can be developed using locally available materials without the need for silica fume or fly ash. The UHPC mixes achieved impressive compressive strengths of up to 149 MPa at 28 days with a cement content of approximately 750 kg/m³. The mixes also exhibited varying levels of workability, with slump flow values ranging from 550 to 850 mm. Additionally, the inclusion of coarse limestone sand in the mixes effectively reduced the demand for superplasticizer and served as a filler material. By exploring the use of coarse limestone sand and other sand types, this study provides valuable insights into optimising the particle packing model for UHPC production. The findings highlight the potential to reduce costs associated with UHPC production without compromising its strength and durability. The study collected data on the workability, compressive strength, splitting tensile strength, and microstructural characteristics of the developed UHPC mixes. Workability was measured using slump flow and mini slump tests, while compressive strength and splitting tensile strength were assessed at different curing periods. Microstructural characteristics were analysed through SEM and energy dispersive X-ray spectroscopy (EDS) analysis. The collected data were then analysed and interpreted to evaluate the performance and properties of the UHPC mixes. The research successfully demonstrates the feasibility of developing cost-effective UHPC mixes using locally available materials. The inclusion of coarse limestone sand, in combination with other sand types, shows promising results in achieving high compressive strengths and satisfactory workability. The findings suggest that the use of the particle packing model can optimise the combination of materials and reduce the reliance on expensive additives such as silica fume and fly ash. This research provides valuable insights for researchers and construction practitioners aiming to develop cost-effective UHPC mixes using readily available materials and an optimised particle packing approach.

Keywords: cost-effective, limestone powder, particle packing model, ultra high performance concrete

Procedia PDF Downloads 73
3762 Empirical Modeling and Spatial Analysis of Heat-Related Morbidity in Maricopa County, Arizona

Authors: Chuyuan Wang, Nayan Khare, Lily Villa, Patricia Solis, Elizabeth A. Wentz

Abstract:

Maricopa County, Arizona, has a semi-arid hot desert climate that is one of the hottest regions in the United States. The exacerbated urban heat island (UHI) effect caused by rapid urbanization has made the urban area even hotter than the rural surroundings. The Phoenix metropolitan area experiences extremely high temperatures in the summer from June to September that can reach the daily highest of 120 °F (48.9 °C). Morbidity and mortality due to the environmental heat is, therefore, a significant public health issue in Maricopa County, especially because it is largely preventable. Public records from the Maricopa County Department of Public Health (MCDPH) revealed that between 2012 and 2016, there were 10,825 incidents of heat-related morbidity incidents, 267 outdoor environmental heat deaths, and 173 indoor heat-related deaths. A lot of research has examined heat-related death and its contributing factors around the world, but little has been done regarding heat-related morbidity issues, especially for regions that are naturally hot in the summer. The objective of this study is to examine the demographic, socio-economic, housing, and environmental factors that contribute to heat-related morbidity in Maricopa County. We obtained heat-related morbidity data between 2012 and 2016 at census tract level from MCDPH. Demographic, socio-economic, and housing variables were derived using 2012-2016 American Community Survey 5-year estimate from the U.S. Census. Remotely sensed Landsat 7 ETM+ and Landsat 8 OLI satellite images and Level-1 products were acquired for all the summer months (June to September) from 2012 and 2016. The National Land Cover Database (NLCD) 2016 percent tree canopy and percent developed imperviousness data were obtained from the U.S. Geological Survey (USGS). We used ordinary least squares (OLS) regression analysis to examine the empirical relationship between all the independent variables and heat-related morbidity rate. Results showed that higher morbidity rates are found in census tracts with higher values in population aged 65 and older, population under poverty, disability, no vehicle ownership, white non-Hispanic, population with less than high school degree, land surface temperature, and surface reflectance, but lower values in normalized difference vegetation index (NDVI) and housing occupancy. The regression model can be used to explain up to 59.4% of total variation of heat-related morbidity in Maricopa County. The multiscale geographically weighted regression (MGWR) technique was then used to examine the spatially varying relationships between heat-related morbidity rate and all the significant independent variables. The R-squared value of the MGWR model increased to 0.691, that shows a significant improvement in goodness-of-fit than the global OLS model, which means that spatial heterogeneity of some independent variables is another important factor that influences the relationship with heat-related morbidity in Maricopa County. Among these variables, population aged 65 and older, the Hispanic population, disability, vehicle ownership, and housing occupancy have much stronger local effects than other variables.

Keywords: census, empirical modeling, heat-related morbidity, spatial analysis

Procedia PDF Downloads 104
3761 Bayesian Reliability of Weibull Regression with Type-I Censored Data

Authors: Al Omari Moahmmed Ahmed

Abstract:

In the Bayesian, we developed an approach by using non-informative prior with covariate and obtained by using Gauss quadrature method to estimate the parameters of the covariate and reliability function of the Weibull regression distribution with Type-I censored data. The maximum likelihood seen that the estimators obtained are not available in closed forms, although they can be solved it by using Newton-Raphson methods. The comparison criteria are the MSE and the performance of these estimates are assessed using simulation considering various sample size, several specific values of shape parameter. The results show that Bayesian with non-informative prior is better than Maximum Likelihood Estimator.

Keywords: non-informative prior, Bayesian method, type-I censoring, Gauss quardature

Procedia PDF Downloads 481
3760 Walmart Sales Forecasting using Machine Learning in Python

Authors: Niyati Sharma, Om Anand, Sanjeev Kumar Prasad

Abstract:

Assuming future sale value for any of the organizations is one of the major essential characteristics of tactical development. Walmart Sales Forecasting is the finest illustration to work with as a beginner; subsequently, it has the major retail data set. Walmart uses this sales estimate problem for hiring purposes also. We would like to analyzing how the internal and external effects of one of the largest companies in the US can walk out their Weekly Sales in the future. Demand forecasting is the planned prerequisite of products or services in the imminent on the basis of present and previous data and different stages of the market. Since all associations is facing the anonymous future and we do not distinguish in the future good demand. Hence, through exploring former statistics and recent market statistics, we envisage the forthcoming claim and building of individual goods, which are extra challenging in the near future. As a result of this, we are producing the required products in pursuance of the petition of the souk in advance. We will be using several machine learning models to test the exactness and then lastly, train the whole data by Using linear regression and fitting the training data into it. Accuracy is 8.88%. The extra trees regression model gives the best accuracy of 97.15%.

Keywords: random forest algorithm, linear regression algorithm, extra trees classifier, mean absolute error

Procedia PDF Downloads 123
3759 Statistical Model of Water Quality in Estero El Macho, Machala-El Oro

Authors: Rafael Zhindon Almeida

Abstract:

Surface water quality is an important concern for the evaluation and prediction of water quality conditions. The objective of this study is to develop a statistical model that can accurately predict the water quality of the El Macho estuary in the city of Machala, El Oro province. The methodology employed in this study is of a basic type that involves a thorough search for theoretical foundations to improve the understanding of statistical modeling for water quality analysis. The research design is correlational, using a multivariate statistical model involving multiple linear regression and principal component analysis. The results indicate that water quality parameters such as fecal coliforms, biochemical oxygen demand, chemical oxygen demand, iron and dissolved oxygen exceed the allowable limits. The water of the El Macho estuary is determined to be below the required water quality criteria. The multiple linear regression model, based on chemical oxygen demand and total dissolved solids, explains 99.9% of the variance of the dependent variable. In addition, principal component analysis shows that the model has an explanatory power of 86.242%. The study successfully developed a statistical model to evaluate the water quality of the El Macho estuary. The estuary did not meet the water quality criteria, with several parameters exceeding the allowable limits. The multiple linear regression model and principal component analysis provide valuable information on the relationship between the various water quality parameters. The findings of the study emphasize the need for immediate action to improve the water quality of the El Macho estuary to ensure the preservation and protection of this valuable natural resource.

Keywords: statistical modeling, water quality, multiple linear regression, principal components, statistical models

Procedia PDF Downloads 67
3758 Analysis of Ferroresonant Overvoltages in Cable-fed Transformers

Authors: George Eduful, Ebenezer A. Jackson, Kingsford A. Atanga

Abstract:

This paper investigates the impacts of cable length and capacity of transformer on ferroresonant overvoltage in cable-fed transformers. The study was conducted by simulation using the EMTP RV. Results show that ferroresonance can cause dangerous overvoltages ranging from 2 to 5 per unit. These overvoltages impose stress on insulations of transformers and cables and subsequently result in system failures. Undertaking Basic Multiple Regression Analysis (BMR) on the results obtained, a statistical model was obtained in terms of cable length and transformer capacity. The model is useful for ferroresonant prediction and control in cable-fed transformers.

Keywords: ferroresonance, cable-fed transformers, EMTP RV, regression analysis

Procedia PDF Downloads 506
3757 The Impact of CSR Satisfaction on Employee Commitment

Authors: Silke Bustamante, Andrea Pelzeter, Andreas Deckmann, Rudi Ehlscheidt, Franziska Freudenberger

Abstract:

Many companies increasingly seek to enhance their attractiveness as an employer to bind their employees. At the same time, corporate responsibility for social and ecological issues seems to become a more important part of an attractive employer brand. It enables the company to match the values and expectations of its members, to signal fairness towards them and to increase its brand potential for positive psychological identification on the employees’ side. In the last decade, several empirical studies have focused this relationship, confirming a positive effect of employees’ CSR perception and their affective organizational commitment. The current paper aims to take a slightly different view by analyzing the impact of another factor on commitment: the weighted employee’s satisfaction with the employer CSR. For that purpose, it is assumed that commitment levels are rather a result of the fulfillment or disappointment of expectations. Hence, instead of merely asking how CSR perception affects commitment, a more complex independent variable is taken into account: a weighted satisfaction construct that summarizes two different factors. Therefore, the individual level of commitment contingent on CSR is conceptualized as a function of two psychological processes: (1) the individual significance that an employee ascribes to specific employer attributes and (2) the individual satisfaction based on the fulfillment of expectation that rely on preceding perceptions of employer attributes. The results presented are based on a quantitative survey that was undertaken among employees of the German service sector. Conceptually a five-dimensional CSR construct (ecology, employees, marketplace, society and corporate governance) and a two-dimensional non-CSR construct (company and workplace) were applied to differentiate employer characteristics. (1) Respondents were asked to indicate the importance of different facets of CSR-related and non-CSR-related employer attributes. By means of a conjoint analysis, the relative importance of each employer attribute was calculated from the data. (2) In addition to this, participants stated their level of satisfaction with specific employer attributes. Both indications were merged to individually weighted satisfaction indexes on the seven-dimensional levels of employer characteristics. The affective organizational commitment of employees (dependent variable) was gathered by applying the established 15-items Organizational Commitment Questionnaire (OCQ). The findings related to the relationship between satisfaction and commitment will be presented. Furthermore, the question will be addressed, how important satisfaction with CSR is in relation to the satisfaction with other attributes of the company in the creation of commitment. Practical as well as scientific implications will be discussed especially with reference to previous results that focused on CSR perception as a commitment driver.

Keywords: corporate social responsibility, organizational commitment, employee attitudes/satisfaction, employee expectations, employer brand

Procedia PDF Downloads 250
3756 Connecting MRI Physics to Glioma Microenvironment: Comparing Simulated T2-Weighted MRI Models of Fixed and Expanding Extracellular Space

Authors: Pamela R. Jackson, Andrea Hawkins-Daarud, Cassandra R. Rickertsen, Kamala Clark-Swanson, Scott A. Whitmire, Kristin R. Swanson

Abstract:

Glioblastoma Multiforme (GBM), the most common primary brain tumor, often presents with hyperintensity on T2-weighted or T2-weighted fluid attenuated inversion recovery (T2/FLAIR) magnetic resonance imaging (MRI). This hyperintensity corresponds with vasogenic edema, however there are likely many infiltrating tumor cells within the hyperintensity as well. While MRIs do not directly indicate tumor cells, MRIs do reflect the microenvironmental water abnormalities caused by the presence of tumor cells and edema. The inherent heterogeneity and resulting MRI features of GBMs complicate assessing disease response. To understand how hyperintensity on T2/FLAIR MRI may correlate with edema in the extracellular space (ECS), a multi-compartmental MRI signal equation which takes into account tissue compartments and their associated volumes with input coming from a mathematical model of glioma growth that incorporates edema formation was explored. The reasonableness of two possible extracellular space schema was evaluated by varying the T2 of the edema compartment and calculating the possible resulting T2s in tumor and peripheral edema. In the mathematical model, gliomas were comprised of vasculature and three tumor cellular phenotypes: normoxic, hypoxic, and necrotic. Edema was characterized as fluid leaking from abnormal tumor vessels. Spatial maps of tumor cell density and edema for virtual tumors were simulated with different rates of proliferation and invasion and various ECS expansion schemes. These spatial maps were then passed into a multi-compartmental MRI signal model for generating simulated T2/FLAIR MR images. Individual compartments’ T2 values in the signal equation were either from literature or estimated and the T2 for edema specifically was varied over a wide range (200 ms – 9200 ms). T2 maps were calculated from simulated images. T2 values based on simulated images were evaluated for regions of interest (ROIs) in normal appearing white matter, tumor, and peripheral edema. The ROI T2 values were compared to T2 values reported in literature. The expanding scheme of extracellular space is had T2 values similar to the literature calculated values. The static scheme of extracellular space had a much lower T2 values and no matter what T2 was associated with edema, the intensities did not come close to literature values. Expanding the extracellular space is necessary to achieve simulated edema intensities commiserate with acquired MRIs.

Keywords: extracellular space, glioblastoma multiforme, magnetic resonance imaging, mathematical modeling

Procedia PDF Downloads 215
3755 Determinants of Rural Household Effective Demand for Biogas Technology in Southern Ethiopia

Authors: Mesfin Nigussie

Abstract:

The objectives of the study were to identify factors affecting rural households’ willingness to install biogas plant and amount willingness to pay in order to examine determinants of effective demand for biogas technology. A multistage sampling technique was employed to select 120 respondents for the study. The binary probit regression model was employed to identify factors affecting rural households’ decision to install biogas technology. The probit model result revealed that household size, total household income, access to extension services related to biogas, access to credit service, proximity to water sources, perception of households about the quality of biogas, perception index about attributes of biogas, perception of households about installation cost of biogas and availability of energy source were statistically significant in determining household’s decision to install biogas. Tobit model was employed to examine determinants of rural household’s amount of willingness to pay. Based on the model result, age of the household head, total annual income of the household, access to extension service and availability of other energy source were significant variables that influence willingness to pay. Providing due considerations for extension services, availability of credit or subsidy, improving the quality of biogas technology design and minimizing cost of installation by using locally available materials are the main suggestions of this research that help to create effective demand for biogas technology.

Keywords: biogas technology, effective demand, probit model, tobit model, willingnes to pay

Procedia PDF Downloads 120
3754 Evaluation of the CRISP-DM Business Understanding Step: An Approach for Assessing the Predictive Power of Regression versus Classification for the Quality Prediction of Hydraulic Test Results

Authors: Christian Neunzig, Simon Fahle, Jürgen Schulz, Matthias Möller, Bernd Kuhlenkötter

Abstract:

Digitalisation in production technology is a driver for the application of machine learning methods. Through the application of predictive quality, the great potential for saving necessary quality control can be exploited through the data-based prediction of product quality and states. However, the serial use of machine learning applications is often prevented by various problems. Fluctuations occur in real production data sets, which are reflected in trends and systematic shifts over time. To counteract these problems, data preprocessing includes rule-based data cleaning, the application of dimensionality reduction techniques, and the identification of comparable data subsets to extract stable features. Successful process control of the target variables aims to centre the measured values around a mean and minimise variance. Competitive leaders claim to have mastered their processes. As a result, much of the real data has a relatively low variance. For the training of prediction models, the highest possible generalisability is required, which is at least made more difficult by this data availability. The implementation of a machine learning application can be interpreted as a production process. The CRoss Industry Standard Process for Data Mining (CRISP-DM) is a process model with six phases that describes the life cycle of data science. As in any process, the costs to eliminate errors increase significantly with each advancing process phase. For the quality prediction of hydraulic test steps of directional control valves, the question arises in the initial phase whether a regression or a classification is more suitable. In the context of this work, the initial phase of the CRISP-DM, the business understanding, is critically compared for the use case at Bosch Rexroth with regard to regression and classification. The use of cross-process production data along the value chain of hydraulic valves is a promising approach to predict the quality characteristics of workpieces. Suitable methods for leakage volume flow regression and classification for inspection decision are applied. Impressively, classification is clearly superior to regression and achieves promising accuracies.

Keywords: classification, CRISP-DM, machine learning, predictive quality, regression

Procedia PDF Downloads 120
3753 Statistical Model to Examine the Impact of the Inflation Rate and Real Interest Rate on the Bahrain Economy

Authors: Ghada Abo-Zaid

Abstract:

Introduction: Oil is one of the most income source in Bahrain. Low oil price influence on the economy growth and the investment rate in Bahrain. For example, the economic growth was 3.7% in 2012, and it reduced to 2.9% in 2015. Investment rate was 9.8% in 2012, and it is reduced to be 5.9% and -12.1% in 2014 and 2015, respectively. The inflation rate is increased to the peak point in 2013 with 3.3 %. Objectives: The objectives here are to build statistical models to examine the effect of the interest rate inflation rate on the growth economy in Bahrain from 2000 to 2018. Methods: This study based on 18 years, and the multiple regression model is used for the analysis. All of the missing data are omitted from the analysis. Results: Regression model is used to examine the association between the Growth national product (GNP), the inflation rate, and real interest rate. We found that (i) Increase the real interest rate decrease the GNP. (ii) Increase the inflation rate does not effect on the growth economy in Bahrain since the average of the inflation rate was almost 2%, and this is considered as a low percentage. Conclusion: There is a positive impact of the real interest rate on the GNP in Bahrain. While the inflation rate does not show any negative influence on the GNP as the inflation rate was not large enough to effect negatively on the economy growth rate in Bahrain.

Keywords: growth national product, egypt, regression model, interest rate

Procedia PDF Downloads 136
3752 The Prediction of Effective Equation on Drivers' Behavioral Characteristics of Lane Changing

Authors: Khashayar Kazemzadeh, Mohammad Hanif Dasoomi

Abstract:

According to the increasing volume of traffic, lane changing plays a crucial role in traffic flow. Lane changing in traffic depends on several factors including road geometrical design, speed, drivers’ behavioral characteristics, etc. A great deal of research has been carried out regarding these fields. Despite of the other significant factors, the drivers’ behavioral characteristics of lane changing has been emphasized in this paper. This paper has predicted the effective equation based on personal characteristics of lane changing by regression models.

Keywords: effective equation, lane changing, drivers’ behavioral characteristics, regression models

Procedia PDF Downloads 425
3751 Climate Changes in Albania and Their Effect on Cereal Yield

Authors: Lule Basha, Eralda Gjika

Abstract:

This study is focused on analyzing climate change in Albania and its potential effects on cereal yields. Initially, monthly temperature and rainfalls in Albania were studied for the period 1960-2021. Climacteric variables are important variables when trying to model cereal yield behavior, especially when significant changes in weather conditions are observed. For this purpose, in the second part of the study, linear and nonlinear models explaining cereal yield are constructed for the same period, 1960-2021. The multiple linear regression analysis and lasso regression method are applied to the data between cereal yield and each independent variable: average temperature, average rainfall, fertilizer consumption, arable land, land under cereal production, and nitrous oxide emissions. In our regression model, heteroscedasticity is not observed, data follow a normal distribution, and there is a low correlation between factors, so we do not have the problem of multicollinearity. Machine-learning methods, such as random forest, are used to predict cereal yield responses to climacteric and other variables. Random Forest showed high accuracy compared to the other statistical models in the prediction of cereal yield. We found that changes in average temperature negatively affect cereal yield. The coefficients of fertilizer consumption, arable land, and land under cereal production are positively affecting production. Our results show that the Random Forest method is an effective and versatile machine-learning method for cereal yield prediction compared to the other two methods.

Keywords: cereal yield, climate change, machine learning, multiple regression model, random forest

Procedia PDF Downloads 68
3750 Clustering for Detection of the Population at Risk of Anticholinergic Medication

Authors: A. Shirazibeheshti, T. Radwan, A. Ettefaghian, G. Wilson, C. Luca, Farbod Khanizadeh

Abstract:

Anticholinergic medication has been associated with events such as falls, delirium, and cognitive impairment in older patients. To further assess this, anticholinergic burden scores have been developed to quantify risk. A risk model based on clustering was deployed in a healthcare management system to cluster patients into multiple risk groups according to anticholinergic burden scores of multiple medicines prescribed to patients to facilitate clinical decision-making. To do so, anticholinergic burden scores of drugs were extracted from the literature, which categorizes the risk on a scale of 1 to 3. Given the patients’ prescription data on the healthcare database, a weighted anticholinergic risk score was derived per patient based on the prescription of multiple anticholinergic drugs. This study was conducted on over 300,000 records of patients currently registered with a major regional UK-based healthcare provider. The weighted risk scores were used as inputs to an unsupervised learning algorithm (mean-shift clustering) that groups patients into clusters that represent different levels of anticholinergic risk. To further evaluate the performance of the model, any association between the average risk score within each group and other factors such as socioeconomic status (i.e., Index of Multiple Deprivation) and an index of health and disability were investigated. The clustering identifies a group of 15 patients at the highest risk from multiple anticholinergic medication. Our findings also show that this group of patients is located within more deprived areas of London compared to the population of other risk groups. Furthermore, the prescription of anticholinergic medicines is more skewed to female than male patients, indicating that females are more at risk from this kind of multiple medications. The risk may be monitored and controlled in well artificial intelligence-equipped healthcare management systems.

Keywords: anticholinergic medicines, clustering, deprivation, socioeconomic status

Procedia PDF Downloads 179
3749 Reducing CO2 Emission Using EDA and Weighted Sum Model in Smart Parking System

Authors: Rahman Ali, Muhammad Sajjad, Farkhund Iqbal, Muhammad Sadiq Hassan Zada, Mohammed Hussain

Abstract:

Emission of Carbon Dioxide (CO2) has adversely affected the environment. One of the major sources of CO2 emission is transportation. In the last few decades, the increase in mobility of people using vehicles has enormously increased the emission of CO2 in the environment. To reduce CO2 emission, sustainable transportation system is required in which smart parking is one of the important measures that need to be established. To contribute to the issue of reducing the amount of CO2 emission, this research proposes a smart parking system. A cloud-based solution is provided to the drivers which automatically searches and recommends the most preferred parking slots. To determine preferences of the parking areas, this methodology exploits a number of unique parking features which ultimately results in the selection of a parking that leads to minimum level of CO2 emission from the current position of the vehicle. To realize the methodology, a scenario-based implementation is considered. During the implementation, a mobile application with GPS signals, vehicles with a number of vehicle features and a list of parking areas with parking features are used by sorting, multi-level filtering, exploratory data analysis (EDA, Analytical Hierarchy Process (AHP)) and weighted sum model (WSM) to rank the parking areas and recommend the drivers with top-k most preferred parking areas. In the EDA process, “2020testcar-2020-03-03”, a freely available dataset is used to estimate CO2 emission of a particular vehicle. To evaluate the system, results of the proposed system are compared with the conventional approach, which reveal that the proposed methodology supersedes the conventional one in reducing the emission of CO2 into the atmosphere.

Keywords: car parking, Co2, Co2 reduction, IoT, merge sort, number plate recognition, smart car parking

Procedia PDF Downloads 127