Search results for: statistical regression
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 6496

Search results for: statistical regression

6106 The Impact of Unconditional and Conditional Conservatism on Cost of Equity Capital: A Quantile Regression Approach for MENA Countries

Authors: Khalifa Maha, Ben Othman Hakim, Khaled Hussainey

Abstract:

Prior empirical studies have investigated the economic consequences of accounting conservatism by examining its impact on the cost of equity capital (COEC). However, findings are not conclusive. We assume that inconsistent results of such association may be attributed to the regression models used in data analysis. To address this issue, we re-examine the effect of different dimension of accounting conservatism: unconditional conservatism (U_CONS) and conditional conservatism (C_CONS) on the COEC for a sample of listed firms from Middle Eastern and North Africa (MENA) countries, applying quantile regression (QR) approach developed by Koenker and Basset (1978). While classical ordinary least square (OLS) method is widely used in empirical accounting research, however it may produce inefficient and bias estimates in the case of departures from normality or long tail error distribution. QR method is more powerful than OLS to handle this kind of problem. It allows the coefficient on the independent variables to shift across the distribution of the dependent variable whereas OLS method only estimates the conditional mean effects of a response variable. We find as predicted that U_CONS has a significant positive effect on the COEC however, C_CONS has a negative impact. Findings suggest also that the effect of the two dimensions of accounting conservatism differs considerably across COEC quantiles. Comparing results from QR method with those of OLS, this study throws more lights on the association between accounting conservatism and COEC.

Keywords: unconditional conservatism, conditional conservatism, cost of equity capital, OLS, quantile regression, emerging markets, MENA countries

Procedia PDF Downloads 338
6105 Approach to Formulate Intuitionistic Fuzzy Regression Models

Authors: Liang-Hsuan Chen, Sheng-Shing Nien

Abstract:

This study aims to develop approaches to formulate intuitionistic fuzzy regression (IFR) models for many decision-making applications in the fuzzy environments using intuitionistic fuzzy observations. Intuitionistic fuzzy numbers (IFNs) are used to characterize the fuzzy input and output variables in the IFR formulation processes. A mathematical programming problem (MPP) is built up to optimally determine the IFR parameters. Each parameter in the MPP is defined as a couple of alternative numerical variables with opposite signs, and an intuitionistic fuzzy error term is added to the MPP to characterize the uncertainty of the model. The IFR model is formulated based on the distance measure to minimize the total distance errors between estimated and observed intuitionistic fuzzy responses in the MPP resolution processes. The proposed approaches are simple/efficient in the formulation/resolution processes, in which the sign of parameters can be determined so that the problem to predetermine the sign of parameters is avoided. Furthermore, the proposed approach has the advantage that the spread of the predicted IFN response will not be over-increased, since the parameters in the established IFR model are crisp. The performance of the obtained models is evaluated and compared with the existing approaches.

Keywords: fuzzy sets, intuitionistic fuzzy number, intuitionistic fuzzy regression, mathematical programming method

Procedia PDF Downloads 116
6104 A Preliminary Study of the Subcontractor Evaluation System for the International Construction Market

Authors: Hochan Seok, Woosik Jang, Seung-Heon Han

Abstract:

The stagnant global construction market has intensified competition since 2008 among firms that aim to win overseas contracts. Against this backdrop, subcontractor selection is identified as one of the most critical success factors in overseas construction project. However, it is difficult to select qualified subcontractors due to the lack of evaluation standards and reliability. This study aims to identify the problems associated with existing subcontractor evaluations using a correlations analysis and a multiple regression analysis with pre-qualification and performance evaluation of 121 firms in six countries.

Keywords: subcontractor evaluation system, pre-qualification, performance evaluation, correlation analysis, multiple regression analysis

Procedia PDF Downloads 342
6103 Liquid Chromatography Microfluidics for Detection and Quantification of Urine Albumin Using Linear Regression Method

Authors: Patricia B. Cruz, Catrina Jean G. Valenzuela, Analyn N. Yumang

Abstract:

Nearly a hundred per million of the Filipino population is diagnosed with Chronic Kidney Disease (CKD). The early stage of CKD has no symptoms and can only be discovered once the patient undergoes urinalysis. Over the years, different methods were discovered and used for the quantification of the urinary albumin such as the immunochemical assays where most of these methods require large machinery that has a high cost in maintenance and resources, and a dipstick test which is yet to be proven and is still debated as a reliable method in detecting early stages of microalbuminuria. This research study involves the use of the liquid chromatography concept in microfluidic instruments with biosensor as a means of separation and detection respectively, and linear regression to quantify human urinary albumin. The researchers’ main objective was to create a miniature system that quantifies and detect patients’ urinary albumin while reducing the amount of volume used per five test samples. For this study, 30 urine samples of unknown albumin concentrations were tested using VITROS Analyzer and the microfluidic system for comparison. Based on the data shared by both methods, the actual vs. predicted regression were able to create a positive linear relationship with an R2 of 0.9995 and a linear equation of y = 1.09x + 0.07, indicating that the predicted values and actual values are approximately equal. Furthermore, the microfluidic instrument uses 75% less in total volume – sample and reagents combined, compared to the VITROS Analyzer per five test samples.

Keywords: Chronic Kidney Disease, Linear Regression, Microfluidics, Urinary Albumin

Procedia PDF Downloads 115
6102 A Proposed Algorithm for Obtaining the Map of Subscribers’ Density Distribution for a Mobile Wireless Communication Network

Authors: C. Temaneh-Nyah, F. A. Phiri, D. Karegeya

Abstract:

This paper presents an algorithm for obtaining the map of subscriber’s density distribution for a mobile wireless communication network based on the actual subscriber's traffic data obtained from the base station. This is useful in statistical characterization of the mobile wireless network.

Keywords: electromagnetic compatibility, statistical analysis, simulation of communication network, subscriber density

Procedia PDF Downloads 291
6101 Using Machine-Learning Methods for Allergen Amino Acid Sequence's Permutations

Authors: Kuei-Ling Sun, Emily Chia-Yu Su

Abstract:

Allergy is a hypersensitive overreaction of the immune system to environmental stimuli, and a major health problem. These overreactions include rashes, sneezing, fever, food allergies, anaphylaxis, asthmatic, shock, or other abnormal conditions. Allergies can be caused by food, insect stings, pollen, animal wool, and other allergens. Their development of allergies is due to both genetic and environmental factors. Allergies involve immunoglobulin E antibodies, a part of the body’s immune system. Immunoglobulin E antibodies will bind to an allergen and then transfer to a receptor on mast cells or basophils triggering the release of inflammatory chemicals such as histamine. Based on the increasingly serious problem of environmental change, changes in lifestyle, air pollution problem, and other factors, in this study, we both collect allergens and non-allergens from several databases and use several machine learning methods for classification, including logistic regression (LR), stepwise regression, decision tree (DT) and neural networks (NN) to do the model comparison and determine the permutations of allergen amino acid’s sequence.

Keywords: allergy, classification, decision tree, logistic regression, machine learning

Procedia PDF Downloads 279
6100 Comparison of Multivariate Adaptive Regression Splines and Random Forest Regression in Predicting Forced Expiratory Volume in One Second

Authors: P. V. Pramila , V. Mahesh

Abstract:

Pulmonary Function Tests are important non-invasive diagnostic tests to assess respiratory impairments and provides quantifiable measures of lung function. Spirometry is the most frequently used measure of lung function and plays an essential role in the diagnosis and management of pulmonary diseases. However, the test requires considerable patient effort and cooperation, markedly related to the age of patients esulting in incomplete data sets. This paper presents, a nonlinear model built using Multivariate adaptive regression splines and Random forest regression model to predict the missing spirometric features. Random forest based feature selection is used to enhance both the generalization capability and the model interpretability. In the present study, flow-volume data are recorded for N= 198 subjects. The ranked order of feature importance index calculated by the random forests model shows that the spirometric features FVC, FEF 25, PEF,FEF 25-75, FEF50, and the demographic parameter height are the important descriptors. A comparison of performance assessment of both models prove that, the prediction ability of MARS with the `top two ranked features namely the FVC and FEF 25 is higher, yielding a model fit of R2= 0.96 and R2= 0.99 for normal and abnormal subjects. The Root Mean Square Error analysis of the RF model and the MARS model also shows that the latter is capable of predicting the missing values of FEV1 with a notably lower error value of 0.0191 (normal subjects) and 0.0106 (abnormal subjects). It is concluded that combining feature selection with a prediction model provides a minimum subset of predominant features to train the model, yielding better prediction performance. This analysis can assist clinicians with a intelligence support system in the medical diagnosis and improvement of clinical care.

Keywords: FEV, multivariate adaptive regression splines pulmonary function test, random forest

Procedia PDF Downloads 283
6099 Mediation Analysis of the Efficacy of the Nimotuzumab-Cisplatin-Radiation (NCR) Improve Overall Survival (OS): A HPV Negative Oropharyngeal Cancer Patient (HPVNOCP) Cohort

Authors: Akshay Patil

Abstract:

Objective: Mediation analysis identifies causal pathways by testing the relationships between the NCR, the OS, and an intermediate variable that mediates the relationship between the Nimotuzumab-cisplatin-radiation (NCR) and OS. Introduction: In randomized controlled trials, the primary interest is in the mechanisms by which an intervention exerts its effects on the outcomes. Clinicians are often interested in how the intervention works (or why it does not work) through hypothesized causal mechanisms. In this work, we highlight the value of understanding causal mechanisms in randomized trial by applying causal mediation analysis in a randomized trial in oncology. Methods: Data was obtained from a phase III randomized trial (Subgroup of HPVNOCP). NCR is reported to significantly improve the OS of patients locally advanced head and neck cancer patients undergoing definitive chemoradiation. Here, based on trial data, the mediating effect of NCR on patient overall survival was systematically quantified through progression-free survival(PFS), disease free survival (DFS), Loco-regional failure (LRF), and the disease control rate (DCR), Overall response rate (ORR). Effects of potential mediators on the HR for OS with NCR versus cisplatin-radiation (CR) were analyzed by Cox regression models. Statistical analyses were performed using R software Version 3.6.3 (The R Foundation for Statistical Computing) Results: Effects of potential mediator PFS was an association between NCR treatment and OS, with an indirect-effect (IE) 0.76(0.62 – 0.95), which mediated 60.69% of the treatment effect. Taking into account baseline confounders, the overall adjusted hazard ratio of death was 0.64 (95% CI: 0.43 – 0.96; P=0.03). The DFS was also a significant mediator and had an IE 0.77 (95% CI; 0.62-0.93), 58% mediated). Smaller mediation effects (maximum 27%) were observed for LRF with IE 0.88(0.74 – 1.06). Both DCR and ORR mediated 10% and 15%, respectively, of the effect of NCR vs. CR on the OS with IE 0.65 (95% CI; 0.81 – 1.08) and 0.94(95% CI; 0.79 – 1.04). Conclusion: Our findings suggest that PFS and DFS were the most important mediators of the OS with nimotuzumab to weekly cisplatin-radiation in HPVNOCP.

Keywords: mediation analysis, cancer data, survival, NCR, HPV negative oropharyngeal

Procedia PDF Downloads 119
6098 Dimensional Accuracy of CNTs/PMMA Parts and Holes Produced by Laser Cutting

Authors: A. Karimzad Ghavidel, M. Zadshakouyan

Abstract:

Laser cutting is a very common production method for cutting 2D polymeric parts. Developing of polymer composites with nano-fibers makes important their other properties like laser workability. The aim of this research is investigation of the influence different laser cutting conditions on the dimensional accuracy of parts and holes from poly methyl methacrylate (PMMA)/carbon nanotubes (CNTs) material. Experiments were carried out by considering of CNTs (in four level 0,0.5, 1 and 1.5% wt.%), laser power (60, 80, and 100 watt) and cutting speed 20, 30, and 40 mm/s as input variable factors. The results reveal that CNTs adding improves the laser workability of PMMA and the increasing of power has a significant effect on the part and hole size. The findings also show cutting speed is effective parameter on the size accuracy. Eventually, the statistical analysis of results was done, and calculated mathematical equations by the regression are presented for determining relation between input and output factor.

Keywords: dimensional accuracy, PMMA, CNTs, laser cutting

Procedia PDF Downloads 283
6097 The Study of Genetic Diversity in Canola Cultivars of Kashmar-Iran Region

Authors: Seyed Habib Shojaei, Reza Eivazi, Mir Sajad Shojaei, Alireza Akbari, Pooria Mazloom, Seyede Mitra Sadati, Mir Zeinalabedin Shojaei, Farnaz Farbakhsh

Abstract:

To study the genetic diversity in rapeseeds and agronomic traits, an experiment was conducted using multivariate statistical methods at Agricultural Research Station of Kashmar in 2012-2013.In this experiment, ten genotypes of rapeseed in a Randomized Complete Block designs with three replications were evaluated. The following traits were studied: seed yield, number of days to the fifty percent of flowering, plant height, number of pods on main stem, length of the pod, seed yield per plant, number of seed in pod, harvest index, weight of 100 seeds, number of pods on lateral branch, number of lateral branches. In analyzing the variance, differences between cultivars were significant. The average comparative revealed that the most valuable variety was Licord regarding to the traits while the least valuable variety was Opera. In stepwise regression, harvest index, grain yield per plant and number of pods per lateral branches were entering to model. Correlation analysis showed that the grain yield with the number of pods per lateral branches and seed yield per plant have positive and significant correlation. In the factor analysis, the first five components explained more than 83% of the variance in the data. In the first factor, seed yield and the number of pods per lateral branches were of the highest importance. The traits, seed yield per plant, and pod per main stem were of a great significance in the second factor. Moreover, in the third factor, plant height and the number of lateral branches were more important. In the fourth factor, plant height and one hundred seeds weight were of the highest variance. Finally, days to fifty percent of flowering and one hundred seeds weight were more important in fifth factor.

Keywords: rapeseed, variance analysis, regression, factor analysis

Procedia PDF Downloads 227
6096 On Improving Breast Cancer Prediction Using GRNN-CP

Authors: Kefaya Qaddoum

Abstract:

The aim of this study is to predict breast cancer and to construct a supportive model that will stimulate a more reliable prediction as a factor that is fundamental for public health. In this study, we utilize general regression neural networks (GRNN) to replace the normal predictions with prediction periods to achieve a reasonable percentage of confidence. The mechanism employed here utilises a machine learning system called conformal prediction (CP), in order to assign consistent confidence measures to predictions, which are combined with GRNN. We apply the resulting algorithm to the problem of breast cancer diagnosis. The results show that the prediction constructed by this method is reasonable and could be useful in practice.

Keywords: neural network, conformal prediction, cancer classification, regression

Procedia PDF Downloads 260
6095 Multiple Linear Regression for Rapid Estimation of Subsurface Resistivity from Apparent Resistivity Measurements

Authors: Sabiu Bala Muhammad, Rosli Saad

Abstract:

Multiple linear regression (MLR) models for fast estimation of true subsurface resistivity from apparent resistivity field measurements are developed and assessed in this study. The parameters investigated were apparent resistivity (ρₐ), horizontal location (X) and depth (Z) of measurement as the independent variables; and true resistivity (ρₜ) as the dependent variable. To achieve linearity in both resistivity variables, datasets were first transformed into logarithmic domain following diagnostic checks of normality of the dependent variable and heteroscedasticity to ensure accurate models. Four MLR models were developed based on hierarchical combination of the independent variables. The generated MLR coefficients were applied to another data set to estimate ρₜ values for validation. Contours of the estimated ρₜ values were plotted and compared to the observed data plots at the colour scale and blanking for visual assessment. The accuracy of the models was assessed using coefficient of determination (R²), standard error (SE) and weighted mean absolute percentage error (wMAPE). It is concluded that the MLR models can estimate ρₜ for with high level of accuracy.

Keywords: apparent resistivity, depth, horizontal location, multiple linear regression, true resistivity

Procedia PDF Downloads 249
6094 Statistical Analysis of Cables in Long-Span Cable-Stayed Bridges

Authors: Ceshi Sun, Yueyu Zhao, Yaobing Zhao, Zhiqiang Wang, Jian Peng, Pengxin Guo

Abstract:

With the rapid development of transportation, there are more than 100 cable-stayed bridges with main span larger than 300 m in China. In order to ascertain the statistical relationships among the design parameters of stay cables and their distribution characteristics, 1500 cables were selected from 25 practical long-span cable-stayed bridges. A new relationship between the first order frequency and the length of cable was found by conducting the curve fitting. Then, based on this relationship other interesting relationships were deduced. Several probability density functions (PDFs) were used to investigate the distributions of the parameters of first order frequency, stress level and the Irvine parameter. It was found that these parameters obey the Lognormal distribution, the Weibull distribution and the generalized Pareto distribution, respectively. Scatter diagrams of the three parameters were plotted and their 95% confidence intervals were also investigated.

Keywords: cable, cable-stayed bridge, long-span, statistical analysis

Procedia PDF Downloads 603
6093 Multicollinearity and MRA in Sustainability: Application of the Raise Regression

Authors: Claudia García-García, Catalina B. García-García, Román Salmerón-Gómez

Abstract:

Much economic-environmental research includes the analysis of possible interactions by using Moderated Regression Analysis (MRA), which is a specific application of multiple linear regression analysis. This methodology allows analyzing how the effect of one of the independent variables is moderated by a second independent variable by adding a cross-product term between them as an additional explanatory variable. Due to the very specification of the methodology, the moderated factor is often highly correlated with the constitutive terms. Thus, great multicollinearity problems arise. The appearance of strong multicollinearity in a model has important consequences. Inflated variances of the estimators may appear, there is a tendency to consider non-significant regressors that they probably are together with a very high coefficient of determination, incorrect signs of our coefficients may appear and also the high sensibility of the results to small changes in the dataset. Finally, the high relationship among explanatory variables implies difficulties in fixing the individual effects of each one on the model under study. These consequences shifted to the moderated analysis may imply that it is not worth including an interaction term that may be distorting the model. Thus, it is important to manage the problem with some methodology that allows for obtaining reliable results. After a review of those works that applied the MRA among the ten top journals of the field, it is clear that multicollinearity is mostly disregarded. Less than 15% of the reviewed works take into account potential multicollinearity problems. To overcome the issue, this work studies the possible application of recent methodologies to MRA. Particularly, the raised regression is analyzed. This methodology mitigates collinearity from a geometrical point of view: the collinearity problem arises because the variables under study are very close geometrically, so by separating both variables, the problem can be mitigated. Raise regression maintains the available information and modifies the problematic variables instead of deleting variables, for example. Furthermore, the global characteristics of the initial model are also maintained (sum of squared residuals, estimated variance, coefficient of determination, global significance test and prediction). The proposal is implemented to data from countries of the European Union during the last year available regarding greenhouse gas emissions, per capita GDP and a dummy variable that represents the topography of the country. The use of a dummy variable as the moderator is a special variant of MRA, sometimes called “subgroup regression analysis.” The main conclusion of this work is that applying new techniques to the field can improve in a substantial way the results of the analysis. Particularly, the use of raised regression mitigates great multicollinearity problems, so the researcher is able to rely on the interaction term when interpreting the results of a particular study.

Keywords: multicollinearity, MRA, interaction, raise

Procedia PDF Downloads 76
6092 Evaluation of the Weight-Based and Fat-Based Indices in Relation to Basal Metabolic Rate-to-Weight Ratio

Authors: Orkide Donma, Mustafa M. Donma

Abstract:

Basal metabolic rate is questioned as a risk factor for weight gain. The relations between basal metabolic rate and body composition have not been cleared yet. The impact of fat mass on basal metabolic rate is also uncertain. Within this context, indices based upon total body mass as well as total body fat mass are available. In this study, the aim is to investigate the potential clinical utility of these indices in the adult population. 287 individuals, aged from 18 to 79 years, were included into the scope of the study. Based upon body mass index values, 10 underweight, 88 normal, 88 overweight, 81 obese, and 20 morbid obese individuals participated. Anthropometric measurements including height (m), and weight (kg) were performed. Body mass index, diagnostic obesity notation model assessment index I, diagnostic obesity notation model assessment index II, basal metabolic rate-to-weight ratio were calculated. Total body fat mass (kg), fat percent (%), basal metabolic rate, metabolic age, visceral adiposity, fat mass of upper as well as lower extremities and trunk, obesity degree were measured by TANITA body composition monitor using bioelectrical impedance analysis technology. Statistical evaluations were performed by statistical package (SPSS) for Windows Version 16.0. Scatterplots of individual measurements for the parameters concerning correlations were drawn. Linear regression lines were displayed. The statistical significance degree was accepted as p < 0.05. The strong correlations between body mass index and diagnostic obesity notation model assessment index I as well as diagnostic obesity notation model assessment index II were obtained (p < 0.001). A much stronger correlation was detected between basal metabolic rate and diagnostic obesity notation model assessment index I in comparison with that calculated for basal metabolic rate and body mass index (p < 0.001). Upon consideration of the associations between basal metabolic rate-to-weight ratio and these three indices, the best association was observed between basal metabolic rate-to-weight and diagnostic obesity notation model assessment index II. In a similar manner, this index was highly correlated with fat percent (p < 0.001). Being independent of the indices, a strong correlation was found between fat percent and basal metabolic rate-to-weight ratio (p < 0.001). Visceral adiposity was much strongly correlated with metabolic age when compared to that with chronological age (p < 0.001). In conclusion, all three indices were associated with metabolic age, but not with chronological age. Diagnostic obesity notation model assessment index II values were highly correlated with body mass index values throughout all ranges starting with underweight going towards morbid obesity. This index is the best in terms of its association with basal metabolic rate-to-weight ratio, which can be interpreted as basal metabolic rate unit.

Keywords: basal metabolic rate, body mass index, children, diagnostic obesity notation model assessment index, obesity

Procedia PDF Downloads 132
6091 Investigation of the Main Trends of Tourist Expenses in Georgia

Authors: Nino Abesadze, Marine Mindorashvili, Nino Paresashvili

Abstract:

The main purpose of the article is to make complex statistical analysis of tourist expenses of foreign visitors. We used mixed technique of selection that implies rules of random and proportional selection. Computer software SPSS was used to compute statistical data for corresponding analysis. Corresponding methodology of tourism statistics was implemented according to international standards. Important information was collected and grouped from the major Georgian airports. Techniques of statistical observation were prepared. A representative population of foreign visitors and a rule of selection of respondents were determined. We have a trend of growth of tourist numbers and share of tourists from post-soviet countries constantly increases. Level of satisfaction with tourist facilities and quality of service has grown, but still we have a problem of disparity between quality of service and prices. The design of tourist expenses of foreign visitors is diverse; competitiveness of tourist products of Georgian tourist companies is higher.

Keywords: tourist, expenses, methods, statistics, analysis

Procedia PDF Downloads 317
6090 Machine Learning Techniques for Estimating Ground Motion Parameters

Authors: Farid Khosravikia, Patricia Clayton

Abstract:

The main objective of this study is to evaluate the advantages and disadvantages of various machine learning techniques in forecasting ground-motion intensity measures given source characteristics, source-to-site distance, and local site condition. Intensity measures such as peak ground acceleration and velocity (PGA and PGV, respectively) as well as 5% damped elastic pseudospectral accelerations at different periods (PSA), are indicators of the strength of shaking at the ground surface. Estimating these variables for future earthquake events is a key step in seismic hazard assessment and potentially subsequent risk assessment of different types of structures. Typically, linear regression-based models, with pre-defined equations and coefficients, are used in ground motion prediction. However, due to the restrictions of the linear regression methods, such models may not capture more complex nonlinear behaviors that exist in the data. Thus, this study comparatively investigates potential benefits from employing other machine learning techniques as a statistical method in ground motion prediction such as Artificial Neural Network, Random Forest, and Support Vector Machine. The algorithms are adjusted to quantify event-to-event and site-to-site variability of the ground motions by implementing them as random effects in the proposed models to reduce the aleatory uncertainty. All the algorithms are trained using a selected database of 4,528 ground-motions, including 376 seismic events with magnitude 3 to 5.8, recorded over the hypocentral distance range of 4 to 500 km in Oklahoma, Kansas, and Texas since 2005. The main reason of the considered database stems from the recent increase in the seismicity rate of these states attributed to petroleum production and wastewater disposal activities, which necessities further investigation in the ground motion models developed for these states. Accuracy of the models in predicting intensity measures, generalization capability of the models for future data, as well as usability of the models are discussed in the evaluation process. The results indicate the algorithms satisfy some physically sound characteristics such as magnitude scaling distance dependency without requiring pre-defined equations or coefficients. Moreover, it is shown that, when sufficient data is available, all the alternative algorithms tend to provide more accurate estimates compared to the conventional linear regression-based method, and particularly, Random Forest outperforms the other algorithms. However, the conventional method is a better tool when limited data is available.

Keywords: artificial neural network, ground-motion models, machine learning, random forest, support vector machine

Procedia PDF Downloads 102
6089 Least Squares Method Identification of Corona Current-Voltage Characteristics and Electromagnetic Field in Electrostatic Precipitator

Authors: H. Nouri, I. E. Achouri, A. Grimes, H. Ait Said, M. Aissou, Y. Zebboudj

Abstract:

This paper aims to analysis the behaviour of DC corona discharge in wire-to-plate electrostatic precipitators (ESP). Current-voltage curves are particularly analysed. Experimental results show that discharge current is strongly affected by the applied voltage. The proposed method of current identification is to use the method of least squares. Least squares problems that of into two categories: linear or ordinary least squares and non-linear least squares, depending on whether or not the residuals are linear in all unknowns. The linear least-squares problem occurs in statistical regression analysis; it has a closed-form solution. A closed-form solution (or closed form expression) is any formula that can be evaluated in a finite number of standard operations. The non-linear problem has no closed-form solution and is usually solved by iterative.

Keywords: electrostatic precipitator, current-voltage characteristics, least squares method, electric field, magnetic field

Procedia PDF Downloads 409
6088 Bayesian Reliability of Weibull Regression with Type-I Censored Data

Authors: Al Omari Moahmmed Ahmed

Abstract:

In the Bayesian, we developed an approach by using non-informative prior with covariate and obtained by using Gauss quadrature method to estimate the parameters of the covariate and reliability function of the Weibull regression distribution with Type-I censored data. The maximum likelihood seen that the estimators obtained are not available in closed forms, although they can be solved it by using Newton-Raphson methods. The comparison criteria are the MSE and the performance of these estimates are assessed using simulation considering various sample size, several specific values of shape parameter. The results show that Bayesian with non-informative prior is better than Maximum Likelihood Estimator.

Keywords: non-informative prior, Bayesian method, type-I censoring, Gauss quardature

Procedia PDF Downloads 477
6087 Walmart Sales Forecasting using Machine Learning in Python

Authors: Niyati Sharma, Om Anand, Sanjeev Kumar Prasad

Abstract:

Assuming future sale value for any of the organizations is one of the major essential characteristics of tactical development. Walmart Sales Forecasting is the finest illustration to work with as a beginner; subsequently, it has the major retail data set. Walmart uses this sales estimate problem for hiring purposes also. We would like to analyzing how the internal and external effects of one of the largest companies in the US can walk out their Weekly Sales in the future. Demand forecasting is the planned prerequisite of products or services in the imminent on the basis of present and previous data and different stages of the market. Since all associations is facing the anonymous future and we do not distinguish in the future good demand. Hence, through exploring former statistics and recent market statistics, we envisage the forthcoming claim and building of individual goods, which are extra challenging in the near future. As a result of this, we are producing the required products in pursuance of the petition of the souk in advance. We will be using several machine learning models to test the exactness and then lastly, train the whole data by Using linear regression and fitting the training data into it. Accuracy is 8.88%. The extra trees regression model gives the best accuracy of 97.15%.

Keywords: random forest algorithm, linear regression algorithm, extra trees classifier, mean absolute error

Procedia PDF Downloads 122
6086 The Relationship between Impared Fasting Glucose and Serum Fibroblast Growth Factor 21 Level

Authors: Nanhee Cho, Eugene Han, Hanbyul Kim, Hochan Cho

Abstract:

Pre-diabetes includes impaired fasting glucose (IFG) and impaired glucose tolerance (IGT) and there is a strong probability that pre-diabetes will lead to diabetes mellitus (DM). Serum fibroblast growth factor 21 (FGF-21) is known to be increased as a compensatory response to metabolic imbalance under conditions such as obesity, metabolic syndrome, and DM. This study aims to identify the relationship of serum FGF-21 with pre-diabetes, and with biomarkers of related metabolic diseases. Fifty five Korea adult patients participated in a cohort study from June 2012 to December 2015. The analysis revealed that BMI, FBS levels, and serum FGF-21 levels were significantly higher in the IFG group compared to those in the normal group. A multiple regression analysis was conduted on the correlations of serum FGF-21 levels with BMI, and FBS levels, and the result did not show statistical significance. In conclusion, our results revealed that serum FGF-21 level serve as a marker to predict IFG.

Keywords: cytokine, fibroblast growth factor 21, impaired fasting glucose, prediabetes

Procedia PDF Downloads 302
6085 The Role of Japan's Land-Use Planning in Farmland Conservation: A Statistical Study of Tokyo Metropolitan District

Authors: Ruiyi Zhang, Wanglin Yan

Abstract:

Strict land-use plan is issued based on city planning act for controlling urbanization and conserving semi-natural landscape. And the agrarian land resource in the suburbs has indispensable socio-economic value and contributes to the sustainability of the regional environment. However, the agrarian hinterland of metropolitan is witnessing severe farmland conversion and abandonment, while the contribution of land-use planning to farmland conservation remains unclear in those areas. Hypothetically, current land-use plan contributes to farmland loss. So, this research investigated the relationship between farmland loss and land-use planning at municipality level to provide base data for zoning in the metropolitan suburbs, and help to develop a sustainable land-use plan that will conserve the agrarian hinterland. As data and methods, 1) Farmland data of Census of Agriculture and Forestry for 2005 to 2015 and population data of 2015 and 2018 were used to investigate spatial distribution feathers of farmland loss in Tokyo Metropolitan District (TMD) for two periods: 2005-2010;2010-2015. 2) And the samples were divided by four urbanization facts. 3) DID data and zoning data for 2006 to 2018 were used to specify urbanization level of zones for describing land-use plan. 4) Then we conducted multiple regression between farmland loss, both abandonment and conversion amounts, and the described land-use plan in each of the urbanization scenario and in each period. As the results, the study reveals land-use plan has unignorable relation with farmland loss in the metropolitan suburbs at ward-city-town-village level. 1) The urban promotion areas planned larger than necessity and unregulated urbanization promote both farmland conversion and abandonment, and the effect weakens from inner suburbs to outer suburbs. 2) And the effect of land-use plan on farmland abandonment is more obvious than that on farmland conversion. The study advocates that, optimizing land-use plan will hopefully help the farmland conservation in metropolitan suburbs, which contributes to sustainable regional policy making.

Keywords: Agrarian land resource, land-use planning, urbanization level, multiple regression

Procedia PDF Downloads 128
6084 Water Access and Food Security: A Cross-Sectional Study of SSA Countries in 2017

Authors: Davod Ahmadi, Narges Ebadi, Ethan Wang, Hugo Melgar-Quiñonez

Abstract:

Compared to the other Least Developed Countries (LDCs), major countries in sub-Saharan Africa (SSA) have limited access to the clean water. People in this region, and more specifically females, suffer from acute water scarcity problems. They are compelled to spend too much of their time bringing water for domestic use like drinking and washing. Apart from domestic use, water through affecting agriculture and livestock contributes to the food security status of people in vulnerable regions like SSA. Livestock needs water to grow, and agriculture requires enormous quantities of water for irrigation. The main objective of this study is to explore the association between access to water and individuals’ food security status. Data from 2017 Gallup World Poll (GWP) for SSA were analyzed (n=35,000). The target population in GWP is the entire civilian, non-institutionalized, aged 15 and older population. All samples selection is probability based and nationally representative. The Gallup surveys an average of 1,000 samples of individuals per country. Three questions related to water (i.e., water quality, availability of water for crops and availability of water for livestock) were used as the exposure variables. Food Insecurity Experience Scale (FIES) was used as the outcome variable. FIES measures individuals’ food security status, and it is composed of eight questions with simple dichotomous responses (1=Yes and 0=No). Different statistical analyses such as descriptive, crosstabs and binary logistic regression, form the basis of this study. Results from descriptive analyses showed that more than 50% of the respondents had no access to enough water for crops and livestock. More than 85% of respondents were categorized as “food insecure”. Findings from cross-tabulation analyses showed that food security status was significantly associated with water quality (0.135; P=0.000), water for crops (0.106; P=0.000) and water for livestock (0.112; P=0.000). In regression analyses, the probability of being food insecure increased among people who expressed no satisfaction with water quality (OR=1.884 (OR=1.768-2.008)), not enough water for crops (OR=1.721 (1.616-1.834)) and not enough water for livestock (OR=1.706 (1.819)). In conclusion, it should note that water access affects food security status in SSA.

Keywords: water access, agriculture, livestock, FIES

Procedia PDF Downloads 126
6083 Simulation of Government Management Model to Increase Financial Productivity System Using Govpilot

Authors: Arezou Javadi

Abstract:

The use of algorithmic models dependent on software calculations and simulation of new government management assays with the help of specialized software had increased the productivity and efficiency of the government management system recently. This has caused the management approach to change from the old bitch & fix model, which has low efficiency and less usefulness, to the capable management model with higher efficiency called the partnership with resident model. By using Govpilot TM software, the relationship between people in a system and the government was examined. The method of two tailed interaction was the outsourcing of a goal in a system, which is formed in the order of goals, qualified executive people, optimal executive model, and finally, summarizing additional activities at the different statistical levels. The results showed that the participation of people in a financial implementation system with a statistical potential of P≥5% caused a significant increase in investment and initial capital in the government system with maximum implement project in a smart government.

Keywords: machine learning, financial income, statistical potential, govpilot

Procedia PDF Downloads 68
6082 Simulation of Government Management Model to Increase Financial Productivity System Using Govpilot

Authors: Arezou Javadi

Abstract:

The use of algorithmic models dependent on software calculations and simulation of new government management assays with the help of specialized software had increased the productivity and efficiency of the government management system recently. This has caused the management approach to change from the old bitch & fix model, which has low efficiency and less usefulness, to the capable management model with higher efficiency called the partnership with resident model. By using Govpilot TM software, the relationship between people in a system and the government was examined. The method of two tailed interaction was the outsourcing of a goal in a system, which is formed in the order of goals, qualified executive people, optimal executive model, and finally, summarizing additional activities at the different statistical levels. The results showed that the participation of people in a financial implementation system with a statistical potential of P≥5% caused a significant increase in investment and initial capital in the government system with maximum implement project in a smart government.

Keywords: machine learning, financial income, statistical potential, govpilot

Procedia PDF Downloads 51
6081 Assessment of Level of Sedation and Associated Factors Among Intubated Critically Ill Children in Pediatric Intensive Care Unit of Jimma University Medical Center: A Fourteen Months Prospective Observation Study, 2023

Authors: Habtamu Wolde Engudai

Abstract:

Background: Sedation can be provided to facilitate a procedure or to stabilize patients admitted in pediatric intensive care unit (PICU). Sedation is often necessary to maintain optimal care for critically ill children requiring mechanical ventilation. However, if sedation is too deep or too light, it has its own adverse effects, and hence, it is important to monitor the level of sedation and maintain an optimal level. Objectives: The objective is to assess the level of sedation and associated factors among intubated critically ill children admitted to PICU of JUMC, Jimma. Methods: A prospective observation study was conducted in the PICU of JUMC in September 2021 in 105 patients who were going to be admitted to the PICU aged less than 14 and with GCS >8. Data was collected by residents and nurses working in PICU. Data entry was done by Epi data manager (version 4.6.0.2). Statistical analysis and the creation of charts is going to be performed using SPSS version 26. Data was presented as mean, percentage and standard deviation. The assumption of logistic regression and the result of the assumption will be checked. To find potential predictors, bi-variable logistic regression was used for each predictor and outcome variable. A p value of <0.05 was considered as statistically significant. Finally, findings have been presented using figures, AOR, percentages, and a summary table. Result: in this study, 105 critically ill children had been involved who were started on continuous or intermittent forms of sedative drugs. Sedation level was assessed using a comfort scale three times per day. Based on this observation, we got a 44.8% level of suboptimal sedation at the baseline, a 36.2% level of suboptimal sedation at eight hours, and a 24.8% level of suboptimal sedation at sixteen hours. There is a significant association between suboptimal sedation and duration of stay with mechanical ventilation and the rate of unplanned extubation, which was shown by P < 0.05 using the Hosmer-Lemeshow test of goodness of fit (p> 0.44).

Keywords: level of sedation, critically ill children, Pediatric intensive care unit, Jimma university

Procedia PDF Downloads 42
6080 Evaluation of the CRISP-DM Business Understanding Step: An Approach for Assessing the Predictive Power of Regression versus Classification for the Quality Prediction of Hydraulic Test Results

Authors: Christian Neunzig, Simon Fahle, Jürgen Schulz, Matthias Möller, Bernd Kuhlenkötter

Abstract:

Digitalisation in production technology is a driver for the application of machine learning methods. Through the application of predictive quality, the great potential for saving necessary quality control can be exploited through the data-based prediction of product quality and states. However, the serial use of machine learning applications is often prevented by various problems. Fluctuations occur in real production data sets, which are reflected in trends and systematic shifts over time. To counteract these problems, data preprocessing includes rule-based data cleaning, the application of dimensionality reduction techniques, and the identification of comparable data subsets to extract stable features. Successful process control of the target variables aims to centre the measured values around a mean and minimise variance. Competitive leaders claim to have mastered their processes. As a result, much of the real data has a relatively low variance. For the training of prediction models, the highest possible generalisability is required, which is at least made more difficult by this data availability. The implementation of a machine learning application can be interpreted as a production process. The CRoss Industry Standard Process for Data Mining (CRISP-DM) is a process model with six phases that describes the life cycle of data science. As in any process, the costs to eliminate errors increase significantly with each advancing process phase. For the quality prediction of hydraulic test steps of directional control valves, the question arises in the initial phase whether a regression or a classification is more suitable. In the context of this work, the initial phase of the CRISP-DM, the business understanding, is critically compared for the use case at Bosch Rexroth with regard to regression and classification. The use of cross-process production data along the value chain of hydraulic valves is a promising approach to predict the quality characteristics of workpieces. Suitable methods for leakage volume flow regression and classification for inspection decision are applied. Impressively, classification is clearly superior to regression and achieves promising accuracies.

Keywords: classification, CRISP-DM, machine learning, predictive quality, regression

Procedia PDF Downloads 118
6079 Support Vector Regression with Weighted Least Absolute Deviations

Authors: Kang-Mo Jung

Abstract:

Least squares support vector machine (LS-SVM) is a penalized regression which considers both fitting and generalization ability of a model. However, the squared loss function is very sensitive to even single outlier. We proposed a weighted absolute deviation loss function for the robustness of the estimates in least absolute deviation support vector machine. The proposed estimates can be obtained by a quadratic programming algorithm. Numerical experiments on simulated datasets show that the proposed algorithm is competitive in view of robustness to outliers.

Keywords: least absolute deviation, quadratic programming, robustness, support vector machine, weight

Procedia PDF Downloads 498
6078 The Prediction of Effective Equation on Drivers' Behavioral Characteristics of Lane Changing

Authors: Khashayar Kazemzadeh, Mohammad Hanif Dasoomi

Abstract:

According to the increasing volume of traffic, lane changing plays a crucial role in traffic flow. Lane changing in traffic depends on several factors including road geometrical design, speed, drivers’ behavioral characteristics, etc. A great deal of research has been carried out regarding these fields. Despite of the other significant factors, the drivers’ behavioral characteristics of lane changing has been emphasized in this paper. This paper has predicted the effective equation based on personal characteristics of lane changing by regression models.

Keywords: effective equation, lane changing, drivers’ behavioral characteristics, regression models

Procedia PDF Downloads 424
6077 Development Strategies for Building Smart Cities: The Case of Kalampaka, Greece

Authors: Christos Stamopoulos

Abstract:

Nowadays, the technological evolution has brought changes and new requirements not only on human’s life but also on the environment in which they live. Cities have begun to be organized in new ways which comply with contemporary living standards. The aim of this paper was to present the characteristics and to introduce good construction strategies of smart cities around the world. Also, a case study of the city of Kalampaka and its residents was surveyed. More specifically, residents’ knowledge about smart cities and their opinion for future progress was examined. Statistical analysis showed that residents’ knowledge about smart cities was fairly good (48% knew the phrase 'smart city'). However, respondents believe that the appearance of the city of Kalampaka needs improvement in many areas (the 75% are disappointed with the current appearance of the city). Furthermore, regression analysis showed that the value of the environmental sustainability is greatly influenced by the energy saving, as well as, innovation has an impact on the level of quality of life, while older people seem satisfied with administration’s efforts for development.

Keywords: development, economy, environment, governance, quality of life, smart city

Procedia PDF Downloads 317