Search results for: multivariate regression
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3507

Search results for: multivariate regression

3417 Arsenic Contamination in Drinking Water Is Associated with Dyslipidemia in Pregnancy

Authors: Begum Rokeya, Rahelee Zinnat, Fatema Jebunnesa, Israt Ara Hossain, A. Rahman

Abstract:

Background and Aims: Arsenic in drinking water is a global environmental health problem, and the exposure may increase dyslipidemia and cerebrovascular diseases mortalities, most likely through causing atherosclerosis. However, the mechanism of lipid metabolism, atherosclerosis formation, arsenic exposure and impact in pregnancy is still unclear. Recent epidemiological evidences indicate close association between inorganic arsenic exposure via drinking water and Dyslipidemia. However, the exact mechanism of this arsenic-mediated increase in atherosclerosis risk factors remains enigmatic. We explore the association of the effect of arsenic on serum lipid profile in pregnant subjects. Methods: A total 200 pregnant mother screened in this study from arsenic exposed area. Our study group included 100 exposed subjects were cases and 100 Non exposed healthy pregnant were controls requited by a cross-sectional study. Clinical and anthropometric measurements were done by standard techniques. Lipidemic status was assessed by enzymatic endpoint method. Urinary As was measured by inductively coupled plasma-mass spectrometry and adjusted with specific gravity and Arsenic exposure was assessed by the level of urinary arsenic level > 100 μg/L was categorized as arsenic exposed and < 100 μg/L were categorized as non-exposed. Multivariate logistic regression and Student’s t - test was used for statistical analysis. Results: Systolic and diastolic blood pressure both were significantly higher in the Arsenic exposed pregnant subjects compared to the Non-exposed group (p<0.001). Arsenic exposed subjects had 2 times higher chance of developing hypertensive pregnancy (Odds Ratio 2.2). In parallel to the findings in Ar exposed subjects showed significantly higher proportion of triglyceride and total cholesterol and low density of lipo protein when compare to non- arsenic exposed pregnant subjects. Significant correlation of urinary arsenic level was also found with SBP, DBP, TG, T chol and serum LDL-Cholesterol. On multivariate logistic regression showed urinary arsenic had a positive association with DBP, SBP, Triglyceride and LDL-c. Conclusion: In conclusion, arsenic exposure may induce dyslipidemia like atherosclerosis through modifying reverse cholesterol transport in cholesterol metabolism. For decreasing atherosclerosis related mortality associated with arsenic, preventing exposure from environmental sources in early life is an important element.

Keywords: Arsenic Exposure, Dyslipidemia, Gestational Diabetes Mellitus, Serum lipid profile

Procedia PDF Downloads 95
3416 The Extended Skew Gaussian Process for Regression

Authors: M. T. Alodat

Abstract:

In this paper, we propose a generalization to the Gaussian process regression(GPR) model called the extended skew Gaussian process for regression(ESGPr) model. The ESGPR model works better than the GPR model when the errors are skewed. We derive the predictive distribution for the ESGPR model at a new input. Also we apply the ESGPR model to FOREX data and we find that it fits the Forex data better than the GPR model.

Keywords: extended skew normal distribution, Gaussian process for regression, predictive distribution, ESGPr model

Procedia PDF Downloads 520
3415 Irrigation Water Quality Evaluation Based on Multivariate Statistical Analysis: A Case Study of Jiaokou Irrigation District

Authors: Panpan Xu, Qiying Zhang, Hui Qian

Abstract:

Groundwater is main source of water supply in the Guanzhong Basin, China. To investigate the quality of groundwater for agricultural purposes in Jiaokou Irrigation District located in the east of the Guanzhong Basin, 141 groundwater samples were collected for analysis of major ions (K+, Na+, Mg2+, Ca2+, SO42-, Cl-, HCO3-, and CO32-), pH, and total dissolved solids (TDS). Sodium percentage (Na%), residual sodium carbonate (RSC), magnesium hazard (MH), and potential salinity (PS) were applied for irrigation water quality assessment. In addition, multivariate statistical techniques were used to identify the underlying hydrogeochemical processes. Results show that the content of TDS mainly depends on Cl-, Na+, Mg2+, and SO42-, and the HCO3- content is generally high except for the eastern sand area. These are responsible for complex hydrogeochemical processes, such as dissolution of carbonate minerals (dolomite and calcite), gypsum, halite, and silicate minerals, the cation exchange, as well as evaporation and concentration. The average evaluation levels of Na%, RSC, MH, and PS for irrigation water quality are doubtful, good, unsuitable, and injurious to unsatisfactory, respectively. Therefore, it is necessary for decision makers to comprehensively consider the indicators and thus reasonably evaluate the irrigation water quality.

Keywords: irrigation water quality, multivariate statistical analysis, groundwater, hydrogeochemical process

Procedia PDF Downloads 116
3414 Factor Associated with Uncertainty Undergoing Hematopoietic Stem Cell Transplantation

Authors: Sandra Adarve, Jhon Osorio

Abstract:

Uncertainty has been studied in patients with different types of cancer, except in patients with hematologic cancer and undergoing transplantation. The purpose of this study was to identify factors associated with uncertainty in adults patients with malignant hemato-oncology diseases who are scheduled to undergo hematopoietic stem cell transplantation based on Merle Mishel´s Uncertainty theory. This was a cross-sectional study with an analytical purpose. The study sample included 50 patients with leukemia, myeloma, and lymphoma selected by non-probability sampling by convenience and intention. Sociodemographic and clinical variables were measured. Mishel´s Scale of Uncertainty in Illness was used for the measurement of uncertainty. A bivariate and multivariate analyses were performed to explore the relationships and associations between the different variables and uncertainty level. For this analysis, the distribution of the uncertainty scale values was evaluated through the Shapiro-Wilk normality test to identify statistical tests to be used. A multivariate analysis was conducted through a logistic regression using step-by-step technique. Patients were 18-74 years old, with a mean age of 44.8. Over time, the disease course had a median of 9.5 months, an opportunity was found in the performance of the transplantation of < 20 days for 50% of the patients. Regarding the uncertainty scale, a mean score of 95.46 was identified. When the dimensions of the scale were analyzed, the mean score of the framework of stimuli was 25.6, of cognitive ability was 47.4 and structure providers was 22.8. Age was identified to correlate with the total uncertainty score (p=0.012). Additionally, a statistically significant difference was evidenced between different religious creeds and uncertainty score (p=0.023), education level (p=0.012), family history of cancer (p=0.001), the presence of comorbidities (p=0.023) and previous radiotherapy treatment (p=0.022). After performing logistic regression, previous radiotherapy treatment (OR=0.04 IC95% (0.004-0.48)) and family history of cancer (OR=30.7 IC95% (2.7-349)) were found to be factors associated with the high level of uncertainty. Uncertainty is present in high levels in patients who are going to be subjected to bone marrow transplantation, and it is the responsibility of the nurse to assess the levels of uncertainty and the presence of factors that may contribute to their presence. Once it has been valued, the uncertainty must be intervened from the identified associated factors, especially all those that have to do with the cognitive capacity. This implies the implementation and design of intervention strategies to improve the knowledge related to the disease and the therapeutic procedures to which the patients will be subjected. All interventions should favor the adaptation of these patients to their current experience and contribute to seeing uncertainty as an opportunity for growth and transcendence.

Keywords: hematopoietic stem cell transplantation, hematologic diseases, nursing, uncertainty

Procedia PDF Downloads 119
3413 Integrated Nested Laplace Approximations For Quantile Regression

Authors: Kajingulu Malandala, Ranganai Edmore

Abstract:

The asymmetric Laplace distribution (ADL) is commonly used as the likelihood function of the Bayesian quantile regression, and it offers different families of likelihood method for quantile regression. Notwithstanding their popularity and practicality, ADL is not smooth and thus making it difficult to maximize its likelihood. Furthermore, Bayesian inference is time consuming and the selection of likelihood may mislead the inference, as the Bayes theorem does not automatically establish the posterior inference. Furthermore, ADL does not account for greater skewness and Kurtosis. This paper develops a new aspect of quantile regression approach for count data based on inverse of the cumulative density function of the Poisson, binomial and Delaporte distributions using the integrated nested Laplace Approximations. Our result validates the benefit of using the integrated nested Laplace Approximations and support the approach for count data.

Keywords: quantile regression, Delaporte distribution, count data, integrated nested Laplace approximation

Procedia PDF Downloads 134
3412 The Use of Geographically Weighted Regression for Deforestation Analysis: Case Study in Brazilian Cerrado

Authors: Ana Paula Camelo, Keila Sanches

Abstract:

The Geographically Weighted Regression (GWR) was proposed in geography literature to allow relationship in a regression model to vary over space. In Brazil, the agricultural exploitation of the Cerrado Biome is the main cause of deforestation. In this study, we propose a methodology using geostatistical methods to characterize the spatial dependence of deforestation in the Cerrado based on agricultural production indicators. Therefore, it was used the set of exploratory spatial data analysis tools (ESDA) and confirmatory analysis using GWR. It was made the calibration a non-spatial model, evaluation the nature of the regression curve, election of the variables by stepwise process and multicollinearity analysis. After the evaluation of the non-spatial model was processed the spatial-regression model, statistic evaluation of the intercept and verification of its effect on calibration. In an analysis of Spearman’s correlation the results between deforestation and livestock was +0.783 and with soybeans +0.405. The model presented R²=0.936 and showed a strong spatial dependence of agricultural activity of soybeans associated to maize and cotton crops. The GWR is a very effective tool presenting results closer to the reality of deforestation in the Cerrado when compared with other analysis.

Keywords: deforestation, geographically weighted regression, land use, spatial analysis

Procedia PDF Downloads 328
3411 Dietary Pattern derived by Reduced Rank Regression is Associated with Reduced Cognitive Impairment Risk in Singaporean Older Adults

Authors: Kaisy Xinhong Ye, Su Lin Lim, Jialiang Li, Lei Feng

Abstract:

background: Multiple healthful dietary patterns have been linked with dementia, but limited studies have looked at the role of diet in cognitive health in Asians whose eating habits are very different from their counterparts in the west. This study aimed to derive a dietary pattern that is associated with the risk of cognitive impairment (CI) in the Singaporean population. Method: The analysis was based on 719 community older adults aged 60 and above. Dietary intake was measured using a validated semi-quantitative food-frequency questionnaire (FFQ). Reduced rank regression (RRR) was used to extract dietary pattern from 45 food groups, specifying sugar, dietary fiber, vitamin A, calcium, and the ratio of polyunsaturated fat to saturated fat intake (P:S ratio) as response variables. The RRR-derived dietary patterns were subsequently investigated using multivariate logistic regression models to look for associations with the risk of CI. Results: A dietary pattern characterized by greater intakes of green leafy vegetables, red-orange vegetables, wholegrains, tofu, nuts, and lower intakes of biscuits, pastries, local sweets, coffee, poultry with skin, sugar added to beverages, malt beverages, roti, butter, and fast food was associated with reduced risk of CI [multivariable-adjusted OR comparing extreme quintiles, 0.29 (95% CI: 0.11, 0.77); P-trend =0.03]. This pattern was positively correlated with P:S ratio, vitamin A, and dietary fiber and negatively correlated with sugar. Conclusion: A dietary pattern providing high P:S ratio, vitamin A and dietary fiber, and a low level of sugar may reduce the risk of cognitive impairment in old age. The findings have significance in guiding local Singaporeans to dementia prevention through food-based dietary approaches.

Keywords: dementia, cognitive impairment, diet, nutrient, elderly

Procedia PDF Downloads 46
3410 Deriving an Index of Adoption Rate and Assessing Factors Affecting Adoption of an Agroforestry-Based Farming System in Dhanusha District, Nepal

Authors: Arun Dhakal, Geoff Cockfield, Tek Narayan Maraseni

Abstract:

This paper attempts to fulfil the gap in measuring adoption in agroforestry studies. It explains the derivation of an index of adoption rate in a Nepalese context and examines the factors affecting adoption of agroforestry-based land management practice (AFLMP) in the Dhanusha District of Nepal. Data about the different farm practices and the factors (bio-physical, socio-economic) influencing adoption were collected during focus group discussion and from the randomly selected households using a household survey questionnaire, respectively. A multivariate regression model was used to determine the factors. The factors (variables) found to significantly affect adoption of AFLMP were: farm size, availability of irrigation water, education of household heads, agricultural labour force, frequency of visits by extension workers, expenditure on farm inputs purchase, household’s experience in agroforestry, and distance from home to government forest. The regression model explained about 75% of variation in adoption decision. The model rejected ‘erosion hazard’, ‘flood hazard’ and ‘gender’ as determinants of adoption, which in case of single agroforestry practice were major variables and played positive role. Out of eight variables, farm size played the most powerful role in explaining the variation in adoption, followed by availability of irrigation water and education of household heads. The results of this study suggest that policies to promote the provision of irrigation water, extension services and motivation to obtaining higher education would probably provide the incentive to adopt agroforestry elsewhere in the terai of Nepal.

Keywords: agroforestry, adoption index, determinants of adoption, step-wise linear regression, Nepal

Procedia PDF Downloads 466
3409 Ranking Effective Factors on Strategic Planning to Achieve Organization Objectives in Fuzzy Multivariate Decision-Making Technique

Authors: Elahe Memari, Ahmad Aslizadeh, Ahmad Memari

Abstract:

Today strategic planning is counted as the most important duties of senior directors in each organization. Strategic planning allows the organizations to implement compiled strategies and reach higher competitive benefits than their competitors. The present research work tries to prepare and rank the strategies form effective factors on strategic planning in fulfillment of the State Road Management and Transportation Organization in order to indicate the role of organizational factors in efficiency of the process to organization managers. Connection between six main factors in fulfillment of State Road Management and Transportation Organization were studied here, including Improvement of Strategic Thinking in senior managers, improvement of the organization business process, rationalization of resources allocation in different parts of the organization, coordination and conformity of strategic plan with organization needs, adjustment of organization activities with environmental changes, reinforcement of organizational culture. All said factors approved by implemented tests and then ranked using fuzzy multivariate decision-making technique.

Keywords: Fuzzy TOPSIS, improvement of organization business process, multivariate decision-making, strategic planning

Procedia PDF Downloads 374
3408 Weighted Rank Regression with Adaptive Penalty Function

Authors: Kang-Mo Jung

Abstract:

The use of regularization for statistical methods has become popular. The least absolute shrinkage and selection operator (LASSO) framework has become the standard tool for sparse regression. However, it is well known that the LASSO is sensitive to outliers or leverage points. We consider a new robust estimation which is composed of the weighted loss function of the pairwise difference of residuals and the adaptive penalty function regulating the tuning parameter for each variable. Rank regression is resistant to regression outliers, but not to leverage points. By adopting a weighted loss function, the proposed method is robust to leverage points of the predictor variable. Furthermore, the adaptive penalty function gives us good statistical properties in variable selection such as oracle property and consistency. We develop an efficient algorithm to compute the proposed estimator using basic functions in program R. We used an optimal tuning parameter based on the Bayesian information criterion (BIC). Numerical simulation shows that the proposed estimator is effective for analyzing real data set and contaminated data.

Keywords: adaptive penalty function, robust penalized regression, variable selection, weighted rank regression

Procedia PDF Downloads 431
3407 Applying Multivariate and Univariate Analysis of Variance on Socioeconomic, Health, and Security Variables in Jordan

Authors: Faisal G. Khamis, Ghaleb A. El-Refae

Abstract:

Many researchers have studied socioeconomic, health, and security variables in the developed countries; however, very few studies used multivariate analysis in developing countries. The current study contributes to the scarce literature about the determinants of the variance in socioeconomic, health, and security factors. Questions raised were whether the independent variables (IVs) of governorate and year impact the socioeconomic, health, and security dependent variables (DVs) in Jordan, whether the marginal mean of each DV in each governorate and in each year is significant, which governorates are similar in difference means of each DV, and whether these DVs vary. The main objectives were to determine the source of variances in DVs, collectively and separately, testing which governorates are similar and which diverge for each DV. The research design was time series and cross-sectional analysis. The main hypotheses are that IVs affect DVs collectively and separately. Multivariate and univariate analyses of variance were carried out to test these hypotheses. The population of 12 governorates in Jordan and the available data of 15 years (2000–2015) accrued from several Jordanian statistical yearbooks. We investigated the effect of two factors of governorate and year on the four DVs of divorce rate, mortality rate, unemployment percentage, and crime rate. All DVs were transformed to multivariate normal distribution. We calculated descriptive statistics for each DV. Based on the multivariate analysis of variance, we found a significant effect in IVs on DVs with p < .001. Based on the univariate analysis, we found a significant effect of IVs on each DV with p < .001, except the effect of the year factor on unemployment was not significant with p = .642. The grand and marginal means of each DV in each governorate and each year were significant based on a 95% confidence interval. Most governorates are not similar in DVs with p < .001. We concluded that the two factors produce significant effects on DVs, collectively and separately. Based on these findings, the government can distribute its financial and physical resources to governorates more efficiently. By identifying the sources of variance that contribute to the variation in DVs, insights can help inform focused variation prevention efforts.

Keywords: ANOVA, crime, divorce, governorate, hypothesis test, Jordan, MANOVA, means, mortality, unemployment, year

Procedia PDF Downloads 239
3406 Healthy Lifestyle and Risky Behaviors amongst Students of Physical Education High Schools

Authors: Amin Amani, Masomeh Reihany Shirvan, Mahla Nabizadeh Mashizi, Mohadese Khoshtinat, Mohammad Elyas Ansarinia

Abstract:

The purpose of this study is the relationship between a healthy lifestyle and risky behavior in physical education students of Bojnourd schools. The study sample consisted of teenagers studying in second and third grade of Bojnourd's high schools. According to level sampling, 604 students studying in the second grade, and 600 students studying in third grade were tested from physical education schools in Bojnourd. For sample selection, populations were divided into 4 area including north, East, West and South. Then according to the number of students of each area, sample size of each level was determined. Two questionnaires were used to collect data in this study which were consisted of three parts: The demographic data, Iranian teenagers' risk taking (IARS) and prevention methods with emphasize on the importance of family role were examined. The Central and dispersion indices, such as standard deviation, multiple variance analysis, and multivariate regression analysis were used. Results showed that the observed F is significant (P ≤ 0.01) and 21% of variance related to risky behavior is explained by the lack of awareness. Given the significance of the regression, the coefficients of risky behavior in teenagers in prediction equation showed that each of teenagers' risky behavior can have an impact on healthy lifestyle.

Keywords: healthy lifestyle, high-risk behavior, students, physical education

Procedia PDF Downloads 157
3405 MapReduce Logistic Regression Algorithms with RHadoop

Authors: Byung Ho Jung, Dong Hoon Lim

Abstract:

Logistic regression is a statistical method for analyzing a dataset in which there are one or more independent variables that determine an outcome. Logistic regression is used extensively in numerous disciplines, including the medical and social science fields. In this paper, we address the problem of estimating parameters in the logistic regression based on MapReduce framework with RHadoop that integrates R and Hadoop environment applicable to large scale data. There exist three learning algorithms for logistic regression, namely Gradient descent method, Cost minimization method and Newton-Rhapson's method. The Newton-Rhapson's method does not require a learning rate, while gradient descent and cost minimization methods need to manually pick a learning rate. The experimental results demonstrated that our learning algorithms using RHadoop can scale well and efficiently process large data sets on commodity hardware. We also compared the performance of our Newton-Rhapson's method with gradient descent and cost minimization methods. The results showed that our newton's method appeared to be the most robust to all data tested.

Keywords: big data, logistic regression, MapReduce, RHadoop

Procedia PDF Downloads 245
3404 Effect of Serum Electrolytes on a QTc Interval and Mortality in Patients admitted to Coronary Care Unit

Authors: Thoetchai Peeraphatdit, Peter A. Brady, Suraj Kapa, Samuel J. Asirvatham, Niyada Naksuk

Abstract:

Background: Serum electrolyte abnormalities are a common cause of an acquired prolonged QT syndrome, especially, in the coronary care unit (CCU) setting. Optimal electrolyte ranges among the CCU patients have not been sufficiently investigated. Methods: We identified 8,498 consecutive CCU patients who were admitted to the CCU at Mayo Clinic, Rochester, the USA, from 2004 through 2013. Association between first serum electrolytes and baseline corrected QT intervals (QTc), as well as in-hospital mortality, was tested using multivariate linear regression and logistic regression, respectively. Serum potassium 4.0- < 4.5 mEq/L, ionized calcium (iCa) 4.6-4.8 mg/dL, and magnesium 2.0- < 2.2 mg/dL were used as the reference levels. Results: There was a modest level-dependent relationship between hypokalemia ( < 4.0 mEq/L), hypocalcemia ( < 4.4 mg/dL), and a prolonged QTc interval; serum magnesium did not affect the QTc interval. Association between the serum electrolytes and in-hospital mortality included a U-shaped relationship for serum potassium (adjusted odds ratio (OR) 1.53 and OR 1.91for serum potassium 4.5- < 5.0 and ≥ 5.0 mEq/L, respectively) and an inverted J-shaped relationship for iCa (adjusted OR 2.79 and OR 2.03 for calcium < 4.4 and 4.4- < 4.6 mg/dL, respectively). For serum magnesium, the mortality was greater only among patients with levels ≥ 2.4 mg/dL (adjusted OR 1.40), compared to the reference level. Findings were similar in sensitivity analyses examining the association between mean serum electrolytes and mean QTc intervals, as well as in-hospital mortality. Conclusions: Serum potassium 4.0- < 4.5 mEq/L, iCa ≥ 4.6 mg/dL, and magnesium < 2.4 mg/dL had a neutral effect on QTc intervals and were associated with the lowest in-hospital mortality among the CCU patients.

Keywords: calcium, electrocardiography, long-QT syndrome, magnesium, mortality, potassium

Procedia PDF Downloads 365
3403 A Generalized Weighted Loss for Support Vextor Classification and Multilayer Perceptron

Authors: Filippo Portera

Abstract:

Usually standard algorithms employ a loss where each error is the mere absolute difference between the true value and the prediction, in case of a regression task. In the present, we present several error weighting schemes that are a generalization of the consolidated routine. We study both a binary classification model for Support Vextor Classification and a regression net for Multylayer Perceptron. Results proves that the error is never worse than the standard procedure and several times it is better.

Keywords: loss, binary-classification, MLP, weights, regression

Procedia PDF Downloads 63
3402 Use of Multivariate Statistical Techniques for Water Quality Monitoring Network Assessment, Case of Study: Jequetepeque River Basin

Authors: Jose Flores, Nadia Gamboa

Abstract:

A proper water quality management requires the establishment of a monitoring network. Therefore, evaluation of the efficiency of water quality monitoring networks is needed to ensure high-quality data collection of critical quality chemical parameters. Unfortunately, in some Latin American countries water quality monitoring programs are not sustainable in terms of recording historical data or environmentally representative sites wasting time, money and valuable information. In this study, multivariate statistical techniques, such as principal components analysis (PCA) and hierarchical cluster analysis (HCA), are applied for identifying the most significant monitoring sites as well as critical water quality parameters in the monitoring network of the Jequetepeque River basin, in northern Peru. The Jequetepeque River basin, like others in Peru, shows socio-environmental conflicts due to economical activities developed in this area. Water pollution by trace elements in the upper part of the basin is mainly related with mining activity, and agricultural land lost due to salinization is caused by the extensive use of groundwater in the lower part of the basin. Since the 1980s, the water quality in the basin has been non-continuously assessed by public and private organizations, and recently the National Water Authority had established permanent water quality networks in 45 basins in Peru. Despite many countries use multivariate statistical techniques for assessing water quality monitoring networks, those instruments have never been applied for that purpose in Peru. For this reason, the main contribution of this study is to demonstrate that application of the multivariate statistical techniques could serve as an instrument that allows the optimization of monitoring networks using least number of monitoring sites as well as the most significant water quality parameters, which would reduce costs concerns and improve the water quality management in Peru. Main socio-economical activities developed and the principal stakeholders related to the water management in the basin are also identified. Finally, water quality management programs will also be discussed in terms of their efficiency and sustainability.

Keywords: PCA, HCA, Jequetepeque, multivariate statistical

Procedia PDF Downloads 329
3401 Interference among Lambsquarters and Oil Rapeseed Cultivars

Authors: Reza Siyami, Bahram Mirshekari

Abstract:

Seed and oil yield of rapeseed is considerably affected by weeds interference including mustard (Sinapis arvensis L.), lambsquarters (Chenopodium album L.) and redroot pigweed (Amaranthus retroflexus L.) throughout the East Azerbaijan province in Iran. To formulate the relationship between four independent growth variables measured in our experiment with a dependent variable, multiple regression analysis was carried out for the weed leaves number per plant (X1), green cover percentage (X2), LAI (X3) and leaf area per plant (X4) as independent variables and rapeseed oil yield as a dependent variable. The multiple regression equation is shown as follows: Seed essential oil yield (kg/ha) = 0.156 + 0.0325 (X1) + 0.0489 (X2) + 0.0415 (X3) + 0.133 (X4). Furthermore, the stepwise regression analysis was also carried out for the data obtained to test the significance of the independent variables affecting the oil yield as a dependent variable. The resulted stepwise regression equation is shown as follows: Oil yield = 4.42 + 0.0841 (X2) + 0.0801 (X3); R2 = 81.5. The stepwise regression analysis verified that the green cover percentage and LAI of weed had a marked increasing effect on the oil yield of rapeseed.

Keywords: green cover percentage, independent variable, interference, regression

Procedia PDF Downloads 388
3400 Optimization of Electric Vehicle (EV) Charging Station Allocation Based on Multiple Data - Taking Nanjing (China) as an Example

Authors: Yue Huang, Yiheng Feng

Abstract:

Due to the global pressure on climate and energy, many countries are vigorously promoting electric vehicles and building charging (public) charging facilities. Faced with the supply-demand gap of existing electric vehicle charging stations and unreasonable space usage in China, this paper takes the central city of Nanjing as an example, establishes a site selection model through multivariate data integration, conducts multiple linear regression SPSS analysis, gives quantitative site selection results, and provides optimization models and suggestions for charging station layout planning.

Keywords: electric vehicle, charging station, allocation optimization, urban mobility, urban infrastructure, nanjing

Procedia PDF Downloads 59
3399 Low SPOP Expression and High MDM2 expression Are Associated with Tumor Progression and Predict Poor Prognosis in Hepatocellular Carcinoma

Authors: Chang Liang, Weizhi Gong, Yan Zhang

Abstract:

Purpose: Hepatocellular carcinoma (HCC) is a malignant tumor with a high mortality rate and poor prognosis worldwide. Murine double minute 2 (MDM2) regulates the tumor suppressor p53, increasing cancer risk and accelerating tumor progression. Speckle-type POX virus and zinc finger protein (SPOP), a key of subunit of Cullin-Ring E3 ligase, inhibits tumor genesis and progression by the ubiquitination of its downstream substrates. This study aimed to clarify whether SPOP and MDM2 are mutually regulated in HCC and the correlation between SPOP and MDM2 and the prognosis of HCC patients. Methods: First, the expression of SPOP and MDM2 in HCC tissues were detected by TCGA database. Then, 53 paired samples of HCC tumor and adjacent tissues were collected to evaluate the expression of SPOP and MDM2 using immunohistochemistry. Chi-square test or Fisher’s exact test were used to analyze the relationship between clinicopathological features and the expression levels of SPOP and MDM2. In addition, Kaplan‒Meier curve analysis and log-rank test were used to investigate the effects of SPOP and MDM2 on the survival of HCC patients. Last, the Multivariate Cox proportional risk regression model analyzed whether the different expression levels of SPOP and MDM2 were independent risk factors for the prognosis of HCC patients. Results: Bioinformatics analysis revealed the low expression of SPOP and high expression of MDM2 were related to worse prognosis of HCC patients. The relationship between the expression of SPOP and MDM2 and tumor stem-like features showed an opposite trend. The immunohistochemistry showed the expression of SPOP protein was significantly downregulated while MDM2 protein significantly upregulated in HCC tissue compared to that in para-cancerous tissue. Tumors with low SPOP expression were related to worse T stage and Barcelona Clinic Liver Cancer (BCLC) stage, but tumors with high MDM2 expression were related to worse T stage, M stage, and BCLC stage. Kaplan–Meier curves showed HCC patients with high SPOP expression and low MDM2 expression had better survival than those with low SPOP expression and high MDM2 expression (P < 0.05). A multivariate Cox proportional risk regression model confirmed that a high MDM2 expression level was an independent risk factor for poor prognosis in HCC patients (P <0.05). Conclusion: The expression of SPOP protein was significantly downregulated, while the expression of MDM2 significantly upregulated in HCC. The low expression of SPOP and high expression. of MDM2 were associated with malignant progression and poor prognosis of HCC patients, indicating a potential therapeutic target for HCC patients.

Keywords: hepatocellular carcinoma, murine double minute 2, speckle-type POX virus and zinc finger protein, ubiquitination

Procedia PDF Downloads 103
3398 Copula-Based Estimation of Direct and Indirect Effects in Path Analysis Model

Authors: Alam Ali, Ashok Kumar Pathak

Abstract:

Path analysis is a statistical technique used to evaluate the strength of the direct and indirect effects of variables. One or more structural regression equations are used to estimate a series of parameters in order to find the better fit of data. Sometimes, exogenous variables do not show a significant strength of their direct and indirect effect when the assumption of classical regression (ordinary least squares (OLS)) are violated by the nature of the data. The main motive of this article is to investigate the efficacy of the copula-based regression approach over the classical regression approach and calculate the direct and indirect effects of variables when data violates the OLS assumption and variables are linked through an elliptical copula. We perform this study using a well-organized numerical scheme. Finally, a real data application is also presented to demonstrate the performance of the superiority of the copula approach.

Keywords: path analysis, copula-based regression models, direct and indirect effects, k-fold cross validation technique

Procedia PDF Downloads 46
3397 Performance Analysis of Proprietary and Non-Proprietary Tools for Regression Testing Using Genetic Algorithm

Authors: K. Hema Shankari, R. Thirumalaiselvi, N. V. Balasubramanian

Abstract:

The present paper addresses to the research in the area of regression testing with emphasis on automated tools as well as prioritization of test cases. The uniqueness of regression testing and its cyclic nature is pointed out. The difference in approach between industry, with business model as basis, and academia, with focus on data mining, is highlighted. Test Metrics are discussed as a prelude to our formula for prioritization; a case study is further discussed to illustrate this methodology. An industrial case study is also described in the paper, where the number of test cases is so large that they have to be grouped as Test Suites. In such situations, a genetic algorithm proposed by us can be used to reconfigure these Test Suites in each cycle of regression testing. The comparison is made between a proprietary tool and an open source tool using the above-mentioned metrics. Our approach is clarified through several tables.

Keywords: APFD metric, genetic algorithm, regression testing, RFT tool, test case prioritization, selenium tool

Procedia PDF Downloads 401
3396 A Hybrid Model Tree and Logistic Regression Model for Prediction of Soil Shear Strength in Clay

Authors: Ehsan Mehryaar, Seyed Armin Motahari Tabari

Abstract:

Without a doubt, soil shear strength is the most important property of the soil. The majority of fatal and catastrophic geological accidents are related to shear strength failure of the soil. Therefore, its prediction is a matter of high importance. However, acquiring the shear strength is usually a cumbersome task that might need complicated laboratory testing. Therefore, prediction of it based on common and easy to get soil properties can simplify the projects substantially. In this paper, A hybrid model based on the classification and regression tree algorithm and logistic regression is proposed where each leaf of the tree is an independent regression model. A database of 189 points for clay soil, including Moisture content, liquid limit, plastic limit, clay content, and shear strength, is collected. The performance of the developed model compared to the existing models and equations using root mean squared error and coefficient of correlation.

Keywords: model tree, CART, logistic regression, soil shear strength

Procedia PDF Downloads 165
3395 A Regression Model for Residual-State Creep Failure

Authors: Deepak Raj Bhat, Ryuichi Yatabe

Abstract:

In this study, a residual-state creep failure model was developed based on the residual-state creep test results of clayey soils. To develop the proposed model, the regression analyses were done by using the R. The model results of the failure time (tf) and critical displacement (δc) were compared with experimental results and found in close agreements to each others. It is expected that the proposed regression model for residual-state creep failure will be more useful for the prediction of displacement of different clayey soils in the future.

Keywords: regression model, residual-state creep failure, displacement prediction, clayey soils

Procedia PDF Downloads 375
3394 The Lopsided Burden of Non-Communicable Diseases in India: Evidences from the Decade 2004-2014

Authors: Kajori Banerjee, Laxmi Kant Dwivedi

Abstract:

India is a part of the ongoing globalization, contemporary convergence, industrialization and technical advancement that is taking place world-wide. Some of the manifestations of this evolution is rapid demographic, socio-economic, epidemiological and health transition. There has been a considerable increase in non-communicable diseases due to change in lifestyle. This study aims to assess the direction of burden of disease and compare the pressure of infectious diseases against cardio-vascular, endocrine, metabolic and nutritional diseases. The change in prevalence in a ten-year period (2004-2014) is further decomposed to determine the net contribution of various socio-economic and demographic covariates. The present study uses the recent 71st (2014) and 60th (2004) rounds of National Sample Survey. The pressure of infectious diseases against cardio-vascular (CVD), endocrine, metabolic and nutritional (EMN) diseases during 2004-2014 is calculated by Prevalence Rates (PR), Hospitalization Rates (HR) and Case Fatality Rates (CFR). The prevalence of non-communicable diseases are further used as a dependent variable in a logit regression to find the effect of various social, economic and demographic factors on the chances of suffering from the particular disease. Multivariate decomposition technique further assists in determining the net contribution of socio-economic and demographic covariates. This paper upholds evidences of stagnation of the burden of communicable diseases (CD) and rapid increase in the burden of non-communicable diseases (NCD) uniformly for all population sub-groups in India. CFR for CVD has increased drastically in 2004-2014. Logit regression indicates the chances of suffering from CVD and EMN is significantly higher among the urban residents, older ages, females, widowed/ divorced and separated individuals. Decomposition displays ample proof that improvement in quality of life markers like education, urbanization, longevity of life has positively contributed in increasing the NCD prevalence rate. In India’s current epidemiological phase, compression theory of morbidity is in action as a significant rise in the probability of contracting the NCDs over the time period among older ages is observed. Age is found to play a vital contributor in increasing the probability of having CVD and EMN over the study decade 2004-2014 in the nationally representative sample of National Sample Survey.

Keywords: cardio-vascular disease, case-fatality rate, communicable diseases, hospitalization rate, multivariate decomposition, non-communicable diseases, prevalence rate

Procedia PDF Downloads 287
3393 Relationship between Depression, Stress, and Life Satisfaction among Students

Authors: Rexa Pasha

Abstract:

The aim of this study was to examine the relationship between depression, stress and life satisfaction with sleep disturbance among Islamic Azad University Ahvaz Branch students. Samples in the study included 230 students who were selected by stratified random sampling. For data collection, the Beck Depression Inventory, stress, life satisfaction and quality of sleep (PSQI) was used. Which all have acceptable reliability and validity. This study was correlation and Data analysis using Pearson correlation and multivariate regression significance level (pKeywords: depression, life satisfaction, sleep disorder, sleep disturbane

Procedia PDF Downloads 402
3392 A Fuzzy Nonlinear Regression Model for Interval Type-2 Fuzzy Sets

Authors: O. Poleshchuk, E. Komarov

Abstract:

This paper presents a regression model for interval type-2 fuzzy sets based on the least squares estimation technique. Unknown coefficients are assumed to be triangular fuzzy numbers. The basic idea is to determine aggregation intervals for type-1 fuzzy sets, membership functions of whose are low membership function and upper membership function of interval type-2 fuzzy set. These aggregation intervals were called weighted intervals. Low and upper membership functions of input and output interval type-2 fuzzy sets for developed regression models are considered as piecewise linear functions.

Keywords: interval type-2 fuzzy sets, fuzzy regression, weighted interval

Procedia PDF Downloads 335
3391 Prognostic Impact of Pre-transplant Ferritinemia: A Survival Analysis Among Allograft Patients

Authors: Mekni Sabrine, Nouira Mariem

Abstract:

Background and aim: Allogeneic hematopoietic stem cell transplantation is a curative treatment for several hematological diseases; however, it has a non-negligible morbidity and mortality depending on several prognostic factors, including pre-transplant hyperferritinemia. The aim of our study was to estimate the impact of hyperferritinemia on survivals and on the occurrence of post-transplant complications. Methods: It was a longitudinal study conducted over 8 years and including all patients who had a first allograft. The impact of pretransplant hyperferritinemia (ferritinemia ≥1500) on survivals was studied using the Kaplan Meier method and the COX model for uni- and multivariate analysis. The Khi-deux test and binary logistic regression were used to study the association between pretransplant ferritinemia and post-transplant complications. Results: One hundred forty patients were included with an average age of 26.6 years and a sex ratio (M/F)=1.4. Hyperferritinemia was found in 33% of patients. It had no significant impact on either overall survival (p=0.9) or event -free survival (p=0.6). In multivariate analysis, only the type of disease was independently associated with overall survival (p=0.04) and event-free survival (p=0.002). For post-allograft complications: The occurrence of early documented infections was independently associated with pretransplant hyperferritinemia (p=0.02) and the presence of acute graft versus host disease( GVHD) (p<10-3). The occurrence of acute GVHD was associated with early documented infection (p=0.002) and Cytomegalovirus reactivation (p<10-3). The occurrence of chronic GVHD was associated with the presence of Cytomegalovirus reactivation (p=0.006) and graft source (p=0.009). Conclusion: Our study showed the significant impact of pre-transplant hyperferritinemia on the occurrence of early infections but not on survivals. Early and more accurate assessment iron overload by other tests such as liver magnetic resonance imaging with initiation of chelating treatment could prevent the occurrence of such complications after transplantation.

Keywords: allogeneic, transplants, ferritin, survival

Procedia PDF Downloads 43
3390 Formulating a Flexible-Spread Fuzzy Regression Model Based on Dissemblance Index

Authors: Shih-Pin Chen, Shih-Syuan You

Abstract:

This study proposes a regression model with flexible spreads for fuzzy input-output data to cope with the situation that the existing measures cannot reflect the actual estimation error. The main idea is that a dissemblance index (DI) is carefully identified and defined for precisely measuring the actual estimation error. Moreover, the graded mean integration (GMI) representation is adopted for determining more representative numeric regression coefficients. Notably, to comprehensively compare the performance of the proposed model with other ones, three different criteria are adopted. The results from commonly used test numerical examples and an application to Taiwan's business monitoring indicator illustrate that the proposed dissemblance index method not only produces valid fuzzy regression models for fuzzy input-output data, but also has satisfactory and stable performance in terms of the total estimation error based on these three criteria.

Keywords: dissemblance index, forecasting, fuzzy sets, linear regression

Procedia PDF Downloads 330
3389 Prediction of Marine Ecosystem Changes Based on the Integrated Analysis of Multivariate Data Sets

Authors: Prozorkevitch D., Mishurov A., Sokolov K., Karsakov L., Pestrikova L.

Abstract:

The current body of knowledge about the marine environment and the dynamics of marine ecosystems includes a huge amount of heterogeneous data collected over decades. It generally includes a wide range of hydrological, biological and fishery data. Marine researchers collect these data and analyze how and why the ecosystem changes from past to present. Based on these historical records and linkages between the processes it is possible to predict future changes. Multivariate analysis of trends and their interconnection in the marine ecosystem may be used as an instrument for predicting further ecosystem evolution. A wide range of information about the components of the marine ecosystem for more than 50 years needs to be used to investigate how these arrays can help to predict the future.

Keywords: barents sea ecosystem, abiotic, biotic, data sets, trends, prediction

Procedia PDF Downloads 79
3388 Image Compression Based on Regression SVM and Biorthogonal Wavelets

Authors: Zikiou Nadia, Lahdir Mourad, Ameur Soltane

Abstract:

In this paper, we propose an effective method for image compression based on SVM Regression (SVR), with three different kernels, and biorthogonal 2D Discrete Wavelet Transform. SVM regression could learn dependency from training data and compressed using fewer training points (support vectors) to represent the original data and eliminate the redundancy. Biorthogonal wavelet has been used to transform the image and the coefficients acquired are then trained with different kernels SVM (Gaussian, Polynomial, and Linear). Run-length and Arithmetic coders are used to encode the support vectors and its corresponding weights, obtained from the SVM regression. The peak signal noise ratio (PSNR) and their compression ratios of several test images, compressed with our algorithm, with different kernels are presented. Compared with other kernels, Gaussian kernel achieves better image quality. Experimental results show that the compression performance of our method gains much improvement.

Keywords: image compression, 2D discrete wavelet transform (DWT-2D), support vector regression (SVR), SVM Kernels, run-length, arithmetic coding

Procedia PDF Downloads 352