Search results for: collinearity
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 14

Search results for: collinearity

14 On the Performance of Improvised Generalized M-Estimator in the Presence of High Leverage Collinearity Enhancing Observations

Authors: Habshah Midi, Mohammed A. Mohammed, Sohel Rana

Abstract:

Multicollinearity occurs when two or more independent variables in a multiple linear regression model are highly correlated. The ridge regression is the commonly used method to rectify this problem. However, the ridge regression cannot handle the problem of multicollinearity which is caused by high leverage collinearity enhancing observation (HLCEO). Since high leverage points (HLPs) are responsible for inducing multicollinearity, the effect of HLPs needs to be reduced by using Generalized M estimator. The existing GM6 estimator is based on the Minimum Volume Ellipsoid (MVE) which tends to swamp some low leverage points. Hence an improvised GM (MGM) estimator is presented to improve the precision of the GM6 estimator. Numerical example and simulation study are presented to show how HLPs can cause multicollinearity. The numerical results show that our MGM estimator is the most efficient method compared to some existing methods.

Keywords: identification, high leverage points, multicollinearity, GM-estimator, DRGP, DFFITS

Procedia PDF Downloads 218
13 A Study on Inference from Distance Variables in Hedonic Regression

Authors: Yan Wang, Yasushi Asami, Yukio Sadahiro

Abstract:

In urban area, several landmarks may affect housing price and rents, hedonic analysis should employ distance variables corresponding to each landmarks. Unfortunately, the effects of distances to landmarks on housing prices are generally not consistent with the true price. These distance variables may cause magnitude error in regression, pointing a problem of spatial multicollinearity. In this paper, we provided some approaches for getting the samples with less bias and method on locating the specific sampling area to avoid the multicollinerity problem in two specific landmarks case.

Keywords: landmarks, hedonic regression, distance variables, collinearity, multicollinerity

Procedia PDF Downloads 422
12 Multicollinearity and MRA in Sustainability: Application of the Raise Regression

Authors: Claudia García-García, Catalina B. García-García, Román Salmerón-Gómez

Abstract:

Much economic-environmental research includes the analysis of possible interactions by using Moderated Regression Analysis (MRA), which is a specific application of multiple linear regression analysis. This methodology allows analyzing how the effect of one of the independent variables is moderated by a second independent variable by adding a cross-product term between them as an additional explanatory variable. Due to the very specification of the methodology, the moderated factor is often highly correlated with the constitutive terms. Thus, great multicollinearity problems arise. The appearance of strong multicollinearity in a model has important consequences. Inflated variances of the estimators may appear, there is a tendency to consider non-significant regressors that they probably are together with a very high coefficient of determination, incorrect signs of our coefficients may appear and also the high sensibility of the results to small changes in the dataset. Finally, the high relationship among explanatory variables implies difficulties in fixing the individual effects of each one on the model under study. These consequences shifted to the moderated analysis may imply that it is not worth including an interaction term that may be distorting the model. Thus, it is important to manage the problem with some methodology that allows for obtaining reliable results. After a review of those works that applied the MRA among the ten top journals of the field, it is clear that multicollinearity is mostly disregarded. Less than 15% of the reviewed works take into account potential multicollinearity problems. To overcome the issue, this work studies the possible application of recent methodologies to MRA. Particularly, the raised regression is analyzed. This methodology mitigates collinearity from a geometrical point of view: the collinearity problem arises because the variables under study are very close geometrically, so by separating both variables, the problem can be mitigated. Raise regression maintains the available information and modifies the problematic variables instead of deleting variables, for example. Furthermore, the global characteristics of the initial model are also maintained (sum of squared residuals, estimated variance, coefficient of determination, global significance test and prediction). The proposal is implemented to data from countries of the European Union during the last year available regarding greenhouse gas emissions, per capita GDP and a dummy variable that represents the topography of the country. The use of a dummy variable as the moderator is a special variant of MRA, sometimes called “subgroup regression analysis.” The main conclusion of this work is that applying new techniques to the field can improve in a substantial way the results of the analysis. Particularly, the use of raised regression mitigates great multicollinearity problems, so the researcher is able to rely on the interaction term when interpreting the results of a particular study.

Keywords: multicollinearity, MRA, interaction, raise

Procedia PDF Downloads 65
11 A Research on Inference from Multiple Distance Variables in Hedonic Regression Focus on Three Variables

Authors: Yan Wang, Yasushi Asami, Yukio Sadahiro

Abstract:

In urban context, urban nodes such as amenity or hazard will certainly affect house price, while classic hedonic analysis will employ distance variables measured from each urban nodes. However, effects from distances to facilities on house prices generally do not represent the true price of the property. Distance variables measured on the same surface are suffering a problem called multicollinearity, which is usually presented as magnitude variance and mean value in regression, errors caused by instability. In this paper, we provided a theoretical framework to identify and gather the data with less bias, and also provided specific sampling method on locating the sample region to avoid the spatial multicollinerity problem in three distance variable’s case.

Keywords: hedonic regression, urban node, distance variables, multicollinerity, collinearity

Procedia PDF Downloads 437
10 Determining the Causality Variables in Female Genital Mutilation: A Factor Screening Approach

Authors: Ekele Alih, Enejo Jalija

Abstract:

Female Genital Mutilation (FGM) is made up of three types namely: Clitoridectomy, Excision and Infibulation. In this study, we examine the factors responsible for FGM in order to identify the causality variables in a logistic regression approach. From the result of the survey conducted by the Public Health Division, Nigeria Institute of Medical Research, Yaba, Lagos State, the tau statistic, τ was used to screen 9 factors that causes FGM in order to select few of the predictors before multiple regression equation is obtained. The need for this may be that the sample size may not be able to sustain having a regression with all the predictors or to avoid multi-collinearity. A total of 300 respondents, comprising 150 adult males and 150 adult females were selected for the household survey based on the multi-stage sampling procedure. The tau statistic,

Keywords: female genital mutilation, logistic regression, tau statistic, African society

Procedia PDF Downloads 226
9 Rigorous Photogrammetric Push-Broom Sensor Modeling for Lunar and Planetary Image Processing

Authors: Ahmed Elaksher, Islam Omar

Abstract:

Accurate geometric relation algorithms are imperative in Earth and planetary satellite and aerial image processing, particularly for high-resolution images that are used for topographic mapping. Most of these satellites carry push-broom sensors. These sensors are optical scanners equipped with linear arrays of CCDs. These sensors have been deployed on most EOSs. In addition, the LROC is equipped with two push NACs that provide 0.5 meter-scale panchromatic images over a 5 km swath of the Moon. The HiRISE carried by the MRO and the HRSC carried by MEX are examples of push-broom sensor that produces images of the surface of Mars. Sensor models developed in photogrammetry relate image space coordinates in two or more images with the 3D coordinates of ground features. Rigorous sensor models use the actual interior orientation parameters and exterior orientation parameters of the camera, unlike approximate models. In this research, we generate a generic push-broom sensor model to process imageries acquired through linear array cameras and investigate its performance, advantages, and disadvantages in generating topographic models for the Earth, Mars, and the Moon. We also compare and contrast the utilization, effectiveness, and applicability of available photogrammetric techniques and softcopies with the developed model. We start by defining an image reference coordinate system to unify image coordinates from all three arrays. The transformation from an image coordinate system to a reference coordinate system involves a translation and three rotations. For any image point within the linear array, its image reference coordinates, the coordinates of the exposure center of the array in the ground coordinate system at the imaging epoch (t), and the corresponding ground point coordinates are related through the collinearity condition that states that all these three points must be on the same line. The rotation angles for each CCD array at the epoch t are defined and included in the transformation model. The exterior orientation parameters of an image line, i.e., coordinates of exposure station and rotation angles, are computed by a polynomial interpolation function in time (t). The parameter (t) is the time at a certain epoch from a certain orbit position. Depending on the types of observations, coordinates, and parameters may be treated as knowns or unknowns differently in various situations. The unknown coefficients are determined in a bundle adjustment. The orientation process starts by extracting the sensor position and, orientation and raw images from the PDS. The parameters of each image line are then estimated and imported into the push-broom sensor model. We also define tie points between image pairs to aid the bundle adjustment model, determine the refined camera parameters, and generate highly accurate topographic maps. The model was tested on different satellite images such as IKONOS, QuickBird, and WorldView-2, HiRISE. It was found that the accuracy of our model is comparable to those of commercial and open-source software, the computational efficiency of the developed model is high, the model could be used in different environments with various sensors, and the implementation process is much more cost-and effort-consuming.

Keywords: photogrammetry, push-broom sensors, IKONOS, HiRISE, collinearity condition

Procedia PDF Downloads 35
8 Statistical Analysis with Prediction Models of User Satisfaction in Software Project Factors

Authors: Katawut Kaewbanjong

Abstract:

We analyzed a volume of data and found significant user satisfaction in software project factors. A statistical significance analysis (logistic regression) and collinearity analysis determined the significance factors from a group of 71 pre-defined factors from 191 software projects in ISBSG Release 12. The eight prediction models used for testing the prediction potential of these factors were Neural network, k-NN, Naïve Bayes, Random forest, Decision tree, Gradient boosted tree, linear regression and logistic regression prediction model. Fifteen pre-defined factors were truly significant in predicting user satisfaction, and they provided 82.71% prediction accuracy when used with a neural network prediction model. These factors were client-server, personnel changes, total defects delivered, project inactive time, industry sector, application type, development type, how methodology was acquired, development techniques, decision making process, intended market, size estimate approach, size estimate method, cost recording method, and effort estimate method. These findings may benefit software development managers considerably.

Keywords: prediction model, statistical analysis, software project, user satisfaction factor

Procedia PDF Downloads 85
7 Investigating the Relationship Between Corporate Governance and Financial Performance Considering the Moderating Role of Opinion and Internal Control Weakness

Authors: Fatemeh Norouzi

Abstract:

Today, financial performance has become one of the important issues in accounting and auditing that companies and their managers have paid attention to this issue and for this reason to the variables that are influential in this field. One of the things that can affect financial performance is corporate governance, which is examined in this research, although some things such as issues related to auditing can also moderate this relationship; Therefore, this research has been conducted with the aim of investigating the relationship between corporate governance and financial performance with regard to the moderating role of feedback and internal control weakness. The research is practical in terms of purpose, and in terms of method, it has been done in a post-event descriptive manner, in which the data has been analyzed using stock market data. Data collection has been done by using stock exchange data which has been extracted from the website of the Iraqi Stock Exchange, the statistical population of this research is all the companies admitted to the Iraqi Stock Exchange. . The statistical sample in this research is considered from 2014 to 2021, which includes 34 companies. Four different models have been considered for the research hypotheses, which are eight hypotheses, in this research, the analysis has been done using EXCEL and STATA15 software. In this article, collinearity test, integration test ,determination of fixed effects and correlation matrix results, have been used. The research results showed that the first four hypotheses were rejected and the second four hypotheses were confirmed.

Keywords: size of the board of directors, duality of the CEO, financial performance, internal control weakness

Procedia PDF Downloads 50
6 Sea Surface Temperature and Climatic Variables as Drivers of North Pacific Albacore Tuna Thunnus Alalunga Time Series

Authors: Ashneel Ajay Singh, Naoki Suzuki, Kazumi Sakuramoto, Swastika Roshni, Paras Nath, Alok Kalla

Abstract:

Albacore tuna (Thunnus alalunga) is one of the commercially important species of tuna in the North Pacific region. Despite the long history of albacore fisheries in the Pacific, its ecological characteristics are not sufficiently understood. The effects of changing climate on numerous commercially and ecologically important fish species including albacore tuna have been documented over the past decades. The objective of this study was to explore and elucidate the relationship of environmental variables with the stock parameters of albacore tuna. The relationship of the North Pacific albacore tuna recruitment (R), spawning stock biomass (SSB) and recruits per spawning biomass (RPS) from 1970 to 2012 with the environmental factors of sea surface temperature (SST), Pacific decadal oscillation (PDO), El Niño southern oscillation (ENSO) and Pacific warm pool index (PWI) was construed. SST and PDO were used as independent variables with SSB to construct stock reproduction models for R and RPS as they showed most significant relationship with the dependent variables. ENSO and PWI were excluded due to collinearity effects with SST and PDO. Model selections were based on R2 values, Akaike Information Criterion (AIC) and significant parameter estimates at p<0.05. Models with single independent variables of SST, PDO, ENSO and PWI were also constructed to illuminate their individual effect on albacore R and RPS. From the results it can be said that SST and PDO resulted in the most significant models for reproducing North Pacific albacore tuna R and RPS time series. SST has the highest impact on albacore R and RPS when comparing models with single environmental variables. It is important for fishery managers and decision makers to incorporate the findings into their albacore tuna management plans for the North Pacific Oceanic region.

Keywords: Albacore tuna, El Niño southern oscillation, Pacific decadal oscillation, sea surface temperature

Procedia PDF Downloads 202
5 Analysing Time Series for a Forecasting Model to the Dynamics of Aedes Aegypti Population Size

Authors: Flavia Cordeiro, Fabio Silva, Alvaro Eiras, Jose Luiz Acebal

Abstract:

Aedes aegypti is present in the tropical and subtropical regions of the world and is a vector of several diseases such as dengue fever, yellow fever, chikungunya, zika etc. The growth in the number of arboviruses cases in the last decades became a matter of great concern worldwide. Meteorological factors like mean temperature and precipitation are known to influence the infestation by the species through effects on physiology and ecology, altering the fecundity, mortality, lifespan, dispersion behaviour and abundance of the vector. Models able to describe the dynamics of the vector population size should then take into account the meteorological variables. The relationship between meteorological factors and the population dynamics of Ae. aegypti adult females are studied to provide a good set of predictors to model the dynamics of the mosquito population size. The time-series data of capture of adult females of a public health surveillance program from the city of Lavras, MG, Brazil had its association with precipitation, humidity and temperature analysed through a set of statistical methods for time series analysis commonly adopted in Signal Processing, Information Theory and Neuroscience. Cross-correlation, multicollinearity test and whitened cross-correlation were applied to determine in which time lags would occur the influence of meteorological variables on the dynamics of the mosquito abundance. Among the findings, the studied case indicated strong collinearity between humidity and precipitation, and precipitation was selected to form a pair of descriptors together with temperature. In the techniques used, there were observed significant associations between infestation indicators and both temperature and precipitation in short, mid and long terms, evincing that those variables should be considered in entomological models and as public health indicators. A descriptive model used to test the results exhibits a strong correlation to data.

Keywords: Aedes aegypti, cross-correlation, multicollinearity, meteorological variables

Procedia PDF Downloads 146
4 A Meta-Analysis of the Academic Achievement of Students With Emotional/Behavioral Disorders in Traditional Public Schools in the United States

Authors: Dana Page, Erica McClure, Kate Snider, Jenni Pollard, Tim Landrum, Jeff Valentine

Abstract:

Extensive research has been conducted on students with emotional and behavioral disorders (EBD) and their rates of challenging behavior. In the past, however, less attention has been given to their academic achievement and outcomes. Recent research examining outcomes for students with EBD has indicated that these students receive lower grades, are less likely to pass classes, and experience higher rates of school dropout than students without disabilities and students with other high incidence disabilities. Given that between 2% and 20% of the school-age population is likely to have EBD (though many may not be identified as such), this is no small problem. Despite the need for increased examination of this population’s academic achievement, research on the actual performance of students with EBD has been minimal. This study reports the results of a meta-analysis of the limited research examining academic achievement of students with EBD, including effect sizes of assessment scores and discussion of moderators potentially impacting academic outcomes. Researchers conducted a thorough literature search to identify potentially relevant documents before screening studies for inclusion in the systematic review. Screening identified 35 studies that reported results of academic assessment scores for students with EBD. These studies were then coded to extract descriptive data across multiple domains, including placement of students, participant demographics, and academic assessment scores. Results indicated possible collinearity between EBD disability status and lower academic assessment scores, despite a lack of association between EBD eligibility and lower cognitive ability. Quantitative analysis of assessment results yielded effect sizes for academic achievement of student participants, indicating lower performance levels and potential moderators (e.g., race, socioeconomic status, and gender) impacting student academic performance. In addition to discussing results of the meta-analysis, implications and areas for future research, policy, and practice are discussed.

Keywords: students with emotional behavioral disorders, academic achievement, systematic review, meta-analysis

Procedia PDF Downloads 32
3 Adolescent-Parent Relationship as the Most Important Factor in Preventing Mood Disorders in Adolescents: An Application of Artificial Intelligence to Social Studies

Authors: Elżbieta Turska

Abstract:

Introduction: One of the most difficult times in a person’s life is adolescence. The experiences in this period may shape the future life of this person to a large extent. This is the reason why many young people experience sadness, dejection, hopelessness, sense of worthlessness, as well as losing interest in various activities and social relationships, all of which are often classified as mood disorders. As many as 15-40% adolescents experience depressed moods and for most of them they resolve and are not carried into adulthood. However, (5-6%) of those affected by mood disorders develop the depressive syndrome and as many as (1-3%) develop full-blown clinical depression. Materials: A large questionnaire was given to 2508 students, aged 13–16 years old, and one of its parts was the Burns checklist, i.e. the standard test for identifying depressed mood. The questionnaire asked about many aspects of the student’s life, it included a total of 53 questions, most of which had subquestions. It is important to note that the data suffered from many problems, the most important of which were missing data and collinearity. Aim: In order to identify the correlates of mood disorders we built predictive models which were then trained and validated. Our aim was not to be able to predict which students suffer from mood disorders but rather to explore the factors influencing mood disorders. Methods: The problems with data described above practically excluded using all classical statistical methods. For this reason, we attempted to use the following Artificial Intelligence (AI) methods: classification trees with surrogate variables, random forests and xgboost. All analyses were carried out with the use of the mlr package for the R programming language. Resuts: The predictive model built by classification trees algorithm outperformed the other algorithms by a large margin. As a result, we were able to rank the variables (questions and subquestions from the questionnaire) from the most to least influential as far as protection against mood disorder is concerned. Thirteen out of twenty most important variables reflect the relationships with parents. This seems to be a really significant result both from the cognitive point of view and also from the practical point of view, i.e. as far as interventions to correct mood disorders are concerned.

Keywords: mood disorders, adolescents, family, artificial intelligence

Procedia PDF Downloads 72
2 Private and Public Health Sector Difference on Client Satisfaction: Results from Secondary Data Analysis in Sindh, Pakistan

Authors: Wajiha Javed, Arsalan Jabbar, Nelofer Mehboob, Muhammad Tafseer, Zahid Memon

Abstract:

Introduction: Researchers globally have strived to explore diverse factors that augment the continuation and uptake of family planning methods. Clients’ satisfaction is one of the core determinants facilitating continuation of family planning methods. There is a major debate yet scanty evidence to contrast public and private sectors with respect to client satisfaction. The objective of this study is to compare quality-of-care provided by public and private sectors of Pakistan through a client satisfaction lens. Methods: We used Pakistan Demographic Heath Survey 2012-13 dataset (Sindh province) on a total of 3133 Married Women of Reproductive Age (MWRA) aged 15-49 years. Source of family planning (public/private sector) was the main exposure variable. Outcome variable was client satisfaction judged by ten different dimensions of client satisfaction. Means and standard deviations were calculated for continuous variable while for categorical variable frequencies and percentages were computed. For univariate analysis, Chi-square/Fisher Exact test was used to find an association between clients’ satisfaction in public and private sectors. Ten different multivariate models were made. Variables were checked for multi-collinearity, confounding, and interaction, and then advanced logistic regression was used to explore the relationship between client satisfaction and dependent outcome after adjusting for all known confounding factors and results are presented as OR and AOR (95% CI). Results: Multivariate analyses showed that clients were less satisfied in contraceptive provision from private sector as compared to public sector (AOR 0.92,95% CI 0.63-1.68) even though the result was not statistically significant. Clients were more satisfied from private sector as compared to the public sector with respect to other determinants of quality-of-care (follow-up care (AOR 3.29, 95% CI 1.95-5.55), infection prevention (AOR 2.41, 95% CI 1.60-3.62), counseling services (AOR 2.01, 95% CI 1.27-3.18, timely treatment (AOR 3.37, 95% CI 2.20-5.15), attitude of staff (AOR 2.23, 95% CI 1.50-3.33), punctuality of staff (AOR 2.28, 95% CI 1.92-4.13), timely referring (AOR 2.34, 95% CI 1.63-3.35), staff cooperation (AOR 1.75, 95% CI 1.22-2.51) and complications handling (AOR 2.27, 95% CI 1.56-3.29).

Keywords: client satisfaction, family planning, public private partnership, quality of care

Procedia PDF Downloads 381
1 Determinants of Never Users of Contraception-Results from Pakistan Demographic and Health Survey 2012-13

Authors: Arsalan Jabbar, Wajiha Javed, Nelofer Mehboob, Zahid Memon

Abstract:

Introduction: There are multiple social, individual and cultural factors that influence an individual’s decision to adopt family planning methods especially among non-users in patriarchal societies like Pakistan.Non-users, if targeted efficiently, can contribute significantly to country’s CPR. A research study showed that non-users if convinced to adopt lactational amenorrhea method can shift to long-term methods in future. Research shows that if non-users are targeted efficiently a 59% reduction in unintended pregnancies in Saharan Africa and South-Central and South-East Asia is anticipated. Methods: We did secondary data analysis on Pakistan Demographic Heath Survey (2012-13) dataset. Use of contraception (never-use/ever-use) was the outcome variable. At univariate level Chi-square/Fisher Exact test was used to assess relationship of baseline covariates with contraception use. Then variables to be incorporated in the model were checked for multi-collinearity, confounding, and interaction. Then binary logistic regression (with an urban-rural stratification) was done to find the relationship between contraception use and baseline demographic and social variables. Results: The multivariate analyses of the study showed that younger women (≤ 29 years) were more prone to be never users as compared to those who were > 30 years and this trend was seen in urban areas (AOR 1.92, CI 1.453-2.536) as well as rural areas (AOR 1.809, CI 1.421-2.303). While looking at regional variation, women from urban Sindh (AOR 1.548, CI 1.142-2.099) and urban Balochistan (AOR 2.403, CI 1.504-3.839) had more never users as compared to other urban regions. Women in the rich wealth quintile were more never users and this was seen both in urban and rural localities (urban (AOR 1.106 CI .753-1.624); rural areas (AOR 1.162, CI .887-1.524)) even though these were not statistically significant. Women idealizing more children(> 4) are more never users as compared to those idealizing less children in both urban (AOR 1.854, CI 1.275-2.697) and rural areas (AOR 2.101, CI 1.514-2.916). Women who never lost a pregnancy were more inclined to be non-users in rural areas (AOR 1.394, CI 1.127-1.723) .Women familiar with only traditional or no method had more never users in rural areas (AOR 1.717, CI 1.127-1.723) but in urban areas it wasn’t significant. Women unaware of Lady Health Worker’s presence in their area were more never users especially in rural areas (AOR 1.276, CI 1.014-1.607). Women who did not visit any care provider were more never users (urban (AOR 11.738, CI 9.112-15.121) rural areas (AOR 7.832, CI 6.243-9.826)). Discussion/Conclusion: This study concluded that government, policy makers and private sector family planning programs should focus on the untapped pool of never users (younger women from underserved provinces, in higher wealth quintiles, who desire more children.). We need to make sure to cover catchment areas where there are less LHWs and less providers as ignorance to modern methods and never been visited by an LHW are important determinants of never use. This all is in sync with previous literate from similar developing countries.

Keywords: contraception, demographic and health survey, family planning, never users

Procedia PDF Downloads 375