Search results for: Tobit regression model
18188 Towards an Effective Approach for Modelling near Surface Air Temperature Combining Weather and Satellite Data
Authors: Nicola Colaninno, Eugenio Morello
Abstract:
The urban environment affects local-to-global climate and, in turn, suffers global warming phenomena, with worrying impacts on human well-being, health, social and economic activities. Physic-morphological features of the built-up space affect urban air temperature, locally, causing the urban environment to be warmer compared to surrounding rural. This occurrence, typically known as the Urban Heat Island (UHI), is normally assessed by means of air temperature from fixed weather stations and/or traverse observations or based on remotely sensed Land Surface Temperatures (LST). The information provided by ground weather stations is key for assessing local air temperature. However, the spatial coverage is normally limited due to low density and uneven distribution of the stations. Although different interpolation techniques such as Inverse Distance Weighting (IDW), Ordinary Kriging (OK), or Multiple Linear Regression (MLR) are used to estimate air temperature from observed points, such an approach may not effectively reflect the real climatic conditions of an interpolated point. Quantifying local UHI for extensive areas based on weather stations’ observations only is not practicable. Alternatively, the use of thermal remote sensing has been widely investigated based on LST. Data from Landsat, ASTER, or MODIS have been extensively used. Indeed, LST has an indirect but significant influence on air temperatures. However, high-resolution near-surface air temperature (NSAT) is currently difficult to retrieve. Here we have experimented Geographically Weighted Regression (GWR) as an effective approach to enable NSAT estimation by accounting for spatial non-stationarity of the phenomenon. The model combines on-site measurements of air temperature, from fixed weather stations and satellite-derived LST. The approach is structured upon two main steps. First, a GWR model has been set to estimate NSAT at low resolution, by combining air temperature from discrete observations retrieved by weather stations (dependent variable) and the LST from satellite observations (predictor). At this step, MODIS data, from Terra satellite, at 1 kilometer of spatial resolution have been employed. Two time periods are considered according to satellite revisit period, i.e. 10:30 am and 9:30 pm. Afterward, the results have been downscaled at 30 meters of spatial resolution by setting a GWR model between the previously retrieved near-surface air temperature (dependent variable), the multispectral information as provided by the Landsat mission, in particular the albedo, and Digital Elevation Model (DEM) from the Shuttle Radar Topography Mission (SRTM), both at 30 meters. Albedo and DEM are now the predictors. The area under investigation is the Metropolitan City of Milan, which covers an area of approximately 1,575 km2 and encompasses a population of over 3 million inhabitants. Both models, low- (1 km) and high-resolution (30 meters), have been validated according to a cross-validation that relies on indicators such as R2, Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE). All the employed indicators give evidence of highly efficient models. In addition, an alternative network of weather stations, available for the City of Milano only, has been employed for testing the accuracy of the predicted temperatures, giving and RMSE of 0.6 and 0.7 for daytime and night-time, respectively.Keywords: urban climate, urban heat island, geographically weighted regression, remote sensing
Procedia PDF Downloads 19518187 Effect of Pregnancy Intention, Postnatal Depressive Symptoms and Social Support on Early Childhood Stunting: Findings from India
Authors: Swati Srivastava, Ashish Kumar Upadhyay
Abstract:
Background: According to United Nation Children’s Fund, it has been estimated that worldwide about 165 million children were stunted in 2012 and India alone accounts for 38% of global burden of stunting. In terms of incidence, India is home of more than 60 million stunted children worldwide. Our study aims to examine the effect of pregnancy intention and maternal postnatal depressive symptoms on early childhood stunting in India. We hypothesized that effect of pregnancy intention and postnatal maternal depressive symptoms were mediated by social support. Methods: We used data from first wave of Young Lives Study India. Out of 2011 children recruited in original cohort, 1833 children had complete information on pregnancy intention, maternal depression and other variables. A series of multivariate logistic regression model were used to examine the effect of pregnancy intention and postnatal depressive symptoms on early childhood stunting. Results: Bivariate result indicates that a higher percent of children born after unintended pregnancy (40%) were stunted than children of intended pregnancy (26%). Likewise, proportion of stunted children was also higher among women of high postnatal depressive symptoms (35%) than low level of depression (24%). Results of multivariate logistic regression model indicate that children born after unintended pregnancy were significantly more likely to be stunted than children born after intended pregnancy (Coefficient: 1.70, CI: 1.17, 2.48). Likewise, early childhood stunting was also associated with maternal postnatal depressive symptoms among women (Coefficient: 1.48, CI: 1.16, 1.88). The effect of pregnancy intention and postnatal depressive symptoms on early childhood stunting remains unchanged after controlling for social support and other variables. Conclusions: The findings of this study provide conclusive evidence regarding consequences of pregnancy intention and postnatal depressive symptoms on early childhood stunting in India. Therefore, there is need to identify the women with unintended pregnancy and incorporate the promotion of mental health into their national reproductive and child health programme.Keywords: pregnancy intention, postnatal depressive symptoms, social support, childhood stunting, young lives study, India
Procedia PDF Downloads 30218186 Recommender Systems Using Ensemble Techniques
Authors: Yeonjeong Lee, Kyoung-jae Kim, Youngtae Kim
Abstract:
This study proposes a novel recommender system that uses data mining and multi-model ensemble techniques to enhance the recommendation performance through reflecting the precise user’s preference. The proposed model consists of two steps. In the first step, this study uses logistic regression, decision trees, and artificial neural networks to predict customers who have high likelihood to purchase products in each product group. Then, this study combines the results of each predictor using the multi-model ensemble techniques such as bagging and bumping. In the second step, this study uses the market basket analysis to extract association rules for co-purchased products. Finally, the system selects customers who have high likelihood to purchase products in each product group and recommends proper products from same or different product groups to them through above two steps. We test the usability of the proposed system by using prototype and real-world transaction and profile data. In addition, we survey about user satisfaction for the recommended product list from the proposed system and the randomly selected product lists. The results also show that the proposed system may be useful in real-world online shopping store.Keywords: product recommender system, ensemble technique, association rules, decision tree, artificial neural networks
Procedia PDF Downloads 29418185 A Study of Classification Models to Predict Drill-Bit Breakage Using Degradation Signals
Authors: Bharatendra Rai
Abstract:
Cutting tools are widely used in manufacturing processes and drilling is the most commonly used machining process. Although drill-bits used in drilling may not be expensive, their breakage can cause damage to expensive work piece being drilled and at the same time has major impact on productivity. Predicting drill-bit breakage, therefore, is important in reducing cost and improving productivity. This study uses twenty features extracted from two degradation signals viz., thrust force and torque. The methodology used involves developing and comparing decision tree, random forest, and multinomial logistic regression models for classifying and predicting drill-bit breakage using degradation signals.Keywords: degradation signal, drill-bit breakage, random forest, multinomial logistic regression
Procedia PDF Downloads 35218184 A Study on Reliability of Gender and Stature Determination by Odontometric and Craniofacial Anthropometric Parameters
Authors: Churamani Pokhrel, C. B. Jha, S. R. Niraula, P. R. Pokharel
Abstract:
Human identification is one of the most challenging subjects that man has confronted. The determination of adult sex and stature are two of the four key factors (sex, stature, age, and race) in identification of an individual. Craniofacial and odontometric parameters are important tools for forensic anthropologists when it is not possible to apply advanced techniques for identification purposes. The present study provides anthropometric correlation of the parameters with stature and gender and also devises regression formulae for reconstruction of stature. A total of 312 Nepalese students with equal distribution of sex i.e., 156 male and 156 female students of age 18-35 years were taken for the study. Total of 10 parameters were measured (age, sex, stature, head circumference, head length, head breadth, facial height, bi-zygomatic width, mesio-distal canine width and inter-canine distance of both maxilla and mandible). Co-relation and regression analysis was done to find the association between the parameters. All parameters were found to be greater in males than females and each was found to be statistically significant. Out of total 312 samples, the best regressor for the determination of stature was head circumference and mandibular inter-canine width and that for gender was head circumference and right mandibular teeth. The accuracy of prediction was 83%. Regression equations and analysis generated from craniofacial and odontometric parameters can be a supplementary approach for the estimation of stature and gender when extremities are not available.Keywords: craniofacial, gender, odontometric, stature
Procedia PDF Downloads 19118183 Predicting College Students’ Happiness During COVID-19 Pandemic; Be optimistic and Well in College!
Authors: Michiko Iwasaki, Jane M. Endres, Julia Y. Richards, Andrew Futterman
Abstract:
The present study aimed to examine college students’ happiness during COVID19-pandemic. Using the online survey data from 96 college students in the U.S., a regression analysis was conducted to predict college students’ happiness. The results indicated that a four-predictor model (optimism, college students’ subjective wellbeing, coronavirus stress, and spirituality) explained 57.9% of the variance in student’s subjective happiness, F(4,77)=26.428, p<.001, R2=.579, 95% CI [.41,.66]. The study suggests the importance of learned optimism among college students.Keywords: COVID-19, optimism, spirituality, well-being
Procedia PDF Downloads 22618182 Equivalent Circuit Model for the Eddy Current Damping with Frequency-Dependence
Authors: Zhiguo Shi, Cheng Ning Loong, Jiazeng Shan, Weichao Wu
Abstract:
This study proposes an equivalent circuit model to simulate the eddy current damping force with shaking table tests and finite element modeling. The model is firstly proposed and applied to a simple eddy current damper, which is modelled in ANSYS, indicating that the proposed model can simulate the eddy current damping force under different types of excitations. Then, a non-contact and friction-free eddy current damper is designed and tested, and the proposed model can reproduce the experimental observations. The excellent agreement between the simulated results and the experimental data validates the accuracy and reliability of the equivalent circuit model. Furthermore, a more complicated model is performed in ANSYS to verify the feasibility of the equivalent circuit model in complex eddy current damper, and the higher-order fractional model and viscous model are adopted for comparison.Keywords: equivalent circuit model, eddy current damping, finite element model, shake table test
Procedia PDF Downloads 19118181 Determination Power and Sample Size Zero-Inflated Negative Binomial Dependent Death Rate of Age Model (ZINBD): Regression Analysis Mortality Acquired Immune Deficiency Deciency Syndrome (AIDS)
Authors: Mohd Asrul Affendi Bin Abdullah
Abstract:
Sample size calculation is especially important for zero inflated models because a large sample size is required to detect a significant effect with this model. This paper verify how to present percentage of power approximation for categorical and then extended to zero inflated models. Wald test was chosen to determine power sample size of AIDS death rate because it is frequently used due to its approachability and its natural for several major recent contribution in sample size calculation for this test. Power calculation can be conducted when covariates are used in the modeling ‘excessing zero’ data and assist categorical covariate. Analysis of AIDS death rate study is used for this paper. Aims of this study to determine the power of sample size (N = 945) categorical death rate based on parameter estimate in the simulation of the study.Keywords: power sample size, Wald test, standardize rate, ZINBDR
Procedia PDF Downloads 43618180 Determinants of Diarrhoea Prevalence Variations in Mountainous Informal Settlements of Kigali City, Rwanda
Authors: Dieudonne Uwizeye
Abstract:
Introduction: Diarrhoea is one of the major causes of morbidity and mortality among communities living in urban informal settlements of developing countries. It is assumed that mountainous environment introduces variations of the burden among residents of the same settlements. Design and Objective: A cross-sectional study was done in Kigali to explore the effect of mountainous informal settlements on diarrhoea risk variations. Data were collected among 1,152 households through household survey and transect walk to observe the status of sanitation. The outcome variable was the incidence of diarrhoea among household members of any age. The study used the most knowledgeable person in the household as the main respondent. Mostly this was the woman of the house as she was more likely to know the health status of every household member as she plays various roles: mother, wife, and head of the household among others. The analysis used cross tabulation and logistic regression analysis. Results: Results suggest that risks for diarrhoea vary depending on home location in the settlements. Diarrhoea risk increased as the distance from the road increased. The results of the logistic regression analysis indicate the adjusted odds ratio of 2.97 with 95% confidence interval being 1.35-6.55 and 3.50 adjusted odds ratio with 95% confidence interval being 1.61-7.60 in level two and three respectively compared with level one. The status of sanitation within and around homes was also significantly associated with the increase of diarrhoea. Equally, it is indicated that stable households were less likely to have diarrhoea. The logistic regression analysis indicated the adjusted odds ratio of 0.45 with 95% confidence interval being 0.25-0.81. However, the study did not find evidence for a significant association between diarrhoea risks and household socioeconomic status in the multivariable model. It is assumed that environmental factors in mountainous settings prevailed. Households using the available public water sources were more likely to have diarrhoea in their households. Recommendation: The study recommends the provision and extension of infrastructure for improved water, drainage, sanitation and wastes management facilities. Equally, studies should be done to identify the level of contamination and potential origin of contaminants for water sources in the valleys to adequately control the risks for diarrhoea in mountainous urban settings.Keywords: urbanisation, diarrhoea risk, mountainous environment, urban informal settlements in Rwanda
Procedia PDF Downloads 17018179 Indoor Air Pollution of the Flexographic Printing Environment
Authors: Jelena S. Kiurski, Vesna S. Kecić, Snežana M. Aksentijević
Abstract:
The identification and evaluation of organic and inorganic pollutants were performed in a flexographic facility in Novi Sad, Serbia. Air samples were collected and analyzed in situ, during 4-hours working time at five sampling points by the mobile gas chromatograph and ozonometer at the printing of collagen casing. Experimental results showed that the concentrations of isopropyl alcohol, acetone, total volatile organic compounds and ozone varied during the sampling times. The highest average concentrations of 94.80 ppm and 102.57 ppm were achieved at 200 minutes from starting the production for isopropyl alcohol and total volatile organic compounds, respectively. The mutual dependences between target hazardous and microclimate parameters were confirmed using a multiple linear regression model with software package STATISTICA 10. Obtained multiple coefficients of determination in the case of ozone and acetone (0.507 and 0.589) with microclimate parameters indicated a moderate correlation between the observed variables. However, a strong positive correlation was obtained for isopropyl alcohol and total volatile organic compounds (0.760 and 0.852) with microclimate parameters. Higher values of parameter F than Fcritical for all examined dependences indicated the existence of statistically significant difference between the concentration levels of target pollutants and microclimates parameters. Given that, the microclimate parameters significantly affect the emission of investigated gases and the application of eco-friendly materials in production process present a necessity.Keywords: flexographic printing, indoor air, multiple regression analysis, pollution emission
Procedia PDF Downloads 19718178 Selection of Pichia kudriavzevii Strain for the Production of Single-Cell Protein from Cassava Processing Waste
Authors: Phakamas Rachamontree, Theerawut Phusantisampan, Natthakorn Woravutthikul, Peerapong Pornwongthong, Malinee Sriariyanun
Abstract:
A total of 115 yeast strains isolated from local cassava processing wastes were measured for crude protein content. Among these strains, the strain MSY-2 possessed the highest protein concentration (>3.5 mg protein/mL). By using molecular identification tools, it was identified to be a strain of Pichia kudriavzevii based on similarity of D1/D2 domain of 26S rDNA region. In this study, to optimize the protein production by MSY-2 strain, Response Surface Methodology (RSM) was applied. The tested parameters were the carbon content, nitrogen content, and incubation time. Here, the value of regression coefficient (R2) = 0.7194 could be explained by the model, which is high to support the significance of the model. Under the optimal condition, the protein content was produced up to 3.77 g per L of the culture and MSY-2 strain contain 66.8 g protein per 100 g of cell dry weight. These results revealed the plausibility of applying the novel strain of yeast in single-cell protein production.Keywords: single cell protein, response surface methodology, yeast, cassava processing waste
Procedia PDF Downloads 40318177 The Effect of Peer Pressure and Leisure Boredom on Substance Use Among Adolescents in Low-Income Communities in Capetown
Authors: Gaironeesa Hendricks, Shazly Savahl, Maria Florence
Abstract:
The aim of the study is to determine whether peer pressure and leisure boredom influence substance use among adolescents in low-income communities in Cape Town. Non-probability sampling was used to select 296 adolescents between the ages of 16–18 from schools located in two low-income communities. The measurement tools included the Drug Use Disorders Identification Test, the Resistance to Peer Influence and Leisure Boredom Scales. Multiple regression revealed that the combined influence of peer pressure and leisure boredom predicted substance use, while peer pressure emerged as a stronger predictor than leisure boredom on substance use among adolescents.Keywords: substance use, peer pressure, leisure boredom, adolescents, multiple regression
Procedia PDF Downloads 59918176 Camera Model Identification for Mi Pad 4, Oppo A37f, Samsung M20, and Oppo f9
Authors: Ulrich Wake, Eniman Syamsuddin
Abstract:
The model for camera model identificaiton is trained using pretrained model ResNet43 and ResNet50. The dataset consists of 500 photos of each phone. Dataset is divided into 1280 photos for training, 320 photos for validation and 400 photos for testing. The model is trained using One Cycle Policy Method and tested using Test-Time Augmentation. Furthermore, the model is trained for 50 epoch using regularization such as drop out and early stopping. The result is 90% accuracy for validation set and above 85% for Test-Time Augmentation using ResNet50. Every model is also trained by slightly updating the pretrained model’s weightsKeywords: One Cycle Policy, ResNet34, ResNet50, Test-Time Agumentation
Procedia PDF Downloads 20818175 Financial Fraud Prediction for Russian Non-Public Firms Using Relational Data
Authors: Natalia Feruleva
Abstract:
The goal of this paper is to develop the fraud risk assessment model basing on both relational and financial data and test the impact of the relationships between Russian non-public companies on the likelihood of financial fraud commitment. Relationships mean various linkages between companies such as parent-subsidiary relationship and person-related relationships. These linkages may provide additional opportunities for committing fraud. Person-related relationships appear when firms share a director, or the director owns another firm. The number of companies belongs to CEO and managed by CEO, the number of subsidiaries was calculated to measure the relationships. Moreover, the dummy variable describing the existence of parent company was also included in model. Control variables such as financial leverage and return on assets were also implemented because they describe the motivating factors of fraud. To check the hypotheses about the influence of the chosen parameters on the likelihood of financial fraud, information about person-related relationships between companies, existence of parent company and subsidiaries, profitability and the level of debt was collected. The resulting sample consists of 160 Russian non-public firms. The sample includes 80 fraudsters and 80 non-fraudsters operating in 2006-2017. The dependent variable is dichotomous, and it takes the value 1 if the firm is engaged in financial crime, otherwise 0. Employing probit model, it was revealed that the number of companies which belong to CEO of the firm or managed by CEO has significant impact on the likelihood of financial fraud. The results obtained indicate that the more companies are affiliated with the CEO, the higher the likelihood that the company will be involved in financial crime. The forecast accuracy of the model is about is 80%. Thus, the model basing on both relational and financial data gives high level of forecast accuracy.Keywords: financial fraud, fraud prediction, non-public companies, regression analysis, relational data
Procedia PDF Downloads 11918174 Improved Computational Efficiency of Machine Learning Algorithm Based on Evaluation Metrics to Control the Spread of Coronavirus in the UK
Authors: Swathi Ganesan, Nalinda Somasiri, Rebecca Jeyavadhanam, Gayathri Karthick
Abstract:
The COVID-19 crisis presents a substantial and critical hazard to worldwide health. Since the occurrence of the disease in late January 2020 in the UK, the number of infected people confirmed to acquire the illness has increased tremendously across the country, and the number of individuals affected is undoubtedly considerably high. The purpose of this research is to figure out a predictive machine learning archetypal that could forecast COVID-19 cases within the UK. This study concentrates on the statistical data collected from 31st January 2020 to 31st March 2021 in the United Kingdom. Information on total COVID cases registered, new cases encountered on a daily basis, total death registered, and patients’ death per day due to Coronavirus is collected from World Health Organisation (WHO). Data preprocessing is carried out to identify any missing values, outliers, or anomalies in the dataset. The data is split into 8:2 ratio for training and testing purposes to forecast future new COVID cases. Support Vector Machines (SVM), Random Forests, and linear regression algorithms are chosen to study the model performance in the prediction of new COVID-19 cases. From the evaluation metrics such as r-squared value and mean squared error, the statistical performance of the model in predicting the new COVID cases is evaluated. Random Forest outperformed the other two Machine Learning algorithms with a training accuracy of 99.47% and testing accuracy of 98.26% when n=30. The mean square error obtained for Random Forest is 4.05e11, which is lesser compared to the other predictive models used for this study. From the experimental analysis Random Forest algorithm can perform more effectively and efficiently in predicting the new COVID cases, which could help the health sector to take relevant control measures for the spread of the virus.Keywords: COVID-19, machine learning, supervised learning, unsupervised learning, linear regression, support vector machine, random forest
Procedia PDF Downloads 12118173 Multimodal Data Fusion Techniques in Audiovisual Speech Recognition
Authors: Hadeer M. Sayed, Hesham E. El Deeb, Shereen A. Taie
Abstract:
In the big data era, we are facing a diversity of datasets from different sources in different domains that describe a single life event. These datasets consist of multiple modalities, each of which has a different representation, distribution, scale, and density. Multimodal fusion is the concept of integrating information from multiple modalities in a joint representation with the goal of predicting an outcome through a classification task or regression task. In this paper, multimodal fusion techniques are classified into two main classes: model-agnostic techniques and model-based approaches. It provides a comprehensive study of recent research in each class and outlines the benefits and limitations of each of them. Furthermore, the audiovisual speech recognition task is expressed as a case study of multimodal data fusion approaches, and the open issues through the limitations of the current studies are presented. This paper can be considered a powerful guide for interested researchers in the field of multimodal data fusion and audiovisual speech recognition particularly.Keywords: multimodal data, data fusion, audio-visual speech recognition, neural networks
Procedia PDF Downloads 11218172 Towards Automatic Calibration of In-Line Machine Processes
Authors: David F. Nettleton, Elodie Bugnicourt, Christian Wasiak, Alejandro Rosales
Abstract:
In this presentation, preliminary results are given for the modeling and calibration of two different industrial winding MIMO (Multiple Input Multiple Output) processes using machine learning techniques. In contrast to previous approaches which have typically used ‘black-box’ linear statistical methods together with a definition of the mechanical behavior of the process, we use non-linear machine learning algorithms together with a ‘white-box’ rule induction technique to create a supervised model of the fitting error between the expected and real force measures. The final objective is to build a precise model of the winding process in order to control de-tension of the material being wound in the first case, and the friction of the material passing through the die, in the second case. Case 1, Tension Control of a Winding Process. A plastic web is unwound from a first reel, goes over a traction reel and is rewound on a third reel. The objectives are: (i) to train a model to predict the web tension and (ii) calibration to find the input values which result in a given tension. Case 2, Friction Force Control of a Micro-Pullwinding Process. A core+resin passes through a first die, then two winding units wind an outer layer around the core, and a final pass through a second die. The objectives are: (i) to train a model to predict the friction on die2; (ii) calibration to find the input values which result in a given friction on die2. Different machine learning approaches are tested to build models, Kernel Ridge Regression, Support Vector Regression (with a Radial Basis Function Kernel) and MPART (Rule Induction with continuous value as output). As a previous step, the MPART rule induction algorithm was used to build an explicative model of the error (the difference between expected and real friction on die2). The modeling of the error behavior using explicative rules is used to help improve the overall process model. Once the models are built, the inputs are calibrated by generating Gaussian random numbers for each input (taking into account its mean and standard deviation) and comparing the output to a target (desired) output until a closest fit is found. The results of empirical testing show that a high precision is obtained for the trained models and for the calibration process. The learning step is the slowest part of the process (max. 5 minutes for this data), but this can be done offline just once. The calibration step is much faster and in under one minute obtained a precision error of less than 1x10-3 for both outputs. To summarize, in the present work two processes have been modeled and calibrated. A fast processing time and high precision has been achieved, which can be further improved by using heuristics to guide the Gaussian calibration. Error behavior has been modeled to help improve the overall process understanding. This has relevance for the quick optimal set up of many different industrial processes which use a pull-winding type process to manufacture fibre reinforced plastic parts. Acknowledgements to the Openmind project which is funded by Horizon 2020 European Union funding for Research & Innovation, Grant Agreement number 680820Keywords: data model, machine learning, industrial winding, calibration
Procedia PDF Downloads 24118171 Robust Inference with a Skew T Distribution
Authors: M. Qamarul Islam, Ergun Dogan, Mehmet Yazici
Abstract:
There is a growing body of evidence that non-normal data is more prevalent in nature than the normal one. Examples can be quoted from, but not restricted to, the areas of Economics, Finance and Actuarial Science. The non-normality considered here is expressed in terms of fat-tailedness and asymmetry of the relevant distribution. In this study a skew t distribution that can be used to model a data that exhibit inherent non-normal behavior is considered. This distribution has tails fatter than a normal distribution and it also exhibits skewness. Although maximum likelihood estimates can be obtained by solving iteratively the likelihood equations that are non-linear in form, this can be problematic in terms of convergence and in many other respects as well. Therefore, it is preferred to use the method of modified maximum likelihood in which the likelihood estimates are derived by expressing the intractable non-linear likelihood equations in terms of standardized ordered variates and replacing the intractable terms by their linear approximations obtained from the first two terms of a Taylor series expansion about the quantiles of the distribution. These estimates, called modified maximum likelihood estimates, are obtained in closed form. Hence, they are easy to compute and to manipulate analytically. In fact the modified maximum likelihood estimates are equivalent to maximum likelihood estimates, asymptotically. Even in small samples the modified maximum likelihood estimates are found to be approximately the same as maximum likelihood estimates that are obtained iteratively. It is shown in this study that the modified maximum likelihood estimates are not only unbiased but substantially more efficient than the commonly used moment estimates or the least square estimates that are known to be biased and inefficient in such cases. Furthermore, in conventional regression analysis, it is assumed that the error terms are distributed normally and, hence, the well-known least square method is considered to be a suitable and preferred method for making the relevant statistical inferences. However, a number of empirical researches have shown that non-normal errors are more prevalent. Even transforming and/or filtering techniques may not produce normally distributed residuals. Here, a study is done for multiple linear regression models with random error having non-normal pattern. Through an extensive simulation it is shown that the modified maximum likelihood estimates of regression parameters are plausibly robust to the distributional assumptions and to various data anomalies as compared to the widely used least square estimates. Relevant tests of hypothesis are developed and are explored for desirable properties in terms of their size and power. The tests based upon modified maximum likelihood estimates are found to be substantially more powerful than the tests based upon least square estimates. Several examples are provided from the areas of Economics and Finance where such distributions are interpretable in terms of efficient market hypothesis with respect to asset pricing, portfolio selection, risk measurement and capital allocation, etc.Keywords: least square estimates, linear regression, maximum likelihood estimates, modified maximum likelihood method, non-normality, robustness
Procedia PDF Downloads 39718170 Impact of Grade Sensitivity on Learning Motivation and Academic Performance
Authors: Salwa Aftab, Sehrish Riaz
Abstract:
The objective of this study was to check the impact of grade sensitivity on learning motivation and academic performance of students and to remove the degree of difference that exists among students regarding the cause of their learning motivation and also to gain knowledge about this matter since it has not been adequately researched. Data collection was primarily done through the academic sector of Pakistan and was depended upon the responses given by students solely. A sample size of 208 university students was selected. Both paper and online surveys were used to collect data from respondents. The results of the study revealed that grade sensitivity has a positive relationship with the learning motivation of students and their academic performance. These findings were carried out through systematic correlation and regression analysis.Keywords: academic performance, correlation, grade sensitivity, learning motivation, regression
Procedia PDF Downloads 40018169 Probability Sampling in Matched Case-Control Study in Drug Abuse
Authors: Surya R. Niraula, Devendra B Chhetry, Girish K. Singh, S. Nagesh, Frederick A. Connell
Abstract:
Background: Although random sampling is generally considered to be the gold standard for population-based research, the majority of drug abuse research is based on non-random sampling despite the well-known limitations of this kind of sampling. Method: We compared the statistical properties of two surveys of drug abuse in the same community: one using snowball sampling of drug users who then identified “friend controls” and the other using a random sample of non-drug users (controls) who then identified “friend cases.” Models to predict drug abuse based on risk factors were developed for each data set using conditional logistic regression. We compared the precision of each model using bootstrapping method and the predictive properties of each model using receiver operating characteristics (ROC) curves. Results: Analysis of 100 random bootstrap samples drawn from the snowball-sample data set showed a wide variation in the standard errors of the beta coefficients of the predictive model, none of which achieved statistical significance. One the other hand, bootstrap analysis of the random-sample data set showed less variation, and did not change the significance of the predictors at the 5% level when compared to the non-bootstrap analysis. Comparison of the area under the ROC curves using the model derived from the random-sample data set was similar when fitted to either data set (0.93, for random-sample data vs. 0.91 for snowball-sample data, p=0.35); however, when the model derived from the snowball-sample data set was fitted to each of the data sets, the areas under the curve were significantly different (0.98 vs. 0.83, p < .001). Conclusion: The proposed method of random sampling of controls appears to be superior from a statistical perspective to snowball sampling and may represent a viable alternative to snowball sampling.Keywords: drug abuse, matched case-control study, non-probability sampling, probability sampling
Procedia PDF Downloads 49318168 The Effects of a Mathematics Remedial Program on Mathematics Success and Achievement among Beginning Mathematics Major Students: A Regression Discontinuity Analysis
Authors: Kuixi Du, Thomas J. Lipscomb
Abstract:
The proficiency in Mathematics skills is fundamental to success in the STEM disciplines. In the US, beginning college students who are placed in remedial/developmental Mathematics courses frequently struggle to achieve academic success. Therefore, Mathematics remediation in college has become an important concern, and providing Mathematics remediation is a prevalent way to help the students who may not be fully prepared for college-level courses. Programs vary, however, and the effectiveness of a particular remedial Mathematics program must be empirically demonstrated. The purpose of this study was to apply the sharp regression discontinuity (RD) technique to determine the effectiveness of the Jack Leaps Summer (JLS) Mathematic remediation program in supporting improved Mathematics learning outcomes among newly admitted Mathematics students in the South Dakota State University. The researchers studied the newly admitted Fall 2019 cohort of Mathematics majors (n=423). The results indicated that students whose pretest score was lower than the cut-off point and who were assigned to the JLS program experienced significantly higher scores on the post-test (Math 101 final score). Based on these results, there is evidence that the JLS program is effective in meeting its primary objective.Keywords: causal inference, mathematisc remedial program evaluation, quasi-experimental research design, regression discontinuity design, cohort studies
Procedia PDF Downloads 9718167 Assessment of Personal Level Exposures to Particulate Matter among Children in Rural Preliminary Schools as an Indoor Air Pollution Monitoring
Authors: Seyedtaghi Mirmohammadi, J. Yazdani, S. M. Asadi, M. Rokni, A. Toosi
Abstract:
There are many indoor air quality studies with an emphasis on indoor particulate matters (PM2.5) monitoring. Whereas, there is a lake of data about indoor PM2.5 concentrations in rural area schools (especially in classrooms), since preliminary children are assumed to be more defenseless to health hazards and spend a large part of their time in classrooms. The objective of this study was indoor PM2.5 concentration quality assessment. Fifteen preliminary schools by time-series sampling were selected to evaluate the indoor air quality in the rural district of Sari city, Iran. Data on indoor air climate parameters (temperature, relative humidity and wind speed) were measured by a hygrometer and thermometer. Particulate matters (PM2.5) were collected and assessed by Real Time Dust Monitor, (MicroDust Pro, Casella, UK). The mean indoor PM2.5 concentration in the studied classrooms was 135µg/m3 in average. The multiple linear regression revealed that a correlation between PM2.5 concentration and relative humidity, distance from city center and classroom size. Classroom size yields reasonable negative relationship, the PM2.5 concentration was ranged from 65 to 540μg/m3 and statistically significant at 0.05 level and the relative humidity was ranged from 70 to 85% and dry bulb temperature ranged from 28 to 29°C were statistically significant at 0.035 and 0.05 level, respectively. A statistical predictive model was obtained from multiple regressions modeling for PM2.5 and indoor psychrometric parameters.Keywords: particulate matters, classrooms, regression, concentration, humidity
Procedia PDF Downloads 31118166 Predict Suspended Sediment Concentration Using Artificial Neural Networks Technique: Case Study Oued El Abiod Watershed, Algeria
Authors: Adel Bougamouza, Boualam Remini, Abd El Hadi Ammari, Feteh Sakhraoui
Abstract:
The assessment of sediments being carried by a river is importance for planning and designing of various water resources projects. In this study, Artificial Neural Network Techniques are used to estimate the daily suspended sediment concentration for the corresponding daily discharge flow in the upstream of Foum El Gherza dam, Biskra, Algeria. The FFNN, GRNN, and RBNN models are established for estimating current suspended sediment values. Some statistics involving RMSE and R2 were used to evaluate the performance of applied models. The comparison of three AI models showed that the RBNN model performed better than the FFNN and GRNN models with R2 = 0.967 and RMSE= 5.313 mg/l. Therefore, the ANN model had capability to improve nonlinear relationships between discharge flow and suspended sediment with reasonable precision.Keywords: artificial neural network, Oued Abiod watershed, feedforward network, generalized regression network, radial basis network, sediment concentration
Procedia PDF Downloads 41818165 Real Estate Trend Prediction with Artificial Intelligence Techniques
Authors: Sophia Liang Zhou
Abstract:
For investors, businesses, consumers, and governments, an accurate assessment of future housing prices is crucial to critical decisions in resource allocation, policy formation, and investment strategies. Previous studies are contradictory about macroeconomic determinants of housing price and largely focused on one or two areas using point prediction. This study aims to develop data-driven models to accurately predict future housing market trends in different markets. This work studied five different metropolitan areas representing different market trends and compared three-time lagging situations: no lag, 6-month lag, and 12-month lag. Linear regression (LR), random forest (RF), and artificial neural network (ANN) were employed to model the real estate price using datasets with S&P/Case-Shiller home price index and 12 demographic and macroeconomic features, such as gross domestic product (GDP), resident population, personal income, etc. in five metropolitan areas: Boston, Dallas, New York, Chicago, and San Francisco. The data from March 2005 to December 2018 were collected from the Federal Reserve Bank, FBI, and Freddie Mac. In the original data, some factors are monthly, some quarterly, and some yearly. Thus, two methods to compensate missing values, backfill or interpolation, were compared. The models were evaluated by accuracy, mean absolute error, and root mean square error. The LR and ANN models outperformed the RF model due to RF’s inherent limitations. Both ANN and LR methods generated predictive models with high accuracy ( > 95%). It was found that personal income, GDP, population, and measures of debt consistently appeared as the most important factors. It also showed that technique to compensate missing values in the dataset and implementation of time lag can have a significant influence on the model performance and require further investigation. The best performing models varied for each area, but the backfilled 12-month lag LR models and the interpolated no lag ANN models showed the best stable performance overall, with accuracies > 95% for each city. This study reveals the influence of input variables in different markets. It also provides evidence to support future studies to identify the optimal time lag and data imputing methods for establishing accurate predictive models.Keywords: linear regression, random forest, artificial neural network, real estate price prediction
Procedia PDF Downloads 10318164 Big Data Analysis with RHadoop
Authors: Ji Eun Shin, Byung Ho Jung, Dong Hoon Lim
Abstract:
It is almost impossible to store or analyze big data increasing exponentially with traditional technologies. Hadoop is a new technology to make that possible. R programming language is by far the most popular statistical tool for big data analysis based on distributed processing with Hadoop technology. With RHadoop that integrates R and Hadoop environment, we implemented parallel multiple regression analysis with different sizes of actual data. Experimental results showed our RHadoop system was much faster as the number of data nodes increases. We also compared the performance of our RHadoop with lm function and big lm packages available on big memory. The results showed that our RHadoop was faster than other packages owing to paralleling processing with increasing the number of map tasks as the size of data increases.Keywords: big data, Hadoop, parallel regression analysis, R, RHadoop
Procedia PDF Downloads 43718163 Quantitative Structure-Activity Relationship Modeling of Detoxication Properties of Some 1,2-Dithiole-3-Thione Derivatives
Authors: Nadjib Melkemi, Salah Belaidi
Abstract:
Quantitative Structure-Activity Relationship (QSAR) studies have been performed on nineteen molecules of 1,2-dithiole-3-thione analogues. The compounds used are the potent inducers of enzymes involved in the maintenance of reduced glutathione pools as well as phase-2 enzymes important to electrophile detoxication. A multiple linear regression (MLR) procedure was used to design the relationships between molecular descriptor and detoxication properties of the 1,2-dithiole-3-thione derivatives. The predictivity of the model was estimated by cross-validation with the leave-one-out method. Our results suggest a QSAR model based of the following descriptors: qS2, qC3, qC5, qS6, DM, Pol, log P, MV, SAG, HE and EHOMO for the specific activity of quinone reductase; qS1, qS2, qC3, qC4, qC5, qS6, DM, Pol, logP, MV, SAG, HE and EHOMO for the production of growth hormone. To confirm the predictive power of the models, an external set of molecules was used. High correlation between experimental and predicted activity values was observed, indicating the validation and the good quality of the derived QSAR models.Keywords: QSAR, quinone reductase activity, production of growth hormone, MLR
Procedia PDF Downloads 35018162 How Do Crisis Affect Economic Policy?
Authors: Eva Kotlánová
Abstract:
After recession that began in 2007 in the United States and subsequently spilled over the Europe we could expect recovery of economic growth. According to the last estimation of economic progress of European countries, this recovery is not strong enough. Among others, it will depend on economic policy, where and in which way, the economic indicators will proceed. Economic theories postulate that the economic subjects prefer stably, continual economic policy without repeated and strong fluctuations. This policy is perceived as support of economic growth. Mostly in crises period, when the government must cope with consequences of recession, the economic policy becomes unpredictable for many subjects and economic policy uncertainty grows, which have negative influence on economic growth. The aim of this paper is to use panel regression to prove or disprove this hypothesis on the example of five largest European economies in the period 2008–2012.Keywords: economic crises in Europe, economic policy, uncertainty, panel analysis regression
Procedia PDF Downloads 38618161 Investigating the Interaction of Individuals' Knowledge Sharing Constructs
Authors: Eugene Okyere-Kwakye
Abstract:
Knowledge sharing is a practice where individuals commonly exchange both tacit and explicit knowledge to jointly create a new knowledge. Knowledge management literature vividly express that knowledge sharing is the keystone and perhaps it is the most important aspect of knowledge management. To enhance the understanding of knowledge sharing domain, this study is aimed to investigate some factors that could influence employee’s attitude and behaviour to share their knowledge. The researchers employed the social exchange theory as a theoretical foundation for this study. Three essential factors namely: Trust, mutual reciprocity and perceived enjoyment that could influence knowledge sharing behaviour has been incorporated into a research model. To empirically validate this model, data was collected from one hundred and twenty respondents. The multiple regression analysis was employed to analyse the data. The results indicate that perceived enjoyment and trust have a significant influence on knowledge sharing. Surprisingly, mutual reciprocity did not influence knowledge sharing. The paper concludes by highlight the practical implications of the findings and areas for future research to consider.Keywords: perceived enjoyment, trust, knowledge sharing, knowledge management
Procedia PDF Downloads 44718160 Age Estimation from Upper Anterior Teeth by Pulp/Tooth Ratio Using Peri-Apical X-Rays among Egyptians
Authors: Fatma Mohamed Magdy Badr El Dine, Amr Mohamed Abd Allah
Abstract:
Introduction: Age estimation of individuals is one of the crucial steps in forensic practice. Different traditional methods rely on the length of the diaphysis of long bones of limbs, epiphyseal-diaphyseal union, fusion of the primary ossification centers as well as dental eruption. However, there is a growing need for the development of precise and reliable methods to estimate age, especially in cases where dismembered corpses, burnt bodies, purified or fragmented parts are recovered. Teeth are the hardest and indestructible structure in the human body. In recent years, assessment of pulp/tooth area ratio, as an indirect quantification of secondary dentine deposition has received a considerable attention. However, scanty work has been done in Egypt in terms of applicability of pulp/tooth ratio for age estimation. Aim of the Work: The present work was designed to assess the Cameriere’s method for age estimation from pulp/tooth ratio of maxillary canines, central and lateral incisors among a sample from Egyptian population. In addition, to formulate regression equations to be used as population-based standards for age determination. Material and Methods: The present study was conducted on 270 peri-apical X-rays of maxillary canines, central and lateral incisors (collected from 131 males and 139 females aged between 19 and 52 years). The pulp and tooth areas were measured using the Adobe Photoshop software program and the pulp/tooth area ratio was computed. Linear regression equations were determined separately for canines, central and lateral incisors. Results: A significant correlation was recorded between the pulp/tooth area ratio and the chronological age. The linear regression analysis revealed a coefficient of determination (R² = 0.824 for canine, 0.588 for central incisor and 0.737 for lateral incisor teeth). Three regression equations were derived. Conclusion: As a conclusion, the pulp/tooth ratio is a useful technique for estimating age among Egyptians. Additionally, the regression equation derived from canines gave better result than the incisors.Keywords: age determination, canines, central incisors, Egypt, lateral incisors, pulp/tooth ratio
Procedia PDF Downloads 18418159 A Theoretical Hypothesis on Ferris Wheel Model of University Social Responsibility
Authors: Le Kang
Abstract:
According to the nature of the university, as a free and responsible academic community, USR is based on a different foundation —academic responsibility, so the Pyramid and the IC Model of CSR could not fully explain the most distinguished feature of USR. This paper sought to put forward a new model— Ferris Wheel Model, to illustrate the nature of USR and the process of achievement. The Ferris Wheel Model of USR shows the university creates a balanced, fairness and neutrality systemic structure to afford social responsibilities; that makes the organization could obtain a synergistic effect to achieve more extensive interests of stakeholders and wider social responsibilities.Keywords: USR, achievement model, ferris wheel model, social responsibilities
Procedia PDF Downloads 725