Search results for: stepwise regression
2971 Big Data Analysis with RHadoop
Authors: Ji Eun Shin, Byung Ho Jung, Dong Hoon Lim
Abstract:
It is almost impossible to store or analyze big data increasing exponentially with traditional technologies. Hadoop is a new technology to make that possible. R programming language is by far the most popular statistical tool for big data analysis based on distributed processing with Hadoop technology. With RHadoop that integrates R and Hadoop environment, we implemented parallel multiple regression analysis with different sizes of actual data. Experimental results showed our RHadoop system was much faster as the number of data nodes increases. We also compared the performance of our RHadoop with lm function and big lm packages available on big memory. The results showed that our RHadoop was faster than other packages owing to paralleling processing with increasing the number of map tasks as the size of data increases.Keywords: big data, Hadoop, parallel regression analysis, R, RHadoop
Procedia PDF Downloads 4372970 How Do Crisis Affect Economic Policy?
Authors: Eva Kotlánová
Abstract:
After recession that began in 2007 in the United States and subsequently spilled over the Europe we could expect recovery of economic growth. According to the last estimation of economic progress of European countries, this recovery is not strong enough. Among others, it will depend on economic policy, where and in which way, the economic indicators will proceed. Economic theories postulate that the economic subjects prefer stably, continual economic policy without repeated and strong fluctuations. This policy is perceived as support of economic growth. Mostly in crises period, when the government must cope with consequences of recession, the economic policy becomes unpredictable for many subjects and economic policy uncertainty grows, which have negative influence on economic growth. The aim of this paper is to use panel regression to prove or disprove this hypothesis on the example of five largest European economies in the period 2008–2012.Keywords: economic crises in Europe, economic policy, uncertainty, panel analysis regression
Procedia PDF Downloads 3862969 Age Estimation from Upper Anterior Teeth by Pulp/Tooth Ratio Using Peri-Apical X-Rays among Egyptians
Authors: Fatma Mohamed Magdy Badr El Dine, Amr Mohamed Abd Allah
Abstract:
Introduction: Age estimation of individuals is one of the crucial steps in forensic practice. Different traditional methods rely on the length of the diaphysis of long bones of limbs, epiphyseal-diaphyseal union, fusion of the primary ossification centers as well as dental eruption. However, there is a growing need for the development of precise and reliable methods to estimate age, especially in cases where dismembered corpses, burnt bodies, purified or fragmented parts are recovered. Teeth are the hardest and indestructible structure in the human body. In recent years, assessment of pulp/tooth area ratio, as an indirect quantification of secondary dentine deposition has received a considerable attention. However, scanty work has been done in Egypt in terms of applicability of pulp/tooth ratio for age estimation. Aim of the Work: The present work was designed to assess the Cameriere’s method for age estimation from pulp/tooth ratio of maxillary canines, central and lateral incisors among a sample from Egyptian population. In addition, to formulate regression equations to be used as population-based standards for age determination. Material and Methods: The present study was conducted on 270 peri-apical X-rays of maxillary canines, central and lateral incisors (collected from 131 males and 139 females aged between 19 and 52 years). The pulp and tooth areas were measured using the Adobe Photoshop software program and the pulp/tooth area ratio was computed. Linear regression equations were determined separately for canines, central and lateral incisors. Results: A significant correlation was recorded between the pulp/tooth area ratio and the chronological age. The linear regression analysis revealed a coefficient of determination (R² = 0.824 for canine, 0.588 for central incisor and 0.737 for lateral incisor teeth). Three regression equations were derived. Conclusion: As a conclusion, the pulp/tooth ratio is a useful technique for estimating age among Egyptians. Additionally, the regression equation derived from canines gave better result than the incisors.Keywords: age determination, canines, central incisors, Egypt, lateral incisors, pulp/tooth ratio
Procedia PDF Downloads 1842968 Dietary Patterns and Hearing Loss in Older People
Authors: N. E. Gallagher, C. E. Neville, N. Lyner, J. Yarnell, C. C. Patterson, J. E. Gallacher, Y. Ben-Shlomo, A. Fehily, J. V. Woodside
Abstract:
Hearing loss is highly prevalent in older people and can reduce quality of life substantially. Emerging research suggests that potentially modifiable risk factors, including risk factors previously related to cardiovascular disease risk, may be associated with a decreased or increased incidence of hearing loss. This has prompted investigation into the possibility that certain nutrients, foods or dietary patterns may also be associated with incidence of hearing loss. The aim of this study was to determine any associations between dietary patterns and hearing loss in men enrolled in the Caerphilly study. The Caerphilly prospective cohort study began in 1979-1983 with recruitment of 2512 men aged 45-59 years. Dietary data was collected using a self-administered, semi-quantitative, 56-item food frequency questionnaire (FFQ) at baseline (1979-1983), and 7-day weighed food intake (WI) in a 30% sub-sample, while pure-tone unaided audiometric threshold was assessed at 0.5, 1, 2 and 4 kHz, between 1984 and 1988. Principal components analysis (PCA) was carried out to determine a posteriori dietary patterns and multivariate linear and logistic regression models were used to examine associations with hearing level (pure tone average (PTA) of frequencies 0.5, 1, 2 and 4 kHz in decibels (dB)) for linear regression and with hearing loss (PTA>25dB) for logistic regression. Three dietary patterns were determined using PCA on the FFQ data- Traditional, Healthy, High sugar/Alcohol avoider. After adjustment for potential confounding factors, both linear and logistic regression analyses showed a significant and inverse association between the Healthy pattern and hearing loss (P<0.001) and linear regression analysis showed a significant association between the High sugar/Alcohol avoider pattern and hearing loss (P=0.04). Three similar dietary patterns were determined using PCA on the WI data- Traditional, Healthy, High sugar/Alcohol avoider. After adjustment for potential confounding factors, logistic regression analyses showed a significant and inverse association between the Healthy pattern and hearing loss (P=0.02) and a significant association between the Traditional pattern and hearing loss (P=0.04). A Healthy dietary pattern was found to be significantly inversely associated with hearing loss in middle-aged men in the Caerphilly study. Furthermore, a High sugar/Alcohol avoider pattern (FFQ) and a Traditional pattern (WI) were associated with poorer hearing levels. Consequently, the role of dietary factors in hearing loss remains to be fully established and warrants further investigation.Keywords: ageing, diet, dietary patterns, hearing loss
Procedia PDF Downloads 2302967 6D Posture Estimation of Road Vehicles from Color Images
Authors: Yoshimoto Kurihara, Tad Gonsalves
Abstract:
Currently, in the field of object posture estimation, there is research on estimating the position and angle of an object by storing a 3D model of the object to be estimated in advance in a computer and matching it with the model. However, in this research, we have succeeded in creating a module that is much simpler, smaller in scale, and faster in operation. Our 6D pose estimation model consists of two different networks – a classification network and a regression network. From a single RGB image, the trained model estimates the class of the object in the image, the coordinates of the object, and its rotation angle in 3D space. In addition, we compared the estimation accuracy of each camera position, i.e., the angle from which the object was captured. The highest accuracy was recorded when the camera position was 75°, the accuracy of the classification was about 87.3%, and that of regression was about 98.9%.Keywords: 6D posture estimation, image recognition, deep learning, AlexNet
Procedia PDF Downloads 1552966 Effective Parameter Selection for Audio-Based Music Mood Classification for Christian Kokborok Song: A Regression-Based Approach
Authors: Sanchali Das, Swapan Debbarma
Abstract:
Music mood classification is developing in both the areas of music information retrieval (MIR) and natural language processing (NLP). Some of the Indian languages like Hindi English etc. have considerable exposure in MIR. But research in mood classification in regional language is very less. In this paper, powerful audio based feature for Kokborok Christian song is identified and mood classification task has been performed. Kokborok is an Indo-Burman language especially spoken in the northeastern part of India and also some other countries like Bangladesh, Myanmar etc. For performing audio-based classification task, useful audio features are taken out by jMIR software. There are some standard audio parameters are there for the audio-based task but as known to all that every language has its unique characteristics. So here, the most significant features which are the best fit for the database of Kokborok song is analysed. The regression-based model is used to find out the independent parameters that act as a predictor and predicts the dependencies of parameters and shows how it will impact on overall classification result. For classification WEKA 3.5 is used, and selected parameters create a classification model. And another model is developed by using all the standard audio features that are used by most of the researcher. In this experiment, the essential parameters that are responsible for effective audio based mood classification and parameters that do not significantly change for each of the Christian Kokborok songs are analysed, and a comparison is also shown between the two above model.Keywords: Christian Kokborok song, mood classification, music information retrieval, regression
Procedia PDF Downloads 2212965 A Stochastic Model to Predict Earthquake Ground Motion Duration Recorded in Soft Soils Based on Nonlinear Regression
Authors: Issam Aouari, Abdelmalek Abdelhamid
Abstract:
For seismologists, the characterization of seismic demand should include the amplitude and duration of strong shaking in the system. The duration of ground shaking is one of the key parameters in earthquake resistant design of structures. This paper proposes a nonlinear statistical model to estimate earthquake ground motion duration in soft soils using multiple seismicity indicators. Three definitions of ground motion duration proposed by literature have been applied. With a comparative study, we select the most significant definition to use for predict the duration. A stochastic model is presented for the McCann and Shah Method using nonlinear regression analysis based on a data set for moment magnitude, source to site distance and site conditions. The data set applied is taken from PEER strong motion databank and contains shallow earthquakes from different regions in the world; America, Turkey, London, China, Italy, Chili, Mexico...etc. Main emphasis is placed on soft site condition. The predictive relationship has been developed based on 600 records and three input indicators. Results have been compared with others published models. It has been found that the proposed model can predict earthquake ground motion duration in soft soils for different regions and sites conditions.Keywords: duration, earthquake, prediction, regression, soft soil
Procedia PDF Downloads 1532964 Quantile Smoothing Splines: Application on Productivity of Enterprises
Authors: Semra Turkan
Abstract:
In this paper, we have examined the factors that affect the productivity of Turkey’s Top 500 Industrial Enterprises in 2014. The labor productivity of enterprises is taken as an indicator of productivity of industrial enterprises. When the relationships between some financial ratios and labor productivity, it is seen that there is a nonparametric relationship between labor productivity and return on sales. In addition, the distribution of labor productivity of enterprises is right-skewed. If the dependent distribution is skewed, the quantile regression is more suitable for this data. Hence, the nonparametric relationship between labor productivity and return on sales by quantile smoothing splines.Keywords: quantile regression, smoothing spline, labor productivity, financial ratios
Procedia PDF Downloads 3022963 Factors for Entry Timing Choices Using Principal Axis Factorial Analysis and Logistic Regression Model
Authors: C. M. Mat Isa, H. Mohd Saman, S. R. Mohd Nasir, A. Jaapar
Abstract:
International market expansion involves a strategic process of market entry decision through which a firm expands its operation from domestic to the international domain. Hence, entry timing choices require the needs to balance the early entry risks and the problems in losing opportunities as a result of late entry into a new market. Questionnaire surveys administered to 115 Malaysian construction firms operating in 51 countries worldwide have resulted in 39.1 percent response rate. Factor analysis was used to determine the most significant factors affecting entry timing choices of the firms to penetrate the international market. A logistic regression analysis used to examine the firms’ entry timing choices, indicates that the model has correctly classified 89.5 per cent of cases as late movers. The findings reveal that the most significant factor influencing the construction firms’ choices as late movers was the firm factor related to the firm’s international experience, resources, competencies and financing capacity. The study also offers valuable information to construction firms with intention to internationalize their businesses.Keywords: factors, early movers, entry timing choices, late movers, logistic regression model, principal axis factorial analysis, Malaysian construction firms
Procedia PDF Downloads 3762962 Choosing between the Regression Correlation, the Rank Correlation, and the Correlation Curve
Authors: Roger L. Goodwin
Abstract:
This paper presents a rank correlation curve. The traditional correlation coefficient is valid for both continuous variables and for integer variables using rank statistics. Since the correlation coefficient has already been established in rank statistics by Spearman, such a calculation can be extended to the correlation curve. This paper presents two survey questions. The survey collected non-continuous variables. We will show weak to moderate correlation. Obviously, one question has a negative effect on the other. A review of the qualitative literature can answer which question and why. The rank correlation curve shows which collection of responses has a positive slope and which collection of responses has a negative slope. Such information is unavailable from the flat, "first-glance" correlation statistics.Keywords: Bayesian estimation, regression model, rank statistics, correlation, correlation curve
Procedia PDF Downloads 4732961 Predictors of School Drop out among High School Students
Authors: Osman Zorbaz, Selen Demirtas-Zorbaz, Ozlem Ulas
Abstract:
The factors that cause adolescents to drop out school were several. One of the frameworks about school dropout focuses on the contextual factors around the adolescents whereas the other one focuses on individual factors. It can be said that both factors are important equally. In this study, both adolescent’s individual factors (anti-social behaviors, academic success) and contextual factors (parent academic involvement, parent academic support, number of siblings, living with parent) were examined in the term of school dropout. The study sample consisted of 346 high school students in the public schools in Ankara who continued their education in 2015-2016 academic year. One hundred eighty-five the students (53.5%) were girls and 161 (46.5%) were boys. In addition to this 118 of them were in ninth grade, 122 of them in tenth grade and 106 of them were in eleventh grade. Multiple regression and one-way ANOVA statistical methods were used. First, it was examined if the data meet the assumptions and conditions that are required for regression analysis. After controlling the assumptions, regression analysis was conducted. Parent academic involvement, parent academic support, number of siblings, anti-social behaviors, academic success variables were taken into the regression model and it was seen that parent academic involvement (t=-3.023, p < .01), anti-social behaviors (t=7.038, p < .001), and academic success (t=-3.718, p < .001) predicted school dropout whereas parent academic support (t=-1.403, p > .05) and number of siblings (t=-1.908, p > .05) didn’t. The model explained 30% of the variance (R=.557, R2=.300, F5,345=30.626, p < .001). In addition to this the variance, results showed there was no significant difference on high school students school dropout levels according to living with parents or not (F2;345=1.183, p > .05). Results discussed in the light of the literature and suggestion were made. As a result, academic involvement, academic success and anti-social behaviors will be considered as an important factors for preventing school drop-out.Keywords: adolescents, anti-social behavior, parent academic involvement, parent academic support, school dropout
Procedia PDF Downloads 2842960 Separating Landform from Noise in High-Resolution Digital Elevation Models through Scale-Adaptive Window-Based Regression
Authors: Anne M. Denton, Rahul Gomes, David W. Franzen
Abstract:
High-resolution elevation data are becoming increasingly available, but typical approaches for computing topographic features, like slope and curvature, still assume small sliding windows, for example, of size 3x3. That means that the digital elevation model (DEM) has to be resampled to the scale of the landform features that are of interest. Any higher resolution is lost in this resampling. When the topographic features are computed through regression that is performed at the resolution of the original data, the accuracy can be much higher, and the reported result can be adjusted to the length scale that is relevant locally. Slope and variance are calculated for overlapping windows, meaning that one regression result is computed per raster point. The number of window centers per area is the same for the output as for the original DEM. Slope and variance are computed by performing regression on the points in the surrounding window. Such an approach is computationally feasible because of the additive nature of regression parameters and variance. Any doubling of window size in each direction only takes a single pass over the data, corresponding to a logarithmic scaling of the resulting algorithm as a function of the window size. Slope and variance are stored for each aggregation step, allowing the reported slope to be selected to minimize variance. The approach thereby adjusts the effective window size to the landform features that are characteristic to the area within the DEM. Starting with a window size of 2x2, each iteration aggregates 2x2 non-overlapping windows from the previous iteration. Regression results are stored for each iteration, and the slope at minimal variance is reported in the final result. As such, the reported slope is adjusted to the length scale that is characteristic of the landform locally. The length scale itself and the variance at that length scale are also visualized to aid in interpreting the results for slope. The relevant length scale is taken to be half of the window size of the window over which the minimum variance was achieved. The resulting process was evaluated for 1-meter DEM data and for artificial data that was constructed to have defined length scales and added noise. A comparison with ESRI ArcMap was performed and showed the potential of the proposed algorithm. The resolution of the resulting output is much higher and the slope and aspect much less affected by noise. Additionally, the algorithm adjusts to the scale of interest within the region of the image. These benefits are gained without additional computational cost in comparison with resampling the DEM and computing the slope over 3x3 images in ESRI ArcMap for each resolution. In summary, the proposed approach extracts slope and aspect of DEMs at the lengths scales that are characteristic locally. The result is of higher resolution and less affected by noise than existing techniques.Keywords: high resolution digital elevation models, multi-scale analysis, slope calculation, window-based regression
Procedia PDF Downloads 1292959 Effect of Transit-Oriented Development on Air Quality in Neighborhoods of Delhi
Authors: Smriti Bhatnagar
Abstract:
This study aims to find if the Transit-oriented planning and development approach benefit the quality of air in neighborhoods of New Delhi. Two methodologies, namely the land use regression analysis and the Transit-oriented development index analysis, are being used to explore this relationship. Land Use Regression Analysis makes use of urban form characteristics as obtained for 33 neighborhoods in Delhi. These comprise road lengths, land use areas, population and household densities, number of amenities and distance between amenities. Regressions are run to establish the relationship between urban form variables and air quality parameters (dependent variables). For the Transit-oriented development index analysis, the Transit-oriented Development index is developed as a composite index comprising 29 urban form indicators. This index is developed by assigning weights to each of the 29 urban form data points. Regressions are run to establish the relationship between the Transit-oriented development index and air quality parameters. The thesis finds that elements of Transit-oriented development if incorporated in planning approach, have a positive effect on air quality. Roads suited for non-motorized transport, well connected civic amenities in neighbourhoods, for instance, have a directly proportional relationship with air quality. Transit-oriented development index, however, is not found to have a consistent relationship with air quality parameters. The reason could this, however, be in the way that the index has been constructed.Keywords: air quality, land use regression, mixed-use planning, transit-oriented development index, New Delhi
Procedia PDF Downloads 2702958 Effect of Liquid Additive on Dry Grinding for Desired Surface Structure of CaO Catalyst
Authors: Wiyanti Fransisca Simanullang, Shinya Yamanaka
Abstract:
Grinding method was used to control the active site and to improve the specific surface area (SSA) of calcium oxide (CaO) derived from scallop shell as a sustainable resource. The dry grinding of CaO with acetone and tertiary butanol as a liquid additive was carried out using a planetary ball mill with a laboratory scale. The experiments were operated by stepwise addition with time variations to determine the grinding limit. The active site of CaO was measured by X-Ray Diffraction and FT-IR. The SSA variations of products with grinding time were measured by BET method. The morphology structure of CaO was observed by SEM. The use of liquid additive was effective for increasing the SSA and controlling the active site of CaO. SSA of CaO was increased in proportion to the amount of the liquid additive and the grinding time. The performance of CaO as a solid base catalyst for biodiesel production was tested in the transesterification reaction of used cooking oil to produce fatty acid methyl ester (FAME).Keywords: active site, calcium oxide, grinding, specific surface area
Procedia PDF Downloads 2872957 Analysis of Spatial Heterogeneity of Residential Prices in Guangzhou: An Actual Study Based on Point of Interest Geographically Weighted Regression Model
Authors: Zichun Guo
Abstract:
Guangzhou's house price has long been lower than the other three major cities; with the gradual increase in Guangzhou's house price, the influencing factors of house price have gradually been paid attention to; this paper tries to use house price data and POI (Point of Interest) data, and explores the distribution of house price and influencing factors by applying the Kriging spatial interpolation method and geographically weighted regression model in ArcGIS. The results show that the interpolation result of house price has a significant relationship with the economic development and development potential of the region and that different POI types have different impacts on the growth of house prices in different regions.Keywords: POI, house price, spatial heterogeneity, Guangzhou
Procedia PDF Downloads 552956 The Impact of Public Open Space System on Housing Price in Chicago
Authors: Si Chen, Le Zhang, Xian He
Abstract:
The research explored the influences of public open space system on housing price through hedonic models, in order to support better open space plans and economic policies. We have three initial hypotheses: 1) public open space system has an overall positive influence on surrounding housing prices. 2) Different public open space types have different levels of influence on motivating surrounding housing prices. 3) Walking and driving accessibilities from property to public open spaces have different statistical relation with housing prices. Cook County, Illinois, was chosen to be a study area since data availability, sufficient open space types, and long-term open space preservation strategies. We considered the housing attributes, driving and walking accessibility scores from houses to nearby public open spaces, and driving accessibility scores to hospitals as influential features and used real housing sales price in 2010 as a dependent variable in the built hedonic model. Through ordinary least squares (OLS) regression analysis, General Moran’s I analysis and geographically weighted regression analysis, we observed the statistical relations between public open spaces and housing sale prices in the three built hedonic models and confirmed all three hypotheses.Keywords: hedonic model, public open space, housing sale price, regression analysis, accessibility score
Procedia PDF Downloads 1332955 Applicability of Cameriere’s Age Estimation Method in a Sample of Turkish Adults
Authors: Hatice Boyacioglu, Nursel Akkaya, Humeyra Ozge Yilanci, Hilmi Kansu, Nihal Avcu
Abstract:
The strong relationship between the reduction in the size of the pulp cavity and increasing age has been reported in the literature. This relationship can be utilized to estimate the age of an individual by measuring the pulp cavity size using dental radiographs as a non-destructive method. The purpose of this study is to develop a population specific regression model for age estimation in a sample of Turkish adults by applying Cameriere’s method on panoramic radiographs. The sample consisted of 100 panoramic radiographs of Turkish patients (40 men, 60 women) aged between 20 and 70 years. Pulp and tooth area ratios (AR) of the maxilla¬¬ry canines were measured by two maxillofacial radiologists and then the results were subjected to regression analysis. There were no statistically significant intra-observer and inter-observer differences. The correlation coefficient between age and the AR of the maxillary canines was -0.71 and the following regression equation was derived: Estimated Age = 77,365 – ( 351,193 × AR ). The mean prediction error was 4 years which is within acceptable errors limits for age estimation. This shows that the pulp/tooth area ratio is a useful variable for assessing age with reasonable accuracy. Based on the results of this research, it was concluded that Cameriere’s method is suitable for dental age estimation and it can be used for forensic procedures in Turkish adults. These instructions give you guidelines for preparing papers for conferences or journals.Keywords: age estimation by teeth, forensic dentistry, panoramic radiograph, Cameriere's method
Procedia PDF Downloads 4492954 Relations between Psychological Adjustment and Perceived Parental, Teacher and Best Friend Acceptance among Bangladeshi Adolescents
Authors: Tariqul Islam, Shaheen Mollah
Abstract:
The study's main objective is to assess the relationship between psychological adjustment and parental acceptance-rejection, teacher acceptance-rejection, and best friend acceptance-rejection among secondary school students. This study was conducted on a sample of 300 (6th through 10th-grade students) recruited from over ten schools in Dhaka. While the schools were selected purposively, the respondents within each school were selected conveniently. The collected data were analyzed using Pearson product-moment correlation, hierarchical regression, and simultaneous regression analysis. The results showed that psychological adjustment is positively correlated with paternal, maternal, teacher, and best friend acceptance. The paternal acceptance was significantly connected with maternal acceptance. The teacher and best friend acceptance are correlated substantially with paternal and maternal acceptance. The hierarchical multiple regressions indicated that maternal, paternal, teacher, and best friend acceptance-rejection contributed significantly to students' psychological adjustment. The results revealed substantial independent contributions of maternal, paternal, teacher, and best friend acceptance on the students' psychological adjustment. The simultaneous regression analysis indicates that the maternal and best friend acceptances (but not paternal acceptance) were significant predictors of psychological adjustments. It showed that 41.7% variability in psychological adjustment could be explained by paternal, maternal, and best friend acceptance. The findings of the present study are exciting. They may contribute to developing insight in parents and best friends for behaving properly with their offspring and friend, respectively, for better psychological adjustment.Keywords: adjustment, parenting, rejection, acceptance
Procedia PDF Downloads 1452953 Effect of a Stepwise Discontinuity on a 65 Degree Delta Wing
Authors: Nishit L. Sanil, Raza M. Khan
Abstract:
Increasing lift effectively at higher angles of attack has always been a daunting challenge in aviation especially on a delta wing. These are used on military jet fighter planes and has some undesirable characteristics, notably flow separation at high angles of attack and high drag at low speeds. In order to solve this problem, a design modification is modeled on a delta wing which would increase the lift so that we can improve maneuverability. To attain an increase in the lift of a 65 degree delta wing at higher angles of attack, a step-wise discontinuity is created at the upper surface of the delta wing. A normal delta wing is validated for comparison which would thereby give us a measure of flow separation and coefficient of lift affected by the modification. The results obtained deliver a significant increase in lift at higher angles of attack thereby delaying stall. Hence the benefits of the modification would aid the potential designs of aircraft’s in the time to come.Keywords: coefficient of lift, delta wing, flow separation, step-wise discontinuity
Procedia PDF Downloads 3102952 Model-Driven and Data-Driven Approaches for Crop Yield Prediction: Analysis and Comparison
Authors: Xiangtuo Chen, Paul-Henry Cournéde
Abstract:
Crop yield prediction is a paramount issue in agriculture. The main idea of this paper is to find out efficient way to predict the yield of corn based meteorological records. The prediction models used in this paper can be classified into model-driven approaches and data-driven approaches, according to the different modeling methodologies. The model-driven approaches are based on crop mechanistic modeling. They describe crop growth in interaction with their environment as dynamical systems. But the calibration process of the dynamic system comes up with much difficulty, because it turns out to be a multidimensional non-convex optimization problem. An original contribution of this paper is to propose a statistical methodology, Multi-Scenarios Parameters Estimation (MSPE), for the parametrization of potentially complex mechanistic models from a new type of datasets (climatic data, final yield in many situations). It is tested with CORNFLO, a crop model for maize growth. On the other hand, the data-driven approach for yield prediction is free of the complex biophysical process. But it has some strict requirements about the dataset. A second contribution of the paper is the comparison of these model-driven methods with classical data-driven methods. For this purpose, we consider two classes of regression methods, methods derived from linear regression (Ridge and Lasso Regression, Principal Components Regression or Partial Least Squares Regression) and machine learning methods (Random Forest, k-Nearest Neighbor, Artificial Neural Network and SVM regression). The dataset consists of 720 records of corn yield at county scale provided by the United States Department of Agriculture (USDA) and the associated climatic data. A 5-folds cross-validation process and two accuracy metrics: root mean square error of prediction(RMSEP), mean absolute error of prediction(MAEP) were used to evaluate the crop prediction capacity. The results show that among the data-driven approaches, Random Forest is the most robust and generally achieves the best prediction error (MAEP 4.27%). It also outperforms our model-driven approach (MAEP 6.11%). However, the method to calibrate the mechanistic model from dataset easy to access offers several side-perspectives. The mechanistic model can potentially help to underline the stresses suffered by the crop or to identify the biological parameters of interest for breeding purposes. For this reason, an interesting perspective is to combine these two types of approaches.Keywords: crop yield prediction, crop model, sensitivity analysis, paramater estimation, particle swarm optimization, random forest
Procedia PDF Downloads 2312951 Free Fatty Acid Assessment of Crude Palm Oil Using a Non-Destructive Approach
Authors: Siti Nurhidayah Naqiah Abdull Rani, Herlina Abdul Rahim, Rashidah Ghazali, Noramli Abdul Razak
Abstract:
Near infrared (NIR) spectroscopy has always been of great interest in the food and agriculture industries. The development of prediction models has facilitated the estimation process in recent years. In this study, 110 crude palm oil (CPO) samples were used to build a free fatty acid (FFA) prediction model. 60% of the collected data were used for training purposes and the remaining 40% used for testing. The visible peaks on the NIR spectrum were at 1725 nm and 1760 nm, indicating the existence of the first overtone of C-H bands. Principal component regression (PCR) was applied to the data in order to build this mathematical prediction model. The optimal number of principal components was 10. The results showed R2=0.7147 for the training set and R2=0.6404 for the testing set.Keywords: palm oil, fatty acid, NIRS, regression
Procedia PDF Downloads 5062950 Estimation of Foliar Nitrogen in Selected Vegetation Communities of Uttrakhand Himalayas Using Hyperspectral Satellite Remote Sensing
Authors: Yogita Mishra, Arijit Roy, Dhruval Bhavsar
Abstract:
The study estimates the nitrogen concentration in selected vegetation community’s i.e. chir pine (pinusroxburghii) by using hyperspectral satellite data and also identified the appropriate spectral bands and nitrogen indices. The Short Wave InfraRed reflectance spectrum at 1790 nm and 1680 nm shows the maximum possible absorption by nitrogen in selected species. Among the nitrogen indices, log normalized nitrogen index performed positively and negatively too. The strong positive correlation is taken out from 1510 nm and 760 nm for the pinusroxburghii for leaf nitrogen concentration and leaf nitrogen mass while using NDNI. The regression value of R² developed by using linear equation achieved maximum at 0.7525 for the analysis of satellite image data and R² is maximum at 0.547 for ground truth data for pinusroxburghii respectively.Keywords: hyperspectral, NDNI, nitrogen concentration, regression value
Procedia PDF Downloads 2952949 A Multinomial Logistic Regression Analysis of Factors Influencing Couples' Fertility Preferences in Kenya
Authors: Naomi W. Maina
Abstract:
Fertility preference is a subject of great significance in developing countries. Studies reveal that the preferences of fertility are actually significant in determining the society’s fertility levels because the fertility behavior of the future has a high likelihood of falling under the effect of currently observed fertility inclinations. The objective of this study was to establish the factors associated with fertility preference amongst couples in Kenya by fitting a multinomial logistic regression model against 5,265 couple data obtained from Kenya demographic health survey 2014. Results revealed that the type of place of residence, the region of residence, age and spousal age gap significantly influence desire for additional children among couples in Kenya. There was the notable high likelihood of couples living in rural settlements having similar fertility preference compared to those living in urban settlements. Moreover, geographical disparities such as in northern Kenya revealed significant differences in a couples desire to have additional children compared to Nairobi. The odds of a couple’s desire for additional children were further observed to vary dependent on either the wife or husbands age and to a large extent the spousal age gap. Evidenced from the study, was the fact that as spousal age gap increases, the desire for more children amongst couples decreases. Insights derived from this study would be attractive to demographers, health practitioners, policymakers, and non-governmental organizations implementing fertility related interventions in Kenya among other stakeholders. Moreover, with the adoption of devolution, there is a clear need for adoption of population policies that are County specific as opposed to a national population policy as is the current practice in Kenya. Additionally, researchers or students who have little understanding in the application of multinomial logistic regression, both theoretical understanding and practical analysis in SPSS as well as application on real datasets, will find this article useful.Keywords: couples' desire, fertility, fertility preference, multinomial regression analysis
Procedia PDF Downloads 1812948 Estimation of a Finite Population Mean under Random Non Response Using Improved Nadaraya and Watson Kernel Weights
Authors: Nelson Bii, Christopher Ouma, John Odhiambo
Abstract:
Non-response is a potential source of errors in sample surveys. It introduces bias and large variance in the estimation of finite population parameters. Regression models have been recognized as one of the techniques of reducing bias and variance due to random non-response using auxiliary data. In this study, it is assumed that random non-response occurs in the survey variable in the second stage of cluster sampling, assuming full auxiliary information is available throughout. Auxiliary information is used at the estimation stage via a regression model to address the problem of random non-response. In particular, the auxiliary information is used via an improved Nadaraya-Watson kernel regression technique to compensate for random non-response. The asymptotic bias and mean squared error of the estimator proposed are derived. Besides, a simulation study conducted indicates that the proposed estimator has smaller values of the bias and smaller mean squared error values compared to existing estimators of finite population mean. The proposed estimator is also shown to have tighter confidence interval lengths at a 95% coverage rate. The results obtained in this study are useful, for instance, in choosing efficient estimators of the finite population mean in demographic sample surveys.Keywords: mean squared error, random non-response, two-stage cluster sampling, confidence interval lengths
Procedia PDF Downloads 1372947 Logistic Regression Based Model for Predicting Students’ Academic Performance in Higher Institutions
Authors: Emmanuel Osaze Oshoiribhor, Adetokunbo MacGregor John-Otumu
Abstract:
In recent years, there has been a desire to forecast student academic achievement prior to graduation. This is to help them improve their grades, particularly for individuals with poor performance. The goal of this study is to employ supervised learning techniques to construct a predictive model for student academic achievement. Many academics have already constructed models that predict student academic achievement based on factors such as smoking, demography, culture, social media, parent educational background, parent finances, and family background, to name a few. This feature and the model employed may not have correctly classified the students in terms of their academic performance. This model is built using a logistic regression classifier with basic features such as the previous semester's course score, attendance to class, class participation, and the total number of course materials or resources the student is able to cover per semester as a prerequisite to predict if the student will perform well in future on related courses. The model outperformed other classifiers such as Naive bayes, Support vector machine (SVM), Decision Tree, Random forest, and Adaboost, returning a 96.7% accuracy. This model is available as a desktop application, allowing both instructors and students to benefit from user-friendly interfaces for predicting student academic achievement. As a result, it is recommended that both students and professors use this tool to better forecast outcomes.Keywords: artificial intelligence, ML, logistic regression, performance, prediction
Procedia PDF Downloads 972946 Electrical Load Estimation Using Estimated Fuzzy Linear Parameters
Authors: Bader Alkandari, Jamal Y. Madouh, Ahmad M. Alkandari, Anwar A. Alnaqi
Abstract:
A new formulation of fuzzy linear estimation problem is presented. It is formulated as a linear programming problem. The objective is to minimize the spread of the data points, taking into consideration the type of the membership function of the fuzzy parameters to satisfy the constraints on each measurement point and to insure that the original membership is included in the estimated membership. Different models are developed for a fuzzy triangular membership. The proposed models are applied to different examples from the area of fuzzy linear regression and finally to different examples for estimating the electrical load on a busbar. It had been found that the proposed technique is more suited for electrical load estimation, since the nature of the load is characterized by the uncertainty and vagueness.Keywords: fuzzy regression, load estimation, fuzzy linear parameters, electrical load estimation
Procedia PDF Downloads 5402945 Stature and Gender Estimation Using Foot Measurements in South Indian Population
Authors: Jagadish Rao Padubidri, Mehak Bhandary, Sowmya J. Rao
Abstract:
Introduction: The significance of the human foot and its measurements in identifying an individual has been proved a lot of times by different studies in different geographical areas and its association to the stature and gender of the individual has been justified by many researches. In our study we have used different foot measurements including the length, width, malleol height and navicular height for establishing its association to stature and gender and to find out its accuracy. The purpose of this study is to show the relation of foot measurements with stature and gender, and to derive Multiple and Logistic regression equations for stature and gender estimation in South Indian population. Materials and Methods: The subjects for this study were 200 South Indian students out of which 100 were females and 100 were males, aged between 18 to 24 years. The data for the present study included the stature, foot length, foot breath, foot malleol height, foot navicular height of both right and left foot. Descriptive statistics, T-test and Pearson correlation coefficients were derived between stature, gender and foot measurements. The stature was estimated from right and left foot measurements for both male and female South Indian population using multiple regression analysis and logistic regression analysis for gender estimation. Results: The means, standard deviation, stature, right and left foot measurements and T-test in male population were higher than in females. LFL (Left foot length) is more than RFL (Right Foot length) in male groups, but in female groups the length of both foot are almost equal [RFL=226.6, LFL=227.1]. There is not much of difference in means of RFW (Right foot width) and LFW (Left foot width) in both the genders. Significant difference were seen in mean values of malleol and navicular height of right and left feet in male gender. No such difference was seen in female subjects. Conclusions: The study has successfully demonstrated the correlation of foot length in stature estimation in all the three study groups in both right and left foot. Next in parameters are Foot width and malleol height in estimating stature among male and female groups. Navicular height of both right and left foot showed poor relationship with stature estimation in both male and female groups. Multiple regression equations for both right and left foot measurements to estimate stature were derived with standard error ranging from 11-12 cm in males and 10-11 cm in females. The SEE was 5.8 when both male and female groups were pooled together. The logistic regression model which was derived to determine gender showed 85% accuracy and 92.5% accuracy using right and left foot measurements respectively. We believe that stature and gender can be estimated with foot measurements in South Indian population.Keywords: foot length, gender, stature, South Indian
Procedia PDF Downloads 3352944 Uncovering the Relationship between EFL Students' Self-Concept and Their Willingness to Communicate in Language Classes
Authors: Seyedeh Khadijeh Amirian, Seyed Mohammad Reza Amirian, Narges Hekmati
Abstract:
The current study aims at examining the relationship between English as a foreign language (EFL) students' self-concept and their willingness to communicate (WTC) in EFL classes. To this effect, two questionnaires, namely 'Willingness to Communicate' (MacIntyre et al., 2001) and 'Self-Concept Scale' (Liu and Wang, 2005), were distributed among 174 (45 males and 129 females) Iranian EFL university students. Correlation and regression analyses were conducted to examine the relationship between the two variables. The results indicated that there was a significantly positive correlation between EFL students' self-concept and their WTC in EFL classes (p < .0.05). Moreover, regression analyses indicated that self-concept has a significantly positive influence on students’ WTC in language classes (B= .302, p < .0.05) and explains .302 percent of the variance in the dependent variable (WTC). The results are discussed with regards to the individual differences in educational contexts, and implications are offered.Keywords: EFL students, language classes, willingness to communicate, self-concept
Procedia PDF Downloads 1262943 The Influence of Interest, Beliefs, and Identity with Mathematics on Achievement
Authors: Asma Alzahrani, Elizabeth Stojanovski
Abstract:
This study investigated factors that influence mathematics achievement based on a sample of ninth-grade students (N = 21,444) from the High School Longitudinal Study of 2009 (HSLS09). Key aspects studied included efficacy in mathematics, interest and enjoyment of mathematics, identity with mathematics and future utility beliefs and how these influence mathematics achievement. The predictability of mathematics achievement based on these factors was assessed using correlation coefficients and multiple linear regression. Spearman rank correlations and multiple regression analyses indicated positive and statistically significant relationships between the explanatory variables: mathematics efficacy, identity with mathematics, interest in and future utility beliefs with the response variable, achievement in mathematics.Keywords: Mathematics achievement, math efficacy, mathematics interest, factors influence
Procedia PDF Downloads 1502942 Determinants of Free Independent Traveler Tourist Expenditures in Israel: Quantile Regression Model
Authors: Shlomit Hon-Snir, Sharon Teitler-Regev, Anabel Lifszyc Friedlander
Abstract:
Tourism, one of the world's largest and fastest growing industries, exerts a major economic influence. The number of international tourists is growing every year, and the relative portion of independent (FIT) tourists is growing as well. The characteristics of independent tourists differ from those of tourists who travel in organized trips. The purpose of the research is to identify the factors that affect the individual tourist's expenses in Israel: total expenses, expenses per day, expenses per tourist, expenses per day per tourist, accommodation expenses, dining expenses and transportation expenses. Most of the research analyzed the total expenses using OLS regression. The determinants influencing expenses were divided into four groups: budget constraints, socio-demographic data, psychological characteristics and travel-related characteristics. Since the effect of each variable may change over different levels of total expenses the quantile regression (QR) theory will be applied. The current research will use data collected by the Israeli Ministry of Tourism in 2015 from individual independent tourists at the end of their visit to Israel. Preliminary results show that: At lower levels of expense, only income has a (positive) effect on total expenses, while at higher levels of expense, both income and length of stay have (positive) effects. -The effect of income on total expenses is higher for higher levels of expenses than for lower level of expenses. -The number of sites visited during the trip has a (negative) effect on tourist accommodation expenses only for tourists with a high level of total expenses. Due to the increasing share of independent tourism in Israel and around the world and due to the importance of tourism to Israel, it is very important to understand the factors that influence the expenses and behavior of independent tourists. Understanding the factors that affect independent tourists' expenses in Israel can help Israeli policymakers in their promotional efforts to attract tourism to Israel.Keywords: independent tourist, quantile regression theory, tourism expenses, tourism
Procedia PDF Downloads 274