Search results for: geospatial weighted regression
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3717

Search results for: geospatial weighted regression

3567 Breast Cancer Survivability Prediction via Classifier Ensemble

Authors: Mohamed Al-Badrashiny, Abdelghani Bellaachia

Abstract:

This paper presents a classifier ensemble approach for predicting the survivability of the breast cancer patients using the latest database version of the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute. The system consists of two main components; features selection and classifier ensemble components. The features selection component divides the features in SEER database into four groups. After that it tries to find the most important features among the four groups that maximizes the weighted average F-score of a certain classification algorithm. The ensemble component uses three different classifiers, each of which models different set of features from SEER through the features selection module. On top of them, another classifier is used to give the final decision based on the output decisions and confidence scores from each of the underlying classifiers. Different classification algorithms have been examined; the best setup found is by using the decision tree, Bayesian network, and Na¨ıve Bayes algorithms for the underlying classifiers and Na¨ıve Bayes for the classifier ensemble step. The system outperforms all published systems to date when evaluated against the exact same data of SEER (period of 1973-2002). It gives 87.39% weighted average F-score compared to 85.82% and 81.34% of the other published systems. By increasing the data size to cover the whole database (period of 1973-2014), the overall weighted average F-score jumps to 92.4% on the held out unseen test set.

Keywords: classifier ensemble, breast cancer survivability, data mining, SEER

Procedia PDF Downloads 295
3566 Generalized Additive Model for Estimating Propensity Score

Authors: Tahmidul Islam

Abstract:

Propensity Score Matching (PSM) technique has been widely used for estimating causal effect of treatment in observational studies. One major step of implementing PSM is estimating the propensity score (PS). Logistic regression model with additive linear terms of covariates is most used technique in many studies. Logistics regression model is also used with cubic splines for retaining flexibility in the model. However, choosing the functional form of the logistic regression model has been a question since the effectiveness of PSM depends on how accurately the PS been estimated. In many situations, the linearity assumption of linear logistic regression may not hold and non-linear relation between the logit and the covariates may be appropriate. One can estimate PS using machine learning techniques such as random forest, neural network etc for more accuracy in non-linear situation. In this study, an attempt has been made to compare the efficacy of Generalized Additive Model (GAM) in various linear and non-linear settings and compare its performance with usual logistic regression. GAM is a non-parametric technique where functional form of the covariates can be unspecified and a flexible regression model can be fitted. In this study various simple and complex models have been considered for treatment under several situations (small/large sample, low/high number of treatment units) and examined which method leads to more covariate balance in the matched dataset. It is found that logistic regression model is impressively robust against inclusion quadratic and interaction terms and reduces mean difference in treatment and control set equally efficiently as GAM does. GAM provided no significantly better covariate balance than logistic regression in both simple and complex models. The analysis also suggests that larger proportion of controls than treatment units leads to better balance for both of the methods.

Keywords: accuracy, covariate balances, generalized additive model, logistic regression, non-linearity, propensity score matching

Procedia PDF Downloads 335
3565 Using Scale Invariant Feature Transform Features to Recognize Characters in Natural Scene Images

Authors: Belaynesh Chekol, Numan Çelebi

Abstract:

The main purpose of this work is to recognize individual characters extracted from natural scene images using scale invariant feature transform (SIFT) features as an input to K-nearest neighbor (KNN); a classification learner algorithm. For this task, 1,068 and 78 images of English alphabet characters taken from Chars74k data set is used to train and test the classifier respectively. For each character image, We have generated describing features by using SIFT algorithm. This set of features is fed to the learner so that it can recognize and label new images of English characters. Two types of KNN (fine KNN and weighted KNN) were trained and the resulted classification accuracy is 56.9% and 56.5% respectively. The training time taken was the same for both fine and weighted KNN.

Keywords: character recognition, KNN, natural scene image, SIFT

Procedia PDF Downloads 254
3564 A Comparison of Neural Network and DOE-Regression Analysis for Predicting Resource Consumption of Manufacturing Processes

Authors: Frank Kuebler, Rolf Steinhilper

Abstract:

Artificial neural networks (ANN) as well as Design of Experiments (DOE) based regression analysis (RA) are mainly used for modeling of complex systems. Both methodologies are commonly applied in process and quality control of manufacturing processes. Due to the fact that resource efficiency has become a critical concern for manufacturing companies, these models needs to be extended to predict resource-consumption of manufacturing processes. This paper describes an approach to use neural networks as well as DOE based regression analysis for predicting resource consumption of manufacturing processes and gives a comparison of the achievable results based on an industrial case study of a turning process.

Keywords: artificial neural network, design of experiments, regression analysis, resource efficiency, manufacturing process

Procedia PDF Downloads 492
3563 Logistic Regression Model versus Additive Model for Recurrent Event Data

Authors: Entisar A. Elgmati

Abstract:

Recurrent infant diarrhea is studied using daily data collected in Salvador, Brazil over one year and three months. A logistic regression model is fitted instead of Aalen's additive model using the same covariates that were used in the analysis with the additive model. The model gives reasonably similar results to that using additive regression model. In addition, the problem with the estimated conditional probabilities not being constrained between zero and one in additive model is solved here. Also martingale residuals that have been used to judge the goodness of fit for the additive model are shown to be useful for judging the goodness of fit of the logistic model.

Keywords: additive model, cumulative probabilities, infant diarrhoea, recurrent event

Procedia PDF Downloads 607
3562 Cognitive Weighted Polymorphism Factor: A New Cognitive Complexity Metric

Authors: T. Francis Thamburaj, A. Aloysius

Abstract:

Polymorphism is one of the main pillars of the object-oriented paradigm. It induces hidden forms of class dependencies which may impact software quality, resulting in higher cost factor for comprehending, debugging, testing, and maintaining the software. In this paper, a new cognitive complexity metric called Cognitive Weighted Polymorphism Factor (CWPF) is proposed. Apart from the software structural complexity, it includes the cognitive complexity on the basis of type. The cognitive weights are calibrated based on 27 empirical studies with 120 persons. A case study and experimentation of the new software metric shows positive results. Further, a comparative study is made and the correlation test has proved that CWPF complexity metric is a better, more comprehensive, and more realistic indicator of the software complexity than Abreu’s Polymorphism Factor (PF) complexity metric.

Keywords: cognitive complexity metric, object-oriented metrics, polymorphism factor, software metrics

Procedia PDF Downloads 413
3561 Utility of Geospatial Techniques in Delineating Groundwater-Dependent Ecosystems in Arid Environments

Authors: Mangana B. Rampheri, Timothy Dube, Farai Dondofema, Tatenda Dalu

Abstract:

Identifying and delineating groundwater-dependent ecosystems (GDEs) is critical to the well understanding of the GDEs spatial distribution as well as groundwater allocation. However, this information is inadequately understood due to limited available data for the most area of concerns. Thus, this study aims to address this gap using remotely sensed, analytical hierarchy process (AHP) and in-situ data to identify and delineate GDEs in Khakea-Bray Transboundary Aquifer. Our study developed GDEs index, which integrates seven explanatory variables, namely, Normalized Difference Vegetation Index (NDVI), Modified Normalized Difference Water Index (MNDWI), Land-use and landcover (LULC), slope, Topographic Wetness Index (TWI), flow accumulation and curvature. The GDEs map was delineated using the weighted overlay tool in ArcGIS environments. The map was spatially classified into two classes, namely, GDEs and Non-GDEs. The results showed that only 1,34 % (721,91 km2) of the area is characterised by GDEs. Finally, groundwater level (GWL) data was used for validation through correlation analysis. Our results indicated that: 1) GDEs are concentrated at the northern, central, and south-western part of our study area, and 2) the validation results showed that GDEs classes do not overlap with GWL located in the 22 boreholes found in the given area. However, the results show a possible delineation of GDEs in the study area using remote sensing and GIS techniques along with AHP. The results of this study further contribute to identifying and delineating priority areas where appropriate water conservation programs, as well as strategies for sustainable groundwater development, can be implemented.

Keywords: analytical hierarchy process (AHP), explanatory variables, groundwater-dependent ecosystems (GDEs), khakea-bray transboundary aquifer, sentinel-2

Procedia PDF Downloads 78
3560 Extended Constraint Mask Based One-Bit Transform for Low-Complexity Fast Motion Estimation

Authors: Oğuzhan Urhan

Abstract:

In this paper, an improved motion estimation (ME) approach based on weighted constrained one-bit transform is proposed for block-based ME employed in video encoders. Binary ME approaches utilize low bit-depth representation of the original image frames with a Boolean exclusive-OR based hardware efficient matching criterion to decrease computational burden of the ME stage. Weighted constrained one-bit transform (WC‑1BT) based approach improves the performance of conventional C-1BT based ME employing 2-bit depth constraint mask instead of a 1-bit depth mask. In this work, the range of constraint mask is further extended to increase ME performance of WC-1BT approach. Experiments reveal that the proposed method provides better ME accuracy compared existing similar ME methods in the literature.

Keywords: fast motion estimation; low-complexity motion estimation, video coding

Procedia PDF Downloads 292
3559 Identifying Factors Contributing to the Spread of Lyme Disease: A Regression Analysis of Virginia’s Data

Authors: Fatemeh Valizadeh Gamchi, Edward L. Boone

Abstract:

This research focuses on Lyme disease, a widespread infectious condition in the United States caused by the bacterium Borrelia burgdorferi sensu stricto. It is critical to identify environmental and economic elements that are contributing to the spread of the disease. This study examined data from Virginia to identify a subset of explanatory variables significant for Lyme disease case numbers. To identify relevant variables and avoid overfitting, linear poisson, and regularization regression methods such as a ridge, lasso, and elastic net penalty were employed. Cross-validation was performed to acquire tuning parameters. The methods proposed can automatically identify relevant disease count covariates. The efficacy of the techniques was assessed using four criteria on three simulated datasets. Finally, using the Virginia Department of Health’s Lyme disease data set, the study successfully identified key factors, and the results were consistent with previous studies.

Keywords: lyme disease, Poisson generalized linear model, ridge regression, lasso regression, elastic net regression

Procedia PDF Downloads 96
3558 An Analysis of the Effect of Sharia Financing and Work Relation Founding towards Non-Performing Financing in Islamic Banks in Indonesia

Authors: Muhammad Bahrul Ilmi

Abstract:

The purpose of this research is to analyze the influence of Islamic financing and work relation founding simultaneously and partially towards non-performing financing in Islamic banks. This research was regression quantitative field research, and had been done in Muammalat Indonesia Bank and Islamic Danamon Bank in 3 months. The populations of this research were 15 account officers of Muammalat Indonesia Bank and Islamic Danamon Bank in Surakarta, Indonesia. The techniques of collecting data used in this research were documentation, questionnaire, literary study and interview. Regression analysis result shows that Islamic financing and work relation founding simultaneously has positive and significant effect towards non performing financing of two Islamic Banks. It is obtained with probability value 0.003 which is less than 0.05 and F value 9.584. The analysis result of Islamic financing regression towards non performing financing shows the significant effect. It is supported by double linear regression analysis with probability value 0.001 which is less than 0.05. The regression analysis of work relation founding effect towards non-performing financing shows insignificant effect. This is shown in the double linear regression analysis with probability value 0.161 which is bigger than 0.05.

Keywords: Syariah financing, work relation founding, non-performing financing (NPF), Islamic Bank

Procedia PDF Downloads 404
3557 A Kolmogorov-Smirnov Type Goodness-Of-Fit Test of Multinomial Logistic Regression Model in Case-Control Studies

Authors: Chen Li-Ching

Abstract:

The multinomial logistic regression model is used popularly for inferring the relationship of risk factors and disease with multiple categories. This study based on the discrepancy between the nonparametric maximum likelihood estimator and semiparametric maximum likelihood estimator of the cumulative distribution function to propose a Kolmogorov-Smirnov type test statistic to assess adequacy of the multinomial logistic regression model for case-control data. A bootstrap procedure is presented to calculate the critical value of the proposed test statistic. Empirical type I error rates and powers of the test are performed by simulation studies. Some examples will be illustrated the implementation of the test.

Keywords: case-control studies, goodness-of-fit test, Kolmogorov-Smirnov test, multinomial logistic regression

Procedia PDF Downloads 427
3556 Towards an Effective Approach for Modelling near Surface Air Temperature Combining Weather and Satellite Data

Authors: Nicola Colaninno, Eugenio Morello

Abstract:

The urban environment affects local-to-global climate and, in turn, suffers global warming phenomena, with worrying impacts on human well-being, health, social and economic activities. Physic-morphological features of the built-up space affect urban air temperature, locally, causing the urban environment to be warmer compared to surrounding rural. This occurrence, typically known as the Urban Heat Island (UHI), is normally assessed by means of air temperature from fixed weather stations and/or traverse observations or based on remotely sensed Land Surface Temperatures (LST). The information provided by ground weather stations is key for assessing local air temperature. However, the spatial coverage is normally limited due to low density and uneven distribution of the stations. Although different interpolation techniques such as Inverse Distance Weighting (IDW), Ordinary Kriging (OK), or Multiple Linear Regression (MLR) are used to estimate air temperature from observed points, such an approach may not effectively reflect the real climatic conditions of an interpolated point. Quantifying local UHI for extensive areas based on weather stations’ observations only is not practicable. Alternatively, the use of thermal remote sensing has been widely investigated based on LST. Data from Landsat, ASTER, or MODIS have been extensively used. Indeed, LST has an indirect but significant influence on air temperatures. However, high-resolution near-surface air temperature (NSAT) is currently difficult to retrieve. Here we have experimented Geographically Weighted Regression (GWR) as an effective approach to enable NSAT estimation by accounting for spatial non-stationarity of the phenomenon. The model combines on-site measurements of air temperature, from fixed weather stations and satellite-derived LST. The approach is structured upon two main steps. First, a GWR model has been set to estimate NSAT at low resolution, by combining air temperature from discrete observations retrieved by weather stations (dependent variable) and the LST from satellite observations (predictor). At this step, MODIS data, from Terra satellite, at 1 kilometer of spatial resolution have been employed. Two time periods are considered according to satellite revisit period, i.e. 10:30 am and 9:30 pm. Afterward, the results have been downscaled at 30 meters of spatial resolution by setting a GWR model between the previously retrieved near-surface air temperature (dependent variable), the multispectral information as provided by the Landsat mission, in particular the albedo, and Digital Elevation Model (DEM) from the Shuttle Radar Topography Mission (SRTM), both at 30 meters. Albedo and DEM are now the predictors. The area under investigation is the Metropolitan City of Milan, which covers an area of approximately 1,575 km2 and encompasses a population of over 3 million inhabitants. Both models, low- (1 km) and high-resolution (30 meters), have been validated according to a cross-validation that relies on indicators such as R2, Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE). All the employed indicators give evidence of highly efficient models. In addition, an alternative network of weather stations, available for the City of Milano only, has been employed for testing the accuracy of the predicted temperatures, giving and RMSE of 0.6 and 0.7 for daytime and night-time, respectively.

Keywords: urban climate, urban heat island, geographically weighted regression, remote sensing

Procedia PDF Downloads 168
3555 A Study on Inference from Distance Variables in Hedonic Regression

Authors: Yan Wang, Yasushi Asami, Yukio Sadahiro

Abstract:

In urban area, several landmarks may affect housing price and rents, hedonic analysis should employ distance variables corresponding to each landmarks. Unfortunately, the effects of distances to landmarks on housing prices are generally not consistent with the true price. These distance variables may cause magnitude error in regression, pointing a problem of spatial multicollinearity. In this paper, we provided some approaches for getting the samples with less bias and method on locating the specific sampling area to avoid the multicollinerity problem in two specific landmarks case.

Keywords: landmarks, hedonic regression, distance variables, collinearity, multicollinerity

Procedia PDF Downloads 426
3554 Forecasting of Grape Juice Flavor by Using Support Vector Regression

Authors: Ren-Jieh Kuo, Chun-Shou Huang

Abstract:

The research of juice flavor forecasting has become more important in China. Due to the fast economic growth in China, many different kinds of juices have been introduced to the market. If a beverage company can understand their customers’ preference well, the juice can be served more attractively. Thus, this study intends to introduce the basic theory and computing process of grapes juice flavor forecasting based on support vector regression (SVR). Applying SVR, BPN and LR to forecast the flavor of grapes juice in real data, the result shows that SVR is more suitable and effective at predicting performance.

Keywords: flavor forecasting, artificial neural networks, Support Vector Regression, China

Procedia PDF Downloads 453
3553 The Geographic Distribution of Complementary, Alternative, and Traditional Medicine in the United States in 2018

Authors: Janis E. Campbell

Abstract:

Most of what is known about complementary, alternative or traditional medicine (CATM) in the United States today is known from either the National Health Interview Survey a cross-sectional survey with a few questions or from small cross-sectional or cohort studies with specific populations. The broad geographical distribution in CATM use or providers is not known. For this project, we used geospatial cluster analysis to determine if there were clusters of CATM provider by county in the US. In this analysis, we used the National Provider Index to determine the geographic distribution of providers in the US. Of the 215,769 CAMT providers 211,603 resided in the contiguous US: Acupuncturist (26,563); Art, Poetry, Music and Dance Therapist (2,752); Chiropractor (89,514); Doula/Midwife (3,535); Exercise (507); Homeopath (380); Massage Therapist (36,540); Mechanotherapist (1,888); Naprapath (146); Naturopath (4,782); Nutrition (42,036); Reflexologist (522); Religious (2,438). ESRI® spatial autocorrelation was used to determine if the geographic location of CATM providers were random or clustered. For global analysis, we used Getis-Ord General G and for Local Indicators of Spatial Associations with an Optimized Hot Spot Analysis using an alpha of 0.05. Overall, CATM providers were clustered with both low and high. With Chiropractors, focusing in the Midwest, religious providers having very small clusters in the central US, and other types of CAMT focused in the northwest and west coast, Colorado and New Mexico, the great lakes areas and Florida. We will discuss some of the implications of this study, including associations with health, economic, social, and political systems.

Keywords: complementary medicine, alternative medicine, geospatial, United States

Procedia PDF Downloads 123
3552 Cloud-Based Multiresolution Geodata Cube for Efficient Raster Data Visualization and Analysis

Authors: Lassi Lehto, Jaakko Kahkonen, Juha Oksanen, Tapani Sarjakoski

Abstract:

The use of raster-formatted data sets in geospatial analysis is increasing rapidly. At the same time, geographic data are being introduced into disciplines outside the traditional domain of geoinformatics, like climate change, intelligent transport, and immigration studies. These developments call for better methods to deliver raster geodata in an efficient and easy-to-use manner. Data cube technologies have traditionally been used in the geospatial domain for managing Earth Observation data sets that have strict requirements for effective handling of time series. The same approach and methodologies can also be applied in managing other types of geospatial data sets. A cloud service-based geodata cube, called GeoCubes Finland, has been developed to support online delivery and analysis of most important geospatial data sets with national coverage. The main target group of the service is the academic research institutes in the country. The most significant aspects of the GeoCubes data repository include the use of multiple resolution levels, cloud-optimized file structure, and a customized, flexible content access API. Input data sets are pre-processed while being ingested into the repository to bring them into a harmonized form in aspects like georeferencing, sampling resolutions, spatial subdivision, and value encoding. All the resolution levels are created using an appropriate generalization method, selected depending on the nature of the source data set. Multiple pre-processed resolutions enable new kinds of online analysis approaches to be introduced. Analysis processes based on interactive visual exploration can be effectively carried out, as the level of resolution most close to the visual scale can always be used. In the same way, statistical analysis can be carried out on resolution levels that best reflect the scale of the phenomenon being studied. Access times remain close to constant, independent of the scale applied in the application. The cloud service-based approach, applied in the GeoCubes Finland repository, enables analysis operations to be performed on the server platform, thus making high-performance computing facilities easily accessible. The developed GeoCubes API supports this kind of approach for online analysis. The use of cloud-optimized file structures in data storage enables the fast extraction of subareas. The access API allows for the use of vector-formatted administrative areas and user-defined polygons as definitions of subareas for data retrieval. Administrative areas of the country in four levels are available readily from the GeoCubes platform. In addition to direct delivery of raster data, the service also supports the so-called virtual file format, in which only a small text file is first downloaded. The text file contains links to the raster content on the service platform. The actual raster data is downloaded on demand, from the spatial area and resolution level required in each stage of the application. By the geodata cube approach, pre-harmonized geospatial data sets are made accessible to new categories of inexperienced users in an easy-to-use manner. At the same time, the multiresolution nature of the GeoCubes repository facilitates expert users to introduce new kinds of interactive online analysis operations.

Keywords: cloud service, geodata cube, multiresolution, raster geodata

Procedia PDF Downloads 104
3551 Estimation of Coefficients of Ridge and Principal Components Regressions with Multicollinear Data

Authors: Rajeshwar Singh

Abstract:

The presence of multicollinearity is common in handling with several explanatory variables simultaneously due to exhibiting a linear relationship among them. A great problem arises in understanding the impact of explanatory variables on the dependent variable. Thus, the method of least squares estimation gives inexact estimates. In this case, it is advised to detect its presence first before proceeding further. Using the ridge regression degree of its occurrence is reduced but principal components regression gives good estimates in this situation. This paper discusses well-known techniques of the ridge and principal components regressions and applies to get the estimates of coefficients by both techniques. In addition to it, this paper also discusses the conflicting claim on the discovery of the method of ridge regression based on available documents.

Keywords: conflicting claim on credit of discovery of ridge regression, multicollinearity, principal components and ridge regressions, variance inflation factor

Procedia PDF Downloads 375
3550 Estimate of Maximum Expected Intensity of One-Half-Wave Lines Dancing

Authors: A. Bekbaev, M. Dzhamanbaev, R. Abitaeva, A. Karbozova, G. Nabyeva

Abstract:

In this paper, the regression dependence of dancing intensity from wind speed and length of span was established due to the statistic data obtained from multi-year observations on line wires dancing accumulated by power systems of Kazakhstan and the Russian Federation. The lower and upper limitations of the equations parameters were estimated, as well as the adequacy of the regression model. The constructed model will be used in research of dancing phenomena for the development of methods and means of protection against dancing and for zoning plan of the territories of line wire dancing.

Keywords: power lines, line wire dancing, dancing intensity, regression equation, dancing area intensity

Procedia PDF Downloads 287
3549 Estimation and Forecasting with a Quantile AR Model for Financial Returns

Authors: Yuzhi Cai

Abstract:

This talk presents a Bayesian approach to quantile autoregressive (QAR) time series model estimation and forecasting. We establish that the joint posterior distribution of the model parameters and future values is well defined. The associated MCMC algorithm for parameter estimation and forecasting converges to the posterior distribution quickly. We also present a combining forecasts technique to produce more accurate out-of-sample forecasts by using a weighted sequence of fitted QAR models. A moving window method to check the quality of the estimated conditional quantiles is developed. We verify our methodology using simulation studies and then apply it to currency exchange rate data. An application of the method to the USD to GBP daily currency exchange rates will also be discussed. The results obtained show that an unequally weighted combining method performs better than other forecasting methodology.

Keywords: combining forecasts, MCMC, quantile modelling, quantile forecasting, predictive density functions

Procedia PDF Downloads 315
3548 Incorporating Anomaly Detection in a Digital Twin Scenario Using Symbolic Regression

Authors: Manuel Alves, Angelica Reis, Armindo Lobo, Valdemar Leiras

Abstract:

In industry 4.0, it is common to have a lot of sensor data. In this deluge of data, hints of possible problems are difficult to spot. The digital twin concept aims to help answer this problem, but it is mainly used as a monitoring tool to handle the visualisation of data. Failure detection is of paramount importance in any industry, and it consumes a lot of resources. Any improvement in this regard is of tangible value to the organisation. The aim of this paper is to add the ability to forecast test failures, curtailing detection times. To achieve this, several anomaly detection algorithms were compared with a symbolic regression approach. To this end, Isolation Forest, One-Class SVM and an auto-encoder have been explored. For the symbolic regression PySR library was used. The first results show that this approach is valid and can be added to the tools available in this context as a low resource anomaly detection method since, after training, the only requirement is the calculation of a polynomial, a useful feature in the digital twin context.

Keywords: anomaly detection, digital twin, industry 4.0, symbolic regression

Procedia PDF Downloads 89
3547 Application of RS and GIS Technique for Identifying Groundwater Potential Zone in Gomukhi Nadhi Sub Basin, South India

Authors: Punitha Periyasamy, Mahalingam Sudalaimuthu, Sachikanta Nanda, Arasu Sundaram

Abstract:

India holds 17.5% of the world’s population but has only 2% of the total geographical area of the world where 27.35% of the area is categorized as wasteland due to lack of or less groundwater. So there is a demand for excessive groundwater for agricultural and non agricultural activities to balance its growth rate. With this in mind, an attempt is made to find the groundwater potential zone in Gomukhi river sub basin of Vellar River basin, TamilNadu, India covering an area of 1146.6 Sq.Km consists of 9 blocks from Peddanaickanpalayam to Villupuram fall in the sub basin. The thematic maps such as Geology, Geomorphology, Lineament, Landuse, and Landcover and Drainage are prepared for the study area using IRS P6 data. The collateral data includes rainfall, water level, soil map are collected for analysis and inference. The digital elevation model (DEM) is generated using Shuttle Radar Topographic Mission (SRTM) and the slope of the study area is obtained. ArcGIS 10.1 acts as a powerful spatial analysis tool to find out the ground water potential zones in the study area by means of weighted overlay analysis. Each individual parameter of the thematic maps are ranked and weighted in accordance with their influence to increase the water level in the ground. The potential zones in the study area are classified viz., Very Good, Good, Moderate, Poor with its aerial extent of 15.67, 381.06, 575.38, 174.49 Sq.Km respectively.

Keywords: ArcGIS, DEM, groundwater, recharge, weighted overlay

Procedia PDF Downloads 416
3546 Impact of Infrastructural Development on Socio-Economic Growth: An Empirical Investigation in India

Authors: Jonardan Koner

Abstract:

The study attempts to find out the impact of infrastructural investment on state economic growth in India. It further tries to determine the magnitude of the impact of infrastructural investment on economic indicator, i.e., per-capita income (PCI) in Indian States. The study uses panel regression technique to measure the impact of infrastructural investment on per-capita income (PCI) in Indian States. Panel regression technique helps incorporate both the cross-section and time-series aspects of the dataset. In order to analyze the difference in impact of the explanatory variables on the explained variables across states, the study uses Fixed Effect Panel Regression Model. The conclusions of the study are that infrastructural investment has a desirable impact on economic development and that the impact is different for different states in India. We analyze time series data (annual frequency) ranging from 1991 to 2010. The study reveals that the infrastructural investment significantly explains the variation of economic indicators.

Keywords: infrastructural investment, multiple regression, panel regression techniques, economic development, fixed effect dummy variable model

Procedia PDF Downloads 346
3545 A Quadratic Model to Early Predict the Blastocyst Stage with a Time Lapse Incubator

Authors: Cecile Edel, Sandrine Giscard D'Estaing, Elsa Labrune, Jacqueline Lornage, Mehdi Benchaib

Abstract:

Introduction: The use of incubator equipped with time-lapse technology in Artificial Reproductive Technology (ART) allows a continuous surveillance. With morphocinetic parameters, algorithms are available to predict the potential outcome of an embryo. However, the different proposed time-lapse algorithms do not take account the missing data, and then some embryos could not be classified. The aim of this work is to construct a predictive model even in the case of missing data. Materials and methods: Patients: A retrospective study was performed, in biology laboratory of reproduction at the hospital ‘Femme Mère Enfant’ (Lyon, France) between 1 May 2013 and 30 April 2015. Embryos (n= 557) obtained from couples (n=108) were cultured in a time-lapse incubator (Embryoscope®, Vitrolife, Goteborg, Sweden). Time-lapse incubator: The morphocinetic parameters obtained during the three first days of embryo life were used to build the predictive model. Predictive model: A quadratic regression was performed between the number of cells and time. N = a. T² + b. T + c. N: number of cells at T time (T in hours). The regression coefficients were calculated with Excel software (Microsoft, Redmond, WA, USA), a program with Visual Basic for Application (VBA) (Microsoft) was written for this purpose. The quadratic equation was used to find a value that allows to predict the blastocyst formation: the synthetize value. The area under the curve (AUC) obtained from the ROC curve was used to appreciate the performance of the regression coefficients and the synthetize value. A cut-off value has been calculated for each regression coefficient and for the synthetize value to obtain two groups where the difference of blastocyst formation rate according to the cut-off values was maximal. The data were analyzed with SPSS (IBM, Il, Chicago, USA). Results: Among the 557 embryos, 79.7% had reached the blastocyst stage. The synthetize value corresponds to the value calculated with time value equal to 99, the highest AUC was then obtained. The AUC for regression coefficient ‘a’ was 0.648 (p < 0.001), 0.363 (p < 0.001) for the regression coefficient ‘b’, 0.633 (p < 0.001) for the regression coefficient ‘c’, and 0.659 (p < 0.001) for the synthetize value. The results are presented as follow: blastocyst formation rate under cut-off value versus blastocyst rate formation above cut-off value. For the regression coefficient ‘a’ the optimum cut-off value was -1.14.10-3 (61.3% versus 84.3%, p < 0.001), 0.26 for the regression coefficient ‘b’ (83.9% versus 63.1%, p < 0.001), -4.4 for the regression coefficient ‘c’ (62.2% versus 83.1%, p < 0.001) and 8.89 for the synthetize value (58.6% versus 85.0%, p < 0.001). Conclusion: This quadratic regression allows to predict the outcome of an embryo even in case of missing data. Three regression coefficients and a synthetize value could represent the identity card of an embryo. ‘a’ regression coefficient represents the acceleration of cells division, ‘b’ regression coefficient represents the speed of cell division. We could hypothesize that ‘c’ regression coefficient could represent the intrinsic potential of an embryo. This intrinsic potential could be dependent from oocyte originating the embryo. These hypotheses should be confirmed by studies analyzing relationship between regression coefficients and ART parameters.

Keywords: ART procedure, blastocyst formation, time-lapse incubator, quadratic model

Procedia PDF Downloads 285
3544 Geodesign Application for Bio-Swale Design: A Data-Driven Design Approach for a Case Site in Ottawa Street North in Hamilton, Ontario, Canada

Authors: Adele Pierre, Nadia Amoroso

Abstract:

Changing climate patterns are resulting in increased in storm severity, challenging traditional methods of managing stormwater runoff. This research compares a system of bioswales to existing curb and gutter infrastructure in a post-industrial streetscape of Hamilton, Ontario. Using the geodesign process, including rule-based set parameters and an integrated approach combining geospatial information with stakeholder input, a section of Ottawa St. North was modelled to show how green infrastructure can ease the burden on aging, combined sewer systems. Qualitative data was gathered from residents of the neighbourhood through field notes, and quantitative geospatial data through GIS and site analysis. Parametric modelling was used to generate multiple design scenarios, each visualizing resulting impacts on stormwater runoff along with their calculations. The selected design scenarios offered both an aesthetically pleasing urban bioswale street-scape system while minimizing and controlling stormwater runoff. Interactive maps, videos and the 3D model were presented for stakeholder comment via ESRI’s (Environmental System Research Institute) web-scene. The results of the study demonstrate powerful tools that can assist landscape architects in designing, collaborating and communicating stormwater strategies.

Keywords: bioswale, geodesign, data-driven and rule-based design, geodesign, GIS, stormwater management

Procedia PDF Downloads 153
3543 Two-Phase Sampling for Estimating a Finite Population Total in Presence of Missing Values

Authors: Daniel Fundi Murithi

Abstract:

Missing data is a real bane in many surveys. To overcome the problems caused by missing data, partial deletion, and single imputation methods, among others, have been proposed. However, problems such as discarding usable data and inaccuracy in reproducing known population parameters and standard errors are associated with them. For regression and stochastic imputation, it is assumed that there is a variable with complete cases to be used as a predictor in estimating missing values in the other variable, and the relationship between the two variables is linear, which might not be realistic in practice. In this project, we estimate population total in presence of missing values in two-phase sampling. Instead of regression or stochastic models, non-parametric model based regression model is used in imputing missing values. Empirical study showed that nonparametric model-based regression imputation is better in reproducing variance of population total estimate obtained when there were no missing values compared to mean, median, regression, and stochastic imputation methods. Although regression and stochastic imputation were better than nonparametric model-based imputation in reproducing population total estimates obtained when there were no missing values in one of the sample sizes considered, nonparametric model-based imputation may be used when the relationship between outcome and predictor variables is not linear.

Keywords: finite population total, missing data, model-based imputation, two-phase sampling

Procedia PDF Downloads 104
3542 A Novel Approach towards Test Case Prioritization Technique

Authors: Kamna Solanki, Yudhvir Singh, Sandeep Dalal

Abstract:

Software testing is a time and cost intensive process. A scrutiny of the code and rigorous testing is required to identify and rectify the putative bugs. The process of bug identification and its consequent correction is continuous in nature and often some of the bugs are removed after the software has been launched in the market. This process of code validation of the altered software during the maintenance phase is termed as Regression testing. Regression testing ubiquitously considers resource constraints; therefore, the deduction of an appropriate set of test cases, from the ensemble of the entire gamut of test cases, is a critical issue for regression test planning. This paper presents a novel method for designing a suitable prioritization process to optimize fault detection rate and performance of regression test on predefined constraints. The proposed method for test case prioritization m-ACO alters the food source selection criteria of natural ants and is basically a modified version of Ant Colony Optimization (ACO). The proposed m-ACO approach has been coded in 'Perl' language and results are validated using three examples by computation of Average Percentage of Faults Detected (APFD) metric.

Keywords: regression testing, software testing, test case prioritization, test suite optimization

Procedia PDF Downloads 305
3541 Revolutionizing Oil Palm Replanting: Geospatial Terrace Design for High-precision Ground Implementation Compared to Conventional Methods

Authors: Nursuhaili Najwa Masrol, Nur Hafizah Mohammed, Nur Nadhirah Rusyda Rosnan, Vijaya Subramaniam, Sim Choon Cheak

Abstract:

Replanting in oil palm cultivation is vital to enable the introduction of planting materials and provides an opportunity to improve the road, drainage, terrace design, and planting density. Oil palm replanting is fundamentally necessary every 25 years. The adoption of the digital replanting blueprint is imperative as it can assist the Malaysia Oil Palm industry in addressing challenges such as labour shortages and limited expertise related to replanting tasks. Effective replanting planning should commence at least 6 months prior to the actual replanting process. Therefore, this study will help to plan and design the replanting blueprint with high-precision translation on the ground. With the advancement of geospatial technology, it is now feasible to engage in thoroughly researched planning, which can help maximize the potential yield. A blueprint designed before replanting is to enhance management’s ability to optimize the planting program, address manpower issues, or even increase productivity. In terrace planting blueprints, geographic tools have been utilized to design the roads, drainages, terraces, and planting points based on the ARM standards. These designs are mapped with location information and undergo statistical analysis. The geospatial approach is essential in precision agriculture and ensuring an accurate translation of design to the ground by implementing high-accuracy technologies. In this study, geospatial and remote sensing technologies played a vital role. LiDAR data was employed to determine the Digital Elevation Model (DEM), enabling the precise selection of terraces, while ortho imagery was used for validation purposes. Throughout the designing process, Geographical Information System (GIS) tools were extensively utilized. To assess the design’s reliability on the ground compared with the current conventional method, high-precision GPS instruments like EOS Arrow Gold and HIPER VR GNSS were used, with both offering accuracy levels between 0.3 cm and 0.5cm. Nearest Distance Analysis was generated to compare the design with actual planting on the ground. The analysis revealed that it could not be applied to the roads due to discrepancies between actual roads and the blueprint design, which resulted in minimal variance. In contrast, the terraces closely adhered to the GPS markings, with the most variance distance being less than 0.5 meters compared to actual terraces constructed. Considering the required slope degrees for terrace planting, which must be greater than 6 degrees, the study found that approximately 65% of the terracing was constructed at a 12-degree slope, while over 50% of the terracing was constructed at slopes exceeding the minimum degrees. Utilizing blueprint replanting promising strategies for optimizing land utilization in agriculture. This approach harnesses technology and meticulous planning to yield advantages, including increased efficiency, enhanced sustainability, and cost reduction. From this study, practical implementation of this technique can lead to tangible and significant improvements in agricultural sectors. In boosting further efficiencies, future initiatives will require more sophisticated techniques and the incorporation of precision GPS devices for upcoming blueprint replanting projects besides strategic progression aims to guarantee the precision of both blueprint design stages and its subsequent implementation on the field. Looking ahead, automating digital blueprints are necessary to reduce time, workforce, and costs in commercial production.

Keywords: replanting, geospatial, precision agriculture, blueprint

Procedia PDF Downloads 48
3540 Proposals of Exposure Limits for Infrasound From Wind Turbines

Authors: M. Pawlaczyk-Łuszczyńska, T. Wszołek, A. Dudarewicz, P. Małecki, M. Kłaczyński, A. Bortkiewicz

Abstract:

Human tolerance to infrasound is defined by the hearing threshold. Infrasound that cannot be heard (or felt) is not annoying and is not thought to have any other adverse or health effects. Recent research has largely confirmed earlier findings. ISO 7196:1995 recommends the use of G-weighted characteristics for the assessment of infrasound. There is a strong correlation between G-weighted SPL and annoyance perception. The aim of this study was to propose exposure limits for infrasound from wind turbines. However, only a few countries have set limits for infrasound. These limits are usually no higher than 85-92 dBG, and none of them are specific to wind turbines. Over the years, a number of studies have been carried out to determine hearing thresholds below 20 Hz. It has been recognized that 10% of young people would be able to perceive 10 Hz at around 90 dB, and it has also been found that the difference in median hearing thresholds between young adults aged around 20 years and older adults aged over 60 years is around 10 dB, irrespective of frequency. This shows that older people (up to about 60 years of age) retain good hearing in the low frequency range, while their sensitivity to higher frequencies is often significantly reduced. In terms of exposure limits for infrasound, the average hearing threshold corresponds to a tone with a G-weighted SPL of about 96 dBG. In contrast, infrasound at Lp,G levels below 85-90 dBG is usually inaudible. The individual hearing threshold can, therefore be 10-15 dB lower than the average threshold, so the recommended limits for environmental infrasound could be 75 dBG or 80 dBG. It is worth noting that the G86 curve has been taken as the threshold of auditory perception of infrasound reached by 90-95% of the population, so the G75 and G80 curves can be taken as the criterion curve for wind turbine infrasound. Finally, two assessment methods and corresponding exposure limit values have been proposed for wind turbine infrasound, i.e. method I - based on G-weighted sound pressure level measurements and method II - based on frequency analysis in 1/3-octave bands in the frequency range 4-20 Hz. Separate limit values have been set for outdoor living areas in the open countryside (Area A) and for noise sensitive areas (Area B). In the case of Method I, infrasound limit values of 80 dBG (for areas A) and 75 dBG (for areas B) have been proposed, while in the case of Method II - criterion curves G80 and G75 have been chosen (for areas A and B, respectively).

Keywords: infrasound, exposure limit, hearing thresholds, wind turbines

Procedia PDF Downloads 49
3539 Automated Natural Hazard Zonation System with Internet-SMS Warning: Distributed GIS for Sustainable Societies Creating Schema and Interface for Mapping and Communication

Authors: Devanjan Bhattacharya, Jitka Komarkova

Abstract:

The research describes the implementation of a novel and stand-alone system for dynamic hazard warning. The system uses all existing infrastructure already in place like mobile networks, a laptop/PC and the small installation software. The geospatial dataset are the maps of a region which are again frugal. Hence there is no need to invest and it reaches everyone with a mobile. A novel architecture of hazard assessment and warning introduced where major technologies in ICT interfaced to give a unique WebGIS based dynamic real time geohazard warning communication system. A never before architecture introduced for integrating WebGIS with telecommunication technology. Existing technologies interfaced in a novel architectural design to address a neglected domain in a way never done before–through dynamically updatable WebGIS based warning communication. The work publishes new architecture and novelty in addressing hazard warning techniques in sustainable way and user friendly manner. Coupling of hazard zonation and hazard warning procedures into a single system has been shown. Generalized architecture for deciphering a range of geo-hazards has been developed. Hence the developmental work presented here can be summarized as the development of internet-SMS based automated geo-hazard warning communication system; integrating a warning communication system with a hazard evaluation system; interfacing different open-source technologies towards design and development of a warning system; modularization of different technologies towards development of a warning communication system; automated data creation, transformation and dissemination over different interfaces. The architecture of the developed warning system has been functionally automated as well as generalized enough that can be used for any hazard and setup requirement has been kept to a minimum.

Keywords: geospatial, web-based GIS, geohazard, warning system

Procedia PDF Downloads 370
3538 E-Hailing Taxi Industry Management Mode Innovation Based on the Credit Evaluation

Authors: Yuan-lin Liu, Ye Li, Tian Xia

Abstract:

There are some shortcomings in Chinese existing taxi management modes. This paper suggests to establish the third-party comprehensive information management platform and put forward an evaluation model based on credit. Four indicators are used to evaluate the drivers’ credit, they are passengers’ evaluation score, driving behavior evaluation, drivers’ average bad record number, and personal credit score. A weighted clustering method is used to achieve credit level evaluation for taxi drivers. The management of taxi industry is based on the credit level, while the grade of the drivers is accorded to their credit rating. Credit rating determines the cost, income levels, the market access, useful period of license and the level of wage and bonus, as well as violation fine. These methods can make the credit evaluation effective. In conclusion, more credit data will help to set up a more accurate and detailed classification standard library.

Keywords: credit, mobile internet, e-hailing taxi, management mode, weighted cluster

Procedia PDF Downloads 288