Search results for: nonparametric geographically weighted regression
3817 A Ratio-Weighted Decision Tree Algorithm for Imbalance Dataset Classification
Authors: Doyin Afolabi, Phillip Adewole, Oladipupo Sennaike
Abstract:
Most well-known classifiers, including the decision tree algorithm, can make predictions on balanced datasets efficiently. However, the decision tree algorithm tends to be biased towards imbalanced datasets because of the skewness of the distribution of such datasets. To overcome this problem, this study proposes a weighted decision tree algorithm that aims to remove the bias toward the majority class and prevents the reduction of majority observations in imbalance datasets classification. The proposed weighted decision tree algorithm was tested on three imbalanced datasets- cancer dataset, german credit dataset, and banknote dataset. The specificity, sensitivity, and accuracy metrics were used to evaluate the performance of the proposed decision tree algorithm on the datasets. The evaluation results show that for some of the weights of our proposed decision tree, the specificity, sensitivity, and accuracy metrics gave better results compared to that of the ID3 decision tree and decision tree induced with minority entropy for all three datasets.Keywords: data mining, decision tree, classification, imbalance dataset
Procedia PDF Downloads 1373816 The Theory behind Logistic Regression
Authors: Jan Henrik Wosnitza
Abstract:
The logistic regression has developed into a standard approach for estimating conditional probabilities in a wide range of applications including credit risk prediction. The article at hand contributes to the current literature on logistic regression fourfold: First, it is demonstrated that the binary logistic regression automatically meets its model assumptions under very general conditions. This result explains, at least in part, the logistic regression's popularity. Second, the requirement of homoscedasticity in the context of binary logistic regression is theoretically substantiated. The variances among the groups of defaulted and non-defaulted obligors have to be the same across the level of the aggregated default indicators in order to achieve linear logits. Third, this article sheds some light on the question why nonlinear logits might be superior to linear logits in case of a small amount of data. Fourth, an innovative methodology for estimating correlations between obligor-specific log-odds is proposed. In order to crystallize the key ideas, this paper focuses on the example of credit risk prediction. However, the results presented in this paper can easily be transferred to any other field of application.Keywords: correlation, credit risk estimation, default correlation, homoscedasticity, logistic regression, nonlinear logistic regression
Procedia PDF Downloads 4263815 Econophysics: The Use of Entropy Measures in Finance
Authors: Muhammad Sheraz, Vasile Preda, Silvia Dedu
Abstract:
Concepts of econophysics are usually used to solve problems related to uncertainty and nonlinear dynamics. In the theory of option pricing the risk neutral probabilities play very important role. The application of entropy in finance can be regarded as the extension of both information entropy and the probability entropy. It can be an important tool in various financial methods such as measure of risk, portfolio selection, option pricing and asset pricing. Gulko applied Entropy Pricing Theory (EPT) for pricing stock options and introduced an alternative framework of Black-Scholes model for pricing European stock option. In this article, we present solutions to maximum entropy problems based on Tsallis, Weighted-Tsallis, Kaniadakis, Weighted-Kaniadakies entropies, to obtain risk-neutral densities. We have also obtained the value of European call and put in this framework.Keywords: option pricing, Black-Scholes model, Tsallis entropy, Kaniadakis entropy, weighted entropy, risk-neutral density
Procedia PDF Downloads 3033814 A Study on the Measurement of Spatial Mismatch and the Influencing Factors of “Job-Housing” in Affordable Housing from the Perspective of Commuting
Authors: Daijun Chen
Abstract:
Affordable housing is subsidized by the government to meet the housing demand of low and middle-income urban residents in the process of urbanization and to alleviate the housing inequality caused by market-based housing reforms. It is a recognized fact that the living conditions of the insured have been improved while constructing the subsidized housing. However, the choice of affordable housing is mostly in the suburbs, where the surrounding urban functions and infrastructure are incomplete, resulting in the spatial mismatch of "jobs-housing" in affordable housing. The main reason for this problem is that the residents of affordable housing are more sensitive to the spatial location of their residence, but their selectivity and controllability to the housing location are relatively weak, which leads to higher commuting costs. Their real cost of living has not been effectively reduced. In this regard, 92 subsidized housing communities in Nanjing, China, are selected as the research sample in this paper. The residents of the affordable housing and their commuting Spatio-temporal behavior characteristics are identified based on the LBS (location-based service) data. Based on the spatial mismatch theory, spatial mismatch indicators such as commuting distance and commuting time are established to measure the spatial mismatch degree of subsidized housing in different districts of Nanjing. Furthermore, the geographically weighted regression model is used to analyze the influencing factors of the spatial mismatch of affordable housing in terms of the provision of employment opportunities, traffic accessibility and supporting service facilities by using spatial, functional and other multi-source Spatio-temporal big data. The results show that the spatial mismatch of affordable housing in Nanjing generally presents a "concentric circle" pattern of decreasing from the central urban area to the periphery. The factors affecting the spatial mismatch of affordable housing in different spatial zones are different. The main reasons are the number of enterprises within 1 km of the affordable housing district and the shortest distance to the subway station. And the low spatial mismatch is due to the diversity of services and facilities. Based on this, a spatial optimization strategy for different levels of spatial mismatch in subsidized housing is proposed. And feasible suggestions for the later site selection of subsidized housing are also provided. It hopes to avoid or mitigate the impact of "spatial mismatch," promote the "spatial adaptation" of "jobs-housing," and truly improve the overall welfare level of affordable housing residents.Keywords: affordable housing, spatial mismatch, commuting characteristics, spatial adaptation, welfare benefits
Procedia PDF Downloads 1083813 Multidimensional Poverty and Child Cognitive Development
Authors: Bidyadhar Dehury, Sanjay Kumar Mohanty
Abstract:
According to the Right to Education Act of India, education is the fundamental right of all children of age group 6-14 year irrespective of their status. Using the unit level data from India Human Development Survey (IHDS), we tried to understand the inter-relationship between the level of poverty and the academic performance of the children aged 8-11 years. The level of multidimensional poverty is measured using five dimensions and 10 indicators using Alkire-Foster approach. The weighted deprivation score was obtained by giving equal weight to each dimension and indicators within the dimension. The weighted deprivation score varies from 0 to 1 and grouped into four categories as non-poor, vulnerable, multidimensional poor and sever multidimensional poor. The academic performance index was measured using three variables reading skills, math skills and writing skills using PCA. The bivariate and multivariate analysis was used in the analysis. The outcome variable was ordinal. So the predicted probabilities were calculated using the ordinal logistic regression. The predicted probabilities of good academic performance index was 0.202 if the child was sever multidimensional poor, 0.235 if the child was multidimensional poor, 0.264 if the child was vulnerable, and 0.316 if the child was non-poor. Hence, if the level of poverty among the children decreases from sever multidimensional poor to non-poor, the probability of good academic performance increases.Keywords: multidimensional poverty, academic performance index, reading skills, math skills, writing skills, India
Procedia PDF Downloads 5933812 Empirical Modeling and Spatial Analysis of Heat-Related Morbidity in Maricopa County, Arizona
Authors: Chuyuan Wang, Nayan Khare, Lily Villa, Patricia Solis, Elizabeth A. Wentz
Abstract:
Maricopa County, Arizona, has a semi-arid hot desert climate that is one of the hottest regions in the United States. The exacerbated urban heat island (UHI) effect caused by rapid urbanization has made the urban area even hotter than the rural surroundings. The Phoenix metropolitan area experiences extremely high temperatures in the summer from June to September that can reach the daily highest of 120 °F (48.9 °C). Morbidity and mortality due to the environmental heat is, therefore, a significant public health issue in Maricopa County, especially because it is largely preventable. Public records from the Maricopa County Department of Public Health (MCDPH) revealed that between 2012 and 2016, there were 10,825 incidents of heat-related morbidity incidents, 267 outdoor environmental heat deaths, and 173 indoor heat-related deaths. A lot of research has examined heat-related death and its contributing factors around the world, but little has been done regarding heat-related morbidity issues, especially for regions that are naturally hot in the summer. The objective of this study is to examine the demographic, socio-economic, housing, and environmental factors that contribute to heat-related morbidity in Maricopa County. We obtained heat-related morbidity data between 2012 and 2016 at census tract level from MCDPH. Demographic, socio-economic, and housing variables were derived using 2012-2016 American Community Survey 5-year estimate from the U.S. Census. Remotely sensed Landsat 7 ETM+ and Landsat 8 OLI satellite images and Level-1 products were acquired for all the summer months (June to September) from 2012 and 2016. The National Land Cover Database (NLCD) 2016 percent tree canopy and percent developed imperviousness data were obtained from the U.S. Geological Survey (USGS). We used ordinary least squares (OLS) regression analysis to examine the empirical relationship between all the independent variables and heat-related morbidity rate. Results showed that higher morbidity rates are found in census tracts with higher values in population aged 65 and older, population under poverty, disability, no vehicle ownership, white non-Hispanic, population with less than high school degree, land surface temperature, and surface reflectance, but lower values in normalized difference vegetation index (NDVI) and housing occupancy. The regression model can be used to explain up to 59.4% of total variation of heat-related morbidity in Maricopa County. The multiscale geographically weighted regression (MGWR) technique was then used to examine the spatially varying relationships between heat-related morbidity rate and all the significant independent variables. The R-squared value of the MGWR model increased to 0.691, that shows a significant improvement in goodness-of-fit than the global OLS model, which means that spatial heterogeneity of some independent variables is another important factor that influences the relationship with heat-related morbidity in Maricopa County. Among these variables, population aged 65 and older, the Hispanic population, disability, vehicle ownership, and housing occupancy have much stronger local effects than other variables.Keywords: census, empirical modeling, heat-related morbidity, spatial analysis
Procedia PDF Downloads 1263811 Real-Time Lane Marking Detection Using Weighted Filter
Authors: Ayhan Kucukmanisa, Orhan Akbulut, Oguzhan Urhan
Abstract:
Nowadays, advanced driver assistance systems (ADAS) have become popular, since they enable safe driving. Lane detection is a vital step for ADAS. The performance of the lane detection process is critical to obtain a high accuracy lane departure warning system (LDWS). Challenging factors such as road cracks, erosion of lane markings, weather conditions might affect the performance of a lane detection system. In this paper, 1-D weighted filter based on row filtering to detect lane marking is proposed. 2-D input image is filtered by 1-D weighted filter considering four-pixel values located symmetrically around the center of candidate pixel. Performance evaluation is carried out by two metrics which are true positive rate (TPR) and false positive rate (FPR). Experimental results demonstrate that the proposed approach provides better lane marking detection accuracy compared to the previous methods while providing real-time processing performance.Keywords: lane marking filter, lane detection, ADAS, LDWS
Procedia PDF Downloads 1943810 Recursive Doubly Complementary Filter Design Using Particle Swarm Optimization
Authors: Ju-Hong Lee, Ding-Chen Chung
Abstract:
This paper deals with the optimal design of recursive doubly complementary (DC) digital filter design using a metaheuristic based optimization technique. Based on the theory of DC digital filters using two recursive digital all-pass filters (DAFs), the design problem is appropriately formulated to result in an objective function which is a weighted sum of the phase response errors of the designed DAFs. To deal with the stability of the recursive DC filters during the design process, we can either impose some necessary constraints on the phases of the recursive DAFs. Through a frequency sampling and a weighted least squares approach, the optimization problem of the objective function can be solved by utilizing a population based stochastic optimization approach. The resulting DC digital filters can possess satisfactory frequency response. Simulation results are presented for illustration and comparison.Keywords: doubly complementary, digital all-pass filter, weighted least squares algorithm, particle swarm optimization
Procedia PDF Downloads 6883809 The Customization of 3D Last Form Design Based on Weighted Blending
Authors: Shih-Wen Hsiao, Chu-Hsuan Lee, Rong-Qi Chen
Abstract:
When it comes to last, it is regarded as the critical foundation of shoe design and development. Not only the last relates to the comfort of shoes wearing but also it aids the production of shoe styling and manufacturing. In order to enhance the efficiency and application of last development, a computer aided methodology for customized last form designs is proposed in this study. The reverse engineering is mainly applied to the process of scanning for the last form. Then the minimum energy is used for the revision of surface continuity, the surface of the last is reconstructed with the feature curves of the scanned last. When the surface of a last is reconstructed, based on the foundation of the proposed last form reconstruction module, the weighted arithmetic mean method is applied to the calculation on the shape morphing which differs from the grading for the control mesh of last, and the algorithm of subdivision is used to create the surface of last mesh, thus the feet-fitting 3D last form of different sizes is generated from its original form feature with functions remained. Finally, the practicability of the proposed methodology is verified through later case studies.Keywords: 3D last design, customization, reverse engineering, weighted morphing, shape blending
Procedia PDF Downloads 3393808 Model Averaging for Poisson Regression
Authors: Zhou Jianhong
Abstract:
Model averaging is a desirable approach to deal with model uncertainty, which, however, has rarely been explored for Poisson regression. In this paper, we propose a model averaging procedure based on an unbiased estimator of the expected Kullback-Leibler distance for the Poisson regression. Simulation study shows that the proposed model average estimator outperforms some other commonly used model selection and model average estimators in some situations. Our proposed methods are further applied to a real data example and the advantage of this method is demonstrated again.Keywords: model averaging, poission regression, Kullback-Leibler distance, statistics
Procedia PDF Downloads 5203807 Establishment of the Regression Uncertainty of the Critical Heat Flux Power Correlation for an Advanced Fuel Bundle
Authors: L. Q. Yuan, J. Yang, A. Siddiqui
Abstract:
A new regression uncertainty analysis methodology was applied to determine the uncertainties of the critical heat flux (CHF) power correlation for an advanced 43-element bundle design, which was developed by Canadian Nuclear Laboratories (CNL) to achieve improved economics, resource utilization and energy sustainability. The new methodology is considered more appropriate than the traditional methodology in the assessment of the experimental uncertainty associated with regressions. The methodology was first assessed using both the Monte Carlo Method (MCM) and the Taylor Series Method (TSM) for a simple linear regression model, and then extended successfully to a non-linear CHF power regression model (CHF power as a function of inlet temperature, outlet pressure and mass flow rate). The regression uncertainty assessed by MCM agrees well with that by TSM. An equation to evaluate the CHF power regression uncertainty was developed and expressed as a function of independent variables that determine the CHF power.Keywords: CHF experiment, CHF correlation, regression uncertainty, Monte Carlo Method, Taylor Series Method
Procedia PDF Downloads 4163806 Quadrature Mirror Filter Bank Design Using Population Based Stochastic Optimization
Authors: Ju-Hong Lee, Ding-Chen Chung
Abstract:
The paper deals with the optimal design of two-channel linear-phase (LP) quadrature mirror filter (QMF) banks using a metaheuristic based optimization technique. Based on the theory of two-channel QMF banks using two recursive digital all-pass filters (DAFs), the design problem is appropriately formulated to result in an objective function which is a weighted sum of the group delay error of the designed QMF bank and the magnitude response error of the designed low-pass analysis filter. Through a frequency sampling and a weighted least squares approach, the optimization problem of the objective function can be solved by utilizing a particle swarm optimization algorithm. The resulting two-channel QMF banks can possess approximately LP response without magnitude distortion. Simulation results are presented for illustration and comparison.Keywords: quadrature mirror filter bank, digital all-pass filter, weighted least squares algorithm, particle swarm optimization
Procedia PDF Downloads 5213805 Use of Multistage Transition Regression Models for Credit Card Income Prediction
Authors: Denys Osipenko, Jonathan Crook
Abstract:
Because of the variety of the card holders’ behaviour types and income sources each consumer account can be transferred to a variety of states. Each consumer account can be inactive, transactor, revolver, delinquent, defaulted and requires an individual model for the income prediction. The estimation of transition probabilities between statuses at the account level helps to avoid the memorylessness of the Markov Chains approach. This paper investigates the transition probabilities estimation approaches to credit cards income prediction at the account level. The key question of empirical research is which approach gives more accurate results: multinomial logistic regression or multistage conditional logistic regression with binary target. Both models have shown moderate predictive power. Prediction accuracy for conditional logistic regression depends on the order of stages for the conditional binary logistic regression. On the other hand, multinomial logistic regression is easier for usage and gives integrate estimations for all states without priorities. Thus further investigations can be concentrated on alternative modeling approaches such as discrete choice models.Keywords: multinomial regression, conditional logistic regression, credit account state, transition probability
Procedia PDF Downloads 4873804 Semiparametric Regression Of Truncated Spline Biresponse On Farmer Loyalty And Attachment Modeling
Authors: Adji Achmad Rinaldo Fernandes
Abstract:
Regression analysis is a statistical method that is able to describe and predict causal relationships between individuals. Not all relationships have a known curve shape; often, there are relationship patterns that cannot be known in the shape of the curve; besides that, a cause can have an impact on more than one effect, so that between effects can also have a close relationship in it. Regression analysis that can be done to find out the relationship can be brought closer to the semiparametric regression of truncated spline biresponse. The purpose of this study is to examine the function estimator and determine the best model of truncated spline biresponse semiparametric regression. The results of the secondary data study showed that the best model with the highest order of quadratic and a maximum of two knots with a Goodness of fit value in the form of Adjusted R2 of 88.5%.Keywords: biresponse, farmer attachment, farmer loyalty, truncated spline
Procedia PDF Downloads 363803 Internet Purchases in European Union Countries: Multiple Linear Regression Approach
Authors: Ksenija Dumičić, Anita Čeh Časni, Irena Palić
Abstract:
This paper examines economic and Information and Communication Technology (ICT) development influence on recently increasing Internet purchases by individuals for European Union member states. After a growing trend for Internet purchases in EU27 was noticed, all possible regression analysis was applied using nine independent variables in 2011. Finally, two linear regression models were studied in detail. Conducted simple linear regression analysis confirmed the research hypothesis that the Internet purchases in analysed EU countries is positively correlated with statistically significant variable Gross Domestic Product per capita (GDPpc). Also, analysed multiple linear regression model with four regressors, showing ICT development level, indicates that ICT development is crucial for explaining the Internet purchases by individuals, confirming the research hypothesis.Keywords: European union, Internet purchases, multiple linear regression model, outlier
Procedia PDF Downloads 3023802 Copula-Based Estimation of Direct and Indirect Effects in Path Analysis Models
Authors: Alam Ali, Ashok Kumar Pathak
Abstract:
Path analysis is a statistical technique used to evaluate the direct and indirect effects of variables in path models. One or more structural regression equations are used to estimate a series of parameters in path models to find the better fit of data. However, sometimes the assumptions of classical regression models, such as ordinary least squares (OLS), are violated by the nature of the data, resulting in insignificant direct and indirect effects of exogenous variables. This article aims to explore the effectiveness of a copula-based regression approach as an alternative to classical regression, specifically when variables are linked through an elliptical copula.Keywords: path analysis, copula-based regression models, direct and indirect effects, k-fold cross validation technique
Procedia PDF Downloads 413801 Optimization of Slider Crank Mechanism Using Design of Experiments and Multi-Linear Regression
Authors: Galal Elkobrosy, Amr M. Abdelrazek, Bassuny M. Elsouhily, Mohamed E. Khidr
Abstract:
Crank shaft length, connecting rod length, crank angle, engine rpm, cylinder bore, mass of piston and compression ratio are the inputs that can control the performance of the slider crank mechanism and then its efficiency. Several combinations of these seven inputs are used and compared. The throughput engine torque predicted by the simulation is analyzed through two different regression models, with and without interaction terms, developed according to multi-linear regression using LU decomposition to solve system of algebraic equations. These models are validated. A regression model in seven inputs including their interaction terms lowered the polynomial degree from 3rd degree to 1st degree and suggested valid predictions and stable explanations.Keywords: design of experiments, regression analysis, SI engine, statistical modeling
Procedia PDF Downloads 1863800 Optimal Control of DC Motor Using Linear Quadratic Regulator
Authors: Meetty Tomy, Arxhana G Thosar
Abstract:
This paper provides the implementation of optimal control for an armature-controlled DC motor. The selection of error weighted Matrix and control weighted matrix in order to implement optimal control theory for improving the dynamic behavior of DC motor is presented. The closed loop performance of Armature controlled DC motor with derived linear optimal controller is then evaluated for the transient operating condition (starting). The result obtained from MATLAB is compared with that of PID controller and simple closed loop response of the motor.Keywords: optimal control, DC motor, performance index, MATLAB
Procedia PDF Downloads 4103799 Seismic Perimeter Surveillance System (Virtual Fence) for Threat Detection and Characterization Using Multiple ML Based Trained Models in Weighted Ensemble Voting
Authors: Vivek Mahadev, Manoj Kumar, Neelu Mathur, Brahm Dutt Pandey
Abstract:
Perimeter guarding and protection of critical installations require prompt intrusion detection and assessment to take effective countermeasures. Currently, visual and electronic surveillance are the primary methods used for perimeter guarding. These methods can be costly and complicated, requiring careful planning according to the location and terrain. Moreover, these methods often struggle to detect stealthy and camouflaged insurgents. The object of the present work is to devise a surveillance technique using seismic sensors that overcomes the limitations of existing systems. The aim is to improve intrusion detection, assessment, and characterization by utilizing seismic sensors. Most of the similar systems have only two types of intrusion detection capability viz., human or vehicle. In our work we could even categorize further to identify types of intrusion activity such as walking, running, group walking, fence jumping, tunnel digging and vehicular movements. A virtual fence of 60 meters at GCNEP, Bahadurgarh, Haryana, India, was created by installing four underground geophones at a distance of 15 meters each. The signals received from these geophones are then processed to find unique seismic signatures called features. Various feature optimization and selection methodologies, such as LightGBM, Boruta, Random Forest, Logistics, Recursive Feature Elimination, Chi-2 and Pearson Ratio were used to identify the best features for training the machine learning models. The trained models were developed using algorithms such as supervised support vector machine (SVM) classifier, kNN, Decision Tree, Logistic Regression, Naïve Bayes, and Artificial Neural Networks. These models were then used to predict the category of events, employing weighted ensemble voting to analyze and combine their results. The models were trained with 1940 training events and results were evaluated with 831 test events. It was observed that using the weighted ensemble voting increased the efficiency of predictions. In this study we successfully developed and deployed the virtual fence using geophones. Since these sensors are passive, do not radiate any energy and are installed underground, it is impossible for intruders to locate and nullify them. Their flexibility, quick and easy installation, low costs, hidden deployment and unattended surveillance make such systems especially suitable for critical installations and remote facilities with difficult terrain. This work demonstrates the potential of utilizing seismic sensors for creating better perimeter guarding and protection systems using multiple machine learning models in weighted ensemble voting. In this study the virtual fence achieved an intruder detection efficiency of over 97%.Keywords: geophone, seismic perimeter surveillance, machine learning, weighted ensemble method
Procedia PDF Downloads 783798 An Epsilon Hierarchical Fuzzy Twin Support Vector Regression
Authors: Arindam Chaudhuri
Abstract:
The research presents epsilon- hierarchical fuzzy twin support vector regression (epsilon-HFTSVR) based on epsilon-fuzzy twin support vector regression (epsilon-FTSVR) and epsilon-twin support vector regression (epsilon-TSVR). Epsilon-FTSVR is achieved by incorporating trapezoidal fuzzy numbers to epsilon-TSVR which takes care of uncertainty existing in forecasting problems. Epsilon-FTSVR determines a pair of epsilon-insensitive proximal functions by solving two related quadratic programming problems. The structural risk minimization principle is implemented by introducing regularization term in primal problems of epsilon-FTSVR. This yields dual stable positive definite problems which improves regression performance. Epsilon-FTSVR is then reformulated as epsilon-HFTSVR consisting of a set of hierarchical layers each containing epsilon-FTSVR. Experimental results on both synthetic and real datasets reveal that epsilon-HFTSVR has remarkable generalization performance with minimum training time.Keywords: regression, epsilon-TSVR, epsilon-FTSVR, epsilon-HFTSVR
Procedia PDF Downloads 3753797 Weighted Data Replication Strategy for Data Grid Considering Economic Approach
Authors: N. Mansouri, A. Asadi
Abstract:
Data Grid is a geographically distributed environment that deals with data intensive application in scientific and enterprise computing. Data replication is a common method used to achieve efficient and fault-tolerant data access in Grids. In this paper, a dynamic data replication strategy, called Enhanced Latest Access Largest Weight (ELALW) is proposed. This strategy is an enhanced version of Latest Access Largest Weight strategy. However, replication should be used wisely because the storage capacity of each Grid site is limited. Thus, it is important to design an effective strategy for the replication replacement task. ELALW replaces replicas based on the number of requests in future, the size of the replica, and the number of copies of the file. It also improves access latency by selecting the best replica when various sites hold replicas. The proposed replica selection selects the best replica location from among the many replicas based on response time that can be determined by considering the data transfer time, the storage access latency, the replica requests that waiting in the storage queue and the distance between nodes. Simulation results utilizing the OptorSim show our replication strategy achieve better performance overall than other strategies in terms of job execution time, effective network usage and storage resource usage.Keywords: data grid, data replication, simulation, replica selection, replica placement
Procedia PDF Downloads 2603796 Regression Model Evaluation on Depth Camera Data for Gaze Estimation
Authors: James Purnama, Riri Fitri Sari
Abstract:
We investigate the machine learning algorithm selection problem in the term of a depth image based eye gaze estimation, with respect to its essential difficulty in reducing the number of required training samples and duration time of training. Statistics based prediction accuracy are increasingly used to assess and evaluate prediction or estimation in gaze estimation. This article evaluates Root Mean Squared Error (RMSE) and R-Squared statistical analysis to assess machine learning methods on depth camera data for gaze estimation. There are 4 machines learning methods have been evaluated: Random Forest Regression, Regression Tree, Support Vector Machine (SVM), and Linear Regression. The experiment results show that the Random Forest Regression has the lowest RMSE and the highest R-Squared, which means that it is the best among other methods.Keywords: gaze estimation, gaze tracking, eye tracking, kinect, regression model, orange python
Procedia PDF Downloads 5383795 Multi-Criteria Decision Approach to Performance Measurement Techniques Data Envelopment Analysis: Case Study of Kerman City’s Parks
Authors: Ali A. Abdollahi
Abstract:
During the last several decades, scientists have consistently applied Multiple Criteria Decision-Making methods in making decisions about multi-faceted, complicated subjects. While making such decisions and in order to achieve more accurate evaluations, they have regularly used a variety of criteria instead of applying just one Optimum Evaluation Criterion. The method presented here utilizes both ‘quantity’ and ‘quality’ to assess the function of the Multiple-Criteria method. Applying Data envelopment analysis (DEA), weighted aggregated sum product assessment (WASPAS), Weighted Sum Approach (WSA), Analytic Network Process (ANP), and Charnes, Cooper, Rhodes (CCR) methods, we have analyzed thirteen parks in Kerman city. It further indicates that the functions of WASPAS and WSA are compatible with each other, but also that their deviation from DEA is extensive. Finally, the results for the CCR technique do not match the results of the DEA technique. Our study indicates that the ANP method, with the average rate of 1/51, ranks closest to the DEA method, which has an average rate of 1/49.Keywords: multiple criteria decision making, Data envelopment analysis (DEA), Charnes Cooper Rhodes (CCR), Weighted Sum Approach (WSA)
Procedia PDF Downloads 2183794 Generalized Extreme Value Regression with Binary Dependent Variable: An Application for Predicting Meteorological Drought Probabilities
Authors: Retius Chifurira
Abstract:
Logistic regression model is the most used regression model to predict meteorological drought probabilities. When the dependent variable is extreme, the logistic model fails to adequately capture drought probabilities. In order to adequately predict drought probabilities, we use the generalized linear model (GLM) with the quantile function of the generalized extreme value distribution (GEVD) as the link function. The method maximum likelihood estimation is used to estimate the parameters of the generalized extreme value (GEV) regression model. We compare the performance of the logistic and the GEV regression models in predicting drought probabilities for Zimbabwe. The performance of the regression models are assessed using the goodness-of-fit tests, namely; relative root mean square error (RRMSE) and relative mean absolute error (RMAE). Results show that the GEV regression model performs better than the logistic model, thereby providing a good alternative candidate for predicting drought probabilities. This paper provides the first application of GLM derived from extreme value theory to predict drought probabilities for a drought-prone country such as Zimbabwe.Keywords: generalized extreme value distribution, general linear model, mean annual rainfall, meteorological drought probabilities
Procedia PDF Downloads 2003793 Cooperative Cross Layer Topology for Concurrent Transmission Scheduling Scheme in Broadband Wireless Networks
Authors: Gunasekaran Raja, Ramkumar Jayaraman
Abstract:
In this paper, we consider CCL-N (Cooperative Cross Layer Network) topology based on the cross layer (both centralized and distributed) environment to form network communities. Various performance metrics related to the IEEE 802.16 networks are discussed to design CCL-N Topology. In CCL-N topology, nodes are classified as master nodes (Master Base Station [MBS]) and serving nodes (Relay Station [RS]). Nodes communities are organized based on the networking terminologies. Based on CCL-N Topology, various simulation analyses for both transparent and non-transparent relays are tabulated and throughput efficiency is calculated. Weighted load balancing problem plays a challenging role in IEEE 802.16 network. CoTS (Concurrent Transmission Scheduling) Scheme is formulated in terms of three aspects – transmission mechanism based on identical communities, different communities and identical node communities. CoTS scheme helps in identifying the weighted load balancing problem. Based on the analytical results, modularity value is inversely proportional to that of the error value. The modularity value plays a key role in solving the CoTS problem based on hop count. The transmission mechanism for identical node community has no impact since modularity value is same for all the network groups. In this paper three aspects of communities based on the modularity value which helps in solving the problem of weighted load balancing and CoTS are discussed.Keywords: cross layer network topology, concurrent scheduling, modularity value, network communities and weighted load balancing
Procedia PDF Downloads 2653792 The Spatial Analysis of Wetland Ecosystem Services Valuation on Flood Protection in Tone River Basin
Authors: Tingting Song
Abstract:
Wetlands are significant ecosystems that provide a variety of ecosystem services for humans, such as, providing water and food resources, purifying water quality, regulating climate, protecting biodiversity, and providing cultural, recreational, and educational resources. Wetlands also provide benefits, such as reduction of flood, storm damage, and soil erosion. The flood protection ecosystem services of wetlands are often ignored. Due to climate change, the flood caused by extreme weather in recent years occur frequently. Flood has a great impact on people's production and life with more and more economic losses. This study area is in the Tone river basin in the Kanto area, Japan. It is the second-longest river with the largest basin area in Japan, and it is still suffering heavy economic losses from floods. Tone river basin is one of the rivers that provide water for Tokyo and has an important impact on economic activities in Japan. The purpose of this study was to investigate land-use changes of wetlands in the Tone River Basin, and whether there are spatial differences in the value of wetland functions in mitigating economic losses caused by floods. This study analyzed the land-use change of wetland in Tone River, based on the Landsat data from 1980 to 2020. Combined with flood economic loss, wetland area, GDP, population density, and other social-economic data, a geospatial weighted regression model was constructed to analyze the spatial difference of wetland ecosystem service value. Now, flood protection mainly relies on such a hard project of dam and reservoir, but excessive dependence on hard engineering will cause the government huge financial pressure and have a big impact on the ecological environment. However, natural wetlands can also play a role in flood management, at the same time they can also provide diverse ecosystem services. Moreover, the construction and maintenance cost of natural wetlands is lower than that of hard engineering. Although it is not easy to say which is more effective in terms of flood management. When the marginal value of a wetland is greater than the economic loss caused by flood per unit area, it may be considered to rely on the flood storage capacity of the wetland to reduce the impact of the flood. It can promote the sustainable development of wetlands ecosystem. On the other hand, spatial analysis of wetland values can provide a more effective strategy for flood management in the Tone river basin.Keywords: wetland, geospatial weighted regression, ecosystem services, environment valuation
Procedia PDF Downloads 1013791 Glushkov's Construction for Functional Subsequential Transducers
Authors: Aleksander Mendoza
Abstract:
Glushkov's construction has many interesting properties, and they become even more evident when applied to transducers. This article strives to show the vast range of possible extensions and optimisations for this algorithm. Special flavour of regular expressions is introduced, which can be efficiently converted to e-free functional subsequential weighted finite state transducers. Produced automata are very compact, as they contain only one state for each symbol (from input alphabet) of original expression and only one transition for each range of symbols, no matter how large. Such compactified ranges of transitions allow for efficient binary search lookup during automaton evaluation. All the methods and algorithms presented here were used to implement open-source compiler of regular expressions for multitape transducers.Keywords: weighted automata, transducers, Glushkov, follow automata, regular expressions
Procedia PDF Downloads 1623790 The Extended Skew Gaussian Process for Regression
Authors: M. T. Alodat
Abstract:
In this paper, we propose a generalization to the Gaussian process regression(GPR) model called the extended skew Gaussian process for regression(ESGPr) model. The ESGPR model works better than the GPR model when the errors are skewed. We derive the predictive distribution for the ESGPR model at a new input. Also we apply the ESGPR model to FOREX data and we find that it fits the Forex data better than the GPR model.Keywords: extended skew normal distribution, Gaussian process for regression, predictive distribution, ESGPr model
Procedia PDF Downloads 5533789 Integrated Nested Laplace Approximations For Quantile Regression
Authors: Kajingulu Malandala, Ranganai Edmore
Abstract:
The asymmetric Laplace distribution (ADL) is commonly used as the likelihood function of the Bayesian quantile regression, and it offers different families of likelihood method for quantile regression. Notwithstanding their popularity and practicality, ADL is not smooth and thus making it difficult to maximize its likelihood. Furthermore, Bayesian inference is time consuming and the selection of likelihood may mislead the inference, as the Bayes theorem does not automatically establish the posterior inference. Furthermore, ADL does not account for greater skewness and Kurtosis. This paper develops a new aspect of quantile regression approach for count data based on inverse of the cumulative density function of the Poisson, binomial and Delaporte distributions using the integrated nested Laplace Approximations. Our result validates the benefit of using the integrated nested Laplace Approximations and support the approach for count data.Keywords: quantile regression, Delaporte distribution, count data, integrated nested Laplace approximation
Procedia PDF Downloads 1633788 Biimodal Biometrics System Using Fusion of Iris and Fingerprint
Authors: Attallah Bilal, Hendel Fatiha
Abstract:
This paper proposes the bimodal biometrics system for identity verification iris and fingerprint, at matching score level architecture using weighted sum of score technique. The features are extracted from the pre processed images of iris and fingerprint. These features of a query image are compared with those of a database image to obtain matching scores. The individual scores generated after matching are passed to the fusion module. This module consists of three major steps i.e., normalization, generation of similarity score and fusion of weighted scores. The final score is then used to declare the person as genuine or an impostor. The system is tested on CASIA database and gives an overall accuracy of 91.04% with FAR of 2.58% and FRR of 8.34%.Keywords: iris, fingerprint, sum rule, fusion
Procedia PDF Downloads 368