Search results for: nonparametric geographically weighted regression
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3749

Search results for: nonparametric geographically weighted regression

3689 Integrating Process Planning, WMS Dispatching, and WPPW Weighted Due Date Assignment Using a Genetic Algorithm

Authors: Halil Ibrahim Demir, Tarık Cakar, Ibrahim Cil, Muharrem Dugenci, Caner Erden

Abstract:

Conventionally, process planning, scheduling, and due-date assignment functions are performed separately and sequentially. The interdependence of these functions requires integration. Although integrated process planning and scheduling, and scheduling with due date assignment problems are popular research topics, only a few works address the integration of these three functions. This work focuses on the integration of process planning, WMS scheduling, and WPPW due date assignment. Another novelty of this work is the use of a weighted due date assignment. In the literature, due dates are generally assigned without considering the importance of customers. However, in this study, more important customers get closer due dates. Typically, only tardiness is punished, but the JIT philosophy punishes both earliness and tardiness. In this study, all weighted earliness, tardiness, and due date related costs are penalized. As no customer desires distant due dates, such distant due dates should be penalized. In this study, various levels of integration of these three functions are tested and genetic search and random search are compared both with each other and with ordinary solutions. Higher integration levels are superior, while search is always useful. Genetic searches outperformed random searches.

Keywords: process planning, weighted scheduling, weighted due-date assignment, genetic algorithm, random search

Procedia PDF Downloads 352
3688 The Classification Performance in Parametric and Nonparametric Discriminant Analysis for a Class- Unbalanced Data of Diabetes Risk Groups

Authors: Lily Ingsrisawang, Tasanee Nacharoen

Abstract:

Introduction: The problems of unbalanced data sets generally appear in real world applications. Due to unequal class distribution, many research papers found that the performance of existing classifier tends to be biased towards the majority class. The k -nearest neighbors’ nonparametric discriminant analysis is one method that was proposed for classifying unbalanced classes with good performance. Hence, the methods of discriminant analysis are of interest to us in investigating misclassification error rates for class-imbalanced data of three diabetes risk groups. Objective: The purpose of this study was to compare the classification performance between parametric discriminant analysis and nonparametric discriminant analysis in a three-class classification application of class-imbalanced data of diabetes risk groups. Methods: Data from a healthy project for 599 staffs in a government hospital in Bangkok were obtained for the classification problem. The staffs were diagnosed into one of three diabetes risk groups: non-risk (90%), risk (5%), and diabetic (5%). The original data along with the variables; diabetes risk group, age, gender, cholesterol, and BMI was analyzed and bootstrapped up to 50 and 100 samples, 599 observations per sample, for additional estimation of misclassification error rate. Each data set was explored for the departure of multivariate normality and the equality of covariance matrices of the three risk groups. Both the original data and the bootstrap samples show non-normality and unequal covariance matrices. The parametric linear discriminant function, quadratic discriminant function, and the nonparametric k-nearest neighbors’ discriminant function were performed over 50 and 100 bootstrap samples and applied to the original data. In finding the optimal classification rule, the choices of prior probabilities were set up for both equal proportions (0.33: 0.33: 0.33) and unequal proportions with three choices of (0.90:0.05:0.05), (0.80: 0.10: 0.10) or (0.70, 0.15, 0.15). Results: The results from 50 and 100 bootstrap samples indicated that the k-nearest neighbors approach when k = 3 or k = 4 and the prior probabilities of {non-risk:risk:diabetic} as {0.90:0.05:0.05} or {0.80:0.10:0.10} gave the smallest error rate of misclassification. Conclusion: The k-nearest neighbors approach would be suggested for classifying a three-class-imbalanced data of diabetes risk groups.

Keywords: error rate, bootstrap, diabetes risk groups, k-nearest neighbors

Procedia PDF Downloads 409
3687 A Learning-Based EM Mixture Regression Algorithm

Authors: Yi-Cheng Tian, Miin-Shen Yang

Abstract:

The mixture likelihood approach to clustering is a popular clustering method where the expectation and maximization (EM) algorithm is the most used mixture likelihood method. In the literature, the EM algorithm had been used for mixture regression models. However, these EM mixture regression algorithms are sensitive to initial values with a priori number of clusters. In this paper, to resolve these drawbacks, we construct a learning-based schema for the EM mixture regression algorithm such that it is free of initializations and can automatically obtain an approximately optimal number of clusters. Some numerical examples and comparisons demonstrate the superiority and usefulness of the proposed learning-based EM mixture regression algorithm.

Keywords: clustering, EM algorithm, Gaussian mixture model, mixture regression model

Procedia PDF Downloads 478
3686 Solving Weighted Number of Operation Plus Processing Time Due-Date Assignment, Weighted Scheduling and Process Planning Integration Problem Using Genetic and Simulated Annealing Search Methods

Authors: Halil Ibrahim Demir, Caner Erden, Mumtaz Ipek, Ozer Uygun

Abstract:

Traditionally, the three important manufacturing functions, which are process planning, scheduling and due-date assignment, are performed separately and sequentially. For couple of decades, hundreds of studies are done on integrated process planning and scheduling problems and numerous researches are performed on scheduling with due date assignment problem, but unfortunately the integration of these three important functions are not adequately addressed. Here, the integration of these three important functions is studied by using genetic, random-genetic hybrid, simulated annealing, random-simulated annealing hybrid and random search techniques. As well, the importance of the integration of these three functions and the power of meta-heuristics and of hybrid heuristics are studied.

Keywords: process planning, weighted scheduling, weighted due-date assignment, genetic search, simulated annealing, hybrid meta-heuristics

Procedia PDF Downloads 449
3685 Bag of Words Representation Based on Weighting Useful Visual Words

Authors: Fatma Abdedayem

Abstract:

The most effective and efficient methods in image categorization are almost based on bag-of-words (BOW) which presents image by a histogram of occurrence of visual words. In this paper, we propose a novel extension to this method. Firstly, we extract features in multi-scales by applying a color local descriptor named opponent-SIFT. Secondly, in order to represent image we use Spatial Pyramid Representation (SPR) and an extension to the BOW method which based on weighting visual words. Typically, the visual words are weighted during histogram assignment by computing the ratio of their occurrences in the image to the occurrences in the background. Finally, according to classical BOW retrieval framework, only a few words of the vocabulary is useful for image representation. Therefore, we select the useful weighted visual words that respect the threshold value. Experimentally, the algorithm is tested by using different image classes of PASCAL VOC 2007 and is compared against the classical bag-of-visual-words algorithm.

Keywords: BOW, useful visual words, weighted visual words, bag of visual words

Procedia PDF Downloads 410
3684 Prediction of Energy Storage Areas for Static Photovoltaic System Using Irradiation and Regression Modelling

Authors: Kisan Sarda, Bhavika Shingote

Abstract:

This paper aims to evaluate regression modelling for prediction of Energy storage of solar photovoltaic (PV) system using Semi parametric regression techniques because there are some parameters which are known while there are some unknown parameters like humidity, dust etc. Here irradiation of solar energy is different for different places on the basis of Latitudes, so by finding out areas which give more storage we can implement PV systems at those places and our need of energy will be fulfilled. This regression modelling is done for daily, monthly and seasonal prediction of solar energy storage. In this, we have used R modules for designing the algorithm. This algorithm will give the best comparative results than other regression models for the solar PV cell energy storage.

Keywords: semi parametric regression, photovoltaic (PV) system, regression modelling, irradiation

Procedia PDF Downloads 350
3683 Forecasting Unusual Infection of Patient Used by Irregular Weighted Point Set

Authors: Seema Vaidya

Abstract:

Mining association rule is a key issue in data mining. In any case, the standard models ignore the distinction among the exchanges, and the weighted association rule mining does not transform on databases with just binary attributes. This paper proposes a novel continuous example and executes a tree (FP-tree) structure, which is an increased prefix-tree structure for securing compacted, discriminating data about examples, and makes a fit FP-tree-based mining system, FP enhanced capacity algorithm is used, for mining the complete game plan of examples by illustration incessant development. Here, this paper handles the motivation behind making remarkable and weighted item sets, i.e. rare weighted item set mining issue. The two novel brightness measures are proposed for figuring the infrequent weighted item set mining issue. Also, the algorithm are handled which perform IWI which is more insignificant IWI mining. Moreover we utilized the rare item set for choice based structure. The general issue of the start of reliable definite rules is troublesome for the grounds that hypothetically no inciting technique with no other person can promise the rightness of influenced theories. In this way, this framework expects the disorder with the uncommon signs. Usage study demonstrates that proposed algorithm upgrades the structure which is successful and versatile for mining both long and short diagnostics rules. Structure upgrades aftereffects of foreseeing rare diseases of patient.

Keywords: association rule, data mining, IWI mining, infrequent item set, frequent pattern growth

Procedia PDF Downloads 377
3682 Enhanced Weighted Centroid Localization Algorithm for Indoor Environments

Authors: I. Nižetić Kosović, T. Jagušt

Abstract:

Lately, with the increasing number of location-based applications, demand for highly accurate and reliable indoor localization became urgent. This is a challenging problem, due to the measurement variance which is the consequence of various factors like obstacles, equipment properties and environmental changes in complex nature of indoor environments. In this paper we propose low-cost custom-setup infrastructure solution and localization algorithm based on the Weighted Centroid Localization (WCL) method. Localization accuracy is increased by several enhancements: calibration of RSSI values gained from wireless nodes, repetitive measurements of RSSI to exclude deviating values from the position estimation, and by considering orientation of the device according to the wireless nodes. We conducted several experiments to evaluate the proposed algorithm. High accuracy of ~1m was achieved.

Keywords: indoor environment, received signal strength indicator, weighted centroid localization, wireless localization

Procedia PDF Downloads 208
3681 Assessing Spatial Associations of Mortality Patterns in Municipalities of the Czech Republic

Authors: Jitka Rychtarikova

Abstract:

Regional differences in mortality in the Czech Republic (CR) may be moderate from a broader European perspective, but important discrepancies in life expectancy can be found between smaller territorial units. In this study territorial units are based on Administrative Districts of Municipalities with Extended Powers (MEP). This definition came into force January 1, 2003. There are 205 units and the city of Prague. MEP represents the smallest unit for which mortality patterns based on life tables can be investigated and the Czech Statistical Office has been calculating such life tables (every five-years) since 2004. MEP life tables from 2009-2013 for males and females allowed the investigation of three main life cycles with the use of temporary life expectancies between the exact ages of 0 and 35; 35 and 65; and the life expectancy at exact age 65. The results showed regional survival inequalities primarily in adult and older ages. Consequently, only mortality indicators for adult and elderly population were related to census 2011 unlinked data for the same age groups. The most relevant socio-economic factors taken from the census are: having a partner, educational level and unemployment rate. The unemployment rate was measured for adults aged 35-64 completed years. Exploratory spatial data analysis methods were used to detect regional patterns in spatially contiguous units of MEP. The presence of spatial non-stationarity (spatial autocorrelation) of mortality levels for male and female adults (35-64), and elderly males and females (65+) was tested using global Moran’s I. Spatial autocorrelation of mortality patterns was mapped using local Moran’s I with the intention to depict clusters of low or high mortality and spatial outliers for two age groups (35-64 and 65+). The highest Moran’s I was observed for male temporary life expectancy between exact ages 35 and 65 (0.52) and the lowest was among women with life expectancy of 65 (0.26). Generally, men showed stronger spatial autocorrelation compared to women. The relationship between mortality indicators such as life expectancies and socio-economic factors like the percentage of males/females having a partner; percentage of males/females with at least higher secondary education; and percentage of unemployed males/females from economically active population aged 35-64 years, was evaluated using multiple regression (OLS). The results were then compared to outputs from geographically weighted regression (GWR). In the Czech Republic, there are two broader territories North-West Bohemia (NWB) and North Moravia (NM), in which excess mortality is well established. Results of the t-test of spatial regression showed that for males aged 30-64 the association between mortality and unemployment (when adjusted for education and partnership) was stronger in NM compared to NWB, while educational level impacted the length of survival more in NWB. Geographic variation and relationships in mortality of the CR MEP will also be tested using the spatial Durbin approach. The calculations were conducted by means of ArcGIS 10.6 and SAS 9.4.

Keywords: Czech Republic, mortality, municipality, socio-economic factors, spatial analysis

Procedia PDF Downloads 92
3680 Statistical and Analytical Comparison of GIS Overlay Modelings: An Appraisal on Groundwater Prospecting in Precambrian Metamorphics

Authors: Tapas Acharya, Monalisa Mitra

Abstract:

Overlay modeling is the most widely used conventional analysis for spatial decision support system. Overlay modeling requires a set of themes with different weightage computed in varied manners, which gives a resultant input for further integrated analysis. In spite of the popularity and most widely used technique; it gives inconsistent and erroneous results for similar inputs while processed in various GIS overlay techniques. This study is an attempt to compare and analyse the differences in the outputs of different overlay methods using GIS platform with same set of themes of the Precambrian metamorphic to obtain groundwater prospecting in Precambrian metamorphic rocks. The objective of the study is to emphasize the most suitable overlay method for groundwater prospecting in older Precambrian metamorphics. Seven input thematic layers like slope, Digital Elevation Model (DEM), soil thickness, lineament intersection density, average groundwater table fluctuation, stream density and lithology have been used in the spatial overlay models of fuzzy overlay, weighted overlay and weighted sum overlay methods to yield the suitable groundwater prospective zones. Spatial concurrence analysis with high yielding wells of the study area and the statistical comparative studies among the outputs of various overlay models using RStudio reveal that the Weighted Overlay model is the most efficient GIS overlay model to delineate the groundwater prospecting zones in the Precambrian metamorphic rocks.

Keywords: fuzzy overlay, GIS overlay model, groundwater prospecting, Precambrian metamorphics, weighted overlay, weighted sum overlay

Procedia PDF Downloads 99
3679 Phase II Monitoring of First-Order Autocorrelated General Linear Profiles

Authors: Yihua Wang, Yunru Lai

Abstract:

Statistical process control has been successfully applied in a variety of industries. In some applications, the quality of a process or product is better characterized and summarized by a functional relationship between a response variable and one or more explanatory variables. A collection of this type of data is called a profile. Profile monitoring is used to understand and check the stability of this relationship or curve over time. The independent assumption for the error term is commonly used in the existing profile monitoring studies. However, in many applications, the profile data show correlations over time. Therefore, we focus on a general linear regression model with a first-order autocorrelation between profiles in this study. We propose an exponentially weighted moving average charting scheme to monitor this type of profile. The simulation study shows that our proposed methods outperform the existing schemes based on the average run length criterion.

Keywords: autocorrelation, EWMA control chart, general linear regression model, profile monitoring

Procedia PDF Downloads 434
3678 Humeral Head and Scapula Detection in Proton Density Weighted Magnetic Resonance Images Using YOLOv8

Authors: Aysun Sezer

Abstract:

Magnetic Resonance Imaging (MRI) is one of the advanced diagnostic tools for evaluating shoulder pathologies. Proton Density (PD)-weighted MRI sequences prove highly effective in detecting edema. However, they are deficient in the anatomical identification of bones due to a trauma-induced decrease in signal-to-noise ratio and blur in the traumatized cortices. Computer-based diagnostic systems require precise segmentation, identification, and localization of anatomical regions in medical imagery. Deep learning-based object detection algorithms exhibit remarkable proficiency in real-time object identification and localization. In this study, the YOLOv8 model was employed to detect humeral head and scapular regions in 665 axial PD-weighted MR images. The YOLOv8 configuration achieved an overall success rate of 99.60% and 89.90% for detecting the humeral head and scapula, respectively, with an intersection over union (IoU) of 0.5. Our findings indicate a significant promise of employing YOLOv8-based detection for the humerus and scapula regions, particularly in the context of PD-weighted images affected by both noise and intensity inhomogeneity.

Keywords: YOLOv8, object detection, humerus, scapula, IRM

Procedia PDF Downloads 34
3677 Exercise Training for Management Hypertensive Patients: A Systematic Review and Meta-Analysis

Authors: Noor F. Ilias, Mazlifah Omar, Hashbullah Ismail

Abstract:

Exercise training has been shown to improve functional capacity and is recommended as a therapy for management of blood pressure. Our purpose was to establish whether different exercise capacity produces different effect size for Cardiorespiratory Fitness (CRF), systolic (SBP) and diastolic (DBP) blood pressure in patients with hypertension. Exercise characteristic is required in order to have optimal benefit from the training, but optimal exercise capacity is still unwarranted. A MEDLINE search (1985 to 2015) was conducted for exercise based rehabilitation trials in hypertensive patients. Thirty-seven studies met the selection criteria. Of these, 31 (83.7%) were aerobic exercise and 6 (16.3%) aerobic with additional resistance exercise, providing a total of 1318 exercise subjects and 819 control, the total of subjects was 2137. We calculated exercise volume and energy expenditure through the description of exercise characteristics. 4 studies (18.2%) were 451kcal - 900 kcal, 12 (54.5%) were 900 kcal – 1350 kcal and 6 (27.3%) >1351kcal per week. Peak oxygen consumption (peak VO2) increased by mean difference of 1.44 ml/kg/min (95% confidence interval [CI]: 1.08 to 1.79 ml/kg/min; p = 0.00001) with weighted mean 21.2% for aerobic exercise compare to aerobic with additional resistance exercise 4.50 ml/kg/min (95% confidence interval [CI]: 3.57 to 5.42 ml/kg/min; p = 0.00001) with weighted mean 14.5%. SBP was clinically reduce for both aerobic and aerobic with resistance training by mean difference of -4.66 mmHg (95% confidence interval [CI]: -5.68 to -3.63 mmHg; p = 0.00001) weighted mean 6% reduction and -5.06 mmHg (95% confidence interval [CI]: -7.32 to -2.8 mmHg; p = 0.0001) weighted mean 5% reduction respectively. Result for DBP was clinically reduce for aerobic by mean difference of -1.62 mmHg (95% confidence interval [CI]: -2.09 to -1.15 mmHg; p = 0.00001) weighted mean 4% reduction and aerobic with resistance training reduce by mean difference of -3.26 mmHg (95% confidence interval [CI]: -4.87 to -1.65 mmHg; p = 0.0001) weighted mean 6% reduction. Optimum exercise capacity for 451 kcal – 900 kcal showed greater improvement in peak VO2 and SBP by 2.76 ml/kg/min (95% confidence interval [CI]: 1.47 to 4.05 ml/kg/min; p = 0.0001) with weighted mean 40.6% and -16.66 mmHg (95% confidence interval [CI]: -21.72 to -11.60 mmHg; p = 0.00001) weighted mean 9.8% respectively. Our data demonstrated that aerobic exercise with total volume of 451 kcal – 900 kcal/ week energy expenditure may elicit greater changes in cardiorespiratory fitness and blood pressure in hypertensive patients. Higher exercise capacity weekly does not seem better result in management hypertensive patients.

Keywords: blood Pressure, exercise, hypertension, peak VO2

Procedia PDF Downloads 257
3676 New Segmentation of Piecewise Linear Regression Models Using Reversible Jump MCMC Algorithm

Authors: Suparman

Abstract:

Piecewise linear regression models are very flexible models for modeling the data. If the piecewise linear regression models are matched against the data, then the parameters are generally not known. This paper studies the problem of parameter estimation of piecewise linear regression models. The method used to estimate the parameters of picewise linear regression models is Bayesian method. But the Bayes estimator can not be found analytically. To overcome these problems, the reversible jump MCMC algorithm is proposed. Reversible jump MCMC algorithm generates the Markov chain converges to the limit distribution of the posterior distribution of the parameters of picewise linear regression models. The resulting Markov chain is used to calculate the Bayes estimator for the parameters of picewise linear regression models.

Keywords: regression, piecewise, Bayesian, reversible Jump MCMC

Procedia PDF Downloads 489
3675 Application Difference between Cox and Logistic Regression Models

Authors: Idrissa Kayijuka

Abstract:

The logistic regression and Cox regression models (proportional hazard model) at present are being employed in the analysis of prospective epidemiologic research looking into risk factors in their application on chronic diseases. However, a theoretical relationship between the two models has been studied. By definition, Cox regression model also called Cox proportional hazard model is a procedure that is used in modeling data regarding time leading up to an event where censored cases exist. Whereas the Logistic regression model is mostly applicable in cases where the independent variables consist of numerical as well as nominal values while the resultant variable is binary (dichotomous). Arguments and findings of many researchers focused on the overview of Cox and Logistic regression models and their different applications in different areas. In this work, the analysis is done on secondary data whose source is SPSS exercise data on BREAST CANCER with a sample size of 1121 women where the main objective is to show the application difference between Cox regression model and logistic regression model based on factors that cause women to die due to breast cancer. Thus we did some analysis manually i.e. on lymph nodes status, and SPSS software helped to analyze the mentioned data. This study found out that there is an application difference between Cox and Logistic regression models which is Cox regression model is used if one wishes to analyze data which also include the follow-up time whereas Logistic regression model analyzes data without follow-up-time. Also, they have measurements of association which is different: hazard ratio and odds ratio for Cox and logistic regression models respectively. A similarity between the two models is that they are both applicable in the prediction of the upshot of a categorical variable i.e. a variable that can accommodate only a restricted number of categories. In conclusion, Cox regression model differs from logistic regression by assessing a rate instead of proportion. The two models can be applied in many other researches since they are suitable methods for analyzing data but the more recommended is the Cox, regression model.

Keywords: logistic regression model, Cox regression model, survival analysis, hazard ratio

Procedia PDF Downloads 423
3674 Automatic Seizure Detection Using Weighted Permutation Entropy and Support Vector Machine

Authors: Noha Seddik, Sherine Youssef, Mohamed Kholeif

Abstract:

The automated epileptic seizure detection research field has emerged in the recent years; this involves analyzing the Electroencephalogram (EEG) signals instead of the traditional visual inspection performed by expert neurologists. In this study, a Support Vector Machine (SVM) that uses Weighted Permutation Entropy (WPE) as the input feature is proposed for classifying normal and seizure EEG records. WPE is a modified statistical parameter of the permutation entropy (PE) that measures the complexity and irregularity of a time series. It incorporates both the mapped ordinal pattern of the time series and the information contained in the amplitude of its sample points. The proposed system utilizes the fact that entropy based measures for the EEG segments during epileptic seizure are lower than in normal EEG.

Keywords: electroencephalogram (EEG), epileptic seizure detection, weighted permutation entropy (WPE), support vector machine (SVM)

Procedia PDF Downloads 341
3673 Stock Market Prediction by Regression Model with Social Moods

Authors: Masahiro Ohmura, Koh Kakusho, Takeshi Okadome

Abstract:

This paper presents a regression model with autocorrelated errors in which the inputs are social moods obtained by analyzing the adjectives in Twitter posts using a document topic model. The regression model predicts Dow Jones Industrial Average (DJIA) more precisely than autoregressive moving-average models.

Keywords: stock market prediction, social moods, regression model, DJIA

Procedia PDF Downloads 519
3672 Model-Based Software Regression Test Suite Reduction

Authors: Shiwei Deng, Yang Bao

Abstract:

In this paper, we present a model-based regression test suite reducing approach that uses EFSM model dependence analysis and probability-driven greedy algorithm to reduce software regression test suites. The approach automatically identifies the difference between the original model and the modified model as a set of elementary model modifications. The EFSM dependence analysis is performed for each elementary modification to reduce the regression test suite, and then the probability-driven greedy algorithm is adopted to select the minimum set of test cases from the reduced regression test suite that cover all interaction patterns. Our initial experience shows that the approach may significantly reduce the size of regression test suites.

Keywords: dependence analysis, EFSM model, greedy algorithm, regression test

Procedia PDF Downloads 398
3671 Segmentation of Piecewise Polynomial Regression Model by Using Reversible Jump MCMC Algorithm

Authors: Suparman

Abstract:

Piecewise polynomial regression model is very flexible model for modeling the data. If the piecewise polynomial regression model is matched against the data, its parameters are not generally known. This paper studies the parameter estimation problem of piecewise polynomial regression model. The method which is used to estimate the parameters of the piecewise polynomial regression model is Bayesian method. Unfortunately, the Bayes estimator cannot be found analytically. Reversible jump MCMC algorithm is proposed to solve this problem. Reversible jump MCMC algorithm generates the Markov chain that converges to the limit distribution of the posterior distribution of piecewise polynomial regression model parameter. The resulting Markov chain is used to calculate the Bayes estimator for the parameters of piecewise polynomial regression model.

Keywords: piecewise regression, bayesian, reversible jump MCMC, segmentation

Procedia PDF Downloads 340
3670 Reliability and Probability Weighted Moment Estimation for Three Parameter Mukherjee-Islam Failure Model

Authors: Ariful Islam, Showkat Ahmad Lone

Abstract:

The Mukherjee-Islam Model is commonly used as a simple life time distribution to assess system reliability. The model exhibits a better fit for failure information and provides more appropriate information about hazard rate and other reliability measures as shown by various authors. It is possible to introduce a location parameter at a time (i.e., a time before which failure cannot occur) which makes it a more useful failure distribution than the existing ones. Even after shifting the location of the distribution, it represents a decreasing, constant and increasing failure rate. It has been shown to represent the appropriate lower tail of the distribution of random variables having fixed lower bound. This study presents the reliability computations and probability weighted moment estimation of three parameter model. A comparative analysis is carried out between three parameters finite range model and some existing bathtub shaped curve fitting models. Since probability weighted moment method is used, the results obtained can also be applied on small sample cases. Maximum likelihood estimation method is also applied in this study.

Keywords: comparative analysis, maximum likelihood estimation, Mukherjee-Islam failure model, probability weighted moment estimation, reliability

Procedia PDF Downloads 244
3669 A Fuzzy Linear Regression Model Based on Dissemblance Index

Authors: Shih-Pin Chen, Shih-Syuan You

Abstract:

Fuzzy regression models are useful for investigating the relationship between explanatory variables and responses in fuzzy environments. To overcome the deficiencies of previous models and increase the explanatory power of fuzzy data, the graded mean integration (GMI) representation is applied to determine representative crisp regression coefficients. A fuzzy regression model is constructed based on the modified dissemblance index (MDI), which can precisely measure the actual total error. Compared with previous studies based on the proposed MDI and distance criterion, the results from commonly used test examples show that the proposed fuzzy linear regression model has higher explanatory power and forecasting accuracy.

Keywords: dissemblance index, fuzzy linear regression, graded mean integration, mathematical programming

Procedia PDF Downloads 407
3668 The Theory behind Logistic Regression

Authors: Jan Henrik Wosnitza

Abstract:

The logistic regression has developed into a standard approach for estimating conditional probabilities in a wide range of applications including credit risk prediction. The article at hand contributes to the current literature on logistic regression fourfold: First, it is demonstrated that the binary logistic regression automatically meets its model assumptions under very general conditions. This result explains, at least in part, the logistic regression's popularity. Second, the requirement of homoscedasticity in the context of binary logistic regression is theoretically substantiated. The variances among the groups of defaulted and non-defaulted obligors have to be the same across the level of the aggregated default indicators in order to achieve linear logits. Third, this article sheds some light on the question why nonlinear logits might be superior to linear logits in case of a small amount of data. Fourth, an innovative methodology for estimating correlations between obligor-specific log-odds is proposed. In order to crystallize the key ideas, this paper focuses on the example of credit risk prediction. However, the results presented in this paper can easily be transferred to any other field of application.

Keywords: correlation, credit risk estimation, default correlation, homoscedasticity, logistic regression, nonlinear logistic regression

Procedia PDF Downloads 394
3667 A Weighted Approach to Unconstrained Iris Recognition

Authors: Yao-Hong Tsai

Abstract:

This paper presents a weighted approach to unconstrained iris recognition. Nowadays, commercial systems are usually characterized by strong acquisition constraints based on the subject’s cooperation. However, it is not always achievable for real scenarios in our daily life. Researchers have been focused on reducing these constraints and maintaining the performance of the system by new techniques at the same time. With large variation in the environment, there are two main improvements to develop the proposed iris recognition system. For solving extremely uneven lighting condition, statistic based illumination normalization is first used on eye region to increase the accuracy of iris feature. The detection of the iris image is based on Adaboost algorithm. Secondly, the weighted approach is designed by Gaussian functions according to the distance to the center of the iris. Furthermore, local binary pattern (LBP) histogram is then applied to texture classification with the weight. Experiment showed that the proposed system provided users a more flexible and feasible way to interact with the verification system through iris recognition.

Keywords: authentication, iris recognition, adaboost, local binary pattern

Procedia PDF Downloads 192
3666 X̄ and S Control Charts based on Weighted Standard Deviation Method

Authors: Derya Karagöz

Abstract:

A Shewhart chart based on normality assumption is not appropriate for skewed distributions since its Type-I error rate is inflated. This study presents X̄ and S control charts for monitoring the process variability for skewed distributions. We propose Weighted Standard Deviation (WSD) X̄ and S control charts. Standard deviation estimator is applied to monitor the process variability for estimating the process standard deviation, in the case of the W SD X̄ and S control charts as this estimator is simple and easy to compute. Unlike the Shewhart control chart, the proposed charts provide asymmetric limits in accordance with the direction and degree of skewness to construct the upper and lower limits. The performances of the proposed charts are compared with other heuristic charts for skewed distributions by using Simulation study. The Simulation studies show that the proposed control charts have good properties for skewed distributions and large sample sizes.

Keywords: weighted standard deviation, MAD, skewed distributions, S control charts

Procedia PDF Downloads 368
3665 Hybrid Fuzzy Weighted K-Nearest Neighbor to Predict Hospital Readmission for Diabetic Patients

Authors: Soha A. Bahanshal, Byung G. Kim

Abstract:

Identification of patients at high risk for hospital readmission is of crucial importance for quality health care and cost reduction. Predicting hospital readmissions among diabetic patients has been of great interest to many researchers and health decision makers. We build a prediction model to predict hospital readmission for diabetic patients within 30 days of discharge. The core of the prediction model is a modified k Nearest Neighbor called Hybrid Fuzzy Weighted k Nearest Neighbor algorithm. The prediction is performed on a patient dataset which consists of more than 70,000 patients with 50 attributes. We applied data preprocessing using different techniques in order to handle data imbalance and to fuzzify the data to suit the prediction algorithm. The model so far achieved classification accuracy of 80% compared to other models that only use k Nearest Neighbor.

Keywords: machine learning, prediction, classification, hybrid fuzzy weighted k-nearest neighbor, diabetic hospital readmission

Procedia PDF Downloads 156
3664 A Ratio-Weighted Decision Tree Algorithm for Imbalance Dataset Classification

Authors: Doyin Afolabi, Phillip Adewole, Oladipupo Sennaike

Abstract:

Most well-known classifiers, including the decision tree algorithm, can make predictions on balanced datasets efficiently. However, the decision tree algorithm tends to be biased towards imbalanced datasets because of the skewness of the distribution of such datasets. To overcome this problem, this study proposes a weighted decision tree algorithm that aims to remove the bias toward the majority class and prevents the reduction of majority observations in imbalance datasets classification. The proposed weighted decision tree algorithm was tested on three imbalanced datasets- cancer dataset, german credit dataset, and banknote dataset. The specificity, sensitivity, and accuracy metrics were used to evaluate the performance of the proposed decision tree algorithm on the datasets. The evaluation results show that for some of the weights of our proposed decision tree, the specificity, sensitivity, and accuracy metrics gave better results compared to that of the ID3 decision tree and decision tree induced with minority entropy for all three datasets.

Keywords: data mining, decision tree, classification, imbalance dataset

Procedia PDF Downloads 93
3663 Econophysics: The Use of Entropy Measures in Finance

Authors: Muhammad Sheraz, Vasile Preda, Silvia Dedu

Abstract:

Concepts of econophysics are usually used to solve problems related to uncertainty and nonlinear dynamics. In the theory of option pricing the risk neutral probabilities play very important role. The application of entropy in finance can be regarded as the extension of both information entropy and the probability entropy. It can be an important tool in various financial methods such as measure of risk, portfolio selection, option pricing and asset pricing. Gulko applied Entropy Pricing Theory (EPT) for pricing stock options and introduced an alternative framework of Black-Scholes model for pricing European stock option. In this article, we present solutions to maximum entropy problems based on Tsallis, Weighted-Tsallis, Kaniadakis, Weighted-Kaniadakies entropies, to obtain risk-neutral densities. We have also obtained the value of European call and put in this framework.

Keywords: option pricing, Black-Scholes model, Tsallis entropy, Kaniadakis entropy, weighted entropy, risk-neutral density

Procedia PDF Downloads 270
3662 A Study on the Measurement of Spatial Mismatch and the Influencing Factors of “Job-Housing” in Affordable Housing from the Perspective of Commuting

Authors: Daijun Chen

Abstract:

Affordable housing is subsidized by the government to meet the housing demand of low and middle-income urban residents in the process of urbanization and to alleviate the housing inequality caused by market-based housing reforms. It is a recognized fact that the living conditions of the insured have been improved while constructing the subsidized housing. However, the choice of affordable housing is mostly in the suburbs, where the surrounding urban functions and infrastructure are incomplete, resulting in the spatial mismatch of "jobs-housing" in affordable housing. The main reason for this problem is that the residents of affordable housing are more sensitive to the spatial location of their residence, but their selectivity and controllability to the housing location are relatively weak, which leads to higher commuting costs. Their real cost of living has not been effectively reduced. In this regard, 92 subsidized housing communities in Nanjing, China, are selected as the research sample in this paper. The residents of the affordable housing and their commuting Spatio-temporal behavior characteristics are identified based on the LBS (location-based service) data. Based on the spatial mismatch theory, spatial mismatch indicators such as commuting distance and commuting time are established to measure the spatial mismatch degree of subsidized housing in different districts of Nanjing. Furthermore, the geographically weighted regression model is used to analyze the influencing factors of the spatial mismatch of affordable housing in terms of the provision of employment opportunities, traffic accessibility and supporting service facilities by using spatial, functional and other multi-source Spatio-temporal big data. The results show that the spatial mismatch of affordable housing in Nanjing generally presents a "concentric circle" pattern of decreasing from the central urban area to the periphery. The factors affecting the spatial mismatch of affordable housing in different spatial zones are different. The main reasons are the number of enterprises within 1 km of the affordable housing district and the shortest distance to the subway station. And the low spatial mismatch is due to the diversity of services and facilities. Based on this, a spatial optimization strategy for different levels of spatial mismatch in subsidized housing is proposed. And feasible suggestions for the later site selection of subsidized housing are also provided. It hopes to avoid or mitigate the impact of "spatial mismatch," promote the "spatial adaptation" of "jobs-housing," and truly improve the overall welfare level of affordable housing residents.

Keywords: affordable housing, spatial mismatch, commuting characteristics, spatial adaptation, welfare benefits

Procedia PDF Downloads 78
3661 Model Averaging for Poisson Regression

Authors: Zhou Jianhong

Abstract:

Model averaging is a desirable approach to deal with model uncertainty, which, however, has rarely been explored for Poisson regression. In this paper, we propose a model averaging procedure based on an unbiased estimator of the expected Kullback-Leibler distance for the Poisson regression. Simulation study shows that the proposed model average estimator outperforms some other commonly used model selection and model average estimators in some situations. Our proposed methods are further applied to a real data example and the advantage of this method is demonstrated again.

Keywords: model averaging, poission regression, Kullback-Leibler distance, statistics

Procedia PDF Downloads 485
3660 Multidimensional Poverty and Child Cognitive Development

Authors: Bidyadhar Dehury, Sanjay Kumar Mohanty

Abstract:

According to the Right to Education Act of India, education is the fundamental right of all children of age group 6-14 year irrespective of their status. Using the unit level data from India Human Development Survey (IHDS), we tried to understand the inter-relationship between the level of poverty and the academic performance of the children aged 8-11 years. The level of multidimensional poverty is measured using five dimensions and 10 indicators using Alkire-Foster approach. The weighted deprivation score was obtained by giving equal weight to each dimension and indicators within the dimension. The weighted deprivation score varies from 0 to 1 and grouped into four categories as non-poor, vulnerable, multidimensional poor and sever multidimensional poor. The academic performance index was measured using three variables reading skills, math skills and writing skills using PCA. The bivariate and multivariate analysis was used in the analysis. The outcome variable was ordinal. So the predicted probabilities were calculated using the ordinal logistic regression. The predicted probabilities of good academic performance index was 0.202 if the child was sever multidimensional poor, 0.235 if the child was multidimensional poor, 0.264 if the child was vulnerable, and 0.316 if the child was non-poor. Hence, if the level of poverty among the children decreases from sever multidimensional poor to non-poor, the probability of good academic performance increases.

Keywords: multidimensional poverty, academic performance index, reading skills, math skills, writing skills, India

Procedia PDF Downloads 563