Search results for: nonparametric geographically weighted regression
3788 Comparative Study between Herzberg’s and Maslow’s Theories in Maritime Transport Education
Authors: Nermin Mahmoud Gohar, Aisha Tarek Noour
Abstract:
Learner satisfaction has been a vital field of interest in the literature. Accordingly, the paper will explore the reasons behind individual differences in motivation and satisfaction. This study examines the effect of both; Herzberg’s and Maslow’s theories on learners satisfaction. A self-administered questionnaire was used to collect data from learners who were geographically widely spread around the College of Maritime Transport and Technology (CMTT) at the Arab Academy for Science, Technology and Maritime Transport (AAST&MT) in Egypt. One hundred and fifty undergraduates responded to a questionnaire survey. Respondents were drawn from two branches in Alexandria and Port Said. The data analysis used was SPSS 22 and AMOS 18. Factor analysis technique was used to find out the dimensions under study verified by Herzberg’s and Maslow’s theories. In addition, regression analysis and structural equation modeling were applied to find the effect of the above-mentioned theories on maritime transport learners’ satisfaction. Concerning the limitation of this study, it used the available number of learners in the CMTT due to the relatively low population in this field.Keywords: motivation, satisfaction, needs, education, Herzberg’s and Maslow’s theories
Procedia PDF Downloads 4323787 Spatial Pattern and Predictors of Malaria in Ethiopia: Application of Auto Logistics Spatial Regression
Authors: Melkamu A. Zeru, Yamral M. Warkaw, Aweke A. Mitku, Muluwerk Ayele
Abstract:
Introduction: Malaria is a severe health threat in the World, mainly in Africa. It is the major cause of health problems in which the risk of morbidity and mortality associated with malaria cases are characterized by spatial variations across the county. This study aimed to investigate the spatial patterns and predictors of malaria distribution in Ethiopia. Methods: A weighted sample of 15,239 individuals with rapid diagnosis tests was obtained from the Central Statistical Agency and Ethiopia malaria indicator survey of 2015. Global Moran's I and Moran scatter plots were used in determining the distribution of malaria cases, whereas the local Moran's I statistic was used in identifying exposed areas. In data manipulation, machine learning was used for variable reduction and statistical software R, Stata, and Python were used for data management and analysis. The auto logistics spatial binary regression model was used to investigate the predictors of malaria. Results: The final auto logistics regression model reported that male clients had a positive significant effect on malaria cases as compared to female clients [AOR=2.401, 95 % CI: (2.125 - 2.713)]. The distribution of malaria across the regions was different. The highest incidence of malaria was found in Gambela [AOR=52.55, 95%CI: (40.54-68.12)] followed by Beneshangul [AOR=34.95, 95%CI: (27.159 - 44.963)]. Similarly, individuals in Amhara [AOR=0.243, 95% CI:(0.1950.303],Oromiya[AOR=0.197,95%CI:(0.1580.244)],DireDawa[AOR=0.064,95%CI(0.049-0.082)],AddisAbaba[AOR=0.057,95%CI:(0.044-0.075)], Somali[AOR=0.077,95%CI:(0.059-0.097)], SNNPR[OR=0.329, 95%CI: (0.261- 0.413)] and Harari [AOR=0.256, 95%CI:(0.201 - 0.325)] were less likely to had low incidence of malaria as compared with Tigray. Furthermore, for a one-meter increase in altitude, the odds of a positive rapid diagnostic test (RDT) decrease by 1.6% [AOR = 0.984, 95% CI :( 0.984 - 0.984)]. The use of a shared toilet facility was found as a protective factor for malaria in Ethiopia [AOR=1.671, 95% CI: (1.504 - 1.854)]. The spatial autocorrelation variable changes the constant from AOR = 0.471 for logistic regression to AOR = 0.164 for auto logistics regression. Conclusions: This study found that the incidence of malaria in Ethiopia had a spatial pattern that is associated with socio-economic, demographic, and geographic risk factors. Spatial clustering of malaria cases had occurred in all regions, and the risk of clustering was different across the regions. The risk of malaria was found to be higher for those who live in soil floor-type houses as compared to those who live in cement or ceramics floor type. Similarly, households with thatched, metal and thin, and other roof-type houses have a higher risk of malaria than ceramic tiles roof houses. Moreover, using a protected anti-mosquito net reduced the risk of malaria incidence.Keywords: malaria, Ethiopia, auto logistics, spatial model, spatial clustering
Procedia PDF Downloads 333786 Currency Exchange Rate Forecasts Using Quantile Regression
Authors: Yuzhi Cai
Abstract:
In this paper, we discuss a Bayesian approach to quantile autoregressive (QAR) time series model estimation and forecasting. Together with a combining forecasts technique, we then predict USD to GBP currency exchange rates. Combined forecasts contain all the information captured by the fitted QAR models at different quantile levels and are therefore better than those obtained from individual models. Our results show that an unequally weighted combining method performs better than other forecasting methodology. We found that a median AR model can perform well in point forecasting when the predictive density functions are symmetric. However, in practice, using the median AR model alone may involve the loss of information about the data captured by other QAR models. We recommend that combined forecasts should be used whenever possible.Keywords: combining forecasts, MCMC, predictive density functions, quantile forecasting, quantile modelling
Procedia PDF Downloads 2553785 MapReduce Logistic Regression Algorithms with RHadoop
Authors: Byung Ho Jung, Dong Hoon Lim
Abstract:
Logistic regression is a statistical method for analyzing a dataset in which there are one or more independent variables that determine an outcome. Logistic regression is used extensively in numerous disciplines, including the medical and social science fields. In this paper, we address the problem of estimating parameters in the logistic regression based on MapReduce framework with RHadoop that integrates R and Hadoop environment applicable to large scale data. There exist three learning algorithms for logistic regression, namely Gradient descent method, Cost minimization method and Newton-Rhapson's method. The Newton-Rhapson's method does not require a learning rate, while gradient descent and cost minimization methods need to manually pick a learning rate. The experimental results demonstrated that our learning algorithms using RHadoop can scale well and efficiently process large data sets on commodity hardware. We also compared the performance of our Newton-Rhapson's method with gradient descent and cost minimization methods. The results showed that our newton's method appeared to be the most robust to all data tested.Keywords: big data, logistic regression, MapReduce, RHadoop
Procedia PDF Downloads 2803784 Interference among Lambsquarters and Oil Rapeseed Cultivars
Authors: Reza Siyami, Bahram Mirshekari
Abstract:
Seed and oil yield of rapeseed is considerably affected by weeds interference including mustard (Sinapis arvensis L.), lambsquarters (Chenopodium album L.) and redroot pigweed (Amaranthus retroflexus L.) throughout the East Azerbaijan province in Iran. To formulate the relationship between four independent growth variables measured in our experiment with a dependent variable, multiple regression analysis was carried out for the weed leaves number per plant (X1), green cover percentage (X2), LAI (X3) and leaf area per plant (X4) as independent variables and rapeseed oil yield as a dependent variable. The multiple regression equation is shown as follows: Seed essential oil yield (kg/ha) = 0.156 + 0.0325 (X1) + 0.0489 (X2) + 0.0415 (X3) + 0.133 (X4). Furthermore, the stepwise regression analysis was also carried out for the data obtained to test the significance of the independent variables affecting the oil yield as a dependent variable. The resulted stepwise regression equation is shown as follows: Oil yield = 4.42 + 0.0841 (X2) + 0.0801 (X3); R2 = 81.5. The stepwise regression analysis verified that the green cover percentage and LAI of weed had a marked increasing effect on the oil yield of rapeseed.Keywords: green cover percentage, independent variable, interference, regression
Procedia PDF Downloads 4203783 Copula-Based Estimation of Direct and Indirect Effects in Path Analysis Model
Authors: Alam Ali, Ashok Kumar Pathak
Abstract:
Path analysis is a statistical technique used to evaluate the strength of the direct and indirect effects of variables. One or more structural regression equations are used to estimate a series of parameters in order to find the better fit of data. Sometimes, exogenous variables do not show a significant strength of their direct and indirect effect when the assumption of classical regression (ordinary least squares (OLS)) are violated by the nature of the data. The main motive of this article is to investigate the efficacy of the copula-based regression approach over the classical regression approach and calculate the direct and indirect effects of variables when data violates the OLS assumption and variables are linked through an elliptical copula. We perform this study using a well-organized numerical scheme. Finally, a real data application is also presented to demonstrate the performance of the superiority of the copula approach.Keywords: path analysis, copula-based regression models, direct and indirect effects, k-fold cross validation technique
Procedia PDF Downloads 693782 A Nonlocal Means Algorithm for Poisson Denoising Based on Information Geometry
Authors: Dongxu Chen, Yipeng Li
Abstract:
This paper presents an information geometry NonlocalMeans(NLM) algorithm for Poisson denoising. NLM estimates a noise-free pixel as a weighted average of image pixels, where each pixel is weighted according to the similarity between image patches in Euclidean space. In this work, every pixel is a Poisson distribution locally estimated by Maximum Likelihood (ML), all distributions consist of a statistical manifold. A NLM denoising algorithm is conducted on the statistical manifold where Fisher information matrix can be used for computing distribution geodesics referenced as the similarity between patches. This approach was demonstrated to be competitive with related state-of-the-art methods.Keywords: image denoising, Poisson noise, information geometry, nonlocal-means
Procedia PDF Downloads 2843781 Performance Analysis of Proprietary and Non-Proprietary Tools for Regression Testing Using Genetic Algorithm
Authors: K. Hema Shankari, R. Thirumalaiselvi, N. V. Balasubramanian
Abstract:
The present paper addresses to the research in the area of regression testing with emphasis on automated tools as well as prioritization of test cases. The uniqueness of regression testing and its cyclic nature is pointed out. The difference in approach between industry, with business model as basis, and academia, with focus on data mining, is highlighted. Test Metrics are discussed as a prelude to our formula for prioritization; a case study is further discussed to illustrate this methodology. An industrial case study is also described in the paper, where the number of test cases is so large that they have to be grouped as Test Suites. In such situations, a genetic algorithm proposed by us can be used to reconfigure these Test Suites in each cycle of regression testing. The comparison is made between a proprietary tool and an open source tool using the above-mentioned metrics. Our approach is clarified through several tables.Keywords: APFD metric, genetic algorithm, regression testing, RFT tool, test case prioritization, selenium tool
Procedia PDF Downloads 4343780 A Hybrid Model Tree and Logistic Regression Model for Prediction of Soil Shear Strength in Clay
Authors: Ehsan Mehryaar, Seyed Armin Motahari Tabari
Abstract:
Without a doubt, soil shear strength is the most important property of the soil. The majority of fatal and catastrophic geological accidents are related to shear strength failure of the soil. Therefore, its prediction is a matter of high importance. However, acquiring the shear strength is usually a cumbersome task that might need complicated laboratory testing. Therefore, prediction of it based on common and easy to get soil properties can simplify the projects substantially. In this paper, A hybrid model based on the classification and regression tree algorithm and logistic regression is proposed where each leaf of the tree is an independent regression model. A database of 189 points for clay soil, including Moisture content, liquid limit, plastic limit, clay content, and shear strength, is collected. The performance of the developed model compared to the existing models and equations using root mean squared error and coefficient of correlation.Keywords: model tree, CART, logistic regression, soil shear strength
Procedia PDF Downloads 1943779 A Stochastic Analytic Hierarchy Process Based Weighting Model for Sustainability Measurement in an Organization
Authors: Faramarz Khosravi, Gokhan Izbirak
Abstract:
A weighted statistical stochastic based Analytical Hierarchy Process (AHP) model for modeling the potential barriers and enablers of sustainability for measuring and assessing the sustainability level is proposed. For context-dependent potential barriers and enablers, the proposed model takes the basis of the properties of the variables describing the sustainability functions and was developed into a realistic analytical model for the sustainable behavior of an organization. This thus serves as a means for measuring the sustainability of the organization. The main focus of this paper was the application of the AHP tool in a statistically-based model for measuring sustainability. Hence a strong weighted stochastic AHP based procedure was achieved. A case study scenario of a widely reported major Canadian electric utility was adopted to demonstrate the applicability of the developed model and comparatively examined its results with those of an equal-weighted model method. Variations in the sustainability of a company, as fluctuations, were figured out during the time. In the results obtained, sustainability index for successive years changed form 73.12%, 79.02%, 74.31%, 76.65%, 80.49%, 79.81%, 79.83% to more exact values 73.32%, 77.72%, 76.76%, 79.41%, 81.93%, 79.72%, and 80,45% according to priorities of factors that have found by expert views, respectively. By obtaining relatively necessary informative measurement indicators, the model can practically and effectively evaluate the sustainability extent of any organization and also to determine fluctuations in the organization over time.Keywords: AHP, sustainability fluctuation, environmental indicators, performance measurement
Procedia PDF Downloads 1193778 A Regression Model for Residual-State Creep Failure
Authors: Deepak Raj Bhat, Ryuichi Yatabe
Abstract:
In this study, a residual-state creep failure model was developed based on the residual-state creep test results of clayey soils. To develop the proposed model, the regression analyses were done by using the R. The model results of the failure time (tf) and critical displacement (δc) were compared with experimental results and found in close agreements to each others. It is expected that the proposed regression model for residual-state creep failure will be more useful for the prediction of displacement of different clayey soils in the future.Keywords: regression model, residual-state creep failure, displacement prediction, clayey soils
Procedia PDF Downloads 4053777 Uterine Cervical Cancer; Early Treatment Assessment with T2- And Diffusion-Weighted MRI
Authors: Susanne Fridsten, Kristina Hellman, Anders Sundin, Lennart Blomqvist
Abstract:
Background: Patients diagnosed with locally advanced cervical carcinoma are treated with definitive concomitant chemo-radiotherapy. Treatment failure occurs in 30-50% of patients with very poor prognoses. The treatment is standardized with risk for both over-and undertreatment. Consequently, there is a great need for biomarkers able to predict therapy outcomes to allow for individualized treatment. Aim: To explore the role of T2- and diffusion-weighted magnetic resonance imaging (MRI) for early prediction of therapy outcome and the optimal time point for assessment. Methods: A pilot study including 15 patients with cervical carcinoma stage IIB-IIIB (FIGO 2009) undergoing definitive chemoradiotherapy. All patients underwent MRI four times, at baseline, 3 weeks, 5 weeks, and 12 weeks after treatment started. Tumour size, size change (∆size), visibility on diffusion-weighted imaging (DWI), apparent diffusion coefficient (ADC) and change of ADC (∆ADC) at the different time points were recorded. Results: 7/15 patients relapsed during the study period, referred to as "poor prognosis", PP, and the remaining eight patients are referred to "good prognosis", GP. The tumor size was larger at all time points for PP than for GP. The ∆size between any of the four-time points was the same for PP and GP patients. The sensitivity and specificity to predict prognostic group depending on a remaining tumor on DWI were highest at 5 weeks and 83% (5/6) and 63% (5/8), respectively. The combination of tumor size at baseline and remaining tumor on DWI at 5 weeks in ROC analysis reached an area under the curve (AUC) of 0.83. After 12 weeks, no remaining tumor was seen on DWI among patients with GP, as opposed to 2/7 PP patients. Adding ADC to the tumor size measurements did not improve the predictive value at any time point. Conclusion: A large tumor at baseline MRI combined with a remaining tumor on DWI at 5 weeks predicted a poor prognosis.Keywords: chemoradiotherapy, diffusion-weighted imaging, magnetic resonance imaging, uterine cervical carcinoma
Procedia PDF Downloads 1393776 Formulating a Flexible-Spread Fuzzy Regression Model Based on Dissemblance Index
Authors: Shih-Pin Chen, Shih-Syuan You
Abstract:
This study proposes a regression model with flexible spreads for fuzzy input-output data to cope with the situation that the existing measures cannot reflect the actual estimation error. The main idea is that a dissemblance index (DI) is carefully identified and defined for precisely measuring the actual estimation error. Moreover, the graded mean integration (GMI) representation is adopted for determining more representative numeric regression coefficients. Notably, to comprehensively compare the performance of the proposed model with other ones, three different criteria are adopted. The results from commonly used test numerical examples and an application to Taiwan's business monitoring indicator illustrate that the proposed dissemblance index method not only produces valid fuzzy regression models for fuzzy input-output data, but also has satisfactory and stable performance in terms of the total estimation error based on these three criteria.Keywords: dissemblance index, forecasting, fuzzy sets, linear regression
Procedia PDF Downloads 3603775 CAG Repeat Polymorphism of Androgen Receptor and Female Sexual Functions in Egyptian Female Population
Authors: Azza Gaber Farag, Yasser Atta Shehata, Sara Elsayed Elghazouly, Mustafa Elsayed Elshaib, Nesreen Gamal Elden Elhelbawy
Abstract:
Background: Androgen receptor (AR) polymorphism in cytosine adenineguanine (CAG) repeat has an effect on the functional capacity of AR in males. However, little researches in this field are available regarding female sexual function. Aim: To investigate the possible link between polymorphism in the CAG repeat of AR gene and female sexual function in a sample of the Egyptian population. Materials and methods: 500 Egyptian married females completed a questionnaire regarding sociodemographic, reproductive, and sexual data. AR CAG repeat length was analyzed for those having female sexual dysfunctions (FSD) using real-time PCR. Results: The most sensitive domain to AR CAG repeat length was the orgasm domain that showed significant positive correlations with short allele (p=0.001), long allele (p=.015), biallellic mean (p=.000), and X weighted biallelic mean (p=.000). The satisfaction domain had significant positive correlations with the biallelic mean (p=.035), and the X weighted biallelic mean (p=. 032). However, the pain domain was of significant negative correlations with AR polymorphism of short allele (p=.002), biallelic mean (p=.013), and X weighted biallelic mean (p = . 011). Conclusions: AR polymorphism could represent a non-negligible aspect in female sexual function. The lower AR CAG repeat polymorphism was of significant impact on FSD, affecting mainly female orgasm followed by pain disorders that finally reflected On her sexual satisfaction.Keywords: female sexual dysfunction, androgen receptor, CAG repeat polymorphism, androgen
Procedia PDF Downloads 1803774 Image Compression Based on Regression SVM and Biorthogonal Wavelets
Authors: Zikiou Nadia, Lahdir Mourad, Ameur Soltane
Abstract:
In this paper, we propose an effective method for image compression based on SVM Regression (SVR), with three different kernels, and biorthogonal 2D Discrete Wavelet Transform. SVM regression could learn dependency from training data and compressed using fewer training points (support vectors) to represent the original data and eliminate the redundancy. Biorthogonal wavelet has been used to transform the image and the coefficients acquired are then trained with different kernels SVM (Gaussian, Polynomial, and Linear). Run-length and Arithmetic coders are used to encode the support vectors and its corresponding weights, obtained from the SVM regression. The peak signal noise ratio (PSNR) and their compression ratios of several test images, compressed with our algorithm, with different kernels are presented. Compared with other kernels, Gaussian kernel achieves better image quality. Experimental results show that the compression performance of our method gains much improvement.Keywords: image compression, 2D discrete wavelet transform (DWT-2D), support vector regression (SVR), SVM Kernels, run-length, arithmetic coding
Procedia PDF Downloads 3803773 Application and Verification of Regression Model to Landslide Susceptibility Mapping
Authors: Masood Beheshtirad
Abstract:
Identification of regions having potential for landslide occurrence is one of the basic measures in natural resources management. Different landslide hazard mapping models are proposed based on the environmental condition and goals. In this research landslide hazard map using multiple regression model were provided and applicability of this model is investigated in Baghdasht watershed. Dependent variable is landslide inventory map and independent variables consist of information layers as Geology, slope, aspect, distance from river, distance from road, fault and land use. For doing this, existing landslides have been identified and an inventory map made. The landslide hazard map is based on the multiple regression provided. The level of similarity potential hazard classes and figures of this model were compared with the landslide inventory map in the SPSS environments. Results of research showed that there is a significant correlation between the potential hazard classes and figures with area of the landslides. The multiple regression model is suitable for application in the Baghdasht Watershed.Keywords: landslide, mapping, multiple model, regression
Procedia PDF Downloads 3223772 An EWMA P-Chart Based on Improved Square Root Transformation
Authors: Saowanit Sukparungsee
Abstract:
Generally, the traditional Shewhart p chart has been developed by for charting the binomial data. This chart has been developed using the normal approximation with condition as low defect level and the small to moderate sample size. In real applications, however, are away from these assumptions due to skewness in the exact distribution. In this paper, a modified Exponentially Weighted Moving Average (EWMA) control chat for detecting a change in binomial data by improving square root transformations, namely ISRT p EWMA control chart. The numerical results show that ISRT p EWMA chart is superior to ISRT p chart for small to moderate shifts, otherwise, the latter is better for large shifts.Keywords: number of defects, exponentially weighted moving average, average run length, square root transformations
Procedia PDF Downloads 4373771 Predicting Bridge Pier Scour Depth with SVM
Authors: Arun Goel
Abstract:
Prediction of maximum local scour is necessary for the safety and economical design of the bridges. A number of equations have been developed over the years to predict local scour depth using laboratory data and a few pier equations have also been proposed using field data. Most of these equations are empirical in nature as indicated by the past publications. In this paper, attempts have been made to compute local depth of scour around bridge pier in dimensional and non-dimensional form by using linear regression, simple regression and SVM (Poly and Rbf) techniques along with few conventional empirical equations. The outcome of this study suggests that the SVM (Poly and Rbf) based modeling can be employed as an alternate to linear regression, simple regression and the conventional empirical equations in predicting scour depth of bridge piers. The results of present study on the basis of non-dimensional form of bridge pier scour indicates the improvement in the performance of SVM (Poly and Rbf) in comparison to dimensional form of scour.Keywords: modeling, pier scour, regression, prediction, SVM (Poly and Rbf kernels)
Procedia PDF Downloads 4503770 Arabic Character Recognition Using Regression Curves with the Expectation Maximization Algorithm
Authors: Abdullah A. AlShaher
Abstract:
In this paper, we demonstrate how regression curves can be used to recognize 2D non-rigid handwritten shapes. Each shape is represented by a set of non-overlapping uniformly distributed landmarks. The underlying models utilize 2nd order of polynomials to model shapes within a training set. To estimate the regression models, we need to extract the required coefficients which describe the variations for a set of shape class. Hence, a least square method is used to estimate such modes. We then proceed by training these coefficients using the apparatus Expectation Maximization algorithm. Recognition is carried out by finding the least error landmarks displacement with respect to the model curves. Handwritten isolated Arabic characters are used to evaluate our approach.Keywords: character recognition, regression curves, handwritten Arabic letters, expectation maximization algorithm
Procedia PDF Downloads 1433769 Reminiscence Therapy for Alzheimer’s Disease Restrained on Logistic Regression Based Linear Bootstrap Aggregating
Authors: P. S. Jagadeesh Kumar, Mingmin Pan, Xianpei Li, Yanmin Yuan, Tracy Lin Huan
Abstract:
Researchers are doing enchanting research into the inherited features of Alzheimer’s disease and probable consistent therapies. In Alzheimer’s, memories are extinct in reverse order; memories formed lately are more transitory than those from formerly. Reminiscence therapy includes the conversation of past actions, trials and knowledges with another individual or set of people, frequently with the help of perceptible reminders such as photos, household and other acquainted matters from the past, music and collection of tapes. In this manuscript, the competence of reminiscence therapy for Alzheimer’s disease is measured using logistic regression based linear bootstrap aggregating. Logistic regression is used to envisage the experiential features of the patient’s memory through various therapies. Linear bootstrap aggregating shows better stability and accuracy of reminiscence therapy used in statistical classification and regression of memories related to validation therapy, supportive psychotherapy, sensory integration and simulated presence therapy.Keywords: Alzheimer’s disease, linear bootstrap aggregating, logistic regression, reminiscence therapy
Procedia PDF Downloads 3073768 Statistical Convergence of the Szasz-Mirakjan-Kantorovich-Type Operators
Authors: Rishikesh Yadav, Ramakanta Meher, Vishnu Narayan Mishra
Abstract:
The main aim of this article is to investigate the statistical convergence of the summation of integral type operators and to obtain the weighted statistical convergence. The rate of statistical convergence by means of modulus of continuity and function belonging to the Lipschitz class are also studied. We discuss the convergence of the defined operators by graphical representation and put a better rate of convergence than the Szasz-Mirakjan-Kantorovich operators. In the last section, we extend said operators into bivariate operators to study about the rate of convergence in sense of modulus of continuity and by means of Lipschitz class by using function of two variables.Keywords: The Szasz-Mirakjan-Kantorovich operators, statistical convergence, modulus of continuity, Peeters K-functional, weighted modulus of continuity
Procedia PDF Downloads 2093767 Predicting Survival in Cancer: How Cox Regression Model Compares to Artifial Neural Networks?
Authors: Dalia Rimawi, Walid Salameh, Amal Al-Omari, Hadeel AbdelKhaleq
Abstract:
Predication of Survival time of patients with cancer, is a core factor that influences oncologist decisions in different aspects; such as offered treatment plans, patients’ quality of life and medications development. For a long time proportional hazards Cox regression (ph. Cox) was and still the most well-known statistical method to predict survival outcome. But due to the revolution of data sciences; new predication models were employed and proved to be more flexible and provided higher accuracy in that type of studies. Artificial neural network is one of those models that is suitable to handle time to event predication. In this study we aim to compare ph Cox regression with artificial neural network method according to data handling and Accuracy of each model.Keywords: Cox regression, neural networks, survival, cancer.
Procedia PDF Downloads 1983766 Survival and Hazard Maximum Likelihood Estimator with Covariate Based on Right Censored Data of Weibull Distribution
Authors: Al Omari Mohammed Ahmed
Abstract:
This paper focuses on Maximum Likelihood Estimator with Covariate. Covariates are incorporated into the Weibull model. Under this regression model with regards to maximum likelihood estimator, the parameters of the covariate, shape parameter, survival function and hazard rate of the Weibull regression distribution with right censored data are estimated. The mean square error (MSE) and absolute bias are used to compare the performance of Weibull regression distribution. For the simulation comparison, the study used various sample sizes and several specific values of the Weibull shape parameter.Keywords: weibull regression distribution, maximum likelihood estimator, survival function, hazard rate, right censoring
Procedia PDF Downloads 4393765 Hybrid Artificial Bee Colony and Least Squares Method for Rule-Based Systems Learning
Authors: Ahcene Habbi, Yassine Boudouaoui
Abstract:
This paper deals with the problem of automatic rule generation for fuzzy systems design. The proposed approach is based on hybrid artificial bee colony (ABC) optimization and weighted least squares (LS) method and aims to find the structure and parameters of fuzzy systems simultaneously. More precisely, two ABC based fuzzy modeling strategies are presented and compared. The first strategy uses global optimization to learn fuzzy models, the second one hybridizes ABC and weighted least squares estimate method. The performances of the proposed ABC and ABC-LS fuzzy modeling strategies are evaluated on complex modeling problems and compared to other advanced modeling methods.Keywords: automatic design, learning, fuzzy rules, hybrid, swarm optimization
Procedia PDF Downloads 4363764 Machine Vision System for Measuring the Quality of Bulk Sun-dried Organic Raisins
Authors: Navab Karimi, Tohid Alizadeh
Abstract:
An intelligent vision-based system was designed to measure the quality and purity of raisins. A machine vision setup was utilized to capture the images of bulk raisins in ranges of 5-50% mixed pure-impure berries. The textural features of bulk raisins were extracted using Grey-level Histograms, Co-occurrence Matrix, and Local Binary Pattern (a total of 108 features). Genetic Algorithm and neural network regression were used for selecting and ranking the best features (21 features). As a result, the GLCM features set was found to have the highest accuracy (92.4%) among the other sets. Followingly, multiple feature combinations of the previous stage were fed into the second regression (linear regression) to increase accuracy, wherein a combination of 16 features was found to be the optimum. Finally, a Support Vector Machine (SVM) classifier was used to differentiate the mixtures, producing the best efficiency and accuracy of 96.2% and 97.35%, respectively.Keywords: sun-dried organic raisin, genetic algorithm, feature extraction, ann regression, linear regression, support vector machine, south azerbaijan.
Procedia PDF Downloads 723763 Weighted-Distance Sliding Windows and Cooccurrence Graphs for Supporting Entity-Relationship Discovery in Unstructured Text
Authors: Paolo Fantozzi, Luigi Laura, Umberto Nanni
Abstract:
The problem of Entity relation discovery in structured data, a well covered topic in literature, consists in searching within unstructured sources (typically, text) in order to find connections among entities. These can be a whole dictionary, or a specific collection of named items. In many cases machine learning and/or text mining techniques are used for this goal. These approaches might be unfeasible in computationally challenging problems, such as processing massive data streams. A faster approach consists in collecting the cooccurrences of any two words (entities) in order to create a graph of relations - a cooccurrence graph. Indeed each cooccurrence highlights some grade of semantic correlation between the words because it is more common to have related words close each other than having them in the opposite sides of the text. Some authors have used sliding windows for such problem: they count all the occurrences within a sliding windows running over the whole text. In this paper we generalise such technique, coming up to a Weighted-Distance Sliding Window, where each occurrence of two named items within the window is accounted with a weight depending on the distance between items: a closer distance implies a stronger evidence of a relationship. We develop an experiment in order to support this intuition, by applying this technique to a data set consisting in the text of the Bible, split into verses.Keywords: cooccurrence graph, entity relation graph, unstructured text, weighted distance
Procedia PDF Downloads 1493762 Breast Cancer Survivability Prediction via Classifier Ensemble
Authors: Mohamed Al-Badrashiny, Abdelghani Bellaachia
Abstract:
This paper presents a classifier ensemble approach for predicting the survivability of the breast cancer patients using the latest database version of the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute. The system consists of two main components; features selection and classifier ensemble components. The features selection component divides the features in SEER database into four groups. After that it tries to find the most important features among the four groups that maximizes the weighted average F-score of a certain classification algorithm. The ensemble component uses three different classifiers, each of which models different set of features from SEER through the features selection module. On top of them, another classifier is used to give the final decision based on the output decisions and confidence scores from each of the underlying classifiers. Different classification algorithms have been examined; the best setup found is by using the decision tree, Bayesian network, and Na¨ıve Bayes algorithms for the underlying classifiers and Na¨ıve Bayes for the classifier ensemble step. The system outperforms all published systems to date when evaluated against the exact same data of SEER (period of 1973-2002). It gives 87.39% weighted average F-score compared to 85.82% and 81.34% of the other published systems. By increasing the data size to cover the whole database (period of 1973-2014), the overall weighted average F-score jumps to 92.4% on the held out unseen test set.Keywords: classifier ensemble, breast cancer survivability, data mining, SEER
Procedia PDF Downloads 3233761 Using Scale Invariant Feature Transform Features to Recognize Characters in Natural Scene Images
Authors: Belaynesh Chekol, Numan Çelebi
Abstract:
The main purpose of this work is to recognize individual characters extracted from natural scene images using scale invariant feature transform (SIFT) features as an input to K-nearest neighbor (KNN); a classification learner algorithm. For this task, 1,068 and 78 images of English alphabet characters taken from Chars74k data set is used to train and test the classifier respectively. For each character image, We have generated describing features by using SIFT algorithm. This set of features is fed to the learner so that it can recognize and label new images of English characters. Two types of KNN (fine KNN and weighted KNN) were trained and the resulted classification accuracy is 56.9% and 56.5% respectively. The training time taken was the same for both fine and weighted KNN.Keywords: character recognition, KNN, natural scene image, SIFT
Procedia PDF Downloads 2793760 Deep Vision: A Robust Dominant Colour Extraction Framework for T-Shirts Based on Semantic Segmentation
Authors: Kishore Kumar R., Kaustav Sengupta, Shalini Sood Sehgal, Poornima Santhanam
Abstract:
Fashion is a human expression that is constantly changing. One of the prime factors that consistently influences fashion is the change in colour preferences. The role of colour in our everyday lives is very significant. It subconsciously explains a lot about one’s mindset and mood. Analyzing the colours by extracting them from the outfit images is a critical study to examine the individual’s/consumer behaviour. Several research works have been carried out on extracting colours from images, but to the best of our knowledge, there were no studies that extract colours to specific apparel and identify colour patterns geographically. This paper proposes a framework for accurately extracting colours from T-shirt images and predicting dominant colours geographically. The proposed method consists of two stages: first, a U-Net deep learning model is adopted to segment the T-shirts from the images. Second, the colours are extracted only from the T-shirt segments. The proposed method employs the iMaterialist (Fashion) 2019 dataset for the semantic segmentation task. The proposed framework also includes a mechanism for gathering data and analyzing India’s general colour preferences. From this research, it was observed that black and grey are the dominant colour in different regions of India. The proposed method can be adapted to study fashion’s evolving colour preferences.Keywords: colour analysis in t-shirts, convolutional neural network, encoder-decoder, k-means clustering, semantic segmentation, U-Net model
Procedia PDF Downloads 1113759 Cognitive Weighted Polymorphism Factor: A New Cognitive Complexity Metric
Authors: T. Francis Thamburaj, A. Aloysius
Abstract:
Polymorphism is one of the main pillars of the object-oriented paradigm. It induces hidden forms of class dependencies which may impact software quality, resulting in higher cost factor for comprehending, debugging, testing, and maintaining the software. In this paper, a new cognitive complexity metric called Cognitive Weighted Polymorphism Factor (CWPF) is proposed. Apart from the software structural complexity, it includes the cognitive complexity on the basis of type. The cognitive weights are calibrated based on 27 empirical studies with 120 persons. A case study and experimentation of the new software metric shows positive results. Further, a comparative study is made and the correlation test has proved that CWPF complexity metric is a better, more comprehensive, and more realistic indicator of the software complexity than Abreu’s Polymorphism Factor (PF) complexity metric.Keywords: cognitive complexity metric, object-oriented metrics, polymorphism factor, software metrics
Procedia PDF Downloads 457