Search results for: stepwise regression analysis.
9000 Fuzzy Logic Approach to Robust Regression Models of Uncertain Medical Categories
Authors: Arkady Bolotin
Abstract:
Dichotomization of the outcome by a single cut-off point is an important part of various medical studies. Usually the relationship between the resulted dichotomized dependent variable and explanatory variables is analyzed with linear regression, probit regression or logistic regression. However, in many real-life situations, a certain cut-off point dividing the outcome into two groups is unknown and can be specified only approximately, i.e. surrounded by some (small) uncertainty. It means that in order to have any practical meaning the regression model must be robust to this uncertainty. In this paper, we show that neither the beta in the linear regression model, nor its significance level is robust to the small variations in the dichotomization cut-off point. As an alternative robust approach to the problem of uncertain medical categories, we propose to use the linear regression model with the fuzzy membership function as a dependent variable. This fuzzy membership function denotes to what degree the value of the underlying (continuous) outcome falls below or above the dichotomization cut-off point. In the paper, we demonstrate that the linear regression model of the fuzzy dependent variable can be insensitive against the uncertainty in the cut-off point location. In the paper we present the modeling results from the real study of low hemoglobin levels in infants. We systematically test the robustness of the binomial regression model and the linear regression model with the fuzzy dependent variable by changing the boundary for the category Anemia and show that the behavior of the latter model persists over a quite wide interval.
Keywords: Categorization, Uncertain medical categories, Binomial regression model, Fuzzy dependent variable, Robustness.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15628999 Dichotomous Logistic Regression with Leave-One-Out Validation
Authors: Sin Yin Teh, Abdul Rahman Othman, Michael Boon Chong Khoo
Abstract:
In this paper, the concepts of dichotomous logistic regression (DLR) with leave-one-out (L-O-O) were discussed. To illustrate this, the L-O-O was run to determine the importance of the simulation conditions for robust test of spread procedures with good Type I error rates. The resultant model was then evaluated. The discussions included 1) assessment of the accuracy of the model, and 2) parameter estimates. These were presented and illustrated by modeling the relationship between the dichotomous dependent variable (Type I error rates) with a set of independent variables (the simulation conditions). The base SAS software containing PROC LOGISTIC and DATA step functions can be making used to do the DLR analysis.Keywords: Dichotomous logistic regression, leave-one-out, testof spread.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20738998 A General Regression Test Selection Technique
Authors: Walid S. Abd El-hamid, Sherif S. El-etriby, Mohiy M. Hadhoud
Abstract:
This paper presents a new methodology to select test cases from regression test suites. The selection strategy is based on analyzing the dynamic behavior of the applications that written in any programming language. Methods based on dynamic analysis are more safe and efficient. We design a technique that combine the code based technique and model based technique, to allow comparing the object oriented of an application that written in any programming language. We have developed a prototype tool that detect changes and select test cases from test suite.Keywords: Regression testing, Model based testing, Dynamicbehavior.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19798997 The Relative Efficiency of Parameter Estimation in Linear Weighted Regression
Authors: Baoguang Tian, Nan Chen
Abstract:
A new relative efficiency in linear model in reference is instructed into the linear weighted regression, and its upper and lower bound are proposed. In the linear weighted regression model, for the best linear unbiased estimation of mean matrix respect to the least-squares estimation, two new relative efficiencies are given, and their upper and lower bounds are also studied.
Keywords: Linear weighted regression, Relative efficiency, Mean matrix, Trace.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24778996 Identifying Factors Contributing to the Spread of Lyme Disease: A Regression Analysis of Virginia’s Data
Authors: Fatemeh Valizadeh Gamchi, Edward L. Boone
Abstract:
This research focuses on Lyme disease, a widespread infectious condition in the United States caused by the bacterium Borrelia burgdorferi sensu stricto. It is critical to identify environmental and economic elements that are contributing to the spread of the disease. This study examined data from Virginia to identify a subset of explanatory variables significant for Lyme disease case numbers. To identify relevant variables and avoid overfitting, linear poisson, and regularization regression methods such as ridge, lasso, and elastic net penalty were employed. Cross-validation was performed to acquire tuning parameters. The methods proposed can automatically identify relevant disease count covariates. The efficacy of the techniques was assessed using four criteria on three simulated datasets. Finally, using the Virginia Department of Health’s Lyme disease dataset, the study successfully identified key factors, and the results were consistent with previous studies.
Keywords: Lyme disease, Poisson generalized linear model, Ridge regression, Lasso Regression, elastic net regression.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1268995 A Study on a Research and Development Cost-Estimation Model in Korea
Authors: Babakina Alexandra, Yong Soo Kim
Abstract:
In this study, we analyzed the factors that affect research funds using linear regression analysis to increase the effectiveness of investments in national research projects. We collected 7,916 items of data on research projects that were in the process of being finished or were completed between 2010 and 2011. Data pre-processing and visualization were performed to derive statistically significant results. We identified factors that affected funding using analysis of fit distributions and estimated increasing or decreasing tendencies based on these factors.
Keywords: R&D funding, Cost estimation, Linear regression, Preliminary feasibility study.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22508994 Predictors of Academic Achievement of Student ICT Teachers with Different Learning Styles
Authors: Deniz Deryakulu, Şener Büyüköztürk Hüseyin Özçınar
Abstract:
The main purpose of this study was to determine the predictors of academic achievement of student Information and Communications Technologies (ICT) teachers with different learning styles. Participants were 148 student ICT teachers from Ankara University. Participants were asked to fill out a personal information sheet, the Turkish version of Kolb-s Learning Style Inventory, Weinstein-s Learning and Study Strategies Inventory, Schommer's Epistemological Beliefs Questionnaire, and Eysenck-s Personality Questionnaire. Stepwise regression analyses showed that the statistically significant predictors of the academic achievement of the accommodators were attitudes and high school GPAs; of the divergers was anxiety; of the convergers were gender, epistemological beliefs, and motivation; and of the assimilators were gender, personality, and test strategies. Implications for ICT teaching-learning processes and teacher education are discussed.
Keywords: Academic achievement, student ICT teachers, Kolb learning styles, experiential learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 26108993 The Relationship between Iranian EFL Learners' Multiple Intelligences and Their Performance on Grammar Tests
Authors: Rose Shayeghi, Pejman Hosseinioun
Abstract:
The Multiple Intelligences theory characterizes human intelligence as a multifaceted entity that exists in all human beings with varying degrees. The most important contribution of this theory to the field of English Language Teaching (ELT) is its role in identifying individual differences and designing more learnercentered programs. The present study aims at investigating the relationship between different elements of multiple intelligence and grammar scores. To this end, 63 female Iranian EFL learner selected from among intermediate students participated in the study. The instruments employed were a Nelson English language test, Michigan Grammar Test, and Teele Inventory for Multiple Intelligences (TIMI). The results of Pearson Product-Moment Correlation revealed a significant positive correlation between grammatical accuracy and linguistic as well as interpersonal intelligence. The results of Stepwise Multiple Regression indicated that linguistic intelligence contributed to the prediction of grammatical accuracy.Keywords: Multiple intelligence, grammar, ELT, EFL, TIMI.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24218992 A Study of Panel Logit Model and Adaptive Neuro-Fuzzy Inference System in the Prediction of Financial Distress Periods
Authors: Ε. Giovanis
Abstract:
The purpose of this paper is to present two different approaches of financial distress pre-warning models appropriate for risk supervisors, investors and policy makers. We examine a sample of the financial institutions and electronic companies of Taiwan Security Exchange (TSE) market from 2002 through 2008. We present a binary logistic regression with paned data analysis. With the pooled binary logistic regression we build a model including more variables in the regression than with random effects, while the in-sample and out-sample forecasting performance is higher in random effects estimation than in pooled regression. On the other hand we estimate an Adaptive Neuro-Fuzzy Inference System (ANFIS) with Gaussian and Generalized Bell (Gbell) functions and we find that ANFIS outperforms significant Logit regressions in both in-sample and out-of-sample periods, indicating that ANFIS is a more appropriate tool for financial risk managers and for the economic policy makers in central banks and national statistical services.Keywords: ANFIS, Binary logistic regression, Financialdistress, Panel data
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23448991 Interrelationships between Physicochemical Water Pollution Indicators: A Case Study of River Pandu
Authors: Sunita Verma , Divya Tiwari, Ajay Verma
Abstract:
Water samples were collected from river Pandu at six stations where human and animal activities were high. Composite samples were analyzed for dissolved oxygen (DO), biochemical oxygen demand (BOD), chemical oxygen demand (COD) , pH values during dry and wet seasons as well as the harmattan period. The total data points were used to establish relationships between the parameters and data were also subjected to statistical analysis and expressed as mean ± standard error of mean (SEM) at a level of significance of p<0.05. Regression analysis was carried out to establish relationships if any between studied parameters and relationships in form of scatter plots were obtained between DO/BOD, COD/DO, BOD/COD, COD/pH, BOD/pH and DO/pH. The high to moderate correlation coefficient observed, R2 ranged from 0.68 to 0.15 between these parameters.Keywords: BOD, DO, COD, pH, Regression analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21328990 Extended Least Squares LS–SVM
Authors: József Valyon, Gábor Horváth
Abstract:
Among neural models the Support Vector Machine (SVM) solutions are attracting increasing attention, mostly because they eliminate certain crucial questions involved by neural network construction. The main drawback of standard SVM is its high computational complexity, therefore recently a new technique, the Least Squares SVM (LS–SVM) has been introduced. In this paper we present an extended view of the Least Squares Support Vector Regression (LS–SVR), which enables us to develop new formulations and algorithms to this regression technique. Based on manipulating the linear equation set -which embodies all information about the regression in the learning process- some new methods are introduced to simplify the formulations, speed up the calculations and/or provide better results.Keywords: Function estimation, Least–Squares Support VectorMachines, Regression, System Modeling
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20138989 Predictive Analysis for Big Data: Extension of Classification and Regression Trees Algorithm
Authors: Ameur Abdelkader, Abed Bouarfa Hafida
Abstract:
Since its inception, predictive analysis has revolutionized the IT industry through its robustness and decision-making facilities. It involves the application of a set of data processing techniques and algorithms in order to create predictive models. Its principle is based on finding relationships between explanatory variables and the predicted variables. Past occurrences are exploited to predict and to derive the unknown outcome. With the advent of big data, many studies have suggested the use of predictive analytics in order to process and analyze big data. Nevertheless, they have been curbed by the limits of classical methods of predictive analysis in case of a large amount of data. In fact, because of their volumes, their nature (semi or unstructured) and their variety, it is impossible to analyze efficiently big data via classical methods of predictive analysis. The authors attribute this weakness to the fact that predictive analysis algorithms do not allow the parallelization and distribution of calculation. In this paper, we propose to extend the predictive analysis algorithm, Classification And Regression Trees (CART), in order to adapt it for big data analysis. The major changes of this algorithm are presented and then a version of the extended algorithm is defined in order to make it applicable for a huge quantity of data.
Keywords: Predictive analysis, big data, predictive analysis algorithms. CART algorithm.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10768988 Numerical Simulation of a Pressure Regulated Valve to Find Out the Characteristics of Passive Control Circuit
Authors: Binod Kumar Saha
Abstract:
The objective of the present paper is a numerical analysis of the flow forces acting on spool surfaces of a pressure regulated valve. The transient, compressible and turbulent flow structures inside the valve are simulated using ANSYS FLUENT coupled with a special UDF. Here, valve inlet pressure is varied in a stepwise manner. For every value of inlet pressure, transient analysis leads to a quasi-static flow through the valve. Spool forces are calculated based on different pressures at inlet. From this information of spool forces, pressure characteristic of the passive control circuit has been derived.Keywords: Pressure Regulating Valve, Spool Opening, Spool Movement, Force Balance, CFD.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 38688987 Enhancing Predictive Accuracy in Pharmaceutical Sales Through an Ensemble Kernel Gaussian Process Regression Approach
Authors: Shahin Mirshekari, Mohammadreza Moradi, Hossein Jafari, Mehdi Jafari, Mohammad Ensaf
Abstract:
This research employs Gaussian Process Regression (GPR) with an ensemble kernel, integrating Exponential Squared, Revised Matérn, and Rational Quadratic kernels to analyze pharmaceutical sales data. Bayesian optimization was used to identify optimal kernel weights: 0.76 for Exponential Squared, 0.21 for Revised Matérn, and 0.13 for Rational Quadratic. The ensemble kernel demonstrated superior performance in predictive accuracy, achieving an R² score near 1.0, and significantly lower values in MSE, MAE, and RMSE. These findings highlight the efficacy of ensemble kernels in GPR for predictive analytics in complex pharmaceutical sales datasets.
Keywords: Gaussian Process Regression, Ensemble Kernels, Bayesian Optimization, Pharmaceutical Sales Analysis, Time Series Forecasting, Data Analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1138986 Time Series Regression with Meta-Clusters
Authors: Monika Chuchro
Abstract:
This paper presents a preliminary attempt to apply classification of time series using meta-clusters in order to improve the quality of regression models. In this case, clustering was performed as a method to obtain subgroups of time series data with normal distribution from the inflow into wastewater treatment plant data, composed of several groups differing by mean value. Two simple algorithms, K-mean and EM, were chosen as a clustering method. The Rand index was used to measure the similarity. After simple meta-clustering, a regression model was performed for each subgroups. The final model was a sum of the subgroups models. The quality of the obtained model was compared with the regression model made using the same explanatory variables, but with no clustering of data. Results were compared using determination coefficient (R2), measure of prediction accuracy- mean absolute percentage error (MAPE) and comparison on a linear chart. Preliminary results allow us to foresee the potential of the presented technique.
Keywords: Clustering, Data analysis, Data mining, Predictive models.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19518985 Estimating Regression Parameters in Linear Regression Model with a Censored Response Variable
Authors: Jesus Orbe, Vicente Nunez-Anton
Abstract:
In this work we study the effect of several covariates X on a censored response variable T with unknown probability distribution. In this context, most of the studies in the literature can be located in two possible general classes of regression models: models that study the effect the covariates have on the hazard function; and models that study the effect the covariates have on the censored response variable. Proposals in this paper are in the second class of models and, more specifically, on least squares based model approach. Thus, using the bootstrap estimate of the bias, we try to improve the estimation of the regression parameters by reducing their bias, for small sample sizes. Simulation results presented in the paper show that, for reasonable sample sizes and censoring levels, the bias is always smaller for the new proposals.
Keywords: Censored response variable, regression, bias.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14778984 Effects of Hidden Unit Sizes and Autoregressive Features in Mental Task Classification
Authors: Ramaswamy Palaniappan, Nai-Jen Huan
Abstract:
Classification of electroencephalogram (EEG) signals extracted during mental tasks is a technique that is actively pursued for Brain Computer Interfaces (BCI) designs. In this paper, we compared the classification performances of univariateautoregressive (AR) and multivariate autoregressive (MAR) models for representing EEG signals that were extracted during different mental tasks. Multilayer Perceptron (MLP) neural network (NN) trained by the backpropagation (BP) algorithm was used to classify these features into the different categories representing the mental tasks. Classification performances were also compared across different mental task combinations and 2 sets of hidden units (HU): 2 to 10 HU in steps of 2 and 20 to 100 HU in steps of 20. Five different mental tasks from 4 subjects were used in the experimental study and combinations of 2 different mental tasks were studied for each subject. Three different feature extraction methods with 6th order were used to extract features from these EEG signals: AR coefficients computed with Burg-s algorithm (ARBG), AR coefficients computed with stepwise least square algorithm (ARLS) and MAR coefficients computed with stepwise least square algorithm. The best results were obtained with 20 to 100 HU using ARBG. It is concluded that i) it is important to choose the suitable mental tasks for different individuals for a successful BCI design, ii) higher HU are more suitable and iii) ARBG is the most suitable feature extraction method.Keywords: Autoregressive, Brain-Computer Interface, Electroencephalogram, Neural Network.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18048983 Adjusted Ratio and Regression Type Estimators for Estimation of Population Mean when some Observations are missing
Authors: Nuanpan Nangsue
Abstract:
Ratio and regression type estimators have been used by previous authors to estimate a population mean for the principal variable from samples in which both auxiliary x and principal y variable data are available. However, missing data are a common problem in statistical analyses with real data. Ratio and regression type estimators have also been used for imputing values of missing y data. In this paper, six new ratio and regression type estimators are proposed for imputing values for any missing y data and estimating a population mean for y from samples with missing x and/or y data. A simulation study has been conducted to compare the six ratio and regression type estimators with a previous estimator of Rueda. Two population sizes N = 1,000 and 5,000 have been considered with sample sizes of 10% and 30% and with correlation coefficients between population variables X and Y of 0.5 and 0.8. In the simulations, 10 and 40 percent of sample y values and 10 and 40 percent of sample x values were randomly designated as missing. The new ratio and regression type estimators give similar mean absolute percentage errors that are smaller than the Rueda estimator for all cases. The new estimators give a large reduction in errors for the case of 40% missing y values and sampling fraction of 30%.
Keywords: Auxiliary variable, missing data, ratio and regression type estimators.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17338982 Methods for Data Selection in Medical Databases: The Binary Logistic Regression -Relations with the Calculated Risks
Authors: Cristina G. Dascalu, Elena Mihaela Carausu, Daniela Manuc
Abstract:
The medical studies often require different methods for parameters selection, as a second step of processing, after the database-s designing and filling with information. One common task is the selection of fields that act as risk factors using wellknown methods, in order to find the most relevant risk factors and to establish a possible hierarchy between them. Different methods are available in this purpose, one of the most known being the binary logistic regression. We will present the mathematical principles of this method and a practical example of using it in the analysis of the influence of 10 different psychiatric diagnostics over 4 different types of offences (in a database made from 289 psychiatric patients involved in different types of offences). Finally, we will make some observations about the relation between the risk factors hierarchy established through binary logistic regression and the individual risks, as well as the results of Chi-squared test. We will show that the hierarchy built using the binary logistic regression doesn-t agree with the direct order of risk factors, even if it was naturally to assume this hypothesis as being always true.Keywords: Databases, risk factors, binary logisticregression, hierarchy.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13298981 The Discriminate Analysis and Relevant Model for Mapping Export Potential
Authors: Jana Gutierrez Chvalkovská, Michal Mejstřík, Matěj Urban
Abstract:
There are pending discussions over the mapping of country export potential in order to refocus export strategy of firms and its evidence-based promotion by the Export Credit Agencies (ECAs) and other permitted vehicles of governments. In this paper we develop our version of an applied model that offers “stepwise” elimination of unattractive markets. We modify and calibrate the model for the particular features of the Czech Republic and specific pilot cases where we apply an individual approach to each sector.
Keywords: Export strategy, Modeling export, Calibration, Export promotion.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24208980 Reductions of Control Flow Graphs
Authors: Robert Gold
Abstract:
Control flow graphs are a well-known representation of the sequential control flow structure of programs with a multitude of applications. Not only single functions but also sets of functions or complete programs can be modeled by control flow graphs. In this case the size of the graphs can grow considerably and thus makes it difficult for software engineers to analyze the control flow. Graph reductions are helpful in this situation. In this paper we define reductions to subsets of nodes. Since executions of programs are represented by paths through the control flow graphs, paths should be preserved. Furthermore, the composition of reductions makes a stepwise analysis approach possible.
Keywords: Control flow graph, graph reduction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 34968979 Factors Determining Intention to Pursue Genetic Testing for People in Taiwan
Authors: Ju-Chun Chien
Abstract:
The Ottawa Charter for Health Promotion proposed that the role of health services should shift the focus from cure to prevention. Nowadays, besides having physical examinations, people could also conduct genetic tests to provide important information for diagnosing, treating, and/or preventing illnesses. However, because of the incompletion of the Chinese Genetic Database, people in Taiwan were still unfamiliar with genetic testing. The purposes of the present study were to: (1) Figure out people’s attitudes towards genetic testing. (2) Examine factors that influence people’s intention to pursue genetic testing by means of the Health Belief Model (HBM). A pilot study was conducted on 249 Taiwanese in 2017 to test the feasibility of the self-developed instrument. The reliability and construct validity of scores on the self-developed questionnaire revealed that this HBM-based questionnaire with 40 items was a well-developed instrument. A total of 542 participants were recruited and the valid participants were 535 (99%) between the ages of 20 and 86. Descriptive statistics, one-way ANOVA, two-way contingency table analysis, Pearson’s correlation, and stepwise multiple regression analysis were used in this study. The main results were that only 32 participants (6%) had already undergone genetic testing; moreover, their attitude towards genetic testing was more positive than those who did not have the experience. Compared with people who never underwent genetic tests, those who had gone for genetic testing had higher self-efficacy, greater intention to pursue genetic testing, had academic majors in health-related fields, had chronic and genetic diseases, possessed Catastrophic Illness Cards, and all of them had heard about genetic testing. The variables that best predicted people’s intention to pursue genetic testing were cues to action, self-efficacy, and perceived benefits (the three variables all correlated with one another positively at high magnitudes). To sum up, the HBM could be effective in designing and identifying the needs and priorities of the target population to pursue genetic testing.
Keywords: Genetic testing, intention to pursue genetic testing, Taiwan, health belief model.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7028978 Acute Coronary Syndrome Prediction Using Data Mining Techniques- An Application
Authors: Tahseen A. Jilani, Huda Yasin, Madiha Yasin, C. Ardil
Abstract:
In this paper we use data mining techniques to investigate factors that contribute significantly to enhancing the risk of acute coronary syndrome. We assume that the dependent variable is diagnosis – with dichotomous values showing presence or absence of disease. We have applied binary regression to the factors affecting the dependent variable. The data set has been taken from two different cardiac hospitals of Karachi, Pakistan. We have total sixteen variables out of which one is assumed dependent and other 15 are independent variables. For better performance of the regression model in predicting acute coronary syndrome, data reduction techniques like principle component analysis is applied. Based on results of data reduction, we have considered only 14 out of sixteen factors.
Keywords: Acute coronary syndrome (ACS), binary logistic regression analyses, myocardial ischemia (MI), principle component analysis, unstable angina (U.A.).
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21158977 Support Vector Regression for Retrieval of Soil Moisture Using Bistatic Scatterometer Data at X-Band
Authors: Dileep Kumar Gupta, Rajendra Prasad, Pradeep Kumar, Varun Narayan Mishra, Ajeet Kumar Vishwakarma, Prashant Kumar Srivastava
Abstract:
An approach was evaluated for the retrieval of soil moisture of bare soil surface using bistatic scatterometer data in the angular range of 200 to 700 at VV- and HH- polarization. The microwave data was acquired by specially designed X-band (10 GHz) bistatic scatterometer. The linear regression analysis was done between scattering coefficients and soil moisture content to select the suitable incidence angle for retrieval of soil moisture content. The 250 incidence angle was found more suitable. The support vector regression analysis was used to approximate the function described by the input output relationship between the scattering coefficient and corresponding measured values of the soil moisture content. The performance of support vector regression algorithm was evaluated by comparing the observed and the estimated soil moisture content by statistical performance indices %Bias, root mean squared error (RMSE) and Nash-Sutcliffe Efficiency (NSE). The values of %Bias, root mean squared error (RMSE) and Nash-Sutcliffe Efficiency (NSE) were found 2.9451, 1.0986 and 0.9214 respectively at HHpolarization. At VV- polarization, the values of %Bias, root mean squared error (RMSE) and Nash-Sutcliffe Efficiency (NSE) were found 3.6186, 0.9373 and 0.9428 respectively.Keywords: Bistatic scatterometer, soil moisture, support vector regression, RMSE, %Bias, NSE.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 32298976 Quality of Service Evaluation using a Combination of Fuzzy C-Means and Regression Model
Authors: Aboagela Dogman, Reza Saatchi, Samir Al-Khayatt
Abstract:
In this study, a network quality of service (QoS) evaluation system was proposed. The system used a combination of fuzzy C-means (FCM) and regression model to analyse and assess the QoS in a simulated network. Network QoS parameters of multimedia applications were intelligently analysed by FCM clustering algorithm. The QoS parameters for each FCM cluster centre were then inputted to a regression model in order to quantify the overall QoS. The proposed QoS evaluation system provided valuable information about the network-s QoS patterns and based on this information, the overall network-s QoS was effectively quantified.Keywords: Fuzzy C-means; regression model, network quality of service
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17218975 Estimating Bridge Deterioration for Small Data Sets Using Regression and Markov Models
Authors: Yina F. Muñoz, Alexander Paz, Hanns De La Fuente-Mella, Joaquin V. Fariña, Guilherme M. Sales
Abstract:
The primary approach for estimating bridge deterioration uses Markov-chain models and regression analysis. Traditional Markov models have problems in estimating the required transition probabilities when a small sample size is used. Often, reliable bridge data have not been taken over large periods, thus large data sets may not be available. This study presents an important change to the traditional approach by using the Small Data Method to estimate transition probabilities. The results illustrate that the Small Data Method and traditional approach both provide similar estimates; however, the former method provides results that are more conservative. That is, Small Data Method provided slightly lower than expected bridge condition ratings compared with the traditional approach. Considering that bridges are critical infrastructures, the Small Data Method, which uses more information and provides more conservative estimates, may be more appropriate when the available sample size is small. In addition, regression analysis was used to calculate bridge deterioration. Condition ratings were determined for bridge groups, and the best regression model was selected for each group. The results obtained were very similar to those obtained when using Markov chains; however, it is desirable to use more data for better results.
Keywords: Concrete bridges, deterioration, Markov chains, probability matrix.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14428974 A Research on Inference from Multiple Distance Variables in Hedonic Regression – Focus on Three Variables
Authors: Yan Wang, Yasushi Asami, Yukio Sadahiro
Abstract:
In urban context, urban nodes such as amenity or hazard will certainly affect house price, while classic hedonic analysis will employ distance variables measured from each urban nodes. However, effects from distances to facilities on house prices generally do not represent the true price of the property. Distance variables measured on the same surface are suffering a problem called multicollinearity, which is usually presented as magnitude variance and mean value in regression, errors caused by instability. In this paper, we provided a theoretical framework to identify and gather the data with less bias, and also provided specific sampling method on locating the sample region to avoid the spatial multicollinerity problem in three distance variable’s case.
Keywords: Hedonic regression, urban node, distance variables, multicollinerity, collinearity.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19948973 Data Mining Classification Methods Applied in Drug Design
Authors: Mária Stachová, Lukáš Sobíšek
Abstract:
Data mining incorporates a group of statistical methods used to analyze a set of information, or a data set. It operates with models and algorithms, which are powerful tools with the great potential. They can help people to understand the patterns in certain chunk of information so it is obvious that the data mining tools have a wide area of applications. For example in the theoretical chemistry data mining tools can be used to predict moleculeproperties or improve computer-assisted drug design. Classification analysis is one of the major data mining methodologies. The aim of thecontribution is to create a classification model, which would be able to deal with a huge data set with high accuracy. For this purpose logistic regression, Bayesian logistic regression and random forest models were built using R software. TheBayesian logistic regression in Latent GOLD software was created as well. These classification methods belong to supervised learning methods. It was necessary to reduce data matrix dimension before construct models and thus the factor analysis (FA) was used. Those models were applied to predict the biological activity of molecules, potential new drug candidates.Keywords: data mining, classification, drug design, QSAR
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 28528972 Level of Concentration in Banking Markets and Length of EU Membership
Authors: Ivan Pavic, Fran Galetic, Tomislava Pavic Kramaric
Abstract:
The purpose of this article is to analyze the degree of concentration in the banking market in EU member states as well as to determine the impact of the length of EU membership on the degree of concentration. In that sense several analysis were conducted, specifically, panel analysis, calculation of correlation coefficient and regression analysis of the impact of the length of EU membership on the degree of concentration. Panel analysis was conducted to determine whether there is a similar trend of concentration in three groups of countries - countries with a low, moderate and high level of concentration. The conducted panel analysis showed that in EU countries with a moderate level of concentration, the level of concentration decreases. The calculation of correlation showed that, to some extent, with other influential factors, the length of EU membership negatively affects the market concentration of the banking market. Using the regression analysis for investigation of the influence of the length of EU membership on the level of concentration in the banking sector in a particular country, the results reveal that there is a negative effect of the length in EU membership on market concentration, although it is not significantly influential variable.Keywords: Banking sector, concentration, EU
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18658971 Density Estimation using Generalized Linear Model and a Linear Combination of Gaussians
Authors: Aly Farag, Ayman El-Baz, Refaat Mohamed
Abstract:
In this paper we present a novel approach for density estimation. The proposed approach is based on using the logistic regression model to get initial density estimation for the given empirical density. The empirical data does not exactly follow the logistic regression model, so, there will be a deviation between the empirical density and the density estimated using logistic regression model. This deviation may be positive and/or negative. In this paper we use a linear combination of Gaussian (LCG) with positive and negative components as a model for this deviation. Also, we will use the expectation maximization (EM) algorithm to estimate the parameters of LCG. Experiments on real images demonstrate the accuracy of our approach.
Keywords: Logistic regression model, Expectationmaximization, Segmentation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1737