Search results for: Panel Analysis Regression.
9081 Identifying Factors Contributing to the Spread of Lyme Disease: A Regression Analysis of Virginia’s Data
Authors: Fatemeh Valizadeh Gamchi, Edward L. Boone
Abstract:
This research focuses on Lyme disease, a widespread infectious condition in the United States caused by the bacterium Borrelia burgdorferi sensu stricto. It is critical to identify environmental and economic elements that are contributing to the spread of the disease. This study examined data from Virginia to identify a subset of explanatory variables significant for Lyme disease case numbers. To identify relevant variables and avoid overfitting, linear poisson, and regularization regression methods such as ridge, lasso, and elastic net penalty were employed. Cross-validation was performed to acquire tuning parameters. The methods proposed can automatically identify relevant disease count covariates. The efficacy of the techniques was assessed using four criteria on three simulated datasets. Finally, using the Virginia Department of Health’s Lyme disease dataset, the study successfully identified key factors, and the results were consistent with previous studies.
Keywords: Lyme disease, Poisson generalized linear model, Ridge regression, Lasso Regression, elastic net regression.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1229080 A Study on a Research and Development Cost-Estimation Model in Korea
Authors: Babakina Alexandra, Yong Soo Kim
Abstract:
In this study, we analyzed the factors that affect research funds using linear regression analysis to increase the effectiveness of investments in national research projects. We collected 7,916 items of data on research projects that were in the process of being finished or were completed between 2010 and 2011. Data pre-processing and visualization were performed to derive statistically significant results. We identified factors that affected funding using analysis of fit distributions and estimated increasing or decreasing tendencies based on these factors.
Keywords: R&D funding, Cost estimation, Linear regression, Preliminary feasibility study.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22469079 Interrelationships between Physicochemical Water Pollution Indicators: A Case Study of River Pandu
Authors: Sunita Verma , Divya Tiwari, Ajay Verma
Abstract:
Water samples were collected from river Pandu at six stations where human and animal activities were high. Composite samples were analyzed for dissolved oxygen (DO), biochemical oxygen demand (BOD), chemical oxygen demand (COD) , pH values during dry and wet seasons as well as the harmattan period. The total data points were used to establish relationships between the parameters and data were also subjected to statistical analysis and expressed as mean ± standard error of mean (SEM) at a level of significance of p<0.05. Regression analysis was carried out to establish relationships if any between studied parameters and relationships in form of scatter plots were obtained between DO/BOD, COD/DO, BOD/COD, COD/pH, BOD/pH and DO/pH. The high to moderate correlation coefficient observed, R2 ranged from 0.68 to 0.15 between these parameters.Keywords: BOD, DO, COD, pH, Regression analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21319078 Factors Influencing B2c eCommerce Diffusion
Authors: R. Mangiaracina, A. Perego, F. Campari
Abstract:
Despite the fact that B2c eCommerce has become important in numerous economies, its adoption varies from country to country. This paper aims to identify the factors affecting (enabling or inhibiting) B2c eCommerce and to determine their quantitative impact on the diffusion of online sales across countries. A dynamic panel model analyzing the relationship between 13 factors (Macroeconomic, Demographic, Socio-Cultural, Infrastructural and Offer related) stemming from a complete literature analysis and the B2c eCommerce value in 45 countries over 9 years has been developed. Having a positive correlation coefficient, GDP, mobile penetration, Internet user penetration and credit card penetration resulted as enabling drivers of the B2c eCommerce value across countries, whereas, having a negative correlation coefficient,equal distribution of income and the development of traditional retailing network act as inhibiting factors.Keywords: B2c eCommerce diffusion, influencing factors, dynamic panel model
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 35639077 Extended Least Squares LS–SVM
Authors: József Valyon, Gábor Horváth
Abstract:
Among neural models the Support Vector Machine (SVM) solutions are attracting increasing attention, mostly because they eliminate certain crucial questions involved by neural network construction. The main drawback of standard SVM is its high computational complexity, therefore recently a new technique, the Least Squares SVM (LS–SVM) has been introduced. In this paper we present an extended view of the Least Squares Support Vector Regression (LS–SVR), which enables us to develop new formulations and algorithms to this regression technique. Based on manipulating the linear equation set -which embodies all information about the regression in the learning process- some new methods are introduced to simplify the formulations, speed up the calculations and/or provide better results.Keywords: Function estimation, Least–Squares Support VectorMachines, Regression, System Modeling
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20099076 Firm Ownership and Performance: Evidence for Croatian Listed Firms
Authors: M. Pervan, I. Pervan, M. Todoric
Abstract:
Using data of listed Croatian firms from the Zagreb Stock Exchange we analyze the relationship between firm ownership (ownership concentration and type) and performance (ROA). Empirical research was conducted for the period 2003-2010, yielding with the total of 1,430 observations. Empirical findings based on dynamic panel analysis indicate that ownership concentration variable - CR4 is negatively related with performance, i.e. listed firms with dispersed ownership perform better than firms with concentrated ownership. Also, the research indicated that foreign controlled listed firms perform better than domestically controlled firms. Majority state owned firms perform worse than privately held firms but dummy variable for privately controlled firms was not statistically significant in the estimated panel model.Keywords: Croatia, firm, ownership, performance
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22909075 Does Corporate Governance or Transparency Affect Foreign Direct Investment?
Authors: Haksoon Kim
Abstract:
The paper investigates the relationship between the foreign direct investment (FDI) and the corporate governance or transparency by investigating the country-level FDI flows, FDI inward performance, corporate governance and transparency variables. From the regression analysis with Newey-West estimator of 28 country panel data from 1990- 2002, we find strong positive relationships between corporate governance or transparency level of hosting countries and FDI inward performance within hosting countries. A strong positive relationship is found between anti-director rights level or number of analysts of hosting countries and FDI inward performance within hosting countries. Also, we find a positive relationship between the number of analysts of hosting countries and FDI inflows. The empirical results are consistent with stock market liberalizations and corporate governance explanations of reasons for FDI.
Keywords: corporate governance, corporate transparency, FDIflows, FDI inward performance
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 27619074 Categorical Data Modeling: Logistic Regression Software
Authors: Abdellatif Tchantchane
Abstract:
A Matlab based software for logistic regression is developed to enhance the process of teaching quantitative topics and assist researchers with analyzing wide area of applications where categorical data is involved. The software offers an option of performing stepwise logistic regression to select the most significant predictors. The software includes a feature to detect influential observations in data, and investigates the effect of dropping or misclassifying an observation on a predictor variable. The input data may consist either as a set of individual responses (yes/no) with the predictor variables or as grouped records summarizing various categories for each unique set of predictor variables' values. Graphical displays are used to output various statistical results and to assess the goodness of fit of the logistic regression model. The software recognizes possible convergence constraints when present in data, and the user is notified accordingly.
Keywords: Logistic regression, Matlab, Categorical data, Influential observation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18829073 Predictive Analysis for Big Data: Extension of Classification and Regression Trees Algorithm
Authors: Ameur Abdelkader, Abed Bouarfa Hafida
Abstract:
Since its inception, predictive analysis has revolutionized the IT industry through its robustness and decision-making facilities. It involves the application of a set of data processing techniques and algorithms in order to create predictive models. Its principle is based on finding relationships between explanatory variables and the predicted variables. Past occurrences are exploited to predict and to derive the unknown outcome. With the advent of big data, many studies have suggested the use of predictive analytics in order to process and analyze big data. Nevertheless, they have been curbed by the limits of classical methods of predictive analysis in case of a large amount of data. In fact, because of their volumes, their nature (semi or unstructured) and their variety, it is impossible to analyze efficiently big data via classical methods of predictive analysis. The authors attribute this weakness to the fact that predictive analysis algorithms do not allow the parallelization and distribution of calculation. In this paper, we propose to extend the predictive analysis algorithm, Classification And Regression Trees (CART), in order to adapt it for big data analysis. The major changes of this algorithm are presented and then a version of the extended algorithm is defined in order to make it applicable for a huge quantity of data.
Keywords: Predictive analysis, big data, predictive analysis algorithms. CART algorithm.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10759072 Enhancing Predictive Accuracy in Pharmaceutical Sales Through an Ensemble Kernel Gaussian Process Regression Approach
Authors: Shahin Mirshekari, Mohammadreza Moradi, Hossein Jafari, Mehdi Jafari, Mohammad Ensaf
Abstract:
This research employs Gaussian Process Regression (GPR) with an ensemble kernel, integrating Exponential Squared, Revised Matérn, and Rational Quadratic kernels to analyze pharmaceutical sales data. Bayesian optimization was used to identify optimal kernel weights: 0.76 for Exponential Squared, 0.21 for Revised Matérn, and 0.13 for Rational Quadratic. The ensemble kernel demonstrated superior performance in predictive accuracy, achieving an R² score near 1.0, and significantly lower values in MSE, MAE, and RMSE. These findings highlight the efficacy of ensemble kernels in GPR for predictive analytics in complex pharmaceutical sales datasets.
Keywords: Gaussian Process Regression, Ensemble Kernels, Bayesian Optimization, Pharmaceutical Sales Analysis, Time Series Forecasting, Data Analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1119071 Experimental Study on Quasi-Static Response of Multi-layer Sandwich Composite Structures
Authors: S. Jedari Salami
Abstract:
In this paper the effects of adding an extra layer within a sandwich panel and core- types in top and bottom cores on quasi- static loading are studied experimentally. The panel includes polymer composite laminated sheets for faces and the internal laminated sheet called extra layer sheet, and two types of crushable foams are selected as the core material. Quasi- static tests were done by ZWICK testing machine on fully backed specimens with two foam cores, Poly Urethane Rigid (PUR) and Poly Vinyl Chloride (PVC). It was found that the core material type has made significant role on improving the sandwich panel’s behavior compared with the effect of extra layer location.
Keywords: Multi-layer sandwich structures, Internal sheet, Crushable foam, Top core, Bottom core.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21929070 Time Series Regression with Meta-Clusters
Authors: Monika Chuchro
Abstract:
This paper presents a preliminary attempt to apply classification of time series using meta-clusters in order to improve the quality of regression models. In this case, clustering was performed as a method to obtain subgroups of time series data with normal distribution from the inflow into wastewater treatment plant data, composed of several groups differing by mean value. Two simple algorithms, K-mean and EM, were chosen as a clustering method. The Rand index was used to measure the similarity. After simple meta-clustering, a regression model was performed for each subgroups. The final model was a sum of the subgroups models. The quality of the obtained model was compared with the regression model made using the same explanatory variables, but with no clustering of data. Results were compared using determination coefficient (R2), measure of prediction accuracy- mean absolute percentage error (MAPE) and comparison on a linear chart. Preliminary results allow us to foresee the potential of the presented technique.
Keywords: Clustering, Data analysis, Data mining, Predictive models.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19519069 Estimating Regression Parameters in Linear Regression Model with a Censored Response Variable
Authors: Jesus Orbe, Vicente Nunez-Anton
Abstract:
In this work we study the effect of several covariates X on a censored response variable T with unknown probability distribution. In this context, most of the studies in the literature can be located in two possible general classes of regression models: models that study the effect the covariates have on the hazard function; and models that study the effect the covariates have on the censored response variable. Proposals in this paper are in the second class of models and, more specifically, on least squares based model approach. Thus, using the bootstrap estimate of the bias, we try to improve the estimation of the regression parameters by reducing their bias, for small sample sizes. Simulation results presented in the paper show that, for reasonable sample sizes and censoring levels, the bias is always smaller for the new proposals.
Keywords: Censored response variable, regression, bias.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14759068 Gluability of Bambusa balcooa and Bambusa vulgaris for Development of Laminated Panels
Authors: Daisy Biswas, Samar Kanti Bose, M. Mozaffar Hossain
Abstract:
The development of value added composite products from bamboo with the application of gluing technology can play a vital role in economic development and also in forest resource conservation of any country. In this study, the gluability of Bambusa balcooa and Bambusa vulgaris, two locally grown bamboo species of Bangladesh was assessed. As the culm wall thickness of bamboos decreases from bottom to top, a culm portion of up to 5.4 m and 3.6 m were used from the base of B. balcooa and B. vulgaris, respectively, to get rectangular strips of uniform thickness. The color of the B. vulgaris strips was yellowish brown and that of B. balcooa was reddish brown. The strips were treated in borax-boric, bleaching and carbonization for extending the service life of the laminates. The preservative treatments changed the color of the strips. Borax–boric acid treated strips were reddish brown. When bleached with hydrogen peroxide, the color of the strips turned into whitish yellow. Carbonization produced dark brownish strips having coffee flavor. Chemical constituents for untreated and treated strips were determined. B. vulgaris was more acidic than B. balcooa. Then the treated strips were used to develop three-layered bamboo laminated panel. Urea formaldehyde (UF) and polyvinyl acetate (PVA) were used as binder. The shear strength and abrasive resistance of the panel were evaluated. It was found that the shear strength of the UF-panel was higher than the PVA-panel for all treatments. Between the species, gluability of B. vulgaris was better and in some cases better than hardwood species. The abrasive resistance of B. balcooa is slightly higher than B. vulgaris; however, the latter was preferred as it showed well gluability. The panels could be used as structural panel, floor tiles, flat pack furniture component, and wall panel etc. However, further research on durability and creep behavior of the product in service condition is warranted.Keywords: Bambusa balcooa, Bambusa vulgaris, polyvinyl acetate, urea formaldehyde.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11279067 Adjusted Ratio and Regression Type Estimators for Estimation of Population Mean when some Observations are missing
Authors: Nuanpan Nangsue
Abstract:
Ratio and regression type estimators have been used by previous authors to estimate a population mean for the principal variable from samples in which both auxiliary x and principal y variable data are available. However, missing data are a common problem in statistical analyses with real data. Ratio and regression type estimators have also been used for imputing values of missing y data. In this paper, six new ratio and regression type estimators are proposed for imputing values for any missing y data and estimating a population mean for y from samples with missing x and/or y data. A simulation study has been conducted to compare the six ratio and regression type estimators with a previous estimator of Rueda. Two population sizes N = 1,000 and 5,000 have been considered with sample sizes of 10% and 30% and with correlation coefficients between population variables X and Y of 0.5 and 0.8. In the simulations, 10 and 40 percent of sample y values and 10 and 40 percent of sample x values were randomly designated as missing. The new ratio and regression type estimators give similar mean absolute percentage errors that are smaller than the Rueda estimator for all cases. The new estimators give a large reduction in errors for the case of 40% missing y values and sampling fraction of 30%.
Keywords: Auxiliary variable, missing data, ratio and regression type estimators.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17329066 Studying the Effect of Shading by Rooftop PV Panels on Dwellings’ Thermal Performance
Authors: Saad Odeh
Abstract:
Thermal performance is considered to be a key measure in building sustainability. One of the technologies used in the current building sustainable design is the rooftop solar PV power generators. The application of this type of technology has expanded vastly during the last five years in many countries. This paper studies the effect of roof shading developed by the solar PV panels on dwellings’ thermal performance. The analysis in this work is performed by using two types of packages: “AccuRate Sustainability” for rating the energy efficiency of residential building design, and “PVSYST” for the solar PV power system design. The former package is used to calculate the annual heating and cooling load, and the later package is used to evaluate the power production from the roof top PV system. The analysis correlates the electrical energy generated from the PV panels to the change in the heating and cooling load due to roof shading. Different roof orientation, roof inclination, roof insulation, as well as PV panel area are considered in this study. The analysis shows that the drop in energy efficiency due to the shaded area of the roof by PV panels is negligible compared to the energy generated by these panels.
Keywords: Energy efficiency, roof shading, thermal performance, PV panel.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12699065 Methods for Data Selection in Medical Databases: The Binary Logistic Regression -Relations with the Calculated Risks
Authors: Cristina G. Dascalu, Elena Mihaela Carausu, Daniela Manuc
Abstract:
The medical studies often require different methods for parameters selection, as a second step of processing, after the database-s designing and filling with information. One common task is the selection of fields that act as risk factors using wellknown methods, in order to find the most relevant risk factors and to establish a possible hierarchy between them. Different methods are available in this purpose, one of the most known being the binary logistic regression. We will present the mathematical principles of this method and a practical example of using it in the analysis of the influence of 10 different psychiatric diagnostics over 4 different types of offences (in a database made from 289 psychiatric patients involved in different types of offences). Finally, we will make some observations about the relation between the risk factors hierarchy established through binary logistic regression and the individual risks, as well as the results of Chi-squared test. We will show that the hierarchy built using the binary logistic regression doesn-t agree with the direct order of risk factors, even if it was naturally to assume this hypothesis as being always true.Keywords: Databases, risk factors, binary logisticregression, hierarchy.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13279064 Maximum Power Point Tracking Based on Estimated Power for PV Energy Conversion System
Authors: Zainab Almukhtar, Adel Merabet
Abstract:
In this paper, a method for maximum power point tracking of a photovoltaic energy conversion system is presented. This method is based on using the difference between the power from the solar panel and an estimated power value to control the DC-DC converter of the photovoltaic system. The difference is continuously compared with a preset error permitted value. If the power difference is more than the error, the estimated power is multiplied by a factor and the operation is repeated until the difference is less or equal to the threshold error. The difference in power will be used to trigger a DC-DC boost converter in order to raise the voltage to where the maximum power point is achieved. The proposed method was experimentally verified through a PV energy conversion system driven by the OPAL-RT real time controller. The method was tested on varying radiation conditions and load requirements, and the Photovoltaic Panel was operated at its maximum power in different conditions of irradiation.Keywords: Control system, power error, solar panel, MPPT.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13229063 An Effective Genetic Algorithm for a Complex Real-World Scheduling Problem
Authors: Anis Gharbi, Mohamed Haouari, Talel Ladhari, Mohamed Ali Rakrouki
Abstract:
We address a complex scheduling problem arising in the wood panel industry with the objective of minimizing a quadratic function of job tardiness. The proposed solution strategy, which is based on an effective genetic algorithm, has been coded and implemented within a major Tunisian company, leader in the wood panel manufacturing. Preliminary experimental results indicate significant decrease of delivery times.
Keywords: Genetic algorithm, heuristic, hybrid flowshop, total weighted squared tardiness.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19419062 Acute Coronary Syndrome Prediction Using Data Mining Techniques- An Application
Authors: Tahseen A. Jilani, Huda Yasin, Madiha Yasin, C. Ardil
Abstract:
In this paper we use data mining techniques to investigate factors that contribute significantly to enhancing the risk of acute coronary syndrome. We assume that the dependent variable is diagnosis – with dichotomous values showing presence or absence of disease. We have applied binary regression to the factors affecting the dependent variable. The data set has been taken from two different cardiac hospitals of Karachi, Pakistan. We have total sixteen variables out of which one is assumed dependent and other 15 are independent variables. For better performance of the regression model in predicting acute coronary syndrome, data reduction techniques like principle component analysis is applied. Based on results of data reduction, we have considered only 14 out of sixteen factors.
Keywords: Acute coronary syndrome (ACS), binary logistic regression analyses, myocardial ischemia (MI), principle component analysis, unstable angina (U.A.).
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21149061 Support Vector Regression for Retrieval of Soil Moisture Using Bistatic Scatterometer Data at X-Band
Authors: Dileep Kumar Gupta, Rajendra Prasad, Pradeep Kumar, Varun Narayan Mishra, Ajeet Kumar Vishwakarma, Prashant Kumar Srivastava
Abstract:
An approach was evaluated for the retrieval of soil moisture of bare soil surface using bistatic scatterometer data in the angular range of 200 to 700 at VV- and HH- polarization. The microwave data was acquired by specially designed X-band (10 GHz) bistatic scatterometer. The linear regression analysis was done between scattering coefficients and soil moisture content to select the suitable incidence angle for retrieval of soil moisture content. The 250 incidence angle was found more suitable. The support vector regression analysis was used to approximate the function described by the input output relationship between the scattering coefficient and corresponding measured values of the soil moisture content. The performance of support vector regression algorithm was evaluated by comparing the observed and the estimated soil moisture content by statistical performance indices %Bias, root mean squared error (RMSE) and Nash-Sutcliffe Efficiency (NSE). The values of %Bias, root mean squared error (RMSE) and Nash-Sutcliffe Efficiency (NSE) were found 2.9451, 1.0986 and 0.9214 respectively at HHpolarization. At VV- polarization, the values of %Bias, root mean squared error (RMSE) and Nash-Sutcliffe Efficiency (NSE) were found 3.6186, 0.9373 and 0.9428 respectively.Keywords: Bistatic scatterometer, soil moisture, support vector regression, RMSE, %Bias, NSE.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 32279060 Quality of Service Evaluation using a Combination of Fuzzy C-Means and Regression Model
Authors: Aboagela Dogman, Reza Saatchi, Samir Al-Khayatt
Abstract:
In this study, a network quality of service (QoS) evaluation system was proposed. The system used a combination of fuzzy C-means (FCM) and regression model to analyse and assess the QoS in a simulated network. Network QoS parameters of multimedia applications were intelligently analysed by FCM clustering algorithm. The QoS parameters for each FCM cluster centre were then inputted to a regression model in order to quantify the overall QoS. The proposed QoS evaluation system provided valuable information about the network-s QoS patterns and based on this information, the overall network-s QoS was effectively quantified.Keywords: Fuzzy C-means; regression model, network quality of service
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17209059 Estimating Bridge Deterioration for Small Data Sets Using Regression and Markov Models
Authors: Yina F. Muñoz, Alexander Paz, Hanns De La Fuente-Mella, Joaquin V. Fariña, Guilherme M. Sales
Abstract:
The primary approach for estimating bridge deterioration uses Markov-chain models and regression analysis. Traditional Markov models have problems in estimating the required transition probabilities when a small sample size is used. Often, reliable bridge data have not been taken over large periods, thus large data sets may not be available. This study presents an important change to the traditional approach by using the Small Data Method to estimate transition probabilities. The results illustrate that the Small Data Method and traditional approach both provide similar estimates; however, the former method provides results that are more conservative. That is, Small Data Method provided slightly lower than expected bridge condition ratings compared with the traditional approach. Considering that bridges are critical infrastructures, the Small Data Method, which uses more information and provides more conservative estimates, may be more appropriate when the available sample size is small. In addition, regression analysis was used to calculate bridge deterioration. Condition ratings were determined for bridge groups, and the best regression model was selected for each group. The results obtained were very similar to those obtained when using Markov chains; however, it is desirable to use more data for better results.
Keywords: Concrete bridges, deterioration, Markov chains, probability matrix.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14409058 A Research on Inference from Multiple Distance Variables in Hedonic Regression – Focus on Three Variables
Authors: Yan Wang, Yasushi Asami, Yukio Sadahiro
Abstract:
In urban context, urban nodes such as amenity or hazard will certainly affect house price, while classic hedonic analysis will employ distance variables measured from each urban nodes. However, effects from distances to facilities on house prices generally do not represent the true price of the property. Distance variables measured on the same surface are suffering a problem called multicollinearity, which is usually presented as magnitude variance and mean value in regression, errors caused by instability. In this paper, we provided a theoretical framework to identify and gather the data with less bias, and also provided specific sampling method on locating the sample region to avoid the spatial multicollinerity problem in three distance variable’s case.
Keywords: Hedonic regression, urban node, distance variables, multicollinerity, collinearity.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19939057 Data Mining Classification Methods Applied in Drug Design
Authors: Mária Stachová, Lukáš Sobíšek
Abstract:
Data mining incorporates a group of statistical methods used to analyze a set of information, or a data set. It operates with models and algorithms, which are powerful tools with the great potential. They can help people to understand the patterns in certain chunk of information so it is obvious that the data mining tools have a wide area of applications. For example in the theoretical chemistry data mining tools can be used to predict moleculeproperties or improve computer-assisted drug design. Classification analysis is one of the major data mining methodologies. The aim of thecontribution is to create a classification model, which would be able to deal with a huge data set with high accuracy. For this purpose logistic regression, Bayesian logistic regression and random forest models were built using R software. TheBayesian logistic regression in Latent GOLD software was created as well. These classification methods belong to supervised learning methods. It was necessary to reduce data matrix dimension before construct models and thus the factor analysis (FA) was used. Those models were applied to predict the biological activity of molecules, potential new drug candidates.Keywords: data mining, classification, drug design, QSAR
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 28499056 Density Estimation using Generalized Linear Model and a Linear Combination of Gaussians
Authors: Aly Farag, Ayman El-Baz, Refaat Mohamed
Abstract:
In this paper we present a novel approach for density estimation. The proposed approach is based on using the logistic regression model to get initial density estimation for the given empirical density. The empirical data does not exactly follow the logistic regression model, so, there will be a deviation between the empirical density and the density estimated using logistic regression model. This deviation may be positive and/or negative. In this paper we use a linear combination of Gaussian (LCG) with positive and negative components as a model for this deviation. Also, we will use the expectation maximization (EM) algorithm to estimate the parameters of LCG. Experiments on real images demonstrate the accuracy of our approach.
Keywords: Logistic regression model, Expectationmaximization, Segmentation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17339055 Multiple Regression based Graphical Modeling for Images
Authors: Pavan S., Sridhar G., Sridhar V.
Abstract:
Super resolution is one of the commonly referred inference problems in computer vision. In the case of images, this problem is generally addressed using a graphical model framework wherein each node represents a portion of the image and the edges between the nodes represent the statistical dependencies. However, the large dimensionality of images along with the large number of possible states for a node makes the inference problem computationally intractable. In this paper, we propose a representation wherein each node can be represented as acombination of multiple regression functions. The proposed approach achieves a tradeoff between the computational complexity and inference accuracy by varying the number of regression functions for a node.
Keywords: Belief propagation, Graphical model, Regression, Super resolution.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15479054 Empirical Statistical Modeling of Rainfall Prediction over Myanmar
Authors: Wint Thida Zaw, Thinn Thu Naing
Abstract:
One of the essential sectors of Myanmar economy is agriculture which is sensitive to climate variation. The most important climatic element which impacts on agriculture sector is rainfall. Thus rainfall prediction becomes an important issue in agriculture country. Multi variables polynomial regression (MPR) provides an effective way to describe complex nonlinear input output relationships so that an outcome variable can be predicted from the other or others. In this paper, the modeling of monthly rainfall prediction over Myanmar is described in detail by applying the polynomial regression equation. The proposed model results are compared to the results produced by multiple linear regression model (MLR). Experiments indicate that the prediction model based on MPR has higher accuracy than using MLR.Keywords: Polynomial Regression, Rainfall Forecasting, Statistical forecasting.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 26349053 Effects of Human Capital and Openness on Economic Growth of Developed and Developing Countries: A Panel Data Analysis
Authors: Fatma Didin Sonmez, Pinar Sener
Abstract:
Technology transfer by international trade and foreign direct investment is the most important positive outcome of open economy. It is widely accepted that new technology and knowledge have an important role in enhancing economic growth. Human capital is the other important factor assisting economic growth. In this study, the role of human capital in the growth process is examined in a view of new endogenous growth theory emphasizing on the technology transfer resulting from international trade. Using the panel data of 10 developed and 10 developing countries, impact of human capital and openness on the rate of economic growth of different countries is analysed. Evidence suggests the view that human capital and openness contribute to the economic growth in both developing and developed countries, but with different rates.Keywords: economic growth, human capital, openness, technology
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20809052 Optimization of Solar Tracking Systems
Authors: A. Zaher, A. Traore, F. Thiéry, T. Talbert, B. Shaer
Abstract:
In this paper, an intelligent approach is proposed to optimize the orientation of continuous solar tracking systems on cloudy days. Considering the weather case, the direct sunlight is more important than the diffuse radiation in case of clear sky. Thus, the panel is always pointed towards the sun. In case of an overcast sky, the solar beam is close to zero, and the panel is placed horizontally to receive the maximum of diffuse radiation. Under partly covered conditions, the panel must be pointed towards the source that emits the maximum of solar energy and it may be anywhere in the sky dome. Thus, the idea of our approach is to analyze the images, captured by ground-based sky camera system, in order to detect the zone in the sky dome which is considered as the optimal source of energy under cloudy conditions. The proposed approach is implemented using experimental setup developed at PROMES-CNRS laboratory in Perpignan city (France). Under overcast conditions, the results were very satisfactory, and the intelligent approach has provided efficiency gains of up to 9% relative to conventional continuous sun tracking systems.
Keywords: Clouds detection, fuzzy inference systems, images processing, sun trackers.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1212