Search results for: Regression method
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 8595

Search results for: Regression method

8445 Model-Driven and Data-Driven Approaches for Crop Yield Prediction: Analysis and Comparison

Authors: Xiangtuo Chen, Paul-Henry Cournéde

Abstract:

Crop yield prediction is a paramount issue in agriculture. The main idea of this paper is to find out efficient way to predict the yield of corn based meteorological records. The prediction models used in this paper can be classified into model-driven approaches and data-driven approaches, according to the different modeling methodologies. The model-driven approaches are based on crop mechanistic modeling. They describe crop growth in interaction with their environment as dynamical systems. But the calibration process of the dynamic system comes up with much difficulty, because it turns out to be a multidimensional non-convex optimization problem. An original contribution of this paper is to propose a statistical methodology, Multi-Scenarios Parameters Estimation (MSPE), for the parametrization of potentially complex mechanistic models from a new type of datasets (climatic data, final yield in many situations). It is tested with CORNFLO, a crop model for maize growth. On the other hand, the data-driven approach for yield prediction is free of the complex biophysical process. But it has some strict requirements about the dataset. A second contribution of the paper is the comparison of these model-driven methods with classical data-driven methods. For this purpose, we consider two classes of regression methods, methods derived from linear regression (Ridge and Lasso Regression, Principal Components Regression or Partial Least Squares Regression) and machine learning methods (Random Forest, k-Nearest Neighbor, Artificial Neural Network and SVM regression). The dataset consists of 720 records of corn yield at county scale provided by the United States Department of Agriculture (USDA) and the associated climatic data. A 5-folds cross-validation process and two accuracy metrics: root mean square error of prediction(RMSEP), mean absolute error of prediction(MAEP) were used to evaluate the crop prediction capacity. The results show that among the data-driven approaches, Random Forest is the most robust and generally achieves the best prediction error (MAEP 4.27%). It also outperforms our model-driven approach (MAEP 6.11%). However, the method to calibrate the mechanistic model from dataset easy to access offers several side-perspectives. The mechanistic model can potentially help to underline the stresses suffered by the crop or to identify the biological parameters of interest for breeding purposes. For this reason, an interesting perspective is to combine these two types of approaches.

Keywords: Crop yield prediction, crop model, sensitivity analysis, paramater estimation, particle swarm optimization, random forest.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1121
8444 Support Vector Regression for Retrieval of Soil Moisture Using Bistatic Scatterometer Data at X-Band

Authors: Dileep Kumar Gupta, Rajendra Prasad, Pradeep Kumar, Varun Narayan Mishra, Ajeet Kumar Vishwakarma, Prashant Kumar Srivastava

Abstract:

An approach was evaluated for the retrieval of soil moisture of bare soil surface using bistatic scatterometer data in the angular range of 200 to 700 at VV- and HH- polarization. The microwave data was acquired by specially designed X-band (10 GHz) bistatic scatterometer. The linear regression analysis was done between scattering coefficients and soil moisture content to select the suitable incidence angle for retrieval of soil moisture content. The 250 incidence angle was found more suitable. The support vector regression analysis was used to approximate the function described by the input output relationship between the scattering coefficient and corresponding measured values of the soil moisture content. The performance of support vector regression algorithm was evaluated by comparing the observed and the estimated soil moisture content by statistical performance indices %Bias, root mean squared error (RMSE) and Nash-Sutcliffe Efficiency (NSE). The values of %Bias, root mean squared error (RMSE) and Nash-Sutcliffe Efficiency (NSE) were found 2.9451, 1.0986 and 0.9214 respectively at HHpolarization. At VV- polarization, the values of %Bias, root mean squared error (RMSE) and Nash-Sutcliffe Efficiency (NSE) were found 3.6186, 0.9373 and 0.9428 respectively.

Keywords: Bistatic scatterometer, soil moisture, support vector regression, RMSE, %Bias, NSE.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3168
8443 Regional Analysis of Streamflow Drought: A Case Study for Southwestern Iran

Authors: M. Byzedi, B. Saghafian

Abstract:

Droughts are complex, natural hazards that, to a varying degree, affect some parts of the world every year. The range of drought impacts is related to drought occurring in different stages of the hydrological cycle and usually different types of droughts, such as meteorological, agricultural, hydrological, and socioeconomical are distinguished. Streamflow drought was analyzed by the method of truncation level (at 70% level) on daily discharges measured in 54 hydrometric stations in southwestern Iran. Frequency analysis was carried out for annual maximum series (AMS) of drought deficit volume and duration series. Some factors including physiographic, climatic, geologic, and vegetation cover were studied as influential factors in the regional analysis. According to the results of factor analysis, six most effective factors were identified as area, rainfall from December to February, the percent of area with Normalized Difference Vegetation Index (NDVI) <0.1, the percent of convex area, drainage density and the minimum of watershed elevation that explained 90.9% of variance. The homogenous regions were determined by cluster analysis and discriminate function analysis. Suitable multivariate regression models were evaluated for streamflow drought deficit volume with 2 years return period. The significance level of regression models was 0.01. The results showed that the watershed area is the most effective factor with high correlation with deficit volume. Also, drought duration was not a suitable drought index for regional analysis.

Keywords: Iran, Streamflow drought, truncation level method, regional analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1696
8442 A Three Elements Vector Valued Structure’s Ultimate Strength-Strong Motion-Intensity Measure

Authors: A. Nicknam, N. Eftekhari, A. Mazarei, M. Ganjvar

Abstract:

This article presents an alternative collapse capacity intensity measure in the three elements form which is influenced by the spectral ordinates at periods longer than that of the first mode period at near and far source sites. A parameter, denoted by β, is defined by which the spectral ordinate effects, up to the effective period (2T1), on the intensity measure are taken into account. The methodology permits to meet the hazard-levelled target extreme event in the probabilistic and deterministic forms. A MATLAB code is developed involving OpenSees to calculate the collapse capacities of the 8 archetype RC structures having 2 to 20 stories for regression process. The incremental dynamic analysis (IDA) method is used to calculate the structure’s collapse values accounting for the element stiffness and strength deterioration. The general near field set presented by FEMA is used in a series of performing nonlinear analyses. 8 linear relationships are developed for the 8structutres leading to the correlation coefficient up to 0.93. A collapse capacity near field prediction equation is developed taking into account the results of regression processes obtained from the 8 structures. The proposed prediction equation is validated against a set of actual near field records leading to a good agreement. Implementation of the proposed equation to the four archetype RC structures demonstrated different collapse capacities at near field site compared to those of FEMA. The reasons of differences are believed to be due to accounting for the spectral shape effects.

Keywords: Collapse capacity, fragility analysis, spectral shape effects, IDA method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1746
8441 A Simple and Empirical Refraction Correction Method for UAV-Based Shallow-Water Photogrammetry

Authors: I GD Yudha Partama, A. Kanno, Y. Akamatsu, R. Inui, M. Goto, M. Sekine

Abstract:

The aerial photogrammetry of shallow water bottoms has the potential to be an efficient high-resolution survey technique for shallow water topography, thanks to the advent of convenient UAV and automatic image processing techniques Structure-from-Motion (SfM) and Multi-View Stereo (MVS)). However, it suffers from the systematic overestimation of the bottom elevation, due to the light refraction at the air-water interface. In this study, we present an empirical method to correct for the effect of refraction after the usual SfM-MVS processing, using common software. The presented method utilizes the empirical relation between the measured true depth and the estimated apparent depth to generate an empirical correction factor. Furthermore, this correction factor was utilized to convert the apparent water depth into a refraction-corrected (real-scale) water depth. To examine its effectiveness, we applied the method to two river sites, and compared the RMS errors in the corrected bottom elevations with those obtained by three existing methods. The result shows that the presented method is more effective than the two existing methods: The method without applying correction factor and the method utilizes the refractive index of water (1.34) as correction factor. In comparison with the remaining existing method, which used the additive terms (offset) after calculating correction factor, the presented method performs well in Site 2 and worse in Site 1. However, we found this linear regression method to be unstable when the training data used for calibration are limited. It also suffers from a large negative bias in the correction factor when the apparent water depth estimated is affected by noise, according to our numerical experiment. Overall, the good accuracy of refraction correction method depends on various factors such as the locations, image acquisition, and GPS measurement conditions. The most effective method can be selected by using statistical selection (e.g. leave-one-out cross validation).

Keywords: Bottom elevation, multi-view stereo, river, structure-from-motion.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1504
8440 Gas Detection via Machine Learning

Authors: Walaa Khalaf, Calogero Pace, Manlio Gaudioso

Abstract:

We present an Electronic Nose (ENose), which is aimed at identifying the presence of one out of two gases, possibly detecting the presence of a mixture of the two. Estimation of the concentrations of the components is also performed for a volatile organic compound (VOC) constituted by methanol and acetone, for the ranges 40-400 and 22-220 ppm (parts-per-million), respectively. Our system contains 8 sensors, 5 of them being gas sensors (of the class TGS from FIGARO USA, INC., whose sensing element is a tin dioxide (SnO2) semiconductor), the remaining being a temperature sensor (LM35 from National Semiconductor Corporation), a humidity sensor (HIH–3610 from Honeywell), and a pressure sensor (XFAM from Fujikura Ltd.). Our integrated hardware–software system uses some machine learning principles and least square regression principle to identify at first a new gas sample, or a mixture, and then to estimate the concentrations. In particular we adopt a training model using the Support Vector Machine (SVM) approach with linear kernel to teach the system how discriminate among different gases. Then we apply another training model using the least square regression, to predict the concentrations. The experimental results demonstrate that the proposed multiclassification and regression scheme is effective in the identification of the tested VOCs of methanol and acetone with 96.61% correctness. The concentration prediction is obtained with 0.979 and 0.964 correlation coefficient for the predicted versus real concentrations of methanol and acetone, respectively.

Keywords: Electronic nose, Least square regression, Mixture ofgases, Support Vector Machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2494
8439 Application of Feed-Forward Neural Networks Autoregressive Models in Gross Domestic Product Prediction

Authors: Ε. Giovanis

Abstract:

In this paper we present an autoregressive model with neural networks modeling and standard error backpropagation algorithm training optimization in order to predict the gross domestic product (GDP) growth rate of four countries. Specifically we propose a kind of weighted regression, which can be used for econometric purposes, where the initial inputs are multiplied by the neural networks final optimum weights from input-hidden layer after the training process. The forecasts are compared with those of the ordinary autoregressive model and we conclude that the proposed regression-s forecasting results outperform significant those of autoregressive model in the out-of-sample period. The idea behind this approach is to propose a parametric regression with weighted variables in order to test for the statistical significance and the magnitude of the estimated autoregressive coefficients and simultaneously to estimate the forecasts.

Keywords: Autoregressive model, Error back-propagation Feed-Forward neural networks, , Gross Domestic Product

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1379
8438 The Leaves of a Tree

Authors: Zhu Jiaming, Yu Mengna

Abstract:

In this article, models based on quantitative analysis, physical geometry and regression analysis are established, by using analytic hierarchy process analysis, fuzzy cluster analysis, fuzzy photographic and data fitting. The reasons of various leaf shapes among different species and the differences between the leaf shapes on same tree have been solved by using software, such as Eviews, VB and Matlab. We also successfully estimate the leaf mass of a tree and the correlation with the tree profile.

Keywords: Leaf shape; Mass; Fuzzy cluster; Regression analysis; Eviews; Matlab

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1555
8437 Applying the Regression Technique for Prediction of the Acute Heart Attack

Authors: Paria Soleimani, Arezoo Neshati

Abstract:

Myocardial infarction is one of the leading causes of death in the world. Some of these deaths occur even before the patient reaches the hospital. Myocardial infarction occurs as a result of impaired blood supply. Because the most of these deaths are due to coronary artery disease, hence the awareness of the warning signs of a heart attack is essential. Some heart attacks are sudden and intense, but most of them start slowly, with mild pain or discomfort, then early detection and successful treatment of these symptoms is vital to save them. Therefore, importance and usefulness of a system designing to assist physicians in early diagnosis of the acute heart attacks is obvious. The main purpose of this study would be to enable patients to become better informed about their condition and to encourage them to seek professional care at an earlier stage in the appropriate situations. For this purpose, the data were collected on 711 heart patients in Iran hospitals. 28 attributes of clinical factors can be reported by patients; were studied. Three logistic regression models were made on the basis of the 28 features to predict the risk of heart attacks. The best logistic regression model in terms of performance had a C-index of 0.955 and with an accuracy of 94.9%. The variables, severe chest pain, back pain, cold sweats, shortness of breath, nausea and vomiting, were selected as the main features.

Keywords: Coronary heart disease, acute heart attacks, prediction, logistic regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2373
8436 Combining Bagging and Additive Regression

Authors: Sotiris B. Kotsiantis

Abstract:

Bagging and boosting are among the most popular re-sampling ensemble methods that generate and combine a diversity of regression models using the same learning algorithm as base-learner. Boosting algorithms are considered stronger than bagging on noise-free data. However, there are strong empirical indications that bagging is much more robust than boosting in noisy settings. For this reason, in this work we built an ensemble using an averaging methodology of bagging and boosting ensembles with 10 sub-learners in each one. We performed a comparison with simple bagging and boosting ensembles with 25 sub-learners on standard benchmark datasets and the proposed ensemble gave better accuracy.

Keywords: Regressors, statistical learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1593
8435 Optimum Design of Alkali Activated Slag Concretes for Low Chloride Ion Permeability and Water Absorption Capacity

Authors: Müzeyyen Balçikanli, Erdoğan Özbay, Hakan Tacettin Türker, Okan Karahan, Cengiz Duran Atiş

Abstract:

In this research, effect of curing time (TC), curing temperature (CT), sodium concentration (SC) and silicate modules (SM) on the compressive strength, chloride ion permeability, and water absorption capacity of alkali activated slag (AAS) concretes were investigated. For maximization of compressive strength while for minimization of chloride ion permeability and water absorption capacity of AAS concretes, best possible combination of CT, CTime, SC and SM were determined. An experimental program was conducted by using the central composite design method. Alkali solution-slag ratio was kept constant at 0.53 in all mixture. The effects of the independent parameters were characterized and analyzed by using statistically significant quadratic regression models on the measured properties (dependent parameters). The proposed regression models are valid for AAS concretes with the SC from 0.1% to 7.5%, SM from 0.4 to 3.2, CT from 20 °C to 94 °C and TC from 1.2 hours to 25 hours. The results of test and analysis indicate that the most effective parameter for the compressive strength, chloride ion permeability and water absorption capacity is the sodium concentration.

Keywords: Alkali activation, slag, rapid chloride permeability, water absorption capacity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1092
8434 Institutional Efficiency of Commonhold Industrial Parks Using a Polynomial Regression Model

Authors: Jeng-Wen Lin, Simon Chien-Yuan Chen

Abstract:

Based on assumptions of neo-classical economics and rational choice / public choice theory, this paper investigates the regulation of industrial land use in Taiwan by homeowners associations (HOAs) as opposed to traditional government administration. The comparison, which applies the transaction cost theory and a polynomial regression analysis, manifested that HOAs are superior to conventional government administration in terms of transaction costs and overall efficiency. A case study that compares Taiwan-s commonhold industrial park, NangKang Software Park, to traditional government counterparts using limited data on the costs and returns was analyzed. This empirical study on the relative efficiency of governmental and private institutions justified the important theoretical proposition. Numerical results prove the efficiency of the established model.

Keywords: Homeowners Associations, Institutional Efficiency, Polynomial Regression, Transaction Cost.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1536
8433 A Novel Approach to Handle Uncertainty in Health System Variables for Hospital Admissions

Authors: Manisha Rathi, Thierry Chaussalet

Abstract:

Hospital staff and managers are under pressure and concerned for effective use and management of scarce resources. The hospital admissions require many decisions that have complex and uncertain consequences for hospital resource utilization and patient flow. It is challenging to predict risk of admissions and length of stay of a patient due to their vague nature. There is no method to capture the vague definition of admission of a patient. Also, current methods and tools used to predict patients at risk of admission fail to deal with uncertainty in unplanned admission, LOS, patients- characteristics. The main objective of this paper is to deal with uncertainty in health system variables, and handles uncertain relationship among variables. An introduction of machine learning techniques along with statistical methods like Regression methods can be a proposed solution approach to handle uncertainty in health system variables. A model that adapts fuzzy methods to handle uncertain data and uncertain relationships can be an efficient solution to capture the vague definition of admission of a patient.

Keywords: Admission, Fuzzy, Regression, Uncertainty

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1363
8432 Analyzing Preservice Teachers’ Attitudes towards Technology

Authors: Ahmet Oguz Akturk, Kemal Izci, Gurbuz Caliskan, Ismail Sahin

Abstract:

Rapid developments in technology in the present age have made it necessary for communities to follow technological developments and adapt themselves to these developments. One of the fields that are most rapidly affected by these developments is undoubtedly education. Determination of the attitudes of preservice teachers, who live in an age of technology and get ready to raise future individuals, is of paramount importance both educationally and professionally. The purpose of this study was to analyze attitudes of preservice teachers towards technology and some variables that predict these attitudes (gender, daily duration of internet use, and the number of technical devices owned). 329 preservice teachers attending the education faculty of a large university in central Turkey participated, on a volunteer basis, in this study, where relational survey model was used as the research method. Research findings reveal that preservice teachers’ attitudes towards technology are positive and at the same time, the attitudes of male preservice teachers towards technology are more positive than their female counterparts. As a result of the stepwise multiple regression analysis where factors predicting preservice teachers’ attitudes towards technology, it was found that duration of daily internet use was the strongest predictor of attitudes towards technology.

Keywords: Attitudes towards technology, preservice teachers, gender, stepwise multiple regression analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1702
8431 Coverage Probability Analysis of WiMAX Network under Additive White Gaussian Noise and Predicted Empirical Path Loss Model

Authors: Chaudhuri Manoj Kumar Swain, Susmita Das

Abstract:

This paper explores a detailed procedure of predicting a path loss (PL) model and its application in estimating the coverage probability in a WiMAX network. For this a hybrid approach is followed in predicting an empirical PL model of a 2.65 GHz WiMAX network deployed in a suburban environment. Data collection, statistical analysis, and regression analysis are the phases of operations incorporated in this approach and the importance of each of these phases has been discussed properly. The procedure of collecting data such as received signal strength indicator (RSSI) through experimental set up is demonstrated. From the collected data set, empirical PL and RSSI models are predicted with regression technique. Furthermore, with the aid of the predicted PL model, essential parameters such as PL exponent as well as the coverage probability of the network are evaluated. This research work may assist in the process of deployment and optimisation of any cellular network significantly.

Keywords: WiMAX, RSSI, path loss, coverage probability, regression analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 640
8430 The Impact of Governance on Happiness: Evidence from Quantile Regressions

Authors: Chiung-Ju Huang

Abstract:

This study utilizes the quantile regression analysis to examine the impact of governance (including democratic quality and technical quality) on happiness in 101 countries worldwide, classified as “developed countries” and “developing countries”. The empirical results show that the impact of democratic quality and technical quality on happiness is significantly positive for “developed countries”, while is insignificant for “developing countries”. The results suggest that the authorities in developed countries can enhance the level of individual happiness by means of improving the democracy quality and technical quality. However, for developing countries, promoting the quality of governance in order to enhance the level of happiness may not be effective. Policy makers in developed countries may pay more attention on increasing real GDP per capita instead of promoting the quality of governance to enhance individual happiness.

Keywords: Governance, happiness, multiple regression, quantile regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1643
8429 Non-Methane Hydrocarbons Emission during the Photocopying Process

Authors: Kiurski S. Jelena, Aksentijević M. Snežana, Kecić S. Vesna, Oros B. Ivana

Abstract:

Prosperity of electronic equipment in photocopying environment not only has improved work efficiency, but also has changed indoor air quality. Considering the number of photocopying employed, indoor air quality might be worse than in general office environments. Determining the contribution from any type of equipment to indoor air pollution is a complex matter. Non-methane hydrocarbons are known to have an important role on air quality due to their high reactivity. The presence of hazardous pollutants in indoor air has been detected in one photocopying shop in Novi Sad, Serbia. Air samples were collected and analyzed for five days, during 8-hr working time in three time intervals, whereas three different sampling points were determined. Using multiple linear regression model and software package STATISTICA 10 the concentrations of occupational hazards and microclimates parameters were mutually correlated. Based on the obtained multiple coefficients of determination (0.3751, 0.2389 and 0.1975), a weak positive correlation between the observed variables was determined. Small values of parameter F indicated that there was no statistically significant difference between the concentration levels of nonmethane hydrocarbons and microclimates parameters. The results showed that variable could be presented by the general regression model: y = b0 + b1xi1+ b2xi2. Obtained regression equations allow to measure the quantitative agreement between the variables and thus obtain more accurate knowledge of their mutual relations.

Keywords: Indoor air quality, multiple regression analysis, nonmethane hydrocarbons, photocopying process.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1927
8428 A Robust LS-SVM Regression

Authors: József Valyon, Gábor Horváth

Abstract:

In comparison to the original SVM, which involves a quadratic programming task; LS–SVM simplifies the required computation, but unfortunately the sparseness of standard SVM is lost. Another problem is that LS-SVM is only optimal if the training samples are corrupted by Gaussian noise. In Least Squares SVM (LS–SVM), the nonlinear solution is obtained, by first mapping the input vector to a high dimensional kernel space in a nonlinear fashion, where the solution is calculated from a linear equation set. In this paper a geometric view of the kernel space is introduced, which enables us to develop a new formulation to achieve a sparse and robust estimate.

Keywords: Support Vector Machines, Least Squares SupportVector Machines, Regression, Sparse approximation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2020
8427 EEG-Based Fractal Analysis of Different Motor Imagery Tasks using Critical Exponent Method

Authors: Montri Phothisonothai, Masahiro Nakagawa

Abstract:

The objective of this paper is to characterize the spontaneous Electroencephalogram (EEG) signals of four different motor imagery tasks and to show hereby a possible solution for the present binary communication between the brain and a machine ora Brain-Computer Interface (BCI). The processing technique used in this paper was the fractal analysis evaluated by the Critical Exponent Method (CEM). The EEG signal was registered in 5 healthy subjects,sampling 15 measuring channels at 1024 Hz.Each channel was preprocessed by the Laplacian space ltering so as to reduce the space blur and therefore increase the spaceresolution. The EEG of each channel was segmented and its Fractaldimension (FD) calculated. The FD was evaluated in the time interval corresponding to the motor imagery and averaged out for all the subjects (each channel). In order to characterize the FD distribution,the linear regression curves of FD over the electrodes position were applied. The differences FD between the proposed mental tasks are quantied and evaluated for each experimental subject. The obtained results of the proposed method are a substantial fractal dimension in the EEG signal of motor imagery tasks and can be considerably utilized as the multiple-states BCI applications.

Keywords: electroencephalogram (EEG), motor imagery tasks, mental tasks, biomedical signals processing, human-machine interface, fractal analysis, critical exponent method (CEM).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2209
8426 Acute Coronary Syndrome Prediction Using Data Mining Techniques- An Application

Authors: Tahseen A. Jilani, Huda Yasin, Madiha Yasin, C. Ardil

Abstract:

In this paper we use data mining techniques to investigate factors that contribute significantly to enhancing the risk of acute coronary syndrome. We assume that the dependent variable is diagnosis – with dichotomous values showing presence or  absence of disease. We have applied binary regression to the factors affecting the dependent variable. The data set has been taken from two different cardiac hospitals of Karachi, Pakistan. We have total sixteen variables out of which one is assumed dependent and other 15 are independent variables. For better performance of the regression model in predicting acute coronary syndrome, data reduction techniques like principle component analysis is applied. Based on results of data reduction, we have considered only 14 out of sixteen factors.

Keywords: Acute coronary syndrome (ACS), binary logistic regression analyses, myocardial ischemia (MI), principle component analysis, unstable angina (U.A.).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2068
8425 Weighted Harmonic Arnoldi Method for Large Interior Eigenproblems

Authors: Zhengsheng Wang, Jing Qi, Chuntao Liu, Yuanjun Li

Abstract:

The harmonic Arnoldi method can be used to find interior eigenpairs of large matrices. However, it has been shown that this method may converge erratically and even may fail to do so. In this paper, we present a new method for computing interior eigenpairs of large nonsymmetric matrices, which is called weighted harmonic Arnoldi method. The implementation of the method has been tested by numerical examples, the results show that the method converges fast and works with high accuracy.

Keywords: Harmonic Arnoldi method, weighted harmonic Arnoldi method, eigenpair, interior eigenproblem, non symmetric matrix.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1500
8424 Developing an Advanced Algorithm Capable of Classifying News, Articles and Other Textual Documents Using Text Mining Techniques

Authors: R. B. Knudsen, O. T. Rasmussen, R. A. Alphinas

Abstract:

The reason for conducting this research is to develop an algorithm that is capable of classifying news articles from the automobile industry, according to the competitive actions that they entail, with the use of Text Mining (TM) methods. It is needed to test how to properly preprocess the data for this research by preparing pipelines which fits each algorithm the best. The pipelines are tested along with nine different classification algorithms in the realm of regression, support vector machines, and neural networks. Preliminary testing for identifying the optimal pipelines and algorithms resulted in the selection of two algorithms with two different pipelines. The two algorithms are Logistic Regression (LR) and Artificial Neural Network (ANN). These algorithms are optimized further, where several parameters of each algorithm are tested. The best result is achieved with the ANN. The final model yields an accuracy of 0.79, a precision of 0.80, a recall of 0.78, and an F1 score of 0.76. By removing three of the classes that created noise, the final algorithm is capable of reaching an accuracy of 94%.

Keywords: Artificial neural network, competitive dynamics, logistic regression, text classification, text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 452
8423 A New Approach for Predicting and Optimizing Weld Bead Geometry in GMAW

Authors: Farhad Kolahan, Mehdi Heidari

Abstract:

Gas Metal Arc Welding (GMAW) processes is an important joining process widely used in metal fabrication industries. This paper addresses modeling and optimization of this technique using a set of experimental data and regression analysis. The set of experimental data has been used to assess the influence of GMAW process parameters in weld bead geometry. The process variables considered here include voltage (V); wire feed rate (F); torch Angle (A); welding speed (S) and nozzle-to-plate distance (D). The process output characteristics include weld bead height, width and penetration. The Taguchi method and regression modeling are used in order to establish the relationships between input and output parameters. The adequacy of the model is evaluated using analysis of variance (ANOVA) technique. In the next stage, the proposed model is embedded into a Simulated Annealing (SA) algorithm to optimize the GMAW process parameters. The objective is to determine a suitable set of process parameters that can produce desired bead geometry, considering the ranges of the process parameters. Computational results prove the effectiveness of the proposed model and optimization procedure.

Keywords: Weld Bead Geometry, GMAW welding, Processparameters Optimization, Modeling, SA algorithm

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2145
8422 Dissipation of Higher Mode using Numerical Integration Algorithm in Dynamic Analysis

Authors: Jin Sup Kim, Woo Young Jung, Minho Kwon

Abstract:

In general dynamic analyses, lower mode response is of interest, however the higher modes of spatially discretized equations generally do not represent the real behavior and not affects to global response much. Some implicit algorithms, therefore, are introduced to filter out the high-frequency modes using intended numerical error. The objective of this study is to introduce the P-method and PC α-method to compare that with dissipation method and Newmark method through the stability analysis and numerical example. PC α-method gives more accuracy than other methods because it based on the α-method inherits the superior properties of the implicit α-method. In finite element analysis, the PC α-method is more useful than other methods because it is the explicit scheme and it achieves the second order accuracy and numerical damping simultaneously.

Keywords: Dynamic, α-Method, P-Method, PC α-Method, Newmark method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3019
8421 Study on Optimal Control Strategy of PM2.5 in Wuhan, China

Authors: Qiuling Xie, Shanliang Zhu, Zongdi Sun

Abstract:

In this paper, we analyzed the correlation relationship among PM2.5 from other five Air Quality Indices (AQIs) based on the grey relational degree, and built a multivariate nonlinear regression equation model of PM2.5 and the five monitoring indexes. For the optimal control problem of PM2.5, we took the partial large Cauchy distribution of membership equation as satisfaction function. We established a nonlinear programming model with the goal of maximum performance to price ratio. And the optimal control scheme is given.

Keywords: Grey relational degree, multiple linear regression, membership function, nonlinear programming.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1359
8420 A Study on a Research and Development Cost-Estimation Model in Korea

Authors: Babakina Alexandra, Yong Soo Kim

Abstract:

In this study, we analyzed the factors that affect research funds using linear regression analysis to increase the effectiveness of investments in national research projects. We collected 7,916 items of data on research projects that were in the process of being finished or were completed between 2010 and 2011. Data pre-processing and visualization were performed to derive statistically significant results. We identified factors that affected funding using analysis of fit distributions and estimated increasing or decreasing tendencies based on these factors.

Keywords: R&D funding, Cost estimation, Linear regression, Preliminary feasibility study.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2198
8419 Comparison of Bayesian and Regression Schemes to Model Public Health Services

Authors: Sotirios Raptis

Abstract:

Bayesian reasoning (BR) or Linear (Auto) Regression (AR/LR) can predict different sources of data using priors or other data, and can link social service demands in cohorts, while their consideration in isolation (self-prediction) may lead to service misuse ignoring the context. The paper advocates that BR with Binomial (BD), or Normal (ND) models or raw data (.D) as probabilistic updates can be compared to AR/LR to link services in Scotland and reduce cost by sharing healthcare (HC) resources. Clustering, cross-correlation, along with BR, LR, AR can better predict demand. Insurance companies and policymakers can link such services, and examples include those offered to the elderly, and low-income people, smoking-related services linked to mental health services, or epidemiological weight in children. 22 service packs are used that are published by Public Health Services (PHS) Scotland and Scottish Government (SG) from 1981 to 2019, broken into 110 year series (factors), joined using LR, AR, BR. The Primary component analysis found 11 significant factors, while C-Means (CM) clustering gave five major clusters.

Keywords: Bayesian probability, cohorts, data frames, regression, services, prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 134
8418 Data Mining Classification Methods Applied in Drug Design

Authors: Mária Stachová, Lukáš Sobíšek

Abstract:

Data mining incorporates a group of statistical methods used to analyze a set of information, or a data set. It operates with models and algorithms, which are powerful tools with the great potential. They can help people to understand the patterns in certain chunk of information so it is obvious that the data mining tools have a wide area of applications. For example in the theoretical chemistry data mining tools can be used to predict moleculeproperties or improve computer-assisted drug design. Classification analysis is one of the major data mining methodologies. The aim of thecontribution is to create a classification model, which would be able to deal with a huge data set with high accuracy. For this purpose logistic regression, Bayesian logistic regression and random forest models were built using R software. TheBayesian logistic regression in Latent GOLD software was created as well. These classification methods belong to supervised learning methods. It was necessary to reduce data matrix dimension before construct models and thus the factor analysis (FA) was used. Those models were applied to predict the biological activity of molecules, potential new drug candidates.

Keywords: data mining, classification, drug design, QSAR

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2786
8417 Monitoring Blood Pressure Using Regression Techniques

Authors: Qasem Qananwah, Ahmad Dagamseh, Hiam AlQuran, Khalid Shaker Ibrahim

Abstract:

Blood pressure helps the physicians greatly to have a deep insight into the cardiovascular system. The determination of individual blood pressure is a standard clinical procedure considered for cardiovascular system problems. The conventional techniques to measure blood pressure (e.g. cuff method) allows a limited number of readings for a certain period (e.g. every 5-10 minutes). Additionally, these systems cause turbulence to blood flow; impeding continuous blood pressure monitoring, especially in emergency cases or critically ill persons. In this paper, the most important statistical features in the photoplethysmogram (PPG) signals were extracted to estimate the blood pressure noninvasively. PPG signals from more than 40 subjects were measured and analyzed and 12 features were extracted. The features were fed to principal component analysis (PCA) to find the most important independent features that have the highest correlation with blood pressure. The results show that the stiffness index means and standard deviation for the beat-to-beat heart rate were the most important features. A model representing both features for Systolic Blood Pressure (SBP) and Diastolic Blood Pressure (DBP) was obtained using a statistical regression technique. Surface fitting is used to best fit the series of data and the results show that the error value in estimating the SBP is 4.95% and in estimating the DBP is 3.99%.

Keywords: Blood pressure, noninvasive optical system, PCA, continuous monitoring.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 621
8416 Using Data Mining Methodology to Build the Predictive Model of Gold Passbook Price

Authors: Chien-Hui Yang, Che-Yang Lin, Ya-Chen Hsu

Abstract:

Gold passbook is an investing tool that is especially suitable for investors to do small investment in the solid gold. The gold passbook has the lower risk than other ways investing in gold, but its price is still affected by gold price. However, there are many factors can cause influences on gold price. Therefore, building a model to predict the price of gold passbook can both reduce the risk of investment and increase the benefits. This study investigates the important factors that influence the gold passbook price, and utilize the Group Method of Data Handling (GMDH) to build the predictive model. This method can not only obtain the significant variables but also perform well in prediction. Finally, the significant variables of gold passbook price, which can be predicted by GMDH, are US dollar exchange rate, international petroleum price, unemployment rate, whole sale price index, rediscount rate, foreign exchange reserves, misery index, prosperity coincident index and industrial index.

Keywords: Gold price, Gold passbook price, Group Method ofData Handling (GMDH), Regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2237