Search results for: multivariate analysis technique
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 10967

Search results for: multivariate analysis technique

10967 Multivariate Analysis of Spectroscopic Data for Agriculture Applications

Authors: Asmaa M. Hussein, Amr Wassal, Ahmed Farouk Al-Sadek, A. F. Abd El-Rahman

Abstract:

In this study, a multivariate analysis of potato spectroscopic data was presented to detect the presence of brown rot disease or not. Near-Infrared (NIR) spectroscopy (1,350-2,500 nm) combined with multivariate analysis was used as a rapid, non-destructive technique for the detection of brown rot disease in potatoes. Spectral measurements were performed in 565 samples, which were chosen randomly at the infection place in the potato slice. In this study, 254 infected and 311 uninfected (brown rot-free) samples were analyzed using different advanced statistical analysis techniques. The discrimination performance of different multivariate analysis techniques, including classification, pre-processing, and dimension reduction, were compared. Applying a random forest algorithm classifier with different pre-processing techniques to raw spectra had the best performance as the total classification accuracy of 98.7% was achieved in discriminating infected potatoes from control.

Keywords: Brown rot disease, NIR spectroscopy, potato, random forest.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 885
10966 Effects of Video Games and Online Chat on Mathematics Performance in High School: An Approach of Multivariate Data Analysis

Authors: Lina Wu, Wenyi Lu, Ye Li

Abstract:

Regarding heavy video game players for boys and super online chat lovers for girls as a symbolic phrase in the current adolescent culture, this project of data analysis verifies the displacement effect on deteriorating mathematics performance. To evaluate correlation or regression coefficients between a factor of playing video games or chatting online and mathematics performance compared with other factors, we use multivariate analysis technique and take gender difference into account. We find the most important reason for the negative sign of the displacement effect on mathematics performance due to students’ poor academic background. Statistical analysis methods in this project could be applied to study internet users’ academic performance from the high school education to the college education.

Keywords: Correlation coefficients, displacement effect, gender difference, multivariate analysis technique, regression coefficients.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2170
10965 An AK-Chart for the Non-Normal Data

Authors: Chia-Hau Liu, Tai-Yue Wang

Abstract:

Traditional multivariate control charts assume that measurement from manufacturing processes follows a multivariate normal distribution. However, this assumption may not hold or may be difficult to verify because not all the measurement from manufacturing processes are normal distributed in practice. This study develops a new multivariate control chart for monitoring the processes with non-normal data. We propose a mechanism based on integrating the one-class classification method and the adaptive technique. The adaptive technique is used to improve the sensitivity to small shift on one-class classification in statistical process control. In addition, this design provides an easy way to allocate the value of type I error so it is easier to be implemented. Finally, the simulation study and the real data from industry are used to demonstrate the effectiveness of the propose control charts.

Keywords: Multivariate control chart, statistical process control, one-class classification method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2269
10964 Mathematical Programming on Multivariate Calibration Estimation in Stratified Sampling

Authors: Dinesh Rao, M.G.M. Khan, Sabiha Khan

Abstract:

Calibration estimation is a method of adjusting the original design weights to improve the survey estimates by using auxiliary information such as the known population total (or mean) of the auxiliary variables. A calibration estimator uses calibrated weights that are determined to minimize a given distance measure to the original design weights while satisfying a set of constraints related to the auxiliary information. In this paper, we propose a new multivariate calibration estimator for the population mean in the stratified sampling design, which incorporates information available for more than one auxiliary variable. The problem of determining the optimum calibrated weights is formulated as a Mathematical Programming Problem (MPP) that is solved using the Lagrange multiplier technique.

Keywords: Calibration estimation, Stratified sampling, Multivariate auxiliary information, Mathematical programming problem, Lagrange multiplier technique.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1952
10963 Applying Gibbs Sampler for Multivariate Hierarchical Linear Model

Authors: Satoshi Usami

Abstract:

Among various HLM techniques, the Multivariate Hierarchical Linear Model (MHLM) is desirable to use, particularly when multivariate criterion variables are collected and the covariance structure has information valuable for data analysis. In order to reflect prior information or to obtain stable results when the sample size and the number of groups are not sufficiently large, the Bayes method has often been employed in hierarchical data analysis. In these cases, although the Markov Chain Monte Carlo (MCMC) method is a rather powerful tool for parameter estimation, Procedures regarding MCMC have not been formulated for MHLM. For this reason, this research presents concrete procedures for parameter estimation through the use of the Gibbs samplers. Lastly, several future topics for the use of MCMC approach for HLM is discussed.

Keywords: Gibbs sampler, Hierarchical Linear Model, Markov Chain Monte Carlo, Multivariate Hierarchical Linear Model

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1866
10962 Simulation of Sample Paths of Non Gaussian Stationary Random Fields

Authors: Fabrice Poirion, Benedicte Puig

Abstract:

Mathematical justifications are given for a simulation technique of multivariate nonGaussian random processes and fields based on Rosenblatt-s transformation of Gaussian processes. Different types of convergences are given for the approaching sequence. Moreover an original numerical method is proposed in order to solve the functional equation yielding the underlying Gaussian process autocorrelation function.

Keywords: Simulation, nonGaussian, random field, multivariate, stochastic process.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1837
10961 A Multivariate Statistical Approach for Water Quality Assessment of River Hindon, India

Authors: Nida Rizvi, Deeksha Katyal, Varun Joshi

Abstract:

River Hindon is an important river catering the demand of highly populated rural and industrial cluster of western Uttar Pradesh, India. Water quality of river Hindon is deteriorating at an alarming rate due to various industrial, municipal and agricultural activities. The present study aimed at identifying the pollution sources and quantifying the degree to which these sources are responsible for the deteriorating water quality of the river. Various water quality parameters, like pH, temperature, electrical conductivity, total dissolved solids, total hardness, calcium, chloride, nitrate, sulphate, biological oxygen demand, chemical oxygen demand, and total alkalinity were assessed. Water quality data obtained from eight study sites for one year has been subjected to the two multivariate techniques, namely, principal component analysis and cluster analysis. Principal component analysis was applied with the aim to find out spatial variability and to identify the sources responsible for the water quality of the river. Three Varifactors were obtained after varimax rotation of initial principal components using principal component analysis. Cluster analysis was carried out to classify sampling stations of certain similarity, which grouped eight different sites into two clusters. The study reveals that the anthropogenic influence (municipal, industrial, waste water and agricultural runoff) was the major source of river water pollution. Thus, this study illustrates the utility of multivariate statistical techniques for analysis and elucidation of multifaceted data sets, recognition of pollution sources/factors and understanding temporal/spatial variations in water quality for effective river water quality management.

Keywords: Cluster analysis, multivariate statistical technique, river Hindon, water Quality.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3813
10960 Diagnosis of Multivariate Process via Nonlinear Kernel Method Combined with Qualitative Representation of Fault Patterns

Authors: Hyun-Woo Cho

Abstract:

The fault detection and diagnosis of complicated production processes is one of essential tasks needed to run the process safely with good final product quality. Unexpected events occurred in the process may have a serious impact on the process. In this work, triangular representation of process measurement data obtained in an on-line basis is evaluated using simulation process. The effect of using linear and nonlinear reduced spaces is also tested. Their diagnosis performance was demonstrated using multivariate fault data. It has shown that the nonlinear technique based diagnosis method produced more reliable results and outperforms linear method. The use of appropriate reduced space yielded better diagnosis performance. The presented diagnosis framework is different from existing ones in that it attempts to extract the fault pattern in the reduced space, not in the original process variable space. The use of reduced model space helps to mitigate the sensitivity of the fault pattern to noise.

Keywords: Real-time Fault diagnosis, triangular representation of patterns in reduced spaces, Nonlinear kernel technique, multivariate statistical modeling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1604
10959 Multi Task Scheme to Monitor Multivariate Environments Using Artificial Neural Network

Authors: K. Atashgar

Abstract:

When an assignable cause(s) manifests itself to a multivariate process and the process shifts to an out-of-control condition, a root-cause analysis should be initiated by quality engineers to identify and eliminate the assignable cause(s) affected the process. A root-cause analysis in a multivariate process is more complex compared to a univariate process. In the case of a process involved several correlated variables an effective root-cause analysis can be only experienced when it is possible to identify the required knowledge including the out-of-control condition, the change point, and the variable(s) responsible to the out-of-control condition, all simultaneously. Although literature addresses different schemes to monitor multivariate processes, one can find few scientific reports focused on all the required knowledge. To the best of the author’s knowledge this is the first time that a multi task model based on artificial neural network (ANN) is reported to monitor all the required knowledge at the same time for a multivariate process with more than two correlated quality characteristics. The performance of the proposed scheme is evaluated numerically when different step shifts affect the mean vector. Average run length is used to investigate the performance of the proposed multi task model. The simulated results indicate the multi task scheme performs all the required knowledge effectively.

Keywords: Artificial neural network, Multivariate process, Statistical process control, Change point.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1680
10958 A Multivariate Moving Average Control Chart for Photovoltaic Processes

Authors: Chunchom Pongchavalit

Abstract:

For the electrical metrics that describe photovoltaic cell performance are inherently multivariate in nature, use of a univariate, or one variable, statistical process control chart can have important limitations. Development of a comprehensive process control strategy is known to be significantly beneficial to reducing process variability that ultimately drives up the manufacturing cost photovoltaic cells. The multivariate moving average or MMA chart, is applied to the electrical metrics of photovoltaic cells to illustrate the improved sensitivity on process variability this method of control charting offers. The result show the ability of the MMA chart to expand to as any variables as needed, suggests an application with multiple photovoltaic electrical metrics being used in concert to determine the processes state of control.

Keywords: The multivariate moving average control chart, Photovoltaic processes control, Multivariate system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1281
10957 Application of GIS and Statistical Multivariate Techniques for Estimation of Soil Erosion and Sediment Yield

Authors: Masoud Nasri, Ali Gholami, Ali Najafi

Abstract:

In recent years, most of the regions in the world are exposed to degradation and erosion caused by increasing population and over use of land resources. The understanding of the most important factors on soil erosion and sediment yield are the main keys for decision making and planning. In this study, the sediment yield and soil erosion were estimated and the priority of different soil erosion factors used in the MPSIAC method of soil erosion estimation is evaluated in AliAbad watershed in southwest of Isfahan Province, Iran. Different information layers of the parameters were created using a GIS technique. Then, a multivariate procedure was applied to estimate sediment yield and to find the most important factors of soil erosion in the model. The results showed that land use, geology, land and soil cover are the most important factors describing the soil erosion estimated by MPSIAC model.

Keywords: land degradation, Soil erosion, Sediment yield, Aliabad, GIS technique, Land use.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1690
10956 Model of Optimal Centroids Approach for Multivariate Data Classification

Authors: Pham Van Nha, Le Cam Binh

Abstract:

Particle swarm optimization (PSO) is a population-based stochastic optimization algorithm. PSO was inspired by the natural behavior of birds and fish in migration and foraging for food. PSO is considered as a multidisciplinary optimization model that can be applied in various optimization problems. PSO’s ideas are simple and easy to understand but PSO is only applied in simple model problems. We think that in order to expand the applicability of PSO in complex problems, PSO should be described more explicitly in the form of a mathematical model. In this paper, we represent PSO in a mathematical model and apply in the multivariate data classification. First, PSOs general mathematical model (MPSO) is analyzed as a universal optimization model. Then, Model of Optimal Centroids (MOC) is proposed for the multivariate data classification. Experiments were conducted on some benchmark data sets to prove the effectiveness of MOC compared with several proposed schemes.

Keywords: Analysis of optimization, artificial intelligence-based optimization, optimization for learning and data analysis, global optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 911
10955 Qualitative Data Analysis for Health Care Services

Authors: Taner Ersoz, Filiz Ersoz

Abstract:

This study was designed enable application of multivariate technique in the interpretation of categorical data for measuring health care services satisfaction in Turkey. The data was collected from a total of 17726 respondents. The establishment of the sample group and collection of the data were carried out by a joint team from The Ministry of Health and Turkish Statistical Institute (Turk Stat) of Turkey. The multiple correspondence analysis (MCA) was used on the data of 2882 respondents who answered the questionnaire in full. The multiple correspondence analysis indicated that, in the evaluation of health services females, public employees, younger and more highly educated individuals were more concerned and complainant than males, private sector employees, older and less educated individuals. Overall 53 % of the respondents were pleased with the improvements in health care services in the past three years. This study demonstrates the public consciousness in health services and health care satisfaction in Turkey. It was found that most the respondents were pleased with the improvements in health care services over the past three years. Awareness of health service quality increases with education levels. Older individuals and males would appear to have lower expectancies in health services.

Keywords: Multiple correspondence analysis, optimal scaling, multivariate categorical data, health care services, health satisfaction survey, statistical visualizing, Turkey.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 876
10954 Evaluating Spectral Relationships between Signals by Removing the Contribution of a Common, Periodic Source A Partial Coherence-based Approach

Authors: Antonio Mauricio F. L. Miranda de Sá

Abstract:

Partial coherence between two signals removing the contribution of a periodic, deterministic signal is proposed for evaluating the interrelationship in multivariate systems. The estimator expression was derived and shown to be independent of such periodic signal. Simulations were used for obtaining its critical value, which were found to be the same as those for Gaussian signals, as well as for evaluating the technique. An Illustration with eletroencephalografic (EEG) signals during photic stimulation is also provided. The application of the proposed technique in both simulation and real EEG data indicate that it seems to be very specific in removing the contribution of periodic sources. The estimate independence of the periodic signal may widen partial coherence application to signal analysis, since it could be used together with simple coherence to test for contamination in signals by a common, periodic noise source.

Keywords: Partial coherence, periodic input, spectral analysis, statistical signal processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1463
10953 Time Series Forecasting Using Independent Component Analysis

Authors: Theodor D. Popescu

Abstract:

The paper presents a method for multivariate time series forecasting using Independent Component Analysis (ICA), as a preprocessing tool. The idea of this approach is to do the forecasting in the space of independent components (sources), and then to transform back the results to the original time series space. The forecasting can be done separately and with a different method for each component, depending on its time structure. The paper gives also a review of the main algorithms for independent component analysis in the case of instantaneous mixture models, using second and high-order statistics. The method has been applied in simulation to an artificial multivariate time series with five components, generated from three sources and a mixing matrix, randomly generated.

Keywords: Independent Component Analysis, second order statistics, simulation, time series forecasting

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1778
10952 Interpreting the Out-of-Control Signals of Multivariate Control Charts Employing Neural Networks

Authors: Francisco Aparisi, José Sanz

Abstract:

Multivariate quality control charts show some advantages to monitor several variables in comparison with the simultaneous use of univariate charts, nevertheless, there are some disadvantages. The main problem is how to interpret the out-ofcontrol signal of a multivariate chart. For example, in the case of control charts designed to monitor the mean vector, the chart signals showing that it must be accepted that there is a shift in the vector, but no indication is given about the variables that have produced this shift. The MEWMA quality control chart is a very powerful scheme to detect small shifts in the mean vector. There are no previous specific works about the interpretation of the out-of-control signal of this chart. In this paper neural networks are designed to interpret the out-of-control signal of the MEWMA chart, and the percentage of correct classifications is studied for different cases.

Keywords: Multivariate quality control, Artificial Intelligence, Neural Networks, Computer Applications

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2504
10951 Irrigation Water Quality Evaluation Based on Multivariate Statistical Analysis: A Case Study of Jiaokou Irrigation District

Authors: Panpan Xu, Qiying Zhang, Hui Qian

Abstract:

Groundwater is main source of water supply in the Guanzhong Basin, China. To investigate the quality of groundwater for agricultural purposes in Jiaokou Irrigation District located in the east of the Guanzhong Basin, 141 groundwater samples were collected for analysis of major ions (K+, Na+, Mg2+, Ca2+, SO42-, Cl-, HCO3-, and CO32-), pH, and total dissolved solids (TDS). Sodium percentage (Na%), residual sodium carbonate (RSC), magnesium hazard (MH), and potential salinity (PS) were applied for irrigation water quality assessment. In addition, multivariate statistical techniques were used to identify the underlying hydrogeochemical processes. Results show that the content of TDS mainly depends on Cl-, Na+, Mg2+, and SO42-, and the HCO3- content is generally high except for the eastern sand area. These are responsible for complex hydrogeochemical processes, such as dissolution of carbonate minerals (dolomite and calcite), gypsum, halite, and silicate minerals, the cation exchange, as well as evaporation and concentration. The average evaluation levels of Na%, RSC, MH, and PS for irrigation water quality are doubtful, good, unsuitable, and injurious to unsatisfactory, respectively. Therefore, it is necessary for decision makers to comprehensively consider the indicators and thus reasonably evaluate the irrigation water quality.

Keywords: Irrigation water quality, multivariate statistical analysis, groundwater, hydrogeochemical process.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 566
10950 Spatial Distribution and Risk Assessment of As, Hg, Co and Cr in Kaveh Industrial City, using Geostatistic and GIS

Authors: Abbas Hani

Abstract:

The concentrations of As, Hg, Co, Cr and Cd were tested for each soil sample, and their spatial patterns were analyzed by the semivariogram approach of geostatistics and geographical information system technology. Multivariate statistic approaches (principal component analysis and cluster analysis) were used to identify heavy metal sources and their spatial pattern. Principal component analysis coupled with correlation between heavy metals showed that primary inputs of As, Hg and Cd were due to anthropogenic while, Co, and Cr were associated with pedogenic factors. Ordinary kriging was carried out to map the spatial patters of heavy metals. The high pollution sources evaluated was related with usage of urban and industrial wastewater. The results of this study helpful for risk assessment of environmental pollution for decision making for industrial adjustment and remedy soil pollution.

Keywords: Geographic Information system, Geostatistics, Kaveh, Multivariate Statistical Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1979
10949 Application of Multi-Dimensional Principal Component Analysis to Medical Data

Authors: Naoki Yamamoto, Jun Murakami, Chiharu Okuma, Yutaro Shigeto, Satoko Saito, Takashi Izumi, Nozomi Hayashida

Abstract:

Multi-dimensional principal component analysis (PCA) is the extension of the PCA, which is used widely as the dimensionality reduction technique in multivariate data analysis, to handle multi-dimensional data. To calculate the PCA the singular value decomposition (SVD) is commonly employed by the reason of its numerical stability. The multi-dimensional PCA can be calculated by using the higher-order SVD (HOSVD), which is proposed by Lathauwer et al., similarly with the case of ordinary PCA. In this paper, we apply the multi-dimensional PCA to the multi-dimensional medical data including the functional independence measure (FIM) score, and describe the results of experimental analysis.

Keywords: multi-dimensional principal component analysis, higher-order SVD (HOSVD), functional independence measure (FIM), medical data, tensor decomposition

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2502
10948 Characterization of Atmospheric Particulate Matter using PIXE Technique

Authors: P.Kothai, P. Prathibha, I.V.Saradhi, G.G. Pandit, V.D. Puranik

Abstract:

Coarse and fine particulate matter were collected at a residential area at Vashi, Navi Mumbai and the filter samples were analysed for trace elements using PIXE technique. The trend of particulate matter showed higher concentrations during winter than the summer and monsoon concentration levels. High concentrations of elements related to soil and sea salt were found in PM10 and PM2.5. Also high levels of zinc and sulphur found in the particulates of both the size fractions. EF analysis showed enrichment of Cu, Cr and Mn only in the fine fraction suggesting their origin from anthropogenic sources. The EF value was observed to be maximum for As, Pb and Zn in the fine particulates. However, crustal derived elements showed very low EF values indicating their origin from soil. The PCA based multivariate studies identified soil, sea salt, combustion and Se sources as common sources for coarse and additionally an industrial source has also been identified for fine particles.

Keywords: EF analysis, PM10, PM2.5, PIXE, PCA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1856
10947 Electricity Generation from Renewables and Targets: An Application of Multivariate Statistical Techniques

Authors: Filiz Ersoz, Taner Ersoz, Tugrul Bayraktar

Abstract:

Renewable energy is referred to as "clean energy" and common popular support for the use of renewable energy (RE) is to provide electricity with zero carbon dioxide emissions. This study provides useful insight into the European Union (EU) RE, especially, into electricity generation obtained from renewables, and their targets. The objective of this study is to identify groups of European countries, using multivariate statistical analysis and selected indicators. The hierarchical clustering method is used to decide the number of clusters for EU countries. The conducted statistical hierarchical cluster analysis is based on the Ward’s clustering method and squared Euclidean distances. Hierarchical cluster analysis identified eight distinct clusters of European countries. Then, non-hierarchical clustering (k-means) method was applied. Discriminant analysis was used to determine the validity of the results with data normalized by Z score transformation. To explore the relationship between the selected indicators, correlation coefficients were computed. The results of the study reveal the current situation of RE in European Union Member States.

Keywords: Share of electricity generation, CO2 emission, targets, multivariate methods, hierarchical clustering, K-means clustering, discriminant analyzed, correlation, EU member countries.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1247
10946 Assessment of EU Competitiveness Factors by Multivariate Methods

Authors: L. Melecký

Abstract:

Measurement of competitiveness between countries or regions is an important topic of many economic analysis and scientific papers. In European Union (EU), there is no mainstream approach of competitiveness evaluation and measuring. There are many opinions and methods of measurement and evaluation of competitiveness between states or regions at national and European level. The methods differ in structure of using the indicators of competitiveness and ways of their processing. The aim of the paper is to analyze main sources of competitive potential of the EU Member States with the help of Factor analysis (FA) and to classify the EU Member States to homogeneous units (clusters) according to the similarity of selected indicators of competitiveness factors by Cluster analysis (CA) in reference years 2000 and 2011. The theoretical part of the paper is devoted to the fundamental bases of competitiveness and the methodology of FA and CA methods. The empirical part of the paper deals with the evaluation of competitiveness factors in the EU Member States and cluster comparison of evaluated countries by cluster analysis. 

Keywords: Competitiveness, cluster analysis, EU, factor analysis, multivariate methods.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2045
10945 A Simplified Higher-Order Markov Chain Model

Authors: Chao Wang, Ting-Zhu Huang, Chen Jia

Abstract:

In this paper, we present a simplified higher-order Markov chain model for multiple categorical data sequences also called as simplified higher-order multivariate Markov chain model.

Keywords: Higher-order multivariate Markov chain model, Categorical data sequences, Multivariate Markov chain.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3287
10944 Clustering Multivariate Empiric Characteristic Functions for Multi-Class SVM Classification

Authors: María-Dolores Cubiles-de-la-Vega, Rafael Pino-Mejías, Esther-Lydia Silva-Ramírez

Abstract:

A dissimilarity measure between the empiric characteristic functions of the subsamples associated to the different classes in a multivariate data set is proposed. This measure can be efficiently computed, and it depends on all the cases of each class. It may be used to find groups of similar classes, which could be joined for further analysis, or it could be employed to perform an agglomerative hierarchical cluster analysis of the set of classes. The final tree can serve to build a family of binary classification models, offering an alternative approach to the multi-class SVM problem. We have tested this dendrogram based SVM approach with the oneagainst- one SVM approach over four publicly available data sets, three of them being microarray data. Both performances have been found equivalent, but the first solution requires a smaller number of binary SVM models.

Keywords: Cluster Analysis, Empiric Characteristic Function, Multi-class SVM, R.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1877
10943 Developing Pedotransfer Functions for Estimating Some Soil Properties using Artificial Neural Network and Multivariate Regression Approaches

Authors: Fereydoon Sarmadian, Ali Keshavarzi

Abstract:

Study of soil properties like field capacity (F.C.) and permanent wilting point (P.W.P.) play important roles in study of soil moisture retention curve. Although these parameters can be measured directly, their measurement is difficult and expensive. Pedotransfer functions (PTFs) provide an alternative by estimating soil parameters from more readily available soil data. In this investigation, 70 soil samples were collected from different horizons of 15 soil profiles located in the Ziaran region, Qazvin province, Iran. The data set was divided into two subsets for calibration (80%) and testing (20%) of the models and their normality were tested by Kolmogorov-Smirnov method. Both multivariate regression and artificial neural network (ANN) techniques were employed to develop the appropriate PTFs for predicting soil parameters using easily measurable characteristics of clay, silt, O.C, S.P, B.D and CaCO3. The performance of the multivariate regression and ANN models was evaluated using an independent test data set. In order to evaluate the models, root mean square error (RMSE) and R2 were used. The comparison of RSME for two mentioned models showed that the ANN model gives better estimates of F.C and P.W.P than the multivariate regression model. The value of RMSE and R2 derived by ANN model for F.C and P.W.P were (2.35, 0.77) and (2.83, 0.72), respectively. The corresponding values for multivariate regression model were (4.46, 0.68) and (5.21, 0.64), respectively. Results showed that ANN with five neurons in hidden layer had better performance in predicting soil properties than multivariate regression.

Keywords: Artificial neural network, Field capacity, Permanentwilting point, Pedotransfer functions.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1819
10942 On the Bootstrap P-Value Method in Identifying out of Control Signals in Multivariate Control Chart

Authors: O. Ikpotokin

Abstract:

In any production process, every product is aimed to attain a certain standard, but the presence of assignable cause of variability affects our process, thereby leading to low quality of product. The ability to identify and remove this type of variability reduces its overall effect, thereby improving the quality of the product. In case of a univariate control chart signal, it is easy to detect the problem and give a solution since it is related to a single quality characteristic. However, the problems involved in the use of multivariate control chart are the violation of multivariate normal assumption and the difficulty in identifying the quality characteristic(s) that resulted in the out of control signals. The purpose of this paper is to examine the use of non-parametric control chart (the bootstrap approach) for obtaining control limit to overcome the problem of multivariate distributional assumption and the p-value method for detecting out of control signals. Results from a performance study show that the proposed bootstrap method enables the setting of control limit that can enhance the detection of out of control signals when compared, while the p-value method also enhanced in identifying out of control variables.

Keywords: Bootstrap control limit, p-value method, out-of-control signals, p-value, quality characteristics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1012
10941 Comparison of Artificial Neural Network and Multivariate Regression Methods in Prediction of Soil Cation Exchange Capacity

Authors: Ali Keshavarzi, Fereydoon Sarmadian

Abstract:

Investigation of soil properties like Cation Exchange Capacity (CEC) plays important roles in study of environmental reaserches as the spatial and temporal variability of this property have been led to development of indirect methods in estimation of this soil characteristic. Pedotransfer functions (PTFs) provide an alternative by estimating soil parameters from more readily available soil data. 70 soil samples were collected from different horizons of 15 soil profiles located in the Ziaran region, Qazvin province, Iran. Then, multivariate regression and neural network model (feedforward back propagation network) were employed to develop a pedotransfer function for predicting soil parameter using easily measurable characteristics of clay and organic carbon. The performance of the multivariate regression and neural network model was evaluated using a test data set. In order to evaluate the models, root mean square error (RMSE) was used. The value of RMSE and R2 derived by ANN model for CEC were 0.47 and 0.94 respectively, while these parameters for multivariate regression model were 0.65 and 0.88 respectively. Results showed that artificial neural network with seven neurons in hidden layer had better performance in predicting soil cation exchange capacity than multivariate regression.

Keywords: Easily measurable characteristics, Feed-forwardback propagation, Pedotransfer functions, CEC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2210
10940 Holistic Face Recognition using Multivariate Approximation, Genetic Algorithms and AdaBoost Classifier: Preliminary Results

Authors: C. Villegas-Quezada, J. Climent

Abstract:

Several works regarding facial recognition have dealt with methods which identify isolated characteristics of the face or with templates which encompass several regions of it. In this paper a new technique which approaches the problem holistically dispensing with the need to identify geometrical characteristics or regions of the face is introduced. The characterization of a face is achieved by randomly sampling selected attributes of the pixels of its image. From this information we construct a set of data, which correspond to the values of low frequencies, gradient, entropy and another several characteristics of pixel of the image. Generating a set of “p" variables. The multivariate data set with different polynomials minimizing the data fitness error in the minimax sense (L∞ - Norm) is approximated. With the use of a Genetic Algorithm (GA) it is able to circumvent the problem of dimensionality inherent to higher degree polynomial approximations. The GA yields the degree and values of a set of coefficients of the polynomials approximating of the image of a face. By finding a family of characteristic polynomials from several variables (pixel characteristics) for each face (say Fi ) in the data base through a resampling process the system in use, is trained. A face (say F ) is recognized by finding its characteristic polynomials and using an AdaBoost Classifier from F -s polynomials to each of the Fi -s polynomials. The winner is the polynomial family closer to F -s corresponding to target face in data base.

Keywords: AdaBoost Classifier, Holistic Face Recognition, Minimax Multivariate Approximation, Genetic Algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1496
10939 Landslide Susceptibility Mapping: A Comparison between Logistic Regression and Multivariate Adaptive Regression Spline Models in the Municipality of Oudka, Northern of Morocco

Authors: S. Benchelha, H. C. Aoudjehane, M. Hakdaoui, R. El Hamdouni, H. Mansouri, T. Benchelha, M. Layelmam, M. Alaoui

Abstract:

The logistic regression (LR) and multivariate adaptive regression spline (MarSpline) are applied and verified for analysis of landslide susceptibility map in Oudka, Morocco, using geographical information system. From spatial database containing data such as landslide mapping, topography, soil, hydrology and lithology, the eight factors related to landslides such as elevation, slope, aspect, distance to streams, distance to road, distance to faults, lithology map and Normalized Difference Vegetation Index (NDVI) were calculated or extracted. Using these factors, landslide susceptibility indexes were calculated by the two mentioned methods. Before the calculation, this database was divided into two parts, the first for the formation of the model and the second for the validation. The results of the landslide susceptibility analysis were verified using success and prediction rates to evaluate the quality of these probabilistic models. The result of this verification was that the MarSpline model is the best model with a success rate (AUC = 0.963) and a prediction rate (AUC = 0.951) higher than the LR model (success rate AUC = 0.918, rate prediction AUC = 0.901).

Keywords: Landslide susceptibility mapping, regression logistic, multivariate adaptive regression spline, Oudka, Taounate, Morocco.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 989
10938 Multivariate Statistical Analysis of Decathlon Performance Results in Olympic Athletes (1988-2008)

Authors: Jaebum Park, Vladimir M. Zatsiorsky

Abstract:

The performance results of the athletes competed in the 1988-2008 Olympic Games were analyzed (n = 166). The data were obtained from the IAAF official protocols. In the principal component analysis, the first three principal components explained 70% of the total variance. In the 1st principal component (with 43.1% of total variance explained) the largest factor loadings were for 100m (0.89), 400m (0.81), 110m hurdle run (0.76), and long jump (–0.72). This factor can be interpreted as the 'sprinting performance'. The loadings on the 2nd factor (15.3% of the total variance) presented a counter-intuitive throwing-jumping combination: the highest loadings were for throwing events (javelin throwing 0.76; shot put 0.74; and discus throwing 0.73) and also for jumping events (high jump 0.62; pole vaulting 0.58). On the 3rd factor (11.6% of total variance), the largest loading was for 1500 m running (0.88); all other loadings were below 0.4.

Keywords: Decathlon, principal component analysis, Olympic Games, multivariate statistical analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2810