Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 8215

Search results for: multivariate analysis

8215 Multivariate Analysis of Spectroscopic Data for Agriculture Applications

Authors: Asmaa M. Hussein, Amr Wassal, Ahmed Farouk Al-Sadek, A. F. Abd El-Rahman

Abstract:

In this study, a multivariate analysis of potato spectroscopic data was presented to detect the presence of brown rot disease or not. Near-Infrared (NIR) spectroscopy (1,350-2,500 nm) combined with multivariate analysis was used as a rapid, non-destructive technique for the detection of brown rot disease in potatoes. Spectral measurements were performed in 565 samples, which were chosen randomly at the infection place in the potato slice. In this study, 254 infected and 311 uninfected (brown rot-free) samples were analyzed using different advanced statistical analysis techniques. The discrimination performance of different multivariate analysis techniques, including classification, pre-processing, and dimension reduction, were compared. Applying a random forest algorithm classifier with different pre-processing techniques to raw spectra had the best performance as the total classification accuracy of 98.7% was achieved in discriminating infected potatoes from control.

Keywords: Brown rot disease, NIR spectroscopy, potato, random forest.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 257
8214 Applying Gibbs Sampler for Multivariate Hierarchical Linear Model

Authors: Satoshi Usami

Abstract:

Among various HLM techniques, the Multivariate Hierarchical Linear Model (MHLM) is desirable to use, particularly when multivariate criterion variables are collected and the covariance structure has information valuable for data analysis. In order to reflect prior information or to obtain stable results when the sample size and the number of groups are not sufficiently large, the Bayes method has often been employed in hierarchical data analysis. In these cases, although the Markov Chain Monte Carlo (MCMC) method is a rather powerful tool for parameter estimation, Procedures regarding MCMC have not been formulated for MHLM. For this reason, this research presents concrete procedures for parameter estimation through the use of the Gibbs samplers. Lastly, several future topics for the use of MCMC approach for HLM is discussed.

Keywords: Gibbs sampler, Hierarchical Linear Model, Markov Chain Monte Carlo, Multivariate Hierarchical Linear Model

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1584
8213 Multi Task Scheme to Monitor Multivariate Environments Using Artificial Neural Network

Authors: K. Atashgar

Abstract:

When an assignable cause(s) manifests itself to a multivariate process and the process shifts to an out-of-control condition, a root-cause analysis should be initiated by quality engineers to identify and eliminate the assignable cause(s) affected the process. A root-cause analysis in a multivariate process is more complex compared to a univariate process. In the case of a process involved several correlated variables an effective root-cause analysis can be only experienced when it is possible to identify the required knowledge including the out-of-control condition, the change point, and the variable(s) responsible to the out-of-control condition, all simultaneously. Although literature addresses different schemes to monitor multivariate processes, one can find few scientific reports focused on all the required knowledge. To the best of the author’s knowledge this is the first time that a multi task model based on artificial neural network (ANN) is reported to monitor all the required knowledge at the same time for a multivariate process with more than two correlated quality characteristics. The performance of the proposed scheme is evaluated numerically when different step shifts affect the mean vector. Average run length is used to investigate the performance of the proposed multi task model. The simulated results indicate the multi task scheme performs all the required knowledge effectively.

Keywords: Artificial neural network, Multivariate process, Statistical process control, Change point.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1378
8212 Effects of Video Games and Online Chat on Mathematics Performance in High School: An Approach of Multivariate Data Analysis

Authors: Lina Wu, Wenyi Lu, Ye Li

Abstract:

Regarding heavy video game players for boys and super online chat lovers for girls as a symbolic phrase in the current adolescent culture, this project of data analysis verifies the displacement effect on deteriorating mathematics performance. To evaluate correlation or regression coefficients between a factor of playing video games or chatting online and mathematics performance compared with other factors, we use multivariate analysis technique and take gender difference into account. We find the most important reason for the negative sign of the displacement effect on mathematics performance due to students’ poor academic background. Statistical analysis methods in this project could be applied to study internet users’ academic performance from the high school education to the college education.

Keywords: Correlation coefficients, displacement effect, gender difference, multivariate analysis technique, regression coefficients.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1792
8211 Model of Optimal Centroids Approach for Multivariate Data Classification

Authors: Pham Van Nha, Le Cam Binh

Abstract:

Particle swarm optimization (PSO) is a population-based stochastic optimization algorithm. PSO was inspired by the natural behavior of birds and fish in migration and foraging for food. PSO is considered as a multidisciplinary optimization model that can be applied in various optimization problems. PSO’s ideas are simple and easy to understand but PSO is only applied in simple model problems. We think that in order to expand the applicability of PSO in complex problems, PSO should be described more explicitly in the form of a mathematical model. In this paper, we represent PSO in a mathematical model and apply in the multivariate data classification. First, PSOs general mathematical model (MPSO) is analyzed as a universal optimization model. Then, Model of Optimal Centroids (MOC) is proposed for the multivariate data classification. Experiments were conducted on some benchmark data sets to prove the effectiveness of MOC compared with several proposed schemes.

Keywords: Analysis of optimization, artificial intelligence-based optimization, optimization for learning and data analysis, global optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 191
8210 A Multivariate Moving Average Control Chart for Photovoltaic Processes

Authors: Chunchom Pongchavalit

Abstract:

For the electrical metrics that describe photovoltaic cell performance are inherently multivariate in nature, use of a univariate, or one variable, statistical process control chart can have important limitations. Development of a comprehensive process control strategy is known to be significantly beneficial to reducing process variability that ultimately drives up the manufacturing cost photovoltaic cells. The multivariate moving average or MMA chart, is applied to the electrical metrics of photovoltaic cells to illustrate the improved sensitivity on process variability this method of control charting offers. The result show the ability of the MMA chart to expand to as any variables as needed, suggests an application with multiple photovoltaic electrical metrics being used in concert to determine the processes state of control.

Keywords: The multivariate moving average control chart, Photovoltaic processes control, Multivariate system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1015
8209 Time Series Forecasting Using Independent Component Analysis

Authors: Theodor D. Popescu

Abstract:

The paper presents a method for multivariate time series forecasting using Independent Component Analysis (ICA), as a preprocessing tool. The idea of this approach is to do the forecasting in the space of independent components (sources), and then to transform back the results to the original time series space. The forecasting can be done separately and with a different method for each component, depending on its time structure. The paper gives also a review of the main algorithms for independent component analysis in the case of instantaneous mixture models, using second and high-order statistics. The method has been applied in simulation to an artificial multivariate time series with five components, generated from three sources and a mixing matrix, randomly generated.

Keywords: Independent Component Analysis, second order statistics, simulation, time series forecasting

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1479
8208 A Multivariate Statistical Approach for Water Quality Assessment of River Hindon, India

Authors: Nida Rizvi, Deeksha Katyal, Varun Joshi

Abstract:

River Hindon is an important river catering the demand of highly populated rural and industrial cluster of western Uttar Pradesh, India. Water quality of river Hindon is deteriorating at an alarming rate due to various industrial, municipal and agricultural activities. The present study aimed at identifying the pollution sources and quantifying the degree to which these sources are responsible for the deteriorating water quality of the river. Various water quality parameters, like pH, temperature, electrical conductivity, total dissolved solids, total hardness, calcium, chloride, nitrate, sulphate, biological oxygen demand, chemical oxygen demand, and total alkalinity were assessed. Water quality data obtained from eight study sites for one year has been subjected to the two multivariate techniques, namely, principal component analysis and cluster analysis. Principal component analysis was applied with the aim to find out spatial variability and to identify the sources responsible for the water quality of the river. Three Varifactors were obtained after varimax rotation of initial principal components using principal component analysis. Cluster analysis was carried out to classify sampling stations of certain similarity, which grouped eight different sites into two clusters. The study reveals that the anthropogenic influence (municipal, industrial, waste water and agricultural runoff) was the major source of river water pollution. Thus, this study illustrates the utility of multivariate statistical techniques for analysis and elucidation of multifaceted data sets, recognition of pollution sources/factors and understanding temporal/spatial variations in water quality for effective river water quality management.

Keywords: Cluster analysis, multivariate statistical technique, river Hindon, water Quality.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2934
8207 Spatial Distribution and Risk Assessment of As, Hg, Co and Cr in Kaveh Industrial City, using Geostatistic and GIS

Authors: Abbas Hani

Abstract:

The concentrations of As, Hg, Co, Cr and Cd were tested for each soil sample, and their spatial patterns were analyzed by the semivariogram approach of geostatistics and geographical information system technology. Multivariate statistic approaches (principal component analysis and cluster analysis) were used to identify heavy metal sources and their spatial pattern. Principal component analysis coupled with correlation between heavy metals showed that primary inputs of As, Hg and Cd were due to anthropogenic while, Co, and Cr were associated with pedogenic factors. Ordinary kriging was carried out to map the spatial patters of heavy metals. The high pollution sources evaluated was related with usage of urban and industrial wastewater. The results of this study helpful for risk assessment of environmental pollution for decision making for industrial adjustment and remedy soil pollution.

Keywords: Geographic Information system, Geostatistics, Kaveh, Multivariate Statistical Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1668
8206 Assessment of EU Competitiveness Factors by Multivariate Methods

Authors: L. Melecký

Abstract:

Measurement of competitiveness between countries or regions is an important topic of many economic analysis and scientific papers. In European Union (EU), there is no mainstream approach of competitiveness evaluation and measuring. There are many opinions and methods of measurement and evaluation of competitiveness between states or regions at national and European level. The methods differ in structure of using the indicators of competitiveness and ways of their processing. The aim of the paper is to analyze main sources of competitive potential of the EU Member States with the help of Factor analysis (FA) and to classify the EU Member States to homogeneous units (clusters) according to the similarity of selected indicators of competitiveness factors by Cluster analysis (CA) in reference years 2000 and 2011. The theoretical part of the paper is devoted to the fundamental bases of competitiveness and the methodology of FA and CA methods. The empirical part of the paper deals with the evaluation of competitiveness factors in the EU Member States and cluster comparison of evaluated countries by cluster analysis. 

Keywords: Competitiveness, cluster analysis, EU, factor analysis, multivariate methods.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1717
8205 Irrigation Water Quality Evaluation Based on Multivariate Statistical Analysis: A Case Study of Jiaokou Irrigation District

Authors: Panpan Xu, Qiying Zhang, Hui Qian

Abstract:

Groundwater is main source of water supply in the Guanzhong Basin, China. To investigate the quality of groundwater for agricultural purposes in Jiaokou Irrigation District located in the east of the Guanzhong Basin, 141 groundwater samples were collected for analysis of major ions (K+, Na+, Mg2+, Ca2+, SO42-, Cl-, HCO3-, and CO32-), pH, and total dissolved solids (TDS). Sodium percentage (Na%), residual sodium carbonate (RSC), magnesium hazard (MH), and potential salinity (PS) were applied for irrigation water quality assessment. In addition, multivariate statistical techniques were used to identify the underlying hydrogeochemical processes. Results show that the content of TDS mainly depends on Cl-, Na+, Mg2+, and SO42-, and the HCO3- content is generally high except for the eastern sand area. These are responsible for complex hydrogeochemical processes, such as dissolution of carbonate minerals (dolomite and calcite), gypsum, halite, and silicate minerals, the cation exchange, as well as evaporation and concentration. The average evaluation levels of Na%, RSC, MH, and PS for irrigation water quality are doubtful, good, unsuitable, and injurious to unsatisfactory, respectively. Therefore, it is necessary for decision makers to comprehensively consider the indicators and thus reasonably evaluate the irrigation water quality.

Keywords: Irrigation water quality, multivariate statistical analysis, groundwater, hydrogeochemical process.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 183
8204 Electricity Generation from Renewables and Targets: An Application of Multivariate Statistical Techniques

Authors: Filiz Ersoz, Taner Ersoz, Tugrul Bayraktar

Abstract:

Renewable energy is referred to as "clean energy" and common popular support for the use of renewable energy (RE) is to provide electricity with zero carbon dioxide emissions. This study provides useful insight into the European Union (EU) RE, especially, into electricity generation obtained from renewables, and their targets. The objective of this study is to identify groups of European countries, using multivariate statistical analysis and selected indicators. The hierarchical clustering method is used to decide the number of clusters for EU countries. The conducted statistical hierarchical cluster analysis is based on the Ward’s clustering method and squared Euclidean distances. Hierarchical cluster analysis identified eight distinct clusters of European countries. Then, non-hierarchical clustering (k-means) method was applied. Discriminant analysis was used to determine the validity of the results with data normalized by Z score transformation. To explore the relationship between the selected indicators, correlation coefficients were computed. The results of the study reveal the current situation of RE in European Union Member States.

Keywords: Share of electricity generation, CO2 emission, targets, multivariate methods, hierarchical clustering, K-means clustering, discriminant analyzed, correlation, EU member countries.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 811
8203 Clustering Multivariate Empiric Characteristic Functions for Multi-Class SVM Classification

Authors: María-Dolores Cubiles-de-la-Vega, Rafael Pino-Mejías, Esther-Lydia Silva-Ramírez

Abstract:

A dissimilarity measure between the empiric characteristic functions of the subsamples associated to the different classes in a multivariate data set is proposed. This measure can be efficiently computed, and it depends on all the cases of each class. It may be used to find groups of similar classes, which could be joined for further analysis, or it could be employed to perform an agglomerative hierarchical cluster analysis of the set of classes. The final tree can serve to build a family of binary classification models, offering an alternative approach to the multi-class SVM problem. We have tested this dendrogram based SVM approach with the oneagainst- one SVM approach over four publicly available data sets, three of them being microarray data. Both performances have been found equivalent, but the first solution requires a smaller number of binary SVM models.

Keywords: Cluster Analysis, Empiric Characteristic Function, Multi-class SVM, R.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1602
8202 Multivariate Statistical Analysis of Decathlon Performance Results in Olympic Athletes (1988-2008)

Authors: Jaebum Park, Vladimir M. Zatsiorsky

Abstract:

The performance results of the athletes competed in the 1988-2008 Olympic Games were analyzed (n = 166). The data were obtained from the IAAF official protocols. In the principal component analysis, the first three principal components explained 70% of the total variance. In the 1st principal component (with 43.1% of total variance explained) the largest factor loadings were for 100m (0.89), 400m (0.81), 110m hurdle run (0.76), and long jump (–0.72). This factor can be interpreted as the 'sprinting performance'. The loadings on the 2nd factor (15.3% of the total variance) presented a counter-intuitive throwing-jumping combination: the highest loadings were for throwing events (javelin throwing 0.76; shot put 0.74; and discus throwing 0.73) and also for jumping events (high jump 0.62; pole vaulting 0.58). On the 3rd factor (11.6% of total variance), the largest loading was for 1500 m running (0.88); all other loadings were below 0.4.

Keywords: Decathlon, principal component analysis, Olympic Games, multivariate statistical analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2478
8201 An AK-Chart for the Non-Normal Data

Authors: Chia-Hau Liu, Tai-Yue Wang

Abstract:

Traditional multivariate control charts assume that measurement from manufacturing processes follows a multivariate normal distribution. However, this assumption may not hold or may be difficult to verify because not all the measurement from manufacturing processes are normal distributed in practice. This study develops a new multivariate control chart for monitoring the processes with non-normal data. We propose a mechanism based on integrating the one-class classification method and the adaptive technique. The adaptive technique is used to improve the sensitivity to small shift on one-class classification in statistical process control. In addition, this design provides an easy way to allocate the value of type I error so it is easier to be implemented. Finally, the simulation study and the real data from industry are used to demonstrate the effectiveness of the propose control charts.

Keywords: Multivariate control chart, statistical process control, one-class classification method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1917
8200 Interpreting the Out-of-Control Signals of Multivariate Control Charts Employing Neural Networks

Authors: Francisco Aparisi, José Sanz

Abstract:

Multivariate quality control charts show some advantages to monitor several variables in comparison with the simultaneous use of univariate charts, nevertheless, there are some disadvantages. The main problem is how to interpret the out-ofcontrol signal of a multivariate chart. For example, in the case of control charts designed to monitor the mean vector, the chart signals showing that it must be accepted that there is a shift in the vector, but no indication is given about the variables that have produced this shift. The MEWMA quality control chart is a very powerful scheme to detect small shifts in the mean vector. There are no previous specific works about the interpretation of the out-of-control signal of this chart. In this paper neural networks are designed to interpret the out-of-control signal of the MEWMA chart, and the percentage of correct classifications is studied for different cases.

Keywords: Multivariate quality control, Artificial Intelligence, Neural Networks, Computer Applications

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2127
8199 Landslide Susceptibility Mapping: A Comparison between Logistic Regression and Multivariate Adaptive Regression Spline Models in the Municipality of Oudka, Northern of Morocco

Authors: S. Benchelha, H. C. Aoudjehane, M. Hakdaoui, R. El Hamdouni, H. Mansouri, T. Benchelha, M. Layelmam, M. Alaoui

Abstract:

The logistic regression (LR) and multivariate adaptive regression spline (MarSpline) are applied and verified for analysis of landslide susceptibility map in Oudka, Morocco, using geographical information system. From spatial database containing data such as landslide mapping, topography, soil, hydrology and lithology, the eight factors related to landslides such as elevation, slope, aspect, distance to streams, distance to road, distance to faults, lithology map and Normalized Difference Vegetation Index (NDVI) were calculated or extracted. Using these factors, landslide susceptibility indexes were calculated by the two mentioned methods. Before the calculation, this database was divided into two parts, the first for the formation of the model and the second for the validation. The results of the landslide susceptibility analysis were verified using success and prediction rates to evaluate the quality of these probabilistic models. The result of this verification was that the MarSpline model is the best model with a success rate (AUC = 0.963) and a prediction rate (AUC = 0.951) higher than the LR model (success rate AUC = 0.918, rate prediction AUC = 0.901).

Keywords: Landslide susceptibility mapping, regression logistic, multivariate adaptive regression spline, Oudka, Taounate, Morocco.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 461
8198 Fault Detection of Drinking Water Treatment Process Using PCA and Hotelling's T2 Chart

Authors: Joval P George, Dr. Zheng Chen, Philip Shaw

Abstract:

This paper deals with the application of Principal Component Analysis (PCA) and the Hotelling-s T2 Chart, using data collected from a drinking water treatment process. PCA is applied primarily for the dimensional reduction of the collected data. The Hotelling-s T2 control chart was used for the fault detection of the process. The data was taken from a United Utilities Multistage Water Treatment Works downloaded from an Integrated Program Management (IPM) dashboard system. The analysis of the results show that Multivariate Statistical Process Control (MSPC) techniques such as PCA, and control charts such as Hotelling-s T2, can be effectively applied for the early fault detection of continuous multivariable processes such as Drinking Water Treatment. The software package SIMCA-P was used to develop the MSPC models and Hotelling-s T2 Chart from the collected data.

Keywords: Principal component analysis, hotelling's t2 chart, multivariate statistical process control, drinking water treatment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2397
8197 A Lean Manufacturing Profile of Practices in the Metallurgical Industry: A Methodology for Multivariate Analysis

Authors: Jonathan D. Morales M., Ramón Silva R.

Abstract:

The purpose of this project is to carry out an analysis and determine the profile of actual lean manufacturing processes in the Metropolitan Area of Bucaramanga. Through the analysis of qualitative and quantitative variables it was possible to establish how these manufacturers develop production practices that ensure their competitiveness and productivity in the market. In this study, a random sample of metallurgic and wrought iron companies was applied, following which a quantitative focus and analysis was used to formulate a qualitative methodology for measuring the level of lean manufacturing procedures in the industry. A qualitative evaluation was also carried out through a multivariate analysis using the Numerical Taxonomy System (NTSYS) program which should allow for the determination of Lean Manufacturing profiles. Through the results it was possible to observe how the companies in the sector are doing with respect to Lean Manufacturing Practices, as well as identify the level of management that these companies practice with respect to this topic. In addition, it was possible to ascertain that there is no one dominant profile in the sector when it comes to Lean Manufacturing. It was established that the companies in the metallurgic and wrought iron industry show low levels of Lean Manufacturing implementation. Each one carries out diverse actions that are insufficient to consolidate a sectoral strategy for developing a competitive advantage which enables them to tie together a production strategy.

Keywords: Lean manufacturing, metallurgic industry, production line management, productivity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1634
8196 A Simplified Higher-Order Markov Chain Model

Authors: Chao Wang, Ting-Zhu Huang, Chen Jia

Abstract:

In this paper, we present a simplified higher-order Markov chain model for multiple categorical data sequences also called as simplified higher-order multivariate Markov chain model.

Keywords: Higher-order multivariate Markov chain model, Categorical data sequences, Multivariate Markov chain.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2856
8195 Developing Pedotransfer Functions for Estimating Some Soil Properties using Artificial Neural Network and Multivariate Regression Approaches

Authors: Fereydoon Sarmadian, Ali Keshavarzi

Abstract:

Study of soil properties like field capacity (F.C.) and permanent wilting point (P.W.P.) play important roles in study of soil moisture retention curve. Although these parameters can be measured directly, their measurement is difficult and expensive. Pedotransfer functions (PTFs) provide an alternative by estimating soil parameters from more readily available soil data. In this investigation, 70 soil samples were collected from different horizons of 15 soil profiles located in the Ziaran region, Qazvin province, Iran. The data set was divided into two subsets for calibration (80%) and testing (20%) of the models and their normality were tested by Kolmogorov-Smirnov method. Both multivariate regression and artificial neural network (ANN) techniques were employed to develop the appropriate PTFs for predicting soil parameters using easily measurable characteristics of clay, silt, O.C, S.P, B.D and CaCO3. The performance of the multivariate regression and ANN models was evaluated using an independent test data set. In order to evaluate the models, root mean square error (RMSE) and R2 were used. The comparison of RSME for two mentioned models showed that the ANN model gives better estimates of F.C and P.W.P than the multivariate regression model. The value of RMSE and R2 derived by ANN model for F.C and P.W.P were (2.35, 0.77) and (2.83, 0.72), respectively. The corresponding values for multivariate regression model were (4.46, 0.68) and (5.21, 0.64), respectively. Results showed that ANN with five neurons in hidden layer had better performance in predicting soil properties than multivariate regression.

Keywords: Artificial neural network, Field capacity, Permanentwilting point, Pedotransfer functions.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1505
8194 Sensor Monitoring of the Concentrations of Different Gases Present in Synthesis of Ammonia Based On Multi-Scale Entropy and Multivariate Statistics

Authors: S. Aouabdi, M. Taibi

Abstract:

This paper presents powerful techniques for the development of a new monitoring method based on multi-scale entropy (MSE) in order to characterize the behaviour of the concentrations of different gases present in the synthesis of Ammonia and soft-sensor based on Principal Component Analysis (PCA).

Keywords: Ammonia synthesis, concentrations of different gases, soft sensor, multi-scale entropy, multivariate statistics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1592
8193 Multivariate High Order Fuzzy Time Series Forecasting for Car Road Accidents

Authors: Tahseen A. Jilani, S. M. Aqil Burney, C. Ardil

Abstract:

In this paper, we have presented a new multivariate fuzzy time series forecasting method. This method assumes mfactors with one main factor of interest. History of past three years is used for making new forecasts. This new method is applied in forecasting total number of car accidents in Belgium using four secondary factors. We also make comparison of our proposed method with existing methods of fuzzy time series forecasting. Experimentally, it is shown that our proposed method perform better than existing fuzzy time series forecasting methods. Practically, actuaries are interested in analysis of the patterns of causalities in road accidents. Thus using fuzzy time series, actuaries can define fuzzy premium and fuzzy underwriting of car insurance and life insurance for car insurance. National Institute of Statistics, Belgium provides region of risk classification for each road. Thus using this risk classification, we can predict premium rate and underwriting of insurance policy holders.

Keywords: Average forecasting error rate (AFER), Fuzziness offuzzy sets Fuzzy, If-Then rules, Multivariate fuzzy time series.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2134
8192 On the Bootstrap P-Value Method in Identifying out of Control Signals in Multivariate Control Chart

Authors: O. Ikpotokin

Abstract:

In any production process, every product is aimed to attain a certain standard, but the presence of assignable cause of variability affects our process, thereby leading to low quality of product. The ability to identify and remove this type of variability reduces its overall effect, thereby improving the quality of the product. In case of a univariate control chart signal, it is easy to detect the problem and give a solution since it is related to a single quality characteristic. However, the problems involved in the use of multivariate control chart are the violation of multivariate normal assumption and the difficulty in identifying the quality characteristic(s) that resulted in the out of control signals. The purpose of this paper is to examine the use of non-parametric control chart (the bootstrap approach) for obtaining control limit to overcome the problem of multivariate distributional assumption and the p-value method for detecting out of control signals. Results from a performance study show that the proposed bootstrap method enables the setting of control limit that can enhance the detection of out of control signals when compared, while the p-value method also enhanced in identifying out of control variables.

Keywords: Bootstrap control limit, p-value method, out-of-control signals, p-value, quality characteristics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 582
8191 Comparison of Artificial Neural Network and Multivariate Regression Methods in Prediction of Soil Cation Exchange Capacity

Authors: Ali Keshavarzi, Fereydoon Sarmadian

Abstract:

Investigation of soil properties like Cation Exchange Capacity (CEC) plays important roles in study of environmental reaserches as the spatial and temporal variability of this property have been led to development of indirect methods in estimation of this soil characteristic. Pedotransfer functions (PTFs) provide an alternative by estimating soil parameters from more readily available soil data. 70 soil samples were collected from different horizons of 15 soil profiles located in the Ziaran region, Qazvin province, Iran. Then, multivariate regression and neural network model (feedforward back propagation network) were employed to develop a pedotransfer function for predicting soil parameter using easily measurable characteristics of clay and organic carbon. The performance of the multivariate regression and neural network model was evaluated using a test data set. In order to evaluate the models, root mean square error (RMSE) was used. The value of RMSE and R2 derived by ANN model for CEC were 0.47 and 0.94 respectively, while these parameters for multivariate regression model were 0.65 and 0.88 respectively. Results showed that artificial neural network with seven neurons in hidden layer had better performance in predicting soil cation exchange capacity than multivariate regression.

Keywords: Easily measurable characteristics, Feed-forwardback propagation, Pedotransfer functions, CEC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1769
8190 Generating Normally Distributed Clusters by Means of a Self-organizing Growing Neural Network– An Application to Market Segmentation –

Authors: Reinhold Decker, Christian Holsing, Sascha Lerke

Abstract:

This paper presents a new growing neural network for cluster analysis and market segmentation, which optimizes the size and structure of clusters by iteratively checking them for multivariate normality. We combine the recently published SGNN approach [8] with the basic principle underlying the Gaussian-means algorithm [13] and the Mardia test for multivariate normality [18, 19]. The new approach distinguishes from existing ones by its holistic design and its great autonomy regarding the clustering process as a whole. Its performance is demonstrated by means of synthetic 2D data and by real lifestyle survey data usable for market segmentation.

Keywords: Artificial neural network, clustering, multivariatenormality, market segmentation, self-organization

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 974
8189 Qualitative Data Analysis for Health Care Services

Authors: Taner Ersoz, Filiz Ersoz

Abstract:

This study was designed enable application of multivariate technique in the interpretation of categorical data for measuring health care services satisfaction in Turkey. The data was collected from a total of 17726 respondents. The establishment of the sample group and collection of the data were carried out by a joint team from The Ministry of Health and Turkish Statistical Institute (Turk Stat) of Turkey. The multiple correspondence analysis (MCA) was used on the data of 2882 respondents who answered the questionnaire in full. The multiple correspondence analysis indicated that, in the evaluation of health services females, public employees, younger and more highly educated individuals were more concerned and complainant than males, private sector employees, older and less educated individuals. Overall 53 % of the respondents were pleased with the improvements in health care services in the past three years. This study demonstrates the public consciousness in health services and health care satisfaction in Turkey. It was found that most the respondents were pleased with the improvements in health care services over the past three years. Awareness of health service quality increases with education levels. Older individuals and males would appear to have lower expectancies in health services.

Keywords: Multiple correspondence analysis, optimal scaling, multivariate categorical data, health care services, health satisfaction survey, statistical visualizing, Turkey.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 484
8188 Alternative Computational Arrangements on g-Group (g > 2) Profile Analysis

Authors: Emmanuel U. Ohaegbulem, Felix N. Nwobi

Abstract:

Alternative and simple computational arrangements in carrying out multivariate profile analysis when more than two groups (populations) are involved are presented. These arrangements have been demonstrated to not only yield equivalent results for the test statistics (the Wilks lambdas), but they have less computational efforts relative to other arrangements so far presented in the literature; in addition to being quite simple and easy to apply.

Keywords: Coincident profiles, g-group profile analysis, level profiles, parallel profiles, repeated measures MANOVA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 927
8187 Principal Component Analysis using Singular Value Decomposition of Microarray Data

Authors: Dong Hoon Lim

Abstract:

A series of microarray experiments produces observations of differential expression for thousands of genes across multiple conditions. Principal component analysis(PCA) has been widely used in multivariate data analysis to reduce the dimensionality of the data in order to simplify subsequent analysis and allow for summarization of the data in a parsimonious manner. PCA, which can be implemented via a singular value decomposition(SVD), is useful for analysis of microarray data. For application of PCA using SVD we use the DNA microarray data for the small round blue cell tumors(SRBCT) of childhood by Khan et al.(2001). To decide the number of components which account for sufficient amount of information we draw scree plot. Biplot, a graphic display associated with PCA, reveals important features that exhibit relationship between variables and also the relationship of variables with observations.

Keywords: Principal component analysis, singular value decomposition, microarray data, SRBCT

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2963
8186 Simulation of Sample Paths of Non Gaussian Stationary Random Fields

Authors: Fabrice Poirion, Benedicte Puig

Abstract:

Mathematical justifications are given for a simulation technique of multivariate nonGaussian random processes and fields based on Rosenblatt-s transformation of Gaussian processes. Different types of convergences are given for the approaching sequence. Moreover an original numerical method is proposed in order to solve the functional equation yielding the underlying Gaussian process autocorrelation function.

Keywords: Simulation, nonGaussian, random field, multivariate, stochastic process.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1558