Search results for: multivariate chemometric
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 671

Search results for: multivariate chemometric

641 Multivariate Statistical Process Monitoring of Base Metal Flotation Plant Using Dissimilarity Scale-Based Singular Spectrum Analysis

Authors: Syamala Krishnannair

Abstract:

A multivariate statistical process monitoring methodology using dissimilarity scale-based singular spectrum analysis (SSA) is proposed for the detection and diagnosis of process faults in the base metal flotation plant. Process faults are detected based on the multi-level decomposition of process signals by SSA using the dissimilarity structure of the process data and the subsequent monitoring of the multiscale signals using the unified monitoring index which combines T² with SPE. Contribution plots are used to identify the root causes of the process faults. The overall results indicated that the proposed technique outperformed the conventional multivariate techniques in the detection and diagnosis of the process faults in the flotation plant.

Keywords: fault detection, fault diagnosis, process monitoring, dissimilarity scale

Procedia PDF Downloads 178
640 Discrimination Between Bacillus and Alicyclobacillus Isolates in Apple Juice by Fourier Transform Infrared Spectroscopy and Multivariate Analysis

Authors: Murada Alholy, Mengshi Lin, Omar Alhaj, Mahmoud Abugoush

Abstract:

Alicyclobacillus is a causative agent of spoilage in pasteurized and heat-treated apple juice products. Differentiating between this genus and the closely related Bacillus is crucially important. In this study, Fourier transform infrared spectroscopy (FT-IR) was used to identify and discriminate between four Alicyclobacillus strains and four Bacillus isolates inoculated individually into apple juice. Loading plots over the range of 1350 and 1700 cm-1 reflected the most distinctive biochemical features of Bacillus and Alicyclobacillus. Multivariate statistical methods (e.g. principal component analysis (PCA) and soft independent modeling of class analogy (SIMCA)) were used to analyze the spectral data. Distinctive separation of spectral samples was observed. This study demonstrates that FT-IR spectroscopy in combination with multivariate analysis could serve as a rapid and effective tool for fruit juice industry to differentiate between Bacillus and Alicyclobacillus and to distinguish between species belonging to these two genera.

Keywords: alicyclobacillus, bacillus, FT-IR, spectroscopy, PCA

Procedia PDF Downloads 455
639 Polycyclic Aromatic Hydrocarbons: Pollution and Ecological Risk Assessment in Surface Soil of the Tezpur Town, on the North Bank of the Brahmaputra River, Assam, India

Authors: Kali Prasad Sarma, Nibedita Baul, Jinu Deka

Abstract:

In the present study, pollution level of polycyclic aromatic hydrocarbon (PAH) in surface soil of historic Tezpur town located in the north bank of the River Brahmaputra were evaluated. In order to determine the seasonal distribution and concentration level of 16 USEPA priority PAHs surface soil samples were collected from 12 different sampling sites with various land use type. The total concentrations of 16 PAHs (∑16 PAHs) varied from 242.68µgkg-1to 7901.89µgkg-1. Concentration of total probable carcinogenic PAH ranged between 7.285µgkg-1 and 479.184 µgkg-1 in different seasons. However, the concentration of BaP, the most carcinogenic PAH, was found in the range of BDL to 50.01 µgkg-1. The composition profiles of PAHs in 3 different seasons were characterized by following two different types of ring: (1) 4-ring PAHs, contributed to highest percentage of total PAHs (43.75%) (2) while in pre- and post- monsoon season 3- ring compounds dominated the PAH profile, contributing 65.58% and 74.41% respectively. A high PAHs concentration with significant seasonality and high abundance of LMWPAHs was observed in Tezpur town. Soil PAHs toxicity was evaluated taking toxic equivalency factors (TEFs), which quantify the carcinogenic potential of other PAHs relative to BaP and estimate benzo[a]pyrene-equivalent concentration (BaPeq). The calculated BaPeq value signifies considerable risk to contact with soil PAHs. We applied cluster analysis and principal component analysis (PCA) with multivariate linear regression (MLR) to apportion sources of polycyclic aromatic hydrocarbons (PAHs) in surface soil of Tezpur town, based on the measured PAH concentrations. The results indicate that petrogenic and pyrogenic sources are the important sources of PAHs. A combination of chemometric and molecular indices were used to identify the sources of PAHs, which could be attributed to vehicle emissions, a mixed source input, natural gas combustion, wood or biomass burning and coal combustion. Source apportionment using absolute principle component scores–multiple linear regression showed that the main sources of PAHs are 22.3% mix sources comprising of diesel and biomass combustion and petroleum spill,13.55% from vehicle emission, 9.15% from diesel and natural gas burning, 38.05% from wood and biomass burning and 16.95% contribute coal combustion. Pyrogenic input was found to dominate source of PAHs origin with more contribution from vehicular exhaust. PAHs have often been found to co-emit with other environmental pollutants like heavy metals due to similar source of origin. A positive correlation was observed between PAH with Cr and Pb (r2 = 0.54 and 0.55 respectively) in monsoon season and PAH with Cd and Pb (r2 = 0.54 and 0.61 respectively) indicating their common source. Strong correlation was observed between PAH and OC during pre- and post- monsoon (r2=0.46 and r2=0.65 respectively) whereas during monsoon season no significant correlation was observed (r2=0.24).

Keywords: polycyclic aromatic hydrocarbon, Tezpur town, chemometric analysis, ecological risk assessment, pollution

Procedia PDF Downloads 189
638 Fast Bayesian Inference of Multivariate Block-Nearest Neighbor Gaussian Process (NNGP) Models for Large Data

Authors: Carlos Gonzales, Zaida Quiroz, Marcos Prates

Abstract:

Several spatial variables collected at the same location that share a common spatial distribution can be modeled simultaneously through a multivariate geostatistical model that takes into account the correlation between these variables and the spatial autocorrelation. The main goal of this model is to perform spatial prediction of these variables in the region of study. Here we focus on a geostatistical multivariate formulation that relies on sharing common spatial random effect terms. In particular, the first response variable can be modeled by a mean that incorporates a shared random spatial effect, while the other response variables depend on this shared spatial term, in addition to specific random spatial effects. Each spatial random effect is defined through a Gaussian process with a valid covariance function, but in order to improve the computational efficiency when the data are large, each Gaussian process is approximated to a Gaussian random Markov field (GRMF), specifically to the block nearest neighbor Gaussian process (Block-NNGP). This approach involves dividing the spatial domain into several dependent blocks under certain constraints, where the cross blocks allow capturing the spatial dependence on a large scale, while each individual block captures the spatial dependence on a smaller scale. The multivariate geostatistical model belongs to the class of Latent Gaussian Models; thus, to achieve fast Bayesian inference, it is used the integrated nested Laplace approximation (INLA) method. The good performance of the proposed model is shown through simulations and applications for massive data.

Keywords: Block-NNGP, geostatistics, gaussian process, GRMF, INLA, multivariate models.

Procedia PDF Downloads 67
637 Effects of Video Games and Online Chat on Mathematics Performance in High School: An Approach of Multivariate Data Analysis

Authors: Lina Wu, Wenyi Lu, Ye Li

Abstract:

Regarding heavy video game players for boys and super online chat lovers for girls as a symbolic phrase in the current adolescent culture, this project of data analysis verifies the displacement effect on deteriorating mathematics performance. To evaluate correlation or regression coefficients between a factor of playing video games or chatting online and mathematics performance compared with other factors, we use multivariate analysis technique and take gender difference into account. We find the most important reason for the negative sign of the displacement effect on mathematics performance due to students’ poor academic background. Statistical analysis methods in this project could be applied to study internet users’ academic performance from the high school education to the college education.

Keywords: correlation coefficients, displacement effect, multivariate analysis technique, regression coefficients

Procedia PDF Downloads 331
636 Development, Optimization, and Validation of a Synchronous Fluorescence Spectroscopic Method with Multivariate Calibration for the Determination of Amlodipine and Olmesartan Implementing: Experimental Design

Authors: Noha Ibrahim, Eman S. Elzanfaly, Said A. Hassan, Ahmed E. El Gendy

Abstract:

Objectives: The purpose of the study is to develop a sensitive synchronous spectrofluorimetric method with multivariate calibration after studying and optimizing the different variables affecting the native fluorescence intensity of amlodipine and olmesartan implementing an experimental design approach. Method: In the first step, the fractional factorial design used to screen independent factors affecting the intensity of both drugs. The objective of the second step was to optimize the method performance using a Central Composite Face-centred (CCF) design. The optimal experimental conditions obtained from this study were; a temperature of (15°C ± 0.5), the solvent of 0.05N HCl and methanol with a ratio of (90:10, v/v respectively), Δλ of 42 and the addition of 1.48 % surfactant providing a sensitive measurement of amlodipine and olmesartan. The resolution of the binary mixture with a multivariate calibration method has been accomplished mainly by using partial least squares (PLS) model. Results: The recovery percentage for amlodipine besylate and atorvastatin calcium in tablets dosage form were found to be (102 ± 0.24, 99.56 ± 0.10, for amlodipine and Olmesartan, respectively). Conclusion: Method is valid according to some International Conference on Harmonization (ICH) guidelines, providing to be linear over a range of 200-300, 500-1500 ng mL⁻¹ for amlodipine and Olmesartan. The methods were successful to estimate amlodipine besylate and olmesartan in bulk powder and pharmaceutical preparation.

Keywords: amlodipine, central composite face-centred design, experimental design, fractional factorial design, multivariate calibration, olmesartan

Procedia PDF Downloads 122
635 New Gas Geothermometers for the Prediction of Subsurface Geothermal Temperatures: An Optimized Application of Artificial Neural Networks and Geochemometric Analysis

Authors: Edgar Santoyo, Daniel Perez-Zarate, Agustin Acevedo, Lorena Diaz-Gonzalez, Mirna Guevara

Abstract:

Four new gas geothermometers have been derived from a multivariate geo chemometric analysis of a geothermal fluid chemistry database, two of which use the natural logarithm of CO₂ and H2S concentrations (mmol/mol), respectively, and the other two use the natural logarithm of the H₂S/H₂ and CO₂/H₂ ratios. As a strict compilation criterion, the database was created with gas-phase composition of fluids and bottomhole temperatures (BHTM) measured in producing wells. The calibration of the geothermometers was based on the geochemical relationship existing between the gas-phase composition of well discharges and the equilibrium temperatures measured at bottomhole conditions. Multivariate statistical analysis together with the use of artificial neural networks (ANN) was successfully applied for correlating the gas-phase compositions and the BHTM. The predicted or simulated bottomhole temperatures (BHTANN), defined as output neurons or simulation targets, were statistically compared with measured temperatures (BHTM). The coefficients of the new geothermometers were obtained from an optimized self-adjusting training algorithm applied to approximately 2,080 ANN architectures with 15,000 simulation iterations each one. The self-adjusting training algorithm used the well-known Levenberg-Marquardt model, which was used to calculate: (i) the number of neurons of the hidden layer; (ii) the training factor and the training patterns of the ANN; (iii) the linear correlation coefficient, R; (iv) the synaptic weighting coefficients; and (v) the statistical parameter, Root Mean Squared Error (RMSE) to evaluate the prediction performance between the BHTM and the simulated BHTANN. The prediction performance of the new gas geothermometers together with those predictions inferred from sixteen well-known gas geothermometers (previously developed) was statistically evaluated by using an external database for avoiding a bias problem. Statistical evaluation was performed through the analysis of the lowest RMSE values computed among the predictions of all the gas geothermometers. The new gas geothermometers developed in this work have been successfully used for predicting subsurface temperatures in high-temperature geothermal systems of Mexico (e.g., Los Azufres, Mich., Los Humeros, Pue., and Cerro Prieto, B.C.) as well as in a blind geothermal system (known as Acoculco, Puebla). The last results of the gas geothermometers (inferred from gas-phase compositions of soil-gas bubble emissions) compare well with the temperature measured in two wells of the blind geothermal system of Acoculco, Puebla (México). Details of this new development are outlined in the present research work. Acknowledgements: The authors acknowledge the funding received from CeMIE-Geo P09 project (SENER-CONACyT).

Keywords: artificial intelligence, gas geochemistry, geochemometrics, geothermal energy

Procedia PDF Downloads 312
634 Simultaneous Determination of Six Characterizing/Quality Parameters of Biodiesels via 1H NMR and Multivariate Calibration

Authors: Gustavo G. Shimamoto, Matthieu Tubino

Abstract:

The characterization and the quality of biodiesel samples are checked by determining several parameters. Considering a large number of analysis to be performed, as well as the disadvantages of the use of toxic solvents and waste generation, multivariate calibration is suggested to reduce the number of tests. In this work, hydrogen nuclear magnetic resonance (1H NMR) spectra were used to build multivariate models, from partial least squares (PLS) regression, in order to determine simultaneously six important characterizing and/or quality parameters of biodiesels: density at 20 ºC, kinematic viscosity at 40 ºC, iodine value, acid number, oxidative stability, and water content. Biodiesels from twelve different oils sources were used in this study: babassu, brown flaxseed, canola, corn, cottonseed, macauba almond, microalgae, palm kernel, residual frying, sesame, soybean, and sunflower. 1H NMR reflects the structures of the compounds present in biodiesel samples and showed suitable correlations with the six parameters. The PLS models were constructed with latent variables between 5 and 7, the obtained values of r(cal) and r(val) were greater than 0.994 and 0.989, respectively. In addition, the models were considered suitable to predict all the six parameters for external samples, taking into account the analytical speed to perform it. Thus, the alliance between 1H NMR and PLS showed to be appropriate to characterize and evaluate the quality of biodiesels, reducing significantly analysis time, the consumption of reagents/solvents, and waste generation. Therefore, the proposed methods can be considered to adhere to the principles of green chemistry.

Keywords: biodiesel, multivariate calibration, nuclear magnetic resonance, quality parameters

Procedia PDF Downloads 501
633 Integrating the Athena Vortex Lattice Code into a Multivariate Design Synthesis Optimisation Platform in JAVA

Authors: Paul Okonkwo, Howard Smith

Abstract:

This paper describes a methodology to integrate the Athena Vortex Lattice Aerodynamic Software for automated operation in a multivariate optimisation of the Blended Wing Body Aircraft. The Athena Vortex Lattice code developed at the Massachusetts Institute of Technology by Mark Drela allows for the aerodynamic analysis of aircraft using the vortex lattice method. Ordinarily, the Athena Vortex Lattice operation requires a text file containing the aircraft geometry to be loaded into the AVL solver in order to determine the aerodynamic forces and moments. However, automated operation will be required to enable integration into a multidisciplinary optimisation framework. Automated AVL operation within the JAVA design environment will nonetheless require a modification and recompilation of AVL source code into an executable file capable of running on windows and other platforms without the –X11 libraries. This paper describes the procedure for the integrating the FORTRAN written AVL software for automated operation within the multivariate design synthesis optimisation framework for the conceptual design of the BWB aircraft.

Keywords: aerodynamics, automation, optimisation, AVL, JNI

Procedia PDF Downloads 560
632 White Wine Discrimination Based on Deconvoluted Surface Enhanced Raman Spectroscopy Signals

Authors: Dana Alina Magdas, Nicoleta Simona Vedeanu, Ioana Feher, Rares Stiufiuc

Abstract:

Food and beverages authentication using rapid and non-expensive analytical tools represents nowadays an important challenge. In this regard, the potential of vibrational techniques in food authentication has gained an increased attention during the last years. For wines discrimination, Raman spectroscopy appears more feasible to be used as compared with IR (infrared) spectroscopy, because of the relatively weak water bending mode in the vibrational spectroscopy fingerprint range. Despite this, the use of Raman technique in wine discrimination is in an early stage. Taking this into consideration, the wine discrimination potential of surface-enhanced Raman scattering (SERS) technique is reported in the present work. The novelty of this study, compared with the previously reported studies, concerning the application of vibrational techniques in wine discrimination consists in the fact that the present work presents the wines differentiation based on the individual signals obtained from deconvoluted spectra. In order to achieve wines classification with respect to variety, geographical origin and vintage, the peaks intensities obtained after spectra deconvolution were compared using supervised chemometric methods like Linear Discriminant Analysis (LDA). For this purpose, a set of 20 white Romanian wines from different viticultural Romanian regions four varieties, was considered. Chemometric methods applied directly to row SERS experimental spectra proved their efficiency, but discrimination markers identification found to be very difficult due to the overlapped signals as well as for the band shifts. By using this approach, a better general view related to the differences that appear among the wines in terms of compositional differentiation could be reached.

Keywords: chemometry, SERS, variety, wines discrimination

Procedia PDF Downloads 134
631 Modification of the Athena Vortex Lattice Code for the Multivariate Design Synthesis Optimisation of the Blended Wing Body Aircraft

Authors: Paul Okonkwo, Howard Smith

Abstract:

This paper describes a methodology to integrate the Athena Vortex Lattice Aerodynamic Software for automated operation in a multivariate optimisation of the Blended Wing Body Aircraft. The Athena Vortex Lattice code developed at the Massachusetts Institute of Technology by Mark Drela allows for the aerodynamic analysis of aircraft using the vortex lattice method. Ordinarily, the Athena Vortex Lattice operation requires a text file containing the aircraft geometry to be loaded into the AVL solver in order to determine the aerodynamic forces and moments. However, automated operation will be required to enable integration into a multidisciplinary optimisation framework. Automated AVL operation within the JAVA design environment will nonetheless require a modification and recompilation of AVL source code into an executable file capable of running on windows and other platforms without the –X11 libraries. This paper describes the procedure for the integrating the FORTRAN written AVL software for automated operation within the multivariate design synthesis optimisation framework for the conceptual design of the BWB aircraft.

Keywords: aerodynamics, automation, optimisation, AVL

Procedia PDF Downloads 629
630 Three-Stage Multivariate Stratified Sample Surveys with Probabilistic Cost Constraint and Random Variance

Authors: Sanam Haseen, Abdul Bari

Abstract:

In this paper a three stage multivariate programming problem with random survey cost and variances as random variables has been formulated as a non-linear stochastic programming problem. The problem has been converted into an equivalent deterministic form using chance constraint programming and modified E-modeling. An empirical study of the problem has been done at the end of the paper using R-simulation.

Keywords: chance constraint programming, modified E-model, stochastic programming, stratified sample surveys, three stage sample surveys

Procedia PDF Downloads 432
629 Some Generalized Multivariate Estimators for Population Mean under Multi Phase Stratified Systematic Sampling

Authors: Muqaddas Javed, Muhammad Hanif

Abstract:

The generalized multivariate ratio and regression type estimators for population mean are suggested under multi-phase stratified systematic sampling (MPSSS) using multi auxiliary information. Estimators are developed under the two different situations of availability of auxiliary information. The expressions of bias and mean square error (MSE) are developed. Special cases of suggested estimators are also discussed and simulation study is conducted to observe the performance of estimators.

Keywords: generalized estimators, multi-phase sampling, stratified random sampling, systematic sampling

Procedia PDF Downloads 705
628 STTS-EAD: Improving Spatio-Temporal Learning Based Time Series Prediction via Embedded Anomaly Detection

Authors: Tianhao Zhang, Cen Chen, Dawei Cheng, Yuqi Liang, Yuanyuan Liang

Abstract:

Dealing with anomalies is a crucial preprocessing step for multivariate time series prediction. However, existing methods that separate anomaly preprocessing and model training into two stages have certain limitations. Specifically, these methods fail to leverage auxiliary information necessary to distinguish latent anomalies related to spatiotemporal factors during the preprocessing stage. Instead, they solely rely on data distribution for detection which may lead to incorrect processing of many samples that are beneficial for training. To address this, we propose STTS-EAD, an end-to-end method that seamlessly integrates anomaly detection into the training process of multivariate time series forecasting and aims to improve Spatio-Temporal learning based Time Series prediction via Embedded Anomaly Detection. Our proposed STTS-EAD leverages spatio-temporal information for forecasting and anomaly detection, with the two parts alternately executed and optimized for each other. To the best of our knowledge, STTS-EAD is the first to integrate anomaly detection and forecasting tasks in the training phase for improving the accuracy of multivariate time series forecasting. Extensive experiments on a public stock dataset and two real-world sales datasets from a renowned coffee chain enterprise show that our proposed method can effectively process detected anomalies in the training stage to improve forecasting performance in the inference stage and significantly outperform baselines.

Keywords: multivariate time series, anomaly detection, time series forecasting, spatiotemporal feature learning

Procedia PDF Downloads 14
627 Irrigation Water Quality Evaluation Based on Multivariate Statistical Analysis: A Case Study of Jiaokou Irrigation District

Authors: Panpan Xu, Qiying Zhang, Hui Qian

Abstract:

Groundwater is main source of water supply in the Guanzhong Basin, China. To investigate the quality of groundwater for agricultural purposes in Jiaokou Irrigation District located in the east of the Guanzhong Basin, 141 groundwater samples were collected for analysis of major ions (K+, Na+, Mg2+, Ca2+, SO42-, Cl-, HCO3-, and CO32-), pH, and total dissolved solids (TDS). Sodium percentage (Na%), residual sodium carbonate (RSC), magnesium hazard (MH), and potential salinity (PS) were applied for irrigation water quality assessment. In addition, multivariate statistical techniques were used to identify the underlying hydrogeochemical processes. Results show that the content of TDS mainly depends on Cl-, Na+, Mg2+, and SO42-, and the HCO3- content is generally high except for the eastern sand area. These are responsible for complex hydrogeochemical processes, such as dissolution of carbonate minerals (dolomite and calcite), gypsum, halite, and silicate minerals, the cation exchange, as well as evaporation and concentration. The average evaluation levels of Na%, RSC, MH, and PS for irrigation water quality are doubtful, good, unsuitable, and injurious to unsatisfactory, respectively. Therefore, it is necessary for decision makers to comprehensively consider the indicators and thus reasonably evaluate the irrigation water quality.

Keywords: irrigation water quality, multivariate statistical analysis, groundwater, hydrogeochemical process

Procedia PDF Downloads 120
626 Ranking Effective Factors on Strategic Planning to Achieve Organization Objectives in Fuzzy Multivariate Decision-Making Technique

Authors: Elahe Memari, Ahmad Aslizadeh, Ahmad Memari

Abstract:

Today strategic planning is counted as the most important duties of senior directors in each organization. Strategic planning allows the organizations to implement compiled strategies and reach higher competitive benefits than their competitors. The present research work tries to prepare and rank the strategies form effective factors on strategic planning in fulfillment of the State Road Management and Transportation Organization in order to indicate the role of organizational factors in efficiency of the process to organization managers. Connection between six main factors in fulfillment of State Road Management and Transportation Organization were studied here, including Improvement of Strategic Thinking in senior managers, improvement of the organization business process, rationalization of resources allocation in different parts of the organization, coordination and conformity of strategic plan with organization needs, adjustment of organization activities with environmental changes, reinforcement of organizational culture. All said factors approved by implemented tests and then ranked using fuzzy multivariate decision-making technique.

Keywords: Fuzzy TOPSIS, improvement of organization business process, multivariate decision-making, strategic planning

Procedia PDF Downloads 382
625 Applying Multivariate and Univariate Analysis of Variance on Socioeconomic, Health, and Security Variables in Jordan

Authors: Faisal G. Khamis, Ghaleb A. El-Refae

Abstract:

Many researchers have studied socioeconomic, health, and security variables in the developed countries; however, very few studies used multivariate analysis in developing countries. The current study contributes to the scarce literature about the determinants of the variance in socioeconomic, health, and security factors. Questions raised were whether the independent variables (IVs) of governorate and year impact the socioeconomic, health, and security dependent variables (DVs) in Jordan, whether the marginal mean of each DV in each governorate and in each year is significant, which governorates are similar in difference means of each DV, and whether these DVs vary. The main objectives were to determine the source of variances in DVs, collectively and separately, testing which governorates are similar and which diverge for each DV. The research design was time series and cross-sectional analysis. The main hypotheses are that IVs affect DVs collectively and separately. Multivariate and univariate analyses of variance were carried out to test these hypotheses. The population of 12 governorates in Jordan and the available data of 15 years (2000–2015) accrued from several Jordanian statistical yearbooks. We investigated the effect of two factors of governorate and year on the four DVs of divorce rate, mortality rate, unemployment percentage, and crime rate. All DVs were transformed to multivariate normal distribution. We calculated descriptive statistics for each DV. Based on the multivariate analysis of variance, we found a significant effect in IVs on DVs with p < .001. Based on the univariate analysis, we found a significant effect of IVs on each DV with p < .001, except the effect of the year factor on unemployment was not significant with p = .642. The grand and marginal means of each DV in each governorate and each year were significant based on a 95% confidence interval. Most governorates are not similar in DVs with p < .001. We concluded that the two factors produce significant effects on DVs, collectively and separately. Based on these findings, the government can distribute its financial and physical resources to governorates more efficiently. By identifying the sources of variance that contribute to the variation in DVs, insights can help inform focused variation prevention efforts.

Keywords: ANOVA, crime, divorce, governorate, hypothesis test, Jordan, MANOVA, means, mortality, unemployment, year

Procedia PDF Downloads 246
624 Landslide Susceptibility Mapping: A Comparison between Logistic Regression and Multivariate Adaptive Regression Spline Models in the Municipality of Oudka, Northern of Morocco

Authors: S. Benchelha, H. C. Aoudjehane, M. Hakdaoui, R. El Hamdouni, H. Mansouri, T. Benchelha, M. Layelmam, M. Alaoui

Abstract:

The logistic regression (LR) and multivariate adaptive regression spline (MarSpline) are applied and verified for analysis of landslide susceptibility map in Oudka, Morocco, using geographical information system. From spatial database containing data such as landslide mapping, topography, soil, hydrology and lithology, the eight factors related to landslides such as elevation, slope, aspect, distance to streams, distance to road, distance to faults, lithology map and Normalized Difference Vegetation Index (NDVI) were calculated or extracted. Using these factors, landslide susceptibility indexes were calculated by the two mentioned methods. Before the calculation, this database was divided into two parts, the first for the formation of the model and the second for the validation. The results of the landslide susceptibility analysis were verified using success and prediction rates to evaluate the quality of these probabilistic models. The result of this verification was that the MarSpline model is the best model with a success rate (AUC = 0.963) and a prediction rate (AUC = 0.951) higher than the LR model (success rate AUC = 0.918, rate prediction AUC = 0.901).

Keywords: landslide susceptibility mapping, regression logistic, multivariate adaptive regression spline, Oudka, Taounate

Procedia PDF Downloads 163
623 Use of Multivariate Statistical Techniques for Water Quality Monitoring Network Assessment, Case of Study: Jequetepeque River Basin

Authors: Jose Flores, Nadia Gamboa

Abstract:

A proper water quality management requires the establishment of a monitoring network. Therefore, evaluation of the efficiency of water quality monitoring networks is needed to ensure high-quality data collection of critical quality chemical parameters. Unfortunately, in some Latin American countries water quality monitoring programs are not sustainable in terms of recording historical data or environmentally representative sites wasting time, money and valuable information. In this study, multivariate statistical techniques, such as principal components analysis (PCA) and hierarchical cluster analysis (HCA), are applied for identifying the most significant monitoring sites as well as critical water quality parameters in the monitoring network of the Jequetepeque River basin, in northern Peru. The Jequetepeque River basin, like others in Peru, shows socio-environmental conflicts due to economical activities developed in this area. Water pollution by trace elements in the upper part of the basin is mainly related with mining activity, and agricultural land lost due to salinization is caused by the extensive use of groundwater in the lower part of the basin. Since the 1980s, the water quality in the basin has been non-continuously assessed by public and private organizations, and recently the National Water Authority had established permanent water quality networks in 45 basins in Peru. Despite many countries use multivariate statistical techniques for assessing water quality monitoring networks, those instruments have never been applied for that purpose in Peru. For this reason, the main contribution of this study is to demonstrate that application of the multivariate statistical techniques could serve as an instrument that allows the optimization of monitoring networks using least number of monitoring sites as well as the most significant water quality parameters, which would reduce costs concerns and improve the water quality management in Peru. Main socio-economical activities developed and the principal stakeholders related to the water management in the basin are also identified. Finally, water quality management programs will also be discussed in terms of their efficiency and sustainability.

Keywords: PCA, HCA, Jequetepeque, multivariate statistical

Procedia PDF Downloads 332
622 Multivariate Output-Associative RVM for Multi-Dimensional Affect Predictions

Authors: Achut Manandhar, Kenneth D. Morton, Peter A. Torrione, Leslie M. Collins

Abstract:

The current trends in affect recognition research are to consider continuous observations from spontaneous natural interactions in people using multiple feature modalities, and to represent affect in terms of continuous dimensions, incorporate spatio-temporal correlation among affect dimensions, and provide fast affect predictions. These research efforts have been propelled by a growing effort to develop affect recognition system that can be implemented to enable seamless real-time human-computer interaction in a wide variety of applications. Motivated by these desired attributes of an affect recognition system, in this work a multi-dimensional affect prediction approach is proposed by integrating multivariate Relevance Vector Machine (MVRVM) with a recently developed Output-associative Relevance Vector Machine (OARVM) approach. The resulting approach can provide fast continuous affect predictions by jointly modeling the multiple affect dimensions and their correlations. Experiments on the RECOLA database show that the proposed approach performs competitively with the OARVM while providing faster predictions during testing.

Keywords: dimensional affect prediction, output-associative RVM, multivariate regression, fast testing

Procedia PDF Downloads 262
621 Analogy in Microclimatic Parameters, Chemometric and Phytonutrient Profiles of Cultivated and Wild Ecotypes of Origanum vulgare L., across Kashmir Himalaya

Authors: Sumira Jan, Javid Iqbal Mir, Desh Beer Singh, Anil Sharma, Shafia Zaffar Faktoo

Abstract:

Background and Aims: Climatic and edaphic factors immensely influence crop quality and proper development. Regardless of economic potential, Himalayan Oregano has not subjected to phytonutrient and chemometric evaluation and its relationship with environmental conditions are scarce. The central objective of this research was to investigate microclimatic variation among wild and cultivated populations located in a microclimatic gradient in north-western Himalaya, Kashmir and analyse if such disparity was related with diverse climatic and edaphic conditions. Methods: Micrometeorological, Atomic absorption spectroscopy for micro elemental analysis was carried for soil. HPLC was carried out to estimate variation in phytonutrients and phytochemicals. Results: Geographic variation in phytonutrient was observed among cultivated and wild populations and among populations diverse within regions. Cultivated populations exhibited comparatively lesser phytonutrient value than wild populations. Moreover, our results observed higher vegetative growth of O. vulgare L. with higher pH (6-7), elevated organic carbon (2.42%), high nitrogen (97.41Kg/ha) and manganese (10-12ppm) and zinc contents (0.39-0.50) produce higher phytonutrients. HPLC data of phytonutrients like quercetin, betacarotene, ascorbic acid, arbutin and catechin revealed direct relationship with UV-B flux (r2=0.82), potassium (r2=0.97) displaying parallel relationship with phytonutrient value. Conclusions: Catechin was found as predominant phytonutrient among all populations with maximum accumulation of 163.8 ppm while as quercetin exhibited lesser value. Maximum arbutin (53.42ppm) and quercetin (2.87ppm) accumulated in plants thriving under intense and high UV-B flux. Minimum variation was demonstrated by beta carotene and ascorbic acid.

Keywords: phytonutrient, ascorbic acid, beta carotene, quercetin, catechin

Procedia PDF Downloads 243
620 Selection of Designs in Ordinal Regression Models under Linear Predictor Misspecification

Authors: Ishapathik Das

Abstract:

The purpose of this article is to find a method of comparing designs for ordinal regression models using quantile dispersion graphs in the presence of linear predictor misspecification. The true relationship between response variable and the corresponding control variables are usually unknown. Experimenter assumes certain form of the linear predictor of the ordinal regression models. The assumed form of the linear predictor may not be correct always. Thus, the maximum likelihood estimates (MLE) of the unknown parameters of the model may be biased due to misspecification of the linear predictor. In this article, the uncertainty in the linear predictor is represented by an unknown function. An algorithm is provided to estimate the unknown function at the design points where observations are available. The unknown function is estimated at all points in the design region using multivariate parametric kriging. The comparison of the designs are based on a scalar valued function of the mean squared error of prediction (MSEP) matrix, which incorporates both variance and bias of the prediction caused by the misspecification in the linear predictor. The designs are compared using quantile dispersion graphs approach. The graphs also visually depict the robustness of the designs on the changes in the parameter values. Numerical examples are presented to illustrate the proposed methodology.

Keywords: model misspecification, multivariate kriging, multivariate logistic link, ordinal response models, quantile dispersion graphs

Procedia PDF Downloads 360
619 Prediction of Marine Ecosystem Changes Based on the Integrated Analysis of Multivariate Data Sets

Authors: Prozorkevitch D., Mishurov A., Sokolov K., Karsakov L., Pestrikova L.

Abstract:

The current body of knowledge about the marine environment and the dynamics of marine ecosystems includes a huge amount of heterogeneous data collected over decades. It generally includes a wide range of hydrological, biological and fishery data. Marine researchers collect these data and analyze how and why the ecosystem changes from past to present. Based on these historical records and linkages between the processes it is possible to predict future changes. Multivariate analysis of trends and their interconnection in the marine ecosystem may be used as an instrument for predicting further ecosystem evolution. A wide range of information about the components of the marine ecosystem for more than 50 years needs to be used to investigate how these arrays can help to predict the future.

Keywords: barents sea ecosystem, abiotic, biotic, data sets, trends, prediction

Procedia PDF Downloads 87
618 Multivariate Analytical Insights into Spatial and Temporal Variation in Water Quality of a Major Drinking Water Reservoir

Authors: Azadeh Golshan, Craig Evans, Phillip Geary, Abigail Morrow, Zoe Rogers, Marcel Maeder

Abstract:

22 physicochemical variables have been determined in water samples collected weekly from January to December in 2013 from three sampling stations located within a major drinking water reservoir. Classical Multivariate Curve Resolution Alternating Least Squares (MCR-ALS) analysis was used to investigate the environmental factors associated with the physico-chemical variability of the water samples at each of the sampling stations. Matrix augmentation MCR-ALS (MA-MCR-ALS) was also applied, and the two sets of results were compared for interpretative clarity. Links between these factors, reservoir inflows and catchment land-uses were investigated and interpreted in relation to chemical composition of the water and their resolved geographical distribution profiles. The results suggested that the major factors affecting reservoir water quality were those associated with agricultural runoff, with evidence of influence on algal photosynthesis within the water column. Water quality variability within the reservoir was also found to be strongly linked to physical parameters such as water temperature and the occurrence of thermal stratification. The two methods applied (MCR-ALS and MA-MCR-ALS) led to similar conclusions; however, MA-MCR-ALS appeared to provide results more amenable to interpretation of temporal and geological variation than those obtained through classical MCR-ALS.

Keywords: drinking water reservoir, multivariate analysis, physico-chemical parameters, water quality

Procedia PDF Downloads 260
617 A Multivariate Statistical Approach for Water Quality Assessment of River Hindon, India

Authors: Nida Rizvi, Deeksha Katyal, Varun Joshi

Abstract:

River Hindon is an important river catering the demand of highly populated rural and industrial cluster of western Uttar Pradesh, India. Water quality of river Hindon is deteriorating at an alarming rate due to various industrial, municipal and agricultural activities. The present study aimed at identifying the pollution sources and quantifying the degree to which these sources are responsible for the deteriorating water quality of the river. Various water quality parameters, like pH, temperature, electrical conductivity, total dissolved solids, total hardness, calcium, chloride, nitrate, sulphate, biological oxygen demand, chemical oxygen demand and total alkalinity were assessed. Water quality data obtained from eight study sites for one year has been subjected to the two multivariate techniques, namely, principal component analysis and cluster analysis. Principal component analysis was applied with the aim to find out spatial variability and to identify the sources responsible for the water quality of the river. Three Varifactors were obtained after varimax rotation of initial principal components using principal component analysis. Cluster analysis was carried out to classify sampling stations of certain similarity, which grouped eight different sites into two clusters. The study reveals that the anthropogenic influence (municipal, industrial, waste water and agricultural runoff) was the major source of river water pollution. Thus, this study illustrates the utility of multivariate statistical techniques for analysis and elucidation of multifaceted data sets, recognition of pollution sources/factors and understanding temporal/spatial variations in water quality for effective river water quality management.

Keywords: cluster analysis, multivariate statistical techniques, river Hindon, water quality

Procedia PDF Downloads 433
616 EWMA and MEWMA Control Charts for Monitoring Mean and Variance in Industrial Processes

Authors: L. A. Toro, N. Prieto, J. J. Vargas

Abstract:

There are many control charts for monitoring mean and variance. Among these, the X y R, X y S, S2 Hotteling and Shewhart control charts, for mentioning some, are widely used for monitoring mean a variance in industrial processes. In particular, the Shewhart charts are based on the information about the process contained in the current observation only and ignore any information given by the entire sequence of points. Moreover, that the Shewhart chart is a control chart without memory. Consequently, Shewhart control charts are found to be less sensitive in detecting smaller shifts, particularly smaller than 1.5 times of the standard deviation. These kind of small shifts are important in many industrial applications. In this study and effective alternative to Shewhart control chart was implemented. In case of univariate process an Exponentially Moving Average (EWMA) control chart was developed and Multivariate Exponentially Moving Average (MEWMA) control chart in case of multivariate process. Both of these charts were based on memory and perform better that Shewhart chart while detecting smaller shifts. In these charts, information the past sample is cumulated up the current sample and then the decision about the process control is taken. The mentioned characteristic of EWMA and MEWMA charts, are of the paramount importance when it is necessary to control industrial process, because it is possible to correct or predict problems in the processes before they come to a dangerous limit.

Keywords: control charts, multivariate exponentially moving average (MEWMA), exponentially moving average (EWMA), industrial control process

Procedia PDF Downloads 329
615 A Posteriori Trading-Inspired Model-Free Time Series Segmentation

Authors: Plessen Mogens Graf

Abstract:

Within the context of multivariate time series segmentation, this paper proposes a method inspired by a posteriori optimal trading. After a normalization step, time series are treated channelwise as surrogate stock prices that can be traded optimally a posteriori in a virtual portfolio holding either stock or cash. Linear transaction costs are interpreted as hyperparameters for noise filtering. Trading signals, as well as trading signals obtained on the reversed time series, are used for unsupervised channelwise labeling before a consensus over all channels is reached that determines the final segmentation time instants. The method is model-free such that no model prescriptions for segments are made. Benefits of proposed approach include simplicity, computational efficiency, and adaptability to a wide range of different shapes of time series. Performance is demonstrated on synthetic and real-world data, including a large-scale dataset comprising a multivariate time series of dimension 1000 and length 2709. Proposed method is compared to a popular model-based bottom-up approach fitting piecewise affine models and to a recent model-based top-down approach fitting Gaussian models and found to be consistently faster while producing more intuitive results in the sense of segmenting time series at peaks and valleys.

Keywords: time series segmentation, model-free, trading-inspired, multivariate data

Procedia PDF Downloads 109
614 Multivariate Rainfall Disaggregation Using MuDRain Model: Malaysia Experience

Authors: Ibrahim Suliman Hanaish

Abstract:

Disaggregation daily rainfall using stochastic models formulated based on multivariate approach (MuDRain) is discussed in this paper. Seven rain gauge stations are considered in this study for different distances from the referred station starting from 4 km to 160 km in Peninsular Malaysia. The hourly rainfall data used are covered the period from 1973 to 2008 and July and November months are considered as an example of dry and wet periods. The cross-correlation among the rain gauges is considered for the available hourly rainfall information at the neighboring stations or not. This paper discussed the applicability of the MuDRain model for disaggregation daily rainfall to hourly rainfall for both sources of cross-correlation. The goodness of fit of the model was based on the reproduction of fitting statistics like the means, variances, coefficients of skewness, lag zero cross-correlation of coefficients and the lag one auto correlation of coefficients. It is found the correlation coefficients based on extracted correlations that was based on daily are slightly higher than correlations based on available hourly rainfall especially for neighboring stations not more than 28 km. The results showed also the MuDRain model did not reproduce statistics very well. In addition, a bad reproduction of the actual hyetographs comparing to the synthetic hourly rainfall data. Mean while, it is showed a good fit between the distribution function of the historical and synthetic hourly rainfall. These discrepancies are unavoidable because of the lowest cross correlation of hourly rainfall. The overall performance indicated that the MuDRain model would not be appropriate choice for disaggregation daily rainfall.

Keywords: rainfall disaggregation, multivariate disaggregation rainfall model, correlation, stochastic model

Procedia PDF Downloads 479
613 Dissimilarity-Based Coloring for Symbolic and Multivariate Data Visualization

Authors: K. Umbleja, M. Ichino, H. Yaguchi

Abstract:

In this paper, we propose a coloring method for multivariate data visualization by using parallel coordinates based on dissimilarity and tree structure information gathered during hierarchical clustering. The proposed method is an extension for proximity-based coloring that suffers from a few undesired side effects if hierarchical tree structure is not balanced tree. We describe the algorithm by assigning colors based on dissimilarity information, show the application of proposed method on three commonly used datasets, and compare the results with proximity-based coloring. We found our proposed method to be especially beneficial for symbolic data visualization where many individual objects have already been aggregated into a single symbolic object.

Keywords: data visualization, dissimilarity-based coloring, proximity-based coloring, symbolic data

Procedia PDF Downloads 140
612 Predicting Returns Volatilities and Correlations of Stock Indices Using Multivariate Conditional Autoregressive Range and Return Models

Authors: Shay Kee Tan, Kok Haur Ng, Jennifer So-Kuen Chan

Abstract:

This paper extends the conditional autoregressive range (CARR) model to multivariate CARR (MCARR) model and further to the two-stage MCARR-return model to model and forecast volatilities, correlations and returns of multiple financial assets. The first stage model fits the scaled realised Parkinson volatility measures using individual series and their pairwise sums of indices to the MCARR model to obtain in-sample estimates and forecasts of volatilities for these individual and pairwise sum series. Then covariances are calculated to construct the fitted variance-covariance matrix of returns which are imputed into the stage-two return model to capture the heteroskedasticity of assets’ returns. We investigate different choices of mean functions to describe the volatility dynamics. Empirical applications are based on the Standard and Poor 500, Dow Jones Industrial Average and Dow Jones United States Financial Service Indices. Results show that the stage-one MCARR models using asymmetric mean functions give better in-sample model fits than those based on symmetric mean functions. They also provide better out-of-sample volatility forecasts than those using CARR models based on two robust loss functions with the scaled realised open-to-close volatility measure as the proxy for the unobserved true volatility. We also find that the stage-two return models with constant means and multivariate Student-t errors give better in-sample fits than the Baba, Engle, Kraft, and Kroner type of generalized autoregressive conditional heteroskedasticity (BEKK-GARCH) models. The estimates and forecasts of value-at-risk (VaR) and conditional VaR based on the best MCARR-return models for each asset are provided and tested using Kupiec test to confirm the accuracy of the VaR forecasts.

Keywords: range-based volatility, correlation, multivariate CARR-return model, value-at-risk, conditional value-at-risk

Procedia PDF Downloads 75