Search results for: multivariate statistical techniques
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 10159

Search results for: multivariate statistical techniques

10159 Multivariate Analysis of Spectroscopic Data for Agriculture Applications

Authors: Asmaa M. Hussein, Amr Wassal, Ahmed Farouk Al-Sadek, A. F. Abd El-Rahman

Abstract:

In this study, a multivariate analysis of potato spectroscopic data was presented to detect the presence of brown rot disease or not. Near-Infrared (NIR) spectroscopy (1,350-2,500 nm) combined with multivariate analysis was used as a rapid, non-destructive technique for the detection of brown rot disease in potatoes. Spectral measurements were performed in 565 samples, which were chosen randomly at the infection place in the potato slice. In this study, 254 infected and 311 uninfected (brown rot-free) samples were analyzed using different advanced statistical analysis techniques. The discrimination performance of different multivariate analysis techniques, including classification, pre-processing, and dimension reduction, were compared. Applying a random forest algorithm classifier with different pre-processing techniques to raw spectra had the best performance as the total classification accuracy of 98.7% was achieved in discriminating infected potatoes from control.

Keywords: Brown rot disease, NIR spectroscopy, potato, random forest

Procedia PDF Downloads 147
10158 Use of Multivariate Statistical Techniques for Water Quality Monitoring Network Assessment, Case of Study: Jequetepeque River Basin

Authors: Jose Flores, Nadia Gamboa

Abstract:

A proper water quality management requires the establishment of a monitoring network. Therefore, evaluation of the efficiency of water quality monitoring networks is needed to ensure high-quality data collection of critical quality chemical parameters. Unfortunately, in some Latin American countries water quality monitoring programs are not sustainable in terms of recording historical data or environmentally representative sites wasting time, money and valuable information. In this study, multivariate statistical techniques, such as principal components analysis (PCA) and hierarchical cluster analysis (HCA), are applied for identifying the most significant monitoring sites as well as critical water quality parameters in the monitoring network of the Jequetepeque River basin, in northern Peru. The Jequetepeque River basin, like others in Peru, shows socio-environmental conflicts due to economical activities developed in this area. Water pollution by trace elements in the upper part of the basin is mainly related with mining activity, and agricultural land lost due to salinization is caused by the extensive use of groundwater in the lower part of the basin. Since the 1980s, the water quality in the basin has been non-continuously assessed by public and private organizations, and recently the National Water Authority had established permanent water quality networks in 45 basins in Peru. Despite many countries use multivariate statistical techniques for assessing water quality monitoring networks, those instruments have never been applied for that purpose in Peru. For this reason, the main contribution of this study is to demonstrate that application of the multivariate statistical techniques could serve as an instrument that allows the optimization of monitoring networks using least number of monitoring sites as well as the most significant water quality parameters, which would reduce costs concerns and improve the water quality management in Peru. Main socio-economical activities developed and the principal stakeholders related to the water management in the basin are also identified. Finally, water quality management programs will also be discussed in terms of their efficiency and sustainability.

Keywords: PCA, HCA, Jequetepeque, multivariate statistical

Procedia PDF Downloads 316
10157 A Multivariate Statistical Approach for Water Quality Assessment of River Hindon, India

Authors: Nida Rizvi, Deeksha Katyal, Varun Joshi

Abstract:

River Hindon is an important river catering the demand of highly populated rural and industrial cluster of western Uttar Pradesh, India. Water quality of river Hindon is deteriorating at an alarming rate due to various industrial, municipal and agricultural activities. The present study aimed at identifying the pollution sources and quantifying the degree to which these sources are responsible for the deteriorating water quality of the river. Various water quality parameters, like pH, temperature, electrical conductivity, total dissolved solids, total hardness, calcium, chloride, nitrate, sulphate, biological oxygen demand, chemical oxygen demand and total alkalinity were assessed. Water quality data obtained from eight study sites for one year has been subjected to the two multivariate techniques, namely, principal component analysis and cluster analysis. Principal component analysis was applied with the aim to find out spatial variability and to identify the sources responsible for the water quality of the river. Three Varifactors were obtained after varimax rotation of initial principal components using principal component analysis. Cluster analysis was carried out to classify sampling stations of certain similarity, which grouped eight different sites into two clusters. The study reveals that the anthropogenic influence (municipal, industrial, waste water and agricultural runoff) was the major source of river water pollution. Thus, this study illustrates the utility of multivariate statistical techniques for analysis and elucidation of multifaceted data sets, recognition of pollution sources/factors and understanding temporal/spatial variations in water quality for effective river water quality management.

Keywords: cluster analysis, multivariate statistical techniques, river Hindon, water quality

Procedia PDF Downloads 420
10156 Multivariate Statistical Process Monitoring of Base Metal Flotation Plant Using Dissimilarity Scale-Based Singular Spectrum Analysis

Authors: Syamala Krishnannair

Abstract:

A multivariate statistical process monitoring methodology using dissimilarity scale-based singular spectrum analysis (SSA) is proposed for the detection and diagnosis of process faults in the base metal flotation plant. Process faults are detected based on the multi-level decomposition of process signals by SSA using the dissimilarity structure of the process data and the subsequent monitoring of the multiscale signals using the unified monitoring index which combines T² with SPE. Contribution plots are used to identify the root causes of the process faults. The overall results indicated that the proposed technique outperformed the conventional multivariate techniques in the detection and diagnosis of the process faults in the flotation plant.

Keywords: fault detection, fault diagnosis, process monitoring, dissimilarity scale

Procedia PDF Downloads 165
10155 Regression for Doubly Inflated Multivariate Poisson Distributions

Authors: Ishapathik Das, Sumen Sen, N. Rao Chaganty, Pooja Sengupta

Abstract:

Dependent multivariate count data occur in several research studies. These data can be modeled by a multivariate Poisson or Negative binomial distribution constructed using copulas. However, when some of the counts are inflated, that is, the number of observations in some cells are much larger than other cells, then the copula based multivariate Poisson (or Negative binomial) distribution may not fit well and it is not an appropriate statistical model for the data. There is a need to modify or adjust the multivariate distribution to account for the inflated frequencies. In this article, we consider the situation where the frequencies of two cells are higher compared to the other cells, and develop a doubly inflated multivariate Poisson distribution function using multivariate Gaussian copula. We also discuss procedures for regression on covariates for the doubly inflated multivariate count data. For illustrating the proposed methodologies, we present a real data containing bivariate count observations with inflations in two cells. Several models and linear predictors with log link functions are considered, and we discuss maximum likelihood estimation to estimate unknown parameters of the models.

Keywords: copula, Gaussian copula, multivariate distributions, inflated distributios

Procedia PDF Downloads 119
10154 Irrigation Water Quality Evaluation Based on Multivariate Statistical Analysis: A Case Study of Jiaokou Irrigation District

Authors: Panpan Xu, Qiying Zhang, Hui Qian

Abstract:

Groundwater is main source of water supply in the Guanzhong Basin, China. To investigate the quality of groundwater for agricultural purposes in Jiaokou Irrigation District located in the east of the Guanzhong Basin, 141 groundwater samples were collected for analysis of major ions (K+, Na+, Mg2+, Ca2+, SO42-, Cl-, HCO3-, and CO32-), pH, and total dissolved solids (TDS). Sodium percentage (Na%), residual sodium carbonate (RSC), magnesium hazard (MH), and potential salinity (PS) were applied for irrigation water quality assessment. In addition, multivariate statistical techniques were used to identify the underlying hydrogeochemical processes. Results show that the content of TDS mainly depends on Cl-, Na+, Mg2+, and SO42-, and the HCO3- content is generally high except for the eastern sand area. These are responsible for complex hydrogeochemical processes, such as dissolution of carbonate minerals (dolomite and calcite), gypsum, halite, and silicate minerals, the cation exchange, as well as evaporation and concentration. The average evaluation levels of Na%, RSC, MH, and PS for irrigation water quality are doubtful, good, unsuitable, and injurious to unsatisfactory, respectively. Therefore, it is necessary for decision makers to comprehensively consider the indicators and thus reasonably evaluate the irrigation water quality.

Keywords: irrigation water quality, multivariate statistical analysis, groundwater, hydrogeochemical process

Procedia PDF Downloads 105
10153 Introduction of Robust Multivariate Process Capability Indices

Authors: Behrooz Khalilloo, Hamid Shahriari, Emad Roghanian

Abstract:

Process capability indices (PCIs) are important concepts of statistical quality control and measure the capability of processes and how much processes are meeting certain specifications. An important issue in statistical quality control is parameter estimation. Under the assumption of multivariate normality, the distribution parameters, mean vector and variance-covariance matrix must be estimated, when they are unknown. Classic estimation methods like method of moment estimation (MME) or maximum likelihood estimation (MLE) makes good estimation of the population parameters when data are not contaminated. But when outliers exist in the data, MME and MLE make weak estimators of the population parameters. So we need some estimators which have good estimation in the presence of outliers. In this work robust M-estimators for estimating these parameters are used and based on robust parameter estimators, robust process capability indices are introduced. The performances of these robust estimators in the presence of outliers and their effects on process capability indices are evaluated by real and simulated multivariate data. The results indicate that the proposed robust capability indices perform much better than the existing process capability indices.

Keywords: multivariate process capability indices, robust M-estimator, outlier, multivariate quality control, statistical quality control

Procedia PDF Downloads 240
10152 An AK-Chart for the Non-Normal Data

Authors: Chia-Hau Liu, Tai-Yue Wang

Abstract:

Traditional multivariate control charts assume that measurement from manufacturing processes follows a multivariate normal distribution. However, this assumption may not hold or may be difficult to verify because not all the measurement from manufacturing processes are normal distributed in practice. This study develops a new multivariate control chart for monitoring the processes with non-normal data. We propose a mechanism based on integrating the one-class classification method and the adaptive technique. The adaptive technique is used to improve the sensitivity to small shift on one-class classification in statistical process control. In addition, this design provides an easy way to allocate the value of type I error so it is easier to be implemented. Finally, the simulation study and the real data from industry are used to demonstrate the effectiveness of the propose control charts.

Keywords: multivariate control chart, statistical process control, one-class classification method, non-normal data

Procedia PDF Downloads 384
10151 On-Line Data-Driven Multivariate Statistical Prediction Approach to Production Monitoring

Authors: Hyun-Woo Cho

Abstract:

Detection of incipient abnormal events in production processes is important to improve safety and reliability of manufacturing operations and reduce losses caused by failures. The construction of calibration models for predicting faulty conditions is quite essential in making decisions on when to perform preventive maintenance. This paper presents a multivariate calibration monitoring approach based on the statistical analysis of process measurement data. The calibration model is used to predict faulty conditions from historical reference data. This approach utilizes variable selection techniques, and the predictive performance of several prediction methods are evaluated using real data. The results shows that the calibration model based on supervised probabilistic model yielded best performance in this work. By adopting a proper variable selection scheme in calibration models, the prediction performance can be improved by excluding non-informative variables from their model building steps.

Keywords: calibration model, monitoring, quality improvement, feature selection

Procedia PDF Downloads 320
10150 Electricity Generation from Renewables and Targets: An Application of Multivariate Statistical Techniques

Authors: Filiz Ersoz, Taner Ersoz, Tugrul Bayraktar

Abstract:

Renewable energy is referred to as "clean energy" and common popular support for the use of renewable energy (RE) is to provide electricity with zero carbon dioxide emissions. This study provides useful insight into the European Union (EU) RE, especially, into electricity generation obtained from renewables, and their targets. The objective of this study is to identify groups of European countries, using multivariate statistical analysis and selected indicators. The hierarchical clustering method is used to decide the number of clusters for EU countries. The conducted statistical hierarchical cluster analysis is based on the Ward’s clustering method and squared Euclidean distances. Hierarchical cluster analysis identified eight distinct clusters of European countries. Then, non-hierarchical clustering (k-means) method was applied. Discriminant analysis was used to determine the validity of the results with data normalized by Z score transformation. To explore the relationship between the selected indicators, correlation coefficients were computed. The results of the study reveal the current situation of RE in European Union Member States.

Keywords: share of electricity generation, k-means clustering, discriminant, CO2 emission

Procedia PDF Downloads 383
10149 Neutral Heavy Scalar Searches via Standard Model Gauge Boson Decays at the Large Hadron Electron Collider with Multivariate Techniques

Authors: Luigi Delle Rose, Oliver Fischer, Ahmed Hammad

Abstract:

In this article, we study the prospects of the proposed Large Hadron electron Collider (LHeC) in the search for heavy neutral scalar particles. We consider a minimal model with one additional complex scalar singlet that interacts with the Standard Model (SM) via mixing with the Higgs doublet, giving rise to an SM-like Higgs boson and a heavy scalar particle. Both scalar particles are produced via vector boson fusion and can be tested via their decays into pairs of SM particles, analogously to the SM Higgs boson. Using multivariate techniques, we show that the LHeC is sensitive to heavy scalars with masses between 200 and 800 GeV down to scalar mixing of order 0.01.

Keywords: beyond the standard model, large hadron electron collider, multivariate analysis, scalar singlet

Procedia PDF Downloads 94
10148 Quantum Statistical Machine Learning and Quantum Time Series

Authors: Omar Alzeley, Sergey Utev

Abstract:

Minimizing a constrained multivariate function is the fundamental of Machine learning, and these algorithms are at the core of data mining and data visualization techniques. The decision function that maps input points to output points is based on the result of optimization. This optimization is the central of learning theory. One approach to complex systems where the dynamics of the system is inferred by a statistical analysis of the fluctuations in time of some associated observable is time series analysis. The purpose of this paper is a mathematical transition from the autoregressive model of classical time series to the matrix formalization of quantum theory. Firstly, we have proposed a quantum time series model (QTS). Although Hamiltonian technique becomes an established tool to detect a deterministic chaos, other approaches emerge. The quantum probabilistic technique is used to motivate the construction of our QTS model. The QTS model resembles the quantum dynamic model which was applied to financial data. Secondly, various statistical methods, including machine learning algorithms such as the Kalman filter algorithm, are applied to estimate and analyses the unknown parameters of the model. Finally, simulation techniques such as Markov chain Monte Carlo have been used to support our investigations. The proposed model has been examined by using real and simulated data. We establish the relation between quantum statistical machine and quantum time series via random matrix theory. It is interesting to note that the primary focus of the application of QTS in the field of quantum chaos was to find a model that explain chaotic behaviour. Maybe this model will reveal another insight into quantum chaos.

Keywords: machine learning, simulation techniques, quantum probability, tensor product, time series

Procedia PDF Downloads 421
10147 Adaptive Process Monitoring for Time-Varying Situations Using Statistical Learning Algorithms

Authors: Seulki Lee, Seoung Bum Kim

Abstract:

Statistical process control (SPC) is a practical and effective method for quality control. The most important and widely used technique in SPC is a control chart. The main goal of a control chart is to detect any assignable changes that affect the quality output. Most conventional control charts, such as Hotelling’s T2 charts, are commonly based on the assumption that the quality characteristics follow a multivariate normal distribution. However, in modern complicated manufacturing systems, appropriate control chart techniques that can efficiently handle the nonnormal processes are required. To overcome the shortcomings of conventional control charts for nonnormal processes, several methods have been proposed to combine statistical learning algorithms and multivariate control charts. Statistical learning-based control charts, such as support vector data description (SVDD)-based charts, k-nearest neighbors-based charts, have proven their improved performance in nonnormal situations compared to that of the T2 chart. Beside the nonnormal property, time-varying operations are also quite common in real manufacturing fields because of various factors such as product and set-point changes, seasonal variations, catalyst degradation, and sensor drifting. However, traditional control charts cannot accommodate future condition changes of the process because they are formulated based on the data information recorded in the early stage of the process. In the present paper, we propose a SVDD algorithm-based control chart, which is capable of adaptively monitoring time-varying and nonnormal processes. We reformulated the SVDD algorithm into a time-adaptive SVDD algorithm by adding a weighting factor that reflects time-varying situations. Moreover, we defined the updating region for the efficient model-updating structure of the control chart. The proposed control chart simultaneously allows efficient model updates and timely detection of out-of-control signals. The effectiveness and applicability of the proposed chart were demonstrated through experiments with the simulated data and the real data from the metal frame process in mobile device manufacturing.

Keywords: multivariate control chart, nonparametric method, support vector data description, time-varying process

Procedia PDF Downloads 266
10146 Classification of Generative Adversarial Network Generated Multivariate Time Series Data Featuring Transformer-Based Deep Learning Architecture

Authors: Thrivikraman Aswathi, S. Advaith

Abstract:

As there can be cases where the use of real data is somehow limited, such as when it is hard to get access to a large volume of real data, we need to go for synthetic data generation. This produces high-quality synthetic data while maintaining the statistical properties of a specific dataset. In the present work, a generative adversarial network (GAN) is trained to produce multivariate time series (MTS) data since the MTS is now being gathered more often in various real-world systems. Furthermore, the GAN-generated MTS data is fed into a transformer-based deep learning architecture that carries out the data categorization into predefined classes. Further, the model is evaluated across various distinct domains by generating corresponding MTS data.

Keywords: GAN, transformer, classification, multivariate time series

Procedia PDF Downloads 75
10145 Statistical Discrimination of Blue Ballpoint Pen Inks by Diamond Attenuated Total Reflectance (ATR) FTIR

Authors: Mohamed Izzharif Abdul Halim, Niamh Nic Daeid

Abstract:

Determining the source of pen inks used on a variety of documents is impartial for forensic document examiners. The examination of inks is often performed to differentiate between inks in order to evaluate the authenticity of a document. A ballpoint pen ink consists of synthetic dyes in (acidic and/or basic), pigments (organic and/or inorganic) and a range of additives. Inks of similar color may consist of different composition and are frequently the subjects of forensic examinations. This study emphasizes on blue ballpoint pen inks available in the market because it is reported that approximately 80% of questioned documents analysis involving ballpoint pen ink. Analytical techniques such as thin layer chromatography, high-performance liquid chromatography, UV-vis spectroscopy, luminescence spectroscopy and infrared spectroscopy have been used in the analysis of ink samples. In this study, application of Diamond Attenuated Total Reflectance (ATR) FTIR is straightforward but preferable in forensic science as it offers no sample preparation and minimal analysis time. The data obtained from these techniques were further analyzed using multivariate chemometric methods which enable extraction of more information based on the similarities and differences among samples in a dataset. It was indicated that some pens from the same manufactures can be similar in composition, however, discrete types can be significantly different.

Keywords: ATR FTIR, ballpoint, multivariate chemometric, PCA

Procedia PDF Downloads 420
10144 The Moment of the Optimal Average Length of the Multivariate Exponentially Weighted Moving Average Control Chart for Equally Correlated Variables

Authors: Edokpa Idemudia Waziri, Salisu S. Umar

Abstract:

The Hotellng’s T^2 is a well-known statistic for detecting a shift in the mean vector of a multivariate normal distribution. Control charts based on T have been widely used in statistical process control for monitoring a multivariate process. Although it is a powerful tool, the T statistic is deficient when the shift to be detected in the mean vector of a multivariate process is small and consistent. The Multivariate Exponentially Weighted Moving Average (MEWMA) control chart is one of the control statistics used to overcome the drawback of the Hotellng’s T statistic. In this paper, the probability distribution of the Average Run Length (ARL) of the MEWMA control chart when the quality characteristics exhibit substantial cross correlation and when the process is in-control and out-of-control was derived using the Markov Chain algorithm. The derivation of the probability functions and the moments of the run length distribution were also obtained and they were consistent with some existing results for the in-control and out-of-control situation. By simulation process, the procedure identified a class of ARL for the MEWMA control when the process is in-control and out-of-control. From our study, it was observed that the MEWMA scheme is quite adequate for detecting a small shift and a good way to improve the quality of goods and services in a multivariate situation. It was also observed that as the in-control average run length ARL0¬ or the number of variables (p) increases, the optimum value of the ARL0pt increases asymptotically and as the magnitude of the shift σ increases, the optimal ARLopt decreases. Finally, we use the example from the literature to illustrate our method and demonstrate its efficiency.

Keywords: average run length, markov chain, multivariate exponentially weighted moving average, optimal smoothing parameter

Procedia PDF Downloads 383
10143 The Use of Multivariate Statistical and GIS for Characterization Groundwater Quality in Laghouat Region, Algeria

Authors: Rouighi Mustapha, Bouzid Laghaa Souad, Rouighi Tahar

Abstract:

Due to rain Shortage and the increase of population in the last years, wells excavation and groundwater use for different purposes had been increased without any planning. This is a great challenge for our country. Moreover, this scarcity of water resources in this region is unfortunately combined with rapid fresh water resources quality deterioration, due to salinity and contamination processes. Therefore, it is necessary to conduct the studies about groundwater quality in Algeria. In this work consists in the identification of the factors which influence the water quality parameters in Laghouat region by using statistical analysis Principal Component Analysis (PCA), Hierarchical Cluster Analysis (HCA) and geographic information system (GIS) in an attempt to discriminate the sources of the variation of water quality variations. The results of PCA technique indicate that variables responsible for water quality composition are mainly related to soluble salts variables; natural processes and the nature of the rock which modifies significantly the water chemistry. Inferred from the positive correlation between K+ and NO3-, NO3- is believed to be human induced rather than naturally originated. In this study, the multivariate statistical analysis and GIS allows the hydrogeologist to have supplementary tools in the characterization and evaluating of aquifers.

Keywords: cluster, analysis, GIS, groundwater, laghouat, quality

Procedia PDF Downloads 281
10142 The Modality of Multivariate Skew Normal Mixture

Authors: Bader Alruwaili, Surajit Ray

Abstract:

Finite mixtures are a flexible and powerful tool that can be used for univariate and multivariate distributions, and a wide range of research analysis has been conducted based on the multivariate normal mixture and multivariate of a t-mixture. Determining the number of modes is an important activity that, in turn, allows one to determine the number of homogeneous groups in a population. Our work currently being carried out relates to the study of the modality of the skew normal distribution in the univariate and multivariate cases. For the skew normal distribution, the aims are associated with studying the modality of the skew normal distribution and providing the ridgeline, the ridgeline elevation function, the $\Pi$ function, and the curvature function, and this will be conducive to an exploration of the number and location of mode when mixing the two components of skew normal distribution. The subsequent objective is to apply these results to the application of real world data sets, such as flow cytometry data.

Keywords: mode, modality, multivariate skew normal, finite mixture, number of mode

Procedia PDF Downloads 452
10141 Space Telemetry Anomaly Detection Based On Statistical PCA Algorithm

Authors: Bassem Nassar, Wessam Hussein, Medhat Mokhtar

Abstract:

The crucial concern of satellite operations is to ensure the health and safety of satellites. The worst case in this perspective is probably the loss of a mission but the more common interruption of satellite functionality can result in compromised mission objectives. All the data acquiring from the spacecraft are known as Telemetry (TM), which contains the wealth information related to the health of all its subsystems. Each single item of information is contained in a telemetry parameter, which represents a time-variant property (i.e. a status or a measurement) to be checked. As a consequence, there is a continuous improvement of TM monitoring systems in order to reduce the time required to respond to changes in a satellite's state of health. A fast conception of the current state of the satellite is thus very important in order to respond to occurring failures. Statistical multivariate latent techniques are one of the vital learning tools that are used to tackle the aforementioned problem coherently. Information extraction from such rich data sources using advanced statistical methodologies is a challenging task due to the massive volume of data. To solve this problem, in this paper, we present a proposed unsupervised learning algorithm based on Principle Component Analysis (PCA) technique. The algorithm is particularly applied on an actual remote sensing spacecraft. Data from the Attitude Determination and Control System (ADCS) was acquired under two operation conditions: normal and faulty states. The models were built and tested under these conditions and the results shows that the algorithm could successfully differentiate between these operations conditions. Furthermore, the algorithm provides competent information in prediction as well as adding more insight and physical interpretation to the ADCS operation.

Keywords: space telemetry monitoring, multivariate analysis, PCA algorithm, space operations

Procedia PDF Downloads 379
10140 Discrimination Between Bacillus and Alicyclobacillus Isolates in Apple Juice by Fourier Transform Infrared Spectroscopy and Multivariate Analysis

Authors: Murada Alholy, Mengshi Lin, Omar Alhaj, Mahmoud Abugoush

Abstract:

Alicyclobacillus is a causative agent of spoilage in pasteurized and heat-treated apple juice products. Differentiating between this genus and the closely related Bacillus is crucially important. In this study, Fourier transform infrared spectroscopy (FT-IR) was used to identify and discriminate between four Alicyclobacillus strains and four Bacillus isolates inoculated individually into apple juice. Loading plots over the range of 1350 and 1700 cm-1 reflected the most distinctive biochemical features of Bacillus and Alicyclobacillus. Multivariate statistical methods (e.g. principal component analysis (PCA) and soft independent modeling of class analogy (SIMCA)) were used to analyze the spectral data. Distinctive separation of spectral samples was observed. This study demonstrates that FT-IR spectroscopy in combination with multivariate analysis could serve as a rapid and effective tool for fruit juice industry to differentiate between Bacillus and Alicyclobacillus and to distinguish between species belonging to these two genera.

Keywords: alicyclobacillus, bacillus, FT-IR, spectroscopy, PCA

Procedia PDF Downloads 440
10139 Effects of Video Games and Online Chat on Mathematics Performance in High School: An Approach of Multivariate Data Analysis

Authors: Lina Wu, Wenyi Lu, Ye Li

Abstract:

Regarding heavy video game players for boys and super online chat lovers for girls as a symbolic phrase in the current adolescent culture, this project of data analysis verifies the displacement effect on deteriorating mathematics performance. To evaluate correlation or regression coefficients between a factor of playing video games or chatting online and mathematics performance compared with other factors, we use multivariate analysis technique and take gender difference into account. We find the most important reason for the negative sign of the displacement effect on mathematics performance due to students’ poor academic background. Statistical analysis methods in this project could be applied to study internet users’ academic performance from the high school education to the college education.

Keywords: correlation coefficients, displacement effect, multivariate analysis technique, regression coefficients

Procedia PDF Downloads 316
10138 Multivariate Genome-Wide Association Studies for Identifying Additional Loci for Myopia

Authors: Qiao Fan, Xiaobo Guo, Junxian Zhu, Xiaohu Ding, Ching-Yu Cheng, Tien-Yin Wong, Mingguang He, Heping Zhang, Xueqin Wang

Abstract:

A systematic, simultaneous analysis of multiple phenotypes in genome-wide association studies (GWASs) draws a great attention to integrate the signals from single phenotypes with increased power. However, lacking an interpretable and efficient multivariate GWAS analysis impede the application of such approach. In this study, we propose to decompose the multivariate model into a series of simple univariate models. This transformation illuminates what exactly the individual trait contributes to the significant signals from the multivariate analyses. By employing our approach in the analysis of three myopia-related endophenotypes from the Singapore Malay Eye Study (SIMES), we identify novel candidate loci which were successfully validated in an independent Guangzhou Twin Eye Study (GTES).

Keywords: GWAS multivariate, multiple traits, myopia, association

Procedia PDF Downloads 185
10137 Use of In-line Data Analytics and Empirical Model for Early Fault Detection

Authors: Hyun-Woo Cho

Abstract:

Automatic process monitoring schemes are designed to give early warnings for unusual process events or abnormalities as soon as possible. For this end, various techniques have been developed and utilized in various industrial processes. It includes multivariate statistical methods, representation skills in reduced spaces, kernel-based nonlinear techniques, etc. This work presents a nonlinear empirical monitoring scheme for batch type production processes with incomplete process measurement data. While normal operation data are easy to get, unusual fault data occurs infrequently and thus are difficult to collect. In this work, noise filtering steps are added in order to enhance monitoring performance by eliminating irrelevant information of the data. The performance of the monitoring scheme was demonstrated using batch process data. The results showed that the monitoring performance was improved significantly in terms of detection success rate of process fault.

Keywords: batch process, monitoring, measurement, kernel method

Procedia PDF Downloads 282
10136 A Comparative Study on Automatic Feature Classification Methods of Remote Sensing Images

Authors: Lee Jeong Min, Lee Mi Hee, Eo Yang Dam

Abstract:

Geospatial feature extraction is a very important issue in the remote sensing research. In the meantime, the image classification based on statistical techniques, but, in recent years, data mining and machine learning techniques for automated image processing technology is being applied to remote sensing it has focused on improved results generated possibility. In this study, artificial neural network and decision tree technique is applied to classify the high-resolution satellite images, as compared to the MLC processing result is a statistical technique and an analysis of the pros and cons between each of the techniques.

Keywords: remote sensing, artificial neural network, decision tree, maximum likelihood classification

Procedia PDF Downloads 313
10135 Transforming Data into Knowledge: Mathematical and Statistical Innovations in Data Analytics

Authors: Zahid Ullah, Atlas Khan

Abstract:

The rapid growth of data in various domains has created a pressing need for effective methods to transform this data into meaningful knowledge. In this era of big data, mathematical and statistical innovations play a crucial role in unlocking insights and facilitating informed decision-making in data analytics. This abstract aims to explore the transformative potential of these innovations and their impact on converting raw data into actionable knowledge. Drawing upon a comprehensive review of existing literature, this research investigates the cutting-edge mathematical and statistical techniques that enable the conversion of data into knowledge. By evaluating their underlying principles, strengths, and limitations, we aim to identify the most promising innovations in data analytics. To demonstrate the practical applications of these innovations, real-world datasets will be utilized through case studies or simulations. This empirical approach will showcase how mathematical and statistical innovations can extract patterns, trends, and insights from complex data, enabling evidence-based decision-making across diverse domains. Furthermore, a comparative analysis will be conducted to assess the performance, scalability, interpretability, and adaptability of different innovations. By benchmarking against established techniques, we aim to validate the effectiveness and superiority of the proposed mathematical and statistical innovations in data analytics. Ethical considerations surrounding data analytics, such as privacy, security, bias, and fairness, will be addressed throughout the research. Guidelines and best practices will be developed to ensure the responsible and ethical use of mathematical and statistical innovations in data analytics. The expected contributions of this research include advancements in mathematical and statistical sciences, improved data analysis techniques, enhanced decision-making processes, and practical implications for industries and policymakers. The outcomes will guide the adoption and implementation of mathematical and statistical innovations, empowering stakeholders to transform data into actionable knowledge and drive meaningful outcomes.

Keywords: data analytics, mathematical innovations, knowledge extraction, decision-making

Procedia PDF Downloads 35
10134 Sex Work Practice and Health Seeking Behavior among Hiv Positive Female Sex Workers in Rural Karnataka, India

Authors: Rajeshwari Biradar

Abstract:

Background: The anecdotal evidences indicate that utilization of HIV services especially in Government facilities is affected by stigma and discrimination among HIV positive female sex workers (FSWs) in Karnataka. To our knowledge, there is no quantitative study on this issue. In this study an attempt is made to examine these aspects among positive FSWs exposed to prevention programs. Methods: This is a cross‐ sectional quantitative survey of HIV positive FSWs in the 3 districts of northern Karnataka using a structured questionnaire. The list of HIV Positive FSWs was organized by stratification, and 607 positive FSWs were selected using a systematic random selection. The data were analyzed using both bivariate and multivariate statistical techniques. Results: Half of the sex workers (52%) are traditional (devadasi, dedicated to the temple), 22% are widowed and the mean age is 33 years. The FSWs practice sex work on an average 13 days a month with 2.3 clients per day and was in sex work for about 13 years. Almost all of them (97%) used condom with the clients they had on the last day of sex work. About 74% were ever registered in the ART center and 47% of them reported being ever on ART, of which 6% dropped out. Multivariate results support the hypothesis that the interventions addressing stigma and discrimination enabled accessing health services in the government facilities (AOR=1.37; p=0.17). Conclusions: Based on the results of the study, programs addressing stigma, discrimination and positive prevention can be implemented in places where government health services are not utilized by HIV positive FSWs. However, the study may be limited by the fact that majority of the FSWs entered into sex work through the traditional devadasi system, which may not be the case in other parts of India.

Keywords: sex work, HIV/AIDS, female sex workers, health

Procedia PDF Downloads 146
10133 A Cohort and Empirical Based Multivariate Mortality Model

Authors: Jeffrey Tzu-Hao Tsai, Yi-Shan Wong

Abstract:

This article proposes a cohort-age-period (CAP) model to characterize multi-population mortality processes using cohort, age, and period variables. Distinct from the factor-based Lee-Carter-type decomposition mortality model, this approach is empirically based and includes the age, period, and cohort variables into the equation system. The model not only provides a fruitful intuition for explaining multivariate mortality change rates but also has a better performance in forecasting future patterns. Using the US and the UK mortality data and performing ten-year out-of-sample tests, our approach shows smaller mean square errors in both countries compared to the models in the literature.

Keywords: longevity risk, stochastic mortality model, multivariate mortality rate, risk management

Procedia PDF Downloads 9
10132 Forecasting the Influences of Information and Communication Technology on the Structural Changes of Japanese Industrial Sectors: A Study Using Statistical Analysis

Authors: Ubaidillah Zuhdi, Shunsuke Mori, Kazuhisa Kamegai

Abstract:

The purpose of this study is to forecast the influences of Information and Communication Technology (ICT) on the structural changes of Japanese economies based on Leontief Input-Output (IO) coefficients. This study establishes a statistical analysis to predict the future interrelationships among industries. We employ the Constrained Multivariate Regression (CMR) model to analyze the historical changes of input-output coefficients. Statistical significance of the model is then tested by Likelihood Ratio Test (LRT). In our model, ICT is represented by two explanatory variables, i.e. computers (including main parts and accessories) and telecommunications equipment. A previous study, which analyzed the influences of these variables on the structural changes of Japanese industrial sectors from 1985-2005, concluded that these variables had significant influences on the changes in the business circumstances of Japanese commerce, business services and office supplies, and personal services sectors. The projected future Japanese economic structure based on the above forecast generates the differentiated direct and indirect outcomes of ICT penetration.

Keywords: forecast, ICT, industrial structural changes, statistical analysis

Procedia PDF Downloads 333
10131 Applying Multivariate and Univariate Analysis of Variance on Socioeconomic, Health, and Security Variables in Jordan

Authors: Faisal G. Khamis, Ghaleb A. El-Refae

Abstract:

Many researchers have studied socioeconomic, health, and security variables in the developed countries; however, very few studies used multivariate analysis in developing countries. The current study contributes to the scarce literature about the determinants of the variance in socioeconomic, health, and security factors. Questions raised were whether the independent variables (IVs) of governorate and year impact the socioeconomic, health, and security dependent variables (DVs) in Jordan, whether the marginal mean of each DV in each governorate and in each year is significant, which governorates are similar in difference means of each DV, and whether these DVs vary. The main objectives were to determine the source of variances in DVs, collectively and separately, testing which governorates are similar and which diverge for each DV. The research design was time series and cross-sectional analysis. The main hypotheses are that IVs affect DVs collectively and separately. Multivariate and univariate analyses of variance were carried out to test these hypotheses. The population of 12 governorates in Jordan and the available data of 15 years (2000–2015) accrued from several Jordanian statistical yearbooks. We investigated the effect of two factors of governorate and year on the four DVs of divorce rate, mortality rate, unemployment percentage, and crime rate. All DVs were transformed to multivariate normal distribution. We calculated descriptive statistics for each DV. Based on the multivariate analysis of variance, we found a significant effect in IVs on DVs with p < .001. Based on the univariate analysis, we found a significant effect of IVs on each DV with p < .001, except the effect of the year factor on unemployment was not significant with p = .642. The grand and marginal means of each DV in each governorate and each year were significant based on a 95% confidence interval. Most governorates are not similar in DVs with p < .001. We concluded that the two factors produce significant effects on DVs, collectively and separately. Based on these findings, the government can distribute its financial and physical resources to governorates more efficiently. By identifying the sources of variance that contribute to the variation in DVs, insights can help inform focused variation prevention efforts.

Keywords: ANOVA, crime, divorce, governorate, hypothesis test, Jordan, MANOVA, means, mortality, unemployment, year

Procedia PDF Downloads 228
10130 Evaluating the Factors Controlling the Hydrochemistry of Gaza Coastal Aquifer Using Hydrochemical and Multivariate Statistical Analysis

Authors: Madhat Abu Al-Naeem, Ismail Yusoff, Ng Tham Fatt, Yatimah Alias

Abstract:

Groundwater in Gaza strip is increasingly being exposed to anthropic and natural factors that seriously impacted the groundwater quality. Physiochemical data of groundwater can offer important information on changes in groundwater quality that can be useful in improving water management tactics. An integrative hydrochemical and statistical techniques (Hierarchical cluster analysis (HCA) and factor analysis (FA)) have been applied on the existence ten physiochemical data of 84 samples collected in (2000/2001) using STATA, AquaChem, and Surfer softwares to: 1) Provide valuable insight into the salinization sources and the hydrochemical processes controlling the chemistry of groundwater. 2) Differentiate the influence of natural processes and man-made activities. The recorded large diversity in water facies with dominance Na-Cl type that reveals a highly saline aquifer impacted by multiple complex hydrochemical processes. Based on WHO standards, only (15.5%) of the wells were suitable for drinking. HCA yielded three clusters. Cluster 1 is the highest in salinity, mainly due to the impact of Eocene saline water invasion mixed with human inputs. Cluster 2 is the lowest in salinity also due to Eocene saline water invasion but mixed with recent rainfall recharge and limited carbonate dissolution and nitrate pollution. Cluster 3 is similar in salinity to Cluster 2, but with a high diversity of facies due to the impact of many sources of salinity as sea water invasion, carbonate dissolution and human inputs. Factor analysis yielded two factors accounting for 88% of the total variance. Factor 1 (59%) is a salinization factor demonstrating the mixing contribution of natural saline water with human inputs. Factor 2 measure the hardness and pollution which explained 29% of the total variance. The negative relationship between the NO3- and pH may reveal a denitrification process in a heavy polluted aquifer recharged by a limited oxygenated rainfall. Multivariate statistical analysis combined with hydrochemical analysis indicate that the main factors controlling groundwater chemistry were Eocene saline invasion, seawater invasion, sewage invasion and rainfall recharge and the main hydrochemical processes were base ion and reverse ion exchange processes with clay minerals (water rock interactions), nitrification, carbonate dissolution and a limited denitrification process.

Keywords: dendrogram and cluster analysis, water facies, Eocene saline invasion and sea water invasion, nitrification and denitrification

Procedia PDF Downloads 322