Search results for: multivariate time series data
38814 Performance Evaluation and Comparison between the Empirical Mode Decomposition, Wavelet Analysis, and Singular Spectrum Analysis Applied to the Time Series Analysis in Atmospheric Science
Authors: Olivier Delage, Hassan Bencherif, Alain Bourdier
Abstract:
Signal decomposition approaches represent an important step in time series analysis, providing useful knowledge and insight into the data and underlying dynamics characteristics while also facilitating tasks such as noise removal and feature extraction. As most of observational time series are nonlinear and nonstationary, resulting of several physical processes interaction at different time scales, experimental time series have fluctuations at all time scales and requires the development of specific signal decomposition techniques. Most commonly used techniques are data driven, enabling to obtain well-behaved signal components without making any prior-assumptions on input data. Among the most popular time series decomposition techniques, most cited in the literature, are the empirical mode decomposition and its variants, the empirical wavelet transform and singular spectrum analysis. With increasing popularity and utility of these methods in wide ranging applications, it is imperative to gain a good understanding and insight into the operation of these algorithms. In this work, we describe all of the techniques mentioned above as well as their ability to denoise signals, to capture trends, to identify components corresponding to the physical processes involved in the evolution of the observed system and deduce the dimensionality of the underlying dynamics. Results obtained with all of these methods on experimental total ozone columns and rainfall time series will be discussed and comparedKeywords: denoising, empirical mode decomposition, singular spectrum analysis, time series, underlying dynamics, wavelet analysis
Procedia PDF Downloads 11638813 Time Series Analysis on the Production of Fruit Juice: A Case Study of National Horticultural Research Institute (Nihort) Ibadan, Oyo State
Authors: Abiodun Ayodele Sanyaolu
Abstract:
The research was carried out to investigate the time series analysis on quarterly production of fruit juice at the National Horticultural Research Institute Ibadan from 2010 to 2018. Documentary method of data collection was used, and the method of least square and moving average were used in the analysis. From the calculation and the graph, it was glaring that there was increase, decrease, and uniform movements in both the graph of the original data and the tabulated quarter values of the original data. Time series analysis was used to detect the trend in the highest number of fruit juice and it appears to be good over a period of time and the methods used to forecast are additive and multiplicative models. Since it was observed that the production of fruit juice is usually high in January of every year, it is strongly advised that National Horticultural Research Institute should make more provision for fruit juice storage outside this period of the year.Keywords: fruit juice, least square, multiplicative models, time series
Procedia PDF Downloads 14238812 Nonstationarity Modeling of Economic and Financial Time Series
Authors: C. Slim
Abstract:
Traditional techniques for analyzing time series are based on the notion of stationarity of phenomena under study, but in reality most economic and financial series do not verify this hypothesis, which implies the implementation of specific tools for the detection of such behavior. In this paper, we study nonstationary non-seasonal time series tests in a non-exhaustive manner. We formalize the problem of nonstationary processes with numerical simulations and take stock of their statistical characteristics. The theoretical aspects of some of the most common unit root tests will be discussed. We detail the specification of the tests, showing the advantages and disadvantages of each. The empirical study focuses on the application of these tests to the exchange rate (USD/TND) and the Consumer Price Index (CPI) in Tunisia, in order to compare the Power of these tests with the characteristics of the series.Keywords: stationarity, unit root tests, economic time series, ADF tests
Procedia PDF Downloads 42038811 Generating Swarm Satellite Data Using Long Short-Term Memory and Generative Adversarial Networks for the Detection of Seismic Precursors
Authors: Yaxin Bi
Abstract:
Accurate prediction and understanding of the evolution mechanisms of earthquakes remain challenging in the fields of geology, geophysics, and seismology. This study leverages Long Short-Term Memory (LSTM) networks and Generative Adversarial Networks (GANs), a generative model tailored to time-series data, for generating synthetic time series data based on Swarm satellite data, which will be used for detecting seismic anomalies. LSTMs demonstrated commendable predictive performance in generating synthetic data across multiple countries. In contrast, the GAN models struggled to generate synthetic data, often producing non-informative values, although they were able to capture the data distribution of the time series. These findings highlight both the promise and challenges associated with applying deep learning techniques to generate synthetic data, underscoring the potential of deep learning in generating synthetic electromagnetic satellite data.Keywords: LSTM, GAN, earthquake, synthetic data, generative AI, seismic precursors
Procedia PDF Downloads 3238810 A Non-parametric Clustering Approach for Multivariate Geostatistical Data
Authors: Francky Fouedjio
Abstract:
Multivariate geostatistical data have become omnipresent in the geosciences and pose substantial analysis challenges. One of them is the grouping of data locations into spatially contiguous clusters so that data locations within the same cluster are more similar while clusters are different from each other, in some sense. Spatially contiguous clusters can significantly improve the interpretation that turns the resulting clusters into meaningful geographical subregions. In this paper, we develop an agglomerative hierarchical clustering approach that takes into account the spatial dependency between observations. It relies on a dissimilarity matrix built from a non-parametric kernel estimator of the spatial dependence structure of data. It integrates existing methods to find the optimal cluster number and to evaluate the contribution of variables to the clustering. The capability of the proposed approach to provide spatially compact, connected and meaningful clusters is assessed using bivariate synthetic dataset and multivariate geochemical dataset. The proposed clustering method gives satisfactory results compared to other similar geostatistical clustering methods.Keywords: clustering, geostatistics, multivariate data, non-parametric
Procedia PDF Downloads 47738809 Wind Speed Data Analysis in Colombia in 2013 and 2015
Authors: Harold P. Villota, Alejandro Osorio B.
Abstract:
The energy meteorology is an area for study energy complementarity and the use of renewable sources in interconnected systems. Due to diversify the energy matrix in Colombia with wind sources, is necessary to know the data bases about this one. However, the time series given by 260 automatic weather stations have empty, and no apply data, so the purpose is to fill the time series selecting two years to characterize, impute and use like base to complete the data between 2005 and 2020.Keywords: complementarity, wind speed, renewable, colombia, characteri, characterization, imputation
Procedia PDF Downloads 16438808 pscmsForecasting: A Python Web Service for Time Series Forecasting
Authors: Ioannis Andrianakis, Vasileios Gkatas, Nikos Eleftheriadis, Alexios Ellinidis, Ermioni Avramidou
Abstract:
pscmsForecasting is an open-source web service that implements a variety of time series forecasting algorithms and exposes them to the user via the ubiquitous HTTP protocol. It allows developers to enhance their applications by adding time series forecasting functionalities through an intuitive and easy-to-use interface. This paper provides some background on time series forecasting and gives details about the implemented algorithms, aiming to enhance the end user’s understanding of the underlying methods before incorporating them into their applications. A detailed description of the web service’s interface and its various parameterizations is also provided. Being an open-source project, pcsmsForecasting can also be easily modified and tailored to the specific needs of each application.Keywords: time series, forecasting, web service, open source
Procedia PDF Downloads 8338807 Influence of Parameters of Modeling and Data Distribution for Optimal Condition on Locally Weighted Projection Regression Method
Authors: Farhad Asadi, Mohammad Javad Mollakazemi, Aref Ghafouri
Abstract:
Recent research in neural networks science and neuroscience for modeling complex time series data and statistical learning has focused mostly on learning from high input space and signals. Local linear models are a strong choice for modeling local nonlinearity in data series. Locally weighted projection regression is a flexible and powerful algorithm for nonlinear approximation in high dimensional signal spaces. In this paper, different learning scenario of one and two dimensional data series with different distributions are investigated for simulation and further noise is inputted to data distribution for making different disordered distribution in time series data and for evaluation of algorithm in locality prediction of nonlinearity. Then, the performance of this algorithm is simulated and also when the distribution of data is high or when the number of data is less the sensitivity of this approach to data distribution and influence of important parameter of local validity in this algorithm with different data distribution is explained.Keywords: local nonlinear estimation, LWPR algorithm, online training method, locally weighted projection regression method
Procedia PDF Downloads 50238806 Approximation of the Time Series by Fractal Brownian Motion
Authors: Valeria Bondarenko
Abstract:
In this paper, we propose two problems related to fractal Brownian motion. First problem is simultaneous estimation of two parameters, Hurst exponent and the volatility, that describe this random process. Numerical tests for the simulated fBm provided an efficient method. Second problem is approximation of the increments of the observed time series by a power function by increments from the fractional Brownian motion. Approximation and estimation are shown on the example of real data, daily deposit interest rates.Keywords: fractional Brownian motion, Gausssian processes, approximation, time series, estimation of properties of the model
Procedia PDF Downloads 37638805 Modified CUSUM Algorithm for Gradual Change Detection in a Time Series Data
Authors: Victoria Siriaki Jorry, I. S. Mbalawata, Hayong Shin
Abstract:
The main objective in a change detection problem is to develop algorithms for efficient detection of gradual and/or abrupt changes in the parameter distribution of a process or time series data. In this paper, we present a modified cumulative (MCUSUM) algorithm to detect the start and end of a time-varying linear drift in mean value of a time series data based on likelihood ratio test procedure. The design, implementation and performance of the proposed algorithm for a linear drift detection is evaluated and compared to the existing CUSUM algorithm using different performance measures. An approach to accurately approximate the threshold of the MCUSUM is also provided. Performance of the MCUSUM for gradual change-point detection is compared to that of standard cumulative sum (CUSUM) control chart designed for abrupt shift detection using Monte Carlo Simulations. In terms of the expected time for detection, the MCUSUM procedure is found to have a better performance than a standard CUSUM chart for detection of the gradual change in mean. The algorithm is then applied and tested to a randomly generated time series data with a gradual linear trend in mean to demonstrate its usefulness.Keywords: average run length, CUSUM control chart, gradual change detection, likelihood ratio test
Procedia PDF Downloads 29838804 Stock Price Prediction Using Time Series Algorithms
Authors: Sumit Sen, Sohan Khedekar, Umang Shinde, Shivam Bhargava
Abstract:
This study has been undertaken to investigate whether the deep learning models are able to predict the future stock prices by training the model with the historical stock price data. Since this work required time series analysis, various models are present today to perform time series analysis such as Recurrent Neural Network LSTM, ARIMA and Facebook Prophet. Applying these models the movement of stock price of stocks are predicted and also tried to provide the future prediction of the stock price of a stock. Final product will be a stock price prediction web application that is developed for providing the user the ease of analysis of the stocks and will also provide the predicted stock price for the next seven days.Keywords: Autoregressive Integrated Moving Average, Deep Learning, Long Short Term Memory, Time-series
Procedia PDF Downloads 14138803 Multivariate Data Analysis for Automatic Atrial Fibrillation Detection
Authors: Zouhair Haddi, Stephane Delliaux, Jean-Francois Pons, Ismail Kechaf, Jean-Claude De Haro, Mustapha Ouladsine
Abstract:
Atrial fibrillation (AF) has been considered as the most common cardiac arrhythmia, and a major public health burden associated with significant morbidity and mortality. Nowadays, telemedical approaches targeting cardiac outpatients situate AF among the most challenged medical issues. The automatic, early, and fast AF detection is still a major concern for the healthcare professional. Several algorithms based on univariate analysis have been developed to detect atrial fibrillation. However, the published results do not show satisfactory classification accuracy. This work was aimed at resolving this shortcoming by proposing multivariate data analysis methods for automatic AF detection. Four publicly-accessible sets of clinical data (AF Termination Challenge Database, MIT-BIH AF, Normal Sinus Rhythm RR Interval Database, and MIT-BIH Normal Sinus Rhythm Databases) were used for assessment. All time series were segmented in 1 min RR intervals window and then four specific features were calculated. Two pattern recognition methods, i.e., Principal Component Analysis (PCA) and Learning Vector Quantization (LVQ) neural network were used to develop classification models. PCA, as a feature reduction method, was employed to find important features to discriminate between AF and Normal Sinus Rhythm. Despite its very simple structure, the results show that the LVQ model performs better on the analyzed databases than do existing algorithms, with high sensitivity and specificity (99.19% and 99.39%, respectively). The proposed AF detection holds several interesting properties, and can be implemented with just a few arithmetical operations which make it a suitable choice for telecare applications.Keywords: atrial fibrillation, multivariate data analysis, automatic detection, telemedicine
Procedia PDF Downloads 26738802 Forecasting the Volatility of Geophysical Time Series with Stochastic Volatility Models
Authors: Maria C. Mariani, Md Al Masum Bhuiyan, Osei K. Tweneboah, Hector G. Huizar
Abstract:
This work is devoted to the study of modeling geophysical time series. A stochastic technique with time-varying parameters is used to forecast the volatility of data arising in geophysics. In this study, the volatility is defined as a logarithmic first-order autoregressive process. We observe that the inclusion of log-volatility into the time-varying parameter estimation significantly improves forecasting which is facilitated via maximum likelihood estimation. This allows us to conclude that the estimation algorithm for the corresponding one-step-ahead suggested volatility (with ±2 standard prediction errors) is very feasible since it possesses good convergence properties.Keywords: Augmented Dickey Fuller Test, geophysical time series, maximum likelihood estimation, stochastic volatility model
Procedia PDF Downloads 31538801 Proactive Pure Handoff Model with SAW-TOPSIS Selection and Time Series Predict
Authors: Harold Vásquez, Cesar Hernández, Ingrid Páez
Abstract:
This paper approach cognitive radio technic and applied pure proactive handoff Model to decrease interference between PU and SU and comparing it with reactive handoff model. Through the study and analysis of multivariate models SAW and TOPSIS join to 3 dynamic prediction techniques AR, MA ,and ARMA. To evaluate the best model is taken four metrics: number failed handoff, number handoff, number predictions, and number interference. The result presented the advantages using this type of pure proactive models to predict changes in the PU according to the selected channel and reduce interference. The model showed better performance was TOPSIS-MA, although TOPSIS-AR had a higher predictive ability this was not reflected in the interference reduction.Keywords: cognitive radio, spectrum handoff, decision making, time series, wireless networks
Procedia PDF Downloads 48738800 Copula Autoregressive Methodology for Simulation of Solar Irradiance and Air Temperature Time Series for Solar Energy Forecasting
Authors: Andres F. Ramirez, Carlos F. Valencia
Abstract:
The increasing interest in renewable energies strategies application and the path for diminishing the use of carbon related energy sources have encouraged the development of novel strategies for integration of solar energy into the electricity network. A correct inclusion of the fluctuating energy output of a photovoltaic (PV) energy system into an electric grid requires improvements in the forecasting and simulation methodologies for solar energy potential, and the understanding not only of the mean value of the series but the associated underlying stochastic process. We present a methodology for synthetic generation of solar irradiance (shortwave flux) and air temperature bivariate time series based on copula functions to represent the cross-dependence and temporal structure of the data. We explore the advantages of using this nonlinear time series method over traditional approaches that use a transformation of the data to normal distributions as an intermediate step. The use of copulas gives flexibility to represent the serial variability of the real data on the simulation and allows having more control on the desired properties of the data. We use discrete zero mass density distributions to assess the nature of solar irradiance, alongside vector generalized linear models for the bivariate time series time dependent distributions. We found that the copula autoregressive methodology used, including the zero mass characteristics of the solar irradiance time series, generates a significant improvement over state of the art strategies. These results will help to better understand the fluctuating nature of solar energy forecasting, the underlying stochastic process, and quantify the potential of a photovoltaic (PV) energy generating system integration into a country electricity network. Experimental analysis and real data application substantiate the usage and convenience of the proposed methodology to forecast solar irradiance time series and solar energy across northern hemisphere, southern hemisphere, and equatorial zones.Keywords: copula autoregressive, solar irradiance forecasting, solar energy forecasting, time series generation
Procedia PDF Downloads 32338799 Time Series Regression with Meta-Clusters
Authors: Monika Chuchro
Abstract:
This paper presents a preliminary attempt to apply classification of time series using meta-clusters in order to improve the quality of regression models. In this case, clustering was performed as a method to obtain a subgroups of time series data with normal distribution from inflow into waste water treatment plant data which Composed of several groups differing by mean value. Two simple algorithms: K-mean and EM were chosen as a clustering method. The rand index was used to measure the similarity. After simple meta-clustering, regression model was performed for each subgroups. The final model was a sum of subgroups models. The quality of obtained model was compared with the regression model made using the same explanatory variables but with no clustering of data. Results were compared by determination coefficient (R2), measure of prediction accuracy mean absolute percentage error (MAPE) and comparison on linear chart. Preliminary results allows to foresee the potential of the presented technique.Keywords: clustering, data analysis, data mining, predictive models
Procedia PDF Downloads 46638798 Elucidation of the Sequential Transcriptional Activity in Escherichia coli Using Time-Series RNA-Seq Data
Authors: Pui Shan Wong, Kosuke Tashiro, Satoru Kuhara, Sachiyo Aburatani
Abstract:
Functional genomics and gene regulation inference has readily expanded our knowledge and understanding of gene interactions with regards to expression regulation. With the advancement of transcriptome sequencing in time-series comes the ability to study the sequential changes of the transcriptome. This method presented here works to augment existing regulation networks accumulated in literature with transcriptome data gathered from time-series experiments to construct a sequential representation of transcription factor activity. This method is applied on a time-series RNA-Seq data set from Escherichia coli as it transitions from growth to stationary phase over five hours. Investigations are conducted on the various metabolic activities in gene regulation processes by taking advantage of the correlation between regulatory gene pairs to examine their activity on a dynamic network. Especially, the changes in metabolic activity during phase transition are analyzed with focus on the pagP gene as well as other associated transcription factors. The visualization of the sequential transcriptional activity is used to describe the change in metabolic pathway activity originating from the pagP transcription factor, phoP. The results show a shift from amino acid and nucleic acid metabolism, to energy metabolism during the transition to stationary phase in E. coli.Keywords: Escherichia coli, gene regulation, network, time-series
Procedia PDF Downloads 37238797 Time Series Analysis the Case of China and USA Trade Examining during Covid-19 Trade Enormity of Abnormal Pricing with the Exchange rate
Authors: Md. Mahadi Hasan Sany, Mumenunnessa Keya, Sharun Khushbu, Sheikh Abujar
Abstract:
Since the beginning of China's economic reform, trade between the U.S. and China has grown rapidly, and has increased since China's accession to the World Trade Organization in 2001. The US imports more than it exports from China, reducing the trade war between China and the U.S. for the 2019 trade deficit, but in 2020, the opposite happens. In international and U.S. trade, Washington launched a full-scale trade war against China in March 2016, which occurred a catastrophic epidemic. The main goal of our study is to measure and predict trade relations between China and the U.S., before and after the arrival of the COVID epidemic. The ML model uses different data as input but has no time dimension that is present in the time series models and is only able to predict the future from previously observed data. The LSTM (a well-known Recurrent Neural Network) model is applied as the best time series model for trading forecasting. We have been able to create a sustainable forecasting system in trade between China and the US by closely monitoring a dataset published by the State Website NZ Tatauranga Aotearoa from January 1, 2015, to April 30, 2021. Throughout the survey, we provided a 180-day forecast that outlined what would happen to trade between China and the US during COVID-19. In addition, we have illustrated that the LSTM model provides outstanding outcome in time series data analysis rather than RFR and SVR (e.g., both ML models). The study looks at how the current Covid outbreak affects China-US trade. As a comparative study, RMSE transmission rate is calculated for LSTM, RFR and SVR. From our time series analysis, it can be said that the LSTM model has given very favorable thoughts in terms of China-US trade on the future export situation.Keywords: RFR, China-U.S. trade war, SVR, LSTM, deep learning, Covid-19, export value, forecasting, time series analysis
Procedia PDF Downloads 19838796 Gender Based Variability Time Series Complexity Analysis
Authors: Ramesh K. Sunkaria, Puneeta Marwaha
Abstract:
Nonlinear methods of heart rate variability (HRV) analysis are becoming more popular. It has been observed that complexity measures quantify the regularity and uncertainty of cardiovascular RR-interval time series. In the present work, SampEn has been evaluated in healthy Normal Sinus Rhythm (NSR) male and female subjects for different data lengths and tolerance level r. It is demonstrated that SampEn is small for higher values of tolerance r. Also SampEn value of healthy female group is higher than that of healthy male group for short data length and with increase in data length both groups overlap each other and it is difficult to distinguish them. The SampEn gives inaccurate results by assigning higher value to female group, because male subject have more complex HRV pattern than that of female subjects. Therefore, this traditional algorithm exhibits higher complexity for healthy female subjects than for healthy male subjects, which is misleading observation. This may be due to the fact that SampEn do not account for multiple time scales inherent in the physiologic time series and the hidden spatial and temporal fluctuations remains unexplored.Keywords: heart rate variability, normal sinus rhythm group, RR interval time series, sample entropy
Procedia PDF Downloads 28238795 Forecasting Cancers Cases in Algeria Using Double Exponential Smoothing Method
Authors: Messis A., Adjebli A., Ayeche R., Talbi M., Tighilet K., Louardiane M.
Abstract:
Cancers are the second cause of death worldwide. Prevalence and incidence of cancers is getting increased by aging and population growth. This study aims to predict and modeling the evolution of breast, Colorectal, Lung, Bladder and Prostate cancers over the period of 2014-2019. In this study, data were analyzed using time series analysis with double exponential smoothing method to forecast the future pattern. To describe and fit the appropriate models, Minitab statistical software version 17 was used. Between 2014 and 2019, the overall trend in the raw number of new cancer cases registered has been increasing over time; the change in observations over time has been increasing. Our forecast model is validated since we have good prediction for the period 2020 and data not available for 2021 and 2022. Time series analysis showed that the double exponential smoothing is an efficient tool to model the future data on the raw number of new cancer cases.Keywords: cancer, time series, prediction, double exponential smoothing
Procedia PDF Downloads 8838794 Confidence Envelopes for Parametric Model Selection Inference and Post-Model Selection Inference
Authors: I. M. L. Nadeesha Jayaweera, Adao Alex Trindade
Abstract:
In choosing a candidate model in likelihood-based modeling via an information criterion, the practitioner is often faced with the difficult task of deciding just how far up the ranked list to look. Motivated by this pragmatic necessity, we construct an uncertainty band for a generalized (model selection) information criterion (GIC), defined as a criterion for which the limit in probability is identical to that of the normalized log-likelihood. This includes common special cases such as AIC & BIC. The method starts from the asymptotic normality of the GIC for the joint distribution of the candidate models in an independent and identically distributed (IID) data framework and proceeds by deriving the (asymptotically) exact distribution of the minimum. The calculation of an upper quantile for its distribution then involves the computation of multivariate Gaussian integrals, which is amenable to efficient implementation via the R package "mvtnorm". The performance of the methodology is tested on simulated data by checking the coverage probability of nominal upper quantiles and compared to the bootstrap. Both methods give coverages close to nominal for large samples, but the bootstrap is two orders of magnitude slower. The methodology is subsequently extended to two other commonly used model structures: regression and time series. In the regression case, we derive the corresponding asymptotically exact distribution of the minimum GIC invoking Lindeberg-Feller type conditions for triangular arrays and are thus able to similarly calculate upper quantiles for its distribution via multivariate Gaussian integration. The bootstrap once again provides a default competing procedure, and we find that similar comparison performance metrics hold as for the IID case. The time series case is complicated by far more intricate asymptotic regime for the joint distribution of the model GIC statistics. Under a Gaussian likelihood, the default in most packages, one needs to derive the limiting distribution of a normalized quadratic form for a realization from a stationary series. Under conditions on the process satisfied by ARMA models, a multivariate normal limit is once again achieved. The bootstrap can, however, be employed for its computation, whence we are once again in the multivariate Gaussian integration paradigm for upper quantile evaluation. Comparisons of this bootstrap-aided semi-exact method with the full-blown bootstrap once again reveal a similar performance but faster computation speeds. One of the most difficult problems in contemporary statistical methodological research is to be able to account for the extra variability introduced by model selection uncertainty, the so-called post-model selection inference (PMSI). We explore ways in which the GIC uncertainty band can be inverted to make inferences on the parameters. This is being attempted in the IID case by pivoting the CDF of the asymptotically exact distribution of the minimum GIC. For inference one parameter at a time and a small number of candidate models, this works well, whence the attained PMSI confidence intervals are wider than the MLE-based Wald, as expected.Keywords: model selection inference, generalized information criteria, post model selection, Asymptotic Theory
Procedia PDF Downloads 8938793 Model of Optimal Centroids Approach for Multivariate Data Classification
Authors: Pham Van Nha, Le Cam Binh
Abstract:
Particle swarm optimization (PSO) is a population-based stochastic optimization algorithm. PSO was inspired by the natural behavior of birds and fish in migration and foraging for food. PSO is considered as a multidisciplinary optimization model that can be applied in various optimization problems. PSO’s ideas are simple and easy to understand but PSO is only applied in simple model problems. We think that in order to expand the applicability of PSO in complex problems, PSO should be described more explicitly in the form of a mathematical model. In this paper, we represent PSO in a mathematical model and apply in the multivariate data classification. First, PSOs general mathematical model (MPSO) is analyzed as a universal optimization model. Then, Model of Optimal Centroids (MOC) is proposed for the multivariate data classification. Experiments were conducted on some benchmark data sets to prove the effectiveness of MOC compared with several proposed schemes.Keywords: analysis of optimization, artificial intelligence based optimization, optimization for learning and data analysis, global optimization
Procedia PDF Downloads 20838792 Time Series Simulation by Conditional Generative Adversarial Net
Authors: Rao Fu, Jie Chen, Shutian Zeng, Yiping Zhuang, Agus Sudjianto
Abstract:
Generative Adversarial Net (GAN) has proved to be a powerful machine learning tool in image data analysis and generation. In this paper, we propose to use Conditional Generative Adversarial Net (CGAN) to learn and simulate time series data. The conditions include both categorical and continuous variables with different auxiliary information. Our simulation studies show that CGAN has the capability to learn different types of normal and heavy-tailed distributions, as well as dependent structures of different time series. It also has the capability to generate conditional predictive distributions consistent with training data distributions. We also provide an in-depth discussion on the rationale behind GAN and the neural networks as hierarchical splines to establish a clear connection with existing statistical methods of distribution generation. In practice, CGAN has a wide range of applications in market risk and counterparty risk analysis: it can be applied to learn historical data and generate scenarios for the calculation of Value-at-Risk (VaR) and Expected Shortfall (ES), and it can also predict the movement of the market risk factors. We present a real data analysis including a backtesting to demonstrate that CGAN can outperform Historical Simulation (HS), a popular method in market risk analysis to calculate VaR. CGAN can also be applied in economic time series modeling and forecasting. In this regard, we have included an example of hypothetical shock analysis for economic models and the generation of potential CCAR scenarios by CGAN at the end of the paper.Keywords: conditional generative adversarial net, market and credit risk management, neural network, time series
Procedia PDF Downloads 14338791 Introduction of Robust Multivariate Process Capability Indices
Authors: Behrooz Khalilloo, Hamid Shahriari, Emad Roghanian
Abstract:
Process capability indices (PCIs) are important concepts of statistical quality control and measure the capability of processes and how much processes are meeting certain specifications. An important issue in statistical quality control is parameter estimation. Under the assumption of multivariate normality, the distribution parameters, mean vector and variance-covariance matrix must be estimated, when they are unknown. Classic estimation methods like method of moment estimation (MME) or maximum likelihood estimation (MLE) makes good estimation of the population parameters when data are not contaminated. But when outliers exist in the data, MME and MLE make weak estimators of the population parameters. So we need some estimators which have good estimation in the presence of outliers. In this work robust M-estimators for estimating these parameters are used and based on robust parameter estimators, robust process capability indices are introduced. The performances of these robust estimators in the presence of outliers and their effects on process capability indices are evaluated by real and simulated multivariate data. The results indicate that the proposed robust capability indices perform much better than the existing process capability indices.Keywords: multivariate process capability indices, robust M-estimator, outlier, multivariate quality control, statistical quality control
Procedia PDF Downloads 28338790 Content Analysis and Attitude of Thai Students towards Thai Series “Hormones: Season 2”
Authors: Siriporn Meenanan
Abstract:
The objective of this study is to investigate the attitude of Thai students towards the Thai series "Hormones the Series Season 2". This study was conducted in the quantitative research, and the questionnaires were used to collect data from 400 people of the sample group. Descriptive statistics were used in data analysis. The findings reveal that most participants have positive comments regarding the series. They strongly agreed that the series reflects on the way of life and problems of teenagers in Thailand. Hence, the participants believe that if adults have a chance to watch the series, they will have the better understanding of the teenagers. In addition, the participants also agreed that the contents of the play are appropriate and satisfiable as the contents of “Hormones the Series Season 2” will raise awareness among the teens and use it as a guide to prevent problems that might happen during their teenage life.Keywords: content analysis, attitude, Thai series, hormones the Series
Procedia PDF Downloads 22938789 Multivariate Analysis of Spectroscopic Data for Agriculture Applications
Authors: Asmaa M. Hussein, Amr Wassal, Ahmed Farouk Al-Sadek, A. F. Abd El-Rahman
Abstract:
In this study, a multivariate analysis of potato spectroscopic data was presented to detect the presence of brown rot disease or not. Near-Infrared (NIR) spectroscopy (1,350-2,500 nm) combined with multivariate analysis was used as a rapid, non-destructive technique for the detection of brown rot disease in potatoes. Spectral measurements were performed in 565 samples, which were chosen randomly at the infection place in the potato slice. In this study, 254 infected and 311 uninfected (brown rot-free) samples were analyzed using different advanced statistical analysis techniques. The discrimination performance of different multivariate analysis techniques, including classification, pre-processing, and dimension reduction, were compared. Applying a random forest algorithm classifier with different pre-processing techniques to raw spectra had the best performance as the total classification accuracy of 98.7% was achieved in discriminating infected potatoes from control.Keywords: Brown rot disease, NIR spectroscopy, potato, random forest
Procedia PDF Downloads 19038788 Comparison of Different Machine Learning Models for Time-Series Based Load Forecasting of Electric Vehicle Charging Stations
Authors: H. J. Joshi, Satyajeet Patil, Parth Dandavate, Mihir Kulkarni, Harshita Agrawal
Abstract:
As the world looks towards a sustainable future, electric vehicles have become increasingly popular. Millions worldwide are looking to switch to Electric cars over the previously favored combustion engine-powered cars. This demand has seen an increase in Electric Vehicle Charging Stations. The big challenge is that the randomness of electrical energy makes it tough for these charging stations to provide an adequate amount of energy over a specific amount of time. Thus, it has become increasingly crucial to model these patterns and forecast the energy needs of power stations. This paper aims to analyze how different machine learning models perform on Electric Vehicle charging time-series data. The data set consists of authentic Electric Vehicle Data from the Netherlands. It has an overview of ten thousand transactions from public stations operated by EVnetNL.Keywords: forecasting, smart grid, electric vehicle load forecasting, machine learning, time series forecasting
Procedia PDF Downloads 10638787 Time-Series Load Data Analysis for User Power Profiling
Authors: Mahdi Daghmhehci Firoozjaei, Minchang Kim, Dima Alhadidi
Abstract:
In this paper, we present a power profiling model for smart grid consumers based on real time load data acquired smart meters. It profiles consumers’ power consumption behaviour using the dynamic time warping (DTW) clustering algorithm. Due to the invariability of signal warping of this algorithm, time-disordered load data can be profiled and consumption features be extracted. Two load types are defined and the related load patterns are extracted for classifying consumption behaviour by DTW. The classification methodology is discussed in detail. To evaluate the performance of the method, we analyze the time-series load data measured by a smart meter in a real case. The results verify the effectiveness of the proposed profiling method with 90.91% true positive rate for load type clustering in the best case.Keywords: power profiling, user privacy, dynamic time warping, smart grid
Procedia PDF Downloads 14838786 Applying Multivariate and Univariate Analysis of Variance on Socioeconomic, Health, and Security Variables in Jordan
Authors: Faisal G. Khamis, Ghaleb A. El-Refae
Abstract:
Many researchers have studied socioeconomic, health, and security variables in the developed countries; however, very few studies used multivariate analysis in developing countries. The current study contributes to the scarce literature about the determinants of the variance in socioeconomic, health, and security factors. Questions raised were whether the independent variables (IVs) of governorate and year impact the socioeconomic, health, and security dependent variables (DVs) in Jordan, whether the marginal mean of each DV in each governorate and in each year is significant, which governorates are similar in difference means of each DV, and whether these DVs vary. The main objectives were to determine the source of variances in DVs, collectively and separately, testing which governorates are similar and which diverge for each DV. The research design was time series and cross-sectional analysis. The main hypotheses are that IVs affect DVs collectively and separately. Multivariate and univariate analyses of variance were carried out to test these hypotheses. The population of 12 governorates in Jordan and the available data of 15 years (2000–2015) accrued from several Jordanian statistical yearbooks. We investigated the effect of two factors of governorate and year on the four DVs of divorce rate, mortality rate, unemployment percentage, and crime rate. All DVs were transformed to multivariate normal distribution. We calculated descriptive statistics for each DV. Based on the multivariate analysis of variance, we found a significant effect in IVs on DVs with p < .001. Based on the univariate analysis, we found a significant effect of IVs on each DV with p < .001, except the effect of the year factor on unemployment was not significant with p = .642. The grand and marginal means of each DV in each governorate and each year were significant based on a 95% confidence interval. Most governorates are not similar in DVs with p < .001. We concluded that the two factors produce significant effects on DVs, collectively and separately. Based on these findings, the government can distribute its financial and physical resources to governorates more efficiently. By identifying the sources of variance that contribute to the variation in DVs, insights can help inform focused variation prevention efforts.Keywords: ANOVA, crime, divorce, governorate, hypothesis test, Jordan, MANOVA, means, mortality, unemployment, year
Procedia PDF Downloads 27538785 Exploring Time-Series Phosphoproteomic Datasets in the Context of Network Models
Authors: Sandeep Kaur, Jenny Vuong, Marcel Julliard, Sean O'Donoghue
Abstract:
Time-series data are useful for modelling as they can enable model-evaluation. However, when reconstructing models from phosphoproteomic data, often non-exact methods are utilised, as the knowledge regarding the network structure, such as, which kinases and phosphatases lead to the observed phosphorylation state, is incomplete. Thus, such reactions are often hypothesised, which gives rise to uncertainty. Here, we propose a framework, implemented via a web-based tool (as an extension to Minardo), which given time-series phosphoproteomic datasets, can generate κ models. The incompleteness and uncertainty in the generated model and reactions are clearly presented to the user via the visual method. Furthermore, we demonstrate, via a toy EGF signalling model, the use of algorithmic verification to verify κ models. Manually formulated requirements were evaluated with regards to the model, leading to the highlighting of the nodes causing unsatisfiability (i.e. error causing nodes). We aim to integrate such methods into our web-based tool and demonstrate how the identified erroneous nodes can be presented to the user via the visual method. Thus, in this research we present a framework, to enable a user to explore phosphorylation proteomic time-series data in the context of models. The observer can visualise which reactions in the model are highly uncertain, and which nodes cause incorrect simulation outputs. A tool such as this enables an end-user to determine the empirical analysis to perform, to reduce uncertainty in the presented model - thus enabling a better understanding of the underlying system.Keywords: κ-models, model verification, time-series phosphoproteomic datasets, uncertainty and error visualisation
Procedia PDF Downloads 255