Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 19721

Search results for: time series clustering

19631 Elucidation of the Sequential Transcriptional Activity in Escherichia coli Using Time-Series RNA-Seq Data

Authors: Pui Shan Wong, Kosuke Tashiro, Satoru Kuhara, Sachiyo Aburatani

Abstract:

Functional genomics and gene regulation inference has readily expanded our knowledge and understanding of gene interactions with regards to expression regulation. With the advancement of transcriptome sequencing in time-series comes the ability to study the sequential changes of the transcriptome. This method presented here works to augment existing regulation networks accumulated in literature with transcriptome data gathered from time-series experiments to construct a sequential representation of transcription factor activity. This method is applied on a time-series RNA-Seq data set from Escherichia coli as it transitions from growth to stationary phase over five hours. Investigations are conducted on the various metabolic activities in gene regulation processes by taking advantage of the correlation between regulatory gene pairs to examine their activity on a dynamic network. Especially, the changes in metabolic activity during phase transition are analyzed with focus on the pagP gene as well as other associated transcription factors. The visualization of the sequential transcriptional activity is used to describe the change in metabolic pathway activity originating from the pagP transcription factor, phoP. The results show a shift from amino acid and nucleic acid metabolism, to energy metabolism during the transition to stationary phase in E. coli.

Keywords: Escherichia coli, gene regulation, network, time-series

Procedia PDF Downloads 345

19630 Classification of Generative Adversarial Network Generated Multivariate Time Series Data Featuring Transformer-Based Deep Learning Architecture

Authors: Thrivikraman Aswathi, S. Advaith

Abstract:

As there can be cases where the use of real data is somehow limited, such as when it is hard to get access to a large volume of real data, we need to go for synthetic data generation. This produces high-quality synthetic data while maintaining the statistical properties of a specific dataset. In the present work, a generative adversarial network (GAN) is trained to produce multivariate time series (MTS) data since the MTS is now being gathered more often in various real-world systems. Furthermore, the GAN-generated MTS data is fed into a transformer-based deep learning architecture that carries out the data categorization into predefined classes. Further, the model is evaluated across various distinct domains by generating corresponding MTS data.

Keywords: GAN, transformer, classification, multivariate time series

Procedia PDF Downloads 92

19629 Time Series Forecasting (TSF) Using Various Deep Learning Models

Authors: Jimeng Shi, Mahek Jain, Giri Narasimhan

Abstract:

Time Series Forecasting (TSF) is used to predict the target variables at a future time point based on the learning from previous time points. To keep the problem tractable, learning methods use data from a fixed-length window in the past as an explicit input. In this paper, we study how the performance of predictive models changes as a function of different look-back window sizes and different amounts of time to predict the future. We also consider the performance of the recent attention-based Transformer models, which have had good success in the image processing and natural language processing domains. In all, we compare four different deep learning methods (RNN, LSTM, GRU, and Transformer) along with a baseline method. The dataset (hourly) we used is the Beijing Air Quality Dataset from the UCI website, which includes a multivariate time series of many factors measured on an hourly basis for a period of 5 years (2010-14). For each model, we also report on the relationship between the performance and the look-back window sizes and the number of predicted time points into the future. Our experiments suggest that Transformer models have the best performance with the lowest Mean Average Errors (MAE = 14.599, 23.273) and Root Mean Square Errors (RSME = 23.573, 38.131) for most of our single-step and multi-steps predictions. The best size for the look-back window to predict 1 hour into the future appears to be one day, while 2 or 4 days perform the best to predict 3 hours into the future.

Keywords: air quality prediction, deep learning algorithms, time series forecasting, look-back window

Procedia PDF Downloads 133

19628 Copula Autoregressive Methodology for Simulation of Solar Irradiance and Air Temperature Time Series for Solar Energy Forecasting

Authors: Andres F. Ramirez, Carlos F. Valencia

Abstract:

The increasing interest in renewable energies strategies application and the path for diminishing the use of carbon related energy sources have encouraged the development of novel strategies for integration of solar energy into the electricity network. A correct inclusion of the fluctuating energy output of a photovoltaic (PV) energy system into an electric grid requires improvements in the forecasting and simulation methodologies for solar energy potential, and the understanding not only of the mean value of the series but the associated underlying stochastic process. We present a methodology for synthetic generation of solar irradiance (shortwave flux) and air temperature bivariate time series based on copula functions to represent the cross-dependence and temporal structure of the data. We explore the advantages of using this nonlinear time series method over traditional approaches that use a transformation of the data to normal distributions as an intermediate step. The use of copulas gives flexibility to represent the serial variability of the real data on the simulation and allows having more control on the desired properties of the data. We use discrete zero mass density distributions to assess the nature of solar irradiance, alongside vector generalized linear models for the bivariate time series time dependent distributions. We found that the copula autoregressive methodology used, including the zero mass characteristics of the solar irradiance time series, generates a significant improvement over state of the art strategies. These results will help to better understand the fluctuating nature of solar energy forecasting, the underlying stochastic process, and quantify the potential of a photovoltaic (PV) energy generating system integration into a country electricity network. Experimental analysis and real data application substantiate the usage and convenience of the proposed methodology to forecast solar irradiance time series and solar energy across northern hemisphere, southern hemisphere, and equatorial zones.

Keywords: copula autoregressive, solar irradiance forecasting, solar energy forecasting, time series generation

Procedia PDF Downloads 295

19627 Time Series Analysis of Radon Concentration at Different Depths in an Underground Goldmine

Authors: Theophilus Adjirackor, Frederic Sam, Irene Opoku-Ntim, David Okoh Kpeglo, Prince K. Gyekye, Frank K. Quashie, Kofi Ofori

Abstract:

Indoor radon concentrations were collected monthly over a period of one year in 10 different levels in an underground goldmine, and the data was analyzed using a four-moving average time series to determine the relationship between the depths of the underground mine and the indoor radon concentration. The detectors were installed in batches within four quarters. The measurements were carried out using LR115 solid-state nuclear track detectors. Statistical models are applied in the prediction and analysis of the radon concentration at various depths. The time series model predicted a positive relationship between the depth of the underground mine and the indoor radon concentration. Thus, elevated radon concentrations are expected at deeper levels of the underground mine, but the relationship was insignificant at the 5% level of significance with a negative adjusted R2 (R2 = – 0.021) due to an appropriate engineering and adequate ventilation rate in the underground mine.

Keywords: LR115, radon concentration, rime series, underground goldmine

Procedia PDF Downloads 14

19626 Design of Electromagnetic Field of PMSG for VTOL Series-Hybrid UAV

Authors: Sooyoung Cho, In-Gun Kim, Hyun-Seok Hong, Dong-Woo Kang, Ju Lee

Abstract:

Series hybrid UAV(Unmanned aerial vehicle) that is proposed in this paper performs VTOL(Vertical take-off and landing) using the battery and generator, and it applies the series hybrid system with combination of the small engine and generator when cruising flight. This system can be described as the next-generation system that can dramatically increase the UAV flight times. Also, UAV systems require a large energy at the time of VTOL to be conducted for a short time. Therefore, this paper designs PMSG(Permanent Magnet Synchronous Generator) having a high specific power considering VTOL through the FEA.

Keywords: PMSG, VTOL, UAV, high specific power density

Procedia PDF Downloads 491

19625 Max-Entropy Feed-Forward Clustering Neural Network

Authors: Xiaohan Bookman, Xiaoyan Zhu

Abstract:

The outputs of non-linear feed-forward neural network are positive, which could be treated as probability when they are normalized to one. If we take Entropy-Based Principle into consideration, the outputs for each sample could be represented as the distribution of this sample for different clusters. Entropy-Based Principle is the principle with which we could estimate the unknown distribution under some limited conditions. As this paper defines two processes in Feed-Forward Neural Network, our limited condition is the abstracted features of samples which are worked out in the abstraction process. And the final outputs are the probability distribution for different clusters in the clustering process. As Entropy-Based Principle is considered into the feed-forward neural network, a clustering method is born. We have conducted some experiments on six open UCI data sets, comparing with a few baselines and applied purity as the measurement. The results illustrate that our method outperforms all the other baselines that are most popular clustering methods.

Keywords: feed-forward neural network, clustering, max-entropy principle, probabilistic models

Procedia PDF Downloads 411

19624 Gender Based Variability Time Series Complexity Analysis

Authors: Ramesh K. Sunkaria, Puneeta Marwaha

Abstract:

Nonlinear methods of heart rate variability (HRV) analysis are becoming more popular. It has been observed that complexity measures quantify the regularity and uncertainty of cardiovascular RR-interval time series. In the present work, SampEn has been evaluated in healthy Normal Sinus Rhythm (NSR) male and female subjects for different data lengths and tolerance level r. It is demonstrated that SampEn is small for higher values of tolerance r. Also SampEn value of healthy female group is higher than that of healthy male group for short data length and with increase in data length both groups overlap each other and it is difficult to distinguish them. The SampEn gives inaccurate results by assigning higher value to female group, because male subject have more complex HRV pattern than that of female subjects. Therefore, this traditional algorithm exhibits higher complexity for healthy female subjects than for healthy male subjects, which is misleading observation. This may be due to the fact that SampEn do not account for multiple time scales inherent in the physiologic time series and the hidden spatial and temporal fluctuations remains unexplored.

Keywords: heart rate variability, normal sinus rhythm group, RR interval time series, sample entropy

Procedia PDF Downloads 256

19623 Modified CUSUM Algorithm for Gradual Change Detection in a Time Series Data

Authors: Victoria Siriaki Jorry, I. S. Mbalawata, Hayong Shin

Abstract:

The main objective in a change detection problem is to develop algorithms for efficient detection of gradual and/or abrupt changes in the parameter distribution of a process or time series data. In this paper, we present a modified cumulative (MCUSUM) algorithm to detect the start and end of a time-varying linear drift in mean value of a time series data based on likelihood ratio test procedure. The design, implementation and performance of the proposed algorithm for a linear drift detection is evaluated and compared to the existing CUSUM algorithm using different performance measures. An approach to accurately approximate the threshold of the MCUSUM is also provided. Performance of the MCUSUM for gradual change-point detection is compared to that of standard cumulative sum (CUSUM) control chart designed for abrupt shift detection using Monte Carlo Simulations. In terms of the expected time for detection, the MCUSUM procedure is found to have a better performance than a standard CUSUM chart for detection of the gradual change in mean. The algorithm is then applied and tested to a randomly generated time series data with a gradual linear trend in mean to demonstrate its usefulness.

Keywords: average run length, CUSUM control chart, gradual change detection, likelihood ratio test

Procedia PDF Downloads 268

19622 Clustering of Extremes in Financial Returns: A Comparison between Developed and Emerging Markets

Authors: Sara Ali Alokley, Mansour Saleh Albarrak

Abstract:

This paper investigates the dependency or clustering of extremes in the financial returns data by estimating the extremal index value θ∈[0,1]. The smaller the value of θ the more clustering we have. Here we apply the method of Ferro and Segers (2003) to estimate the extremal index for a range of threshold values. We compare the dependency structure of extremes in the developed and emerging markets. We use the financial returns of the stock market index in the developed markets of US, UK, France, Germany and Japan and the emerging markets of Brazil, Russia, India, China and Saudi Arabia. We expect that more clustering occurs in the emerging markets. This study will help to understand the dependency structure of the financial returns data.

Keywords: clustring, extremes, returns, dependency, extermal index

Procedia PDF Downloads 375

19621 An Energy Efficient Clustering Approach for Underwater ‎Wireless Sensor Networks

Authors: Mohammad Reza Taherkhani‎

Abstract:

Wireless sensor networks that are used to monitor a special environment, are formed from a large number of sensor nodes. The role of these sensors is to sense special parameters from ambient and to make a connection. In these networks, the most important challenge is the management of energy usage. Clustering is one of the methods that are broadly used to face this challenge. In this paper, a distributed clustering protocol based on learning automata is proposed for underwater wireless sensor networks. The proposed algorithm that is called LA-Clustering forms clusters in the same energy level, based on the energy level of nodes and the connection radius regardless of size and the structure of sensor network. The proposed approach is simulated and is compared with some other protocols with considering some metrics such as network lifetime, number of alive nodes, and number of transmitted data. The simulation results demonstrate the efficiency of the proposed approach.

Keywords: underwater sensor networks, clustering, learning automata, energy consumption

Procedia PDF Downloads 336

19620 Factors Influencing the Acceptance of Y Series among the Residents in Three Southern Border Provinces of Thailand

Authors: Chetsada Noknoi

Abstract:

The acceptance of Y series refers to the willingness and enjoyment of watching Y series without feeling different from general series. This occurs when people watch Y series and derive happiness and entertainment from it. The viewing experience has the most significant impact on Y series acceptance. This research aims to 1) investigate the levels of acceptance of sexual diversity, image of Y series Actors, media exposure, and Y series acceptance among the residents in three southern border provinces of Thailand, and 2) examine how acceptance of sexual diversity, actor perceptions in Y series, and media exposure influence Y series acceptance in these provinces. The sample consisted of 322 participants from the three southern border provinces of Thailand. The research instrument used was a questionnaire, and data were analyzed using frequency, percentage, mean, standard deviation, and multiple regression analysis. The findings revealed that overall, acceptance of sexual diversity, Image of Y series Actors, and Y series acceptance among the residents in three southern border provinces of Thailand were at a high level, while media exposure was moderate overall. However, the two factors that had the most significant impact on Y series acceptance in these provinces, ranked from highest to lowest influence, were media exposure and acceptance of sexual diversity. Both of these factors had a positive effect on Y series acceptance among the residents in three southern border provinces of Thailand. Collectively, these factors accounted for 40.7% of the variance in Y series acceptance among the residents in three southern border provinces of Thailand.

Keywords: acceptance, acceptance of sexual diversity, image of Y series actors, media exposure, Y series

Procedia PDF Downloads 51

19619 Deep Graph Embeddings for the Analysis of Short Heartbeat Interval Time Series

Authors: Tamas Madl

Abstract:

Sudden cardiac death (SCD) constitutes a large proportion of cardiovascular mortalities, provides little advance warning, and the risk is difficult to recognize based on ubiquitous, low cost medical equipment such as the standard, 12-lead, ten second ECG. Autonomic abnormalities have been shown to be strongly predictive of SCD risk; yet current methods are not trivially applicable to the brevity and low temporal and electrical resolution of standard ECGs. Here, we build horizontal visibility graph representations of very short inter-beat interval time series, and perform unsuper- vised representation learning in order to convert these variable size objects into fixed-length vectors preserving similarity rela- tions. We show that such representations facilitate classification into healthy vs. at-risk patients on two different datasets, the Mul- tiparameter Intelligent Monitoring in Intensive Care II and the PhysioNet Sudden Cardiac Death Holter Database. Our results suggest that graph representation learning of heartbeat interval time series facilitates robust classification even in sequences as short as ten seconds.

Keywords: sudden cardiac death, heart rate variability, ECG analysis, time series classification

Procedia PDF Downloads 210

19618 Forecasting Cancers Cases in Algeria Using Double Exponential Smoothing Method

Authors: Messis A., Adjebli A., Ayeche R., Talbi M., Tighilet K., Louardiane M.

Abstract:

Cancers are the second cause of death worldwide. Prevalence and incidence of cancers is getting increased by aging and population growth. This study aims to predict and modeling the evolution of breast, Colorectal, Lung, Bladder and Prostate cancers over the period of 2014-2019. In this study, data were analyzed using time series analysis with double exponential smoothing method to forecast the future pattern. To describe and fit the appropriate models, Minitab statistical software version 17 was used. Between 2014 and 2019, the overall trend in the raw number of new cancer cases registered has been increasing over time; the change in observations over time has been increasing. Our forecast model is validated since we have good prediction for the period 2020 and data not available for 2021 and 2022. Time series analysis showed that the double exponential smoothing is an efficient tool to model the future data on the raw number of new cancer cases.

Keywords: cancer, time series, prediction, double exponential smoothing

Procedia PDF Downloads 58

19617 A Learning Automata Based Clustering Approach for Underwater ‎Sensor Networks to Reduce Energy Consumption

Authors: Motahareh Fadaei

Abstract:

Wireless sensor networks that are used to monitor a special environment, are formed from a large number of sensor nodes. The role of these sensors is to sense special parameters from ambient and to make connection. In these networks, the most important challenge is the management of energy usage. Clustering is one of the methods that are broadly used to face this challenge. In this paper, a distributed clustering protocol based on learning automata is proposed for underwater wireless sensor networks. The proposed algorithm that is called LA-Clustering forms clusters in the same energy level, based on the energy level of nodes and the connection radius regardless of size and the structure of sensor network. The proposed approach is simulated and is compared with some other protocols with considering some metrics such as network lifetime, number of alive nodes, and number of transmitted data. The simulation results demonstrate the efficiency of the proposed approach.

Keywords: clustering, energy consumption‎, learning automata, underwater sensor networks

Procedia PDF Downloads 294

19616 Content Analysis and Attitude of Thai Students towards Thai Series “Hormones: Season 2”

Authors: Siriporn Meenanan

Abstract:

The objective of this study is to investigate the attitude of Thai students towards the Thai series "Hormones the Series Season 2". This study was conducted in the quantitative research, and the questionnaires were used to collect data from 400 people of the sample group. Descriptive statistics were used in data analysis. The findings reveal that most participants have positive comments regarding the series. They strongly agreed that the series reflects on the way of life and problems of teenagers in Thailand. Hence, the participants believe that if adults have a chance to watch the series, they will have the better understanding of the teenagers. In addition, the participants also agreed that the contents of the play are appropriate and satisfiable as the contents of “Hormones the Series Season 2” will raise awareness among the teens and use it as a guide to prevent problems that might happen during their teenage life.

Keywords: content analysis, attitude, Thai series, hormones the Series

Procedia PDF Downloads 207

19615 Knowledge Representation Based on Interval Type-2 CFCM Clustering

Authors: Lee Myung-Won, Kwak Keun-Chang

Abstract:

This paper is concerned with knowledge representation and extraction of fuzzy if-then rules using Interval Type-2 Context-based Fuzzy C-Means clustering (IT2-CFCM) with the aid of fuzzy granulation. This proposed clustering algorithm is based on information granulation in the form of IT2 based Fuzzy C-Means (IT2-FCM) clustering and estimates the cluster centers by preserving the homogeneity between the clustered patterns from the IT2 contexts produced in the output space. Furthermore, we can obtain the automatic knowledge representation in the design of Radial Basis Function Networks (RBFN), Linguistic Model (LM), and Adaptive Neuro-Fuzzy Networks (ANFN) from the numerical input-output data pairs. We shall focus on a design of ANFN in this paper. The experimental results on an estimation problem of energy performance reveal that the proposed method showed a good knowledge representation and performance in comparison with the previous works.

Keywords: IT2-FCM, IT2-CFCM, context-based fuzzy clustering, adaptive neuro-fuzzy network, knowledge representation

Procedia PDF Downloads 295

19614 Intelligent Recognition of Diabetes Disease via FCM Based Attribute Weighting

Authors: Kemal Polat

Abstract:

In this paper, an attribute weighting method called fuzzy C-means clustering based attribute weighting (FCMAW) for classification of Diabetes disease dataset has been used. The aims of this study are to reduce the variance within attributes of diabetes dataset and to improve the classification accuracy of classifier algorithm transforming from non-linear separable datasets to linearly separable datasets. Pima Indians Diabetes dataset has two classes including normal subjects (500 instances) and diabetes subjects (268 instances). Fuzzy C-means clustering is an improved version of K-means clustering method and is one of most used clustering methods in data mining and machine learning applications. In this study, as the first stage, fuzzy C-means clustering process has been used for finding the centers of attributes in Pima Indians diabetes dataset and then weighted the dataset according to the ratios of the means of attributes to centers of theirs. Secondly, after weighting process, the classifier algorithms including support vector machine (SVM) and k-NN (k- nearest neighbor) classifiers have been used for classifying weighted Pima Indians diabetes dataset. Experimental results show that the proposed attribute weighting method (FCMAW) has obtained very promising results in the classification of Pima Indians diabetes dataset.

Keywords: fuzzy C-means clustering, fuzzy C-means clustering based attribute weighting, Pima Indians diabetes, SVM

Procedia PDF Downloads 387

19613 Exploring Time-Series Phosphoproteomic Datasets in the Context of Network Models

Authors: Sandeep Kaur, Jenny Vuong, Marcel Julliard, Sean O'Donoghue

Abstract:

Time-series data are useful for modelling as they can enable model-evaluation. However, when reconstructing models from phosphoproteomic data, often non-exact methods are utilised, as the knowledge regarding the network structure, such as, which kinases and phosphatases lead to the observed phosphorylation state, is incomplete. Thus, such reactions are often hypothesised, which gives rise to uncertainty. Here, we propose a framework, implemented via a web-based tool (as an extension to Minardo), which given time-series phosphoproteomic datasets, can generate κ models. The incompleteness and uncertainty in the generated model and reactions are clearly presented to the user via the visual method. Furthermore, we demonstrate, via a toy EGF signalling model, the use of algorithmic verification to verify κ models. Manually formulated requirements were evaluated with regards to the model, leading to the highlighting of the nodes causing unsatisfiability (i.e. error causing nodes). We aim to integrate such methods into our web-based tool and demonstrate how the identified erroneous nodes can be presented to the user via the visual method. Thus, in this research we present a framework, to enable a user to explore phosphorylation proteomic time-series data in the context of models. The observer can visualise which reactions in the model are highly uncertain, and which nodes cause incorrect simulation outputs. A tool such as this enables an end-user to determine the empirical analysis to perform, to reduce uncertainty in the presented model - thus enabling a better understanding of the underlying system.

Keywords: κ-models, model verification, time-series phosphoproteomic datasets, uncertainty and error visualisation

Procedia PDF Downloads 231

19612 Method of Cluster Based Cross-Domain Knowledge Acquisition for Biologically Inspired Design

Authors: Shen Jian, Hu Jie, Ma Jin, Peng Ying Hong, Fang Yi, Liu Wen Hai

Abstract:

Biologically inspired design inspires inventions and new technologies in the field of engineering by mimicking functions, principles, and structures in the biological domain. To deal with the obstacles of cross-domain knowledge acquisition in the existing biologically inspired design process, functional semantic clustering based on functional feature semantic correlation and environmental constraint clustering composition based on environmental characteristic constraining adaptability are proposed. A knowledge cell clustering algorithm and the corresponding prototype system is developed. Finally, the effectiveness of the method is verified by the visual prosthetic device design.

Keywords: knowledge clustering, knowledge acquisition, knowledge based engineering, knowledge cell, biologically inspired design

Procedia PDF Downloads 406

19611 Jacobson Semisimple Skew Inverse Laurent Series Rings

Authors: Ahmad Moussavi

Abstract:

In this paper, we are concerned with the Jacobson semisimple skew inverse Laurent series rings R((x−1; α, δ)) and the skew Laurent power series rings R[[x, x−1; α]], where R is an associative ring equipped with an automorphism α and an α-derivation δ. Examples to illustrate and delimit the theory are provided.

Keywords: skew polynomial rings, Laurent series, skew inverse Laurent series rings

Procedia PDF Downloads 142

19610 New Machine Learning Optimization Approach Based on Input Variables Disposition Applied for Time Series Prediction

Authors: Hervice Roméo Fogno Fotsoa, Germaine Djuidje Kenmoe, Claude Vidal Aloyem Kazé

Abstract:

One of the main applications of machine learning is the prediction of time series. But a more accurate prediction requires a more optimal model of machine learning. Several optimization techniques have been developed, but without considering the input variables disposition of the system. Thus, this work aims to present a new machine learning architecture optimization technique based on their optimal input variables disposition. The validations are done on the prediction of wind time series, using data collected in Cameroon. The number of possible dispositions with four input variables is determined, i.e., twenty-four. Each of the dispositions is used to perform the prediction, with the main criteria being the training and prediction performances. The results obtained from a static architecture and a dynamic architecture of neural networks have shown that these performances are a function of the input variable's disposition, and this is in a different way from the architectures. This analysis revealed that it is necessary to take into account the input variable's disposition for the development of a more optimal neural network model. Thus, a new neural network training algorithm is proposed by introducing the search for the optimal input variables disposition in the traditional back-propagation algorithm. The results of the application of this new optimization approach on the two single neural network architectures are compared with the previously obtained results step by step. Moreover, this proposed approach is validated in a collaborative optimization method with a single objective optimization technique, i.e., genetic algorithm back-propagation neural networks. From these comparisons, it is concluded that each proposed model outperforms its traditional model in terms of training and prediction performance of time series. Thus the proposed optimization approach can be useful in improving the accuracy of time series forecasts. This proves that the proposed optimization approach can be useful in improving the accuracy of time series prediction based on machine learning.

Keywords: input variable disposition, machine learning, optimization, performance, time series prediction

Procedia PDF Downloads 70

19609 Pattern Recognition Using Feature Based Die-Map Clustering in the Semiconductor Manufacturing Process

Authors: Seung Hwan Park, Cheng-Sool Park, Jun Seok Kim, Youngji Yoo, Daewoong An, Jun-Geol Baek

Abstract:

Depending on the big data analysis becomes important, yield prediction using data from the semiconductor process is essential. In general, yield prediction and analysis of the causes of the failure are closely related. The purpose of this study is to analyze pattern affects the final test results using a die map based clustering. Many researches have been conducted using die data from the semiconductor test process. However, analysis has limitation as the test data is less directly related to the final test results. Therefore, this study proposes a framework for analysis through clustering using more detailed data than existing die data. This study consists of three phases. In the first phase, die map is created through fail bit data in each sub-area of die. In the second phase, clustering using map data is performed. And the third stage is to find patterns that affect final test result. Finally, the proposed three steps are applied to actual industrial data and experimental results showed the potential field application.

Keywords: die-map clustering, feature extraction, pattern recognition, semiconductor manufacturing process

Procedia PDF Downloads 378

19608 Times2D: A Time-Frequency Method for Time Series Forecasting

Authors: Reza Nematirad, Anil Pahwa, Balasubramaniam Natarajan

Abstract:

Time series data consist of successive data points collected over a period of time. Accurate prediction of future values is essential for informed decision-making in several real-world applications, including electricity load demand forecasting, lifetime estimation of industrial machinery, traffic planning, weather prediction, and the stock market. Due to their critical relevance and wide application, there has been considerable interest in time series forecasting in recent years. However, the proliferation of sensors and IoT devices, real-time monitoring systems, and high-frequency trading data introduce significant intricate temporal variations, rapid changes, noise, and non-linearities, making time series forecasting more challenging. Classical methods such as Autoregressive integrated moving average (ARIMA) and Exponential Smoothing aim to extract pre-defined temporal variations, such as trends and seasonality. While these methods are effective for capturing well-defined seasonal patterns and trends, they often struggle with more complex, non-linear patterns present in real-world time series data. In recent years, deep learning has made significant contributions to time series forecasting. Recurrent Neural Networks (RNNs) and their variants, such as Long short-term memory (LSTMs) and Gated Recurrent Units (GRUs), have been widely adopted for modeling sequential data. However, they often suffer from the locality, making it difficult to capture local trends and rapid fluctuations. Convolutional Neural Networks (CNNs), particularly Temporal Convolutional Networks (TCNs), leverage convolutional layers to capture temporal dependencies by applying convolutional filters along the temporal dimension. Despite their advantages, TCNs struggle with capturing relationships between distant time points due to the locality of one-dimensional convolution kernels. Transformers have revolutionized time series forecasting with their powerful attention mechanisms, effectively capturing long-term dependencies and relationships between distant time points. However, the attention mechanism may struggle to discern dependencies directly from scattered time points due to intricate temporal patterns. Lastly, Multi-Layer Perceptrons (MLPs) have also been employed, with models like N-BEATS and LightTS demonstrating success. Despite this, MLPs often face high volatility and computational complexity challenges in long-horizon forecasting. To address intricate temporal variations in time series data, this study introduces Times2D, a novel framework that parallelly integrates 2D spectrogram and derivative heatmap techniques. The spectrogram focuses on the frequency domain, capturing periodicity, while the derivative patterns emphasize the time domain, highlighting sharp fluctuations and turning points. This 2D transformation enables the utilization of powerful computer vision techniques to capture various intricate temporal variations. To evaluate the performance of Times2D, extensive experiments were conducted on standard time series datasets and compared with various state-of-the-art algorithms, including DLinear (2023), TimesNet (2023), Non-stationary Transformer (2022), PatchTST (2023), N-HiTS (2023), Crossformer (2023), MICN (2023), LightTS (2022), FEDformer (2022), FiLM (2022), SCINet (2022a), Autoformer (2021), and Informer (2021) under the same modeling conditions. The initial results demonstrated that Times2D achieves consistent state-of-the-art performance in both short-term and long-term forecasting tasks. Furthermore, the generality of the Times2D framework allows it to be applied to various tasks such as time series imputation, clustering, classification, and anomaly detection, offering potential benefits in any domain that involves sequential data analysis.

Keywords: derivative patterns, spectrogram, time series forecasting, times2D, 2D representation

Procedia PDF Downloads 4

19607 Forecasting 24-Hour Ahead Electricity Load Using Time Series Models

Authors: Ramin Vafadary, Maryam Khanbaghi

Abstract:

Forecasting electricity load is important for various purposes like planning, operation, and control. Forecasts can save operating and maintenance costs, increase the reliability of power supply and delivery systems, and correct decisions for future development. This paper compares various time series methods to forecast 24 hours ahead of electricity load. The methods considered are the Holt-Winters smoothing, SARIMA Modeling, LSTM Network, Fbprophet, and Tensorflow probability. The performance of each method is evaluated by using the forecasting accuracy criteria, namely, the mean absolute error and root mean square error. The National Renewable Energy Laboratory (NREL) residential energy consumption data is used to train the models. The results of this study show that the SARIMA model is superior to the others for 24 hours ahead forecasts. Furthermore, a Bagging technique is used to make the predictions more robust. The obtained results show that by Bagging multiple time-series forecasts, we can improve the robustness of the models for 24 hours ahead of electricity load forecasting.

Keywords: bagging, Fbprophet, Holt-Winters, LSTM, load forecast, SARIMA, TensorFlow probability, time series

Procedia PDF Downloads 67

19606 Forecasting of COVID-19 Cases, Hospitalization Admissions, and Death Cases Based on Wastewater Sars-COV-2 Surveillance Using Copula Time Series Model

Authors: Hueiwang Anna Jeng, Norou Diawara, Nancy Welch, Cynthia Jackson, Rekha Singh, Kyle Curtis, Raul Gonzalez, David Jurgens, Sasanka Adikari

Abstract:

Modeling effort is needed to predict the COVID-19 trends for developing management strategies and adaptation measures. The objective of this study was to assess whether SARS-CoV-2 viral load in wastewater could serve as a predictor for forecasting COVID-19 cases, hospitalization cases, and death cases using copula-based time series modeling. SARS-CoV-2 RNA load in raw wastewater in Chesapeake VA was measured using the RT-qPCR method. Gaussian copula time series marginal regression model, incorporating an autoregressive moving average model and the copula function, served as a forecasting model. COVID-19 cases were correlated with wastewater viral load, hospitalization cases, and death cases. The forecasted trend of COVID-19 cases closely paralleled one of the reported cases, with over 90% of the forecasted COVID-19 cases falling within the 99% confidence interval of the reported cases. Wastewater SARS-CoV-2 viral load could serve as a predictor for COVID-19 cases and hospitalization cases.

Keywords: COVID-19, modeling, time series, copula function

Procedia PDF Downloads 43

19605 Hierarchical Cluster Analysis of Raw Milk Samples Obtained from Organic and Conventional Dairy Farming in Autonomous Province of Vojvodina, Serbia

Authors: Lidija Jevrić, Denis Kučević, Sanja Podunavac-Kuzmanović, Strahinja Kovačević, Milica Karadžić

Abstract:

In the present study, the Hierarchical Cluster Analysis (HCA) was applied in order to determine the differences between the milk samples originating from a conventional dairy farm (CF) and an organic dairy farm (OF) in AP Vojvodina, Republic of Serbia. The clustering was based on the basis of the average values of saturated fatty acids (SFA) content and unsaturated fatty acids (UFA) content obtained for every season. Therefore, the HCA included the annual SFA and UFA content values. The clustering procedure was carried out on the basis of Euclidean distances and Single linkage algorithm. The obtained dendrograms indicated that the clustering of UFA in OF was much more uniform compared to clustering of UFA in CF. In OF, spring stands out from the other months of the year. The same case can be noticed for CF, where winter is separated from the other months. The results could be expected because the composition of fatty acids content is greatly influenced by the season and nutrition of dairy cows during the year.

Keywords: chemometrics, clustering, food engineering, milk quality

Procedia PDF Downloads 254

19604 Comparison of Different Machine Learning Models for Time-Series Based Load Forecasting of Electric Vehicle Charging Stations

Authors: H. J. Joshi, Satyajeet Patil, Parth Dandavate, Mihir Kulkarni, Harshita Agrawal

Abstract:

As the world looks towards a sustainable future, electric vehicles have become increasingly popular. Millions worldwide are looking to switch to Electric cars over the previously favored combustion engine-powered cars. This demand has seen an increase in Electric Vehicle Charging Stations. The big challenge is that the randomness of electrical energy makes it tough for these charging stations to provide an adequate amount of energy over a specific amount of time. Thus, it has become increasingly crucial to model these patterns and forecast the energy needs of power stations. This paper aims to analyze how different machine learning models perform on Electric Vehicle charging time-series data. The data set consists of authentic Electric Vehicle Data from the Netherlands. It has an overview of ten thousand transactions from public stations operated by EVnetNL.

Keywords: forecasting, smart grid, electric vehicle load forecasting, machine learning, time series forecasting

Procedia PDF Downloads 82

19603 Application of Seasonal Autoregressive Integrated Moving Average Model for Forecasting Monthly Flows in Waterval River, South Africa

Authors: Kassahun Birhanu Tadesse, Megersa Olumana Dinka

Abstract:

Reliable future river flow information is basic for planning and management of any river systems. For data scarce river system having only a river flow records like the Waterval River, a univariate time series models are appropriate for river flow forecasting. In this study, a univariate Seasonal Autoregressive Integrated Moving Average (SARIMA) model was applied for forecasting Waterval River flow using GRETL statistical software. Mean monthly river flows from 1960 to 2016 were used for modeling. Different unit root tests and Mann-Kendall trend analysis were performed to test the stationarity of the observed flow time series. The time series was differenced to remove the seasonality. Using the correlogram of seasonally differenced time series, different SARIMA models were identified, their parameters were estimated, and diagnostic check-up of model forecasts was performed using white noise and heteroscedasticity tests. Finally, based on minimum Akaike Information (AIc) and Hannan-Quinn (HQc) criteria, SARIMA (3, 0, 2) x (3, 1, 3)12 was selected as the best model for Waterval River flow forecasting. Therefore, this model can be used to generate future river information for water resources development and management in Waterval River system. SARIMA model can also be used for forecasting other similar univariate time series with seasonality characteristics.

Keywords: heteroscedasticity, stationarity test, trend analysis, validation, white noise

Procedia PDF Downloads 181

19602 Time Series Simulation by Conditional Generative Adversarial Net

Authors: Rao Fu, Jie Chen, Shutian Zeng, Yiping Zhuang, Agus Sudjianto

Abstract:

Generative Adversarial Net (GAN) has proved to be a powerful machine learning tool in image data analysis and generation. In this paper, we propose to use Conditional Generative Adversarial Net (CGAN) to learn and simulate time series data. The conditions include both categorical and continuous variables with different auxiliary information. Our simulation studies show that CGAN has the capability to learn different types of normal and heavy-tailed distributions, as well as dependent structures of different time series. It also has the capability to generate conditional predictive distributions consistent with training data distributions. We also provide an in-depth discussion on the rationale behind GAN and the neural networks as hierarchical splines to establish a clear connection with existing statistical methods of distribution generation. In practice, CGAN has a wide range of applications in market risk and counterparty risk analysis: it can be applied to learn historical data and generate scenarios for the calculation of Value-at-Risk (VaR) and Expected Shortfall (ES), and it can also predict the movement of the market risk factors. We present a real data analysis including a backtesting to demonstrate that CGAN can outperform Historical Simulation (HS), a popular method in market risk analysis to calculate VaR. CGAN can also be applied in economic time series modeling and forecasting. In this regard, we have included an example of hypothetical shock analysis for economic models and the generation of potential CCAR scenarios by CGAN at the end of the paper.

Keywords: conditional generative adversarial net, market and credit risk management, neural network, time series

Procedia PDF Downloads 114