Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 22

Search results for: LSTM

22 Crude Oil Price Prediction Using LSTM Networks

Authors: Varun Gupta, Ankit Pandey


Crude oil market is an immensely complex and dynamic environment and thus the task of predicting changes in such an environment becomes challenging with regards to its accuracy. A number of approaches have been adopted to take on that challenge and machine learning has been at the core in many of them. There are plenty of examples of algorithms based on machine learning yielding satisfactory results for such type of prediction. In this paper, we have tried to predict crude oil prices using Long Short-Term Memory (LSTM) based recurrent neural networks. We have tried to experiment with different types of models using different epochs, lookbacks and other tuning methods. The results obtained are promising and presented a reasonably accurate prediction for the price of crude oil in near future.

Keywords: Crude oil price prediction, deep learning, LSTM, recurrent neural networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3600
21 An Auxiliary Technique for Coronary Heart Disease Prediction by Analyzing ECG Based on ResNet and Bi-LSTM

Authors: Yang Zhang, Jian He


Heart disease is one of the leading causes of death in the world, and coronary heart disease (CHD) is one of the major heart diseases. Electrocardiogram (ECG) is widely used in the detection of heart diseases, but the traditional manual method for CHD prediction by analyzing ECG requires lots of professional knowledge for doctors. This paper presents sliding window and continuous wavelet transform (CWT) to transform ECG signals into images, and then ResNet and Bi-LSTM are introduced to build the ECG feature extraction network (namely ECGNet). At last, an auxiliary system for CHD prediction was developed based on modified ResNet18 and Bi-LSTM, and the public ECG dataset of CHD from MIMIC-3 was used to train and test the system. The experimental results show that the accuracy of the method is 83%, and the F1-score is 83%. Compared with the available methods for CHD prediction based on ECG, such as kNN, decision tree, VGGNet, etc., this method not only improves the prediction accuracy but also could avoid the degradation phenomenon of the deep learning network.

Keywords: Bi-LSTM, CHD, coronary heart disease, ECG, electrocardiogram, ResNet, sliding window.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 124
20 Prediction on Housing Price Based on Deep Learning

Authors: Li Yu, Chenlu Jiao, Hongrun Xin, Yan Wang, Kaiyang Wang


In order to study the impact of various factors on the housing price, we propose to build different prediction models based on deep learning to determine the existing data of the real estate in order to more accurately predict the housing price or its changing trend in the future. Considering that the factors which affect the housing price vary widely, the proposed prediction models include two categories. The first one is based on multiple characteristic factors of the real estate. We built Convolution Neural Network (CNN) prediction model and Long Short-Term Memory (LSTM) neural network prediction model based on deep learning, and logical regression model was implemented to make a comparison between these three models. Another prediction model is time series model. Based on deep learning, we proposed an LSTM-1 model purely regard to time series, then implementing and comparing the LSTM model and the Auto-Regressive and Moving Average (ARMA) model. In this paper, comprehensive study of the second-hand housing price in Beijing has been conducted from three aspects: crawling and analyzing, housing price predicting, and the result comparing. Ultimately the best model program was produced, which is of great significance to evaluation and prediction of the housing price in the real estate industry.

Keywords: Deep learning, convolutional neural network, LSTM, housing prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4852
19 Deep Learning for Renewable Power Forecasting: An Approach Using LSTM Neural Networks

Authors: Fazıl Gökgöz, Fahrettin Filiz


Load forecasting has become crucial in recent years and become popular in forecasting area. Many different power forecasting models have been tried out for this purpose. Electricity load forecasting is necessary for energy policies, healthy and reliable grid systems. Effective power forecasting of renewable energy load leads the decision makers to minimize the costs of electric utilities and power plants. Forecasting tools are required that can be used to predict how much renewable energy can be utilized. The purpose of this study is to explore the effectiveness of LSTM-based neural networks for estimating renewable energy loads. In this study, we present models for predicting renewable energy loads based on deep neural networks, especially the Long Term Memory (LSTM) algorithms. Deep learning allows multiple layers of models to learn representation of data. LSTM algorithms are able to store information for long periods of time. Deep learning models have recently been used to forecast the renewable energy sources such as predicting wind and solar energy power. Historical load and weather information represent the most important variables for the inputs within the power forecasting models. The dataset contained power consumption measurements are gathered between January 2016 and December 2017 with one-hour resolution. Models use publicly available data from the Turkish Renewable Energy Resources Support Mechanism. Forecasting studies have been carried out with these data via deep neural networks approach including LSTM technique for Turkish electricity markets. 432 different models are created by changing layers cell count and dropout. The adaptive moment estimation (ADAM) algorithm is used for training as a gradient-based optimizer instead of SGD (stochastic gradient). ADAM performed better than SGD in terms of faster convergence and lower error rates. Models performance is compared according to MAE (Mean Absolute Error) and MSE (Mean Squared Error). Best five MAE results out of 432 tested models are 0.66, 0.74, 0.85 and 1.09. The forecasting performance of the proposed LSTM models gives successful results compared to literature searches.

Keywords: Deep learning, long-short-term memory, energy, renewable energy load forecasting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1516
18 Optimizing Forecasting for Indonesia's Coal and Palm Oil Exports: A Comparative Analysis of ARIMA, ANN, and LSTM Methods

Authors: Mochammad Dewo, Sumarsono Sudarto


The Exponential Triple Smoothing Algorithm approach nowadays, which is used to anticipate the export value of Indonesia's two major commodities, coal and palm oil, has a Mean Percentage Absolute Error (MAPE) value of 30-50%, which may be considered as a "reasonable" forecasting mistake. Forecasting errors of more than 30% shall have a domino effect on industrial output, as extra production adds to raw material, manufacturing and storage expenses. Whereas, reaching an "excellent" classification with an error value of less than 10% will provide new investors and exporters with confidence in the commercial development of related sectors. Industrial growth will bring out a positive impact on economic development. It can be applied for other commodities if the forecast error is less than 10%. The purpose of this project is to create a forecasting technique that can produce precise forecasting results with an error of less than 10%. This research analyzes forecasting methods such as ARIMA (Autoregressive Integrated Moving Average), ANN (Artificial Neural Network) and LSTM (Long-Short Term Memory). By providing a MAPE of 1%, this study reveals that ANN is the most successful strategy for forecasting coal and palm oil commodities in Indonesia.

Keywords: ANN, Artificial Neural Network, ARIMA, Autoregressive Integrated Moving Average, export value, forecast, LSTM, Long Short Term Memory.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 125
17 Analysis of Multilayer Neural Network Modeling and Long Short-Term Memory

Authors: Danilo López, Nelson Vera, Luis Pedraza


This paper analyzes fundamental ideas and concepts related to neural networks, which provide the reader a theoretical explanation of Long Short-Term Memory (LSTM) networks operation classified as Deep Learning Systems, and to explicitly present the mathematical development of Backward Pass equations of the LSTM network model. This mathematical modeling associated with software development will provide the necessary tools to develop an intelligent system capable of predicting the behavior of licensed users in wireless cognitive radio networks.

Keywords: Neural networks, multilayer perceptron, long short-term memory, recurrent neuronal network, mathematical analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1501
16 Identification of Vessel Class with LSTM using Kinematic Features in Maritime Traffic Control

Authors: Davide Fuscà, Kanan Rahimli, Roberto Leuzzi


Prevent abuse and illegal activities in a given area of the sea is a very difficult and expensive task. Artificial intelligence offers the possibility to implement new methods to identify the vessel class type from the kinematic features of the vessel itself. The task strictly depends on the quality of the data. This paper explores the application of a deep Long Short-Term Memory model by using AIS flow only with a relatively low quality. The proposed model reaches high accuracy on detecting nine vessel classes representing the most common vessel types in the Ionian-Adriatic Sea. The model has been applied during the Adriatic-Ionian trial period of the international EU ANDROMEDA H2020 project to identify vessels performing behaviours far from the expected one, depending on the declared type.

Keywords: maritime surveillance, artificial intelligence, behaviour analysis, LSTM

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1201
15 Breast Cancer Prediction Using Score-Level Fusion of Machine Learning and Deep Learning Models

Authors: [email protected]


Breast cancer is one of the most common types in women. Early prediction of breast cancer helps physicians detect cancer in its early stages. Big cancer data need a very powerful tool to analyze and extract predictions. Machine learning and deep learning are two of the most efficient tools for predicting cancer based on textual data. In this study, we developed a fusion model of two machine learning and deep learning models. To obtain the final prediction, Long-Short Term Memory (LSTM), ensemble learning with hyper parameters optimization, and score-level fusion is used. Experiments are done on the Breast Cancer Surveillance Consortium (BCSC) dataset after balancing and grouping the class categories. Five different training scenarios are used, and the tests show that the designed fusion model improved the performance by 3.3% compared to the individual models.

Keywords: Machine learning, Deep learning, cancer prediction, breast cancer, LSTM, Score-Level Fusion.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 275
14 Author Profiling: Prediction of Learners’ Gender on a MOOC Platform Based on Learners’ Comments

Authors: Tahani Aljohani, Jialin Yu, Alexandra. I. Cristea


The more an educational system knows about a learner, the more personalised interaction it can provide, which leads to better learning. However, asking a learner directly is potentially disruptive, and often ignored by learners. Especially in the booming realm of MOOC Massive Online Learning platforms, only a very low percentage of users disclose demographic information about themselves. Thus, in this paper, we aim to predict learners’ demographic characteristics, by proposing an approach using linguistically motivated Deep Learning Architectures for Learner Profiling, particularly targeting gender prediction on a FutureLearn MOOC platform. Additionally, we tackle here the difficult problem of predicting the gender of learners based on their comments only – which are often available across MOOCs. The most common current approaches to text classification use the Long Short-Term Memory (LSTM) model, considering sentences as sequences. However, human language also has structures. In this research, rather than considering sentences as plain sequences, we hypothesise that higher semantic - and syntactic level sentence processing based on linguistics will render a richer representation. We thus evaluate, the traditional LSTM versus other bleeding edge models, which take into account syntactic structure, such as tree-structured LSTM, Stack-augmented Parser-Interpreter Neural Network (SPINN) and the Structure-Aware Tag Augmented model (SATA). Additionally, we explore using different word-level encoding functions. We have implemented these methods on Our MOOC dataset, which is the most performant one comparing with a public dataset on sentiment analysis that is further used as a cross-examining for the models' results.

Keywords: Deep learning, data mining, gender predication, MOOCs.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1275
13 Copper Price Prediction Model for Various Economic Situations

Authors: Haidy S. Ghali, Engy Serag, A. Samer Ezeldin


Copper is an essential raw material used in the construction industry. During 2021 and the first half of 2022, the global market suffered from a significant fluctuation in copper raw material prices due to the aftermath of both the COVID-19 pandemic and the Russia-Ukraine war which exposed its consumers to an unexpected financial risk. Thereto, this paper aims to develop two hybrid price prediction models using artificial neural network and long short-term memory (ANN-LSTM), by Python, that can forecast the average monthly copper prices, traded in the London Metal Exchange; the first model is a multivariate model that forecasts the copper price of the next 1-month and the second is a univariate model that predicts the copper prices of the upcoming three months. Historical data of average monthly London Metal Exchange copper prices are collected from January 2009 till July 2022 and potential external factors are identified and employed in the multivariate model. These factors lie under three main categories: energy prices, and economic indicators of the three major exporting countries of copper depending on the data availability. Before developing the LSTM models, the collected external parameters are analyzed with respect to the copper prices using correlation, and multicollinearity tests in R software; then, the parameters are further screened to select the parameters that influence the copper prices. Then, the two LSTM models are developed, and the dataset is divided into training, validation, and testing sets. The results show that the performance of the 3-month prediction model is better than the 1-month prediction model; but still, both models can act as predicting tools for diverse economic situations.

Keywords: Copper prices, prediction model, neural network, time series forecasting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 74
12 Long Short-Term Memory Based Model for Modeling Nicotine Consumption Using an Electronic Cigarette and Internet of Things Devices

Authors: Hamdi Amroun, Yacine Benziani, Mehdi Ammi


In this paper, we want to determine whether the accurate prediction of nicotine concentration can be obtained by using a network of smart objects and an e-cigarette. The approach consists of, first, the recognition of factors influencing smoking cessation such as physical activity recognition and participant’s behaviors (using both smartphone and smartwatch), then the prediction of the configuration of the e-cigarette (in terms of nicotine concentration, power, and resistance of e-cigarette). The study uses a network of commonly connected objects; a smartwatch, a smartphone, and an e-cigarette transported by the participants during an uncontrolled experiment. The data obtained from sensors carried in the three devices were trained by a Long short-term memory algorithm (LSTM). Results show that our LSTM-based model allows predicting the configuration of the e-cigarette in terms of nicotine concentration, power, and resistance with a root mean square error percentage of 12.9%, 9.15%, and 11.84%, respectively. This study can help to better control consumption of nicotine and offer an intelligent configuration of the e-cigarette to users.

Keywords: Iot, activity recognition, automatic classification, unconstrained environment, deep neural networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1077
11 Forecasting 24-Hour Ahead Electricity Load Using Time Series Models

Authors: Ramin Vafadary, Maryam Khanbaghi


Forecasting electricity load is important for various purposes like planning, operation and control. Forecasts can save operating and maintenance costs, increase the reliability of power supply and delivery systems, and correct decisions for future development. This paper compares various time series methods to forecast 24 hours ahead of electricity load. The methods considered are the Holt-Winters smoothing, SARIMA Modeling, LSTM Network, Fbprophet and Tensorflow probability. The performance of each method is evaluated by using the forecasting accuracy criteria namely, the Mean Absolute Error and Root Mean Square Error. The National Renewable Energy Laboratory (NREL) residential energy consumption data are used to train the models. The results of this study show that SARIMA model is superior to the others for 24 hours ahead forecasts. Furthermore, a Bagging technique is used to make the predictions more robust. The obtained results show that by Bagging multiple time-series forecasts we can improve the robustness of the models for 24 hour ahead electricity load forecasting.

Keywords: Bagging, Fbprophet, Holt-Winters, LSTM, Load Forecast, SARIMA, tensorflow probability, time series.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 378
10 A Machine Learning Approach for Anomaly Detection in Environmental IoT-Driven Wastewater Purification Systems

Authors: Giovanni Cicceri, Roberta Maisano, Nathalie Morey, Salvatore Distefano


The main goal of this paper is to present a solution for a water purification system based on an Environmental Internet of Things (EIoT) platform to monitor and control water quality and machine learning (ML) models to support decision making and speed up the processes of purification of water. A real case study has been implemented by deploying an EIoT platform and a network of devices, called Gramb meters and belonging to the Gramb project, on wastewater purification systems located in Calabria, south of Italy. The data thus collected are used to control the wastewater quality, detect anomalies and predict the behaviour of the purification system. To this extent, three different statistical and machine learning models have been adopted and thus compared: Autoregressive Integrated Moving Average (ARIMA), Long Short Term Memory (LSTM) autoencoder, and Facebook Prophet (FP). The results demonstrated that the ML solution (LSTM) out-perform classical statistical approaches (ARIMA, FP), in terms of both accuracy, efficiency and effectiveness in monitoring and controlling the wastewater purification processes.

Keywords: EIoT, machine learning, anomaly detection, environment monitoring.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 913
9 ECG-Based Heartbeat Classification Using Convolutional Neural Networks

Authors: Jacqueline R. T. Alipo-on, Francesca I. F. Escobar, Myles J. T. Tan, Hezerul Abdul Karim, Nouar AlDahoul


Electrocardiogram (ECG) signal analysis and processing are crucial in the diagnosis of cardiovascular diseases which are considered as one of the leading causes of mortality worldwide. However, the traditional rule-based analysis of large volumes of ECG data is time-consuming, labor-intensive, and prone to human errors. With the advancement of the programming paradigm, algorithms such as machine learning have been increasingly used to perform an analysis on the ECG signals. In this paper, various deep learning algorithms were adapted to classify five classes of heart beat types. The dataset used in this work is the synthetic MIT-Beth Israel Hospital (MIT-BIH) Arrhythmia dataset produced from generative adversarial networks (GANs). Various deep learning models such as ResNet-50 convolutional neural network (CNN), 1-D CNN, and long short-term memory (LSTM) were evaluated and compared. ResNet-50 was found to outperform other models in terms of recall and F1 score using a five-fold average score of 98.88% and 98.87%, respectively. 1-D CNN, on the other hand, was found to have the highest average precision of 98.93%.

Keywords: Heartbeat classification, convolutional neural network, electrocardiogram signals, ECG signals, generative adversarial networks, long short-term memory, LSTM, ResNet-50.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 90
8 A Sentence-to-Sentence Relation Network for Recognizing Textual Entailment

Authors: Isaac K. E. Ampomah, Seong-Bae Park, Sang-Jo Lee


Over the past decade, there have been promising developments in Natural Language Processing (NLP) with several investigations of approaches focusing on Recognizing Textual Entailment (RTE). These models include models based on lexical similarities, models based on formal reasoning, and most recently deep neural models. In this paper, we present a sentence encoding model that exploits the sentence-to-sentence relation information for RTE. In terms of sentence modeling, Convolutional neural network (CNN) and recurrent neural networks (RNNs) adopt different approaches. RNNs are known to be well suited for sequence modeling, whilst CNN is suited for the extraction of n-gram features through the filters and can learn ranges of relations via the pooling mechanism. We combine the strength of RNN and CNN as stated above to present a unified model for the RTE task. Our model basically combines relation vectors computed from the phrasal representation of each sentence and final encoded sentence representations. Firstly, we pass each sentence through a convolutional layer to extract a sequence of higher-level phrase representation for each sentence from which the first relation vector is computed. Secondly, the phrasal representation of each sentence from the convolutional layer is fed into a Bidirectional Long Short Term Memory (Bi-LSTM) to obtain the final sentence representations from which a second relation vector is computed. The relations vectors are combined and then used in then used in the same fashion as attention mechanism over the Bi-LSTM outputs to yield the final sentence representations for the classification. Experiment on the Stanford Natural Language Inference (SNLI) corpus suggests that this is a promising technique for RTE.

Keywords: Deep neural models, natural language inference, recognizing textual entailment, sentence-to-sentence relation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1394
7 Web Data Scraping Technology Using Term Frequency Inverse Document Frequency to Enhance the Big Data Quality on Sentiment Analysis

Authors: Sangita Pokhrel, Nalinda Somasiri, Rebecca Jeyavadhanam, Swathi Ganesan


Tourism is a booming industry with huge future potential for global wealth and employment. There are countless data generated over social media sites every day, creating numerous opportunities to bring more insights to decision-makers. The integration of big data technology into the tourism industry will allow companies to conclude where their customers have been and what they like. This information can then be used by businesses, such as those in charge of managing visitor centres or hotels, etc., and the tourist can get a clear idea of places before visiting. The technical perspective of natural language is processed by analysing the sentiment features of online reviews from tourists, and we then supply an enhanced long short-term memory (LSTM) framework for sentiment feature extraction of travel reviews. We have constructed a web review database using a crawler and web scraping technique for experimental validation to evaluate the effectiveness of our methodology. The text form of sentences was first classified through VADER and RoBERTa model to get the polarity of the reviews. In this paper, we have conducted study methods for feature extraction, such as Count Vectorization and Term Frequency – Inverse Document Frequency (TFIDF) Vectorization and implemented Convolutional Neural Network (CNN) classifier algorithm for the sentiment analysis to decide if the tourist’s attitude towards the destinations is positive, negative, or simply neutral based on the review text that they posted online. The results demonstrated that from the CNN algorithm, after pre-processing and cleaning the dataset, we received an accuracy of 96.12% for the positive and negative sentiment analysis.

Keywords: Counter vectorization, Convolutional Neural Network, Crawler, data technology, Long Short-Term Memory, LSTM, Web Scraping, sentiment analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 76
6 AI-Based Techniques for Online Social Media Network Sentiment Analysis: A Methodical Review

Authors: A. M. John-Otumu, M. M. Rahman, O. C. Nwokonkwo, M. C. Onuoha


Online social media networks have long served as a primary arena for group conversations, gossip, text-based information sharing and distribution. The use of natural language processing techniques for text classification and unbiased decision making has not been far-fetched. Proper classification of these textual information in a given context has also been very difficult. As a result, a systematic review was conducted from previous literature on sentiment classification and AI-based techniques. The study was done in order to gain a better understanding of the process of designing and developing a robust and more accurate sentiment classifier that could correctly classify social media textual information of a given context between hate speech and inverted compliments with a high level of accuracy using the knowledge gain from the evaluation of different artificial intelligence techniques reviewed. The study evaluated over 250 articles from digital sources like ACM digital library, Google Scholar, and IEEE Xplore; and whittled down the number of research to 52 articles. Findings revealed that deep learning approaches such as Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Bidirectional Encoder Representations from Transformer (BERT), and Long Short-Term Memory (LSTM) outperformed various machine learning techniques in terms of performance accuracy. A large dataset is also required to develop a robust sentiment classifier. Results also revealed that data can be obtained from places like Twitter, movie reviews, Kaggle, Stanford Sentiment Treebank (SST), and SemEval Task4 based on the required domain. The hybrid deep learning techniques like CNN+LSTM, CNN+ Gated Recurrent Unit (GRU), CNN+BERT outperformed single deep learning techniques and machine learning techniques. Python programming language outperformed Java programming language in terms of development simplicity and AI-based library functionalities. Finally, the study recommended the findings obtained for building robust sentiment classifier in the future.

Keywords: Artificial Intelligence, Natural Language Processing, Sentiment Analysis, Social Network, Text.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 467
5 Aspect-Level Sentiment Analysis with Multi-Channel and Graph Convolutional Networks

Authors: Jiajun Wang, Xiaoge Li


The purpose of the aspect-level sentiment analysis task is to identify the sentiment polarity of aspects in a sentence. Currently, most methods mainly focus on using neural networks and attention mechanisms to model the relationship between aspects and context, but they ignore the dependence of words in different ranges in the sentence, resulting in deviation when assigning relationship weight to other words other than aspect words. To solve these problems, we propose an aspect-level sentiment analysis model that combines a multi-channel convolutional network and graph convolutional network (GCN). Firstly, the context and the degree of association between words are characterized by Long Short-Term Memory (LSTM) and self-attention mechanism. Besides, a multi-channel convolutional network is used to extract the features of words in different ranges. Finally, a convolutional graph network is used to associate the node information of the dependency tree structure. We conduct experiments on four benchmark datasets. The experimental results are compared with those of other models, which shows that our model is better and more effective.

Keywords: Aspect-level sentiment analysis, attention, multi-channel convolution network, graph convolution network, dependency tree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 377
4 Anomaly Detection in a Data Center with a Reconstruction Method Using a Multi-Autoencoders Model

Authors: Victor Breux, Jérôme Boutet, Alain Goret, Viviane Cattin


Early detection of anomalies in data centers is important to reduce downtimes and the costs of periodic maintenance. However, there is little research on this topic and even fewer on the fusion of sensor data for the detection of abnormal events. The goal of this paper is to propose a method for anomaly detection in data centers by combining sensor data (temperature, humidity, power) and deep learning models. The model described in the paper uses one autoencoder per sensor to reconstruct the inputs. The auto-encoders contain Long-Short Term Memory (LSTM) layers and are trained using the normal samples of the relevant sensors selected by correlation analysis. The difference signal between the input and its reconstruction is then used to classify the samples using feature extraction and a random forest classifier. The data measured by the sensors of a data center between January 2019 and May 2020 are used to train the model, while the data between June 2020 and May 2021 are used to assess it. Performances of the model are assessed a posteriori through F1-score by comparing detected anomalies with the data center’s history. The proposed model outperforms the state-of-the-art reconstruction method, which uses only one autoencoder taking multivariate sequences and detects an anomaly with a threshold on the reconstruction error, with an F1-score of 83.60% compared to 24.16%.

Keywords: Anomaly detection, autoencoder, data centers, deep learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 642
3 Traffic Forecasting for Open Radio Access Networks Virtualized Network Functions in 5G Networks

Authors: Khalid Ali, Manar Jammal


In order to meet the stringent latency and reliability requirements of the upcoming 5G networks, Open Radio Access Networks (O-RAN) have been proposed. The virtualization of O-RAN has allowed it to be treated as a Network Function Virtualization (NFV) architecture, while its components are considered Virtualized Network Functions (VNFs). Hence, intelligent Machine Learning (ML) based solutions can be utilized to apply different resource management and allocation techniques on O-RAN. However, intelligently allocating resources for O-RAN VNFs can prove challenging due to the dynamicity of traffic in mobile networks. Network providers need to dynamically scale the allocated resources in response to the incoming traffic. Elastically allocating resources can provide a higher level of flexibility in the network in addition to reducing the OPerational EXpenditure (OPEX) and increasing the resources utilization. Most of the existing elastic solutions are reactive in nature, despite the fact that proactive approaches are more agile since they scale instances ahead of time by predicting the incoming traffic. In this work, we propose and evaluate traffic forecasting models based on the ML algorithm. The algorithms aim at predicting future O-RAN traffic by using previous traffic data. Detailed analysis of the traffic data was carried out to validate the quality and applicability of the traffic dataset. Hence, two ML models were proposed and evaluated based on their prediction capabilities.

Keywords: O-RAN, traffic forecasting, NFV, ARIMA, LSTM, elasticity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 422
2 Time Series Forecasting Using Various Deep Learning Models

Authors: Jimeng Shi, Mahek Jain, Giri Narasimhan


Time Series Forecasting (TSF) is used to predict the target variables at a future time point based on the learning from previous time points. To keep the problem tractable, learning methods use data from a fixed length window in the past as an explicit input. In this paper, we study how the performance of predictive models change as a function of different look-back window sizes and different amounts of time to predict into the future. We also consider the performance of the recent attention-based transformer models, which had good success in the image processing and natural language processing domains. In all, we compare four different deep learning methods (Recurrent Neural Network (RNN), Long Short-term Memory (LSTM), Gated Recurrent Units (GRU), and Transformer) along with a baseline method. The dataset (hourly) we used is the Beijing Air Quality Dataset from the website of University of California, Irvine (UCI), which includes a multivariate time series of many factors measured on an hourly basis for a period of 5 years (2010-14). For each model, we also report on the relationship between the performance and the look-back window sizes and the number of predicted time points into the future. Our experiments suggest that Transformer models have the best performance with the lowest Mean   Absolute Errors (MAE = 14.599, 23.273) and Root Mean Square Errors (RSME = 23.573, 38.131) for most of our single-step and multi-steps predictions. The best size for the look-back window to predict 1 hour into the future appears to be one day, while 2 or 4 days perform the best to predict 3 hours into the future.

Keywords: Air quality prediction, deep learning algorithms, time series forecasting, look-back window.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 986
1 Improving Subjective Bias Detection Using Bidirectional Encoder Representations from Transformers and Bidirectional Long Short-Term Memory

Authors: Ebipatei Victoria Tunyan, T. A. Cao, Cheol Young Ock


Detecting subjectively biased statements is a vital task. This is because this kind of bias, when present in the text or other forms of information dissemination media such as news, social media, scientific texts, and encyclopedias, can weaken trust in the information and stir conflicts amongst consumers. Subjective bias detection is also critical for many Natural Language Processing (NLP) tasks like sentiment analysis, opinion identification, and bias neutralization. Having a system that can adequately detect subjectivity in text will boost research in the above-mentioned areas significantly. It can also come in handy for platforms like Wikipedia, where the use of neutral language is of importance. The goal of this work is to identify the subjectively biased language in text on a sentence level. With machine learning, we can solve complex AI problems, making it a good fit for the problem of subjective bias detection. A key step in this approach is to train a classifier based on BERT (Bidirectional Encoder Representations from Transformers) as upstream model. BERT by itself can be used as a classifier; however, in this study, we use BERT as data preprocessor as well as an embedding generator for a Bi-LSTM (Bidirectional Long Short-Term Memory) network incorporated with attention mechanism. This approach produces a deeper and better classifier. We evaluate the effectiveness of our model using the Wiki Neutrality Corpus (WNC), which was compiled from Wikipedia edits that removed various biased instances from sentences as a benchmark dataset, with which we also compare our model to existing approaches. Experimental analysis indicates an improved performance, as our model achieved state-of-the-art accuracy in detecting subjective bias. This study focuses on the English language, but the model can be fine-tuned to accommodate other languages.

Keywords: Subjective bias detection, machine learning, BERT–BiLSTM–Attention, text classification, natural language processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 734