Search results for: LSTM
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 127

Search results for: LSTM

37 An MrPPG Method for Face Anti-Spoofing

Authors: Lan Zhang, Cailing Zhang

Abstract:

In recent years, many face anti-spoofing algorithms have high detection accuracy when detecting 2D face anti-spoofing or 3D mask face anti-spoofing alone in the field of face anti-spoofing, but their detection performance is greatly reduced in multidimensional and cross-datasets tests. The rPPG method used for face anti-spoofing uses the unique vital information of real face to judge real faces and face anti-spoofing, so rPPG method has strong stability compared with other methods, but its detection rate of 2D face anti-spoofing needs to be improved. Therefore, in this paper, we improve an rPPG(Remote Photoplethysmography) method(MrPPG) for face anti-spoofing which through color space fusion, using the correlation of pulse signals between real face regions and background regions, and introducing the cyclic neural network (LSTM) method to improve accuracy in 2D face anti-spoofing. Meanwhile, the MrPPG also has high accuracy and good stability in face anti-spoofing of multi-dimensional and cross-data datasets. The improved method was validated on Replay-Attack, CASIA-FASD, Siw and HKBU_MARs_V2 datasets, the experimental results show that the performance and stability of the improved algorithm proposed in this paper is superior to many advanced algorithms.

Keywords: face anti-spoofing, face presentation attack detection, remote photoplethysmography, MrPPG

Procedia PDF Downloads 174
36 Speed Breaker/Pothole Detection Using Hidden Markov Models: A Deep Learning Approach

Authors: Surajit Chakrabarty, Piyush Chauhan, Subhasis Panda, Sujoy Bhattacharya

Abstract:

A large proportion of roads in India are not well maintained as per the laid down public safety guidelines leading to loss of direction control and fatal accidents. We propose a technique to detect speed breakers and potholes using mobile sensor data captured from multiple vehicles and provide a profile of the road. This would, in turn, help in monitoring roads and revolutionize digital maps. Incorporating randomness in the model formulation for detection of speed breakers and potholes is crucial due to substantial heterogeneity observed in data obtained using a mobile application from multiple vehicles driven by different drivers. This is accomplished with Hidden Markov Models, whose hidden state sequence is found for each time step given the observables sequence, and are then fed as input to LSTM network with peephole connections. A precision score of 0.96 and 0.63 is obtained for classifying bumps and potholes, respectively, a significant improvement from the machine learning based models. Further visualization of bumps/potholes is done by converting time series to images using Markov Transition Fields where a significant demarcation among bump/potholes is observed.

Keywords: deep learning, hidden Markov model, pothole, speed breaker

Procedia PDF Downloads 140
35 Enhancing Patch Time Series Transformer with Wavelet Transform for Improved Stock Prediction

Authors: Cheng-yu Hsieh, Bo Zhang, Ahmed Hambaba

Abstract:

Stock market prediction has long been an area of interest for both expert analysts and investors, driven by its complexity and the noisy, volatile conditions it operates under. This research examines the efficacy of combining the Patch Time Series Transformer (PatchTST) with wavelet transforms, specifically focusing on Haar and Daubechies wavelets, in forecasting the adjusted closing price of the S&P 500 index for the following day. By comparing the performance of the augmented PatchTST models with traditional predictive models such as Recurrent Neural Networks (RNNs), Convolutional Neural Networks (CNNs), Long Short-Term Memory (LSTM) networks, and Transformers, this study highlights significant enhancements in prediction accuracy. The integration of the Daubechies wavelet with PatchTST notably excels, surpassing other configurations and conventional models in terms of Mean Absolute Error (MAE) and Mean Squared Error (MSE). The success of the PatchTST model paired with Daubechies wavelet is attributed to its superior capability in extracting detailed signal information and eliminating irrelevant noise, thus proving to be an effective approach for financial time series forecasting.

Keywords: deep learning, financial forecasting, stock market prediction, patch time series transformer, wavelet transform

Procedia PDF Downloads 43
34 Anomaly Detection in a Data Center with a Reconstruction Method Using a Multi-Autoencoders Model

Authors: Victor Breux, Jérôme Boutet, Alain Goret, Viviane Cattin

Abstract:

Early detection of anomalies in data centers is important to reduce downtimes and the costs of periodic maintenance. However, there is little research on this topic and even fewer on the fusion of sensor data for the detection of abnormal events. The goal of this paper is to propose a method for anomaly detection in data centers by combining sensor data (temperature, humidity, power) and deep learning models. The model described in the paper uses one autoencoder per sensor to reconstruct the inputs. The auto-encoders contain Long-Short Term Memory (LSTM) layers and are trained using the normal samples of the relevant sensors selected by correlation analysis. The difference signal between the input and its reconstruction is then used to classify the samples using feature extraction and a random forest classifier. The data measured by the sensors of a data center between January 2019 and May 2020 are used to train the model, while the data between June 2020 and May 2021 are used to assess it. Performances of the model are assessed a posteriori through F1-score by comparing detected anomalies with the data center’s history. The proposed model outperforms the state-of-the-art reconstruction method, which uses only one autoencoder taking multivariate sequences and detects an anomaly with a threshold on the reconstruction error, with an F1-score of 83.60% compared to 24.16%.

Keywords: anomaly detection, autoencoder, data centers, deep learning

Procedia PDF Downloads 188
33 Graph Neural Networks and Rotary Position Embedding for Voice Activity Detection

Authors: YingWei Tan, XueFeng Ding

Abstract:

Attention-based voice activity detection models have gained significant attention in recent years due to their fast training speed and ability to capture a wide contextual range. The inclusion of multi-head style and position embedding in the attention architecture are crucial. Having multiple attention heads allows for differential focus on different parts of the sequence, while position embedding provides guidance for modeling dependencies between elements at various positions in the input sequence. In this work, we propose an approach by considering each head as a node, enabling the application of graph neural networks (GNN) to identify correlations among the different nodes. In addition, we adopt an implementation named rotary position embedding (RoPE), which encodes absolute positional information into the input sequence by a rotation matrix, and naturally incorporates explicit relative position information into a self-attention module. We evaluate the effectiveness of our method on a synthetic dataset, and the results demonstrate its superiority over the baseline CRNN in scenarios with low signal-to-noise ratio and noise, while also exhibiting robustness across different noise types. In summary, our proposed framework effectively combines the strengths of CNN and RNN (LSTM), and further enhances detection performance through the integration of graph neural networks and rotary position embedding.

Keywords: voice activity detection, CRNN, graph neural networks, rotary position embedding

Procedia PDF Downloads 65
32 Traffic Forecasting for Open Radio Access Networks Virtualized Network Functions in 5G Networks

Authors: Khalid Ali, Manar Jammal

Abstract:

In order to meet the stringent latency and reliability requirements of the upcoming 5G networks, Open Radio Access Networks (O-RAN) have been proposed. The virtualization of O-RAN has allowed it to be treated as a Network Function Virtualization (NFV) architecture, while its components are considered Virtualized Network Functions (VNFs). Hence, intelligent Machine Learning (ML) based solutions can be utilized to apply different resource management and allocation techniques on O-RAN. However, intelligently allocating resources for O-RAN VNFs can prove challenging due to the dynamicity of traffic in mobile networks. Network providers need to dynamically scale the allocated resources in response to the incoming traffic. Elastically allocating resources can provide a higher level of flexibility in the network in addition to reducing the OPerational EXpenditure (OPEX) and increasing the resources utilization. Most of the existing elastic solutions are reactive in nature, despite the fact that proactive approaches are more agile since they scale instances ahead of time by predicting the incoming traffic. In this work, we propose and evaluate traffic forecasting models based on the ML algorithm. The algorithms aim at predicting future O-RAN traffic by using previous traffic data. Detailed analysis of the traffic data was carried out to validate the quality and applicability of the traffic dataset. Hence, two ML models were proposed and evaluated based on their prediction capabilities.

Keywords: O-RAN, traffic forecasting, NFV, ARIMA, LSTM, elasticity

Procedia PDF Downloads 218
31 Using Machine Learning as an Alternative for Predicting Exchange Rates

Authors: Pedro Paulo Galindo Francisco, Eli Dhadad Junior

Abstract:

This study addresses the Meese-Rogoff Puzzle by introducing the latest machine learning techniques as alternatives for predicting the exchange rates. Using RMSE as a comparison metric, Meese and Rogoff discovered that economic models are unable to outperform the random walk model as short-term exchange rate predictors. Decades after this study, no statistical prediction technique has proven effective in overcoming this obstacle; although there were positive results, they did not apply to all currencies and defined periods. Recent advancements in artificial intelligence technologies have paved the way for a new approach to exchange rate prediction. Leveraging this technology, we applied five machine learning techniques to attempt to overcome the Meese-Rogoff puzzle. We considered daily data for the real, yen, British pound, euro, and Chinese yuan against the US dollar over a time horizon from 2010 to 2023. Our results showed that none of the presented techniques were able to produce an RMSE lower than the Random Walk model. However, the performance of some models, particularly LSTM and N-BEATS were able to outperform the ARIMA model. The results also suggest that machine learning models have untapped potential and could represent an effective long-term possibility for overcoming the Meese-Rogoff puzzle.

Keywords: exchage rate, prediction, machine learning, deep learning

Procedia PDF Downloads 22
30 Correlation between Speech Emotion Recognition Deep Learning Models and Noises

Authors: Leah Lee

Abstract:

This paper examines the correlation between deep learning models and emotions with noises to see whether or not noises mask emotions. The deep learning models used are plain convolutional neural networks (CNN), auto-encoder, long short-term memory (LSTM), and Visual Geometry Group-16 (VGG-16). Emotion datasets used are Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS), Crowd-sourced Emotional Multimodal Actors Dataset (CREMA-D), Toronto Emotional Speech Set (TESS), and Surrey Audio-Visual Expressed Emotion (SAVEE). To make it four times bigger, audio set files, stretch, and pitch augmentations are utilized. From the augmented datasets, five different features are extracted for inputs of the models. There are eight different emotions to be classified. Noise variations are white noise, dog barking, and cough sounds. The variation in the signal-to-noise ratio (SNR) is 0, 20, and 40. In summation, per a deep learning model, nine different sets with noise and SNR variations and just augmented audio files without any noises will be used in the experiment. To compare the results of the deep learning models, the accuracy and receiver operating characteristic (ROC) are checked.

Keywords: auto-encoder, convolutional neural networks, long short-term memory, speech emotion recognition, visual geometry group-16

Procedia PDF Downloads 72
29 Time Series Forecasting (TSF) Using Various Deep Learning Models

Authors: Jimeng Shi, Mahek Jain, Giri Narasimhan

Abstract:

Time Series Forecasting (TSF) is used to predict the target variables at a future time point based on the learning from previous time points. To keep the problem tractable, learning methods use data from a fixed-length window in the past as an explicit input. In this paper, we study how the performance of predictive models changes as a function of different look-back window sizes and different amounts of time to predict the future. We also consider the performance of the recent attention-based Transformer models, which have had good success in the image processing and natural language processing domains. In all, we compare four different deep learning methods (RNN, LSTM, GRU, and Transformer) along with a baseline method. The dataset (hourly) we used is the Beijing Air Quality Dataset from the UCI website, which includes a multivariate time series of many factors measured on an hourly basis for a period of 5 years (2010-14). For each model, we also report on the relationship between the performance and the look-back window sizes and the number of predicted time points into the future. Our experiments suggest that Transformer models have the best performance with the lowest Mean Average Errors (MAE = 14.599, 23.273) and Root Mean Square Errors (RSME = 23.573, 38.131) for most of our single-step and multi-steps predictions. The best size for the look-back window to predict 1 hour into the future appears to be one day, while 2 or 4 days perform the best to predict 3 hours into the future.

Keywords: air quality prediction, deep learning algorithms, time series forecasting, look-back window

Procedia PDF Downloads 149
28 Deep Reinforcement Learning Approach for Optimal Control of Industrial Smart Grids

Authors: Niklas Panten, Eberhard Abele

Abstract:

This paper presents a novel approach for real-time and near-optimal control of industrial smart grids by deep reinforcement learning (DRL). To achieve highly energy-efficient factory systems, the energetic linkage of machines, technical building equipment and the building itself is desirable. However, the increased complexity of the interacting sub-systems, multiple time-variant target values and stochastic influences by the production environment, weather and energy markets make it difficult to efficiently control the energy production, storage and consumption in the hybrid industrial smart grids. The studied deep reinforcement learning approach allows to explore the solution space for proper control policies which minimize a cost function. The deep neural network of the DRL agent is based on a multilayer perceptron (MLP), Long Short-Term Memory (LSTM) and convolutional layers. The agent is trained within multiple Modelica-based factory simulation environments by the Advantage Actor Critic algorithm (A2C). The DRL controller is evaluated by means of the simulation and then compared to a conventional, rule-based approach. Finally, the results indicate that the DRL approach is able to improve the control performance and significantly reduce energy respectively operating costs of industrial smart grids.

Keywords: industrial smart grids, energy efficiency, deep reinforcement learning, optimal control

Procedia PDF Downloads 188
27 Infodemic Detection on Social Media with a Multi-Dimensional Deep Learning Framework

Authors: Raymond Xu, Cindy Jingru Wang

Abstract:

Social media has become a globally connected and influencing platform. Social media data, such as tweets, can help predict the spread of pandemics and provide individuals and healthcare providers early warnings. Public psychological reactions and opinions can be efficiently monitored by AI models on the progression of dominant topics on Twitter. However, statistics show that as the coronavirus spreads, so does an infodemic of misinformation due to pandemic-related factors such as unemployment and lockdowns. Social media algorithms are often biased toward outrage by promoting content that people have an emotional reaction to and are likely to engage with. This can influence users’ attitudes and cause confusion. Therefore, social media is a double-edged sword. Combating fake news and biased content has become one of the essential tasks. This research analyzes the variety of methods used for fake news detection covering random forest, logistic regression, support vector machines, decision tree, naive Bayes, BoW, TF-IDF, LDA, CNN, RNN, LSTM, DeepFake, and hierarchical attention network. The performance of each method is analyzed. Based on these models’ achievements and limitations, a multi-dimensional AI framework is proposed to achieve higher accuracy in infodemic detection, especially pandemic-related news. The model is trained on contextual content, images, and news metadata.

Keywords: artificial intelligence, fake news detection, infodemic detection, image recognition, sentiment analysis

Procedia PDF Downloads 245
26 Understanding Cognitive Fatigue From FMRI Scans With Self-supervised Learning

Authors: Ashish Jaiswal, Ashwin Ramesh Babu, Mohammad Zaki Zadeh, Fillia Makedon, Glenn Wylie

Abstract:

Functional magnetic resonance imaging (fMRI) is a neuroimaging technique that records neural activations in the brain by capturing the blood oxygen level in different regions based on the task performed by a subject. Given fMRI data, the problem of predicting the state of cognitive fatigue in a person has not been investigated to its full extent. This paper proposes tackling this issue as a multi-class classification problem by dividing the state of cognitive fatigue into six different levels, ranging from no-fatigue to extreme fatigue conditions. We built a spatio-temporal model that uses convolutional neural networks (CNN) for spatial feature extraction and a long short-term memory (LSTM) network for temporal modeling of 4D fMRI scans. We also applied a self-supervised method called MoCo (Momentum Contrast) to pre-train our model on a public dataset BOLD5000 and fine-tuned it on our labeled dataset to predict cognitive fatigue. Our novel dataset contains fMRI scans from Traumatic Brain Injury (TBI) patients and healthy controls (HCs) while performing a series of N-back cognitive tasks. This method establishes a state-of-the-art technique to analyze cognitive fatigue from fMRI data and beats previous approaches to solve this problem.

Keywords: fMRI, brain imaging, deep learning, self-supervised learning, contrastive learning, cognitive fatigue

Procedia PDF Downloads 186
25 Multimodal Deep Learning for Human Activity Recognition

Authors: Ons Slimene, Aroua Taamallah, Maha Khemaja

Abstract:

In recent years, human activity recognition (HAR) has been a key area of research due to its diverse applications. It has garnered increasing attention in the field of computer vision. HAR plays an important role in people’s daily lives as it has the ability to learn advanced knowledge about human activities from data. In HAR, activities are usually represented by exploiting different types of sensors, such as embedded sensors or visual sensors. However, these sensors have limitations, such as local obstacles, image-related obstacles, sensor unreliability, and consumer concerns. Recently, several deep learning-based approaches have been proposed for HAR and these approaches are classified into two categories based on the type of data used: vision-based approaches and sensor-based approaches. This research paper highlights the importance of multimodal data fusion from skeleton data obtained from videos and data generated by embedded sensors using deep neural networks for achieving HAR. We propose a deep multimodal fusion network based on a twostream architecture. These two streams use the Convolutional Neural Network combined with the Bidirectional LSTM (CNN BILSTM) to process skeleton data and data generated by embedded sensors and the fusion at the feature level is considered. The proposed model was evaluated on a public OPPORTUNITY++ dataset and produced a accuracy of 96.77%.

Keywords: human activity recognition, action recognition, sensors, vision, human-centric sensing, deep learning, context-awareness

Procedia PDF Downloads 97
24 Impact of Integrated Signals for Doing Human Activity Recognition Using Deep Learning Models

Authors: Milagros Jaén-Vargas, Javier García Martínez, Karla Miriam Reyes Leiva, María Fernanda Trujillo-Guerrero, Francisco Fernandes, Sérgio Barroso Gonçalves, Miguel Tavares Silva, Daniel Simões Lopes, José Javier Serrano Olmedo

Abstract:

Human Activity Recognition (HAR) is having a growing impact in creating new applications and is responsible for emerging new technologies. Also, the use of wearable sensors is an important key to exploring the human body's behavior when performing activities. Hence, the use of these dispositive is less invasive and the person is more comfortable. In this study, a database that includes three activities is used. The activities were acquired from inertial measurement unit sensors (IMU) and motion capture systems (MOCAP). The main objective is differentiating the performance from four Deep Learning (DL) models: Deep Neural Network (DNN), Convolutional Neural Network (CNN), Recurrent Neural Network (RNN) and hybrid model Convolutional Neural Network-Long Short-Term Memory (CNN-LSTM), when considering acceleration, velocity and position and evaluate if integrating the IMU acceleration to obtain velocity and position represent an increment in performance when it works as input to the DL models. Moreover, compared with the same type of data provided by the MOCAP system. Despite the acceleration data is cleaned when integrating, results show a minimal increase in accuracy for the integrated signals.

Keywords: HAR, IMU, MOCAP, acceleration, velocity, position, feature maps

Procedia PDF Downloads 92
23 Detection of Atrial Fibrillation Using Wearables via Attentional Two-Stream Heterogeneous Networks

Authors: Huawei Bai, Jianguo Yao, Fellow, IEEE

Abstract:

Atrial fibrillation (AF) is the most common form of heart arrhythmia and is closely associated with mortality and morbidity in heart failure, stroke, and coronary artery disease. The development of single spot optical sensors enables widespread photoplethysmography (PPG) screening, especially for AF, since it represents a more convenient and noninvasive approach. To our knowledge, most existing studies based on public and unbalanced datasets can barely handle the multiple noises sources in the real world and, also, lack interpretability. In this paper, we construct a large- scale PPG dataset using measurements collected from PPG wrist- watch devices worn by volunteers and propose an attention-based two-stream heterogeneous neural network (TSHNN). The first stream is a hybrid neural network consisting of a three-layer one-dimensional convolutional neural network (1D-CNN) and two-layer attention- based bidirectional long short-term memory (Bi-LSTM) network to learn representations from temporally sampled signals. The second stream extracts latent representations from the PPG time-frequency spectrogram using a five-layer CNN. The outputs from both streams are fed into a fusion layer for the outcome. Visualization of the attention weights learned demonstrates the effectiveness of the attention mechanism against noise. The experimental results show that the TSHNN outperforms all the competitive baseline approaches and with 98.09% accuracy, achieves state-of-the-art performance.

Keywords: PPG wearables, atrial fibrillation, feature fusion, attention mechanism, hyber network

Procedia PDF Downloads 116
22 A Survey of Skin Cancer Detection and Classification from Skin Lesion Images Using Deep Learning

Authors: Joseph George, Anne Kotteswara Roa

Abstract:

Skin disease is one of the most common and popular kinds of health issues faced by people nowadays. Skin cancer (SC) is one among them, and its detection relies on the skin biopsy outputs and the expertise of the doctors, but it consumes more time and some inaccurate results. At the early stage, skin cancer detection is a challenging task, and it easily spreads to the whole body and leads to an increase in the mortality rate. Skin cancer is curable when it is detected at an early stage. In order to classify correct and accurate skin cancer, the critical task is skin cancer identification and classification, and it is more based on the cancer disease features such as shape, size, color, symmetry and etc. More similar characteristics are present in many skin diseases; hence it makes it a challenging issue to select important features from a skin cancer dataset images. Hence, the skin cancer diagnostic accuracy is improved by requiring an automated skin cancer detection and classification framework; thereby, the human expert’s scarcity is handled. Recently, the deep learning techniques like Convolutional neural network (CNN), Deep belief neural network (DBN), Artificial neural network (ANN), Recurrent neural network (RNN), and Long and short term memory (LSTM) have been widely used for the identification and classification of skin cancers. This survey reviews different DL techniques for skin cancer identification and classification. The performance metrics such as precision, recall, accuracy, sensitivity, specificity, and F-measures are used to evaluate the effectiveness of SC identification using DL techniques. By using these DL techniques, the classification accuracy increases along with the mitigation of computational complexities and time consumption.

Keywords: skin cancer, deep learning, performance measures, accuracy, datasets

Procedia PDF Downloads 125
21 Enhancing Project Performance Forecasting using Machine Learning Techniques

Authors: Soheila Sadeghi

Abstract:

Accurate forecasting of project performance metrics is crucial for successfully managing and delivering urban road reconstruction projects. Traditional methods often rely on static baseline plans and fail to consider the dynamic nature of project progress and external factors. This research proposes a machine learning-based approach to forecast project performance metrics, such as cost variance and earned value, for each Work Breakdown Structure (WBS) category in an urban road reconstruction project. The proposed model utilizes time series forecasting techniques, including Autoregressive Integrated Moving Average (ARIMA) and Long Short-Term Memory (LSTM) networks, to predict future performance based on historical data and project progress. The model also incorporates external factors, such as weather patterns and resource availability, as features to enhance the accuracy of forecasts. By applying the predictive power of machine learning, the performance forecasting model enables proactive identification of potential deviations from the baseline plan, which allows project managers to take timely corrective actions. The research aims to validate the effectiveness of the proposed approach using a case study of an urban road reconstruction project, comparing the model's forecasts with actual project performance data. The findings of this research contribute to the advancement of project management practices in the construction industry, offering a data-driven solution for improving project performance monitoring and control.

Keywords: project performance forecasting, machine learning, time series forecasting, cost variance, earned value management

Procedia PDF Downloads 43
20 Variable Refrigerant Flow (VRF) Zonal Load Prediction Using a Transfer Learning-Based Framework

Authors: Junyu Chen, Peng Xu

Abstract:

In the context of global efforts to enhance building energy efficiency, accurate thermal load forecasting is crucial for both device sizing and predictive control. Variable Refrigerant Flow (VRF) systems are widely used in buildings around the world, yet VRF zonal load prediction has received limited attention. Due to differences between VRF zones in building-level prediction methods, zone-level load forecasting could significantly enhance accuracy. Given that modern VRF systems generate high-quality data, this paper introduces transfer learning to leverage this data and further improve prediction performance. This framework also addresses the challenge of predicting load for building zones with no historical data, offering greater accuracy and usability compared to pure white-box models. The study first establishes an initial variable set of VRF zonal building loads and generates a foundational white-box database using EnergyPlus. Key variables for VRF zonal loads are identified using methods including SRRC, PRCC, and Random Forest. XGBoost and LSTM are employed to generate pre-trained black-box models based on the white-box database. Finally, real-world data is incorporated into the pre-trained model using transfer learning to enhance its performance in operational buildings. In this paper, zone-level load prediction was integrated with transfer learning, and a framework was proposed to improve the accuracy and applicability of VRF zonal load prediction.

Keywords: zonal load prediction, variable refrigerant flow (VRF) system, transfer learning, energyplus

Procedia PDF Downloads 19
19 Recurrent Neural Networks for Complex Survival Models

Authors: Pius Marthin, Nihal Ata Tutkun

Abstract:

Survival analysis has become one of the paramount procedures in the modeling of time-to-event data. When we encounter complex survival problems, the traditional approach remains limited in accounting for the complex correlational structure between the covariates and the outcome due to the strong assumptions that limit the inference and prediction ability of the resulting models. Several studies exist on the deep learning approach to survival modeling; moreover, the application for the case of complex survival problems still needs to be improved. In addition, the existing models need to address the data structure's complexity fully and are subject to noise and redundant information. In this study, we design a deep learning technique (CmpXRnnSurv_AE) that obliterates the limitations imposed by traditional approaches and addresses the above issues to jointly predict the risk-specific probabilities and survival function for recurrent events with competing risks. We introduce the component termed Risks Information Weights (RIW) as an attention mechanism to compute the weighted cumulative incidence function (WCIF) and an external auto-encoder (ExternalAE) as a feature selector to extract complex characteristics among the set of covariates responsible for the cause-specific events. We train our model using synthetic and real data sets and employ the appropriate metrics for complex survival models for evaluation. As benchmarks, we selected both traditional and machine learning models and our model demonstrates better performance across all datasets.

Keywords: cumulative incidence function (CIF), risk information weight (RIW), autoencoders (AE), survival analysis, recurrent events with competing risks, recurrent neural networks (RNN), long short-term memory (LSTM), self-attention, multilayers perceptrons (MLPs)

Procedia PDF Downloads 85
18 A Case Study on Machine Learning-Based Project Performance Forecasting for an Urban Road Reconstruction Project

Authors: Soheila Sadeghi

Abstract:

In construction projects, predicting project performance metrics accurately is essential for effective management and successful delivery. However, conventional methods often depend on fixed baseline plans, disregarding the evolving nature of project progress and external influences. To address this issue, we introduce a distinct approach based on machine learning to forecast key performance indicators, such as cost variance and earned value, for each Work Breakdown Structure (WBS) category within an urban road reconstruction project. Our proposed model leverages time series forecasting techniques, namely Autoregressive Integrated Moving Average (ARIMA) and Long Short-Term Memory (LSTM) networks, to predict future performance by analyzing historical data and project progress. Additionally, the model incorporates external factors, including weather patterns and resource availability, as features to improve forecast accuracy. By harnessing the predictive capabilities of machine learning, our performance forecasting model enables project managers to proactively identify potential deviations from the baseline plan and take timely corrective measures. To validate the effectiveness of the proposed approach, we conduct a case study on an urban road reconstruction project, comparing the model's predictions with actual project performance data. The outcomes of this research contribute to the advancement of project management practices in the construction industry by providing a data-driven solution for enhancing project performance monitoring and control.

Keywords: project performance forecasting, machine learning, time series forecasting, cost variance, schedule variance, earned value management

Procedia PDF Downloads 35
17 Web Data Scraping Technology Using Term Frequency Inverse Document Frequency to Enhance the Big Data Quality on Sentiment Analysis

Authors: Sangita Pokhrel, Nalinda Somasiri, Rebecca Jeyavadhanam, Swathi Ganesan

Abstract:

Tourism is a booming industry with huge future potential for global wealth and employment. There are countless data generated over social media sites every day, creating numerous opportunities to bring more insights to decision-makers. The integration of Big Data Technology into the tourism industry will allow companies to conclude where their customers have been and what they like. This information can then be used by businesses, such as those in charge of managing visitor centers or hotels, etc., and the tourist can get a clear idea of places before visiting. The technical perspective of natural language is processed by analysing the sentiment features of online reviews from tourists, and we then supply an enhanced long short-term memory (LSTM) framework for sentiment feature extraction of travel reviews. We have constructed a web review database using a crawler and web scraping technique for experimental validation to evaluate the effectiveness of our methodology. The text form of sentences was first classified through Vader and Roberta model to get the polarity of the reviews. In this paper, we have conducted study methods for feature extraction, such as Count Vectorization and TFIDF Vectorization, and implemented Convolutional Neural Network (CNN) classifier algorithm for the sentiment analysis to decide the tourist’s attitude towards the destinations is positive, negative, or simply neutral based on the review text that they posted online. The results demonstrated that from the CNN algorithm, after pre-processing and cleaning the dataset, we received an accuracy of 96.12% for the positive and negative sentiment analysis.

Keywords: counter vectorization, convolutional neural network, crawler, data technology, long short-term memory, web scraping, sentiment analysis

Procedia PDF Downloads 83
16 Efficient Video Compression Technique Using Convolutional Neural Networks and Generative Adversarial Network

Authors: P. Karthick, K. Mahesh

Abstract:

Video has become an increasingly significant component of our digital everyday contact. With the advancement of greater contents and shows of the resolution, its significant volume poses serious obstacles to the objective of receiving, distributing, compressing, and revealing video content of high quality. In this paper, we propose the primary beginning to complete a deep video compression model that jointly upgrades all video compression components. The video compression method involves splitting the video into frames, comparing the images using convolutional neural networks (CNN) to remove duplicates, repeating the single image instead of the duplicate images by recognizing and detecting minute changes using generative adversarial network (GAN) and recorded with long short-term memory (LSTM). Instead of the complete image, the small changes generated using GAN are substituted, which helps in frame level compression. Pixel wise comparison is performed using K-nearest neighbours (KNN) over the frame, clustered with K-means, and singular value decomposition (SVD) is applied for each and every frame in the video for all three color channels [Red, Green, Blue] to decrease the dimension of the utility matrix [R, G, B] by extracting its latent factors. Video frames are packed with parameters with the aid of a codec and converted to video format, and the results are compared with the original video. Repeated experiments on several videos with different sizes, duration, frames per second (FPS), and quality results demonstrate a significant resampling rate. On average, the result produced had approximately a 10% deviation in quality and more than 50% in size when compared with the original video.

Keywords: video compression, K-means clustering, convolutional neural network, generative adversarial network, singular value decomposition, pixel visualization, stochastic gradient descent, frame per second extraction, RGB channel extraction, self-detection and deciding system

Procedia PDF Downloads 184
15 ARABEX: Automated Dotted Arabic Expiration Date Extraction using Optimized Convolutional Autoencoder and Custom Convolutional Recurrent Neural Network

Authors: Hozaifa Zaki, Ghada Soliman

Abstract:

In this paper, we introduced an approach for Automated Dotted Arabic Expiration Date Extraction using Optimized Convolutional Autoencoder (ARABEX) with bidirectional LSTM. This approach is used for translating the Arabic dot-matrix expiration dates into their corresponding filled-in dates. A custom lightweight Convolutional Recurrent Neural Network (CRNN) model is then employed to extract the expiration dates. Due to the lack of available dataset images for the Arabic dot-matrix expiration date, we generated synthetic images by creating an Arabic dot-matrix True Type Font (TTF) matrix to address this limitation. Our model was trained on a realistic synthetic dataset of 3287 images, covering the period from 2019 to 2027, represented in the format of yyyy/mm/dd. We then trained our custom CRNN model using the generated synthetic images to assess the performance of our model (ARABEX) by extracting expiration dates from the translated images. Our proposed approach achieved an accuracy of 99.4% on the test dataset of 658 images, while also achieving a Structural Similarity Index (SSIM) of 0.46 for image translation on our dataset. The ARABEX approach demonstrates its ability to be applied to various downstream learning tasks, including image translation and reconstruction. Moreover, this pipeline (ARABEX+CRNN) can be seamlessly integrated into automated sorting systems to extract expiry dates and sort products accordingly during the manufacturing stage. By eliminating the need for manual entry of expiration dates, which can be time-consuming and inefficient for merchants, our approach offers significant results in terms of efficiency and accuracy for Arabic dot-matrix expiration date recognition.

Keywords: computer vision, deep learning, image processing, character recognition

Procedia PDF Downloads 80
14 Arabic Light Word Analyser: Roles with Deep Learning Approach

Authors: Mohammed Abu Shquier

Abstract:

This paper introduces a word segmentation method using the novel BP-LSTM-CRF architecture for processing semantic output training. The objective of web morphological analysis tools is to link a formal morpho-syntactic description to a lemma, along with morpho-syntactic information, a vocalized form, a vocalized analysis with morpho-syntactic information, and a list of paradigms. A key objective is to continuously enhance the proposed system through an inductive learning approach that considers semantic influences. The system is currently under construction and development based on data-driven learning. To evaluate the tool, an experiment on homograph analysis was conducted. The tool also encompasses the assumption of deep binary segmentation hypotheses, the arbitrary choice of trigram or n-gram continuation probabilities, language limitations, and morphology for both Modern Standard Arabic (MSA) and Dialectal Arabic (DA), which provide justification for updating this system. Most Arabic word analysis systems are based on the phonotactic morpho-syntactic analysis of a word transmitted using lexical rules, which are mainly used in MENA language technology tools, without taking into account contextual or semantic morphological implications. Therefore, it is necessary to have an automatic analysis tool taking into account the word sense and not only the morpho-syntactic category. Moreover, they are also based on statistical/stochastic models. These stochastic models, such as HMMs, have shown their effectiveness in different NLP applications: part-of-speech tagging, machine translation, speech recognition, etc. As an extension, we focus on language modeling using Recurrent Neural Network (RNN); given that morphological analysis coverage was very low in dialectal Arabic, it is significantly important to investigate deeply how the dialect data influence the accuracy of these approaches by developing dialectal morphological processing tools to show that dialectal variability can support to improve analysis.

Keywords: NLP, DL, ML, analyser, MSA, RNN, CNN

Procedia PDF Downloads 36
13 Speech Emotion Recognition: A DNN and LSTM Comparison in Single and Multiple Feature Application

Authors: Thiago Spilborghs Bueno Meyer, Plinio Thomaz Aquino Junior

Abstract:

Through speech, which privileges the functional and interactive nature of the text, it is possible to ascertain the spatiotemporal circumstances, the conditions of production and reception of the discourse, the explicit purposes such as informing, explaining, convincing, etc. These conditions allow bringing the interaction between humans closer to the human-robot interaction, making it natural and sensitive to information. However, it is not enough to understand what is said; it is necessary to recognize emotions for the desired interaction. The validity of the use of neural networks for feature selection and emotion recognition was verified. For this purpose, it is proposed the use of neural networks and comparison of models, such as recurrent neural networks and deep neural networks, in order to carry out the classification of emotions through speech signals to verify the quality of recognition. It is expected to enable the implementation of robots in a domestic environment, such as the HERA robot from the RoboFEI@Home team, which focuses on autonomous service robots for the domestic environment. Tests were performed using only the Mel-Frequency Cepstral Coefficients, as well as tests with several characteristics of Delta-MFCC, spectral contrast, and the Mel spectrogram. To carry out the training, validation and testing of the neural networks, the eNTERFACE’05 database was used, which has 42 speakers from 14 different nationalities speaking the English language. The data from the chosen database are videos that, for use in neural networks, were converted into audios. It was found as a result, a classification of 51,969% of correct answers when using the deep neural network, when the use of the recurrent neural network was verified, with the classification with accuracy equal to 44.09%. The results are more accurate when only the Mel-Frequency Cepstral Coefficients are used for the classification, using the classifier with the deep neural network, and in only one case, it is possible to observe a greater accuracy by the recurrent neural network, which occurs in the use of various features and setting 73 for batch size and 100 training epochs.

Keywords: emotion recognition, speech, deep learning, human-robot interaction, neural networks

Procedia PDF Downloads 161
12 Improving Subjective Bias Detection Using Bidirectional Encoder Representations from Transformers and Bidirectional Long Short-Term Memory

Authors: Ebipatei Victoria Tunyan, T. A. Cao, Cheol Young Ock

Abstract:

Detecting subjectively biased statements is a vital task. This is because this kind of bias, when present in the text or other forms of information dissemination media such as news, social media, scientific texts, and encyclopedias, can weaken trust in the information and stir conflicts amongst consumers. Subjective bias detection is also critical for many Natural Language Processing (NLP) tasks like sentiment analysis, opinion identification, and bias neutralization. Having a system that can adequately detect subjectivity in text will boost research in the above-mentioned areas significantly. It can also come in handy for platforms like Wikipedia, where the use of neutral language is of importance. The goal of this work is to identify the subjectively biased language in text on a sentence level. With machine learning, we can solve complex AI problems, making it a good fit for the problem of subjective bias detection. A key step in this approach is to train a classifier based on BERT (Bidirectional Encoder Representations from Transformers) as upstream model. BERT by itself can be used as a classifier; however, in this study, we use BERT as data preprocessor as well as an embedding generator for a Bi-LSTM (Bidirectional Long Short-Term Memory) network incorporated with attention mechanism. This approach produces a deeper and better classifier. We evaluate the effectiveness of our model using the Wiki Neutrality Corpus (WNC), which was compiled from Wikipedia edits that removed various biased instances from sentences as a benchmark dataset, with which we also compare our model to existing approaches. Experimental analysis indicates an improved performance, as our model achieved state-of-the-art accuracy in detecting subjective bias. This study focuses on the English language, but the model can be fine-tuned to accommodate other languages.

Keywords: subjective bias detection, machine learning, BERT–BiLSTM–Attention, text classification, natural language processing

Procedia PDF Downloads 124
11 Analysis of Biomarkers Intractable Epileptogenic Brain Networks with Independent Component Analysis and Deep Learning Algorithms: A Comprehensive Framework for Scalable Seizure Prediction with Unimodal Neuroimaging Data in Pediatric Patients

Authors: Bliss Singhal

Abstract:

Epilepsy is a prevalent neurological disorder affecting approximately 50 million individuals worldwide and 1.2 million Americans. There exist millions of pediatric patients with intractable epilepsy, a condition in which seizures fail to come under control. The occurrence of seizures can result in physical injury, disorientation, unconsciousness, and additional symptoms that could impede children's ability to participate in everyday tasks. Predicting seizures can help parents and healthcare providers take precautions, prevent risky situations, and mentally prepare children to minimize anxiety and nervousness associated with the uncertainty of a seizure. This research proposes a comprehensive framework to predict seizures in pediatric patients by evaluating machine learning algorithms on unimodal neuroimaging data consisting of electroencephalogram signals. The bandpass filtering and independent component analysis proved to be effective in reducing the noise and artifacts from the dataset. Various machine learning algorithms’ performance is evaluated on important metrics such as accuracy, precision, specificity, sensitivity, F1 score and MCC. The results show that the deep learning algorithms are more successful in predicting seizures than logistic Regression, and k nearest neighbors. The recurrent neural network (RNN) gave the highest precision and F1 Score, long short-term memory (LSTM) outperformed RNN in accuracy and convolutional neural network (CNN) resulted in the highest Specificity. This research has significant implications for healthcare providers in proactively managing seizure occurrence in pediatric patients, potentially transforming clinical practices, and improving pediatric care.

Keywords: intractable epilepsy, seizure, deep learning, prediction, electroencephalogram channels

Procedia PDF Downloads 78
10 Phenotype Prediction of DNA Sequence Data: A Machine and Statistical Learning Approach

Authors: Mpho Mokoatle, Darlington Mapiye, James Mashiyane, Stephanie Muller, Gciniwe Dlamini

Abstract:

Great advances in high-throughput sequencing technologies have resulted in availability of huge amounts of sequencing data in public and private repositories, enabling a holistic understanding of complex biological phenomena. Sequence data are used for a wide range of applications such as gene annotations, expression studies, personalized treatment and precision medicine. However, this rapid growth in sequence data poses a great challenge which calls for novel data processing and analytic methods, as well as huge computing resources. In this work, a machine and statistical learning approach for DNA sequence classification based on $k$-mer representation of sequence data is proposed. The approach is tested using whole genome sequences of Mycobacterium tuberculosis (MTB) isolates to (i) reduce the size of genomic sequence data, (ii) identify an optimum size of k-mers and utilize it to build classification models, (iii) predict the phenotype from whole genome sequence data of a given bacterial isolate, and (iv) demonstrate computing challenges associated with the analysis of whole genome sequence data in producing interpretable and explainable insights. The classification models were trained on 104 whole genome sequences of MTB isoloates. Cluster analysis showed that k-mers maybe used to discriminate phenotypes and the discrimination becomes more concise as the size of k-mers increase. The best performing classification model had a k-mer size of 10 (longest k-mer) an accuracy, recall, precision, specificity, and Matthews Correlation coeffient of 72.0%, 80.5%, 80.5%, 63.6%, and 0.4 respectively. This study provides a comprehensive approach for resampling whole genome sequencing data, objectively selecting a k-mer size, and performing classification for phenotype prediction. The analysis also highlights the importance of increasing the k-mer size to produce more biological explainable results, which brings to the fore the interplay that exists amongst accuracy, computing resources and explainability of classification results. However, the analysis provides a new way to elucidate genetic information from genomic data, and identify phenotype relationships which are important especially in explaining complex biological mechanisms.

Keywords: AWD-LSTM, bootstrapping, k-mers, next generation sequencing

Procedia PDF Downloads 164
9 Phenotype Prediction of DNA Sequence Data: A Machine and Statistical Learning Approach

Authors: Darlington Mapiye, Mpho Mokoatle, James Mashiyane, Stephanie Muller, Gciniwe Dlamini

Abstract:

Great advances in high-throughput sequencing technologies have resulted in availability of huge amounts of sequencing data in public and private repositories, enabling a holistic understanding of complex biological phenomena. Sequence data are used for a wide range of applications such as gene annotations, expression studies, personalized treatment and precision medicine. However, this rapid growth in sequence data poses a great challenge which calls for novel data processing and analytic methods, as well as huge computing resources. In this work, a machine and statistical learning approach for DNA sequence classification based on k-mer representation of sequence data is proposed. The approach is tested using whole genome sequences of Mycobacterium tuberculosis (MTB) isolates to (i) reduce the size of genomic sequence data, (ii) identify an optimum size of k-mers and utilize it to build classification models, (iii) predict the phenotype from whole genome sequence data of a given bacterial isolate, and (iv) demonstrate computing challenges associated with the analysis of whole genome sequence data in producing interpretable and explainable insights. The classification models were trained on 104 whole genome sequences of MTB isoloates. Cluster analysis showed that k-mers maybe used to discriminate phenotypes and the discrimination becomes more concise as the size of k-mers increase. The best performing classification model had a k-mer size of 10 (longest k-mer) an accuracy, recall, precision, specificity, and Matthews Correlation coeffient of 72.0 %, 80.5 %, 80.5 %, 63.6 %, and 0.4 respectively. This study provides a comprehensive approach for resampling whole genome sequencing data, objectively selecting a k-mer size, and performing classification for phenotype prediction. The analysis also highlights the importance of increasing the k-mer size to produce more biological explainable results, which brings to the fore the interplay that exists amongst accuracy, computing resources and explainability of classification results. However, the analysis provides a new way to elucidate genetic information from genomic data, and identify phenotype relationships which are important especially in explaining complex biological mechanisms

Keywords: AWD-LSTM, bootstrapping, k-mers, next generation sequencing

Procedia PDF Downloads 155
8 Prediction of Sepsis Illness from Patients Vital Signs Using Long Short-Term Memory Network and Dynamic Analysis

Authors: Marcio Freire Cruz, Naoaki Ono, Shigehiko Kanaya, Carlos Arthur Mattos Teixeira Cavalcante

Abstract:

The systems that record patient care information, known as Electronic Medical Record (EMR) and those that monitor vital signs of patients, such as heart rate, body temperature, and blood pressure have been extremely valuable for the effectiveness of the patient’s treatment. Several kinds of research have been using data from EMRs and vital signs of patients to predict illnesses. Among them, we highlight those that intend to predict, classify, or, at least identify patterns, of sepsis illness in patients under vital signs monitoring. Sepsis is an organic dysfunction caused by a dysregulated patient's response to an infection that affects millions of people worldwide. Early detection of sepsis is expected to provide a significant improvement in its treatment. Preceding works usually combined medical, statistical, mathematical and computational models to develop detection methods for early prediction, getting higher accuracies, and using the smallest number of variables. Among other techniques, we could find researches using survival analysis, specialist systems, machine learning and deep learning that reached great results. In our research, patients are modeled as points moving each hour in an n-dimensional space where n is the number of vital signs (variables). These points can reach a sepsis target point after some time. For now, the sepsis target point was calculated using the median of all patients’ variables on the sepsis onset. From these points, we calculate for each hour the position vector, the first derivative (velocity vector) and the second derivative (acceleration vector) of the variables to evaluate their behavior. And we construct a prediction model based on a Long Short-Term Memory (LSTM) Network, including these derivatives as explanatory variables. The accuracy of the prediction 6 hours before the time of sepsis, considering only the vital signs reached 83.24% and by including the vectors position, speed, and acceleration, we obtained 94.96%. The data are being collected from Medical Information Mart for Intensive Care (MIMIC) Database, a public database that contains vital signs, laboratory test results, observations, notes, and so on, from more than 60.000 patients.

Keywords: dynamic analysis, long short-term memory, prediction, sepsis

Procedia PDF Downloads 122