Search results for: deep capsule network
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 6240

Search results for: deep capsule network

6240 Detecting Manipulated Media Using Deep Capsule Network

Authors: Joseph Uzuazomaro Oju

Abstract:

The ease at which manipulated media can be created, and the increasing difficulty in identifying fake media makes it a great threat. Most of the applications used for the creation of these high-quality fake videos and images are built with deep learning. Hence, the use of deep learning in creating a detection mechanism cannot be overemphasized. Any successful fake media that is being detected before it reached the populace will save people from the self-doubt of either a content is genuine or fake and will ensure the credibility of videos and images. The methodology introduced in this paper approaches the manipulated media detection challenge using a combo of VGG-19 and a deep capsule network. In the case of videos, they are converted into frames, which, in turn, are resized and cropped to the face region. These preprocessed images/videos are fed to the VGG-19 network to extract the latent features. The extracted latent features are inputted into a deep capsule network enhanced with a 3D -convolution dynamic routing agreement. The 3D –convolution dynamic routing agreement algorithm helps to reduce the linkages between capsules networks. Thereby limiting the poor learning shortcoming of multiple capsule network layers. The resultant output from the deep capsule network will indicate a media to be either genuine or fake.

Keywords: deep capsule network, dynamic routing, fake media detection, manipulated media

Procedia PDF Downloads 111
6239 Detection of COVID-19 Cases From X-Ray Images Using Capsule-Based Network

Authors: Donya Ashtiani Haghighi, Amirali Baniasadi

Abstract:

Coronavirus (COVID-19) disease has spread abruptly all over the world since the end of 2019. Computed tomography (CT) scans and X-ray images are used to detect this disease. Different Deep Neural Network (DNN)-based diagnosis solutions have been developed, mainly based on Convolutional Neural Networks (CNNs), to accelerate the identification of COVID-19 cases. However, CNNs lose important information in intermediate layers and require large datasets. In this paper, Capsule Network (CapsNet) is used. Capsule Network performs better than CNNs for small datasets. Accuracy of 0.9885, f1-score of 0.9883, precision of 0.9859, recall of 0.9908, and Area Under the Curve (AUC) of 0.9948 are achieved on the Capsule-based framework with hyperparameter tuning. Moreover, different dropout rates are investigated to decrease overfitting. Accordingly, a dropout rate of 0.1 shows the best results. Finally, we remove one convolution layer and decrease the number of trainable parameters to 146,752, which is a promising result.

Keywords: capsule network, dropout, hyperparameter tuning, classification

Procedia PDF Downloads 52
6238 Optimizing the Capacity of a Convolutional Neural Network for Image Segmentation and Pattern Recognition

Authors: Yalong Jiang, Zheru Chi

Abstract:

In this paper, we study the factors which determine the capacity of a Convolutional Neural Network (CNN) model and propose the ways to evaluate and adjust the capacity of a CNN model for best matching to a specific pattern recognition task. Firstly, a scheme is proposed to adjust the number of independent functional units within a CNN model to make it be better fitted to a task. Secondly, the number of independent functional units in the capsule network is adjusted to fit it to the training dataset. Thirdly, a method based on Bayesian GAN is proposed to enrich the variances in the current dataset to increase its complexity. Experimental results on the PASCAL VOC 2010 Person Part dataset and the MNIST dataset show that, in both conventional CNN models and capsule networks, the number of independent functional units is an important factor that determines the capacity of a network model. By adjusting the number of functional units, the capacity of a model can better match the complexity of a dataset.

Keywords: CNN, convolutional neural network, capsule network, capacity optimization, character recognition, data augmentation, semantic segmentation

Procedia PDF Downloads 127
6237 An Architecture Based on Capsule Networks for the Identification of Handwritten Signature Forgery

Authors: Luisa Mesquita Oliveira Ribeiro, Alexei Manso Correa Machado

Abstract:

Handwritten signature is a unique form for recognizing an individual, used to discern documents, carry out investigations in the criminal, legal, banking areas and other applications. Signature verification is based on large amounts of biometric data, as they are simple and easy to acquire, among other characteristics. Given this scenario, signature forgery is a worldwide recurring problem and fast and precise techniques are needed to prevent crimes of this nature from occurring. This article carried out a study on the efficiency of the Capsule Network in analyzing and recognizing signatures. The chosen architecture achieved an accuracy of 98.11% and 80.15% for the CEDAR and GPDS databases, respectively.

Keywords: biometrics, deep learning, handwriting, signature forgery

Procedia PDF Downloads 55
6236 Mechanical Qualification Test Campaign on the Demise Observation Capsule

Authors: B. Tiseo, V. Quaranta, G. Bruno, R. Gardi, T. Watts, S. Dussy

Abstract:

This paper describes the qualification test campaign performed on the Demise Observation Capsule DOC-EQM as part of the Future Launch Preparatory Program FLPP3. The mechanical environment experienced during launch ascent and separation phase was first identified and then replicated in terms of sine, random and shock vibration. The loads identification is derived by selecting the worst possible case. Vibration and shock qualification test performed at CIRA Space Qualification laboratory is herein described. Mechanical fixtures’ design and validation, carried out by means of FEM, is also addressed due to its fundamental role in the vibrational test campaign. The Demise Observation Capsule (DOC) successfully passed the qualification test campaign. Functional test and resonance search have not been point any fault and damages of the capsule.

Keywords: capsule, demise, demise observation capsule, DOC, launch environment, re-ntry, qualification

Procedia PDF Downloads 120
6235 A Survey of Sentiment Analysis Based on Deep Learning

Authors: Pingping Lin, Xudong Luo, Yifan Fan

Abstract:

Sentiment analysis is a very active research topic. Every day, Facebook, Twitter, Weibo, and other social media, as well as significant e-commerce websites, generate a massive amount of comments, which can be used to analyse peoples opinions or emotions. The existing methods for sentiment analysis are based mainly on sentiment dictionaries, machine learning, and deep learning. The first two kinds of methods rely on heavily sentiment dictionaries or large amounts of labelled data. The third one overcomes these two problems. So, in this paper, we focus on the third one. Specifically, we survey various sentiment analysis methods based on convolutional neural network, recurrent neural network, long short-term memory, deep neural network, deep belief network, and memory network. We compare their futures, advantages, and disadvantages. Also, we point out the main problems of these methods, which may be worthy of careful studies in the future. Finally, we also examine the application of deep learning in multimodal sentiment analysis and aspect-level sentiment analysis.

Keywords: document analysis, deep learning, multimodal sentiment analysis, natural language processing

Procedia PDF Downloads 138
6234 Data Augmentation for Automatic Graphical User Interface Generation Based on Generative Adversarial Network

Authors: Xulu Yao, Moi Hoon Yap, Yanlong Zhang

Abstract:

As a branch of artificial neural network, deep learning is widely used in the field of image recognition, but the lack of its dataset leads to imperfect model learning. By analysing the data scale requirements of deep learning and aiming at the application in GUI generation, it is found that the collection of GUI dataset is a time-consuming and labor-consuming project, which is difficult to meet the needs of current deep learning network. To solve this problem, this paper proposes a semi-supervised deep learning model that relies on the original small-scale datasets to produce a large number of reliable data sets. By combining the cyclic neural network with the generated countermeasure network, the cyclic neural network can learn the sequence relationship and characteristics of data, make the generated countermeasure network generate reasonable data, and then expand the Rico dataset. Relying on the network structure, the characteristics of collected data can be well analysed, and a large number of reasonable data can be generated according to these characteristics. After data processing, a reliable dataset for model training can be formed, which alleviates the problem of dataset shortage in deep learning.

Keywords: GUI, deep learning, GAN, data augmentation

Procedia PDF Downloads 163
6233 Effect of Capsule Storage on Viability of Lactobacillus bulgaricus and Streptococcus thermophilus in Yogurt Powder

Authors: Kanchana Sitlaothaworn

Abstract:

Yogurt capsule was made by mixing 14% w/v of reconstitution of skim milk with 2% FOS. The mixture was fermented by commercial yogurt starter comprising Lactobacillus bulgaricus and Streptococcus thermophilus. These yogurts were made as yogurt powder by freeze-dried. Yogurt powder was put into capsule then stored for 28 days at 4oc. 8ml of commercial yogurt was found to be the most suitable inoculum size in yogurt production. After freeze-dried, the viability of L. bulgaricus and S. thermophilus reduced from 109 to 107 cfu/g. The precence of sucrose cannot help to protect cell from ice crystal formation in freeze-dried process, high (20%) sucrose reduced L. bulgaricus and S. thermophilus growth during fermentation of yogurt. The addition of FOS had reduced slowly the viability of both L. bulgaricus and S. thermophilus similar to control (without FOS) during 28 days of capsule storage. The viable cell exhibited satisfactory viability level in capsule storage (6.7x106cfu/g) during 21 days at 4oC.

Keywords: yogurt capsule, Lactobacillus bulgaricus, Streptococcus thermophilus, freeze-drying, sucrose

Procedia PDF Downloads 305
6232 Exploring Deep Neural Network Compression: An Overview

Authors: Ghorab Sara, Meziani Lila, Rubin Harvey Stuart

Abstract:

The rapid growth of deep learning has led to intricate and resource-intensive deep neural networks widely used in computer vision tasks. However, their complexity results in high computational demands and memory usage, hindering real-time application. To address this, research focuses on model compression techniques. The paper provides an overview of recent advancements in compressing neural networks and categorizes the various methods into four main approaches: network pruning, quantization, network decomposition, and knowledge distillation. This paper aims to provide a comprehensive outline of both the advantages and limitations of each method.

Keywords: model compression, deep neural network, pruning, knowledge distillation, quantization, low-rank decomposition

Procedia PDF Downloads 21
6231 Forecasting the Temperature at a Weather Station Using Deep Neural Networks

Authors: Debneil Saha Roy

Abstract:

Weather forecasting is a complex topic and is well suited for analysis by deep learning approaches. With the wide availability of weather observation data nowadays, these approaches can be utilized to identify immediate comparisons between historical weather forecasts and current observations. This work explores the application of deep learning techniques to weather forecasting in order to accurately predict the weather over a given forecast hori­zon. Three deep neural networks are used in this study, namely, Multi-Layer Perceptron (MLP), Long Short Tunn Memory Network (LSTM) and a combination of Convolutional Neural Network (CNN) and LSTM. The predictive performance of these models is compared using two evaluation metrics. The results show that forecasting accuracy increases with an increase in the complexity of deep neural networks.

Keywords: convolutional neural network, deep learning, long short term memory, multi-layer perceptron

Procedia PDF Downloads 151
6230 Integrating Knowledge Distillation of Multiple Strategies

Authors: Min Jindong, Wang Mingxia

Abstract:

With the widespread use of artificial intelligence in life, computer vision, especially deep convolutional neural network models, has developed rapidly. With the increase of the complexity of the real visual target detection task and the improvement of the recognition accuracy, the target detection network model is also very large. The huge deep neural network model is not conducive to deployment on edge devices with limited resources, and the timeliness of network model inference is poor. In this paper, knowledge distillation is used to compress the huge and complex deep neural network model, and the knowledge contained in the complex network model is comprehensively transferred to another lightweight network model. Different from traditional knowledge distillation methods, we propose a novel knowledge distillation that incorporates multi-faceted features, called M-KD. In this paper, when training and optimizing the deep neural network model for target detection, the knowledge of the soft target output of the teacher network in knowledge distillation, the relationship between the layers of the teacher network and the feature attention map of the hidden layer of the teacher network are transferred to the student network as all knowledge. in the model. At the same time, we also introduce an intermediate transition layer, that is, an intermediate guidance layer, between the teacher network and the student network to make up for the huge difference between the teacher network and the student network. Finally, this paper adds an exploration module to the traditional knowledge distillation teacher-student network model. The student network model not only inherits the knowledge of the teacher network but also explores some new knowledge and characteristics. Comprehensive experiments in this paper using different distillation parameter configurations across multiple datasets and convolutional neural network models demonstrate that our proposed new network model achieves substantial improvements in speed and accuracy performance.

Keywords: object detection, knowledge distillation, convolutional network, model compression

Procedia PDF Downloads 257
6229 Positive Bias and Length Bias in Deep Neural Networks for Premises Selection

Authors: Jiaqi Huang, Yuheng Wang

Abstract:

Premises selection, the task of selecting a set of axioms for proving a given conjecture, is a major bottleneck in automated theorem proving. An array of deep-learning-based methods has been established for premises selection, but a perfect performance remains challenging. Our study examines the inaccuracy of deep neural networks in premises selection. Through training network models using encoded conjecture and axiom pairs from the Mizar Mathematical Library, two potential biases are found: the network models classify more premises as necessary than unnecessary, referred to as the ‘positive bias’, and the network models perform better in proving conjectures that paired with more axioms, referred to as ‘length bias’. The ‘positive bias’ and ‘length bias’ discovered could inform the limitation of existing deep neural networks.

Keywords: automated theorem proving, premises selection, deep learning, interpreting deep learning

Procedia PDF Downloads 162
6228 A Dissipative Particle Dynamics Study of a Capsule in Microfluidic Intracellular Delivery System

Authors: Nishanthi N. S., Srikanth Vedantam

Abstract:

Intracellular delivery of materials has always proved to be a challenge in research and therapeutic applications. Usually, vector-based methods, such as liposomes and polymeric materials, and physical methods, such as electroporation and sonoporation have been used for introducing nucleic acids or proteins. Reliance on exogenous materials, toxicity, off-target effects was the short-comings of these methods. Microinjection was an alternative process which addressed the above drawbacks. However, its low throughput had hindered its adoption widely. Mechanical deformation of cells by squeezing them through constriction channel can cause the temporary development of pores that would facilitate non-targeted diffusion of materials. Advantages of this method include high efficiency in intracellular delivery, a wide choice of materials, improved viability and high throughput. This cell squeezing process can be studied deeper by employing simple models and efficient computational procedures. In our current work, we present a finite sized dissipative particle dynamics (FDPD) model to simulate the dynamics of the cell flowing through a constricted channel. The cell is modeled as a capsule with FDPD particles connected through a spring network to represent the membrane. The total energy of the capsule is associated with linear and radial springs in addition to constraint of the fixed area. By performing detailed simulations, we studied the strain on the membrane of the capsule for channels with varying constriction heights. The strain on the capsule membrane was found to be similar though the constriction heights vary. When strain on the membrane was correlated to the development of pores, we found higher porosity in capsule flowing in wider channel. This is due to localization of strain to a smaller region in the narrow constriction channel. But the residence time of the capsule increased as the channel constriction narrowed indicating that strain for an increased time will cause less cell viability.

Keywords: capsule, cell squeezing, dissipative particle dynamics, intracellular delivery, microfluidics, numerical simulations

Procedia PDF Downloads 119
6227 Hyperspectral Data Classification Algorithm Based on the Deep Belief and Self-Organizing Neural Network

Authors: Li Qingjian, Li Ke, He Chun, Huang Yong

Abstract:

In this paper, the method of combining the Pohl Seidman's deep belief network with the self-organizing neural network is proposed to classify the target. This method is mainly aimed at the high nonlinearity of the hyperspectral image, the high sample dimension and the difficulty in designing the classifier. The main feature of original data is extracted by deep belief network. In the process of extracting features, adding known labels samples to fine tune the network, enriching the main characteristics. Then, the extracted feature vectors are classified into the self-organizing neural network. This method can effectively reduce the dimensions of data in the spectrum dimension in the preservation of large amounts of raw data information, to solve the traditional clustering and the long training time when labeled samples less deep learning algorithm for training problems, improve the classification accuracy and robustness. Through the data simulation, the results show that the proposed network structure can get a higher classification precision in the case of a small number of known label samples.

Keywords: DBN, SOM, pattern classification, hyperspectral, data compression

Procedia PDF Downloads 319
6226 On Dialogue Systems Based on Deep Learning

Authors: Yifan Fan, Xudong Luo, Pingping Lin

Abstract:

Nowadays, dialogue systems increasingly become the way for humans to access many computer systems. So, humans can interact with computers in natural language. A dialogue system consists of three parts: understanding what humans say in natural language, managing dialogue, and generating responses in natural language. In this paper, we survey deep learning based methods for dialogue management, response generation and dialogue evaluation. Specifically, these methods are based on neural network, long short-term memory network, deep reinforcement learning, pre-training and generative adversarial network. We compare these methods and point out the further research directions.

Keywords: dialogue management, response generation, deep learning, evaluation

Procedia PDF Downloads 144
6225 Facial Emotion Recognition with Convolutional Neural Network Based Architecture

Authors: Koray U. Erbas

Abstract:

Neural networks are appealing for many applications since they are able to learn complex non-linear relationships between input and output data. As the number of neurons and layers in a neural network increase, it is possible to represent more complex relationships with automatically extracted features. Nowadays Deep Neural Networks (DNNs) are widely used in Computer Vision problems such as; classification, object detection, segmentation image editing etc. In this work, Facial Emotion Recognition task is performed by proposed Convolutional Neural Network (CNN)-based DNN architecture using FER2013 Dataset. Moreover, the effects of different hyperparameters (activation function, kernel size, initializer, batch size and network size) are investigated and ablation study results for Pooling Layer, Dropout and Batch Normalization are presented.

Keywords: convolutional neural network, deep learning, deep learning based FER, facial emotion recognition

Procedia PDF Downloads 235
6224 Aerodynamic Design Optimization Technique for a Tube Capsule That Uses an Axial Flow Air Compressor and an Aerostatic Bearing

Authors: Ahmed E. Hodaib, Muhammed A. Hashem

Abstract:

High-speed transportation has become a growing concern. To increase high-speed efficiencies and minimize power consumption of a vehicle, we need to eliminate the friction with the ground and minimize the aerodynamic drag acting on the vehicle. Due to the complexity and high power requirements of electromagnetic levitation, we make use of the air in front of the capsule, that produces the majority of the drag, to compress it in two phases and inject a proportion of it through small nozzles to make a high-pressure air cushion to levitate the capsule. The tube is partially-evacuated so that the air pressure is optimized for maximum compressor effectiveness, optimum tube size, and minimum vacuum pump power consumption. The total relative mass flow rate of the tube air is divided into two fractions. One is by-passed to flow over the capsule body, ensuring that no chocked flow takes place. The other fraction is sucked by the compressor where it is diffused to decrease the Mach number (around 0.8) to be suitable for the compressor inlet. The air is then compressed and intercooled, then split. One fraction is expanded through a tail nozzle to contribute to generating thrust. The other is compressed again. Bleed from the two compressors is used to maintain a constant air pressure in an air tank. The air tank is used to supply air for levitation. Dividing the total mass flow rate increases the achievable speed (Kantrowitz limit), and compressing it decreases the blockage of the capsule. As a result, the aerodynamic drag on the capsule decreases. As the tube pressure decreases, the drag decreases and the capsule power requirements decrease, however, the vacuum pump consumes more power. That’s why Design optimization techniques are to be used to get the optimum values for all the design variables given specific design inputs. Aerodynamic shape optimization, Capsule and tube sizing, compressor design, diffuser and nozzle expander design and the effect of the air bearing on the aerodynamics of the capsule are to be considered. The variations of the variables are to be studied for the change of the capsule velocity and air pressure.

Keywords: tube-capsule, hyperloop, aerodynamic design optimization, air compressor, air bearing

Procedia PDF Downloads 306
6223 Cells Detection and Recognition in Bone Marrow Examination with Deep Learning Method

Authors: Shiyin He, Zheng Huang

Abstract:

In this paper, deep learning methods are applied in bio-medical field to detect and count different types of cells in an automatic way instead of manual work in medical practice, specifically in bone marrow examination. The process is mainly composed of two steps, detection and recognition. Mask-Region-Convolutional Neural Networks (Mask-RCNN) was used for detection and image segmentation to extract cells and then Convolutional Neural Networks (CNN), as well as Deep Residual Network (ResNet) was used to classify. Result of cell detection network shows high efficiency to meet application requirements. For the cell recognition network, two networks are compared and the final system is fully applicable.

Keywords: cell detection, cell recognition, deep learning, Mask-RCNN, ResNet

Procedia PDF Downloads 161
6222 Thick Data Techniques for Identifying Abnormality in Video Frames for Wireless Capsule Endoscopy

Authors: Jinan Fiaidhi, Sabah Mohammed, Petros Zezos

Abstract:

Capsule endoscopy (CE) is an established noninvasive diagnostic modality in investigating small bowel disease. CE has a pivotal role in assessing patients with suspected bleeding or identifying evidence of active Crohn's disease in the small bowel. However, CE produces lengthy videos with at least eighty thousand frames, with a frequency rate of 2 frames per second. Gastroenterologists cannot dedicate 8 to 15 hours to reading the CE video frames to arrive at a diagnosis. This is why the issue of analyzing CE videos based on modern artificial intelligence techniques becomes a necessity. However, machine learning, including deep learning, has failed to report robust results because of the lack of large samples to train its neural nets. In this paper, we are describing a thick data approach that learns from a few anchor images. We are using sound datasets like KVASIR and CrohnIPI to filter candidate frames that include interesting anomalies in any CE video. We are identifying candidate frames based on feature extraction to provide representative measures of the anomaly, like the size of the anomaly and the color contrast compared to the image background, and later feed these features to a decision tree that can classify the candidate frames as having a condition like the Crohn's Disease. Our thick data approach reported accuracy of detecting Crohn's Disease based on the availability of ulcer areas at the candidate frames for KVASIR was 89.9% and for the CrohnIPI was 83.3%. We are continuing our research to fine-tune our approach by adding more thick data methods for enhancing diagnosis accuracy.

Keywords: thick data analytics, capsule endoscopy, Crohn’s disease, siamese neural network, decision tree

Procedia PDF Downloads 125
6221 Wireless Capsule Endoscope - Antenna and Channel Characterization

Authors: Mona Elhelbawy, Mac Gray

Abstract:

Traditional wired endoscopy is an intrusive process that requires a long flexible tube to be inserted through the patient’s mouth while intravenously sedated. Only images of the upper 4 feet of stomach, colon, and rectum can be captured, leaving the remaining 20 feet of small intestines. Wireless capsule endoscopy offers a painless, non-intrusive, efficient and effective alternative to traditional endoscopy. In wireless capsule endoscopy (WCE), ingestible vitamin-pill-shaped capsules with imaging capabilities, sensors, batteries, and antennas are designed to send images of the gastrointestinal (GI) tract in real time. In this paper, we investigate the radiation performance and specific absorption rate (SAR) of a miniature conformal capsule antenna operating at the Medical Implant Communication Service (MICS) frequency band in the human body. We perform numerical simulations using the finite element method based commercial software, high-frequency structure simulator (HFSS) and the ANSYS human body model (HBM). We also investigate the in-body channel characteristics between the implantable capsule and an external antenna placed on the surface of the human body.

Keywords: IEEE 802.15.6, MICS, SAR, WCE

Procedia PDF Downloads 108
6220 Cellular Traffic Prediction through Multi-Layer Hybrid Network

Authors: Supriya H. S., Chandrakala B. M.

Abstract:

Deep learning based models have been recently successful adoption for network traffic prediction. However, training a deep learning model for various prediction tasks is considered one of the critical tasks due to various reasons. This research work develops Multi-Layer Hybrid Network (MLHN) for network traffic prediction and analysis; MLHN comprises the three distinctive networks for handling the different inputs for custom feature extraction. Furthermore, an optimized and efficient parameter-tuning algorithm is introduced to enhance parameter learning. MLHN is evaluated considering the “Big Data Challenge” dataset considering the Mean Absolute Error, Root Mean Square Error and R^2as metrics; furthermore, MLHN efficiency is proved through comparison with a state-of-art approach.

Keywords: MLHN, network traffic prediction

Procedia PDF Downloads 61
6219 Deep Neural Network Approach for Navigation of Autonomous Vehicles

Authors: Mayank Raj, V. G. Narendra

Abstract:

Ever since the DARPA challenge on autonomous vehicles in 2005, there has been a lot of buzz about ‘Autonomous Vehicles’ amongst the major tech giants such as Google, Uber, and Tesla. Numerous approaches have been adopted to solve this problem, which can have a long-lasting impact on mankind. In this paper, we have used Deep Learning techniques and TensorFlow framework with the goal of building a neural network model to predict (speed, acceleration, steering angle, and brake) features needed for navigation of autonomous vehicles. The Deep Neural Network has been trained on images and sensor data obtained from the comma.ai dataset. A heatmap was used to check for correlation among the features, and finally, four important features were selected. This was a multivariate regression problem. The final model had five convolutional layers, followed by five dense layers. Finally, the calculated values were tested against the labeled data, where the mean squared error was used as a performance metric.

Keywords: autonomous vehicles, deep learning, computer vision, artificial intelligence

Procedia PDF Downloads 134
6218 Deep Learning for Recommender System: Principles, Methods and Evaluation

Authors: Basiliyos Tilahun Betru, Charles Awono Onana, Bernabe Batchakui

Abstract:

Recommender systems have become increasingly popular in recent years, and are utilized in numerous areas. Nowadays many web services provide several information for users and recommender systems have been developed as critical element of these web applications to predict choice of preference and provide significant recommendations. With the help of the advantage of deep learning in modeling different types of data and due to the dynamic change of user preference, building a deep model can better understand users demand and further improve quality of recommendation. In this paper, deep neural network models for recommender system are evaluated. Most of deep neural network models in recommender system focus on the classical collaborative filtering user-item setting. Deep learning models demonstrated high level features of complex data can be learned instead of using metadata which can significantly improve accuracy of recommendation. Even though deep learning poses a great impact in various areas, applying the model to a recommender system have not been fully exploited and still a lot of improvements can be done both in collaborative and content-based approach while considering different contextual factors.

Keywords: big data, decision making, deep learning, recommender system

Procedia PDF Downloads 451
6217 An Event Relationship Extraction Method Incorporating Deep Feedback Recurrent Neural Network and Bidirectional Long Short-Term Memory

Authors: Yin Yuanling

Abstract:

A Deep Feedback Recurrent Neural Network (DFRNN) and Bidirectional Long Short-Term Memory (BiLSTM) are designed to address the problem of low accuracy of traditional relationship extraction models. This method combines a deep feedback-based recurrent neural network (DFRNN) with a bi-directional long short-term memory (BiLSTM) approach. The method combines DFRNN, which extracts local features of text based on deep feedback recurrent mechanism, BiLSTM, which better extracts global features of text, and Self-Attention, which extracts semantic information. Experiments show that the method achieves an F1 value of 76.69% on the CEC dataset, which is 0.0652 better than the BiLSTM+Self-ATT model, thus optimizing the performance of the deep learning method in the event relationship extraction task.

Keywords: event relations, deep learning, DFRNN models, bi-directional long and short-term memory networks

Procedia PDF Downloads 109
6216 Health Trajectory Clustering Using Deep Belief Networks

Authors: Farshid Hajati, Federico Girosi, Shima Ghassempour

Abstract:

We present a Deep Belief Network (DBN) method for clustering health trajectories. Deep Belief Network (DBN) is a deep architecture that consists of a stack of Restricted Boltzmann Machines (RBM). In a deep architecture, each layer learns more complex features than the past layers. The proposed method depends on DBN in clustering without using back propagation learning algorithm. The proposed DBN has a better a performance compared to the deep neural network due the initialization of the connecting weights. We use Contrastive Divergence (CD) method for training the RBMs which increases the performance of the network. The performance of the proposed method is evaluated extensively on the Health and Retirement Study (HRS) database. The University of Michigan Health and Retirement Study (HRS) is a nationally representative longitudinal study that has surveyed more than 27,000 elderly and near-elderly Americans since its inception in 1992. Participants are interviewed every two years and they collect data on physical and mental health, insurance coverage, financial status, family support systems, labor market status, and retirement planning. The dataset is publicly available and we use the RAND HRS version L, which is easy to use and cleaned up version of the data. The size of sample data set is 268 and the length of the trajectories is equal to 10. The trajectories do not stop when the patient dies and represent 10 different interviews of live patients. Compared to the state-of-the-art benchmarks, the experimental results show the effectiveness and superiority of the proposed method in clustering health trajectories.

Keywords: health trajectory, clustering, deep learning, DBN

Procedia PDF Downloads 347
6215 Adaptive Few-Shot Deep Metric Learning

Authors: Wentian Shi, Daming Shi, Maysam Orouskhani, Feng Tian

Abstract:

Whereas currently the most prevalent deep learning methods require a large amount of data for training, few-shot learning tries to learn a model from limited data without extensive retraining. In this paper, we present a loss function based on triplet loss for solving few-shot problem using metric based learning. Instead of setting the margin distance in triplet loss as a constant number empirically, we propose an adaptive margin distance strategy to obtain the appropriate margin distance automatically. We implement the strategy in the deep siamese network for deep metric embedding, by utilizing an optimization approach by penalizing the worst case and rewarding the best. Our experiments on image recognition and co-segmentation model demonstrate that using our proposed triplet loss with adaptive margin distance can significantly improve the performance.

Keywords: few-shot learning, triplet network, adaptive margin, deep learning

Procedia PDF Downloads 147
6214 Performance Evaluation of Distributed Deep Learning Frameworks in Cloud Environment

Authors: Shuen-Tai Wang, Fang-An Kuo, Chau-Yi Chou, Yu-Bin Fang

Abstract:

2016 has become the year of the Artificial Intelligence explosion. AI technologies are getting more and more matured that most world well-known tech giants are making large investment to increase the capabilities in AI. Machine learning is the science of getting computers to act without being explicitly programmed, and deep learning is a subset of machine learning that uses deep neural network to train a machine to learn  features directly from data. Deep learning realizes many machine learning applications which expand the field of AI. At the present time, deep learning frameworks have been widely deployed on servers for deep learning applications in both academia and industry. In training deep neural networks, there are many standard processes or algorithms, but the performance of different frameworks might be different. In this paper we evaluate the running performance of two state-of-the-art distributed deep learning frameworks that are running training calculation in parallel over multi GPU and multi nodes in our cloud environment. We evaluate the training performance of the frameworks with ResNet-50 convolutional neural network, and we analyze what factors that result in the performance among both distributed frameworks as well. Through the experimental analysis, we identify the overheads which could be further optimized. The main contribution is that the evaluation results provide further optimization directions in both performance tuning and algorithmic design.

Keywords: artificial intelligence, machine learning, deep learning, convolutional neural networks

Procedia PDF Downloads 181
6213 Powder Assisted Sheet Forming to Fabricate Ti Capsule Magnetic Hyperthermia Implant

Authors: Keigo Nishitani, Kohei Mizuta Mizuta, Kazuyoshi Kurita, Yukinori Taniguchi

Abstract:

To establish mass production process of Ti capsule which has Fe powder inside as magnetic hyperthermia implant, we assumed that Ti thin sheet can be drawn into a φ1.0 mm die hole through the medium of Fe Powder and becomes outer shell of capsule. This study discusses mechanism of powder assisted deep drawing process by both of numerical simulation and experiment. Ti thin sheet blank was placed on die, and was covered by Fe powder layer without pressurizing. Then upper punch was indented on the Fe powder layer, and the blank can be drawn into die cavity as pressurized powder particles were extruded into die cavity from behind of the drawn blank. Distinct Element Method (DEM) has been used to demonstrate the process. To identify bonding parameters on Fe particles which are cohesion, tensile bond stress and inter particle friction angle, axial and diametrical compression failure test of Fe powder compact was conducted. Several density ratios of powder compacts in range of 0.70 - 0.85 were investigated and relationship between mean stress and equivalent stress was calculated with consideration of critical state line which rules failure criterion in consolidation of Fe powder. Since variation of bonding parameters with density ratio has been experimentally identified, and good agreement has been recognized between several failure tests and its simulation, demonstration of powder assisted sheet forming by using DEM becomes applicable. Results of simulation indicated that indent/drawing length of Ti thin sheet is promoted by smaller Fe particle size, larger indent punch diameter, lower friction coefficient between die surface and Ti sheet and certain degrees of die inlet taper angle. In the deep drawing test, we have made die-set with φ2.4 mm punch and φ1.0 mm die bore diameter. Pure Ti sheet with 100 μm thickness, annealed at 650 deg. C has been tested. After indentation, indented/drawn capsule has been observed by microscope, and its length was measured to discuss the feasibility of this capsulation process. Longer drawing length exists on progressive loading pass comparing with the case of single stroke loading. It is expected that progressive loading has an advantage of which extrusion of powder particle into die cavity with Ti sheet is promoted since powder particle layer can be rebuilt while the punch is withdrawn from the layer in each loading steps. This capsulation phenomenon is qualitatively demonstrated by DEM simulation. Finally, we have fabricated Ti capsule which has Fe powder inside for magnetic hyperthermia cancer care treatment. It is concluded that suggested method is possible to use the manufacturing of Ti capsule implant for magnetic hyperthermia cancer care.

Keywords: metal powder compaction, metal forming, distinct element method, cancer care, magnetic hyperthermia

Procedia PDF Downloads 265
6212 Automated Weight Painting: Using Deep Neural Networks to Adjust 3D Mesh Skeletal Weights

Authors: John Gibbs, Benjamin Flanders, Dylan Pozorski, Weixuan Liu

Abstract:

Weight Painting–adjusting the influence a skeletal joint has on a given vertex in a character mesh–is an arduous and time con- suming part of the 3D animation pipeline. This process generally requires a trained technical animator and many hours of work to complete. Our skiNNer plug-in, which works within Autodesk’s Maya 3D animation software, uses Machine Learning and data pro- cessing techniques to create a deep neural network model that can accomplish the weight painting task in seconds rather than hours for bipedal quasi-humanoid character meshes. In order to create a properly trained network, a number of challenges were overcome, including curating an appropriately large data library, managing an arbitrary 3D mesh size, handling arbitrary skeletal architectures, accounting for extreme numeric values (most data points are near 0 or 1 for weight maps), and constructing an appropriate neural network model that can properly capture the high frequency alter- ation between high weight values (near 1.0) and low weight values (near 0.0). The arrived at neural network model is a cross between a traditional CNN, deep residual network, and fully dense network. The resultant network captures the unusually hard-edged features of a weight map matrix, and produces excellent results on many bipedal models.

Keywords: 3d animation, animation, character, rigging, skinning, weight painting, machine learning, artificial intelligence, neural network, deep neural network

Procedia PDF Downloads 245
6211 Robot Movement Using the Trust Region Policy Optimization

Authors: Romisaa Ali

Abstract:

The Policy Gradient approach is one of the deep reinforcement learning families that combines deep neural networks (DNN) with reinforcement learning RL to discover the optimum of the control problem through experience gained from the interaction between the robot and its surroundings. In contrast to earlier policy gradient algorithms, which were unable to handle these two types of error because of over-or under-estimation introduced by the deep neural network model, this article will discuss the state-of-the-art SOTA policy gradient technique, trust region policy optimization (TRPO), by applying this method in various environments compared to another policy gradient method, the Proximal Policy Optimization (PPO), to explain their robust optimization, using this SOTA to gather experience data during various training phases after observing the impact of hyper-parameters on neural network performance.

Keywords: deep neural networks, deep reinforcement learning, proximal policy optimization, state-of-the-art, trust region policy optimization

Procedia PDF Downloads 144