Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 4171

Search results for: convolution neural network

4171 Two Concurrent Convolution Neural Networks TC*CNN Model for Face Recognition Using Edge

Authors: T. Alghamdi, G. Alaghband


In this paper we develop a model that couples Two Concurrent Convolution Neural Network with different filters (TC*CNN) for face recognition and compare its performance to an existing sequential CNN (base model). We also test and compare the quality and performance of the models on three datasets with various levels of complexity (easy, moderate, and difficult) and show that for the most complex datasets, edges will produce the most accurate and efficient results. We further show that in such cases while Support Vector Machine (SVM) models are fast, they do not produce accurate results.

Keywords: Convolution Neural Network, Edges, Face Recognition , Support Vector Machine.

Procedia PDF Downloads 79
4170 Multimodal Convolutional Neural Network for Musical Instrument Recognition

Authors: Yagya Raj Pandeya, Joonwhoan Lee


The dynamic behavior of music and video makes it difficult to evaluate musical instrument playing in a video by computer system. Any television or film video clip with music information are rich sources for analyzing musical instruments using modern machine learning technologies. In this research, we integrate the audio and video information sources using convolutional neural network (CNN) and pass network learned features through recurrent neural network (RNN) to preserve the dynamic behaviors of audio and video. We use different pre-trained CNN for music and video feature extraction and then fine tune each model. The music network use 2D convolutional network and video network use 3D convolution (C3D). Finally, we concatenate each music and video feature by preserving the time varying features. The long short term memory (LSTM) network is used for long-term dynamic feature characterization and then use late fusion with generalized mean. The proposed network performs better performance to recognize the musical instrument using audio-video multimodal neural network.

Keywords: multimodal, 3D convolution, music-video feature extraction, generalized mean

Procedia PDF Downloads 133
4169 Keyframe Extraction Using Face Quality Assessment and Convolution Neural Network

Authors: Rahma Abed, Sahbi Bahroun, Ezzeddine Zagrouba


Due to the huge amount of data in videos, extracting the relevant frames became a necessity and an essential step prior to performing face recognition. In this context, we propose a method for extracting keyframes from videos based on face quality and deep learning for a face recognition task. This method has two steps. We start by generating face quality scores for each face image based on the use of three face feature extractors, including Gabor, LBP, and HOG. The second step consists in training a Deep Convolutional Neural Network in a supervised manner in order to select the frames that have the best face quality. The obtained results show the effectiveness of the proposed method compared to the methods of the state of the art.

Keywords: keyframe extraction, face quality assessment, face in video recognition, convolution neural network

Procedia PDF Downloads 94
4168 Unsupervised Images Generation Based on Sloan Digital Sky Survey with Deep Convolutional Generative Neural Networks

Authors: Guanghua Zhang, Fubao Wang, Weijun Duan


Convolution neural network (CNN) has attracted more and more attention on recent years. Especially in the field of computer vision and image classification. However, unsupervised learning with CNN has received less attention than supervised learning. In this work, we use a new powerful tool which is deep convolutional generative adversarial networks (DCGANs) to generate images from Sloan Digital Sky Survey. Training by various star and galaxy images, it shows that both the generator and the discriminator are good for unsupervised learning. In this paper, we also took several experiments to choose the best value for hyper-parameters and which could help to stabilize the training process and promise a good quality of the output.

Keywords: convolution neural network, discriminator, generator, unsupervised learning

Procedia PDF Downloads 178
4167 Advances of Image Processing in Precision Agriculture: Using Deep Learning Convolution Neural Network for Soil Nutrient Classification

Authors: Halimatu S. Abdullahi, Ray E. Sheriff, Fatima Mahieddine


Agriculture is essential to the continuous existence of human life as they directly depend on it for the production of food. The exponential rise in population calls for a rapid increase in food with the application of technology to reduce the laborious work and maximize production. Technology can aid/improve agriculture in several ways through pre-planning and post-harvest by the use of computer vision technology through image processing to determine the soil nutrient composition, right amount, right time, right place application of farm input resources like fertilizers, herbicides, water, weed detection, early detection of pest and diseases etc. This is precision agriculture which is thought to be solution required to achieve our goals. There has been significant improvement in the area of image processing and data processing which has being a major challenge. A database of images is collected through remote sensing, analyzed and a model is developed to determine the right treatment plans for different crop types and different regions. Features of images from vegetations need to be extracted, classified, segmented and finally fed into the model. Different techniques have been applied to the processes from the use of neural network, support vector machine, fuzzy logic approach and recently, the most effective approach generating excellent results using the deep learning approach of convolution neural network for image classifications. Deep Convolution neural network is used to determine soil nutrients required in a plantation for maximum production. The experimental results on the developed model yielded results with an average accuracy of 99.58%.

Keywords: convolution, feature extraction, image analysis, validation, precision agriculture

Procedia PDF Downloads 241
4166 A Case Study of Deep Learning for Disease Detection in Crops

Authors: Felipe A. Guth, Shane Ward, Kevin McDonnell


In the precision agriculture area, one of the main tasks is the automated detection of diseases in crops. Machine Learning algorithms have been studied in recent decades for such tasks in view of their potential for improving economic outcomes that automated disease detection may attain over crop fields. The latest generation of deep learning convolution neural networks has presented significant results in the area of image classification. In this way, this work has tested the implementation of an architecture of deep learning convolution neural network for the detection of diseases in different types of crops. A data augmentation strategy was used to meet the requirements of the algorithm implemented with a deep learning framework. Two test scenarios were deployed. The first scenario implemented a neural network under images extracted from a controlled environment while the second one took images both from the field and the controlled environment. The results evaluated the generalisation capacity of the neural networks in relation to the two types of images presented. Results yielded a general classification accuracy of 59% in scenario 1 and 96% in scenario 2.

Keywords: convolutional neural networks, deep learning, disease detection, precision agriculture

Procedia PDF Downloads 171
4164 Satellite Imagery Classification Based on Deep Convolution Network

Authors: Zhong Ma, Zhuping Wang, Congxin Liu, Xiangzeng Liu


Satellite imagery classification is a challenging problem with many practical applications. In this paper, we designed a deep convolution neural network (DCNN) to classify the satellite imagery. The contributions of this paper are twofold — First, to cope with the large-scale variance in the satellite image, we introduced the inception module, which has multiple filters with different size at the same level, as the building block to build our DCNN model. Second, we proposed a genetic algorithm based method to efficiently search the best hyper-parameters of the DCNN in a large search space. The proposed method is evaluated on the benchmark database. The results of the proposed hyper-parameters search method show it will guide the search towards better regions of the parameter space. Based on the found hyper-parameters, we built our DCNN models, and evaluated its performance on satellite imagery classification, the results show the classification accuracy of proposed models outperform the state of the art method.

Keywords: satellite imagery classification, deep convolution network, genetic algorithm, hyper-parameter optimization

Procedia PDF Downloads 235
4163 Aspect-Level Sentiment Analysis with Multi-Channel and Graph Convolutional Networks

Authors: Jiajun Wang, Xiaoge Li


The purpose of the aspect-level sentiment analysis task is to identify the sentiment polarity of aspects in a sentence. Currently, most methods mainly focus on using neural networks and attention mechanisms to model the relationship between aspects and context, but they ignore the dependence of words in different ranges in the sentence, resulting in deviation when assigning relationship weight to other words other than aspect words. To solve these problems, we propose a new aspect-level sentiment analysis model that combines a multi-channel convolutional network and graph convolutional network (GCN). Firstly, the context and the degree of association between words are characterized by Long Short-Term Memory (LSTM) and self-attention mechanism. Besides, a multi-channel convolutional network is used to extract the features of words in different ranges. Finally, a convolutional graph network is used to associate the node information of the dependency tree structure. We conduct experiments on four benchmark datasets. The experimental results are compared with those of other models, which shows that our model is better and more effective.

Keywords: aspect-level sentiment analysis, attention, multi-channel convolution network, graph convolution network, dependency tree

Procedia PDF Downloads 61
4162 The Convolution Recurrent Network of Using Residual LSTM to Process the Output of the Downsampling for Monaural Speech Enhancement

Authors: Shibo Wei, Ting Jiang


Convolutional-recurrent neural networks (CRN) have achieved much success recently in the speech enhancement field. The common processing method is to use the convolution layer to compress the feature space by multiple upsampling and then model the compressed features with the LSTM layer. At last, the enhanced speech is obtained by deconvolution operation to integrate the global information of the speech sequence. However, the feature space compression process may cause the loss of information, so we propose to model the upsampling result of each step with the residual LSTM layer, then join it with the output of the deconvolution layer and input them to the next deconvolution layer, by this way, we want to integrate the global information of speech sequence better. The experimental results show the network model (RES-CRN) we introduce can achieve better performance than LSTM without residual and overlaying LSTM simply in the original CRN in terms of scale-invariant signal-to-distortion ratio (SI-SNR), speech quality (PESQ), and intelligibility (STOI).

Keywords: convolutional-recurrent neural networks, speech enhancement, residual LSTM, SI-SNR

Procedia PDF Downloads 118
4161 Selecting the Best RBF Neural Network Using PSO Algorithm for ECG Signal Prediction

Authors: Najmeh Mohsenifar, Narjes Mohsenifar, Abbas Kargar


In this paper, has been presented a stable method for predicting the ECG signals through the RBF neural networks, by the PSO algorithm. In spite of quasi-periodic ECG signal from a healthy person, there are distortions in electro cardiographic data for a patient. Therefore, there is no precise mathematical model for prediction. Here, we have exploited neural networks that are capable of complicated nonlinear mapping. Although the architecture and spread of RBF networks are usually selected through trial and error, the PSO algorithm has been used for choosing the best neural network. In this way, 2 second of a recorded ECG signal is employed to predict duration of 20 second in advance. Our simulations show that PSO algorithm can find the RBF neural network with minimum MSE and the accuracy of the predicted ECG signal is 97 %.

Keywords: electrocardiogram, RBF artificial neural network, PSO algorithm, predict, accuracy

Procedia PDF Downloads 509
4160 Assessing Artificial Neural Network Models on Forecasting the Return of Stock Market Index

Authors: Hamid Rostami Jaz, Kamran Ameri Siahooei


Up to now different methods have been used to forecast the index returns and the index rate. Artificial intelligence and artificial neural networks have been one of the methods of index returns forecasting. This study attempts to carry out a comparative study on the performance of different Radial Base Neural Network and Feed-Forward Perceptron Neural Network to forecast investment returns on the index. To achieve this goal, the return on investment in Tehran Stock Exchange index is evaluated and the performance of Radial Base Neural Network and Feed-Forward Perceptron Neural Network are compared. Neural networks performance test is applied based on the least square error in two approaches of in-sample and out-of-sample. The research results show the superiority of the radial base neural network in the in-sample approach and the superiority of perceptron neural network in the out-of-sample approach.

Keywords: exchange index, forecasting, perceptron neural network, Tehran stock exchange

Procedia PDF Downloads 321
4159 Convolutional Neural Network Based on Random Kernels for Analyzing Visual Imagery

Authors: Ja-Keoung Koo, Kensuke Nakamura, Hyohun Kim, Dongwha Shin, Yeonseok Kim, Ji-Su Ahn, Byung-Woo Hong


The machine learning techniques based on a convolutional neural network (CNN) have been actively developed and successfully applied to a variety of image analysis tasks including reconstruction, noise reduction, resolution enhancement, segmentation, motion estimation, object recognition. The classical visual information processing that ranges from low level tasks to high level ones has been widely developed in the deep learning framework. It is generally considered as a challenging problem to derive visual interpretation from high dimensional imagery data. A CNN is a class of feed-forward artificial neural network that usually consists of deep layers the connections of which are established by a series of non-linear operations. The CNN architecture is known to be shift invariant due to its shared weights and translation invariance characteristics. However, it is often computationally intractable to optimize the network in particular with a large number of convolution layers due to a large number of unknowns to be optimized with respect to the training set that is generally required to be large enough to effectively generalize the model under consideration. It is also necessary to limit the size of convolution kernels due to the computational expense despite of the recent development of effective parallel processing machinery, which leads to the use of the constantly small size of the convolution kernels throughout the deep CNN architecture. However, it is often desired to consider different scales in the analysis of visual features at different layers in the network. Thus, we propose a CNN model where different sizes of the convolution kernels are applied at each layer based on the random projection. We apply random filters with varying sizes and associate the filter responses with scalar weights that correspond to the standard deviation of the random filters. We are allowed to use large number of random filters with the cost of one scalar unknown for each filter. The computational cost in the back-propagation procedure does not increase with the larger size of the filters even though the additional computational cost is required in the computation of convolution in the feed-forward procedure. The use of random kernels with varying sizes allows to effectively analyze image features at multiple scales leading to a better generalization. The robustness and effectiveness of the proposed CNN based on random kernels are demonstrated by numerical experiments where the quantitative comparison of the well-known CNN architectures and our models that simply replace the convolution kernels with the random filters is performed. The experimental results indicate that our model achieves better performance with less number of unknown weights. The proposed algorithm has a high potential in the application of a variety of visual tasks based on the CNN framework. Acknowledgement—This work was supported by the MISP (Ministry of Science and ICT), Korea, under the National Program for Excellence in SW (20170001000011001) supervised by IITP, and NRF-2014R1A2A1A11051941, NRF2017R1A2B4006023.

Keywords: deep learning, convolutional neural network, random kernel, random projection, dimensionality reduction, object recognition

Procedia PDF Downloads 211
4158 The Application of a Hybrid Neural Network for Recognition of a Handwritten Kazakh Text

Authors: Almagul Assainova , Dariya Abykenova, Liudmila Goncharenko, Sergey Sybachin, Saule Rakhimova, Abay Aman


The recognition of a handwritten Kazakh text is a relevant objective today for the digitization of materials. The study presents a model of a hybrid neural network for handwriting recognition, which includes a convolutional neural network and a multi-layer perceptron. Each network includes 1024 input neurons and 42 output neurons. The model is implemented in the program, written in the Python programming language using the EMNIST database, NumPy, Keras, and Tensorflow modules. The neural network training of such specific letters of the Kazakh alphabet as ә, ғ, қ, ң, ө, ұ, ү, h, і was conducted. The neural network model and the program created on its basis can be used in electronic document management systems to digitize the Kazakh text.

Keywords: handwriting recognition system, image recognition, Kazakh font, machine learning, neural networks

Procedia PDF Downloads 55
4157 A Mechanical Diagnosis Method Based on Vibration Fault Signal down-Sampling and the Improved One-Dimensional Convolutional Neural Network

Authors: Bowei Yuan, Shi Li, Liuyang Song, Huaqing Wang, Lingli Cui


Convolutional neural networks (CNN) have received extensive attention in the field of fault diagnosis. Many fault diagnosis methods use CNN for fault type identification. However, when the amount of raw data collected by sensors is massive, the neural network needs to perform a time-consuming classification task. In this paper, a mechanical fault diagnosis method based on vibration signal down-sampling and the improved one-dimensional convolutional neural network is proposed. Through the robust principal component analysis, the low-rank feature matrix of a large amount of raw data can be separated, and then down-sampling is realized to reduce the subsequent calculation amount. In the improved one-dimensional CNN, a smaller convolution kernel is used to reduce the number of parameters and computational complexity, and regularization is introduced before the fully connected layer to prevent overfitting. In addition, the multi-connected layers can better generalize classification results without cumbersome parameter adjustments. The effectiveness of the method is verified by monitoring the signal of the centrifugal pump test bench, and the average test accuracy is above 98%. When compared with the traditional deep belief network (DBN) and support vector machine (SVM) methods, this method has better performance.

Keywords: fault diagnosis, vibration signal down-sampling, 1D-CNN

Procedia PDF Downloads 56
4156 Design of Neural Predictor for Vibration Analysis of Drilling Machine

Authors: İkbal Eski


This investigation is researched on design of robust neural network predictors for analyzing vibration effects on moving parts of a drilling machine. Moreover, the research is divided two parts; first part is experimental investigation, second part is simulation analysis with neural networks. Therefore, a real time the drilling machine is used to vibrations during working conditions. The measured real vibration parameters are analyzed with proposed neural network. As results: Simulation approaches show that Radial Basis Neural Network has good performance to adapt real time parameters of the drilling machine.

Keywords: artificial neural network, vibration analyses, drilling machine, robust

Procedia PDF Downloads 300
4155 Artificial Neural Network Speed Controller for Excited DC Motor

Authors: Elabed Saud


This paper introduces the new ability of Artificial Neural Networks (ANNs) in estimating speed and controlling the separately excited DC motor. The neural control scheme consists of two parts. One is the neural estimator which is used to estimate the motor speed. The other is the neural controller which is used to generate a control signal for a converter. These two neutrals are training by Levenberg-Marquardt back-propagation algorithm. ANNs are the standard three layers feed-forward neural network with sigmoid activation functions in the input and hidden layers and purelin in the output layer. Simulation results are presented to demonstrate the effectiveness of this neural and advantage of the control system DC motor with ANNs in comparison with the conventional scheme without ANNs.

Keywords: Artificial Neural Network (ANNs), excited DC motor, convenional controller, speed Controller

Procedia PDF Downloads 541
4154 Trusted Neural Network: Reversibility in Neural Networks for Network Integrity Verification

Authors: Malgorzata Schwab, Ashis Kumer Biswas


In this concept paper, we explore the topic of Reversibility in Neural Networks leveraged for Network Integrity Verification and crafted the term ''Trusted Neural Network'' (TNN), paired with the API abstraction around it, to embrace the idea formally. This newly proposed high-level generalizable TNN model builds upon the Invertible Neural Network architecture, trained simultaneously in both forward and reverse directions. This allows for the original system inputs to be compared with the ones reconstructed from the outputs in the reversed flow to assess the integrity of the end-to-end inference flow. The outcome of that assessment is captured as an Integrity Score. Concrete implementation reflecting the needs of specific problem domains can be derived from this general approach and is demonstrated in the experiments. The model aspires to become a useful practice in drafting high-level systems architectures which incorporate AI capabilities.

Keywords: trusted, neural, invertible, API

Procedia PDF Downloads 35
4153 Prediction of Oil Recovery Factor Using Artificial Neural Network

Authors: O. P. Oladipo, O. A. Falode


The determination of Recovery Factor is of great importance to the reservoir engineer since it relates reserves to the initial oil in place. Reserves are the producible portion of reservoirs and give an indication of the profitability of a field Development. The core objective of this project is to develop an artificial neural network model using selected reservoir data to predict Recovery Factors (RF) of hydrocarbon reservoirs and compare the model with a couple of the existing correlations. The type of Artificial Neural Network model developed was the Single Layer Feed Forward Network. MATLAB was used as the network simulator and the network was trained using the supervised learning method, Afterwards, the network was tested with input data never seen by the network. The results of the predicted values of the recovery factors of the Artificial Neural Network Model, API Correlation for water drive reservoirs (Sands and Sandstones) and Guthrie and Greenberger Correlation Equation were obtained and compared. It was noted that the coefficient of correlation of the Artificial Neural Network Model was higher than the coefficient of correlations of the other two correlation equations, thus making it a more accurate prediction tool. The Artificial Neural Network, because of its accurate prediction ability is helpful in the correct prediction of hydrocarbon reservoir factors. Artificial Neural Network could be applied in the prediction of other Petroleum Engineering parameters because it is able to recognise complex patterns of data set and establish a relationship between them.

Keywords: recovery factor, reservoir, reserves, artificial neural network, hydrocarbon, MATLAB, API, Guthrie, Greenberger

Procedia PDF Downloads 343
4152 Performance Comparison of Deep Convolutional Neural Networks for Binary Classification of Fine-Grained Leaf Images

Authors: Kamal KC, Zhendong Yin, Dasen Li, Zhilu Wu


Intra-plant disease classification based on leaf images is a challenging computer vision task due to similarities in texture, color, and shape of leaves with a slight variation of leaf spot; and external environmental changes such as lighting and background noises. Deep convolutional neural network (DCNN) has proven to be an effective tool for binary classification. In this paper, two methods for binary classification of diseased plant leaves using DCNN are presented; model created from scratch and transfer learning. Our main contribution is a thorough evaluation of 4 networks created from scratch and transfer learning of 5 pre-trained models. Training and testing of these models were performed on a plant leaf images dataset belonging to 16 distinct classes, containing a total of 22,265 images from 8 different plants, consisting of a pair of healthy and diseased leaves. We introduce a deep CNN model, Optimized MobileNet. This model with depthwise separable CNN as a building block attained an average test accuracy of 99.77%. We also present a fine-tuning method by introducing the concept of a convolutional block, which is a collection of different deep neural layers. Fine-tuned models proved to be efficient in terms of accuracy and computational cost. Fine-tuned MobileNet achieved an average test accuracy of 99.89% on 8 pairs of [healthy, diseased] leaf ImageSet.

Keywords: deep convolution neural network, depthwise separable convolution, fine-grained classification, MobileNet, plant disease, transfer learning

Procedia PDF Downloads 111
4151 Non-Targeted Adversarial Object Detection Attack: Fast Gradient Sign Method

Authors: Bandar Alahmadi, Manohar Mareboyana, Lethia Jackson


Today, there are many applications that are using computer vision models, such as face recognition, image classification, and object detection. The accuracy of these models is very important for the performance of these applications. One challenge that facing the computer vision models is the adversarial examples attack. In computer vision, the adversarial example is an image that is intentionally designed to cause the machine learning model to misclassify it. One of very well-known method that is used to attack the Convolution Neural Network (CNN) is Fast Gradient Sign Method (FGSM). The goal of this method is to find the perturbation that can fool the CNN using the gradient of the cost function of CNN. In this paper, we introduce a novel model that can attack Regional-Convolution Neural Network (R-CNN) that use FGSM. We first extract the regions that are detected by R-CNN, and then we resize these regions into the size of regular images. Then, we find the best perturbation of the regions that can fool CNN using FGSM. Next, we add the resulted perturbation to the attacked region to get a new region image that looks similar to the original image to human eyes. Finally, we placed the regions back to the original image and test the R-CNN with the attacked images. Our model could drop the accuracy of the R-CNN when we tested with Pascal VOC 2012 dataset.

Keywords: adversarial examples, attack, computer vision, image processing

Procedia PDF Downloads 91
4150 A Two-Step Framework for Unsupervised Speaker Segmentation Using BIC and Artificial Neural Network

Authors: Ahmad Alwosheel, Ahmed Alqaraawi


This work proposes a new speaker segmentation approach for two speakers. It is an online approach that does not require a prior information about speaker models. It has two phases, a conventional approach such as unsupervised BIC-based is utilized in the first phase to detect speaker changes and train a Neural Network, while in the second phase, the output trained parameters from the Neural Network are used to predict next incoming audio stream. Using this approach, a comparable accuracy to similar BIC-based approaches is achieved with a significant improvement in terms of computation time.

Keywords: artificial neural network, diarization, speaker indexing, speaker segmentation

Procedia PDF Downloads 372
4149 Optimizing the Probabilistic Neural Network Training Algorithm for Multi-Class Identification

Authors: Abdelhadi Lotfi, Abdelkader Benyettou


In this work, a training algorithm for probabilistic neural networks (PNN) is presented. The algorithm addresses one of the major drawbacks of PNN, which is the size of the hidden layer in the network. By using a cross-validation training algorithm, the number of hidden neurons is shrunk to a smaller number consisting of the most representative samples of the training set. This is done without affecting the overall architecture of the network. Performance of the network is compared against performance of standard PNN for different databases from the UCI database repository. Results show an important gain in network size and performance.

Keywords: classification, probabilistic neural networks, network optimization, pattern recognition

Procedia PDF Downloads 107
4148 Lean Comic GAN (LC-GAN): a Light-Weight GAN Architecture Leveraging Factorized Convolution and Teacher Forcing Distillation Style Loss Aimed to Capture Two Dimensional Animated Filtered Still Shots Using Mobile Phone Camera and Edge Devices

Authors: Kaustav Mukherjee


In this paper we propose a Neural Style Transfer solution whereby we have created a Lightweight Separable Convolution Kernel Based GAN Architecture (SC-GAN) which will very useful for designing filter for Mobile Phone Cameras and also Edge Devices which will convert any image to its 2D ANIMATED COMIC STYLE Movies like HEMAN, SUPERMAN, JUNGLE-BOOK. This will help the 2D animation artist by relieving to create new characters from real life person's images without having to go for endless hours of manual labour drawing each and every pose of a cartoon. It can even be used to create scenes from real life images.This will reduce a huge amount of turn around time to make 2D animated movies and decrease cost in terms of manpower and time. In addition to that being extreme light-weight it can be used as camera filters capable of taking Comic Style Shots using mobile phone camera or edge device cameras like Raspberry Pi 4,NVIDIA Jetson NANO etc. Existing Methods like CartoonGAN with the model size close to 170 MB is too heavy weight for mobile phones and edge devices due to their scarcity in resources. Compared to the current state of the art our proposed method which has a total model size of 31 MB which clearly makes it ideal and ultra-efficient for designing of camera filters on low resource devices like mobile phones, tablets and edge devices running OS or RTOS. .Owing to use of high resolution input and usage of bigger convolution kernel size it produces richer resolution Comic-Style Pictures implementation with 6 times lesser number of parameters and with just 25 extra epoch trained on a dataset of less than 1000 which breaks the myth that all GAN need mammoth amount of data. Our network reduces the density of the Gan architecture by using Depthwise Separable Convolution which does the convolution operation on each of the RGB channels separately then we use a Point-Wise Convolution to bring back the network into required channel number using 1 by 1 kernel.This reduces the number of parameters substantially and makes it extreme light-weight and suitable for mobile phones and edge devices. The architecture mentioned in the present paper make use of Parameterised Batch Normalization Goodfellow etc al. (Deep Learning OPTIMIZATION FOR TRAINING DEEP MODELS page 320) which makes the network to use the advantage of Batch Norm for easier training while maintaining the non-linear feature capture by inducing the learnable parameters

Keywords: comic stylisation from camera image using GAN, creating 2D animated movie style custom stickers from images, depth-wise separable convolutional neural network for light-weight GAN architecture for EDGE devices, GAN architecture for 2D animated cartoonizing neural style, neural style transfer for edge, model distilation, perceptual loss

Procedia PDF Downloads 50
4147 An Improved Convolution Deep Learning Model for Predicting Trip Mode Scheduling

Authors: Amin Nezarat, Naeime Seifadini


Trip mode selection is a behavioral characteristic of passengers with immense importance for travel demand analysis, transportation planning, and traffic management. Identification of trip mode distribution will allow transportation authorities to adopt appropriate strategies to reduce travel time, traffic and air pollution. The majority of existing trip mode inference models operate based on human selected features and traditional machine learning algorithms. However, human selected features are sensitive to changes in traffic and environmental conditions and susceptible to personal biases, which can make them inefficient. One way to overcome these problems is to use neural networks capable of extracting high-level features from raw input. In this study, the convolutional neural network (CNN) architecture is used to predict the trip mode distribution based on raw GPS trajectory data. The key innovation of this paper is the design of the layout of the input layer of CNN as well as normalization operation, in a way that is not only compatible with the CNN architecture but can also represent the fundamental features of motion including speed, acceleration, jerk, and Bearing rate. The highest prediction accuracy achieved with the proposed configuration for the convolutional neural network with batch normalization is 85.26%.

Keywords: predicting, deep learning, neural network, urban trip

Procedia PDF Downloads 63
4146 Facial Emotion Recognition Using Deep Learning

Authors: Ashutosh Mishra, Nikhil Goyal


A 3D facial emotion recognition model based on deep learning is proposed in this paper. Two convolution layers and a pooling layer are employed in the deep learning architecture. After the convolution process, the pooling is finished. The probabilities for various classes of human faces are calculated using the sigmoid activation function. To verify the efficiency of deep learning-based systems, a set of faces. The Kaggle dataset is used to verify the accuracy of a deep learning-based face recognition model. The model's accuracy is about 65 percent, which is lower than that of other facial expression recognition techniques. Despite significant gains in representation precision due to the nonlinearity of profound image representations.

Keywords: facial recognition, computational intelligence, convolutional neural network, depth map

Procedia PDF Downloads 84
4145 Urban Land Cover from GF-2 Satellite Images Using Object Based and Neural Network Classifications

Authors: Lamyaa Gamal El-Deen Taha, Ashraf Sharawi


China launched satellite GF-2 in 2014. This study deals with comparing nearest neighbor object-based classification and neural network classification methods for classification of the fused GF-2 image. Firstly, rectification of GF-2 image was performed. Secondly, a comparison between nearest neighbor object-based classification and neural network classification for classification of fused GF-2 was performed. Thirdly, the overall accuracy of classification and kappa index were calculated. Results indicate that nearest neighbor object-based classification is better than neural network classification for urban mapping.

Keywords: GF-2 images, feature extraction-rectification, nearest neighbour object based classification, segmentation algorithms, neural network classification, multilayer perceptron

Procedia PDF Downloads 247
4144 A Neural Network Classifier for Identifying Duplicate Image Entries in Real-Estate Databases

Authors: Sergey Ermolin, Olga Ermolin


A Deep Convolution Neural Network with Triplet Loss is used to identify duplicate images in real-estate advertisements in the presence of image artifacts such as watermarking, cropping, hue/brightness adjustment, and others. The effects of batch normalization, spatial dropout, and various convergence methodologies on the resulting detection accuracy are discussed. For comparative Return-on-Investment study (per industry request), end-2-end performance is benchmarked on both Nvidia Titan GPUs and Intel’s Xeon CPUs. A new real-estate dataset from San Francisco Bay Area is used for this work. Sufficient duplicate detection accuracy is achieved to supplement other database-grounded methods of duplicate removal. The implemented method is used in a Proof-of-Concept project in the real-estate industry.

Keywords: visual recognition, convolutional neural networks, triplet loss, spatial batch normalization with dropout, duplicate removal, advertisement technologies, performance benchmarking

Procedia PDF Downloads 259
4143 Influence of the Refractory Period on Neural Networks Based on the Recognition of Neural Signatures

Authors: José Luis Carrillo-Medina, Roberto Latorre


Experimental evidence has revealed that different living neural systems can sign their output signals with some specific neural signature. Although experimental and modeling results suggest that neural signatures can have an important role in the activity of neural networks in order to identify the source of the information or to contextualize a message, the functional meaning of these neural fingerprints is still unclear. The existence of cellular mechanisms to identify the origin of individual neural signals can be a powerful information processing strategy for the nervous system. We have recently built different models to study the ability of a neural network to process information based on the emission and recognition of specific neural fingerprints. In this paper we further analyze the features that can influence on the information processing ability of this kind of networks. In particular, we focus on the role that the duration of a refractory period in each neuron after emitting a signed message can play in the network collective dynamics.

Keywords: neural signature, neural fingerprint, processing based on signal identification, self-organizing neural network

Procedia PDF Downloads 375
4142 Facial Emotion Recognition with Convolutional Neural Network Based Architecture

Authors: Koray U. Erbas


Neural networks are appealing for many applications since they are able to learn complex non-linear relationships between input and output data. As the number of neurons and layers in a neural network increase, it is possible to represent more complex relationships with automatically extracted features. Nowadays Deep Neural Networks (DNNs) are widely used in Computer Vision problems such as; classification, object detection, segmentation image editing etc. In this work, Facial Emotion Recognition task is performed by proposed Convolutional Neural Network (CNN)-based DNN architecture using FER2013 Dataset. Moreover, the effects of different hyperparameters (activation function, kernel size, initializer, batch size and network size) are investigated and ablation study results for Pooling Layer, Dropout and Batch Normalization are presented.

Keywords: convolutional neural network, deep learning, deep learning based FER, facial emotion recognition

Procedia PDF Downloads 84