Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 12

CNN Related Abstracts

12 Optimizing the Capacity of a Convolutional Neural Network for Image Segmentation and Pattern Recognition

Authors: Yalong Jiang, Zheru Chi

Abstract:

In this paper, we study the factors which determine the capacity of a Convolutional Neural Network (CNN) model and propose the ways to evaluate and adjust the capacity of a CNN model for best matching to a specific pattern recognition task. Firstly, a scheme is proposed to adjust the number of independent functional units within a CNN model to make it be better fitted to a task. Secondly, the number of independent functional units in the capsule network is adjusted to fit it to the training dataset. Thirdly, a method based on Bayesian GAN is proposed to enrich the variances in the current dataset to increase its complexity. Experimental results on the PASCAL VOC 2010 Person Part dataset and the MNIST dataset show that, in both conventional CNN models and capsule networks, the number of independent functional units is an important factor that determines the capacity of a network model. By adjusting the number of functional units, the capacity of a model can better match the complexity of a dataset.

Keywords: Character Recognition, convolutional neural network, semantic segmentation, CNN, capsule network, capacity optimization, data augmentation

Procedia PDF Downloads 23
11 Application of Deep Learning Algorithms in Agriculture: Early Detection of Crop Diseases

Authors: Manaranjan Pradhan, Shailaja Grover, U. Dinesh Kumar

Abstract:

Farming community in India, as well as other parts of the world, is one of the highly stressed communities due to reasons such as increasing input costs (cost of seeds, fertilizers, pesticide), droughts, reduced revenue leading to farmer suicides. Lack of integrated farm advisory system in India adds to the farmers problems. Farmers need right information during the early stages of crop’s lifecycle to prevent damage and loss in revenue. In this paper, we use deep learning techniques to develop an early warning system for detection of crop diseases using images taken by farmers using their smart phone. The research work leads to building a smart assistant using analytics and big data which could help the farmers with early diagnosis of the crop diseases and corrective actions. The classical approach for crop disease management has been to identify diseases at crop level. Recently, ImageNet Classification using the convolutional neural network (CNN) has been successfully used to identify diseases at individual plant level. Our model uses convolution filters, max pooling, dense layers and dropouts (to avoid overfitting). The models are built for binary classification (healthy or not healthy) and multi class classification (identifying which disease). Transfer learning is used to modify the weights of parameters learnt through ImageNet dataset and apply them on crop diseases, which reduces number of epochs to learn. One shot learning is used to learn from very few images, while data augmentation techniques are used to improve accuracy with images taken from farms by using techniques such as rotation, zoom, shift and blurred images. Models built using combination of these techniques are more robust for deploying in the real world. Our model is validated using tomato crop. In India, tomato is affected by 10 different diseases. Our model achieves an accuracy of more than 95% in correctly classifying the diseases. The main contribution of our research is to create a personal assistant for farmers for managing plant disease, although the model was validated using tomato crop, it can be easily extended to other crops. The advancement of technology in computing and availability of large data has made possible the success of deep learning applications in computer vision, natural language processing, image recognition, etc. With these robust models and huge smartphone penetration, feasibility of implementation of these models is high resulting in timely advise to the farmers and thus increasing the farmers' income and reducing the input costs.

Keywords: Image Recognition, Transfer Learning, CNN, data augmentation, analytics in agriculture, crop disease detection, one shot learning

Procedia PDF Downloads 11
10 A Recognition Method of Ancient Yi Script Based on Deep Learning

Authors: Shanxiong Chen, Xu Han, Xiaolong Wang, Hui Ma

Abstract:

Yi is an ethnic group mainly living in mainland China, with its own spoken and written language systems, after development of thousands of years. Ancient Yi is one of the six ancient languages in the world, which keeps a record of the history of the Yi people and offers documents valuable for research into human civilization. Recognition of the characters in ancient Yi helps to transform the documents into an electronic form, making their storage and spreading convenient. Due to historical and regional limitations, research on recognition of ancient characters is still inadequate. Thus, deep learning technology was applied to the recognition of such characters. Five models were developed on the basis of the four-layer convolutional neural network (CNN). Alpha-Beta divergence was taken as a penalty term to re-encode output neurons of the five models. Two fully connected layers fulfilled the compression of the features. Finally, at the softmax layer, the orthographic features of ancient Yi characters were re-evaluated, their probability distributions were obtained, and characters with features of the highest probability were recognized. Tests conducted show that the method has achieved higher precision compared with the traditional CNN model for handwriting recognition of the ancient Yi.

Keywords: Recognition, divergence, CNN, Yi character

Procedia PDF Downloads 7
9 Automatic Number Plate Recognition System Based on Deep Learning

Authors: T. Damak, O. Kriaa, A. Baccar, M. A. Ben Ayed, N. Masmoudi

Abstract:

In the last few years, Automatic Number Plate Recognition (ANPR) systems have become widely used in the safety, the security, and the commercial aspects. Forethought, several methods and techniques are computing to achieve the better levels in terms of accuracy and real time execution. This paper proposed a computer vision algorithm of Number Plate Localization (NPL) and Characters Segmentation (CS). In addition, it proposed an improved method in Optical Character Recognition (OCR) based on Deep Learning (DL) techniques. In order to identify the number of detected plate after NPL and CS steps, the Convolutional Neural Network (CNN) algorithm is proposed. A DL model is developed using four convolution layers, two layers of Maxpooling, and six layers of fully connected. The model was trained by number image database on the Jetson TX2 NVIDIA target. The accuracy result has achieved 95.84%.

Keywords: Deep learning, CNN, ANPR, NPL

Procedia PDF Downloads 2
8 SNR Classification Using Multiple CNNs

Authors: Thinh Ngo, Paul Rad, Brian Kelley

Abstract:

Noise estimation is essential in today wireless systems for power control, adaptive modulation, interference suppression and quality of service. Deep learning (DL) has already been applied in the physical layer for modulation and signal classifications. Unacceptably low accuracy of less than 50% is found to undermine traditional application of DL classification for SNR prediction. In this paper, we use divide-and-conquer algorithm and classifier fusion method to simplify SNR classification and therefore enhances DL learning and prediction. Specifically, multiple CNNs are used for classification rather than a single CNN. Each CNN performs a binary classification of a single SNR with two labels: less than, greater than or equal. Together, multiple CNNs are combined to effectively classify over a range of SNR values from −20 ≤ SNR ≤ 32 dB.We use pre-trained CNNs to predict SNR over a wide range of joint channel parameters including multiple Doppler shifts (0, 60, 120 Hz), power-delay profiles, and signal-modulation types (QPSK,16QAM,64-QAM). The approach achieves individual SNR prediction accuracy of 92%, composite accuracy of 70% and prediction convergence one order of magnitude faster than that of traditional estimation.

Keywords: classification, Deep learning, prediction, SNR, CNN

Procedia PDF Downloads 1
7 GRCNN: Graph Recognition Convolutional Neural Network for Synthesizing Programs from Flow Charts

Authors: Lin Cheng, Zijiang Yang

Abstract:

Program synthesis is the task to automatically generate programs based on user specification. In this paper, we present a framework that synthesizes programs from flow charts that serve as accurate and intuitive specification. In order doing so, we propose a deep neural network called GRCNN that recognizes graph structure from its image. GRCNN is trained end-to-end, which can predict edge and node information of the flow chart simultaneously. Experiments show that the accuracy rate to synthesize a program is 66.4%, and the accuracy rates to recognize edge and node are 94.1% and 67.9%, respectively. On average, it takes about 60 milliseconds to synthesize a program.

Keywords: specification, Program Synthesis, CNN, flow chart, graph recognition

Procedia PDF Downloads 1
6 Detection of Safety Goggles on Humans in Industrial Environment Using Faster-Region Based on Convolutional Neural Network with Rotated Bounding Box

Authors: Ankit Kamboj, Shikha Talwar, Nilesh Powar

Abstract:

To successfully deliver our products in the market, the employees need to be in a safe environment, especially in an industrial and manufacturing environment. The consequences of delinquency in wearing safety glasses while working in industrial plants could be high risk to employees, hence the need to develop a real-time automatic detection system which detects the persons (violators) not wearing safety glasses. In this study a convolutional neural network (CNN) algorithm called faster region based CNN (Faster RCNN) with rotated bounding box has been used for detecting safety glasses on persons; the algorithm has an advantage of detecting safety glasses with different orientation angles on the persons. The proposed method of rotational bounding boxes with a convolutional neural network first detects a person from the images, and then the method detects whether the person is wearing safety glasses or not. The video data is captured at the entrance of restricted zones of the industrial environment (manufacturing plant), which is further converted into images at 2 frames per second. In the first step, the CNN with pre-trained weights on COCO dataset is used for person detection where the detections are cropped as images. Then the safety goggles are labelled on the cropped images using the image labelling tool called roLabelImg, which is used to annotate the ground truth values of rotated objects more accurately, and the annotations obtained are further modified to depict four coordinates of the rectangular bounding box. Next, the faster RCNN with rotated bounding box is used to detect safety goggles, which is then compared with traditional bounding box faster RCNN in terms of detection accuracy (average precision), which shows the effectiveness of the proposed method for detection of rotatory objects. The deep learning benchmarking is done on a Dell workstation with a 16GB Nvidia GPU.

Keywords: Deep learning, CNN, faster RCNN, roLabelImg rotated bounding box, safety goggle detection

Procedia PDF Downloads 1
5 Artificial Neural Network for Forecasting of Daily Reservoir Inflow: Case Study of the Kotmale Reservoir in Sri Lanka

Authors: E. U. Dampage, Ovindi D. Bandara, Vinushi S. Waraketiya, Samitha S. R. De Silva, Yasiru S. Gunarathne

Abstract:

The knowledge of water inflow figures is paramount in decision making on the allocation for consumption for numerous purposes; irrigation, hydropower, domestic and industrial usage, and flood control. The understanding of how reservoir inflows are affected by different climatic and hydrological conditions is crucial to enable effective water management and downstream flood control. In this research, we propose a method using a Long Short Term Memory (LSTM) Artificial Neural Network (ANN) to assist the aforesaid decision-making process. The Kotmale reservoir, which is the uppermost reservoir in the Mahaweli reservoir complex in Sri Lanka, was used as the test bed for this research. The ANN uses the runoff in the Kotmale reservoir catchment area and the effect of Sea Surface Temperatures (SST) to make a forecast for seven days ahead. Three types of ANN are tested; Multi-Layer Perceptron (MLP), Convolutional Neural Network (CNN), and LSTM. The extensive field trials and validation endeavors found that the LSTM ANN provides superior performance in the aspects of accuracy and latency.

Keywords: Neural Network, multi-layer perceptron, MLP, convolutional neural network, long short-term memory, LSTM, CNN, inflow

Procedia PDF Downloads 1
4 A Survey of Field Programmable Gate Array-Based Convolutional Neural Network Accelerators

Authors: Wei Zhang

Abstract:

With the rapid development of deep learning, neural network and deep learning algorithms play a significant role in various practical applications. Due to the high accuracy and good performance, Convolutional Neural Networks (CNNs) especially have become a research hot spot in the past few years. However, the size of the networks becomes increasingly large scale due to the demands of the practical applications, which poses a significant challenge to construct a high-performance implementation of deep learning neural networks. Meanwhile, many of these application scenarios also have strict requirements on the performance and low-power consumption of hardware devices. Therefore, it is particularly critical to choose a moderate computing platform for hardware acceleration of CNNs. This article aimed to survey the recent advance in Field Programmable Gate Array (FPGA)-based acceleration of CNNs. Various designs and implementations of the accelerator based on FPGA under different devices and network models are overviewed, and the versions of Graphic Processing Units (GPUs), Application Specific Integrated Circuits (ASICs) and Digital Signal Processors (DSPs) are compared to present our own critical analysis and comments. Finally, we give a discussion on different perspectives of these acceleration and optimization methods on FPGA platforms to further explore the opportunities and challenges for future research. More helpfully, we give a prospect for future development of the FPGA-based accelerator.

Keywords: FPGA, Deep learning, Convolutional Neural Networks, field programmable gate array, CNN, hardware accelerator

Procedia PDF Downloads 1
3 Improving the Performance of Deep Learning in Facial Emotion Recognition with Image Sharpening

Authors: Ksheeraj Sai Vepuri, Nada Attar

Abstract:

We as humans use words with accompanying visual and facial cues to communicate effectively. Classifying facial emotion using computer vision methodologies has been an active research area in the computer vision field. In this paper, we propose a simple method for facial expression recognition that enhances accuracy. We tested our method on the FER-2013 dataset that contains static images. Instead of using Histogram equalization to preprocess the dataset, we used Unsharp Mask to emphasize texture and details and sharpened the edges. We also used ImageDataGenerator from Keras library for data augmentation. Then we used Convolutional Neural Networks (CNN) model to classify the images into 7 different facial expressions, yielding an accuracy of 69.46% on the test set. Our results show that using image preprocessing such as the sharpening technique for a CNN model can improve the performance, even when the CNN model is relatively simple.

Keywords: Deep learning, CNN, image preprocessing, facial expression recognittion

Procedia PDF Downloads 1
2 Automated Pothole Detection Using Convolution Neural Networks and 3D Reconstruction Using Stereovision

Authors: Eshta Ranyal, Kamal Jain, Vikrant Ranyal

Abstract:

Potholes are a severe threat to road safety and a major contributing factor towards road distress. In the Indian context, they are a major road hazard. Timely detection of potholes and subsequent repair can prevent the roads from deteriorating. To facilitate the roadway authorities in the timely detection and repair of potholes, we propose a pothole detection methodology using convolutional neural networks. The YOLOv3 model is used as it is fast and accurate in comparison to other state-of-the-art models. You only look once v3 (YOLOv3) is a state-of-the-art, real-time object detection system that features multi-scale detection. A mean average precision(mAP) of 73% was obtained on a training dataset of 200 images. The dataset was then increased to 500 images, resulting in an increase in mAP. We further calculated the depth of the potholes using stereoscopic vision by reconstruction of 3D potholes. This enables calculating pothole volume, its extent, which can then be used to evaluate the pothole severity as low, moderate, high.

Keywords: CNN, pothole detection, YOLO, pothole severity, stereovision

Procedia PDF Downloads 1
1 Lightweight Hybrid Convolutional and Recurrent Neural Networks for Wearable Sensor Based Human Activity Recognition

Authors: Sonia Perez-Gamboa, Qingquan Sun, Yan Zhang

Abstract:

Non-intrusive sensor-based human activity recognition (HAR) is utilized in a spectrum of applications, including fitness tracking devices, gaming, health care monitoring, and smartphone applications. Deep learning models such as convolutional neural networks (CNNs) and long short term memory (LSTM) recurrent neural networks (RNNs) provide a way to achieve HAR accurately and effectively. In this paper, we design a multi-layer hybrid architecture with CNN and LSTM and explore a variety of multi-layer combinations. Based on the exploration, we present a lightweight, hybrid, and multi-layer model, which can improve the recognition performance by integrating local features and scale-invariant with dependencies of activities. The experimental results demonstrate the efficacy of the proposed model, which can achieve a 94.7% activity recognition rate on a benchmark human activity dataset. This model outperforms traditional machine learning and other deep learning methods. Additionally, our implementation achieves a balance between recognition rate and training time consumption.

Keywords: Deep learning, inertial sensor, LSTM, CNN, human activity recognition

Procedia PDF Downloads 1