Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 21

convolutional neural network Related Abstracts

21 Causal Relation Identification Using Convolutional Neural Networks and Knowledge Based Features

Authors: Tharini N. de Silva, Xiao Zhibo, Zhao Rui, Mao Kezhi


Causal relation identification is a crucial task in information extraction and knowledge discovery. In this work, we present two approaches to causal relation identification. The first is a classification model trained on a set of knowledge-based features. The second is a deep learning based approach training a model using convolutional neural networks to classify causal relations. We experiment with several different convolutional neural networks (CNN) models based on previous work on relation extraction as well as our own research. Our models are able to identify both explicit and implicit causal relations as well as the direction of the causal relation. The results of our experiments show a higher accuracy than previously achieved for causal relation identification tasks.

Keywords: causal realtion extraction, relation extracton, convolutional neural network, text representation

Procedia PDF Downloads 328
20 Scene Classification Using Hierarchy Neural Network, Directed Acyclic Graph Structure, and Label Relations

Authors: Po-Jen Chen, Jian-Jiun Ding, Hung-Wei Hsu, Chien-Yao Wang, Jia-Ching Wang


A more accurate scene classification algorithm using label relations and the hierarchy neural network was developed in this work. In many classification algorithms, it is assumed that the labels are mutually exclusive. This assumption is true in some specific problems, however, for scene classification, the assumption is not reasonable. Because there are a variety of objects with a photo image, it is more practical to assign multiple labels for an image. In this paper, two label relations, which are exclusive relation and hierarchical relation, were adopted in the classification process to achieve more accurate multiple label classification results. Moreover, the hierarchy neural network (hierarchy NN) is applied to classify the image and the directed acyclic graph structure is used for predicting a more reasonable result which obey exclusive and hierarchical relations. Simulations show that, with these techniques, a much more accurate scene classification result can be achieved.

Keywords: Scene Classification, convolutional neural network, label relation, hierarchy neural network

Procedia PDF Downloads 279
19 Design and Implementation of Machine Learning Model for Short-Term Energy Forecasting in Smart Home Management System

Authors: R. Ramesh, K. K. Shivaraman


The main aim of this paper is to handle the energy requirement in an efficient manner by merging the advanced digital communication and control technologies for smart grid applications. In order to reduce user home load during peak load hours, utility applies several incentives such as real-time pricing, time of use, demand response for residential customer through smart meter. However, this method provides inconvenience in the sense that user needs to respond manually to prices that vary in real time. To overcome these inconvenience, this paper proposes a convolutional neural network (CNN) with k-means clustering machine learning model which have ability to forecast energy requirement in short term, i.e., hour of the day or day of the week. By integrating our proposed technique with home energy management based on Bluetooth low energy provides predicted value to user for scheduling appliance in advanced. This paper describes detail about CNN configuration and k-means clustering algorithm for short-term energy forecasting.

Keywords: Fuzzy Logic, convolutional neural network, k-means clustering approach, smart home energy management

Procedia PDF Downloads 191
18 Classification of Multiple Cancer Types with Deep Convolutional Neural Network

Authors: Nan Deng, Zhenqiu Liu


Thousands of patients with metastatic tumors were diagnosed with cancers of unknown primary sites each year. The inability to identify the primary cancer site may lead to inappropriate treatment and unexpected prognosis. Nowadays, a large amount of genomics and transcriptomics cancer data has been generated by next-generation sequencing (NGS) technologies, and The Cancer Genome Atlas (TCGA) database has accrued thousands of human cancer tumors and healthy controls, which provides an abundance of resource to differentiate cancer types. Meanwhile, deep convolutional neural networks (CNNs) have shown high accuracy on classification among a large number of image object categories. Here, we utilize 25 cancer primary tumors and 3 normal tissues from TCGA and convert their RNA-Seq gene expression profiling to color images; train, validate and test a CNN classifier directly from these images. The performance result shows that our CNN classifier can archive >80% test accuracy on most of the tumors and normal tissues. Since the gene expression pattern of distant metastases is similar to their primary tumors, the CNN classifier may provide a potential computational strategy on identifying the unknown primary origin of metastatic cancer in order to plan appropriate treatment for patients.

Keywords: Bioinformatics, Cancer, convolutional neural network, deep leaning, gene expression pattern

Procedia PDF Downloads 169
17 Convolutional Neural Network Based on Random Kernels for Analyzing Visual Imagery

Authors: Hyohun Kim, Dongwha Shin, Yeonseok Kim, Ji-Su Ahn, Byung-Woo Hong, Kensuke Nakamura, Ja-Keoung Koo


The machine learning techniques based on a convolutional neural network (CNN) have been actively developed and successfully applied to a variety of image analysis tasks including reconstruction, noise reduction, resolution enhancement, segmentation, motion estimation, object recognition. The classical visual information processing that ranges from low level tasks to high level ones has been widely developed in the deep learning framework. It is generally considered as a challenging problem to derive visual interpretation from high dimensional imagery data. A CNN is a class of feed-forward artificial neural network that usually consists of deep layers the connections of which are established by a series of non-linear operations. The CNN architecture is known to be shift invariant due to its shared weights and translation invariance characteristics. However, it is often computationally intractable to optimize the network in particular with a large number of convolution layers due to a large number of unknowns to be optimized with respect to the training set that is generally required to be large enough to effectively generalize the model under consideration. It is also necessary to limit the size of convolution kernels due to the computational expense despite of the recent development of effective parallel processing machinery, which leads to the use of the constantly small size of the convolution kernels throughout the deep CNN architecture. However, it is often desired to consider different scales in the analysis of visual features at different layers in the network. Thus, we propose a CNN model where different sizes of the convolution kernels are applied at each layer based on the random projection. We apply random filters with varying sizes and associate the filter responses with scalar weights that correspond to the standard deviation of the random filters. We are allowed to use large number of random filters with the cost of one scalar unknown for each filter. The computational cost in the back-propagation procedure does not increase with the larger size of the filters even though the additional computational cost is required in the computation of convolution in the feed-forward procedure. The use of random kernels with varying sizes allows to effectively analyze image features at multiple scales leading to a better generalization. The robustness and effectiveness of the proposed CNN based on random kernels are demonstrated by numerical experiments where the quantitative comparison of the well-known CNN architectures and our models that simply replace the convolution kernels with the random filters is performed. The experimental results indicate that our model achieves better performance with less number of unknown weights. The proposed algorithm has a high potential in the application of a variety of visual tasks based on the CNN framework. Acknowledgement—This work was supported by the MISP (Ministry of Science and ICT), Korea, under the National Program for Excellence in SW (20170001000011001) supervised by IITP, and NRF-2014R1A2A1A11051941, NRF2017R1A2B4006023.

Keywords: Object recognition, Deep learning, dimensionality reduction, random projection, convolutional neural network, random kernel

Procedia PDF Downloads 87
16 Stock Market Prediction Using Convolutional Neural Network That Learns from a Graph

Authors: Hyunchul Ahn, Kee-Young Kwahk, Mo-Se Lee, Cheol-Hwi Ahn


Over the past decade, deep learning has been in spotlight among various machine learning algorithms. In particular, CNN (Convolutional Neural Network), which is known as effective solution for recognizing and classifying images, has been popularly applied to classification and prediction problems in various fields. In this study, we try to apply CNN to stock market prediction, one of the most challenging tasks in the machine learning research. In specific, we propose to apply CNN as the binary classifier that predicts stock market direction (up or down) by using a graph as its input. That is, our proposal is to build a machine learning algorithm that mimics a person who looks at the graph and predicts whether the trend will go up or down. Our proposed model consists of four steps. In the first step, it divides the dataset into 5 days, 10 days, 15 days, and 20 days. And then, it creates graphs for each interval in step 2. In the next step, CNN classifiers are trained using the graphs generated in the previous step. In step 4, it optimizes the hyper parameters of the trained model by using the validation dataset. To validate our model, we will apply it to the prediction of KOSPI200 for 1,986 days in eight years (from 2009 to 2016). The experimental dataset will include 14 technical indicators such as CCI, Momentum, ROC and daily closing price of KOSPI200 of Korean stock market.

Keywords: Deep learning, stock market prediction, convolutional neural network, Korean stock market

Procedia PDF Downloads 305
15 Prediction on Housing Price Based on Deep Learning

Authors: Yan Wang, Li Yu, Chenlu Jiao, Hongrun Xin, Kaiyang Wang


In order to study the impact of various factors on the housing price, we propose to build different prediction models based on deep learning to determine the existing data of the real estate in order to more accurately predict the housing price or its changing trend in the future. Considering that the factors which affect the housing price vary widely, the proposed prediction models include two categories. The first one is based on multiple characteristic factors of the real estate. We built Convolution Neural Network (CNN) prediction model and Long Short-Term Memory (LSTM) neural network prediction model based on deep learning, and logical regression model was implemented to make a comparison between these three models. Another prediction model is time series model. Based on deep learning, we proposed an LSTM-1 model purely regard to time series, then implementing and comparing the LSTM model and the Auto-Regressive and Moving Average (ARMA) model. In this paper, comprehensive study of the second-hand housing price in Beijing has been conducted from three aspects: crawling and analyzing, housing price predicting, and the result comparing. Ultimately the best model program was produced, which is of great significance to evaluation and prediction of the housing price in the real estate industry.

Keywords: Deep learning, convolutional neural network, LSTM, housing prediction

Procedia PDF Downloads 147
14 Automatic Classification of Periodic Heart Sounds Using Convolutional Neural Network

Authors: Jia Xin Low, Keng Wah Choo


This paper presents an automatic normal and abnormal heart sound classification model developed based on deep learning algorithm. MITHSDB heart sounds datasets obtained from the 2016 PhysioNet/Computing in Cardiology Challenge database were used in this research with the assumption that the electrocardiograms (ECG) were recorded simultaneously with the heart sounds (phonocardiogram, PCG). The PCG time series are segmented per heart beat, and each sub-segment is converted to form a square intensity matrix, and classified using convolutional neural network (CNN) models. This approach removes the need to provide classification features for the supervised machine learning algorithm. Instead, the features are determined automatically through training, from the time series provided. The result proves that the prediction model is able to provide reasonable and comparable classification accuracy despite simple implementation. This approach can be used for real-time classification of heart sounds in Internet of Medical Things (IoMT), e.g. remote monitoring applications of PCG signal.

Keywords: Deep learning, discrete wavelet transform, convolutional neural network, heart sound classification

Procedia PDF Downloads 198
13 Makhraj Recognition Using Convolutional Neural Network

Authors: Zan Azma Nasruddin, Irwan Mazlin, Nor Aziah Daud, Fauziah Redzuan, Fariza Hanis Abdul Razak


This paper focuses on a machine learning that learn the correct pronunciation of Makhraj Huroofs. Usually, people need to find an expert to pronounce the Huroof accurately. In this study, the researchers have developed a system that is able to learn the selected Huroofs which are ha, tsa, zho, and dza using the Convolutional Neural Network. The researchers present the chosen type of the CNN architecture to make the system that is able to learn the data (Huroofs) as quick as possible and produces high accuracy during the prediction. The researchers have experimented the system to measure the accuracy and the cross entropy in the training process.

Keywords: Signal Processing, Speech Recognition, convolutional neural network, Makhraj recognition, tensorflow

Procedia PDF Downloads 197
12 Improving Similarity Search Using Clustered Data

Authors: Deokho Kim, Wonwoo Lee, Jaewoong Lee, Teresa Ng, Gun-Ill Lee, Jiwon Jeong


This paper presents a method for improving object search accuracy using a deep learning model. A major limitation to provide accurate similarity with deep learning is the requirement of huge amount of data for training pairwise similarity scores (metrics), which is impractical to collect. Thus, similarity scores are usually trained with a relatively small dataset, which comes from a different domain, causing limited accuracy on measuring similarity. For this reason, this paper proposes a deep learning model that can be trained with a significantly small amount of data, a clustered data which of each cluster contains a set of visually similar images. In order to measure similarity distance with the proposed method, visual features of two images are extracted from intermediate layers of a convolutional neural network with various pooling methods, and the network is trained with pairwise similarity scores which is defined zero for images in identical cluster. The proposed method outperforms the state-of-the-art object similarity scoring techniques on evaluation for finding exact items. The proposed method achieves 86.5% of accuracy compared to the accuracy of the state-of-the-art technique, which is 59.9%. That is, an exact item can be found among four retrieved images with an accuracy of 86.5%, and the rest can possibly be similar products more than the accuracy. Therefore, the proposed method can greatly reduce the amount of training data with an order of magnitude as well as providing a reliable similarity metric.

Keywords: Machine Learning, Deep learning, visual search, convolutional neural network

Procedia PDF Downloads 76
11 Optimizing the Capacity of a Convolutional Neural Network for Image Segmentation and Pattern Recognition

Authors: Yalong Jiang, Zheru Chi


In this paper, we study the factors which determine the capacity of a Convolutional Neural Network (CNN) model and propose the ways to evaluate and adjust the capacity of a CNN model for best matching to a specific pattern recognition task. Firstly, a scheme is proposed to adjust the number of independent functional units within a CNN model to make it be better fitted to a task. Secondly, the number of independent functional units in the capsule network is adjusted to fit it to the training dataset. Thirdly, a method based on Bayesian GAN is proposed to enrich the variances in the current dataset to increase its complexity. Experimental results on the PASCAL VOC 2010 Person Part dataset and the MNIST dataset show that, in both conventional CNN models and capsule networks, the number of independent functional units is an important factor that determines the capacity of a network model. By adjusting the number of functional units, the capacity of a model can better match the complexity of a dataset.

Keywords: Character Recognition, convolutional neural network, semantic segmentation, CNN, capsule network, capacity optimization, data augmentation

Procedia PDF Downloads 23
10 Detection of Keypoint in Press-Fit Curve Based on Convolutional Neural Network

Authors: Xin Chen, Guoqing Ding, Shoujia Fang


The quality of press-fit assembly is closely related to reliability and safety of product. The paper proposed a keypoint detection method based on convolutional neural network to improve the accuracy of keypoint detection in press-fit curve. It would provide an auxiliary basis for judging quality of press-fit assembly. The press-fit curve is a curve of press-fit force and displacement. Both force data and distance data are time-series data. Therefore, one-dimensional convolutional neural network is used to process the press-fit curve. After the obtained press-fit data is filtered, the multi-layer one-dimensional convolutional neural network is used to perform the automatic learning of press-fit curve features, and then sent to the multi-layer perceptron to finally output keypoint of the curve. We used the data of press-fit assembly equipment in the actual production process to train CNN model, and we used different data from the same equipment to evaluate the performance of detection. Compared with the existing research result, the performance of detection was significantly improved. This method can provide a reliable basis for the judgment of press-fit quality.

Keywords: convolutional neural network, keypoint detection, curve feature, press-fit assembly

Procedia PDF Downloads 31
9 1-D Convolutional Neural Network Approach for Wheel Flat Detection for Freight Wagons

Authors: Dachuan Shi, M. Hecht, Y. Ye


With the trend of digitalization in railway freight transport, a large number of freight wagons in Germany have been equipped with telematics devices, commonly placed on the wagon body. A telematics device contains a GPS module for tracking and a 3-axis accelerometer for shock detection. Besides these basic functions, it is desired to use the integrated accelerometer for condition monitoring without any additional sensors. Wheel flats as a common type of failure on wheel tread cause large impacts on wagons and infrastructure as well as impulsive noise. A large wheel flat may even cause safety issues such as derailments. In this sense, this paper proposes a machine learning approach for wheel flat detection by using car body accelerations. Due to suspension systems, impulsive signals caused by wheel flats are damped significantly and thus could be buried in signal noise and disturbances. Therefore, it is very challenging to detect wheel flats using car body accelerations. The proposed algorithm considers the envelope spectrum of car body accelerations to eliminate the effect of noise and disturbances. Subsequently, a 1-D convolutional neural network (CNN), which is well known as a deep learning method, is constructed to automatically extract features in the envelope-frequency domain and conduct classification. The constructed CNN is trained and tested on field test data, which are measured on the underframe of a tank wagon with a wheel flat of 20 mm length in the operational condition. The test results demonstrate the good performance of the proposed algorithm for real-time fault detection.

Keywords: Machine Learning, Fault Detection, convolutional neural network, wheel flat

Procedia PDF Downloads 6
8 A Comprehensive Study and Evaluation on Image Fashion Features Extraction

Authors: Long Chen, Yuanchao Sang, Zhihao Gong, Longsheng Chen


Clothing fashion represents a human’s aesthetic appreciation towards everyday outfits and appetite for fashion, and it reflects the development of status in society, humanity, and economics. However, modelling fashion by machine is extremely challenging because fashion is too abstract to be efficiently described by machines. Even human beings can hardly reach a consensus about fashion. In this paper, we are dedicated to answering a fundamental fashion-related problem: what image feature best describes clothing fashion? To address this issue, we have designed and evaluated various image features, ranging from traditional low-level hand-crafted features to mid-level style awareness features to various current popular deep neural network-based features, which have shown state-of-the-art performance in various vision tasks. In summary, we tested the following 9 feature representations: color, texture, shape, style, convolutional neural networks (CNNs), CNNs with distance metric learning (CNNs&DML), AutoEncoder, CNNs with multiple layer combination (CNNs&MLC) and CNNs with dynamic feature clustering (CNNs&DFC). Finally, we validated the performance of these features on two publicly available datasets. Quantitative and qualitative experimental results on both intra-domain and inter-domain fashion clothing image retrieval showed that deep learning based feature representations far outweigh traditional hand-crafted feature representation. Additionally, among all deep learning based methods, CNNs with explicit feature clustering performs best, which shows feature clustering is essential for discriminative fashion feature representation.

Keywords: Image Processing, machine modelling, convolutional neural network, feature representation

Procedia PDF Downloads 27
7 Foot Recognition Using Deep Learning for Knee Rehabilitation

Authors: Rakkrit Duangsoithong, Jermphiphut Jaruenpunyasak, Alba Garcia


The use of foot recognition can be applied in many medical fields such as the gait pattern analysis and the knee exercises of patients in rehabilitation. Generally, a camera-based foot recognition system is intended to capture a patient image in a controlled room and background to recognize the foot in the limited views. However, this system can be inconvenient to monitor the knee exercises at home. In order to overcome these problems, this paper proposes to use the deep learning method using Convolutional Neural Networks (CNNs) for foot recognition. The results are compared with the traditional classification method using LBP and HOG features with kNN and SVM classifiers. According to the results, deep learning method provides better accuracy but with higher complexity to recognize the foot images from online databases than the traditional classification method.

Keywords: Deep learning, convolutional neural network, foot recognition, knee rehabilitation

Procedia PDF Downloads 9
6 Feature Extraction and Impact Analysis for Solid Mechanics Using Supervised Finite Element Analysis

Authors: Matthias Dehmer, Edward Schwalb, Michael Schlenkrich, Farzaneh Taslimi, Ketron Mitchell-Wynne, Horen Kuecuekyan


We present a generalized feature extraction approach for supporting Machine Learning (ML) algorithms which perform tasks similar to Finite-Element Analysis (FEA). We report results for estimating the Head Injury Categorization (HIC) of vehicle engine compartments across various impact scenarios. Our experiments demonstrate that models learned using features derived with a simple discretization approach provide a reasonable approximation of a full simulation. We observe that Decision Trees could be as effective as Neural Networks for the HIC task. The simplicity and performance of the learned Decision Trees could offer a trade-off of a multiple order of magnitude increase in speed and cost improvement over full simulation for a reasonable approximation. When used as a complement to full simulation, the approach enables rapid approximate feedback to engineering teams before submission for full analysis. The approach produces mesh independent features and is further agnostic of the assembly structure.

Keywords: FEA, convolutional neural network, mechanical design validation, supervised decision tree

Procedia PDF Downloads 11
5 Malignancy Assessment of Brain Tumors Using Convolutional Neural Network

Authors: Chung-Ming Lo, Kevin Li-Chun Hsieh


The central nervous system in the World Health Organization defines grade 2, 3, 4 gliomas according to the aggressiveness. For brain tumors, using image examination would have a lower risk than biopsy. Besides, it is a challenge to extract relevant tissues from biopsy operation. Observing the whole tumor structure and composition can provide a more objective assessment. This study further proposed a computer-aided diagnosis (CAD) system based on a convolutional neural network to quantitatively evaluate a tumor's malignancy from brain magnetic resonance imaging. A total of 30 grade 2, 43 grade 3, and 57 grade 4 gliomas were collected in the experiment. Transferred parameters from AlexNet were fine-tuned to classify the target brain tumors and achieved an accuracy of 98% and an area under the receiver operating characteristics curve (Az) of 0.99. Without pre-trained features, only 61% of accuracy was obtained. The proposed convolutional neural network can accurately and efficiently classify grade 2, 3, and 4 gliomas. The promising accuracy can provide diagnostic suggestions to radiologists in the clinic.

Keywords: Computer-Aided Diagnosis, Magnetic resonance imaging, glioblastoma, convolutional neural network

Procedia PDF Downloads 11
4 Vehicle Timing Motion Detection Based on Multi-Dimensional Dynamic Detection Network

Authors: Jia Li, Xing Wei, Yuchen Hong, Yang Lu


Detecting vehicle behavior has always been the focus of intelligent transportation, but with the explosive growth of the number of vehicles and the complexity of the road environment, the vehicle behavior videos captured by traditional surveillance have been unable to satisfy the study of vehicle behavior. The traditional method of manually labeling vehicle behavior is too time-consuming and labor-intensive, but the existing object detection and tracking algorithms have poor practicability and low behavioral location detection rate. This paper proposes a vehicle behavior detection algorithm based on the dual-stream convolution network and the multi-dimensional video dynamic detection network. In the videos, the straight-line behavior of the vehicle will default to the background behavior. The Changing lanes, turning and turning around are set as target behaviors. The purpose of this model is to automatically mark the target behavior of the vehicle from the untrimmed videos. First, the target behavior proposals in the long video are extracted through the dual-stream convolution network. The model uses a dual-stream convolutional network to generate a one-dimensional action score waveform, and then extract segments with scores above a given threshold M into preliminary vehicle behavior proposals. Second, the preliminary proposals are pruned and identified using the multi-dimensional video dynamic detection network. Referring to the hierarchical reinforcement learning, the multi-dimensional network includes a Timer module and a Spacer module, where the Timer module mines time information in the video stream and the Spacer module extracts spatial information in the video frame. The Timer and Spacer module are implemented by Long Short-Term Memory (LSTM) and start from an all-zero hidden state. The Timer module uses the Transformer mechanism to extract timing information from the video stream and extract features by linear mapping and other methods. Finally, the model fuses time information and spatial information and obtains the location and category of the behavior through the softmax layer. This paper uses recall and precision to measure the performance of the model. Extensive experiments show that based on the dataset of this paper, the proposed model has obvious advantages compared with the existing state-of-the-art behavior detection algorithms. When the Time Intersection over Union (TIoU) threshold is 0.5, the Average-Precision (MP) reaches 36.3% (the MP of baselines is 21.5%). In summary, this paper proposes a vehicle behavior detection model based on multi-dimensional dynamic detection network. This paper introduces spatial information and temporal information to extract vehicle behaviors in long videos. Experiments show that the proposed algorithm is advanced and accurate in-vehicle timing behavior detection. In the future, the focus will be on simultaneously detecting the timing behavior of multiple vehicles in complex traffic scenes (such as a busy street) while ensuring accuracy.

Keywords: Deep learning, convolutional neural network, long short-term memory, vehicle behavior detection

Procedia PDF Downloads 1
3 Slice Bispectrogram Analysis-Based Classification of Environmental Sounds Using Convolutional Neural Network

Authors: Katsumi Hirata


Certain systems can function well only if they recognize the sound environment as humans do. In this research, we focus on sound classification by adopting a convolutional neural network and aim to develop a method that automatically classifies various environmental sounds. Although the neural network is a powerful technique, the performance depends on the type of input data. Therefore, we propose an approach via a slice bispectrogram, which is a third-order spectrogram and is a slice version of the amplitude for the short-time bispectrum. This paper explains the slice bispectrogram and discusses the effectiveness of the derived method by evaluating the experimental results using the ESC‑50 sound dataset. As a result, the proposed scheme gives high accuracy and stability. Furthermore, some relationship between the accuracy and non-Gaussianity of sound signals was confirmed.

Keywords: bispectrum, convolutional neural network, spectrogram, environmental sound, slice bispectrogram

Procedia PDF Downloads 1
2 Real-Time Pedestrian Detection Method Based on Improved YOLOv3

Authors: Yong Wang, Ying Wang, Jingting Luo


Pedestrian detection in image or video data is a very important and challenging task in security surveillance. The difficulty of this task is to locate and detect pedestrians of different scales in complex scenes accurately. To solve these problems, a deep neural network (RT-YOLOv3) is proposed to realize real-time pedestrian detection at different scales in security monitoring. RT-YOLOv3 improves the traditional YOLOv3 algorithm. Firstly, the deep residual network is added to extract vehicle features. Then six convolutional neural networks with different scales are designed and fused with the corresponding scale feature maps in the residual network to form the final feature pyramid to perform pedestrian detection tasks. This method can better characterize pedestrians. In order to further improve the accuracy and generalization ability of the model, a hybrid pedestrian data set training method is used to extract pedestrian data from the VOC data set and train with the INRIA pedestrian data set. Experiments show that the proposed RT-YOLOv3 method achieves 93.57% accuracy of mAP (mean average precision) and 46.52f/s (number of frames per second). In terms of accuracy, RT-YOLOv3 performs better than Fast R-CNN, Faster R-CNN, YOLO, SSD, YOLOv2, and YOLOv3. This method reduces the missed detection rate and false detection rate, improves the positioning accuracy, and meets the requirements of real-time detection of pedestrian objects.

Keywords: Pedestrian Detection, feature detection, convolutional neural network, real-time detection, YOLOv3

Procedia PDF Downloads 1
1 Facial Landmark Detection Using Occlusion-Adaptive Deep Network with Attention Mechanism

Authors: Muhammad Sadiq, Daming Shi, Junwei Liang, Meiqin Guo


In this paper, we proposed an Occlusion-adaptive Deep Network with an Attention mechanism, stated as AODN for Facial Landmark Detection (FLD). FLD is a vital step of the facial attribute analysis, face recognition pipeline, and face ver-ification. Currently, researchers focused on convolutional neural network-based FLD approaches, and have attained substantial advancement, but the occlusion is still the leading cause of difficulty for Convolutional Neural Network (CNN) to achieve accurate FLD because of its random and irregular occurrence. Attention has a vital role in the human visual system, and the significance regarding rich feature representation in computer vision problem has been recently proved by researchers. In short, by considering the importance of attention, we extended our already established Occlusion-adaptive Deep Network (ODN) by incorporating Channel-wise Attention (CA) and Spatial Attention (SA) to improve its ability to deal with the occlusion and enhance its feature representation ability. The occlusion probability assists as the adaptive weights of high-level features to alleviate the effect of the occlusion and assist in modelling the occlusion. Our contributions can be summarised as follows: i) we extended ODN with two improvements, i.e., model occlusion and the feature representation, to achieve better performance. ii) As far as we know, we are the first to introduce CA and SA for FLD to model occlusion. iii) Our method reduces the number of entire network parameters, which effectively reduces training time and cost, hence, more suitable for scalable data processing. The results of our experiments show that our proposed model outperforms than current state-of-the-art methods on available benchmark datasets.

Keywords: Deep learning, Spatial Attention, convolutional neural network, facial landmarks detection, channel-wise attention

Procedia PDF Downloads 1