Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 380

Search results for: convolutional architectures

380 Convolutional Neural Networks Architecture Analysis for Image Captioning

Authors: Jun Seung Woo, Shin Dong Ho


The Image Captioning models with Attention technology have developed significantly compared to previous models, but it is still unsatisfactory in recognizing images. We perform an extensive search over seven interesting Convolutional Neural Networks(CNN) architectures to analyze the behavior of different models for image captioning. We compared seven different CNN Architectures, according to batch size, using on public benchmarks: MS-COCO datasets. In our experimental results, DenseNet and InceptionV3 got about 14% loss and about 160sec training time per epoch. It was the most satisfactory result among the seven CNN architectures after training 50 epochs on GPU.

Keywords: deep learning, image captioning, CNN architectures, densenet, inceptionV3

Procedia PDF Downloads 17
379 Clothes Identification Using Inception ResNet V2 and MobileNet V2

Authors: Subodh Chandra Shakya, Badal Shrestha, Suni Thapa, Ashutosh Chauhan, Saugat Adhikari


To tackle our problem of clothes identification, we used different architectures of Convolutional Neural Networks. Among different architectures, the outcome from Inception ResNet V2 and MobileNet V2 seemed promising. On comparison of the metrices, we observed that the Inception ResNet V2 slightly outperforms MobileNet V2 for this purpose. So this paper of ours proposes the cloth identifier using Inception ResNet V2 and also contains the comparison between the outcome of ResNet V2 and MobileNet V2. The document here contains the results and findings of the research that we performed on the DeepFashion Dataset. To improve the dataset, we used different image preprocessing techniques like image shearing, image rotation, and denoising. The whole experiment was conducted with the intention of testing the efficiency of convolutional neural networks on cloth identification so that we could develop a reliable system that is good enough in identifying the clothes worn by the users. The whole system can be integrated with some kind of recommendation system.

Keywords: inception ResNet, convolutional neural net, deep learning, confusion matrix, data augmentation, data preprocessing

Procedia PDF Downloads 82
378 Glaucoma Detection in Retinal Tomography Using the Vision Transformer

Authors: Sushish Baral, Pratibha Joshi, Yaman Maharjan


Glaucoma is a chronic eye condition that causes vision loss that is irreversible. Early detection and treatment are critical to prevent vision loss because it can be asymptomatic. For the identification of glaucoma, multiple deep learning algorithms are used. Transformer-based architectures, which use the self-attention mechanism to encode long-range dependencies and acquire extremely expressive representations, have recently become popular. Convolutional architectures, on the other hand, lack knowledge of long-range dependencies in the image due to their intrinsic inductive biases. The aforementioned statements inspire this thesis to look at transformer-based solutions and investigate the viability of adopting transformer-based network designs for glaucoma detection. Using retinal fundus images of the optic nerve head to develop a viable algorithm to assess the severity of glaucoma necessitates a large number of well-curated images. Initially, data is generated by augmenting ocular pictures. After that, the ocular images are pre-processed to make them ready for further processing. The system is trained using pre-processed images, and it classifies the input images as normal or glaucoma based on the features retrieved during training. The Vision Transformer (ViT) architecture is well suited to this situation, as it allows the self-attention mechanism to utilise structural modeling. Extensive experiments are run on the common dataset, and the results are thoroughly validated and visualized.

Keywords: glaucoma, vision transformer, convolutional architectures, retinal fundus images, self-attention, deep learning

Procedia PDF Downloads 105
377 2D Convolutional Networks for Automatic Segmentation of Knee Cartilage in 3D MRI

Authors: Ananya Ananya, Karthik Rao


Accurate segmentation of knee cartilage in 3-D magnetic resonance (MR) images for quantitative assessment of volume is crucial for studying and diagnosing osteoarthritis (OA) of the knee, one of the major causes of disability in elderly people. Radiologists generally perform this task in slice-by-slice manner taking 15-20 minutes per 3D image, and lead to high inter and intra observer variability. Hence automatic methods for knee cartilage segmentation are desirable and are an active field of research. This paper presents design and experimental evaluation of 2D convolutional neural networks based fully automated methods for knee cartilage segmentation in 3D MRI. The architectures are validated based on 40 test images and 60 training images from SKI10 dataset. The proposed methods segment 2D slices one by one, which are then combined to give segmentation for whole 3D images. Proposed methods are modified versions of U-net and dilated convolutions, consisting of a single step that segments the given image to 5 labels: background, femoral cartilage, tibia cartilage, femoral bone and tibia bone; cartilages being the primary components of interest. U-net consists of a contracting path and an expanding path, to capture context and localization respectively. Dilated convolutions lead to an exponential expansion of receptive field with only a linear increase in a number of parameters. A combination of modified U-net and dilated convolutions has also been explored. These architectures segment one 3D image in 8 – 10 seconds giving average volumetric Dice Score Coefficients (DSC) of 0.950 - 0.962 for femoral cartilage and 0.951 - 0.966 for tibia cartilage, reference being the manual segmentation.

Keywords: convolutional neural networks, dilated convolutions, 3 dimensional, fully automated, knee cartilage, MRI, segmentation, U-net

Procedia PDF Downloads 180
376 Comparison of Deep Convolutional Neural Networks Models for Plant Disease Identification

Authors: Megha Gupta, Nupur Prakash


Identification of plant diseases has been performed using machine learning and deep learning models on the datasets containing images of healthy and diseased plant leaves. The current study carries out an evaluation of some of the deep learning models based on convolutional neural network (CNN) architectures for identification of plant diseases. For this purpose, the publicly available New Plant Diseases Dataset, an augmented version of PlantVillage dataset, available on Kaggle platform, containing 87,900 images has been used. The dataset contained images of 26 diseases of 14 different plants and images of 12 healthy plants. The CNN models selected for the study presented in this paper are AlexNet, ZFNet, VGGNet (four models), GoogLeNet, and ResNet (three models). The selected models are trained using PyTorch, an open-source machine learning library, on Google Colaboratory. A comparative study has been carried out to analyze the high degree of accuracy achieved using these models. The highest test accuracy and F1-score of 99.59% and 0.996, respectively, were achieved by using GoogLeNet with Mini-batch momentum based gradient descent learning algorithm.

Keywords: comparative analysis, convolutional neural networks, deep learning, plant disease identification

Procedia PDF Downloads 117
375 Analytical Comparison of Conventional Algorithms with Vedic Algorithm for Digital Multiplier

Authors: Akhilesh G. Naik, Dipankar Pal


In today’s scenario, the complexity of digital signal processing (DSP) applications and various microcontroller architectures have been increasing to such an extent that the traditional approaches to multiplier design in most processors are becoming outdated for being comparatively slow. Modern processing applications require suitable pipelined approaches, and therefore, algorithms that are friendlier with pipelined architectures. Traditional algorithms like Wallace Tree, Radix-4 Booth, Radix-8 Booth, Dadda architectures have been proven to be comparatively slow for pipelined architectures. These architectures, therefore, need to be optimized or combined with other architectures amongst them to enhance its performances and to be made suitable for pipelined hardware/architectures. Recently, Vedic algorithm mathematically has proven to be efficient by appearing to be less complex and with fewer steps for its output establishment and have assumed renewed importance. This paper describes and shows how the Vedic algorithm can be better suited for pipelined architectures and also can be combined with traditional architectures and algorithms for enhancing its ability even further. In this paper, we also established that for complex applications on DSP and other microcontroller architectures, using Vedic approach for multiplication proves to be the best available and efficient option.

Keywords: Wallace Tree, Radix-4 Booth, Radix-8 Booth, Dadda, Vedic, Single-Stage Karatsuba (SSK), Looped Karatsuba (LK)

Procedia PDF Downloads 89
374 Classification of Echo Signals Based on Deep Learning

Authors: Aisulu Tileukulova, Zhexebay Dauren


Radar plays an important role because it is widely used in civil and military fields. Target detection is one of the most important radar applications. The accuracy of detecting inconspicuous aerial objects in radar facilities is lower against the background of noise. Convolutional neural networks can be used to improve the recognition of this type of aerial object. The purpose of this work is to develop an algorithm for recognizing aerial objects using convolutional neural networks, as well as training a neural network. In this paper, the structure of a convolutional neural network (CNN) consists of different types of layers: 8 convolutional layers and 3 layers of a fully connected perceptron. ReLU is used as an activation function in convolutional layers, while the last layer uses softmax. It is necessary to form a data set for training a neural network in order to detect a target. We built a Confusion Matrix of the CNN model to measure the effectiveness of our model. The results showed that the accuracy when testing the model was 95.7%. Classification of echo signals using CNN shows high accuracy and significantly speeds up the process of predicting the target.

Keywords: radar, neural network, convolutional neural network, echo signals

Procedia PDF Downloads 53
373 Investigation of New Gait Representations for Improving Gait Recognition

Authors: Chirawat Wattanapanich, Hong Wei


This study presents new gait representations for improving gait recognition accuracy on cross gait appearances, such as normal walking, wearing a coat and carrying a bag. Based on the Gait Energy Image (GEI), two ideas are implemented to generate new gait representations. One is to append lower knee regions to the original GEI, and the other is to apply convolutional operations to the GEI and its variants. A set of new gait representations are created and used for training multi-class Support Vector Machines (SVMs). Tests are conducted on the CASIA dataset B. Various combinations of the gait representations with different convolutional kernel size and different numbers of kernels used in the convolutional processes are examined. Both the entire images as features and reduced dimensional features by Principal Component Analysis (PCA) are tested in gait recognition. Interestingly, both new techniques, appending the lower knee regions to the original GEI and convolutional GEI, can significantly contribute to the performance improvement in the gait recognition. The experimental results have shown that the average recognition rate can be improved from 75.65% to 87.50%.

Keywords: convolutional image, lower knee, gait

Procedia PDF Downloads 127
372 Evolving Convolutional Filter Using Genetic Algorithm for Image Classification

Authors: Rujia Chen, Ajit Narayanan


Convolutional neural networks (CNN), as typically applied in deep learning, use layer-wise backpropagation (BP) to construct filters and kernels for feature extraction. Such filters are 2D or 3D groups of weights for constructing feature maps at subsequent layers of the CNN and are shared across the entire input. BP as a gradient descent algorithm has well-known problems of getting stuck at local optima. The use of genetic algorithms (GAs) for evolving weights between layers of standard artificial neural networks (ANNs) is a well-established area of neuroevolution. In particular, the use of crossover techniques when optimizing weights can help to overcome problems of local optima. However, the application of GAs for evolving the weights of filters and kernels in CNNs is not yet an established area of neuroevolution. In this paper, a GA-based filter development algorithm is proposed. The results of the proof-of-concept experiments described in this paper show the proposed GA algorithm can find filter weights through evolutionary techniques rather than BP learning. For some simple classification tasks like geometric shape recognition, the proposed algorithm can achieve 100% accuracy. The results for MNIST classification, while not as good as possible through standard filter learning through BP, show that filter and kernel evolution warrants further investigation as a new subarea of neuroevolution for deep architectures.

Keywords: neuroevolution, convolutional neural network, genetic algorithm, filters, kernels

Procedia PDF Downloads 105
371 Simplifying the Migration of Architectures in Embedded Applications Introducing a Pattern Language to Support the Workforce

Authors: Farha Lakhani, Michael J. Pont


There are two main architectures used to develop software for modern embedded systems: these can be labelled as “event-triggered” (ET) and “time-triggered” (TT). The research presented in this paper is concerned with the issues involved in migration between these two architectures. Although TT architectures are widely used in safety-critical applications they are less familiar to developers of mainstream embedded systems. The research presented in this paper began from the premise that–for a broad class of systems that have been implemented using an ET architecture–migration to a TT architecture would improve reliability. It may be tempting to assume that conversion between ET and TT designs will simply involve converting all event-handling software routines into periodic activities. However, the required changes to the software architecture are, in many cases rather more profound. The main contribution of the work presented in this paper is to identify ways in which the significant effort involved in migrating between existing ET architectures and “equivalent” (and effective) TT architectures could be reduced. The research described in this paper has taken an innovative step in this regard by introducing the use of ‘Design patterns’ for this purpose for the first time.

Keywords: embedded applications, software architectures, reliability, pattern

Procedia PDF Downloads 244
370 Comparison of Classical Computer Vision vs. Convolutional Neural Networks Approaches for Weed Mapping in Aerial Images

Authors: Paulo Cesar Pereira Junior, Alexandre Monteiro, Rafael da Luz Ribeiro, Antonio Carlos Sobieranski, Aldo von Wangenheim


In this paper, we present a comparison between convolutional neural networks and classical computer vision approaches, for the specific precision agriculture problem of weed mapping on sugarcane fields aerial images. A systematic literature review was conducted to find which computer vision methods are being used on this specific problem. The most cited methods were implemented, as well as four models of convolutional neural networks. All implemented approaches were tested using the same dataset, and their results were quantitatively and qualitatively analyzed. The obtained results were compared to a human expert made ground truth for validation. The results indicate that the convolutional neural networks present better precision and generalize better than the classical models.

Keywords: convolutional neural networks, deep learning, digital image processing, precision agriculture, semantic segmentation, unmanned aerial vehicles

Procedia PDF Downloads 106
369 Causal Relation Identification Using Convolutional Neural Networks and Knowledge Based Features

Authors: Tharini N. de Silva, Xiao Zhibo, Zhao Rui, Mao Kezhi


Causal relation identification is a crucial task in information extraction and knowledge discovery. In this work, we present two approaches to causal relation identification. The first is a classification model trained on a set of knowledge-based features. The second is a deep learning based approach training a model using convolutional neural networks to classify causal relations. We experiment with several different convolutional neural networks (CNN) models based on previous work on relation extraction as well as our own research. Our models are able to identify both explicit and implicit causal relations as well as the direction of the causal relation. The results of our experiments show a higher accuracy than previously achieved for causal relation identification tasks.

Keywords: causal realtion extraction, relation extracton, convolutional neural network, text representation

Procedia PDF Downloads 432
368 Centralizing the Teaching Process in Intelligent Tutoring System Architectures

Authors: Nikolaj Troels Graf Von Malotky, Robin Nicolay, Alke Martens


There exist a plethora of architectures for ITSs (Intelligent Tutoring Systems). A thorough analysis and comparison of the architectures revealed, that in most cases the architecture extensions are evolutionary grown, reflecting state of the art trends of each decade. However, from the perspective of software engineering, the main aspect of an ITS has not been reflected in any of these architectures, yet. From the perspective of cognitive research, the construction of the teaching process is what makes an ITS 'intelligent' regarding the spectrum of interaction with the students. Thus, in our approach, we focus on a behavior based architecture, which is based on the main teaching processes. To create a new general architecture for ITS, we have to define the prerequisites. This paper analyzes the current state of the existing architectures and derives rules for the behavior of ITS. It is presenting a teaching process for ITSs to be used together with the architecture.

Keywords: intelligent tutoring, ITS, tutoring process, system architecture, interaction process

Procedia PDF Downloads 293
367 Image Classification with Localization Using Convolutional Neural Networks

Authors: Bhuyain Mobarok Hossain


Image classification and localization research is currently an important strategy in the field of computer vision. The evolution and advancement of deep learning and convolutional neural networks (CNN) have greatly improved the capabilities of object detection and image-based classification. Target detection is important to research in the field of computer vision, especially in video surveillance systems. To solve this problem, we will be applying a convolutional neural network of multiple scales at multiple locations in the image in one sliding window. Most translation networks move away from the bounding box around the area of interest. In contrast to this architecture, we consider the problem to be a classification problem where each pixel of the image is a separate section. Image classification is the method of predicting an individual category or specifying by a shoal of data points. Image classification is a part of the classification problem, including any labels throughout the image. The image can be classified as a day or night shot. Or, likewise, images of cars and motorbikes will be automatically placed in their collection. The deep learning of image classification generally includes convolutional layers; the invention of it is referred to as a convolutional neural network (CNN).

Keywords: image classification, object detection, localization, particle filter

Procedia PDF Downloads 143
366 Normalized Enterprises Architectures: Portugal's Public Procurement System Application

Authors: Tiago Sampaio, André Vasconcelos, Bruno Fragoso


The Normalized Systems Theory, which is designed to be applied to software architectures, provides a set of theorems, elements and rules, with the purpose of enabling evolution in Information Systems, as well as ensuring that they are ready for change. In order to make that possible, this work’s solution is to apply the Normalized Systems Theory to the domain of enterprise architectures, using Archimate. This application is achieved through the adaptation of the elements of this theory, making them artifacts of the modeling language. The theorems are applied through the identification of the viewpoints to be used in the architectures, as well as the transformation of the theory’s encapsulation rules into architectural rules. This way, it is possible to create normalized enterprise architectures, thus fulfilling the needs and requirements of the business. This solution was demonstrated using the Portuguese Public Procurement System. The Portuguese government aims to make this system as fair as possible, allowing every organization to have the same business opportunities. The aim is for every economic operator to have access to all public tenders, which are published in any of the 6 existing platforms, independently of where they are registered. In order to make this possible, we applied our solution to the construction of two different architectures, which are able of fulfilling the requirements of the Portuguese government. One of those architectures, TO-BE A, has a Message Broker that performs the communication between the platforms. The other, TO-BE B, represents the scenario in which the platforms communicate with each other directly. Apart from these 2 architectures, we also represent the AS-IS architecture that demonstrates the current behavior of the Public Procurement Systems. Our evaluation is based on a comparison between the AS-IS and the TO-BE architectures, regarding the fulfillment of the rules and theorems of the Normalized Systems Theory and some quality metrics.

Keywords: archimate, architecture, broker, enterprise, evolvable systems, interoperability, normalized architectures, normalized systems, normalized systems theory, platforms

Procedia PDF Downloads 271
365 Classification of Computer Generated Images from Photographic Images Using Convolutional Neural Networks

Authors: Chaitanya Chawla, Divya Panwar, Gurneesh Singh Anand, M. P. S Bhatia


This paper presents a deep-learning mechanism for classifying computer generated images and photographic images. The proposed method accounts for a convolutional layer capable of automatically learning correlation between neighbouring pixels. In the current form, Convolutional Neural Network (CNN) will learn features based on an image's content instead of the structural features of the image. The layer is particularly designed to subdue an image's content and robustly learn the sensor pattern noise features (usually inherited from image processing in a camera) as well as the statistical properties of images. The paper was assessed on latest natural and computer generated images, and it was concluded that it performs better than the current state of the art methods.

Keywords: image forensics, computer graphics, classification, deep learning, convolutional neural networks

Procedia PDF Downloads 186
364 Traffic Sign Recognition System Using Convolutional Neural NetworkDevineni

Authors: Devineni Vijay Bhaskar, Yendluri Raja


We recommend a model for traffic sign detection stranded on Convolutional Neural Networks (CNN). We first renovate the unique image into the gray scale image through with support vector machines, then use convolutional neural networks with fixed and learnable layers for revealing and understanding. The permanent layer can reduction the amount of attention areas to notice and crop the limits very close to the boundaries of traffic signs. The learnable coverings can rise the accuracy of detection significantly. Besides, we use bootstrap procedures to progress the accuracy and avoid overfitting problem. In the German Traffic Sign Detection Benchmark, we obtained modest results, with an area under the precision-recall curve (AUC) of 99.49% in the group “Risk”, and an AUC of 96.62% in the group “Obligatory”.

Keywords: convolutional neural network, support vector machine, detection, traffic signs, bootstrap procedures, precision-recall curve

Procedia PDF Downloads 8
363 Detection of Keypoint in Press-Fit Curve Based on Convolutional Neural Network

Authors: Shoujia Fang, Guoqing Ding, Xin Chen


The quality of press-fit assembly is closely related to reliability and safety of product. The paper proposed a keypoint detection method based on convolutional neural network to improve the accuracy of keypoint detection in press-fit curve. It would provide an auxiliary basis for judging quality of press-fit assembly. The press-fit curve is a curve of press-fit force and displacement. Both force data and distance data are time-series data. Therefore, one-dimensional convolutional neural network is used to process the press-fit curve. After the obtained press-fit data is filtered, the multi-layer one-dimensional convolutional neural network is used to perform the automatic learning of press-fit curve features, and then sent to the multi-layer perceptron to finally output keypoint of the curve. We used the data of press-fit assembly equipment in the actual production process to train CNN model, and we used different data from the same equipment to evaluate the performance of detection. Compared with the existing research result, the performance of detection was significantly improved. This method can provide a reliable basis for the judgment of press-fit quality.

Keywords: keypoint detection, curve feature, convolutional neural network, press-fit assembly

Procedia PDF Downloads 92
362 Aspect-Level Sentiment Analysis with Multi-Channel and Graph Convolutional Networks

Authors: Jiajun Wang, Xiaoge Li


The purpose of the aspect-level sentiment analysis task is to identify the sentiment polarity of aspects in a sentence. Currently, most methods mainly focus on using neural networks and attention mechanisms to model the relationship between aspects and context, but they ignore the dependence of words in different ranges in the sentence, resulting in deviation when assigning relationship weight to other words other than aspect words. To solve these problems, we propose a new aspect-level sentiment analysis model that combines a multi-channel convolutional network and graph convolutional network (GCN). Firstly, the context and the degree of association between words are characterized by Long Short-Term Memory (LSTM) and self-attention mechanism. Besides, a multi-channel convolutional network is used to extract the features of words in different ranges. Finally, a convolutional graph network is used to associate the node information of the dependency tree structure. We conduct experiments on four benchmark datasets. The experimental results are compared with those of other models, which shows that our model is better and more effective.

Keywords: aspect-level sentiment analysis, attention, multi-channel convolution network, graph convolution network, dependency tree

Procedia PDF Downloads 37
361 Makhraj Recognition Using Convolutional Neural Network

Authors: Zan Azma Nasruddin, Irwan Mazlin, Nor Aziah Daud, Fauziah Redzuan, Fariza Hanis Abdul Razak


This paper focuses on a machine learning that learn the correct pronunciation of Makhraj Huroofs. Usually, people need to find an expert to pronounce the Huroof accurately. In this study, the researchers have developed a system that is able to learn the selected Huroofs which are ha, tsa, zho, and dza using the Convolutional Neural Network. The researchers present the chosen type of the CNN architecture to make the system that is able to learn the data (Huroofs) as quick as possible and produces high accuracy during the prediction. The researchers have experimented the system to measure the accuracy and the cross entropy in the training process.

Keywords: convolutional neural network, Makhraj recognition, speech recognition, signal processing, tensorflow

Procedia PDF Downloads 246
360 Efficient DCT Architectures

Authors: Mr. P. Suryaprasad, R. Lalitha


This paper presents an efficient area and delay architectures for the implementation of one dimensional and two dimensional discrete cosine transform (DCT). These are supported to different lengths (4, 8, 16, and 32). DCT blocks are used in the different video coding standards for the image compression. The 2D- DCT calculation is made using the 2D-DCT separability property, such that the whole architecture is divided into two 1D-DCT calculations by using a transpose buffer. Based on the existing 1D-DCT architecture two different types of 2D-DCT architectures, folded and parallel types are implemented. Both of these two structures use the same transpose buffer. Proposed transpose buffer occupies less area and high speed than existing transpose buffer. Hence the area, low power and delay of both the 2D-DCT architectures are reduced.

Keywords: transposition buffer, video compression, discrete cosine transform, high efficiency video coding, two dimensional picture

Procedia PDF Downloads 424
359 Tumor Detection Using Convolutional Neural Networks (CNN) Based Neural Network

Authors: Vinai K. Singh


In Neural Network-based Learning techniques, there are several models of Convolutional Networks. Whenever the methods are deployed with large datasets, only then can their applicability and appropriateness be determined. Clinical and pathological pictures of lobular carcinoma are thought to exhibit a large number of random formations and textures. Working with such pictures is a difficult problem in machine learning. Focusing on wet laboratories and following the outcomes, numerous studies have been published with fresh commentaries in the investigation. In this research, we provide a framework that can operate effectively on raw photos of various resolutions while easing the issues caused by the existence of patterns and texturing. The suggested approach produces very good findings that may be used to make decisions in the diagnosis of cancer.

Keywords: lobular carcinoma, convolutional neural networks (CNN), deep learning, histopathological imagery scans

Procedia PDF Downloads 48
358 Reed: An Approach Towards Quickly Bootstrapping Multilingual Acoustic Models

Authors: Bipasha Sen, Aditya Agarwal


Multilingual automatic speech recognition (ASR) system is a single entity capable of transcribing multiple languages sharing a common phone space. Performance of such a system is highly dependent on the compatibility of the languages. State of the art speech recognition systems are built using sequential architectures based on recurrent neural networks (RNN) limiting the computational parallelization in training. This poses a significant challenge in terms of time taken to bootstrap and validate the compatibility of multiple languages for building a robust multilingual system. Complex architectural choices based on self-attention networks are made to improve the parallelization thereby reducing the training time. In this work, we propose Reed, a simple system based on 1D convolutions which uses very short context to improve the training time. To improve the performance of our system, we use raw time-domain speech signals directly as input. This enables the convolutional layers to learn feature representations rather than relying on handcrafted features such as MFCC. We report improvement on training and inference times by atleast a factor of 4x and 7.4x respectively with comparable WERs against standard RNN based baseline systems on SpeechOcean's multilingual low resource dataset.

Keywords: convolutional neural networks, language compatibility, low resource languages, multilingual automatic speech recognition

Procedia PDF Downloads 37
357 Application of Deep Neural Networks to Assess Corporate Credit Rating

Authors: Parisa Golbayani, Dan Wang, Ionut¸ Florescu


In this work we implement machine learning techniques to financial statement reports in order to asses company’s credit rating. Specifically, the work analyzes the performance of four neural network architectures (MLP, CNN, CNN2D, LSTM) in predicting corporate credit rating as issued by Standard and Poor’s. The paper focuses on companies from the energy, financial, and healthcare sectors in the US. The goal of this analysis is to improve application of machine learning algorithms to credit assessment. To accomplish this, the study investigates three questions. First, we investigate if the algorithms perform better when using a selected subset of important features or whether better performance is obtained by allowing the algorithms to select features themselves. Second, we address the temporal aspect inherent in financial data and study whether it is important for the results obtained by a machine learning algorithm. Third, we aim to answer if one of the four particular neural network architectures considered consistently outperforms the others, and if so under which conditions. This work frames the problem as several case studies to answer these questions and analyze the results using ANOVA and multiple comparison testing procedures.

Keywords: convolutional neural network, long short term memory, multilayer perceptron, credit rating

Procedia PDF Downloads 138
356 Slice Bispectrogram Analysis-Based Classification of Environmental Sounds Using Convolutional Neural Network

Authors: Katsumi Hirata


Certain systems can function well only if they recognize the sound environment as humans do. In this research, we focus on sound classification by adopting a convolutional neural network and aim to develop a method that automatically classifies various environmental sounds. Although the neural network is a powerful technique, the performance depends on the type of input data. Therefore, we propose an approach via a slice bispectrogram, which is a third-order spectrogram and is a slice version of the amplitude for the short-time bispectrum. This paper explains the slice bispectrogram and discusses the effectiveness of the derived method by evaluating the experimental results using the ESC‑50 sound dataset. As a result, the proposed scheme gives high accuracy and stability. Furthermore, some relationship between the accuracy and non-Gaussianity of sound signals was confirmed.

Keywords: environmental sound, bispectrum, spectrogram, slice bispectrogram, convolutional neural network

Procedia PDF Downloads 46
355 Malignancy Assessment of Brain Tumors Using Convolutional Neural Network

Authors: Chung-Ming Lo, Kevin Li-Chun Hsieh


The central nervous system in the World Health Organization defines grade 2, 3, 4 gliomas according to the aggressiveness. For brain tumors, using image examination would have a lower risk than biopsy. Besides, it is a challenge to extract relevant tissues from biopsy operation. Observing the whole tumor structure and composition can provide a more objective assessment. This study further proposed a computer-aided diagnosis (CAD) system based on a convolutional neural network to quantitatively evaluate a tumor's malignancy from brain magnetic resonance imaging. A total of 30 grade 2, 43 grade 3, and 57 grade 4 gliomas were collected in the experiment. Transferred parameters from AlexNet were fine-tuned to classify the target brain tumors and achieved an accuracy of 98% and an area under the receiver operating characteristics curve (Az) of 0.99. Without pre-trained features, only 61% of accuracy was obtained. The proposed convolutional neural network can accurately and efficiently classify grade 2, 3, and 4 gliomas. The promising accuracy can provide diagnostic suggestions to radiologists in the clinic.

Keywords: convolutional neural network, computer-aided diagnosis, glioblastoma, magnetic resonance imaging

Procedia PDF Downloads 69
354 Gradient Overdrive: Avoiding Negative Randomness Effects in Stochastic Gradient Descent (SGD)

Authors: Filip Strzałka Urszula Markowska-Kaczmar


This work aims to develop a new method that maximally reduces the phenomenon of scrabbling weights in modern Deep Neural Network architectures without losing positive generalization characteristics of SGD. The goal of the conducted experiments is to tune the proposed method called Gradient Overdrive (GO) and try to prove its effectiveness by comparison to similar state-of-the-art methods. The method aims at achieving steeper learning curves in the same training regimes. Though the method should mark by being computationally efficient, neither the experimental implementation ensures to be optimal nor is it in the scope of this work to optimize the technique in the domain of computation time.

Keywords: neural network training, SGD, MLP, convolutional network

Procedia PDF Downloads 12
353 An Accurate Computer-Aided Diagnosis - CAD System for Diagnosis of Aortic Enlargement by Using Convolutional Neural Networks

Authors: Mahdi Bazarganigilani


Aortic enlargement, also known as an aortic aneurysm, can occur when the walls of the aorta become weak. This disease can become deadly if overlooked and undiagnosed. In this paper, a Computer-Aided Diagnosis - CAD system was introduced to accurately diagnose Aortic enlargement from chest x-ray images. A novel approach by using an optimized Convolutional Neural Networks - CNN was employed. Three main areas, including the left lung, heart, and right lung were extracted from the original images. These three areas were then fed to a CNN. The accuracy of the system was evaluated on 1000 sample images by using 4-fold cross-validation. A promising accuracy of 84% was achieved in terms of the F-measure indicator. This encouraged the author to evaluate this method on a larger dataset and even on different CAD systems for further enhancement of this methodology.

Keywords: computer-aided diagnosis systems, aortic enlargement, chest X-ray, image processing, convolutional neural networks

Procedia PDF Downloads 63
352 Text Localization in Fixed-Layout Documents Using Convolutional Networks in a Coarse-to-Fine Manner

Authors: Beier Zhu, Rui Zhang, Qi Song


Text contained within fixed-layout documents can be of great semantic value and so requires a high localization accuracy, such as ID cards, invoices, cheques, and passports. Recently, algorithms based on deep convolutional networks achieve high performance on text detection tasks. However, for text localization in fixed-layout documents, such algorithms detect word bounding boxes individually, which ignores the layout information. This paper presents a novel architecture built on convolutional neural networks (CNNs). A global text localization network and a regional bounding-box regression network are introduced to tackle the problem in a coarse-to-fine manner. The text localization network simultaneously locates word bounding points, which takes the layout information into account. The bounding-box regression network inputs the features pooled from arbitrarily sized RoIs and refine the localizations. These two networks share their convolutional features and are trained jointly. A typical type of fixed-layout documents: ID cards, is selected to evaluate the effectiveness of the proposed system. These networks are trained on data cropped from nature scene images, and synthetic data produced by a synthetic text generation engine. Experiments show that our approach locates high accuracy word bounding boxes and achieves state-of-the-art performance.

Keywords: bounding box regression, convolutional networks, fixed-layout documents, text localization

Procedia PDF Downloads 101
351 Deep Learning Based, End-to-End Metaphor Detection in Greek with Recurrent and Convolutional Neural Networks

Authors: Konstantinos Perifanos, Eirini Florou, Dionysis Goutsos


This paper presents and benchmarks a number of end-to-end Deep Learning based models for metaphor detection in Greek. We combine Convolutional Neural Networks and Recurrent Neural Networks with representation learning to bear on the metaphor detection problem for the Greek language. The models presented achieve exceptional accuracy scores, significantly improving the previous state-of-the-art results, which had already achieved accuracy 0.82. Furthermore, no special preprocessing, feature engineering or linguistic knowledge is used in this work. The methods presented achieve accuracy of 0.92 and F-score 0.92 with Convolutional Neural Networks (CNNs) and bidirectional Long Short Term Memory networks (LSTMs). Comparable results of 0.91 accuracy and 0.91 F-score are also achieved with bidirectional Gated Recurrent Units (GRUs) and Convolutional Recurrent Neural Nets (CRNNs). The models are trained and evaluated only on the basis of training tuples, the related sentences and their labels. The outcome is a state-of-the-art collection of metaphor detection models, trained on limited labelled resources, which can be extended to other languages and similar tasks.

Keywords: metaphor detection, deep learning, representation learning, embeddings

Procedia PDF Downloads 47