Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 972

Search results for: convolutional coding

792 Code Refactoring Using Slice-Based Cohesion Metrics and AOP

Authors: Jagannath Singh, Durga Prasad Mohapatra

Abstract:

Software refactoring is very essential for maintaining the software quality. It is an usual practice that we first design the software and then go for coding. But after coding is completed, if the requirement changes slightly or our expected output is not achieved, then we change the codes. For each small code change, we cannot change the design. In course of time, due to these small changes made to the code, the software design decays. Software refactoring is used to restructure the code in order to improve the design and quality of the software. In this paper, we propose an approach for performing code refactoring. We use slice-based cohesion metrics to identify the target methods which requires refactoring. After identifying the target methods, we use program slicing to divide the target method into two parts. Finally, we have used the concepts of Aspects to adjust the code structure so that the external behaviour of the original module does not change.

Keywords: software refactoring, program slicing, AOP, cohesion metrics, code restructure, AspectJ

Procedia PDF Downloads 512

791 A Novel Search Pattern for Motion Estimation in High Efficiency Video Coding

Authors: Phong Nguyen, Phap Nguyen, Thang Nguyen

Abstract:

High Efficiency Video Coding (HEVC) or H.265 Standard fulfills the demand of high resolution video storage and transmission since it achieves high compression ratio. However, it requires a huge amount of calculation. Since Motion Estimation (ME) block composes about 80 % of calculation load of HEVC, there are a lot of researches to reduce the computation cost. In this paper, we propose a new algorithm to lower the number of Motion Estimation’s searching points. The number of computing points in search pattern is down from 77 for Diamond Pattern and 81 for Square Pattern to only 31. Meanwhile, the Peak Signal to Noise Ratio (PSNR) and bit rate are almost equal to those of conventional patterns. The motion estimation time of new algorithm reduces by at 68.23%, 65.83%compared to the recommended search pattern of diamond pattern, square pattern, respectively.

Keywords: motion estimation, wide diamond, search pattern, H.265, test zone search, HM software

Procedia PDF Downloads 611

790 Oncogenic Functions of Long Non-Coding RNA XIST in Human Nasopharyngeal Carcinoma by Targeting MiR-34a-5p

Authors: Cheng-Cao Sun, Shu-Jun Li, De-Jia Li

Abstract:

Long non-coding RNA (lncRNA) X inactivate-specific transcript (XIST) has been verified as an oncogenic gene in several human malignant tumors, and its dysregulation was closed associated with tumor initiation, development and progression. Nevertheless, whether the aberrant expression of XIST in human nasopharyngeal carcinoma (NPC) is corrected with malignancy, metastasis or prognosis has not been elaborated. Here, we discovered that XIST was up-regulated in NPC tissues and higher expression of XIST contributed to a markedly poorer survival time. In addition, multivariate analysis demonstrated XIST was an independent risk factor for prognosis. XIST over-expression enhanced, while XIST silencing hampered the cell growth in NPC. Additionally, mechanistic analysis revealed that XIST up-regulated the expression of miR-34a-5p targeted gene E2F3 through acting as a competitive ‘sponge’ of miR-34a-5p. Taking all into account, we concluded that XIST functioned as an oncogene in NPC through up-regulating E2F3 in part through ‘spongeing’ miR-34a-5p.

Keywords: X inactivate-specific transcript; hsa-miRNA-34a-5p, miR-34a-5p; E2F3, nasopharyngeal carcinoma, tumorigenesis

Procedia PDF Downloads 240

789 Enhancing Knowledge Graph Convolutional Networks with Structural Adaptive Receptive Fields for Improved Node Representation and Information Aggregation

Authors: Zheng Zhihao

Abstract:

Recently, Knowledge Graph Framework Network (KGCN) has developed powerful capabilities in knowledge representation and reasoning tasks. However, traditional KGCN often uses a fixed weight mechanism when aggregating information, failing to make full use of rich structural information, resulting in a certain expression ability of node representation, and easily causing over-smoothing problems. In order to solve these challenges, the paper proposes an new graph neural network model called KGCN-STAR (Knowledge Graph Convolutional Network with Structural Adaptive Receptive Fields). This model dynamically adjusts the perception of each node by introducing a structural adaptive receptive field. wild range, and a subgraph aggregator is designed to capture local structural information more effectively. Experimental results show that KGCN-STAR shows significant performance improvement on multiple knowledge graph data sets, especially showing considerable capabilities in the task of representation learning of complex structures.

Keywords: knowledge graph, graph neural networks, structural adaptive receptive fields, information aggregation

Procedia PDF Downloads 33

788 Sum Capacity with Regularized Channel Inversion in Multi-Antenna Downlink Systems under Equal Power Constraint

Authors: Attaullah Khawaja, Amna Shabbir

Abstract:

Channel inversion is one of the simplest techniques for multiuser downlink systems with single-antenna users. In this paper regularized channel inversion under equal power constraint in the multiuser multiple input multiple output (MU-MIMO) broadcast channels has been considered. Sum capacity with plain channel inversion also known as Zero Forcing Beam Forming (ZFBF) and optimum sum capacity using Dirty Paper Coding (DPC) has also been investigated. Analysis and simulations show that regularization enhances the system performance and empower linear growth in Sum Capacity and specially work well at low signal to noise ratio (SNRs) regime.

Keywords: broadcast channel, channel inversion, multiple antenna multiple-user wireless, multiple-input multiple-output (MIMO), regularization, dirty paper coding (DPC), sum capacity

Procedia PDF Downloads 527

787 Automatic Product Identification Based on Deep-Learning Theory in an Assembly Line

Authors: Fidel Lòpez Saca, Carlos Avilés-Cruz, Miguel Magos-Rivera, José Antonio Lara-Chávez

Abstract:

Automated object recognition and identification systems are widely used throughout the world, particularly in assembly lines, where they perform quality control and automatic part selection tasks. This article presents the design and implementation of an object recognition system in an assembly line. The proposed shapes-color recognition system is based on deep learning theory in a specially designed convolutional network architecture. The used methodology involve stages such as: image capturing, color filtering, location of object mass centers, horizontal and vertical object boundaries, and object clipping. Once the objects are cut out, they are sent to a convolutional neural network, which automatically identifies the type of figure. The identification system works in real-time. The implementation was done on a Raspberry Pi 3 system and on a Jetson-Nano device. The proposal is used in an assembly course of bachelor’s degree in industrial engineering. The results presented include studying the efficiency of the recognition and processing time.

Keywords: deep-learning, image classification, image identification, industrial engineering.

Procedia PDF Downloads 160

786 Design and Implementation of Machine Learning Model for Short-Term Energy Forecasting in Smart Home Management System

Authors: R. Ramesh, K. K. Shivaraman

Abstract:

The main aim of this paper is to handle the energy requirement in an efficient manner by merging the advanced digital communication and control technologies for smart grid applications. In order to reduce user home load during peak load hours, utility applies several incentives such as real-time pricing, time of use, demand response for residential customer through smart meter. However, this method provides inconvenience in the sense that user needs to respond manually to prices that vary in real time. To overcome these inconvenience, this paper proposes a convolutional neural network (CNN) with k-means clustering machine learning model which have ability to forecast energy requirement in short term, i.e., hour of the day or day of the week. By integrating our proposed technique with home energy management based on Bluetooth low energy provides predicted value to user for scheduling appliance in advanced. This paper describes detail about CNN configuration and k-means clustering algorithm for short-term energy forecasting.

Keywords: convolutional neural network, fuzzy logic, k-means clustering approach, smart home energy management

Procedia PDF Downloads 305

785 Channel Estimation/Equalization with Adaptive Modulation and Coding over Multipath Faded Channels for WiMAX

Authors: B. Siva Kumar Reddy, B. Lakshmi

Abstract:

WiMAX has adopted an Adaptive Modulation and Coding (AMC) in OFDM to endure higher data rates and error free transmission. AMC schemes employ the Channel State Information (CSI) to efficiently utilize the channel and maximize the throughput and for better spectral efficiency. This CSI has given to the transmitter by the channel estimators. In this paper, LSE (Least Square Error) and MMSE (Minimum Mean square Error) estimators are suggested and BER (Bit Error Rate) performance has been analyzed. Channel equalization is also integrated with with AMC-OFDM system and presented with Constant Modulus Algorithm (CMA) and Least Mean Square (LMS) algorithms with convergence rates analysis. Simulation results proved that increment in modulation scheme size causes to improvement in throughput along with BER value. There is a trade-off among modulation size, throughput, BER value and spectral efficiency. Results also reported the requirement of channel estimation and equalization in high data rate systems.

Keywords: AMC, CSI, CMA, OFDM, OFDMA, WiMAX

Procedia PDF Downloads 393

784 Analysing Techniques for Fusing Multimodal Data in Predictive Scenarios Using Convolutional Neural Networks

Authors: Philipp Ruf, Massiwa Chabbi, Christoph Reich, Djaffar Ould-Abdeslam

Abstract:

In recent years, convolutional neural networks (CNN) have demonstrated high performance in image analysis, but oftentimes, there is only structured data available regarding a specific problem. By interpreting structured data as images, CNNs can effectively learn and extract valuable insights from tabular data, leading to improved predictive accuracy and uncovering hidden patterns that may not be apparent in traditional structured data analysis. In applying a single neural network for analyzing multimodal data, e.g., both structured and unstructured information, significant advantages in terms of time complexity and energy efficiency can be achieved. Converting structured data into images and merging them with existing visual material offers a promising solution for applying CNN in multimodal datasets, as they often occur in a medical context. By employing suitable preprocessing techniques, structured data is transformed into image representations, where the respective features are expressed as different formations of colors and shapes. In an additional step, these representations are fused with existing images to incorporate both types of information. This final image is finally analyzed using a CNN.

Keywords: CNN, image processing, tabular data, mixed dataset, data transformation, multimodal fusion

Procedia PDF Downloads 123

783 Normal and Peaberry Coffee Beans Classification from Green Coffee Bean Images Using Convolutional Neural Networks and Support Vector Machine

Authors: Hira Lal Gope, Hidekazu Fukai

Abstract:

The aim of this study is to develop a system which can identify and sort peaberries automatically at low cost for coffee producers in developing countries. In this paper, the focus is on the classification of peaberries and normal coffee beans using image processing and machine learning techniques. The peaberry is not bad and not a normal bean. The peaberry is born in an only single seed, relatively round seed from a coffee cherry instead of the usual flat-sided pair of beans. It has another value and flavor. To make the taste of the coffee better, it is necessary to separate the peaberry and normal bean before green coffee beans roasting. Otherwise, the taste of total beans will be mixed, and it will be bad. In roaster procedure time, all the beans shape, size, and weight must be unique; otherwise, the larger bean will take more time for roasting inside. The peaberry has a different size and different shape even though they have the same weight as normal beans. The peaberry roasts slower than other normal beans. Therefore, neither technique provides a good option to select the peaberries. Defect beans, e.g., sour, broken, black, and fade bean, are easy to check and pick up manually by hand. On the other hand, the peaberry pick up is very difficult even for trained specialists because the shape and color of the peaberry are similar to normal beans. In this study, we use image processing and machine learning techniques to discriminate the normal and peaberry bean as a part of the sorting system. As the first step, we applied Deep Convolutional Neural Networks (CNN) and Support Vector Machine (SVM) as machine learning techniques to discriminate the peaberry and normal bean. As a result, better performance was obtained with CNN than with SVM for the discrimination of the peaberry. The trained artificial neural network with high performance CPU and GPU in this work will be simply installed into the inexpensive and low in calculation Raspberry Pi system. We assume that this system will be used in under developed countries. The study evaluates and compares the feasibility of the methods in terms of accuracy of classification and processing speed.

Keywords: convolutional neural networks, coffee bean, peaberry, sorting, support vector machine

Procedia PDF Downloads 144

782 Medical Image Augmentation Using Spatial Transformations for Convolutional Neural Network

Authors: Trupti Chavan, Ramachandra Guda, Kameshwar Rao

Abstract:

The lack of data is a pain problem in medical image analysis using a convolutional neural network (CNN). This work uses various spatial transformation techniques to address the medical image augmentation issue for knee detection and localization using an enhanced single shot detector (SSD) network. The spatial transforms like a negative, histogram equalization, power law, sharpening, averaging, gaussian blurring, etc. help to generate more samples, serve as pre-processing methods, and highlight the features of interest. The experimentation is done on the OpenKnee dataset which is a collection of knee images from the openly available online sources. The CNN called enhanced single shot detector (SSD) is utilized for the detection and localization of the knee joint from a given X-ray image. It is an enhanced version of the famous SSD network and is modified in such a way that it will reduce the number of prediction boxes at the output side. It consists of a classification network (VGGNET) and an auxiliary detection network. The performance is measured in mean average precision (mAP), and 99.96% mAP is achieved using the proposed enhanced SSD with spatial transformations. It is also seen that the localization boundary is comparatively more refined and closer to the ground truth in spatial augmentation and gives better detection and localization of knee joints.

Keywords: data augmentation, enhanced SSD, knee detection and localization, medical image analysis, openKnee, Spatial transformations

Procedia PDF Downloads 154

781 Computational Neurosciences: An Inspiration from Biological Neurosciences

Authors: Harsh Sadawarti, Kamal Malik

Abstract:

Humans are the unique and the most powerful creature on this planet just because of the high level of intelligence gifted by nature. Computational Intelligence is highly influenced by the term natural intelligence, neurosciences and mathematics. To deal with the in-depth study of computational intelligence and to utilize it in real-life applications, it is quite important to understand its simulation with the human brain. In this paper, the three important parts, Frontal Lobe, Occipital Lobe and Parietal Lobe of the human brain, are compared with the ANN(Artificial Neural Network), CNN(Convolutional Neural network), and RNN(Recurrent Neural Network), respectively. Intelligent computational systems are created by combining deductive reasoning, logical concepts and high-level algorithms with the simulation and study of the human brain. Human brain is a combination of Physiology, Psychology, emotions, calculations and many other parameters which are of utmost importance that determines the overall intelligence. To create intelligent algorithms, smart machines and to simulate the human brain in an effective manner, it is quite important to have an insight into the human brain and the basic concepts of biological neurosciences.

Keywords: computational intelligence, neurosciences, convolutional neural network, recurrent neural network, artificial neural network, frontal lobe, occipital lobe, parietal lobe

Procedia PDF Downloads 111

780 Lightweight Hybrid Convolutional and Recurrent Neural Networks for Wearable Sensor Based Human Activity Recognition

Authors: Sonia Perez-Gamboa, Qingquan Sun, Yan Zhang

Abstract:

Non-intrusive sensor-based human activity recognition (HAR) is utilized in a spectrum of applications, including fitness tracking devices, gaming, health care monitoring, and smartphone applications. Deep learning models such as convolutional neural networks (CNNs) and long short term memory (LSTM) recurrent neural networks (RNNs) provide a way to achieve HAR accurately and effectively. In this paper, we design a multi-layer hybrid architecture with CNN and LSTM and explore a variety of multi-layer combinations. Based on the exploration, we present a lightweight, hybrid, and multi-layer model, which can improve the recognition performance by integrating local features and scale-invariant with dependencies of activities. The experimental results demonstrate the efficacy of the proposed model, which can achieve a 94.7% activity recognition rate on a benchmark human activity dataset. This model outperforms traditional machine learning and other deep learning methods. Additionally, our implementation achieves a balance between recognition rate and training time consumption.

Keywords: deep learning, LSTM, CNN, human activity recognition, inertial sensor

Procedia PDF Downloads 150

779 Low Light Image Enhancement with Multi-Stage Interconnected Autoencoders Integration in Pix to Pix GAN

Authors: Muhammad Atif, Cang Yan

Abstract:

The enhancement of low-light images is a significant area of study aimed at enhancing the quality of captured images in challenging lighting environments. Recently, methods based on convolutional neural networks (CNN) have gained prominence as they offer state-of-the-art performance. However, many approaches based on CNN rely on increasing the size and complexity of the neural network. In this study, we propose an alternative method for improving low-light images using an autoencoder-based multiscale knowledge transfer model. Our method leverages the power of three autoencoders, where the encoders of the first two autoencoders are directly connected to the decoder of the third autoencoder. Additionally, the decoder of the first two autoencoders is connected to the encoder of the third autoencoder. This architecture enables effective knowledge transfer, allowing the third autoencoder to learn and benefit from the enhanced knowledge extracted by the first two autoencoders. We further integrate the proposed model into the PIX to PIX GAN framework. By integrating our proposed model as the generator in the GAN framework, we aim to produce enhanced images that not only exhibit improved visual quality but also possess a more authentic and realistic appearance. These experimental results, both qualitative and quantitative, show that our method is better than the state-of-the-art methodologies.

Keywords: low light image enhancement, deep learning, convolutional neural network, image processing

Procedia PDF Downloads 80

778 Artificial Neural Network for Forecasting of Daily Reservoir Inflow: Case Study of the Kotmale Reservoir in Sri Lanka

Authors: E. U. Dampage, Ovindi D. Bandara, Vinushi S. Waraketiya, Samitha S. R. De Silva, Yasiru S. Gunarathne

Abstract:

The knowledge of water inflow figures is paramount in decision making on the allocation for consumption for numerous purposes; irrigation, hydropower, domestic and industrial usage, and flood control. The understanding of how reservoir inflows are affected by different climatic and hydrological conditions is crucial to enable effective water management and downstream flood control. In this research, we propose a method using a Long Short Term Memory (LSTM) Artificial Neural Network (ANN) to assist the aforesaid decision-making process. The Kotmale reservoir, which is the uppermost reservoir in the Mahaweli reservoir complex in Sri Lanka, was used as the test bed for this research. The ANN uses the runoff in the Kotmale reservoir catchment area and the effect of Sea Surface Temperatures (SST) to make a forecast for seven days ahead. Three types of ANN are tested; Multi-Layer Perceptron (MLP), Convolutional Neural Network (CNN), and LSTM. The extensive field trials and validation endeavors found that the LSTM ANN provides superior performance in the aspects of accuracy and latency.

Keywords: convolutional neural network, CNN, inflow, long short-term memory, LSTM, multi-layer perceptron, MLP, neural network

Procedia PDF Downloads 151

777 An Improved Convolution Deep Learning Model for Predicting Trip Mode Scheduling

Authors: Amin Nezarat, Naeime Seifadini

Abstract:

Trip mode selection is a behavioral characteristic of passengers with immense importance for travel demand analysis, transportation planning, and traffic management. Identification of trip mode distribution will allow transportation authorities to adopt appropriate strategies to reduce travel time, traffic and air pollution. The majority of existing trip mode inference models operate based on human selected features and traditional machine learning algorithms. However, human selected features are sensitive to changes in traffic and environmental conditions and susceptible to personal biases, which can make them inefficient. One way to overcome these problems is to use neural networks capable of extracting high-level features from raw input. In this study, the convolutional neural network (CNN) architecture is used to predict the trip mode distribution based on raw GPS trajectory data. The key innovation of this paper is the design of the layout of the input layer of CNN as well as normalization operation, in a way that is not only compatible with the CNN architecture but can also represent the fundamental features of motion including speed, acceleration, jerk, and Bearing rate. The highest prediction accuracy achieved with the proposed configuration for the convolutional neural network with batch normalization is 85.26%.

Keywords: predicting, deep learning, neural network, urban trip

Procedia PDF Downloads 138

776 Groundwater Potential Delineation Using Geodetector Based Convolutional Neural Network in the Gunabay Watershed of Ethiopia

Authors: Asnakew Mulualem Tegegne, Tarun Kumar Lohani, Abunu Atlabachew Eshete

Abstract:

Groundwater potential delineation is essential for efficient water resource utilization and long-term development. The scarcity of potable and irrigation water has become a critical issue due to natural and anthropogenic activities in meeting the demands of human survival and productivity. With these constraints, groundwater resources are now being used extensively in Ethiopia. Therefore, an innovative convolutional neural network (CNN) is successfully applied in the Gunabay watershed to delineate groundwater potential based on the selected major influencing factors. Groundwater recharge, lithology, drainage density, lineament density, transmissivity, and geomorphology were selected as major influencing factors during the groundwater potential of the study area. For dataset training, 70% of samples were selected and 30% were used for serving out of the total 128 samples. The spatial distribution of groundwater potential has been classified into five groups: very low (10.72%), low (25.67%), moderate (31.62%), high (19.93%), and very high (12.06%). The area obtains high rainfall but has a very low amount of recharge due to a lack of proper soil and water conservation structures. The major outcome of the study showed that moderate and low potential is dominant. Geodetoctor results revealed that the magnitude influences on groundwater potential have been ranked as transmissivity (0.48), recharge (0.26), lineament density (0.26), lithology (0.13), drainage density (0.12), and geomorphology (0.06). The model results showed that using a convolutional neural network (CNN), groundwater potentiality can be delineated with higher predictive capability and accuracy. CNN-based AUC validation platform showed that 81.58% and 86.84% were accrued from the accuracy of training and testing values, respectively. Based on the findings, the local government can receive technical assistance for groundwater exploration and sustainable water resource development in the Gunabay watershed. Finally, the use of a detector-based deep learning algorithm can provide a new platform for industrial sectors, groundwater experts, scholars, and decision-makers.

Keywords: CNN, geodetector, groundwater influencing factors, Groundwater potential, Gunabay watershed

Procedia PDF Downloads 21

775 Sparse Coding Based Classification of Electrocardiography Signals Using Data-Driven Complete Dictionary Learning

Authors: Fuad Noman, Sh-Hussain Salleh, Chee-Ming Ting, Hadri Hussain, Syed Rasul

Abstract:

In this paper, a data-driven dictionary approach is proposed for the automatic detection and classification of cardiovascular abnormalities. Electrocardiography (ECG) signal is represented by the trained complete dictionaries that contain prototypes or atoms to avoid the limitations of pre-defined dictionaries. The data-driven trained dictionaries simply take the ECG signal as input rather than extracting features to study the set of parameters that yield the most descriptive dictionary. The approach inherently learns the complicated morphological changes in ECG waveform, which is then used to improve the classification. The classification performance was evaluated with ECG data under two different preprocessing environments. In the first category, QT-database is baseline drift corrected with notch filter and it filters the 60 Hz power line noise. In the second category, the data are further filtered using fast moving average smoother. The experimental results on QT database confirm that our proposed algorithm shows a classification accuracy of 92%.

Keywords: electrocardiogram, dictionary learning, sparse coding, classification

Procedia PDF Downloads 386

774 Performance Evaluation of Distributed Deep Learning Frameworks in Cloud Environment

Authors: Shuen-Tai Wang, Fang-An Kuo, Chau-Yi Chou, Yu-Bin Fang

Abstract:

2016 has become the year of the Artificial Intelligence explosion. AI technologies are getting more and more matured that most world well-known tech giants are making large investment to increase the capabilities in AI. Machine learning is the science of getting computers to act without being explicitly programmed, and deep learning is a subset of machine learning that uses deep neural network to train a machine to learn features directly from data. Deep learning realizes many machine learning applications which expand the field of AI. At the present time, deep learning frameworks have been widely deployed on servers for deep learning applications in both academia and industry. In training deep neural networks, there are many standard processes or algorithms, but the performance of different frameworks might be different. In this paper we evaluate the running performance of two state-of-the-art distributed deep learning frameworks that are running training calculation in parallel over multi GPU and multi nodes in our cloud environment. We evaluate the training performance of the frameworks with ResNet-50 convolutional neural network, and we analyze what factors that result in the performance among both distributed frameworks as well. Through the experimental analysis, we identify the overheads which could be further optimized. The main contribution is that the evaluation results provide further optimization directions in both performance tuning and algorithmic design.

Keywords: artificial intelligence, machine learning, deep learning, convolutional neural networks

Procedia PDF Downloads 211

773 Artificial Intelligence for Traffic Signal Control and Data Collection

Authors: Reggie Chandra

Abstract:

Trafficaccidents and traffic signal optimization are correlated. However, 70-90% of the traffic signals across the USA are not synchronized. The reason behind that is insufficient resources to create and implement timing plans. In this work, we will discuss the use of a breakthrough Artificial Intelligence (AI) technology to optimize traffic flow and collect 24/7/365 accurate traffic data using a vehicle detection system. We will discuss what are recent advances in Artificial Intelligence technology, how does AI work in vehicles, pedestrians, and bike data collection, creating timing plans, and what is the best workflow for that. Apart from that, this paper will showcase how Artificial Intelligence makes signal timing affordable. We will introduce a technology that uses Convolutional Neural Networks (CNN) and deep learning algorithms to detect, collect data, develop timing plans and deploy them in the field. Convolutional Neural Networks are a class of deep learning networks inspired by the biological processes in the visual cortex. A neural net is modeled after the human brain. It consists of millions of densely connected processing nodes. It is a form of machine learning where the neural net learns to recognize vehicles through training - which is called Deep Learning. The well-trained algorithm overcomes most of the issues faced by other detection methods and provides nearly 100% traffic data accuracy. Through this continuous learning-based method, we can constantly update traffic patterns, generate an unlimited number of timing plans and thus improve vehicle flow. Convolutional Neural Networks not only outperform other detection algorithms but also, in cases such as classifying objects into fine-grained categories, outperform humans. Safety is of primary importance to traffic professionals, but they don't have the studies or data to support their decisions. Currently, one-third of transportation agencies do not collect pedestrian and bike data. We will discuss how the use of Artificial Intelligence for data collection can help reduce pedestrian fatalities and enhance the safety of all vulnerable road users. Moreover, it provides traffic engineers with tools that allow them to unleash their potential, instead of dealing with constant complaints, a snapshot of limited handpicked data, dealing with multiple systems requiring additional work for adaptation. The methodologies used and proposed in the research contain a camera model identification method based on deep Convolutional Neural Networks. The proposed application was evaluated on our data sets acquired through a variety of daily real-world road conditions and compared with the performance of the commonly used methods requiring data collection by counting, evaluating, and adapting it, and running it through well-established algorithms, and then deploying it to the field. This work explores themes such as how technologies powered by Artificial Intelligence can benefit your community and how to translate the complex and often overwhelming benefits into a language accessible to elected officials, community leaders, and the public. Exploring such topics empowers citizens with insider knowledge about the potential of better traffic technology to save lives and improve communities. The synergies that Artificial Intelligence brings to traffic signal control and data collection are unsurpassed.

Keywords: artificial intelligence, convolutional neural networks, data collection, signal control, traffic signal

Procedia PDF Downloads 169

772 Real-Time Recognition of Dynamic Hand Postures on a Neuromorphic System

Authors: Qian Liu, Steve Furber

Abstract:

To explore how the brain may recognize objects in its general,accurate and energy-efficient manner, this paper proposes the use of a neuromorphic hardware system formed from a Dynamic Video Sensor~(DVS) silicon retina in concert with the SpiNNaker real-time Spiking Neural Network~(SNN) simulator. As a first step in the exploration on this platform a recognition system for dynamic hand postures is developed, enabling the study of the methods used in the visual pathways of the brain. Inspired by the behaviours of the primary visual cortex, Convolutional Neural Networks (CNNs) are modeled using both linear perceptrons and spiking Leaky Integrate-and-Fire (LIF) neurons. In this study's largest configuration using these approaches, a network of 74,210 neurons and 15,216,512 synapses is created and operated in real-time using 290 SpiNNaker processor cores in parallel and with 93.0% accuracy. A smaller network using only 1/10th of the resources is also created, again operating in real-time, and it is able to recognize the postures with an accuracy of around 86.4% -only 6.6% lower than the much larger system. The recognition rate of the smaller network developed on this neuromorphic system is sufficient for a successful hand posture recognition system, and demonstrates a much-improved cost to performance trade-off in its approach.

Keywords: spiking neural network (SNN), convolutional neural network (CNN), posture recognition, neuromorphic system

Procedia PDF Downloads 472

771 Reed: An Approach Towards Quickly Bootstrapping Multilingual Acoustic Models

Authors: Bipasha Sen, Aditya Agarwal

Abstract:

Multilingual automatic speech recognition (ASR) system is a single entity capable of transcribing multiple languages sharing a common phone space. Performance of such a system is highly dependent on the compatibility of the languages. State of the art speech recognition systems are built using sequential architectures based on recurrent neural networks (RNN) limiting the computational parallelization in training. This poses a significant challenge in terms of time taken to bootstrap and validate the compatibility of multiple languages for building a robust multilingual system. Complex architectural choices based on self-attention networks are made to improve the parallelization thereby reducing the training time. In this work, we propose Reed, a simple system based on 1D convolutions which uses very short context to improve the training time. To improve the performance of our system, we use raw time-domain speech signals directly as input. This enables the convolutional layers to learn feature representations rather than relying on handcrafted features such as MFCC. We report improvement on training and inference times by atleast a factor of 4x and 7.4x respectively with comparable WERs against standard RNN based baseline systems on SpeechOcean's multilingual low resource dataset.

Keywords: convolutional neural networks, language compatibility, low resource languages, multilingual automatic speech recognition

Procedia PDF Downloads 123

770 Reactive and Concurrency-Based Image Resource Management Module for iOS Applications

Authors: Shubham V. Kamdi

Abstract:

This paper aims to serve as an introduction to image resource caching techniques for iOS mobile applications. It will explain how developers can break down multiple image-downloading tasks concurrently using state-of-the-art iOS frameworks, namely Swift Concurrency and Combine. The paper will explain how developers can leverage SwiftUI to develop reactive view components and use declarative coding patterns. Developers will learn to bypass built-in image caching systems by curating the procedure to implement a swift-based LRU cache system. The paper will provide a full architectural overview of a system, helping readers understand how mobile applications are designed professionally. It will cover technical discussion, helping readers understand the low-level details of threads and how they can switch between them, as well as the significance of the main and background threads for requesting HTTP services via mobile applications.

Keywords: main thread, background thread, reactive view components, declarative coding

Procedia PDF Downloads 25

769 Exploring the Development of Communicative Skills in English Teaching Students: A Phenomenological Study During Online Instruction

Authors: Estephanie S. López Contreras, Vicente Aranda Palacios, Daniela Flores Silva, Felipe Oliveros Olivares, Romina Riquelme Escobedo, Iñaki Westerhout Usabiaga

Abstract:

This research explored whether the context of online instruction has influenced the development of first-year English-teaching students' communication skills, being these speaking and listening. The theoretical basis finds its niche in the need to bridge the gap in knowledge about the Chilean online educational context and the development of English communicative skills. An interpretative paradigm and a phenomenological design were implemented in this study. Twenty- two first-year students and two teachers from an English teaching training program participated in the study. The students' ages ranged from 18 to 26 years of age, and the teachers' years of experience ranged from 5 to 13 years in the program. For data collection purposes, semi- structured interviews were applied to both students and teachers. Interview questions were based on the initial conceptualization of the central phenomenon. Observations, field notes, and focus groups with the students are also part of the data collection process. Data analysis considered two-cycle methods. The first included descriptive coding for field notes, initial coding for interviews, and creating a codebook. The second cycle included axial coding for both field notes and interviews. After data analysis, the findings show that students perceived online classes as instances in which active communication cannot always occur. In addition, changes made to the curricula as a consequence of the COVID-19 pandemic have affected students' speaking and listening skills.

Keywords: attitudes, communicative skills, EFL teaching training program, online instruction, and perceptions

Procedia PDF Downloads 119

768 Non-intrusive Hand Control of Drone Using an Inexpensive and Streamlined Convolutional Neural Network Approach

Authors: Evan Lowhorn, Rocio Alba-Flores

Abstract:

The purpose of this work is to develop a method for classifying hand signals and using the output in a drone control algorithm. To achieve this, methods based on Convolutional Neural Networks (CNN) were applied. CNN's are a subset of deep learning, which allows grid-like inputs to be processed and passed through a neural network to be trained for classification. This type of neural network allows for classification via imaging, which is less intrusive than previous methods using biosensors, such as EMG sensors. Classification CNN's operate purely from the pixel values in an image; therefore they can be used without additional exteroceptive sensors. A development bench was constructed using a desktop computer connected to a high-definition webcam mounted on a scissor arm. This allowed the camera to be pointed downwards at the desk to provide a constant solid background for the dataset and a clear detection area for the user. A MATLAB script was created to automate dataset image capture at the development bench and save the images to the desktop. This allowed the user to create their own dataset of 12,000 images within three hours. These images were evenly distributed among seven classes. The defined classes include forward, backward, left, right, idle, and land. The drone has a popular flip function which was also included as an additional class. To simplify control, the corresponding hand signals chosen were the numerical hand signs for one through five for movements, a fist for land, and the universal “ok” sign for the flip command. Transfer learning with PyTorch (Python) was performed using a pre-trained 18-layer residual learning network (ResNet-18) to retrain the network for custom classification. An algorithm was created to interpret the classification and send encoded messages to a Ryze Tello drone over its 2.4 GHz Wi-Fi connection. The drone’s movements were performed in half-meter distance increments at a constant speed. When combined with the drone control algorithm, the classification performed as desired with negligible latency when compared to the delay in the drone’s movement commands.

Keywords: classification, computer vision, convolutional neural networks, drone control

Procedia PDF Downloads 210

767 Facial Behavior Modifications Following the Diffusion of the Use of Protective Masks Due to COVID-19

Authors: Andreas Aceranti, Simonetta Vernocchi, Marco Colorato, Daniel Zaccariello

Abstract:

Our study explores the usefulness of implementing facial expression recognition capabilities and using the Facial Action Coding System (FACS) in contexts where the other person is wearing a mask. In the communication process, the subjects use a plurality of distinct and autonomous reporting systems. Among them, the system of mimicking facial movements is worthy of attention. Basic emotion theorists have identified the existence of specific and universal patterns of facial expressions related to seven basic emotions -anger, disgust, contempt, fear, sadness, surprise, and happiness- that would distinguish one emotion from another. However, due to the COVID-19 pandemic, we have come up against the problem of having the lower half of the face covered and, therefore, not investigable due to the masks. Facial-emotional behavior is a good starting point for understanding: (1) the affective state (such as emotions), (2) cognitive activity (perplexity, concentration, boredom), (3) temperament and personality traits (hostility, sociability, shyness), (4) psychopathology (such as diagnostic information relevant to depression, mania, schizophrenia, and less severe disorders), (5) psychopathological processes that occur during social interactions patient and analyst. There are numerous methods to measure facial movements resulting from the action of muscles, see for example, the measurement of visible facial actions using coding systems (non-intrusive systems that require the presence of an observer who encodes and categorizes behaviors) and the measurement of electrical "discharges" of contracting muscles (facial electromyography; EMG). However, the measuring system invented by Ekman and Friesen (2002) - "Facial Action Coding System - FACS" is the most comprehensive, complete, and versatile. Our study, carried out on about 1,500 subjects over three years of work, allowed us to highlight how the movements of the hands and upper part of the face change depending on whether the subject wears a mask or not. We have been able to identify specific alterations to the subjects’ hand movement patterns and their upper face expressions while wearing masks compared to when not wearing them. We believe that finding correlations between how body language changes when our facial expressions are impaired can provide a better understanding of the link between the face and body non-verbal language.

Keywords: facial action coding system, COVID-19, masks, facial analysis

Procedia PDF Downloads 77

766 Smart Defect Detection in XLPE Cables Using Convolutional Neural Networks

Authors: Tesfaye Mengistu

Abstract:

Power cables play a crucial role in the transmission and distribution of electrical energy. As the electricity generation, transmission, distribution, and storage systems become smarter, there is a growing emphasis on incorporating intelligent approaches to ensure the reliability of power cables. Various types of electrical cables are employed for transmitting and distributing electrical energy, with cross-linked polyethylene (XLPE) cables being widely utilized due to their exceptional electrical and mechanical properties. However, insulation defects can occur in XLPE cables due to subpar manufacturing techniques during production and cable joint installation. To address this issue, experts have proposed different methods for monitoring XLPE cables. Some suggest the use of interdigital capacitive (IDC) technology for online monitoring, while others propose employing continuous wave (CW) terahertz (THz) imaging systems to detect internal defects in XLPE plates used for power cable insulation. In this study, we have developed models that employ a custom dataset collected locally to classify the physical safety status of individual power cables. Our models aim to replace physical inspections with computer vision and image processing techniques to classify defective power cables from non-defective ones. The implementation of our project utilized the Python programming language along with the TensorFlow package and a convolutional neural network (CNN). The CNN-based algorithm was specifically chosen for power cable defect classification. The results of our project demonstrate the effectiveness of CNNs in accurately classifying power cable defects. We recommend the utilization of similar or additional datasets to further enhance and refine our models. Additionally, we believe that our models could be used to develop methodologies for detecting power cable defects from live video feeds. We firmly believe that our work makes a significant contribution to the field of power cable inspection and maintenance. Our models offer a more efficient and cost-effective approach to detecting power cable defects, thereby improving the reliability and safety of power grids.

Keywords: artificial intelligence, computer vision, defect detection, convolutional neural net

Procedia PDF Downloads 112

765 Correlation between Speech Emotion Recognition Deep Learning Models and Noises

Authors: Leah Lee

Abstract:

This paper examines the correlation between deep learning models and emotions with noises to see whether or not noises mask emotions. The deep learning models used are plain convolutional neural networks (CNN), auto-encoder, long short-term memory (LSTM), and Visual Geometry Group-16 (VGG-16). Emotion datasets used are Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS), Crowd-sourced Emotional Multimodal Actors Dataset (CREMA-D), Toronto Emotional Speech Set (TESS), and Surrey Audio-Visual Expressed Emotion (SAVEE). To make it four times bigger, audio set files, stretch, and pitch augmentations are utilized. From the augmented datasets, five different features are extracted for inputs of the models. There are eight different emotions to be classified. Noise variations are white noise, dog barking, and cough sounds. The variation in the signal-to-noise ratio (SNR) is 0, 20, and 40. In summation, per a deep learning model, nine different sets with noise and SNR variations and just augmented audio files without any noises will be used in the experiment. To compare the results of the deep learning models, the accuracy and receiver operating characteristic (ROC) are checked.

Keywords: auto-encoder, convolutional neural networks, long short-term memory, speech emotion recognition, visual geometry group-16

Procedia PDF Downloads 75

764 A Preliminary Comparative Study Between the United Kingdom and Taiwan: Public Private Collaboration and Cooperation in Tackling Large Scale Cyberattacks

Authors: Chi-Hsuan Cheng

Abstract:

This research aims to evaluate public-private partnerships against cyberattacks by comparing the UK and Taiwan. First, the study analyses major cyberattacks and factors influencing cybersecurity in both countries. Second, it assesses the effectiveness of current cyber defence strategies in combating cyberattacks by comparing the approaches taken in the UK and Taiwan, while also evaluating the cyber resilience of both nations. Lastly, the research evaluates existing public-private partnerships by comparing those in the UK and Taiwan, and proposes recommendations for enhancing cooperation and collaboration mechanisms in tackling cyberattacks. Grounded theory serves as the core research method. Theoretical sampling is used to recruit participants in both the UK and Taiwan, including investigators, police officers, and professionals from cybersecurity firms. Semi-structured interviews are conducted in English in the UK and Mandarin in Taiwan, recorded with consent, and pseudonymised for privacy. Data analysis involves open coding, grouping excerpts into codes, and categorising codes. Axial coding connects codes into categories, leading to the development of a codebook. The process continues iteratively until theoretical saturation is reached. Finally, selective coding identifies the core topic, evaluating public-private cooperation against cyberattacks and its implications for social and policing strategies in the UK and Taiwan, which highlights the current status of the cybersecurity industry, governmental plans for cybersecurity, and contributions to cybersecurity from both government sectors and cybersecurity firms, with a particular focus on public-private partnerships. In summary, this research aims to offer practical recommendations to law enforcement, private sectors, and academia for reflecting on current strategies and tailoring future approaches in cybersecurity

Keywords: cybersecurity, cybercrime, public private partnerships, cyberattack

Procedia PDF Downloads 75

763 Designing Online Professional Development Courses Using Video-Based Instruction to Teach Robotics and Computer Science

Authors: Alaina Caulkett, Audra Selkowitz, Lauren Harter, Aimee DeFoe

Abstract:

Educational robotics is an effective tool for teaching and learning STEM curricula. Yet, most traditional professional development programs do not cover engineering, coding, or robotics. This paper will give an overview of how and why the VEX Professional Development Plus Introductory Training courses were developed to provide guided, simple professional development in the area of robotics and computer science instruction. These training courses guide educators through learning the basics of VEX robotics platforms, including VEX 123, GO, IQ, and EXP. Because many educators do not have experience teaching robotics or computer science, this course is meant to simulate one on one training or tutoring through video-based instruction. These videos, led by education professionals, can be watched at any time, which allows educators to watch at their own pace and create their own personalized professional development timeline. This personalization expands beyond the course itself into an online community where educators at different points in the self-paced course can converse with one another or with instructors from the videos and learn from a growing community of practice. By the end of each course, educators are armed with the skills to introduce robotics or computer science in their classroom or educational setting. The design of the course was guided by a variation of the Understanding by Design (UbD) framework and included hands-on activities and challenges to keep educators engaged and excited about robotics. Some of the concepts covered include, but are not limited to, following build instructions, building a robot, updating firmware, coding the robot to drive and turn autonomously, coding a robot using multiple methods, and considerations for teaching robotics and computer science in the classroom, and more. A secondary goal of this research is to discuss how this professional development approach can serve as an example in the larger educational community and explore ways that it could be further researched or used in the future.

Keywords: computer science education, online professional development, professional development, robotics education, video-based instruction

Procedia PDF Downloads 100