Search results for: image recognition
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 4000

Search results for: image recognition

3880 Human Activities Recognition Based on Expert System

Authors: Malika Yaici, Soraya Aloui, Sara Semchaoui

Abstract:

Recognition of human activities from sensor data is an active research area, and the main objective is to obtain a high recognition rate. In this work, we propose a recognition system based on expert systems. The proposed system makes the recognition based on the objects, object states, and gestures, taking into account the context (the location of the objects and of the person performing the activity, the duration of the elementary actions, and the activity). This work focuses on complex activities which are decomposed into simple easy to recognize activities. The proposed method can be applied to any type of activity. The simulation results show the robustness of our system and its speed of decision.

Keywords: human activity recognition, ubiquitous computing, context-awareness, expert system

Procedia PDF Downloads 99
3879 A Unified Deep Framework for Joint 3d Pose Estimation and Action Recognition from a Single Color Camera

Authors: Huy Hieu Pham, Houssam Salmane, Louahdi Khoudour, Alain Crouzil, Pablo Zegers, Sergio Velastin

Abstract:

We present a deep learning-based multitask framework for joint 3D human pose estimation and action recognition from color video sequences. Our approach proceeds along two stages. In the first, we run a real-time 2D pose detector to determine the precise pixel location of important key points of the body. A two-stream neural network is then designed and trained to map detected 2D keypoints into 3D poses. In the second, we deploy the Efficient Neural Architecture Search (ENAS) algorithm to find an optimal network architecture that is used for modeling the Spatio-temporal evolution of the estimated 3D poses via an image-based intermediate representation and performing action recognition. Experiments on Human3.6M, Microsoft Research Redmond (MSR) Action3D, and Stony Brook University (SBU) Kinect Interaction datasets verify the effectiveness of the proposed method on the targeted tasks. Moreover, we show that our method requires a low computational budget for training and inference.

Keywords: human action recognition, pose estimation, D-CNN, deep learning

Procedia PDF Downloads 110
3878 Non-Targeted Adversarial Object Detection Attack: Fast Gradient Sign Method

Authors: Bandar Alahmadi, Manohar Mareboyana, Lethia Jackson

Abstract:

Today, there are many applications that are using computer vision models, such as face recognition, image classification, and object detection. The accuracy of these models is very important for the performance of these applications. One challenge that facing the computer vision models is the adversarial examples attack. In computer vision, the adversarial example is an image that is intentionally designed to cause the machine learning model to misclassify it. One of very well-known method that is used to attack the Convolution Neural Network (CNN) is Fast Gradient Sign Method (FGSM). The goal of this method is to find the perturbation that can fool the CNN using the gradient of the cost function of CNN. In this paper, we introduce a novel model that can attack Regional-Convolution Neural Network (R-CNN) that use FGSM. We first extract the regions that are detected by R-CNN, and then we resize these regions into the size of regular images. Then, we find the best perturbation of the regions that can fool CNN using FGSM. Next, we add the resulted perturbation to the attacked region to get a new region image that looks similar to the original image to human eyes. Finally, we placed the regions back to the original image and test the R-CNN with the attacked images. Our model could drop the accuracy of the R-CNN when we tested with Pascal VOC 2012 dataset.

Keywords: adversarial examples, attack, computer vision, image processing

Procedia PDF Downloads 156
3877 Texture Analysis of Grayscale Co-Occurrence Matrix on Mammographic Indexed Image

Authors: S. Sushma, S. Balasubramanian, K. C. Latha

Abstract:

The mammographic image of breast cancer compressed and synthesized to get co-efficient values which will be converted (5x5) matrix to get ROI image where we get the highest value of effected region and with the same ideology the technique has been extended to differentiate between Calcification and normal cell image using mean value derived from 5x5 matrix values

Keywords: texture analysis, mammographic image, partitioned gray scale co-oocurance matrix, co-efficient

Procedia PDF Downloads 499
3876 Logistic Model Tree and Expectation-Maximization for Pollen Recognition and Grouping

Authors: Endrick Barnacin, Jean-Luc Henry, Jack Molinié, Jimmy Nagau, Hélène Delatte, Gérard Lebreton

Abstract:

Palynology is a field of interest for many disciplines. It has multiple applications such as chronological dating, climatology, allergy treatment, and even honey characterization. Unfortunately, the analysis of a pollen slide is a complicated and time-consuming task that requires the intervention of experts in the field, which is becoming increasingly rare due to economic and social conditions. So, the automation of this task is a necessity. Pollen slides analysis is mainly a visual process as it is carried out with the naked eye. That is the reason why a primary method to automate palynology is the use of digital image processing. This method presents the lowest cost and has relatively good accuracy in pollen retrieval. In this work, we propose a system combining recognition and grouping of pollen. It consists of using a Logistic Model Tree to classify pollen already known by the proposed system while detecting any unknown species. Then, the unknown pollen species are divided using a cluster-based approach. Success rates for the recognition of known species have been achieved, and automated clustering seems to be a promising approach.

Keywords: pollen recognition, logistic model tree, expectation-maximization, local binary pattern

Procedia PDF Downloads 150
3875 Image Analysis for Obturator Foramen Based on Marker-controlled Watershed Segmentation and Zernike Moments

Authors: Seda Sahin, Emin Akata

Abstract:

Obturator foramen is a specific structure in pelvic bone images and recognition of it is a new concept in medical image processing. Moreover, segmentation of bone structures such as obturator foramen plays an essential role for clinical research in orthopedics. In this paper, we present a novel method to analyze the similarity between the substructures of the imaged region and a hand drawn template, on hip radiographs to detect obturator foramen accurately with integrated usage of Marker-controlled Watershed segmentation and Zernike moment feature descriptor. Marker-controlled Watershed segmentation is applied to seperate obturator foramen from the background effectively. Zernike moment feature descriptor is used to provide matching between binary template image and the segmented binary image for obturator foramens for final extraction. The proposed method is tested on randomly selected 100 hip radiographs. The experimental results represent that our method is able to segment obturator foramens with % 96 accuracy.

Keywords: medical image analysis, segmentation of bone structures on hip radiographs, marker-controlled watershed segmentation, zernike moment feature descriptor

Procedia PDF Downloads 400
3874 Accuracy Improvement of Traffic Participant Classification Using Millimeter-Wave Radar by Leveraging Simulator Based on Domain Adaptation

Authors: Tokihiko Akita, Seiichi Mita

Abstract:

A millimeter-wave radar is the most robust against adverse environments, making it an essential environment recognition sensor for automated driving. However, the reflection signal is sparse and unstable, so it is difficult to obtain the high recognition accuracy. Deep learning provides high accuracy even for them in recognition, but requires large scale datasets with ground truth. Specially, it takes a lot of cost to annotate for a millimeter-wave radar. For the solution, utilizing a simulator that can generate an annotated huge dataset is effective. Simulation of the radar is more difficult to match with real world data than camera image, and recognition by deep learning with higher-order features using the simulator causes further deviation. We have challenged to improve the accuracy of traffic participant classification by fusing simulator and real-world data with domain adaptation technique. Experimental results with the domain adaptation network created by us show that classification accuracy can be improved even with a few real-world data.

Keywords: millimeter-wave radar, object classification, deep learning, simulation, domain adaptation

Procedia PDF Downloads 55
3873 Large Neural Networks Learning From Scratch With Very Few Data and Without Explicit Regularization

Authors: Christoph Linse, Thomas Martinetz

Abstract:

Recent findings have shown that Neural Networks generalize also in over-parametrized regimes with zero training error. This is surprising, since it is completely against traditional machine learning wisdom. In our empirical study we fortify these findings in the domain of fine-grained image classification. We show that very large Convolutional Neural Networks with millions of weights do learn with only a handful of training samples and without image augmentation, explicit regularization or pretraining. We train the architectures ResNet018, ResNet101 and VGG19 on subsets of the difficult benchmark datasets Caltech101, CUB_200_2011, FGVCAircraft, Flowers102 and StanfordCars with 100 classes and more, perform a comprehensive comparative study and draw implications for the practical application of CNNs. Finally, we show that VGG19 with 140 million weights learns to distinguish airplanes and motorbikes with up to 95% accuracy using only 20 training samples per class.

Keywords: convolutional neural networks, fine-grained image classification, generalization, image recognition, over-parameterized, small data sets

Procedia PDF Downloads 54
3872 Size Reduction of Images Using Constraint Optimization Approach for Machine Communications

Authors: Chee Sun Won

Abstract:

This paper presents the size reduction of images for machine-to-machine communications. Here, the salient image regions to be preserved include the image patches of the key-points such as corners and blobs. Based on a saliency image map from the key-points and their image patches, an axis-aligned grid-size optimization is proposed for the reduction of image size. To increase the size-reduction efficiency the aspect ratio constraint is relaxed in the constraint optimization framework. The proposed method yields higher matching accuracy after the size reduction than the conventional content-aware image size-reduction methods.

Keywords: image compression, image matching, key-point detection and description, machine-to-machine communication

Procedia PDF Downloads 381
3871 Scar Removal Stretegy for Fingerprint Using Diffusion

Authors: Mohammad A. U. Khan, Tariq M. Khan, Yinan Kong

Abstract:

Fingerprint image enhancement is one of the most important step in an automatic fingerprint identification recognition (AFIS) system which directly affects the overall efficiency of AFIS. The conventional fingerprint enhancement like Gabor and Anisotropic filters do fill the gaps in ridge lines but they fail to tackle scar lines. To deal with this problem we are proposing a method for enhancing the ridges and valleys with scar so that true minutia points can be extracted with accuracy. Our results have shown an improved performance in terms of enhancement.

Keywords: fingerprint image enhancement, removing noise, coherence, enhanced diffusion

Procedia PDF Downloads 482
3870 Modern Machine Learning Conniptions for Automatic Speech Recognition

Authors: S. Jagadeesh Kumar

Abstract:

This expose presents a luculent of recent machine learning practices as employed in the modern and as pertinent to prospective automatic speech recognition schemes. The aspiration is to promote additional traverse ablution among the machine learning and automatic speech recognition factions that have transpired in the precedent. The manuscript is structured according to the chief machine learning archetypes that are furthermore trendy by now or have latency for building momentous hand-outs to automatic speech recognition expertise. The standards offered and convoluted in this article embraces adaptive and multi-task learning, active learning, Bayesian learning, discriminative learning, generative learning, supervised and unsupervised learning. These learning archetypes are aggravated and conferred in the perspective of automatic speech recognition tools and functions. This manuscript bequeaths and surveys topical advances of deep learning and learning with sparse depictions; further limelight is on their incessant significance in the evolution of automatic speech recognition.

Keywords: automatic speech recognition, deep learning methods, machine learning archetypes, Bayesian learning, supervised and unsupervised learning

Procedia PDF Downloads 412
3869 Definition, Structure, and Core Functions of the State Image

Authors: Rosa Nurtazina, Yerkebulan Zhumashov, Maral Tomanova

Abstract:

Humanity is entering an era when 'virtual reality' as the image of the world created by the media with the help of the Internet does not match the reality in many respects, when new communication technologies create a fundamentally different and previously unknown 'global space'. According to these technologies, the state begins to change the basic technology of political communication of the state and society, the state and the state. Nowadays, image of the state becomes the most important tool and technology. Image is a purposefully created image granting political object (person, organization, country, etc.) certain social and political values and promoting more emotional perception. Political image of the state plays an important role in international relations. The success of the country's foreign policy, development of trade and economic relations with other countries depends on whether it is positive or negative. Foreign policy image has an impact on political processes taking place in the state: the negative image of the countries can be used by opposition forces as one of the arguments to criticize the government and its policies.

Keywords: image of the country, country's image classification, function of the country image, country's image components

Procedia PDF Downloads 391
3868 Advances in Artificial intelligence Using Speech Recognition

Authors: Khaled M. Alhawiti

Abstract:

This research study aims to present a retrospective study about speech recognition systems and artificial intelligence. Speech recognition has become one of the widely used technologies, as it offers great opportunity to interact and communicate with automated machines. Precisely, it can be affirmed that speech recognition facilitates its users and helps them to perform their daily routine tasks, in a more convenient and effective manner. This research intends to present the illustration of recent technological advancements, which are associated with artificial intelligence. Recent researches have revealed the fact that speech recognition is found to be the utmost issue, which affects the decoding of speech. In order to overcome these issues, different statistical models were developed by the researchers. Some of the most prominent statistical models include acoustic model (AM), language model (LM), lexicon model, and hidden Markov models (HMM). The research will help in understanding all of these statistical models of speech recognition. Researchers have also formulated different decoding methods, which are being utilized for realistic decoding tasks and constrained artificial languages. These decoding methods include pattern recognition, acoustic phonetic, and artificial intelligence. It has been recognized that artificial intelligence is the most efficient and reliable methods, which are being used in speech recognition.

Keywords: speech recognition, acoustic phonetic, artificial intelligence, hidden markov models (HMM), statistical models of speech recognition, human machine performance

Procedia PDF Downloads 440
3867 Biometric Recognition Techniques: A Survey

Authors: Shabir Ahmad Sofi, Shubham Aggarwal, Sanyam Singhal, Roohie Naaz

Abstract:

Biometric recognition refers to an automatic recognition of individuals based on a feature vector(s) derived from their physiological and/or behavioral characteristic. Biometric recognition systems should provide a reliable personal recognition schemes to either confirm or determine the identity of an individual. These features are used to provide an authentication for computer based security systems. Applications of such a system include computer systems security, secure electronic banking, mobile phones, credit cards, secure access to buildings, health and social services. By using biometrics a person could be identified based on 'who she/he is' rather than 'what she/he has' (card, token, key) or 'what she/he knows' (password, PIN). In this paper, a brief overview of biometric methods, both unimodal and multimodal and their advantages and disadvantages, will be presented.

Keywords: biometric, DNA, fingerprint, ear, face, retina scan, gait, iris, voice recognition, unimodal biometric, multimodal biometric

Procedia PDF Downloads 726
3866 A New Scheme for Chain Code Normalization in Arabic and Farsi Scripts

Authors: Reza Shakoori

Abstract:

This paper presents a structural correction of Arabic and Persian strokes using manipulation of their chain codes in order to improve the rate and performance of Persian and Arabic handwritten word recognition systems. It collects pure and effective features to represent a character with one consolidated feature vector and reduces variations in order to decrease the number of training samples and increase the chance of successful classification. Our results also show that how the proposed approaches can simplify classification and consequently recognition by reducing variations and possible noises on the chain code by keeping orientation of characters and their backbone structures.

Keywords: Arabic, chain code normalization, OCR systems, image processing

Procedia PDF Downloads 368
3865 Bitplanes Gray-Level Image Encryption Approach Using Arnold Transform

Authors: Ali Abdrhman M. Ukasha

Abstract:

Data security needed in data transmission, storage, and communication to ensure the security. The single step parallel contour extraction (SSPCE) method is used to create the edge map as a key image from the different Gray level/Binary image. Performing the X-OR operation between the key image and each bit plane of the original image for image pixel values change purpose. The Arnold transform used to changes the locations of image pixels as image scrambling process. Experiments have demonstrated that proposed algorithm can fully encrypt 2D Gary level image and completely reconstructed without any distortion. Also shown that the analyzed algorithm have extremely large security against some attacks like salt & pepper and JPEG compression. Its proof that the Gray level image can be protected with a higher security level. The presented method has easy hardware implementation and suitable for multimedia protection in real time applications such as wireless networks and mobile phone services.

Keywords: SSPCE method, image compression-salt- peppers attacks, bitplanes decomposition, Arnold transform, lossless image encryption

Procedia PDF Downloads 395
3864 Printed Thai Character Recognition Using Particle Swarm Optimization Algorithm

Authors: Phawin Sangsuvan, Chutimet Srinilta

Abstract:

This Paper presents the applications of Particle Swarm Optimization (PSO) Method for Thai optical character recognition (OCR). OCR consists of the pre-processing, character recognition and post-processing. Before enter into recognition process. The Character must be “Prepped” by pre-processing process. The PSO is an optimization method that belongs to the swarm intelligence family based on the imitation of social behavior patterns of animals. Route of each particle is determined by an individual data among neighborhood particles. The interaction of the particles with neighbors is the advantage of Particle Swarm to determine the best solution. So PSO is interested by a lot of researchers in many difficult problems including character recognition. As the previous this research used a Projection Histogram to extract printed digits features and defined the simple Fitness Function for PSO. The results reveal that PSO gives 67.73% for testing dataset. So in the future there can be explored enhancement the better performance of PSO with improve the Fitness Function.

Keywords: character recognition, histogram projection, particle swarm optimization, pattern recognition techniques

Procedia PDF Downloads 437
3863 Integral Image-Based Differential Filters

Authors: Kohei Inoue, Kenji Hara, Kiichi Urahama

Abstract:

We describe a relationship between integral images and differential images. First, we derive a simple difference filter from conventional integral image. In the derivation, we show that an integral image and the corresponding differential image are related to each other by simultaneous linear equations, where the numbers of unknowns and equations are the same, and therefore, we can execute the integration and differentiation by solving the simultaneous equations. We applied the relationship to an image fusion problem, and experimentally verified the effectiveness of the proposed method.

Keywords: integral images, differential images, differential filters, image fusion

Procedia PDF Downloads 471
3862 Water Detection in Aerial Images Using Fuzzy Sets

Authors: Caio Marcelo Nunes, Anderson da Silva Soares, Gustavo Teodoro Laureano, Clarimar Jose Coelho

Abstract:

This paper presents a methodology to pixel recognition in aerial images using fuzzy $c$-means algorithm. This algorithm is a alternative to recognize areas considering uncertainties and inaccuracies. Traditional clustering technics are used in recognizing of multispectral images of earth's surface. This technics recognize well-defined borders that can be easily discretized. However, in the real world there are many areas with uncertainties and inaccuracies which can be mapped by clustering algorithms that use fuzzy sets. The methodology presents in this work is applied to multispectral images obtained from Landsat-5/TM satellite. The pixels are joined using the $c$-means algorithm. After, a classification process identify the types of surface according the patterns obtained from spectral response of image surface. The classes considered are, exposed soil, moist soil, vegetation, turbid water and clean water. The results obtained shows that the fuzzy clustering identify the real type of the earth's surface.

Keywords: aerial images, fuzzy clustering, image processing, pattern recognition

Procedia PDF Downloads 438
3861 RoboWeedSupport-Sub Millimeter Weed Image Acquisition in Cereal Crops with Speeds up till 50 Km/H

Authors: Morten Stigaard Laursen, Rasmus Nyholm Jørgensen, Mads Dyrmann, Robert Poulsen

Abstract:

For the past three years, the Danish project, RoboWeedSupport, has sought to bridge the gap between the potential herbicide savings using a decision support system and the required weed inspections. In order to automate the weed inspections it is desired to generate a map of the weed species present within the field, to generate the map images must be captured with samples covering the field. This paper investigates the economical cost of performing this data collection based on a camera system mounted on a all-terain vehicle (ATV) able to drive and collect data at up to 50 km/h while still maintaining a image quality sufficient for identifying newly emerged grass weeds. The economical estimates are based on approximately 100 hectares recorded at three different locations in Denmark. With an average image density of 99 images per hectare the ATV had an capacity of 28 ha per hour, which is estimated to cost 6.6 EUR/ha. Alternatively relying on a boom solution for an existing tracktor it was estimated that a cost of 2.4 EUR/ha is obtainable under equal conditions.

Keywords: weed mapping, integrated weed management, weed recognition, image acquisition

Procedia PDF Downloads 201
3860 Nation Branding: Guidelines for Identity Development and Image Perception of Thailand Brand in Health and Wellness Tourism

Authors: Jiraporn Prommaha

Abstract:

The purpose of this research is to study the development of Thailand Brand Identity and the perception of its image in order to find any guidelines for the identity development and the image perception of Thailand Brand in Health and Wellness Tourism. The paper is conducted through mixed methods research, both the qualitative and quantitative researches. The qualitative focuses on the in-depth interview of executive administrations from public and private sectors involved scholars and experts in identity and image issue, main 11 people. The quantitative research was done by the questionnaires to collect data from foreign tourists 800; Chinese tourists 400 and UK tourists 400. The technique used for this was the Exploratory Factor Analysis (EFA), this was to determine the relation between the structures of the variables by categorizing the variables into group by applying the Varimax rotation technique. This technique showed recognition the Thailand brand image related to the 2 countries, China and UK. The results found that guidelines for brand identity development and image perception of health and wellness tourism in Thailand; as following (1) Develop communication in order to understanding of the meaning of the word 'Health and beauty tourism' throughout the country, (2) Develop human resources as a national agenda, (3) Develop awareness rising in the conservation and preservation of natural resources of the country, (4) Develop the cooperation of all stakeholders in Health and Wellness Businesses, (5) Develop digital communication throughout the country and (6) Develop safety in Tourism.

Keywords: brand identity, image perception, nation branding, health and wellness tourism, mixed methods research

Procedia PDF Downloads 168
3859 Spatial Object-Oriented Template Matching Algorithm Using Normalized Cross-Correlation Criterion for Tracking Aerial Image Scene

Authors: Jigg Pelayo, Ricardo Villar

Abstract:

Leaning on the development of aerial laser scanning in the Philippine geospatial industry, researches about remote sensing and machine vision technology became a trend. Object detection via template matching is one of its application which characterized to be fast and in real time. The paper purposely attempts to provide application for robust pattern matching algorithm based on the normalized cross correlation (NCC) criterion function subjected in Object-based image analysis (OBIA) utilizing high-resolution aerial imagery and low density LiDAR data. The height information from laser scanning provides effective partitioning order, thus improving the hierarchal class feature pattern which allows to skip unnecessary calculation. Since detection is executed in the object-oriented platform, mathematical morphology and multi-level filter algorithms were established to effectively avoid the influence of noise, small distortion and fluctuating image saturation that affect the rate of recognition of features. Furthermore, the scheme is evaluated to recognized the performance in different situations and inspect the computational complexities of the algorithms. Its effectiveness is demonstrated in areas of Misamis Oriental province, achieving an overall accuracy of 91% above. Also, the garnered results portray the potential and efficiency of the implemented algorithm under different lighting conditions.

Keywords: algorithm, LiDAR, object recognition, OBIA

Procedia PDF Downloads 221
3858 Bitplanes Image Encryption/Decryption Using Edge Map (SSPCE Method) and Arnold Transform

Authors: Ali A. Ukasha

Abstract:

Data security needed in data transmission, storage, and communication to ensure the security. The single step parallel contour extraction (SSPCE) method is used to create the edge map as a key image from the different Gray level/Binary image. Performing the X-OR operation between the key image and each bit plane of the original image for image pixel values change purpose. The Arnold transform used to changes the locations of image pixels as image scrambling process. Experiments have demonstrated that proposed algorithm can fully encrypt 2D Gary level image and completely reconstructed without any distortion. Also shown that the analyzed algorithm have extremely large security against some attacks like salt & pepper and JPEG compression. Its proof that the Gray level image can be protected with a higher security level. The presented method has easy hardware implementation and suitable for multimedia protection in real time applications such as wireless networks and mobile phone services.

Keywords: SSPCE method, image compression, salt and peppers attacks, bitplanes decomposition, Arnold transform, lossless image encryption

Procedia PDF Downloads 454
3857 Automatic Reporting System for Transcriptome Indel Identification and Annotation Based on Snapshot of Next-Generation Sequencing Reads Alignment

Authors: Shuo Mu, Guangzhi Jiang, Jinsa Chen

Abstract:

The analysis of Indel for RNA sequencing of clinical samples is easily affected by sequencing experiment errors and software selection. In order to improve the efficiency and accuracy of analysis, we developed an automatic reporting system for Indel recognition and annotation based on image snapshot of transcriptome reads alignment. This system includes sequence local-assembly and realignment, target point snapshot, and image-based recognition processes. We integrated high-confidence Indel dataset from several known databases as a training set to improve the accuracy of image processing and added a bioinformatical processing module to annotate and filter Indel artifacts. Subsequently, the system will automatically generate data, including data quality levels and images results report. Sanger sequencing verification of the reference Indel mutation of cell line NA12878 showed that the process can achieve 83% sensitivity and 96% specificity. Analysis of the collected clinical samples showed that the interpretation accuracy of the process was equivalent to that of manual inspection, and the processing efficiency showed a significant improvement. This work shows the feasibility of accurate Indel analysis of clinical next-generation sequencing (NGS) transcriptome. This result may be useful for RNA study for clinical samples with microsatellite instability in immunotherapy in the future.

Keywords: automatic reporting, indel, next-generation sequencing, NGS, transcriptome

Procedia PDF Downloads 144
3856 Adversarial Attacks and Defenses on Deep Neural Networks

Authors: Jonathan Sohn

Abstract:

Deep neural networks (DNNs) have shown state-of-the-art performance for many applications, including computer vision, natural language processing, and speech recognition. Recently, adversarial attacks have been studied in the context of deep neural networks, which aim to alter the results of deep neural networks by modifying the inputs slightly. For example, an adversarial attack on a DNN used for object detection can cause the DNN to miss certain objects. As a result, the reliability of DNNs is undermined by their lack of robustness against adversarial attacks, raising concerns about their use in safety-critical applications such as autonomous driving. In this paper, we focus on studying the adversarial attacks and defenses on DNNs for image classification. There are two types of adversarial attacks studied which are fast gradient sign method (FGSM) attack and projected gradient descent (PGD) attack. A DNN forms decision boundaries that separate the input images into different categories. The adversarial attack slightly alters the image to move over the decision boundary, causing the DNN to misclassify the image. FGSM attack obtains the gradient with respect to the image and updates the image once based on the gradients to cross the decision boundary. PGD attack, instead of taking one big step, repeatedly modifies the input image with multiple small steps. There is also another type of attack called the target attack. This adversarial attack is designed to make the machine classify an image to a class chosen by the attacker. We can defend against adversarial attacks by incorporating adversarial examples in training. Specifically, instead of training the neural network with clean examples, we can explicitly let the neural network learn from the adversarial examples. In our experiments, the digit recognition accuracy on the MNIST dataset drops from 97.81% to 39.50% and 34.01% when the DNN is attacked by FGSM and PGD attacks, respectively. If we utilize FGSM training as a defense method, the classification accuracy greatly improves from 39.50% to 92.31% for FGSM attacks and from 34.01% to 75.63% for PGD attacks. To further improve the classification accuracy under adversarial attacks, we can also use a stronger PGD training method. PGD training improves the accuracy by 2.7% under FGSM attacks and 18.4% under PGD attacks over FGSM training. It is worth mentioning that both FGSM and PGD training do not affect the accuracy of clean images. In summary, we find that PGD attacks can greatly degrade the performance of DNNs, and PGD training is a very effective way to defend against such attacks. PGD attacks and defence are overall significantly more effective than FGSM methods.

Keywords: deep neural network, adversarial attack, adversarial defense, adversarial machine learning

Procedia PDF Downloads 153
3855 Human Action Recognition Using Variational Bayesian HMM with Dirichlet Process Mixture of Gaussian Wishart Emission Model

Authors: Wanhyun Cho, Soonja Kang, Sangkyoon Kim, Soonyoung Park

Abstract:

In this paper, we present the human action recognition method using the variational Bayesian HMM with the Dirichlet process mixture (DPM) of the Gaussian-Wishart emission model (GWEM). First, we define the Bayesian HMM based on the Dirichlet process, which allows an infinite number of Gaussian-Wishart components to support continuous emission observations. Second, we have considered an efficient variational Bayesian inference method that can be applied to drive the posterior distribution of hidden variables and model parameters for the proposed model based on training data. And then we have derived the predictive distribution that may be used to classify new action. Third, the paper proposes a process of extracting appropriate spatial-temporal feature vectors that can be used to recognize a wide range of human behaviors from input video image. Finally, we have conducted experiments that can evaluate the performance of the proposed method. The experimental results show that the method presented is more efficient with human action recognition than existing methods.

Keywords: human action recognition, Bayesian HMM, Dirichlet process mixture model, Gaussian-Wishart emission model, Variational Bayesian inference, prior distribution and approximate posterior distribution, KTH dataset

Procedia PDF Downloads 317
3854 An Improved Face Recognition Algorithm Using Histogram-Based Features in Spatial and Frequency Domains

Authors: Qiu Chen, Koji Kotani, Feifei Lee, Tadahiro Ohmi

Abstract:

In this paper, we propose an improved face recognition algorithm using histogram-based features in spatial and frequency domains. For adding spatial information of the face to improve recognition performance, a region-division (RD) method is utilized. The facial area is firstly divided into several regions, then feature vectors of each facial part are generated by Binary Vector Quantization (BVQ) histogram using DCT coefficients in low frequency domains, as well as Local Binary Pattern (LBP) histogram in spatial domain. Recognition results with different regions are first obtained separately and then fused by weighted averaging. Publicly available ORL database is used for the evaluation of our proposed algorithm, which is consisted of 40 subjects with 10 images per subject containing variations in lighting, posing, and expressions. It is demonstrated that face recognition using RD method can achieve much higher recognition rate.

Keywords: binary vector quantization (BVQ), DCT coefficients, face recognition, local binary patterns (LBP)

Procedia PDF Downloads 310
3853 Design and Performance Analysis of Advanced B-Spline Algorithm for Image Resolution Enhancement

Authors: M. Z. Kurian, M. V. Chidananda Murthy, H. S. Guruprasad

Abstract:

An approach to super-resolve the low-resolution (LR) image is presented in this paper which is very useful in multimedia communication, medical image enhancement and satellite image enhancement to have a clear view of the information in the image. The proposed Advanced B-Spline method generates a high-resolution (HR) image from single LR image and tries to retain the higher frequency components such as edges in the image. This method uses B-Spline technique and Crispening. This work is evaluated qualitatively and quantitatively using Mean Square Error (MSE) and Peak Signal to Noise Ratio (PSNR). The method is also suitable for real-time applications. Different combinations of decimation and super-resolution algorithms in the presence of different noise and noise factors are tested.

Keywords: advanced b-spline, image super-resolution, mean square error (MSE), peak signal to noise ratio (PSNR), resolution down converter

Procedia PDF Downloads 373
3852 Deep-Learning Based Approach to Facial Emotion Recognition through Convolutional Neural Network

Authors: Nouha Khediri, Mohammed Ben Ammar, Monji Kherallah

Abstract:

Recently, facial emotion recognition (FER) has become increasingly essential to understand the state of the human mind. Accurately classifying emotion from the face is a challenging task. In this paper, we present a facial emotion recognition approach named CV-FER, benefiting from deep learning, especially CNN and VGG16. First, the data is pre-processed with data cleaning and data rotation. Then, we augment the data and proceed to our FER model, which contains five convolutions layers and five pooling layers. Finally, a softmax classifier is used in the output layer to recognize emotions. Based on the above contents, this paper reviews the works of facial emotion recognition based on deep learning. Experiments show that our model outperforms the other methods using the same FER2013 database and yields a recognition rate of 92%. We also put forward some suggestions for future work.

Keywords: CNN, deep-learning, facial emotion recognition, machine learning

Procedia PDF Downloads 57
3851 Secure Image Retrieval Based on Orthogonal Decomposition under Cloud Environment

Authors: Y. Xu, L. Xiong, Z. Xu

Abstract:

In order to protect data privacy, image with sensitive or private information needs to be encrypted before being outsourced to the cloud. However, this causes difficulties in image retrieval and data management. A secure image retrieval method based on orthogonal decomposition is proposed in the paper. The image is divided into two different components, for which encryption and feature extraction are executed separately. As a result, cloud server can extract features from an encrypted image directly and compare them with the features of the queried images, so that the user can thus obtain the image. Different from other methods, the proposed method has no special requirements to encryption algorithms. Experimental results prove that the proposed method can achieve better security and better retrieval precision.

Keywords: secure image retrieval, secure search, orthogonal decomposition, secure cloud computing

Procedia PDF Downloads 448