Search results for: Arabic speech recognition
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1030

Search results for: Arabic speech recognition

730 OCR for Script Identification of Hindi (Devnagari) Numerals using Feature Sub Selection by Means of End-Point with Neuro-Memetic Model

Authors: Banashree N. P., R. Vasanta

Abstract:

Recognition of Indian languages scripts is challenging problems. In Optical Character Recognition [OCR], a character or symbol to be recognized can be machine printed or handwritten characters/numerals. There are several approaches that deal with problem of recognition of numerals/character depending on the type of feature extracted and different way of extracting them. This paper proposes a recognition scheme for handwritten Hindi (devnagiri) numerals; most admired one in Indian subcontinent. Our work focused on a technique in feature extraction i.e. global based approach using end-points information, which is extracted from images of isolated numerals. These feature vectors are fed to neuro-memetic model [18] that has been trained to recognize a Hindi numeral. The archetype of system has been tested on varieties of image of numerals. . In proposed scheme data sets are fed to neuro-memetic algorithm, which identifies the rule with highest fitness value of nearly 100 % & template associates with this rule is nothing but identified numerals. Experimentation result shows that recognition rate is 92-97 % compared to other models.

Keywords: OCR, Global Feature, End-Points, Neuro-Memetic model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1753
729 Face Localization and Recognition in Varied Expressions and Illumination

Authors: Hui-Yu Huang, Shih-Hang Hsu

Abstract:

In this paper, we propose a robust scheme to work face alignment and recognition under various influences. For face representation, illumination influence and variable expressions are the important factors, especially the accuracy of facial localization and face recognition. In order to solve those of factors, we propose a robust approach to overcome these problems. This approach consists of two phases. One phase is preprocessed for face images by means of the proposed illumination normalization method. The location of facial features can fit more efficient and fast based on the proposed image blending. On the other hand, based on template matching, we further improve the active shape models (called as IASM) to locate the face shape more precise which can gain the recognized rate in the next phase. The other phase is to process feature extraction by using principal component analysis and face recognition by using support vector machine classifiers. The results show that this proposed method can obtain good facial localization and face recognition with varied illumination and local distortion.

Keywords: Gabor filter, improved active shape model (IASM), principal component analysis (PCA), face alignment, face recognition, support vector machine (SVM)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1482
728 Fitness Action Recognition Based on MediaPipe

Authors: Zixuan Xu, Yichun Lou, Yang Song, Zihuai Lin

Abstract:

MediaPipe is an open-source machine learning computer vision framework that can be ported into a multi-platform environment, which makes it easier to use it to recognize human activity. Based on this framework, many human recognition systems have been created, but the fundamental issue is the recognition of human behavior and posture. In this paper, two methods are proposed to recognize human gestures based on MediaPipe, the first one uses the Adaptive Boosting algorithm to recognize a series of fitness gestures, and the second one uses the Fast Dynamic Time Warping algorithm to recognize 413 continuous fitness actions. These two methods are also applicable to any human posture movement recognition.

Keywords: Computer Vision, MediaPipe, Adaptive Boosting, Fast Dynamic Time Warping.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 803
727 Smartphone-Based Human Activity Recognition by Machine Learning Methods

Authors: Yanting Cao, Kazumitsu Nawata

Abstract:

As smartphones are continually upgrading, their software and hardware are getting smarter, so the smartphone-based human activity recognition will be described more refined, complex and detailed. In this context, we analyzed a set of experimental data, obtained by observing and measuring 30 volunteers with six activities of daily living (ADL). Due to the large sample size, especially a 561-feature vector with time and frequency domain variables, cleaning these intractable features and training a proper model become extremely challenging. After a series of feature selection and parameters adjustments, a well-performed SVM classifier has been trained. 

Keywords: smart sensors, human activity recognition, artificial intelligence, SVM

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 619
726 Recognition Machine (RM) for On-line and Isolated Flight Deck Officer (FDO) Gestures

Authors: Deniz T. Sodiri, Venkat V S S Sastry

Abstract:

The paper presents an on-line recognition machine (RM) for continuous/isolated, dynamic and static gestures that arise in Flight Deck Officer (FDO) training. RM is based on generic pattern recognition framework. Gestures are represented as templates using summary statistics. The proposed recognition algorithm exploits temporal and spatial characteristics of gestures via dynamic programming and Markovian process. The algorithm predicts corresponding index of incremental input data in the templates in an on-line mode. Accumulated consistency in the sequence of prediction provides a similarity measurement (Score) between input data and the templates. The algorithm provides an intuitive mechanism for automatic detection of start/end frames of continuous gestures. In the present paper, we consider isolated gestures. The performance of RM is evaluated using four datasets - artificial (W TTest), hand motion (Yang) and FDO (tracker, vision-based ). RM achieves comparable results which are in agreement with other on-line and off-line algorithms such as hidden Markov model (HMM) and dynamic time warping (DTW). The proposed algorithm has the additional advantage of providing timely feedback for training purposes.

Keywords: On-line Recognition Algorithm, IsolatedDynamic/Static Gesture Recognition, On-line Markovian/DynamicProgramming, Training in Virtual Environments.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1452
725 Forensic Speaker Verification in Noisy Environmental by Enhancing the Speech Signal Using ICA Approach

Authors: Ahmed Kamil Hasan Al-Ali, Bouchra Senadji, Ganesh Naik

Abstract:

We propose a system to real environmental noise and channel mismatch for forensic speaker verification systems. This method is based on suppressing various types of real environmental noise by using independent component analysis (ICA) algorithm. The enhanced speech signal is applied to mel frequency cepstral coefficients (MFCC) or MFCC feature warping to extract the essential characteristics of the speech signal. Channel effects are reduced using an intermediate vector (i-vector) and probabilistic linear discriminant analysis (PLDA) approach for classification. The proposed algorithm is evaluated by using an Australian forensic voice comparison database, combined with car, street and home noises from QUT-NOISE at a signal to noise ratio (SNR) ranging from -10 dB to 10 dB. Experimental results indicate that the MFCC feature warping-ICA achieves a reduction in equal error rate about (48.22%, 44.66%, and 50.07%) over using MFCC feature warping when the test speech signals are corrupted with random sessions of street, car, and home noises at -10 dB SNR.

Keywords: Noisy forensic speaker verification, ICA algorithm, MFCC, MFCC feature warping.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 978
724 Local Steerable Pyramid Binary Pattern Sequence LSPBPS for Face Recognition Method

Authors: Mohamed El Aroussi, Mohammed El Hassouni, Sanaa Ghouzali, Mohammed Rziza, Driss Aboutajdine

Abstract:

In this paper the problem of face recognition under variable illumination conditions is considered. Most of the works in the literature exhibit good performance under strictly controlled acquisition conditions, but the performance drastically drop when changes in pose and illumination occur, so that recently number of approaches have been proposed to deal with such variability. The aim of this work is to introduce an efficient local appearance feature extraction method based steerable pyramid (SP) for face recognition. Local information is extracted from SP sub-bands using LBP(Local binary Pattern). The underlying statistics allow us to reduce the required amount of data to be stored. The experiments carried out on different face databases confirm the effectiveness of the proposed approach.

Keywords: Face recognition (FR), Steerable pyramid (SP), localBinary Pattern (LBP).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2170
723 Effects of Recognition of Customer Feedback on Relationships between Emotional Labor and Job Satisfaction: Focusing on a Call Center that Offers Professional Services

Authors: Kiyoko Yoshimura, Yasunobu Kino

Abstract:

Focusing on professional call centers where workers with expertise perform services, this study aims to clarify the relationships between emotional labor and job satisfaction and the effects of recognition of customer feedback. Since the professional call center operators consist of professional license holders (qualification holders) and those who do not (non-holders), the following three points are analyzed in the two groups by using covariance structure analysis and simultaneous multi-population analysis: 1) The relationship between emotional labor and job satisfaction, 2) customer feedback and job satisfaction, and 3) the intermediation effect between the emotional labor of customer feedback and job satisfaction. The following results are obtained: i) No direct effect is found between job satisfaction and emotional labor for qualification holders and non-holders, ii) for qualification holders and non-holders, recognition of positive feedback and recognition of negative feedback had positive and negative effects on job satisfaction, respectively, iii) for qualification and non-holders, “consideration for colleagues” influences job satisfaction by recognizing positive feedback, and iv) only for qualification holders, the factors “customer-oriented emotional expression” and “emotional disharmony” have a positive and negative effect on job satisfaction, respectively, through recognition of positive feedback and recognition of negative feedback.

Keywords: Call center, emotional labor, professional service, job satisfaction, customer feedback.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 38
722 Intelligent Mobile Search Oriented to Global e-Commerce

Authors: Abdelkader Dekdouk

Abstract:

In this paper we propose a novel approach for searching eCommerce products using a mobile phone, illustrated by a prototype eCoMobile. This approach aims to globalize the mobile search by integrating the concept of user multilinguism into it. To show that, we particularly deal with English and Arabic languages. Indeed the mobile user can formulate his query on a commercial product in either language (English/Arabic). The description of his information need on commercial products relies on the ontology that represents the conceptualization of the product catalogue knowledge domain defined in both English and Arabic languages. A query expressed on a mobile device client defines the concept that corresponds to the name of the product followed by a set of pairs (property, value) specifying the characteristics of the product. Once a query is submitted it is then communicated to the server side which analyses it and in its turn performs an http request to an eCommerce application server (like Amazon). This latter responds by returning an XML file representing a set of elements where each element defines an item of the searched product with its specific characteristics. The XML file is analyzed on the server side and then items are displayed on the mobile device client along with its relevant characteristics in the chosen language.

Keywords: Mobile computing, search engine, multilingualglobal eCommerce, ontology, XML.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2082
721 Computationally Efficient Signal Quality Improvement Method for VoIP System

Authors: H. P. Singh, S. Singh

Abstract:

The voice signal in Voice over Internet protocol (VoIP) system is processed through the best effort policy based IP network, which leads to the network degradations including delay, packet loss jitter. The work in this paper presents the implementation of finite impulse response (FIR) filter for voice quality improvement in the VoIP system through distributed arithmetic (DA) algorithm. The VoIP simulations are conducted with AMR-NB 6.70 kbps and G.729a speech coders at different packet loss rates and the performance of the enhanced VoIP signal is evaluated using the perceptual evaluation of speech quality (PESQ) measurement for narrowband signal. The results show reduction in the computational complexity in the system and significant improvement in the quality of the VoIP voice signal.

Keywords: VoIP, Signal Quality, Distributed Arithmetic, Packet Loss, Speech Coder.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1819
720 Liveness Detection for Embedded Face Recognition System

Authors: Hyung-Keun Jee, Sung-Uk Jung, Jang-Hee Yoo

Abstract:

To increase reliability of face recognition system, the system must be able to distinguish real face from a copy of face such as a photograph. In this paper, we propose a fast and memory efficient method of live face detection for embedded face recognition system, based on the analysis of the movement of the eyes. We detect eyes in sequential input images and calculate variation of each eye region to determine whether the input face is a real face or not. Experimental results show that the proposed approach is competitive and promising for live face detection.

Keywords: Liveness Detection, Eye detection, SQI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3168
719 Reduced Dynamic Time Warping for Handwriting Recognition Based on Multidimensional Time Series of a Novel Pen Device

Authors: Muzaffar Bashir, Jürgen Kempf

Abstract:

The purpose of this paper is to present a Dynamic Time Warping technique which reduces significantly the data processing time and memory size of multi-dimensional time series sampled by the biometric smart pen device BiSP. The acquisition device is a novel ballpoint pen equipped with a diversity of sensors for monitoring the kinematics and dynamics of handwriting movement. The DTW algorithm has been applied for time series analysis of five different sensor channels providing pressure, acceleration and tilt data of the pen generated during handwriting on a paper pad. But the standard DTW has processing time and memory space problems which limit its practical use for online handwriting recognition. To face with this problem the DTW has been applied to the sum of the five sensor signals after an adequate down-sampling of the data. Preliminary results have shown that processing time and memory size could significantly be reduced without deterioration of performance in single character and word recognition. Further excellent accuracy in recognition was achieved which is mainly due to the reduced dynamic time warping RDTW technique and a novel pen device BiSP.

Keywords: Biometric character recognition, biometric person authentication, biometric smart pen BiSP, dynamic time warping DTW, online-handwriting recognition, multidimensional time series.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2395
718 Voice Command Recognition System Based on MFCC and VQ Algorithms

Authors: Mahdi Shaneh, Azizollah Taheri

Abstract:

The goal of this project is to design a system to recognition voice commands. Most of voice recognition systems contain two main modules as follow “feature extraction" and “feature matching". In this project, MFCC algorithm is used to simulate feature extraction module. Using this algorithm, the cepstral coefficients are calculated on mel frequency scale. VQ (vector quantization) method will be used for reduction of amount of data to decrease computation time. In the feature matching stage Euclidean distance is applied as similarity criterion. Because of high accuracy of used algorithms, the accuracy of this voice command system is high. Using these algorithms, by at least 5 times repetition for each command, in a single training session, and then twice in each testing session zero error rate in recognition of commands is achieved.

Keywords: MFCC, Vector quantization, Vocal tract, Voicecommand.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3142
717 Extracting Tongue Shape Dynamics from Magnetic Resonance Image Sequences

Authors: María S. Avila-García, John N. Carter, Robert I. Damper

Abstract:

An important problem in speech research is the automatic extraction of information about the shape and dimensions of the vocal tract during real-time speech production. We have previously developed Southampton dynamic magnetic resonance imaging (SDMRI) as an approach to the solution of this problem.However, the SDMRI images are very noisy so that shape extraction is a major challenge. In this paper, we address the problem of tongue shape extraction, which poses difficulties because this is a highly deforming non-parametric shape. We show that combining active shape models with the dynamic Hough transform allows the tongue shape to be reliably tracked in the image sequence.

Keywords: Vocal tract imaging, speech production, active shapemodels, dynamic Hough transform, object tracking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1724
716 Optimal Feature Extraction Dimension in Finger Vein Recognition Using Kernel Principal Component Analysis

Authors: Amir Hajian, Sepehr Damavandinejadmonfared

Abstract:

In this paper the issue of dimensionality reduction is investigated in finger vein recognition systems using kernel Principal Component Analysis (KPCA). One aspect of KPCA is to find the most appropriate kernel function on finger vein recognition as there are several kernel functions which can be used within PCA-based algorithms. In this paper, however, another side of PCA-based algorithms -particularly KPCA- is investigated. The aspect of dimension of feature vector in PCA-based algorithms is of importance especially when it comes to the real-world applications and usage of such algorithms. It means that a fixed dimension of feature vector has to be set to reduce the dimension of the input and output data and extract the features from them. Then a classifier is performed to classify the data and make the final decision. We analyze KPCA (Polynomial, Gaussian, and Laplacian) in details in this paper and investigate the optimal feature extraction dimension in finger vein recognition using KPCA.

Keywords: Biometrics, finger vein recognition, Principal Component Analysis (PCA), Kernel Principal Component Analysis (KPCA).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1954
715 A Method for Quality Inspection of Motors by Detecting Abnormal Sound

Authors: Tadatsugu Kitamoto

Abstract:

Recently, a quality of motors is inspected by human ears. In this paper, I propose two systems using a method of speech recognition for automation of the inspection. The first system is based on a method of linear processing which uses K-means and Nearest Neighbor method, and the second is based on a method of non-linear processing which uses neural networks. I used motor sounds in these systems, and I successfully recognize 86.67% of motor sounds in the linear processing system and 97.78% in the non-linear processing system.

Keywords: Acoustical diagnosis, Neural networks, K-means, Short-time Fourier transformation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1691
714 Motion Recognition Based On Fuzzy WP Feature Extraction Approach

Authors: Keun-Chang Kwak

Abstract:

This paper is concerned with motion recognition based fuzzy WP(Wavelet Packet) feature extraction approach from Vicon physical data sets. For this purpose, we use an efficient fuzzy mutual-information-based WP transform for feature extraction. This method estimates the required mutual information using a novel approach based on fuzzy membership function. The physical action data set includes 10 normal and 10 aggressive physical actions that measure the human activity. The data have been collected from 10 subjects using the Vicon 3D tracker. The experiments consist of running, seating, and walking as physical activity motion among various activities. The experimental results revealed that the presented feature extraction approach showed good recognition performance.

Keywords: Motion recognition, fuzzy wavelet packet, Vicon physical data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1634
713 Rotation Invariant Face Recognition Based on Hybrid LPT/DCT Features

Authors: Rehab F. Abdel-Kader, Rabab M. Ramadan, Rawya Y. Rizk

Abstract:

The recognition of human faces, especially those with different orientations is a challenging and important problem in image analysis and classification. This paper proposes an effective scheme for rotation invariant face recognition using Log-Polar Transform and Discrete Cosine Transform combined features. The rotation invariant feature extraction for a given face image involves applying the logpolar transform to eliminate the rotation effect and to produce a row shifted log-polar image. The discrete cosine transform is then applied to eliminate the row shift effect and to generate the low-dimensional feature vector. A PSO-based feature selection algorithm is utilized to search the feature vector space for the optimal feature subset. Evolution is driven by a fitness function defined in terms of maximizing the between-class separation (scatter index). Experimental results, based on the ORL face database using testing data sets for images with different orientations; show that the proposed system outperforms other face recognition methods. The overall recognition rate for the rotated test images being 97%, demonstrating that the extracted feature vector is an effective rotation invariant feature set with minimal set of selected features.

Keywords: Discrete Cosine Transform, Face Recognition, Feature Extraction, Log Polar Transform, Particle SwarmOptimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1863
712 Face Recognition Using Morphological Shared-weight Neural Networks

Authors: Hossein Sahoolizadeh, Mahdi Rahimi, Hamid Dehghani

Abstract:

We introduce an algorithm based on the morphological shared-weight neural network. Being nonlinear and translation-invariant, the MSNN can be used to create better generalization during face recognition. Feature extraction is performed on grayscale images using hit-miss transforms that are independent of gray-level shifts. The output is then learned by interacting with the classification process. The feature extraction and classification networks are trained together, allowing the MSNN to simultaneously learn feature extraction and classification for a face. For evaluation, we test for robustness under variations in gray levels and noise while varying the network-s configuration to optimize recognition efficiency and processing time. Results show that the MSNN performs better for grayscale image pattern classification than ordinary neural networks.

Keywords: Face recognition, Neural Networks, Multi-layer Perceptron, masking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1507
711 MarginDistillation: Distillation for Face Recognition Neural Networks with Margin-Based Softmax

Authors: Svitov David, Alyamkin Sergey

Abstract:

The usage of convolutional neural networks (CNNs) in conjunction with the margin-based softmax approach demonstrates the state-of-the-art performance for the face recognition problem. Recently, lightweight neural network models trained with the margin-based softmax have been introduced for the face identification task for edge devices. In this paper, we propose a distillation method for lightweight neural network architectures that outperforms other known methods for the face recognition task on LFW, AgeDB-30 and Megaface datasets. The idea of the proposed method is to use class centers from the teacher network for the student network. Then the student network is trained to get the same angles between the class centers and face embeddings predicted by the teacher network.

Keywords: ArcFace, distillation, face recognition, margin-based softmax.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 612
710 Influence of Loudness Compression on Hearing with Bone Anchored Hearing Implants

Authors: Anja Kurz, Marc Flynn, Tobias Good, Marco Caversaccio, Martin Kompis

Abstract:

Bone Anchored Hearing Implants (BAHI) are  routinely used in patients with conductive or mixed hearing loss, e.g.  if conventional air conduction hearing aids cannot be used. New  sound processors and new fitting software now allow the adjustment  of parameters such as loudness compression ratios or maximum  power output separately. Today it is unclear, how the choice of these  parameters influences aided speech understanding in BAHI users.  In this prospective experimental study, the effect of varying the  compression ratio and lowering the maximum power output in a  BAHI were investigated.  Twelve experienced adult subjects with a mixed hearing loss  participated in this study. Four different compression ratios (1.0; 1.3;  1.6; 2.0) were tested along with two different maximum power output  settings, resulting in a total of eight different programs. Each  participant tested each program during two weeks. A blinded Latin  square design was used to minimize bias.  For each of the eight programs, speech understanding in quiet and  in noise was assessed. For speech in quiet, the Freiburg number test  and the Freiburg monosyllabic word test at 50, 65, and 80 dB SPL  were used. For speech in noise, the Oldenburg sentence test was  administered.  Speech understanding in quiet and in noise was improved  significantly in the aided condition in any program, when compared  to the unaided condition. However, no significant differences were  found between any of the eight programs. In contrast, on a subjective  level there was a significant preference for medium compression  ratios of 1.3 to 1.6 and higher maximum power output.

 

Keywords: Bone Anchored Hearing Implant, Compression, Maximum Power Output, Speech understanding.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2047
709 Using Different Aspects of the Signings for Appearance-based Sign Language Recognition

Authors: Morteza Zahedi, Philippe Dreuw, Thomas Deselaers, Hermann Ney

Abstract:

Sign language is used by the deaf and hard of hearing people for communication. Automatic sign language recognition is a challenging research area since sign language often is the only way of communication for the deaf people. Sign language includes different components of visual actions made by the signer using the hands, the face, and the torso, to convey his/her meaning. To use different aspects of signs, we combine the different groups of features which have been extracted from the image frames recorded directly by a stationary camera. We combine the features in two levels by employing three techniques. At the feature level, an early feature combination can be performed by concatenating and weighting different feature groups, or by concatenating feature groups over time and using LDA to choose the most discriminant elements. At the model level, a late fusion of differently trained models can be carried out by a log-linear model combination. In this paper, we investigate these three combination techniques in an automatic sign language recognition system and show that the recognition rate can be significantly improved.

Keywords: American sign language, appearance-based features, Feature combination, Sign language recognition

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1385
708 The Modified Eigenface Method using Two Thresholds

Authors: Yan Ma, ShunBao Li

Abstract:

A new approach is adopted in this paper based on Turk and Pentland-s eigenface method. It was found that the probability density function of the distance between the projection vector of the input face image and the average projection vector of the subject in the face database, follows Rayleigh distribution. In order to decrease the false acceptance rate and increase the recognition rate, the input face image has been recognized using two thresholds including the acceptance threshold and the rejection threshold. We also find out that the value of two thresholds will be close to each other as number of trials increases. During the training, in order to reduce the number of trials, the projection vectors for each subject has been averaged. The recognition experiments using the proposed algorithm show that the recognition rate achieves to 92.875% whilst the average number of judgment is only 2.56 times.

Keywords: Eigenface, Face Recognition, Threshold, Rayleigh Distribution, Feature Extraction

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1480
707 Vision Based Hand Gesture Recognition Using Generative and Discriminative Stochastic Models

Authors: Mahmoud Elmezain, Samar El-shinawy

Abstract:

Many approaches to pattern recognition are founded on probability theory, and can be broadly characterized as either generative or discriminative according to whether or not the distribution of the image features. Generative and discriminative models have very different characteristics, as well as complementary strengths and weaknesses. In this paper, we study these models to recognize the patterns of alphabet characters (A-Z) and numbers (0-9). To handle isolated pattern, generative model as Hidden Markov Model (HMM) and discriminative models like Conditional Random Field (CRF), Hidden Conditional Random Field (HCRF) and Latent-Dynamic Conditional Random Field (LDCRF) with different number of window size are applied on extracted pattern features. The gesture recognition rate is improved initially as the window size increase, but degrades as window size increase further. Experimental results show that the LDCRF is the best in terms of results than CRF, HCRF and HMM at window size equal 4. Additionally, our results show that; an overall recognition rates are 91.52%, 95.28%, 96.94% and 98.05% for CRF, HCRF, HMM and LDCRF respectively.

Keywords: Statistical Pattern Recognition, Generative Model, Discriminative Model, Human Computer Interaction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2917
706 Comparison of Fricative Vocal Tract Transfer Functions Derived using Two Different Segmentation Techniques

Authors: K. S. Subari, C. H. Shadle, A. Barney, R. I. Damper

Abstract:

The acoustic and articulatory properties of fricative speech sounds are being studied using magnetic resonance imaging (MRI) and acoustic recordings from a single subject. Area functions were derived from a complete set of axial and coronal MR slices using two different methods: the Mermelstein technique and the Blum transform. Area functions derived from the two techniques were shown to differ significantly in some cases. Such differences will lead to different acoustic predictions and it is important to know which is the more accurate. The vocal tract acoustic transfer function (VTTF) was derived from these area functions for each fricative and compared with measured speech signals for the same fricative and same subject. The VTTFs for /f/ in two vowel contexts and the corresponding acoustic spectra are derived here; the Blum transform appears to show a better match between prediction and measurement than the Mermelstein technique.

Keywords: Area functions, fricatives, vocal tract transferfunction, MRI, speech.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1642
705 Rapid Study on Feature Extraction and Classification Models in Healthcare Applications

Authors: S. Sowmyayani

Abstract:

The advancement of computer-aided design helps the medical force and security force. Some applications include biometric recognition, elderly fall detection, face recognition, cancer recognition, tumor recognition, etc. This paper deals with different machine learning algorithms that are more generically used for any health care system. The most focused problems are classification and regression. With the rise of big data, machine learning has become particularly important for solving problems. Machine learning uses two types of techniques: supervised learning and unsupervised learning. The former trains a model on known input and output data and predicts future outputs. Classification and regression are supervised learning techniques. Unsupervised learning finds hidden patterns in input data. Clustering is one such unsupervised learning technique. The above-mentioned models are discussed briefly in this paper.

Keywords: Supervised learning, unsupervised learning, regression, neural network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 329
704 On-line Lao Handwritten Recognition with Proportional Invariant Feature

Authors: Khampheth Bounnady, Boontee Kruatrachue, Somkiat Wangsiripitak

Abstract:

This paper proposed high level feature for online Lao handwritten recognition. This feature must be high level enough so that the feature is not change when characters are written by different persons at different speed and different proportion (shorter or longer stroke, head, tail, loop, curve). In this high level feature, a character is divided in to sequence of curve segments where a segment start where curve reverse rotation (counter clockwise and clockwise). In each segment, following features are gathered cumulative change in direction of curve (- for clockwise), cumulative curve length, cumulative length of left to right, right to left, top to bottom and bottom to top ( cumulative change in X and Y axis of segment). This feature is simple yet robust for high accuracy recognition. The feature can be gather from parsing the original time sampling sequence X, Y point of the pen location without re-sampling. We also experiment on other segmentation point such as the maximum curvature point which was widely used by other researcher. Experiments results show that the recognition rates are at 94.62% in comparing to using maximum curvature point 75.07%. This is due to a lot of variations of turning points in handwritten.

Keywords: Handwritten feature, chain code, Lao handwritten recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2024
703 Control Chart Pattern Recognition Using Wavelet Based Neural Networks

Authors: Jun Seok Kim, Cheong-Sool Park, Jun-Geol Baek, Sung-Shick Kim

Abstract:

Control chart pattern recognition is one of the most important tools to identify the process state in statistical process control. The abnormal process state could be classified by the recognition of unnatural patterns that arise from assignable causes. In this study, a wavelet based neural network approach is proposed for the recognition of control chart patterns that have various characteristics. The procedure of proposed control chart pattern recognizer comprises three stages. First, multi-resolution wavelet analysis is used to generate time-shape and time-frequency coefficients that have detail information about the patterns. Second, distance based features are extracted by a bi-directional Kohonen network to make reduced and robust information. Third, a back-propagation network classifier is trained by these features. The accuracy of the proposed method is shown by the performance evaluation with numerical results.

Keywords: Control chart pattern recognition, Multi-resolution wavelet analysis, Bi-directional Kohonen network, Back-propagation network, Feature extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2466
702 A Simple Adaptive Atomic Decomposition Voice Activity Detector Implemented by Matching Pursuit

Authors: Thomas Bryan, Veton Kepuska, Ivica Kostanic

Abstract:

A simple adaptive voice activity detector (VAD) is implemented using Gabor and gammatone atomic decomposition of speech for high Gaussian noise environments. Matching pursuit is used for atomic decomposition, and is shown to achieve optimal speech detection capability at high data compression rates for low signal to noise ratios. The most active dictionary elements found by matching pursuit are used for the signal reconstruction so that the algorithm adapts to the individual speakers dominant time-frequency characteristics. Speech has a high peak to average ratio enabling matching pursuit greedy heuristic of highest inner products to isolate high energy speech components in high noise environments. Gabor and gammatone atoms are both investigated with identical logarithmically spaced center frequencies, and similar bandwidths. The algorithm performs equally well for both Gabor and gammatone atoms with no significant statistical differences. The algorithm achieves 70% accuracy at a 0 dB SNR, 90% accuracy at a 5 dB SNR and 98% accuracy at a 20dB SNR using 30d B SNR as a reference for voice activity.

Keywords: Atomic Decomposition, Gabor, Gammatone, Matching Pursuit, Voice Activity Detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1782
701 Local Spectrum Feature Extraction for Face Recognition

Authors: Muhammad Imran Ahmad, Ruzelita Ngadiran, Mohd Nazrin Md Isa, Nor Ashidi Mat Isa, Mohd Zaizu Ilyas, Raja Abdullah Raja Ahmad, Said Amirul Anwar Ab Hamid, Muzammil Jusoh

Abstract:

This paper presents two techniques, local feature extraction using image spectrum and low frequency spectrum modelling using GMM to capture the underlying statistical information to improve the performance of face recognition system. Local spectrum features are extracted using overlap sub block window that are mapped on the face image. For each of this block, spatial domain is transformed to frequency domain using DFT. A low frequency coefficient is preserved by discarding high frequency coefficients by applying rectangular mask on the spectrum of the facial image. Low frequency information is non- Gaussian in the feature space and by using combination of several Gaussian functions that has different statistical properties, the best feature representation can be modelled using probability density function. The recognition process is performed using maximum likelihood value computed using pre-calculated GMM components. The method is tested using FERET datasets and is able to achieved 92% recognition rates.

Keywords: Local features modelling, face recognition system, Gaussian mixture models.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2240