Search results for: sound recognition
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1046

Search results for: sound recognition

986 Advances in Artificial Intelligence Using Speech Recognition

Authors: Khaled M. Alhawiti

Abstract:

This research study aims to present a retrospective study about speech recognition systems and artificial intelligence. Speech recognition has become one of the widely used technologies, as it offers great opportunity to interact and communicate with automated machines. Precisely, it can be affirmed that speech recognition facilitates its users and helps them to perform their daily routine tasks, in a more convenient and effective manner. This research intends to present the illustration of recent technological advancements, which are associated with artificial intelligence. Recent researches have revealed the fact that speech recognition is found to be the utmost issue, which affects the decoding of speech. In order to overcome these issues, different statistical models were developed by the researchers. Some of the most prominent statistical models include acoustic model (AM), language model (LM), lexicon model, and hidden Markov models (HMM). The research will help in understanding all of these statistical models of speech recognition. Researchers have also formulated different decoding methods, which are being utilized for realistic decoding tasks and constrained artificial languages. These decoding methods include pattern recognition, acoustic phonetic, and artificial intelligence. It has been recognized that artificial intelligence is the most efficient and reliable methods, which are being used in speech recognition.

Keywords: Speech recognition, acoustic phonetic, artificial intelligence, Hidden Markov Models (HMM), statistical models of speech recognition, human machine performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7903
985 Face Recognition Using Double Dimension Reduction

Authors: M. A Anjum, M. Y. Javed, A. Basit

Abstract:

In this paper a new approach to face recognition is presented that achieves double dimension reduction making the system computationally efficient with better recognition results. In pattern recognition techniques, discriminative information of image increases with increase in resolution to a certain extent, consequently face recognition results improve with increase in face image resolution and levels off when arriving at a certain resolution level. In the proposed model of face recognition, first image decimation algorithm is applied on face image for dimension reduction to a certain resolution level which provides best recognition results. Due to better computational speed and feature extraction potential of Discrete Cosine Transform (DCT) it is applied on face image. A subset of coefficients of DCT from low to mid frequencies that represent the face adequately and provides best recognition results is retained. A trade of between decimation factor, number of DCT coefficients retained and recognition rate with minimum computation is obtained. Preprocessing of the image is carried out to increase its robustness against variations in poses and illumination level. This new model has been tested on different databases which include ORL database, Yale database and a color database. The proposed technique has performed much better compared to other techniques. The significance of the model is two fold: (1) dimension reduction up to an effective and suitable face image resolution (2) appropriate DCT coefficients are retained to achieve best recognition results with varying image poses, intensity and illumination level.

Keywords: Biometrics, DCT, Face Recognition, Feature extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1449
984 Probabilistic Bayesian Framework for Infrared Face Recognition

Authors: Moulay A. Akhloufi, Abdelhakim Bendada

Abstract:

Face recognition in the infrared spectrum has attracted a lot of interest in recent years. Many of the techniques used in infrared are based on their visible counterpart, especially linear techniques like PCA and LDA. In this work, we introduce a probabilistic Bayesian framework for face recognition in the infrared spectrum. In the infrared spectrum, variations can occur between face images of the same individual due to pose, metabolic, time changes, etc. Bayesian approaches permit to reduce intrapersonal variation, thus making them very interesting for infrared face recognition. This framework is compared with classical linear techniques. Non linear techniques we developed recently for infrared face recognition are also presented and compared to the Bayesian face recognition framework. A new approach for infrared face extraction based on SVM is introduced. Experimental results show that the Bayesian technique is promising and lead to interesting results in the infrared spectrum when a sufficient number of face images is used in an intrapersonal learning process.

Keywords: Face recognition, biometrics, probabilistic imageprocessing, infrared imaging.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1828
983 Off-Line Hand Written Thai Character Recognition using Ant-Miner Algorithm

Authors: P. Phokharatkul, K. Sankhuangaw, S. Somkuarnpanit, S. Phaiboon, C. Kimpan

Abstract:

Much research into handwritten Thai character recognition have been proposed, such as comparing heads of characters, Fuzzy logic and structure trees, etc. This paper presents a system of handwritten Thai character recognition, which is based on the Ant-minor algorithm (data mining based on Ant colony optimization). Zoning is initially used to determine each character. Then three distinct features (also called attributes) of each character in each zone are extracted. The attributes are Head zone, End point, and Feature code. All attributes are used for construct the classification rules by an Ant-miner algorithm in order to classify 112 Thai characters. For this experiment, the Ant-miner algorithm is adapted, with a small change to increase the recognition rate. The result of this experiment is a 97% recognition rate of the training set (11200 characters) and 82.7% recognition rate of unseen data test (22400 characters).

Keywords: Hand written, Thai character recognition, Ant-mineralgorithm, distinct feature.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1888
982 Performance Improvement of Moving Object Recognition and Tracking Algorithm using Parallel Processing of SURF and Optical Flow

Authors: Jungho Choi, Youngwan Cho

Abstract:

The paper proposes a way of parallel processing of SURF and Optical Flow for moving object recognition and tracking. The object recognition and tracking is one of the most important task in computer vision, however disadvantage are many operations cause processing speed slower so that it can-t do real-time object recognition and tracking. The proposed method uses a typical way of feature extraction SURF and moving object Optical Flow for reduce disadvantage and real-time moving object recognition and tracking, and parallel processing techniques for speed improvement. First analyse that an image from DB and acquired through the camera using SURF for compared to the same object recognition then set ROI (Region of Interest) for tracking movement of feature points using Optical Flow. Secondly, using Multi-Thread is for improved processing speed and recognition by parallel processing. Finally, performance is evaluated and verified efficiency of algorithm throughout the experiment.

Keywords: moving object recognition, moving object tracking, SURF, Optical Flow, Multi-Thread.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2595
981 Automatic Recognition of Emotionally Coloured Speech

Authors: Theologos Athanaselis, Stelios Bakamidis, Ioannis Dologlou

Abstract:

Emotion in speech is an issue that has been attracting the interest of the speech community for many years, both in the context of speech synthesis as well as in automatic speech recognition (ASR). In spite of the remarkable recent progress in Large Vocabulary Recognition (LVR), it is still far behind the ultimate goal of recognising free conversational speech uttered by any speaker in any environment. Current experimental tests prove that using state of the art large vocabulary recognition systems the error rate increases substantially when applied to spontaneous/emotional speech. This paper shows that recognition rate for emotionally coloured speech can be improved by using a language model based on increased representation of emotional utterances.

Keywords: Statistical language model, N-grams, emotionallycoloured speech

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1572
980 Wood Species Recognition System

Authors: Bremananth R, Nithya B, Saipriya R

Abstract:

The proposed system identifies the species of the wood using the textural features present in its barks. Each species of a wood has its own unique patterns in its bark, which enabled the proposed system to identify it accurately. Automatic wood recognition system has not yet been well established mainly due to lack of research in this area and the difficulty in obtaining the wood database. In our work, a wood recognition system has been designed based on pre-processing techniques, feature extraction and by correlating the features of those wood species for their classification. Texture classification is a problem that has been studied and tested using different methods due to its valuable usage in various pattern recognition problems, such as wood recognition, rock classification. The most popular technique used for the textural classification is Gray-level Co-occurrence Matrices (GLCM). The features from the enhanced images are thus extracted using the GLCM is correlated, which determines the classification between the various wood species. The result thus obtained shows a high rate of recognition accuracy proving that the techniques used in suitable to be implemented for commercial purposes.

Keywords: Correlation, Grey Level Co-Occurrence Matrix, ProbabilityDensity Function, Wood Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2413
979 Assessment of Noise Pollution in the City of Biskra, Algeria

Authors: Tallal Abdel Karim Bouzir, Nourdinne Zemmouri, Djihed Berkouk

Abstract:

In this research, a quantitative assessment of the urban sound environment of the city of Biskra, Algeria, was conducted. To determine the quality of the soundscape based on in-situ measurement, using a Landtek SL5868P sound level meter in 47 points, which have been identified to represent the whole city. The result shows that the urban noise level varies from 55.3 dB to 75.8 dB during the weekdays and from 51.7 dB to 74.3 dB during the weekend. On the other hand, we can also note that 70.20% of the results of the weekday measurements and 55.30% of the results of the weekend measurements have levels of sound intensity that exceed the levels allowed by Algerian law and the recommendations of the World Health Organization. These very high urban noise levels affect the quality of life, the acoustic comfort and may even pose multiple risks to people's health.

Keywords: Noise pollution, road traffic, sound intensity, public health, noise monitoring.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1012
978 Speaker Recognition Using LIRA Neural Networks

Authors: Nestor A. Garcia Fragoso, Tetyana Baydyk, Ernst Kussul

Abstract:

This article contains information from our investigation in the field of voice recognition. For this purpose, we created a voice database that contains different phrases in two languages, English and Spanish, for men and women. As a classifier, the LIRA (Limited Receptive Area) grayscale neural classifier was selected. The LIRA grayscale neural classifier was developed for image recognition tasks and demonstrated good results. Therefore, we decided to develop a recognition system using this classifier for voice recognition. From a specific set of speakers, we can recognize the speaker’s voice. For this purpose, the system uses spectrograms of the voice signals as input to the system, extracts the characteristics and identifies the speaker. The results are described and analyzed in this article. The classifier can be used for speaker identification in security system or smart buildings for different types of intelligent devices.

Keywords: Extreme learning, LIRA neural classifier, speaker identification, voice recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 721
977 Investigation of the Acoustic Properties of Recycled Felt Panels and Their Application in Classrooms and Multi-Purpose Halls

Authors: Ivanova B. Natalia, Djambova Т. Svetlana, Hristev S. Ivailo

Abstract:

The acoustic properties of recycled felt panels have been investigated using various methods. Experimentally, the sound insulation of the panels has been evaluated for frequencies in the range of 600 Hz to 4000 Hz, utilizing a small-sized acoustic chamber. Additionally, the sound absorption coefficient for the frequency range of 63 Hz to 4000 Hz was measured according to the EN ISO 354 standard in a laboratory reverberation room. This research was deemed necessary after conducting reverberation time measurements of a university classroom following the EN ISO 3382-2 standard. The measurements indicated values of 2.86 s at 500 Hz, 3.23 s at 1000 Hz, and 2.53 s at 2000 Hz, which significantly exceeded the requirements set by the national regulatory framework (0.6 s) for such premises. For this reason, recycled felt panels have been investigated in the laboratory, showing very good acoustic properties at high frequencies. To enhance performance in the low frequencies, the influence of the distance of the panel spacing was examined. Furthermore, the sound insulation of the panels was studied to expand the possibilities of their application, both for the acoustic treatment of educational and multifunctional halls and for sound insulation purposes (e.g., a suspended ceiling with an air gap passing from room to room). As a conclusion, a theoretical acoustic design of the classroom has been carried out with suggestions for improvements to achieve the necessary acoustic and aesthetic parameters for such rooms.

Keywords: Acoustic panels, recycled felt, sound absorption, sound insulation, classroom acoustics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20
976 Optimizing Feature Selection for Recognizing Handwritten Arabic Characters

Authors: Mohammed Z. Khedher, Gheith A. Abandah, Ahmed M. Al-Khawaldeh

Abstract:

Recognition of characters greatly depends upon the features used. Several features of the handwritten Arabic characters are selected and discussed. An off-line recognition system based on the selected features was built. The system was trained and tested with realistic samples of handwritten Arabic characters. Evaluation of the importance and accuracy of the selected features is made. The recognition based on the selected features give average accuracies of 88% and 70% for the numbers and letters, respectively. Further improvements are achieved by using feature weights based on insights gained from the accuracies of individual features.

Keywords: Arabic handwritten characters, Feature extraction, Off-line recognition, Optical character recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1643
975 On-line Handwritten Character Recognition: An Implementation of Counterpropagation Neural Net

Authors: Muhammad Faisal Zafar, Dzulkifli Mohamad, Razib M. Othman

Abstract:

On-line handwritten scripts are usually dealt with pen tip traces from pen-down to pen-up positions. Time evaluation of the pen coordinates is also considered along with trajectory information. However, the data obtained needs a lot of preprocessing including filtering, smoothing, slant removing and size normalization before recognition process. Instead of doing such lengthy preprocessing, this paper presents a simple approach to extract the useful character information. This work evaluates the use of the counter- propagation neural network (CPN) and presents feature extraction mechanism in full detail to work with on-line handwriting recognition. The obtained recognition rates were 60% to 94% using the CPN for different sets of character samples. This paper also describes a performance study in which a recognition mechanism with multiple thresholds is evaluated for counter-propagation architecture. The results indicate that the application of multiple thresholds has significant effect on recognition mechanism. The method is applicable for off-line character recognition as well. The technique is tested for upper-case English alphabets for a number of different styles from different peoples.

Keywords: On-line character recognition, character digitization, counter-propagation neural networks, extreme coordinates.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2388
974 An Improved Face Recognition Algorithm Using Histogram-Based Features in Spatial and Frequency Domains

Authors: Qiu Chen, Koji Kotani, Feifei Lee, Tadahiro Ohmi

Abstract:

In this paper, we propose an improved face recognition algorithm using histogram-based features in spatial and frequency domains. For adding spatial information of the face to improve recognition performance, a region-division (RD) method is utilized. The facial area is firstly divided into several regions, then feature vectors of each facial part are generated by Binary Vector Quantization (BVQ) histogram using DCT coefficients in low frequency domains, as well as Local Binary Pattern (LBP) histogram in spatial domain. Recognition results with different regions are first obtained separately and then fused by weighted averaging. Publicly available ORL database is used for the evaluation of our proposed algorithm, which is consisted of 40 subjects with 10 images per subject containing variations in lighting, posing, and expressions. It is demonstrated that face recognition using RD method can achieve much higher recognition rate.

Keywords: Face recognition, Binary vector quantization (BVQ), Local Binary Patterns (LBP), DCT coefficients.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1573
973 A New Pattern for Handwritten Persian/Arabic Digit Recognition

Authors: A. Harifi, A. Aghagolzadeh

Abstract:

The main problem for recognition of handwritten Persian digits using Neural Network is to extract an appropriate feature vector from image matrix. In this research an asymmetrical segmentation pattern is proposed to obtain the feature vector. This pattern can be adjusted as an optimum model thanks to its one degree of freedom as a control point. Since any chosen algorithm depends on digit identity, a Neural Network is used to prevail over this dependence. Inputs of this Network are the moment of inertia and the center of gravity which do not depend on digit identity. Recognizing the digit is carried out using another Neural Network. Simulation results indicate the high recognition rate of 97.6% for new introduced pattern in comparison to the previous models for recognition of digits.

Keywords: Pattern recognition, Persian digits, NeuralNetwork.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1636
972 Deep-Learning Based Approach to Facial Emotion Recognition Through Convolutional Neural Network

Authors: Nouha Khediri, Mohammed Ben Ammar, Monji Kherallah

Abstract:

Recently, facial emotion recognition (FER) has become increasingly essential to understand the state of the human mind. However, accurately classifying emotion from the face is a challenging task. In this paper, we present a facial emotion recognition approach named CV-FER benefiting from deep learning, especially CNN and VGG16. First, the data are pre-processed with data cleaning and data rotation. Then, we augment the data and proceed to our FER model, which contains five convolutions layers and five pooling layers. Finally, a softmax classifier is used in the output layer to recognize emotions. Based on the above contents, this paper reviews the works of facial emotion recognition based on deep learning. Experiments show that our model outperforms the other methods using the same FER2013 database and yields a recognition rate of 92%. We also put forward some suggestions for future work.

Keywords: CNN, deep-learning, facial emotion recognition, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 610
971 Offline Signature Recognition using Radon Transform

Authors: M.Radmehr, S.M.Anisheh, I.Yousefian

Abstract:

In this work a new offline signature recognition system based on Radon Transform, Fractal Dimension (FD) and Support Vector Machine (SVM) is presented. In the first step, projections of original signatures along four specified directions have been performed using radon transform. Then, FDs of four obtained vectors are calculated to construct a feature vector for each signature. These vectors are then fed into SVM classifier for recognition of signatures. In order to evaluate the effectiveness of the system several experiments are carried out. Offline signature database from signature verification competition (SVC) 2004 is used during all of the tests. Experimental result indicates that the proposed method achieved high accuracy rate in signature recognition.

Keywords: Fractal Dimension, Offline Signature Recognition, Radon Transform, Support Vector Machine

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2554
970 Practical Method for Digital Music Matching Robust to Various Sound Qualities

Authors: Bokyung Sung, Jungsoo Kim, Jinman Kwun, Junhyung Park, Jihye Ryeo, Ilju Ko

Abstract:

In this paper, we propose a practical digital music matching system that is robust to variation in sound qualities. The proposed system is subdivided into two parts: client and server. The client part consists of the input, preprocessing and feature extraction modules. The preprocessing module, including the music onset module, revises the value gap occurring on the time axis between identical songs of different formats. The proposed method uses delta-grouped Mel frequency cepstral coefficients (MFCCs) to extract music features that are robust to changes in sound quality. According to the number of sound quality formats (SQFs) used, a music server is constructed with a feature database (FD) that contains different sub feature databases (SFDs). When the proposed system receives a music file, the selection module selects an appropriate SFD from a feature database; the selected SFD is subsequently used by the matching module. In this study, we used 3,000 queries for matching experiments in three cases with different FDs. In each case, we used 1,000 queries constructed by mixing 8 SQFs and 125 songs. The success rate of music matching improved from 88.6% when using single a single SFD to 93.2% when using quadruple SFDs. By this experiment, we proved that the proposed method is robust to various sound qualities.

Keywords: Digital Music, Music Matching, Variation in Sound Qualities, Robust Matching method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1330
969 Face Recognition using a Kernelization of Graph Embedding

Authors: Pang Ying Han, Hiew Fu San, Ooi Shih Yin

Abstract:

Linearization of graph embedding has been emerged as an effective dimensionality reduction technique in pattern recognition. However, it may not be optimal for nonlinearly distributed real world data, such as face, due to its linear nature. So, a kernelization of graph embedding is proposed as a dimensionality reduction technique in face recognition. In order to further boost the recognition capability of the proposed technique, the Fisher-s criterion is opted in the objective function for better data discrimination. The proposed technique is able to characterize the underlying intra-class structure as well as the inter-class separability. Experimental results on FRGC database validate the effectiveness of the proposed technique as a feature descriptor.

Keywords: Face recognition, Fisher discriminant, graph embedding, kernelization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1657
968 Performance Comparison and Evaluation of AdaBoost and SoftBoost Algorithms on Generic Object Recognition

Authors: Doaa Hegazy, Joachim Denzler

Abstract:

SoftBoost is a recently presented boosting algorithm, which trades off the size of achieved classification margin and generalization performance. This paper presents a performance evaluation of SoftBoost algorithm on the generic object recognition problem. An appearance-based generic object recognition model is used. The evaluation experiments are performed using a difficult object recognition benchmark. An assessment with respect to different degrees of label noise as well as a comparison to the well known AdaBoost algorithm is performed. The obtained results reveal that SoftBoost is encouraged to be used in cases when the training data is known to have a high degree of noise. Otherwise, using Adaboost can achieve better performance.

Keywords: SoftBoost algorithm, AdaBoost algorithm, Generic object recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1789
967 The Effects of Spatial Dimensions and Relocation and Dimensions of Sound Absorbers in a Space on the Objective Parameters of Sound

Authors: Mustafa Kavraz

Abstract:

This study investigated the differences in the objective parameters of sound depending on the changes in the lengths of the lateral surfaces of a space and on the replacement of the sound absorbers that are placed on these surfaces. To this end, three models of room were chosen. The widths and heights of these rooms were the same but the lengths of the rooms were changed. The smallest room was 8 m. wide and 10 m. long. The lengths of the other two rooms were 15 m. and 20 m. For each model, the differences in the objective parameters of sound were determined by keeping all the material in the space intact and by changing only the positions of the sound absorbers that were placed on the walls. The sound absorbers that were used on the walls were of two different sizes. The sound absorbers that were placed on the walls were 4 m and 8 m. long and story-height (3 m.). In all model room types, the sound absorbers were placed on the long walls in three different ways: at the end of the long walls where the long walls meet the front wall; at the end of the long walls where the long walls meet the back wall; and in the middle part of the long walls. Except for the specially placed sound absorbers, the ground, wall and ceiling surfaces were covered with three different materials. There were no constructional elements such as doors and windows on the walls. On the surfaces, the materials specified in the Odeon 10 material library were used as coating material. Linoleum was used as flooring material, painted plaster as wall coating material and gypsum boards as ceiling covering (2 layers with a total of 32 mm. thickness). These were preferred due to the fact that they are the commonly used materials for these purposes. This study investigated the differences in the objective parameters of sound depending on the changes in the lengths of the lateral surfaces of a space and on the replacement of the sound absorbers that are placed on these surfaces. To this end, three models of room were chosen. The widths and heights of these rooms were the same but the lengths of the rooms were changed. The smallest room was 8 m. wide and 10 m. long. The lengths of the other two rooms were 15 m. and 20 m. For each model, the differences in the objective parameters of sound were determined by keeping all the material in the space intact and by changing only the positions of the sound absorbers that were placed on the walls. The sound absorbers that were used on the walls were of two different sizes. The sound absorbers that were placed on the walls were 4 m and 8 m. long and story-height (3 m.). In all model room types, the sound absorbers were placed on the long walls in three different ways: at the end of the long walls where the long walls meet the front wall; at the end of the long walls where the long walls meet the back wall; and in the middle part of the long walls. Except for the specially placed sound absorbers, the ground, wall and ceiling surfaces were covered with three different materials. There were no constructional elements such as doors and windows on the walls. On the surfaces, the materials specified in the Odeon 10 material library were used as coating material. Linoleum was used as flooring material, painted plaster as wall coating material and gypsum boards as ceiling covering (2 layers with a total of 32 mm. thickness). These were preferred due to the fact that they are the commonly used materials for these purposes.

Keywords: Jnd, objective parameters of sound, room model, sound absorber.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1500
966 Text-independent Speaker Identification Based on MAP Channel Compensation and Pitch-dependent Features

Authors: Jiqing Han, Rongchun Gao

Abstract:

One major source of performance decline in speaker recognition system is channel mismatch between training and testing. This paper focuses on improving channel robustness of speaker recognition system in two aspects of channel compensation technique and channel robust features. The system is text-independent speaker identification system based on two-stage recognition. In the aspect of channel compensation technique, this paper applies MAP (Maximum A Posterior Probability) channel compensation technique, which was used in speech recognition, to speaker recognition system. In the aspect of channel robust features, this paper introduces pitch-dependent features and pitch-dependent speaker model for the second stage recognition. Based on the first stage recognition to testing speech using GMM (Gaussian Mixture Model), the system uses GMM scores to decide if it needs to be recognized again. If it needs to, the system selects a few speakers from all of the speakers who participate in the first stage recognition for the second stage recognition. For each selected speaker, the system obtains 3 pitch-dependent results from his pitch-dependent speaker model, and then uses ANN (Artificial Neural Network) to unite the 3 pitch-dependent results and 1 GMM score for getting a fused result. The system makes the second stage recognition based on these fused results. The experiments show that the correct rate of two-stage recognition system based on MAP channel compensation technique and pitch-dependent features is 41.7% better than the baseline system for closed-set test.

Keywords: Channel Compensation, Channel Robustness, MAP, Speaker Identification

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1502
965 Video-based Face Recognition: A Survey

Authors: Huafeng Wang, Yunhong Wang, Yuan Cao

Abstract:

During the past several years, face recognition in video has received significant attention. Not only the wide range of commercial and law enforcement applications, but also the availability of feasible technologies after several decades of research contributes to the trend. Although current face recognition systems have reached a certain level of maturity, their development is still limited by the conditions brought about by many real applications. For example, recognition images of video sequence acquired in an open environment with changes in illumination and/or pose and/or facial occlusion and/or low resolution of acquired image remains a largely unsolved problem. In other words, current algorithms are yet to be developed. This paper provides an up-to-date survey of video-based face recognition research. To present a comprehensive survey, we categorize existing video based recognition approaches and present detailed descriptions of representative methods within each category. In addition, relevant topics such as real time detection, real time tracking for video, issues such as illumination, pose, 3D and low resolution are covered.

Keywords: Face recognition, video-based, survey

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4074
964 Mobile to Server Face Recognition: A System Overview

Authors: Nurulhuda Ismail, Mas Idayu Md. Sabri

Abstract:

This paper presents a system overview of Mobile to Server Face Recognition, which is a face recognition application developed specifically for mobile phones. Images taken from mobile phone cameras lack of quality due to the low resolution of the cameras. Thus, a prototype is developed to experiment the chosen method. However, this paper shows a result of system backbone without the face recognition functionality. The result demonstrated in this paper indicates that the interaction between mobile phones and server is successfully working. The result shown before the database is completely ready. The system testing is currently going on using real images and a mock-up database to test the functionality of the face recognition algorithm used in this system. An overview of the whole system including screenshots and system flow-chart are presented in this paper. This paper also presents the inspiration or motivation and the justification in developing this system.

Keywords: Mobile to server, face recognition, system overview.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2387
963 Efficient DTW-Based Speech Recognition System for Isolated Words of Arabic Language

Authors: Khalid A. Darabkh, Ala F. Khalifeh, Baraa A. Bathech, Saed W. Sabah

Abstract:

Despite the fact that Arabic language is currently one of the most common languages worldwide, there has been only a little research on Arabic speech recognition relative to other languages such as English and Japanese. Generally, digital speech processing and voice recognition algorithms are of special importance for designing efficient, accurate, as well as fast automatic speech recognition systems. However, the speech recognition process carried out in this paper is divided into three stages as follows: firstly, the signal is preprocessed to reduce noise effects. After that, the signal is digitized and hearingized. Consequently, the voice activity regions are segmented using voice activity detection (VAD) algorithm. Secondly, features are extracted from the speech signal using Mel-frequency cepstral coefficients (MFCC) algorithm. Moreover, delta and acceleration (delta-delta) coefficients have been added for the reason of improving the recognition accuracy. Finally, each test word-s features are compared to the training database using dynamic time warping (DTW) algorithm. Utilizing the best set up made for all affected parameters to the aforementioned techniques, the proposed system achieved a recognition rate of about 98.5% which outperformed other HMM and ANN-based approaches available in the literature.

Keywords: Arabic speech recognition, MFCC, DTW, VAD.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4032
962 Pattern Recognition of Partial Discharge by Using Simplified Fuzzy ARTMAP

Authors: S. Boonpoke, B. Marungsri

Abstract:

This paper presents the effectiveness of artificial intelligent technique to apply for pattern recognition and classification of Partial Discharge (PD). Characteristics of PD signal for pattern recognition and classification are computed from the relation of the voltage phase angle, the discharge magnitude and the repeated existing of partial discharges by using statistical and fractal methods. The simplified fuzzy ARTMAP (SFAM) is used for pattern recognition and classification as artificial intelligent technique. PDs quantities, 13 parameters from statistical method and fractal method results, are inputted to Simplified Fuzzy ARTMAP to train system for pattern recognition and classification. The results confirm the effectiveness of purpose technique.

Keywords: Partial discharges, PD Pattern recognition, PDClassification, Artificial intelligent, Simplified Fuzzy ARTMAP

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3025
961 Various Speech Processing Techniques For Speech Compression And Recognition

Authors: Jalal Karam

Abstract:

Years of extensive research in the field of speech processing for compression and recognition in the last five decades, resulted in a severe competition among the various methods and paradigms introduced. In this paper we include the different representations of speech in the time-frequency and time-scale domains for the purpose of compression and recognition. The examination of these representations in a variety of related work is accomplished. In particular, we emphasize methods related to Fourier analysis paradigms and wavelet based ones along with the advantages and disadvantages of both approaches.

Keywords: Time-Scale, Wavelets, Time-Frequency, Compression, Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2293
960 A Neural Approach for the Offline Recognition of the Arabic Handwritten Words of the Algerian Departments

Authors: Salim Ouchtati, Jean Sequeira, Mouldi Bedda

Abstract:

In the context of the handwriting recognition, we propose an off line system for the recognition of the Arabic handwritten words of the Algerian departments. The study is based mainly on the evaluation of neural network performances, trained with the gradient back propagation algorithm. The used parameters to form the input vector of the neural network are extracted on the binary images of the handwritten word by several methods. The Distribution parameters, the centered moments of the different projections of the different segments, the centered moments of the word image coding according to the directions of Freeman, and the Barr features applied binary image of the word and on its different segments. The classification is achieved by a multi layers perceptron. A detailed experiment is carried and satisfactory recognition results are reported.

Keywords: Handwritten word recognition, neural networks, image processing, pattern recognition, features extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1864
959 Segmentation and Recognition of Handwritten Numeric Chains

Authors: Salim Ouchtati, Bedda Mouldi, Abderrazak Lachouri

Abstract:

In this paper we present an off line system for the recognition of the handwritten numeric chains. Our work is divided in two big parts. The first part is the realization of a recognition system of the isolated handwritten digits. In this case the study is based mainly on the evaluation of neural network performances, trained with the gradient back propagation algorithm. The used parameters to form the input vector of the neural network are extracted on the binary images of the digits by several methods: the distribution sequence, the Barr features and the centred moments of the different projections and profiles. The second part is the extension of our system for the reading of the handwritten numeric chains constituted of a variable number of digits. The vertical projection is used to segment the numeric chain at isolated digits and every digit (or segment) will be presented separately to the entry of the system achieved in the first part (recognition system of the isolated handwritten digits). The result of the recognition of the numeric chain will be displayed at the exit of the global system.

Keywords: Optical Characters Recognition, Neural networks, Barr features, Image processing, Pattern Recognition, Featuresextraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1393
958 Automatic Vehicle Identification by Plate Recognition

Authors: Serkan Ozbay, Ergun Ercelebi

Abstract:

Automatic Vehicle Identification (AVI) has many applications in traffic systems (highway electronic toll collection, red light violation enforcement, border and customs checkpoints, etc.). License Plate Recognition is an effective form of AVI systems. In this study, a smart and simple algorithm is presented for vehicle-s license plate recognition system. The proposed algorithm consists of three major parts: Extraction of plate region, segmentation of characters and recognition of plate characters. For extracting the plate region, edge detection algorithms and smearing algorithms are used. In segmentation part, smearing algorithms, filtering and some morphological algorithms are used. And finally statistical based template matching is used for recognition of plate characters. The performance of the proposed algorithm has been tested on real images. Based on the experimental results, we noted that our algorithm shows superior performance in car license plate recognition.

Keywords: Character recognizer, license plate recognition, plate region extraction, segmentation, smearing, template matching.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7542
957 LSGENSYS - An Integrated System for Pattern Recognition and Summarisation

Authors: Hema Nair

Abstract:

This paper presents a new system developed in Java® for pattern recognition and pattern summarisation in multi-band (RGB) satellite images. The system design is described in some detail. Results of testing the system to analyse and summarise patterns in SPOT MS images and LANDSAT images are also discussed.

Keywords: Pattern recognition, image analysis, feature extraction, blackboard component, linguistic summary.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1499