Search results for: Arabic speech recognition
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1032

Search results for: Arabic speech recognition

792 Quantitative Analysis of PCA, ICA, LDA and SVM in Face Recognition

Authors: Liton Jude Rozario, Mohammad Reduanul Haque, Md. Ziarul Islam, Mohammad Shorif Uddin

Abstract:

Face recognition is a technique to automatically identify or verify individuals. It receives great attention in identification, authentication, security and many more applications. Diverse methods had been proposed for this purpose and also a lot of comparative studies were performed. However, researchers could not reach unified conclusion. In this paper, we are reporting an extensive quantitative accuracy analysis of four most widely used face recognition algorithms: Principal Component Analysis (PCA), Independent Component Analysis (ICA), Linear Discriminant Analysis (LDA) and Support Vector Machine (SVM) using AT&T, Sheffield and Bangladeshi people face databases under diverse situations such as illumination, alignment and pose variations.

Keywords: PCA, ICA, LDA, SVM, face recognition, noise.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2431
791 Visual Thing Recognition with Binary Scale-Invariant Feature Transform and Support Vector Machine Classifiers Using Color Information

Authors: Wei-Jong Yang, Wei-Hau Du, Pau-Choo Chang, Jar-Ferr Yang, Pi-Hsia Hung

Abstract:

The demands of smart visual thing recognition in various devices have been increased rapidly for daily smart production, living and learning systems in recent years. This paper proposed a visual thing recognition system, which combines binary scale-invariant feature transform (SIFT), bag of words model (BoW), and support vector machine (SVM) by using color information. Since the traditional SIFT features and SVM classifiers only use the gray information, color information is still an important feature for visual thing recognition. With color-based SIFT features and SVM, we can discard unreliable matching pairs and increase the robustness of matching tasks. The experimental results show that the proposed object recognition system with color-assistant SIFT SVM classifier achieves higher recognition rate than that with the traditional gray SIFT and SVM classification in various situations.

Keywords: Color moments, visual thing recognition system, SIFT, color SIFT.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1032
790 Delineating Students’ Speaking Anxieties and Assessment Gaps in Online Speech Performances

Authors: Mary Jane B. Suarez

Abstract:

Speech anxiety is innumerable in any traditional communication classes especially for ESL students. The speech anxiety intensifies when communication skills assessments have taken its toll in an online mode of learning due to the perils of the COVID-19 virus. Teachers and students have experienced vast ambiguity on how to realize a still effective way to teach and learn various speaking skills amidst the pandemic. This mixed method study determined the factors that affected the public speaking skills of students in online performances, delineated the assessment gaps in assessing speaking skills in an online setup, and recommended ways to address students’ speech anxieties. Using convergent parallel design, quantitative data were gathered by examining the desired learning competencies of the English course including a review of the teacher’s class record to analyze how students’ performances reflected a significantly high level of anxiety in online speech delivery. Focus group discussion was also conducted for qualitative data describing students’ public speaking anxiety and assessment gaps. Results showed a significantly high level of students’ speech anxiety affected by time constraints, use of technology, lack of audience response, being conscious of making mistakes, and the use of English as a second language. The study presented recommendations to redesign curricular assessments of English teachers and to have a robust diagnosis of students’ speaking anxiety to better cater to the needs of learners in attempt to bridge any gaps in cultivating public speaking skills of students as educational institutions segue from the pandemic to the post-pandemic milieu.

Keywords: Blended learning, communication skills assessment, online speech delivery, public speaking anxiety, speech anxiety.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 178
789 Inverse Sets-based Recognition of Video Clips

Authors: Alexei M. Mikhailov

Abstract:

The paper discusses the mathematics of pattern indexing and its applications to recognition of visual patterns that are found in video clips. It is shown that (a) pattern indexes can be represented by collections of inverted patterns, (b) solutions to pattern classification problems can be found as intersections and histograms of inverted patterns and, thus, matching of original patterns avoided.

Keywords: Artificial neural cortex, computational biology, data mining, pattern recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2115
788 Ant Colony Optimization for Feature Subset Selection

Authors: Ahmed Al-Ani

Abstract:

The Ant Colony Optimization (ACO) is a metaheuristic inspired by the behavior of real ants in their search for the shortest paths to food sources. It has recently attracted a lot of attention and has been successfully applied to a number of different optimization problems. Due to the importance of the feature selection problem and the potential of ACO, this paper presents a novel method that utilizes the ACO algorithm to implement a feature subset search procedure. Initial results obtained using the classification of speech segments are very promising.

Keywords: Ant Colony Optimization, ant systems, feature selection, pattern recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3141
787 Initialization Method of Reference Vectors for Improvement of Recognition Accuracy in LVQ

Authors: Yuji Mizuno, Hiroshi Mabuchi

Abstract:

Initial values of reference vectors have significant influence on recognition accuracy in LVQ. There are several existing techniques, such as SOM and k-means, for setting initial values of reference vectors, each of which has provided some positive results. However, those results are not sufficient for the improvement of recognition accuracy. This study proposes an ACO-used method for initializing reference vectors with an aim to achieve recognition accuracy higher than those obtained through conventional methods. Moreover, we will demonstrate the effectiveness of the proposed method by applying it to the wine data and English vowel data and comparing its results with those of conventional methods.

Keywords: Clustering, LVQ, ACO, SOM, k-means.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1255
786 Posture Recognition using Combined Statistical and Geometrical Feature Vectors based on SVM

Authors: Omer Rashid, Ayoub Al-Hamadi, Axel Panning, Bernd Michaelis

Abstract:

It is hard to percept the interaction process with machines when visual information is not available. In this paper, we have addressed this issue to provide interaction through visual techniques. Posture recognition is done for American Sign Language to recognize static alphabets and numbers. 3D information is exploited to obtain segmentation of hands and face using normal Gaussian distribution and depth information. Features for posture recognition are computed using statistical and geometrical properties which are translation, rotation and scale invariant. Hu-Moment as statistical features and; circularity and rectangularity as geometrical features are incorporated to build the feature vectors. These feature vectors are used to train SVM for classification that recognizes static alphabets and numbers. For the alphabets, curvature analysis is carried out to reduce the misclassifications. The experimental results show that proposed system recognizes posture symbols by achieving recognition rate of 98.65% and 98.6% for ASL alphabets and numbers respectively.

Keywords: Feature Extraction, Posture Recognition, Pattern Recognition, Application.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1519
785 Finger Vein Recognition using PCA-based Methods

Authors: Sepehr Damavandinejadmonfared, Ali Khalili Mobarakeh, Mohsen Pashna, , Jiangping Gou Sayedmehran Mirsafaie Rizi, Saba Nazari, Shadi Mahmoodi Khaniabadi, Mohamad Ali Bagheri

Abstract:

In this paper a novel algorithm is proposed to merit the accuracy of finger vein recognition. The performances of Principal Component Analysis (PCA), Kernel Principal Component Analysis (KPCA), and Kernel Entropy Component Analysis (KECA) in this algorithm are validated and compared with each other in order to determine which one is the most appropriate one in terms of finger vein recognition.

Keywords: Biometrics, finger vein recognition, PrincipalComponent Analysis (PCA), Kernel Principal Component Analysis(KPCA), Kernel Entropy Component Analysis (KPCA).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2680
784 Orchestra/Percussion Classification Algorithm for United Speech Audio Coding System

Authors: Yueming Wang, Rendong Ying, Sumxin Jiang, Peilin Liu

Abstract:

Unified Speech Audio Coding (USAC), the latest MPEG standardization for unified speech and audio coding, uses a speech/audio classification algorithm to distinguish speech and audio segments of the input signal. The quality of the recovered audio can be increased by well-designed orchestra/percussion classification and subsequent processing. However, owing to the shortcoming of the system, introducing an orchestra/percussion classification and modifying subsequent processing can enormously increase the quality of the recovered audio. This paper proposes an orchestra/percussion classification algorithm for the USAC system which only extracts 3 scales of Mel-Frequency Cepstral Coefficients (MFCCs) rather than traditional 13 scales of MFCCs and use Iterative Dichotomiser 3 (ID3) Decision Tree rather than other complex learning method, thus the proposed algorithm has lower computing complexity than most existing algorithms. Considering that frequent changing of attributes may lead to quality loss of the recovered audio signal, this paper also design a modified subsequent process to help the whole classification system reach an accurate rate as high as 97% which is comparable to classical 99%.

Keywords: ID3 Decision Tree, MFCC, Orchestra/Percussion Classification, USAC

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1673
783 Sentence Modality Recognition in French based on Prosody

Authors: Pavel Král, Jana Klečková, Christophe Cerisara

Abstract:

This paper deals with automatic sentence modality recognition in French. In this work, only prosodic features are considered. The sentences are recognized according to the three following modalities: declarative, interrogative and exclamatory sentences. This information will be used to animate a talking head for deaf and hearing-impaired children. We first statistically study a real radio corpus in order to assess the feasibility of the automatic modeling of sentence types. Then, we test two sets of prosodic features as well as two different classifiers and their combination. We further focus our attention on questions recognition, as this modality is certainly the most important one for the target application.

Keywords: Automatic sentences modality recognition (ASMR), fundamental frequency (F0), energy, modal corpus, prosody.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1678
782 A High Quality Speech Coder at 600 bps

Authors: Yong Zhang, Ruimin Hu

Abstract:

This paper presents a vocoder to obtain high quality synthetic speech at 600 bps. To reduce the bit rate, the algorithm is based on a sinusoidally excited linear prediction model which extracts few coding parameters, and three consecutive frames are grouped into a superframe and jointly vector quantization is used to obtain high coding efficiency. The inter-frame redundancy is exploited with distinct quantization schemes for different unvoiced/voiced frame combinations in the superframe. Experimental results show that the quality of the proposed coder is better than that of 2.4kbps LPC10e and achieves approximately the same as that of 2.4kbps MELP and with high robustness.

Keywords: Speech coding, Vector quantization, linear predicition, Mixed sinusoidal excitation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2187
781 The Performance Improvement of Automatic Modulation Recognition Using Simple Feature Manipulation, Analysis of the HOS, and Voted Decision

Authors: Heroe Wijanto, Sugihartono, Suhartono Tjondronegoro, Kuspriyanto

Abstract:

The use of High Order Statistics (HOS) analysis is expected to provide so many candidates of features that can be selected for pattern recognition. More candidates of the feature can be extracted using simple manipulation through a specific mathematical function prior to the HOS analysis. Feature extraction method using HOS analysis combined with Difference to the Nth-Power manipulation has been examined in application for Automatic Modulation Recognition (AMR) to perform scheme recognition of three digital modulation signal, i.e. QPSK-16QAM-64QAM in the AWGN transmission channel. The simulation results is reported when the analysis of HOS up to order-12 and the manipulation of Difference to the Nth-Power up to N = 4. The obtained accuracy rate of AMR using the method of Simple Decision obtained 90% in SNR > 10 dB in its classifier, while using the method of Voted Decision is 96% in SNR > 2 dB.

Keywords: modulation, automatic modulation recognition, feature analysis, feature manipulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2118
780 An ICA Algorithm for Separation of Convolutive Mixture of Speech Signals

Authors: Rajkishore Prasad, Hiroshi Saruwatari, Kiyohiro Shikano

Abstract:

This paper describes Independent Component Analysis (ICA) based fixed-point algorithm for the blind separation of the convolutive mixture of speech, picked-up by a linear microphone array. The proposed algorithm extracts independent sources by non- Gaussianizing the Time-Frequency Series of Speech (TFSS) in a deflationary way. The degree of non-Gaussianization is measured by negentropy. The relative performances of algorithm under random initialization and Null beamformer (NBF) based initialization are studied. It has been found that an NBF based initial value gives speedy convergence as well as better separation performance

Keywords: Blind signal separation, independent component analysis, negentropy, convolutive mixture.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1776
779 Word Recognition and Learning based on Associative Memories and Hidden Markov Models

Authors: Zöhre Kara Kayikci, Günther Palm

Abstract:

A word recognition architecture based on a network of neural associative memories and hidden Markov models has been developed. The input stream, composed of subword-units like wordinternal triphones consisting of diphones and triphones, is provided to the network of neural associative memories by hidden Markov models. The word recognition network derives words from this input stream. The architecture has the ability to handle ambiguities on subword-unit level and is also able to add new words to the vocabulary during performance. The architecture is implemented to perform the word recognition task in a language processing system for understanding simple command sentences like “bot show apple".

Keywords: Hebbian learning, hidden Markov models, neuralassociative memories, word recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1523
778 An Improved Illumination Normalization based on Anisotropic Smoothing for Face Recognition

Authors: Sanghoon Kim, Sun-Tae Chung, Souhwan Jung, Seongwon Cho

Abstract:

Robust face recognition under various illumination environments is very difficult and needs to be accomplished for successful commercialization. In this paper, we propose an improved illumination normalization method for face recognition. Illumination normalization algorithm based on anisotropic smoothing is well known to be effective among illumination normalization methods but deteriorates the intensity contrast of the original image, and incurs less sharp edges. The proposed method in this paper improves the previous anisotropic smoothing-based illumination normalization method so that it increases the intensity contrast and enhances the edges while diminishing the effect of illumination variations. Due to the result of these improvements, face images preprocessed by the proposed illumination normalization method becomes to have more distinctive feature vectors (Gabor feature vectors) for face recognition. Through experiments of face recognition based on Gabor feature vector similarity, the effectiveness of the proposed illumination normalization method is verified.

Keywords: Illumination Normalization, Face Recognition, Anisotropic smoothing, Gabor feature vector.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1548
777 Generating Arabic Fonts Using Rational Cubic Ball Functions

Authors: Fakharuddin Ibrahim, Jamaludin Md. Ali, Ahmad Ramli

Abstract:

In this paper, we will discuss about the data interpolation by using the rational cubic Ball curve. To generate a curve with a better and satisfactory smoothness, the curve segments must be connected with a certain amount of continuity. The continuity that we will consider is of type G1 continuity. The conditions considered are known as the G1 Hermite condition. A simple application of the proposed method is to generate an Arabic font satisfying the required continuity.

Keywords: Continuity, data interpolation, Hermite condition, rational Ball curve.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1467
776 Foot Recognition Using Deep Learning for Knee Rehabilitation

Authors: Rakkrit Duangsoithong, Jermphiphut Jaruenpunyasak, Alba Garcia

Abstract:

The use of foot recognition can be applied in many medical fields such as the gait pattern analysis and the knee exercises of patients in rehabilitation. Generally, a camera-based foot recognition system is intended to capture a patient image in a controlled room and background to recognize the foot in the limited views. However, this system can be inconvenient to monitor the knee exercises at home. In order to overcome these problems, this paper proposes to use the deep learning method using Convolutional Neural Networks (CNNs) for foot recognition. The results are compared with the traditional classification method using LBP and HOG features with kNN and SVM classifiers. According to the results, deep learning method provides better accuracy but with higher complexity to recognize the foot images from online databases than the traditional classification method.

Keywords: Convolutional neural networks, deep learning, foot recognition, knee rehabilitation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1435
775 Robust Face Recognition using AAM and Gabor Features

Authors: Sanghoon Kim, Sun-Tae Chung, Souhwan Jung, Seoungseon Jeon, Jaemin Kim, Seongwon Cho

Abstract:

In this paper, we propose a face recognition algorithm using AAM and Gabor features. Gabor feature vectors which are well known to be robust with respect to small variations of shape, scaling, rotation, distortion, illumination and poses in images are popularly employed for feature vectors for many object detection and recognition algorithms. EBGM, which is prominent among face recognition algorithms employing Gabor feature vectors, requires localization of facial feature points where Gabor feature vectors are extracted. However, localization method employed in EBGM is based on Gabor jet similarity and is sensitive to initial values. Wrong localization of facial feature points affects face recognition rate. AAM is known to be successfully applied to localization of facial feature points. In this paper, we devise a facial feature point localization method which first roughly estimate facial feature points using AAM and refine facial feature points using Gabor jet similarity-based facial feature localization method with initial points set by the rough facial feature points obtained from AAM, and propose a face recognition algorithm using the devised localization method for facial feature localization and Gabor feature vectors. It is observed through experiments that such a cascaded localization method based on both AAM and Gabor jet similarity is more robust than the localization method based on only Gabor jet similarity. Also, it is shown that the proposed face recognition algorithm using this devised localization method and Gabor feature vectors performs better than the conventional face recognition algorithm using Gabor jet similarity-based localization method and Gabor feature vectors like EBGM.

Keywords: Face Recognition, AAM, Gabor features, EBGM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2205
774 Javanese Character Recognition Using Hidden Markov Model

Authors: Anastasia Rita Widiarti, Phalita Nari Wastu

Abstract:

Hidden Markov Model (HMM) is a stochastic method which has been used in various signal processing and character recognition. This study proposes to use HMM to recognize Javanese characters from a number of different handwritings, whereby HMM is used to optimize the number of state and feature extraction. An 85.7 % accuracy is obtained as the best result in 16-stated vertical model using pure HMM. This initial result is satisfactory for prompting further research.

Keywords: Character recognition, off-line handwritingrecognition, Hidden Markov Model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1988
773 Recognition of Grocery Products in Images Captured by Cellular Phones

Authors: Farshideh Einsele, Hassan Foroosh

Abstract:

In this paper, we present a robust algorithm to recognize extracted text from grocery product images captured by mobile phone cameras. Recognition of such text is challenging since text in grocery product images varies in its size, orientation, style, illumination, and can suffer from perspective distortion. Pre-processing is performed to make the characters scale and rotation invariant. Since text degradations can not be appropriately defined using well-known geometric transformations such as translation, rotation, affine transformation and shearing, we use the whole character black pixels as our feature vector. Classification is performed with minimum distance classifier using the maximum likelihood criterion, which delivers very promising Character Recognition Rate (CRR) of 89%. We achieve considerably higher Word Recognition Rate (WRR) of 99% when using lower level linguistic knowledge about product words during the recognition process.

Keywords: Camera-based OCR, Feature extraction, Document and image processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2470
772 View-Point Insensitive Human Pose Recognition using Neural Network and CUDA

Authors: Sanghyeok Oh, Keechul Jung

Abstract:

Although lots of research work has been done for human pose recognition, the view-point of cameras is still critical problem of overall recognition system. In this paper, view-point insensitive human pose recognition is proposed. The aims of the proposed system are view-point insensitivity and real-time processing. Recognition system consists of feature extraction module, neural network and real-time feed forward calculation. First, histogram-based method is used to extract feature from silhouette image and it is suitable for represent the shape of human pose. To reduce the dimension of feature vector, Principle Component Analysis(PCA) is used. Second, real-time processing is implemented by using Compute Unified Device Architecture(CUDA) and this architecture improves the speed of feed-forward calculation of neural network. We demonstrate the effectiveness of our approach with experiments on real environment.

Keywords: computer vision, neural network, pose recognition, view-point insensitive, PCA, CUDA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1338
771 Study of Features for Hand-printed Recognition

Authors: Satish Kumar

Abstract:

The feature extraction method(s) used to recognize hand-printed characters play an important role in ICR applications. In order to achieve high recognition rate for a recognition system, the choice of a feature that suits for the given script is certainly an important task. Even if a new feature required to be designed for a given script, it is essential to know the recognition ability of the existing features for that script. Devanagari script is being used in various Indian languages besides Hindi the mother tongue of majority of Indians. This research examines a variety of feature extraction approaches, which have been used in various ICR/OCR applications, in context to Devanagari hand-printed script. The study is conducted theoretically and experimentally on more that 10 feature extraction methods. The various feature extraction methods have been evaluated on Devanagari hand-printed database comprising more than 25000 characters belonging to 43 alphabets. The recognition ability of the features have been evaluated using three classifiers i.e. k-NN, MLP and SVM.

Keywords: Features, Hand-printed, Devanagari, Classifier, Database

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1728
770 An Exploratory Survey Questionnaire to Understand What Emotions Are Important and Difficult to Communicate for People with Dysarthria and Their Methodology of Communicating

Authors: Lubna Alhinti, Heidi Christensen, Stuart Cunningham

Abstract:

People with speech disorders may rely on augmentative and alternative communication (AAC) technologies to help them communicate. However, the limitations of the current AAC technologies act as barriers to the optimal use of these technologies in daily communication settings. The ability to communicate effectively relies on a number of factors that are not limited to the intelligibility of the spoken words. In fact, non-verbal cues play a critical role in the correct comprehension of messages and having to rely on verbal communication only, as is the case with current AAC technology, may contribute to problems in communication. This is especially true for people’s ability to express their feelings and emotions, which are communicated to a large part through non-verbal cues. This paper focuses on understanding more about the non-verbal communication ability of people with dysarthria, with the overarching aim of this research being to improve AAC technology by allowing people with dysarthria to better communicate emotions. Preliminary survey results are presented that gives an understanding of how people with dysarthria convey emotions, what emotions that are important for them to get across, what emotions that are difficult for them to convey, and whether there is a difference in communicating emotions when speaking to familiar versus unfamiliar people.

Keywords: Alternative and augmentative communication technology, dysarthria, speech emotion recognition, VIVOCA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1063
769 Automatic Music Score Recognition System Using Digital Image Processing

Authors: Yuan-Hsiang Chang, Zhong-Xian Peng, Li-Der Jeng

Abstract:

Music has always been an integral part of human’s daily lives. But, for the most people, reading musical score and turning it into melody is not easy. This study aims to develop an Automatic music score recognition system using digital image processing, which can be used to read and analyze musical score images automatically. The technical approaches included: (1) staff region segmentation; (2) image preprocessing; (3) note recognition; and (4) accidental and rest recognition. Digital image processing techniques (e.g., horizontal /vertical projections, connected component labeling, morphological processing, template matching, etc.) were applied according to musical notes, accidents, and rests in staff notations. Preliminary results showed that our system could achieve detection and recognition rates of 96.3% and 91.7%, respectively. In conclusion, we presented an effective automated musical score recognition system that could be integrated in a system with a media player to play music/songs given input images of musical score. Ultimately, this system could also be incorporated in applications for mobile devices as a learning tool, such that a music player could learn to play music/songs.

Keywords: Connected component labeling, image processing, morphological processing, optical musical recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1930
768 A Optimal Subclass Detection Method for Credit Scoring

Authors: Luciano Nieddu, Giuseppe Manfredi, Salvatore D'Acunto, Katia La Regina

Abstract:

In this paper a non-parametric statistical pattern recognition algorithm for the problem of credit scoring will be presented. The proposed algorithm is based on a clustering k- means algorithm and allows for the determination of subclasses of homogenous elements in the data. The algorithm will be tested on two benchmark datasets and its performance compared with other well known pattern recognition algorithm for credit scoring.

Keywords: Constrained clustering, Credit scoring, Statistical pattern recognition, Supervised classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2049
767 Unit Selection Algorithm Using Bi-grams Model For Corpus-Based Speech Synthesis

Authors: Mohamed Ali KAMMOUN, Ahmed Ben HAMIDA

Abstract:

In this paper, we present a novel statistical approach to corpus-based speech synthesis. Classically, phonetic information is defined and considered as acoustic reference to be respected. In this way, many studies were elaborated for acoustical unit classification. This type of classification allows separating units according to their symbolic characteristics. Indeed, target cost and concatenation cost were classically defined for unit selection. In Corpus-Based Speech Synthesis System, when using large text corpora, cost functions were limited to a juxtaposition of symbolic criteria and the acoustic information of units is not exploited in the definition of the target cost. In this manuscript, we token in our consideration the unit phonetic information corresponding to acoustic information. This would be realized by defining a probabilistic linguistic Bi-grams model basically used for unit selection. The selected units would be extracted from the English TIMIT corpora.

Keywords: Unit selection, Corpus-based Speech Synthesis, Bigram model

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1440
766 Improved Closed Set Text-Independent Speaker Identification by Combining MFCC with Evidence from Flipped Filter Banks

Authors: Sandipan Chakroborty, Anindya Roy, Goutam Saha

Abstract:

A state of the art Speaker Identification (SI) system requires a robust feature extraction unit followed by a speaker modeling scheme for generalized representation of these features. Over the years, Mel-Frequency Cepstral Coefficients (MFCC) modeled on the human auditory system has been used as a standard acoustic feature set for SI applications. However, due to the structure of its filter bank, it captures vocal tract characteristics more effectively in the lower frequency regions. This paper proposes a new set of features using a complementary filter bank structure which improves distinguishability of speaker specific cues present in the higher frequency zone. Unlike high level features that are difficult to extract, the proposed feature set involves little computational burden during the extraction process. When combined with MFCC via a parallel implementation of speaker models, the proposed feature set outperforms baseline MFCC significantly. This proposition is validated by experiments conducted on two different kinds of public databases namely YOHO (microphone speech) and POLYCOST (telephone speech) with Gaussian Mixture Models (GMM) as a Classifier for various model orders.

Keywords: Complementary Information, Filter Bank, GMM, IMFCC, MFCC, Speaker Identification, Speaker Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2294
765 A Motion Dictionary to Real-Time Recognition of Sign Language Alphabet Using Dynamic Time Warping and Artificial Neural Network

Authors: Marcio Leal, Marta Villamil

Abstract:

Computacional recognition of sign languages aims to allow a greater social and digital inclusion of deaf people through interpretation of their language by computer. This article presents a model of recognition of two of global parameters from sign languages; hand configurations and hand movements. Hand motion is captured through an infrared technology and its joints are built into a virtual three-dimensional space. A Multilayer Perceptron Neural Network (MLP) was used to classify hand configurations and Dynamic Time Warping (DWT) recognizes hand motion. Beyond of the method of sign recognition, we provide a dataset of hand configurations and motion capture built with help of fluent professionals in sign languages. Despite this technology can be used to translate any sign from any signs dictionary, Brazilian Sign Language (Libras) was used as case study. Finally, the model presented in this paper achieved a recognition rate of 80.4%.

Keywords: Sign language recognition, computer vision, infrared, artificial neural network, dynamic time warping.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 878
764 Face Recognition with Image Rotation Detection, Correction and Reinforced Decision using ANN

Authors: Hemashree Bordoloi, Kandarpa Kumar Sarma

Abstract:

Rotation or tilt present in an image capture by digital means can be detected and corrected using Artificial Neural Network (ANN) for application with a Face Recognition System (FRS). Principal Component Analysis (PCA) features of faces at different angles are used to train an ANN which detects the rotation for an input image and corrected using a set of operations implemented using another system based on ANN. The work also deals with the recognition of human faces with features from the foreheads, eyes, nose and mouths as decision support entities of the system configured using a Generalized Feed Forward Artificial Neural Network (GFFANN). These features are combined to provide a reinforced decision for verification of a person-s identity despite illumination variations. The complete system performing facial image rotation detection, correction and recognition using re-enforced decision support provides a success rate in the higher 90s.

Keywords: Rotation, Face, Recognition, ANN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2061
763 Recognition and Reconstruction of Partially Occluded Objects

Authors: Michela Lecca, Stefano Messelodi

Abstract:

A new automatic system for the recognition and re¬construction of resealed and/or rotated partially occluded objects is presented. The objects to be recognized are described by 2D views and each view is occluded by several half-planes. The whole object views and their visible parts (linear cuts) are then stored in a database. To establish if a region R of an input image represents an object possibly occluded, the system generates a set of linear cuts of R and compare them with the elements in the database. Each linear cut of R is associated to the most similar database linear cut. R is recognized as an instance of the object 0 if the majority of the linear cuts of R are associated to a linear cut of views of 0. In the case of recognition, the system reconstructs the occluded part of R and determines the scale factor and the orientation in the image plane of the recognized object view. The system has been tested on two different datasets of objects, showing good performance both in terms of recognition and reconstruction accuracy.

Keywords: Occluded Object Recognition, Shape Reconstruction, Automatic Self-Adaptive Systems, Linear Cut.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1284