Search results for: Arabic recognition
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 884

Search results for: Arabic recognition

824 Practical Aspects of Face Recognition

Authors: S. Vural, H. Yamauchi

Abstract:

Current systems for face recognition techniques often use either SVM or Adaboost techniques for face detection part and use PCA for face recognition part. In this paper, we offer a novel method for not only a powerful face detection system based on Six-segment-filters (SSR) and Adaboost learning algorithms but also for a face recognition system. A new exclusive face detection algorithm has been developed and connected with the recognition algorithm. As a result of it, we obtained an overall high-system performance compared with current systems. The proposed algorithm was tested on CMU, FERET, UNIBE, MIT face databases and significant performance has obtained.

Keywords: Adaboost, Face Detection, Face recognition, SVM, Gabor filters, PCA-ICA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1598
823 New Adaptive Linear Discriminante Analysis for Face Recognition with SVM

Authors: Mehdi Ghayoumi

Abstract:

We have applied new accelerated algorithm for linear discriminate analysis (LDA) in face recognition with support vector machine. The new algorithm has the advantage of optimal selection of the step size. The gradient descent method and new algorithm has been implemented in software and evaluated on the Yale face database B. The eigenfaces of these approaches have been used to training a KNN. Recognition rate with new algorithm is compared with gradient.

Keywords: lda, adaptive, svm, face recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1422
822 Recognition-based Segmentation in Persian Character Recognition

Authors: Mohsen Zand, Ahmadreza Naghsh Nilchi, S. Amirhassan Monadjemi

Abstract:

Optical character recognition of cursive scripts presents a number of challenging problems in both segmentation and recognition processes in different languages, including Persian. In order to overcome these problems, we use a newly developed Persian word segmentation method and a recognition-based segmentation technique to overcome its segmentation problems. This method is robust as well as flexible. It also increases the system-s tolerances to font variations. The implementation results of this method on a comprehensive database show a high degree of accuracy which meets the requirements for commercial use. Extended with a suitable pre and post-processing, the method offers a simple and fast framework to develop a full OCR system.

Keywords: OCR, Persian, Recognition, Segmentation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1840
821 Offline Handwritten Signature Recognition

Authors: Gulzar A. Khuwaja, Mohammad S. Laghari

Abstract:

Biometrics, which refers to identifying an individual based on his or her physiological or behavioral characteristics, has the capability to reliably distinguish between an authorized person and an imposter. Signature verification systems can be categorized as offline (static) and online (dynamic). This paper presents a neural network based recognition of offline handwritten signatures system that is trained with low-resolution scanned signature images.

Keywords: Pattern Recognition, Computer Vision, AdaptiveClassification, Handwritten Signature Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2903
820 A New Approach to Face Recognition Using Dual Dimension Reduction

Authors: M. Almas Anjum, M. Younus Javed, A. Basit

Abstract:

In this paper a new approach to face recognition is presented that achieves double dimension reduction, making the system computationally efficient with better recognition results and out perform common DCT technique of face recognition. In pattern recognition techniques, discriminative information of image increases with increase in resolution to a certain extent, consequently face recognition results change with change in face image resolution and provide optimal results when arriving at a certain resolution level. In the proposed model of face recognition, initially image decimation algorithm is applied on face image for dimension reduction to a certain resolution level which provides best recognition results. Due to increased computational speed and feature extraction potential of Discrete Cosine Transform (DCT), it is applied on face image. A subset of coefficients of DCT from low to mid frequencies that represent the face adequately and provides best recognition results is retained. A tradeoff between decimation factor, number of DCT coefficients retained and recognition rate with minimum computation is obtained. Preprocessing of the image is carried out to increase its robustness against variations in poses and illumination level. This new model has been tested on different databases which include ORL , Yale and EME color database.

Keywords: Biometrics, DCT, Face Recognition, Illumination, Computation, Feature extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1686
819 Effects of Multimedia-based Instructional Designs for Arabic Language Learning among Pupils of Different Achievement Levels

Authors: Aldalalah, M. Osamah, Soon Fook Fong & Ababneh, W. Ziad

Abstract:

The purpose of this study is to investigate the effects of modality principles in instructional software among first grade pupils- achievements in the learning of Arabic Language. Two modes of instructional software were systematically designed and developed, audio with images (AI), and text with images (TI). The quasi-experimental design was used in the study. The sample consisted of 123 male and female pupils from IRBED Education Directorate, Jordan. The pupils were randomly assigned to any one of the two modes. The independent variable comprised the two modes of the instructional software, the students- achievement levels in the Arabic Language class and gender. The dependent variable was the achievements of the pupils in the Arabic Language test. The theoretical framework of this study was based on Mayer-s Cognitive Theory of Multimedia Learning. Four hypotheses were postulated and tested. Analyses of Variance (ANOVA) showed that pupils using the (AI) mode performed significantly better than those using (TI) mode. This study concluded that the audio with images mode was an important aid to learning as compared to text with images mode.

Keywords: Cognitive theory of Multimedia Learning, ModalityPrinciple, Multimedia, Arabic Language learning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2265
818 Comparing Emotion Recognition from Voice and Facial Data Using Time Invariant Features

Authors: Vesna Kirandziska, Nevena Ackovska, Ana Madevska Bogdanova

Abstract:

The problem of emotion recognition is a challenging problem. It is still an open problem from the aspect of both intelligent systems and psychology. In this paper, both voice features and facial features are used for building an emotion recognition system. A Support Vector Machine classifiers are built by using raw data from video recordings. In this paper, the results obtained for the emotion recognition are given, and a discussion about the validity and the expressiveness of different emotions is presented. A comparison between the classifiers build from facial data only, voice data only and from the combination of both data is made here. The need for a better combination of the information from facial expression and voice data is argued.

Keywords: Emotion recognition, facial recognition, signal processing, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2018
817 A Novel Approach to Persian Online Hand Writing Recognition

Authors: Ramin Halavati, Mansour Jamzad, Mahdieh Soleymani

Abstract:

Persian (Farsi) script is totally cursive and each character is written in several different forms depending on its former and later characters in the word. These complexities make automatic handwriting recognition of Persian a very hard problem and there are few contributions trying to work it out. This paper presents a novel practical approach to online recognition of Persian handwriting which is based on representation of inputs and patterns with very simple visual features and comparison of these simple terms. This recognition approach is tested over a set of Persian words and the results have been quite acceptable when the possible words where unknown and they were almost all correct in cases that the words where chosen from a prespecified list.

Keywords: Image Processing, Pattern Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1330
816 Persian/Arabic Document Segmentation Based On Pyramidal Image Structure

Authors: Seyyed Yasser Hashemi, Khalil Monfaredi

Abstract:

Automatic transformation of paper documents into electronic documents requires document segmentation at the first stage. However, some parameters restrictions such as variations in character font sizes, different text line spacing, and also not uniform document layout structures altogether have made it difficult to design a general-purpose document layout analysis algorithm for many years. Thus in most previously reported methods it is inevitable to include these parameters. This problem becomes excessively acute and severe, especially in Persian/Arabic documents. Since the Persian/Arabic scripts differ considerably from the English scripts, most of the proposed methods for the English scripts do not render good results for the Persian scripts. In this paper, we present a novel parameter-free method for segmenting the Persian/Arabic document images which also works well for English scripts. This method segments the document image into maximal homogeneous regions and identifies them as texts and non-texts based on a pyramidal image structure. In other words the proposed method is capable of document segmentation without considering the character font sizes, text line spacing, and document layout structures. This algorithm is examined for 150 Arabic/Persian and English documents and document segmentation process are done successfully for 96 percent of documents.

Keywords: Persian/Arabic document, document segmentation, Pyramidal Image Structure, skew detection and correction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1765
815 Off-Line Signature Recognition Based On Angle Features and GRNN Neural Networks

Authors: Laila Y. Fannas, Ahmed Y. Ben Sasi

Abstract:

This research presents a handwritten signature recognition based on angle feature vector using Artificial Neural Network (ANN). Each signature image will be represented by an Angle vector. The feature vector will constitute the input to the ANN. The collection of signature images will be divided into two sets. One set will be used for training the ANN in a supervised fashion. The other set which is never seen by the ANN will be used for testing. After training, the ANN will be tested for recognition of the signature. When the signature is classified correctly, it is considered correct recognition otherwise it is a failure.

Keywords: Signature Recognition, Artificial Neural Network, Angle Features.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2495
814 Optimized Brain Computer Interface System for Unspoken Speech Recognition: Role of Wernicke Area

Authors: Nassib Abdallah, Pierre Chauvet, Abd El Salam Hajjar, Bassam Daya

Abstract:

In this paper, we propose an optimized brain computer interface (BCI) system for unspoken speech recognition, based on the fact that the constructions of unspoken words rely strongly on the Wernicke area, situated in the temporal lobe. Our BCI system has four modules: (i) the EEG Acquisition module based on a non-invasive headset with 14 electrodes; (ii) the Preprocessing module to remove noise and artifacts, using the Common Average Reference method; (iii) the Features Extraction module, using Wavelet Packet Transform (WPT); (iv) the Classification module based on a one-hidden layer artificial neural network. The present study consists of comparing the recognition accuracy of 5 Arabic words, when using all the headset electrodes or only the 4 electrodes situated near the Wernicke area, as well as the selection effect of the subbands produced by the WPT module. After applying the articial neural network on the produced database, we obtain, on the test dataset, an accuracy of 83.4% with all the electrodes and all the subbands of 8 levels of the WPT decomposition. However, by using only the 4 electrodes near Wernicke Area and the 6 middle subbands of the WPT, we obtain a high reduction of the dataset size, equal to approximately 19% of the total dataset, with 67.5% of accuracy rate. This reduction appears particularly important to improve the design of a low cost and simple to use BCI, trained for several words.

Keywords: Brain-computer interface, speech recognition, electroencephalography EEG, Wernicke area, artificial neural network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 918
813 Possibilities, Challenges and the State of the Art of Automatic Speech Recognition in Air Traffic Control

Authors: Van Nhan Nguyen, Harald Holone

Abstract:

Over the past few years, a lot of research has been conducted to bring Automatic Speech Recognition (ASR) into various areas of Air Traffic Control (ATC), such as air traffic control simulation and training, monitoring live operators for with the aim of safety improvements, air traffic controller workload measurement and conducting analysis on large quantities controller-pilot speech. Due to the high accuracy requirements of the ATC context and its unique challenges, automatic speech recognition has not been widely adopted in this field. With the aim of providing a good starting point for researchers who are interested bringing automatic speech recognition into ATC, this paper gives an overview of possibilities and challenges of applying automatic speech recognition in air traffic control. To provide this overview, we present an updated literature review of speech recognition technologies in general, as well as specific approaches relevant to the ATC context. Based on this literature review, criteria for selecting speech recognition approaches for the ATC domain are presented, and remaining challenges and possible solutions are discussed.

Keywords: Automatic Speech Recognition, ASR, Air Traffic Control, ATC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4042
812 Analysis of Combined Use of NN and MFCC for Speech Recognition

Authors: Safdar Tanweer, Abdul Mobin, Afshar Alam

Abstract:

The performance and analysis of speech recognition system is illustrated in this paper. An approach to recognize the English word corresponding to digit (0-9) spoken by 2 different speakers is captured in noise free environment. For feature extraction, speech Mel frequency cepstral coefficients (MFCC) has been used which gives a set of feature vectors from recorded speech samples. Neural network model is used to enhance the recognition performance. Feed forward neural network with back propagation algorithm model is used. However other speech recognition techniques such as HMM, DTW exist. All experiments are carried out on Matlab.

Keywords: Speech Recognition, MFCC, Neural Network, classifier.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3268
811 Investigation of Combined use of MFCC and LPC Features in Speech Recognition Systems

Authors: К. R. Aida–Zade, C. Ardil, S. S. Rustamov

Abstract:

Statement of the automatic speech recognition problem, the assignment of speech recognition and the application fields are shown in the paper. At the same time as Azerbaijan speech, the establishment principles of speech recognition system and the problems arising in the system are investigated. The computing algorithms of speech features, being the main part of speech recognition system, are analyzed. From this point of view, the determination algorithms of Mel Frequency Cepstral Coefficients (MFCC) and Linear Predictive Coding (LPC) coefficients expressing the basic speech features are developed. Combined use of cepstrals of MFCC and LPC in speech recognition system is suggested to improve the reliability of speech recognition system. To this end, the recognition system is divided into MFCC and LPC-based recognition subsystems. The training and recognition processes are realized in both subsystems separately, and recognition system gets the decision being the same results of each subsystems. This results in decrease of error rate during recognition. The training and recognition processes are realized by artificial neural networks in the automatic speech recognition system. The neural networks are trained by the conjugate gradient method. In the paper the problems observed by the number of speech features at training the neural networks of MFCC and LPC-based speech recognition subsystems are investigated. The variety of results of neural networks trained from different initial points in training process is analyzed. Methodology of combined use of neural networks trained from different initial points in speech recognition system is suggested to improve the reliability of recognition system and increase the recognition quality, and obtained practical results are shown.

Keywords: Speech recognition, cepstral analysis, Voice activation detection algorithm, Mel Frequency Cepstral Coefficients, features of speech, Cepstral Mean Subtraction, neural networks, Linear Predictive Coding.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 913
810 Infrared Face Recognition Using Distance Transforms

Authors: Moulay A. Akhloufi, Abdelhakim Bendada

Abstract:

In this work we present an efficient approach for face recognition in the infrared spectrum. In the proposed approach physiological features are extracted from thermal images in order to build a unique thermal faceprint. Then, a distance transform is used to get an invariant representation for face recognition. The obtained physiological features are related to the distribution of blood vessels under the face skin. This blood network is unique to each individual and can be used in infrared face recognition. The obtained results are promising and show the effectiveness of the proposed scheme.

Keywords: Face recognition, biometrics, infrared imaging.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1423
809 Gait Recognition System: Bundle Rectangle Approach

Authors: Edward Guillen, Daniel Padilla, Adriana Hernandez, Kenneth Barner

Abstract:

Biometrics methods include recognition techniques such as fingerprint, iris, hand geometry, voice, face, ears and gait. The gait recognition approach has some advantages, for example it does not need the prior concern of the observed subject and it can record many biometric features in order to make deeper analysis, but most of the research proposals use high computational cost. This paper shows a gait recognition system with feature subtraction on a bundle rectangle drawn over the observed person. Statistical results within a database of 500 videos are shown.

Keywords: Autentication, Biometrics, Gait Recognition, Human Identification, Security.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1877
808 Investigating Interference Errors Made by Azzawia University 1st year Students of English in Learning English Prepositions

Authors: Aimen Mohamed Almaloul

Abstract:

The main focus of this study is investigating the interference of Arabic in the use of English prepositions by Libyan university students. Prepositions in the tests used in the study were categorized, according to their relation to Arabic, into similar Arabic and English prepositions (SAEP), dissimilar Arabic and English prepositions (DAEP), Arabic prepositions with no English counterparts (APEC), and English prepositions with no Arabic counterparts (EPAC).

The subjects of the study were the first year university students of the English department, Sabrata Faculty of Arts, Azzawia University; both males and females, and they were 100 students. The basic tool for data collection was a test of English prepositions; students are instructed to fill in the blanks with the correct prepositions and to put a zero (0) if no preposition was needed. The test was then handed to the subjects of the study.

The test was then scored and quantitative as well as qualitative results were obtained. Quantitative results indicated the number, percentages and rank order of errors in each of the categories and qualitative results indicated the nature and significance of those errors and their possible sources. Based on the obtained results the researcher could detect that students made more errors in the EPAC category than the other three categories and these errors could be attributed to the lack of knowledge of the different meanings of English prepositions. This lack of knowledge forced the students to adopt what is called the strategy of transfer.

Keywords: Foreign language acquisition, foreign language learning, interference system, interlanguage system, mother tongue interference.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5046
807 The Effect of Culture on User Interface Design of Social Media - A Case Study on Preferences of Saudi Arabians on the Arabic User Interface of Facebook

Authors: Hana Almakky, Reza Sahandi, Jacqui Taylor

Abstract:

Social media continues to grow, and user interfaces may become more appealing if cultural characteristics are incorporated into their design. Facebook was designed in the west, and the original language was English. Subsequently, the words in the user interface were translated to other languages, including Arabic. Arabic words are written from right to left, and English is written from left to right. The translated version may misrepresent the original design and users’ preferences may be influenced by their culture, which should be considered in the user interface design. Previous research indicates that users are more comfortable when interacting with a user interface, which relates to their own culture. Therefore, this paper, using a survey, investigates the preferences of Saudi Arabians on the Arabic version of the user interface of Facebook.

Keywords: Culture, Facebook, Saudi Arabia, Social media, User Interface Design.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3655
806 Recognizing an Individual, Their Topic of Conversation, and Cultural Background from 3D Body Movement

Authors: Gheida J. Shahrour, Martin J. Russell

Abstract:

The 3D body movement signals captured during human-human conversation include clues not only to the content of people’s communication but also to their culture and personality. This paper is concerned with automatic extraction of this information from body movement signals. For the purpose of this research, we collected a novel corpus from 27 subjects, arranged them into groups according to their culture. We arranged each group into pairs and each pair communicated with each other about different topics. A state-of-art recognition system is applied to the problems of person, culture, and topic recognition. We borrowed modeling, classification, and normalization techniques from speech recognition. We used Gaussian Mixture Modeling (GMM) as the main technique for building our three systems, obtaining 77.78%, 55.47%, and 39.06% from the person, culture, and topic recognition systems respectively. In addition, we combined the above GMM systems with Support Vector Machines (SVM) to obtain 85.42%, 62.50%, and 40.63% accuracy for person, culture, and topic recognition respectively. Although direct comparison among these three recognition systems is difficult, it seems that our person recognition system performs best for both GMM and GMM-SVM, suggesting that intersubject differences (i.e. subject’s personality traits) are a major source of variation. When removing these traits from culture and topic recognition systems using the Nuisance Attribute Projection (NAP) and the Intersession Variability Compensation (ISVC) techniques, we obtained 73.44% and 46.09% accuracy from culture and topic recognition systems respectively.

Keywords: Person Recognition, Topic Recognition, Culture Recognition, 3D Body Movement Signals, Variability Compensation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2174
805 Constructing of Classifier for Face Recognition on the Basis of the Conjugation Indexes

Authors: Vladimir A. Fursov, Nikita E. Kozin

Abstract:

In this work the opportunity of construction of the qualifiers for face-recognition systems based on conjugation criteria is investigated. The linkage between the bipartite conjugation, the conjugation with a subspace and the conjugation with the null-space is shown. The unified solving rule is investigated. It makes the decision on the rating of face to a class considering the linkage between conjugation values. The described recognition method can be successfully applied to the distributed systems of video control and video observation.

Keywords: Conjugation, Eigenfaces, Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1467
804 Face Recognition Based On Vector Quantization Using Fuzzy Neuro Clustering

Authors: Elizabeth B. Varghese, M. Wilscy

Abstract:

A face recognition system is a computer application for automatically identifying or verifying a person from a digital image or a video frame. A lot of algorithms have been proposed for face recognition. Vector Quantization (VQ) based face recognition is a novel approach for face recognition. Here a new codebook generation for VQ based face recognition using Integrated Adaptive Fuzzy Clustering (IAFC) is proposed. IAFC is a fuzzy neural network which incorporates a fuzzy learning rule into a competitive neural network. The performance of proposed algorithm is demonstrated by using publicly available AT&T database, Yale database, Indian Face database and a small face database, DCSKU database created in our lab. In all the databases the proposed approach got a higher recognition rate than most of the existing methods. In terms of Equal Error Rate (ERR) also the proposed codebook is better than the existing methods.

Keywords: Face Recognition, Vector Quantization, Integrated Adaptive Fuzzy Clustering, Self Organization Map.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2241
803 Data Gathering and Analysis for Arabic Historical Documents

Authors: Ali Dulla

Abstract:

This paper introduces a new dataset (and the methodology used to generate it) based on a wide range of historical Arabic documents containing clean data simple and homogeneous-page layouts. The experiments are implemented on printed and handwritten documents obtained respectively from some important libraries such as Qatar Digital Library, the British Library and the Library of Congress. We have gathered and commented on 150 archival document images from different locations and time periods. It is based on different documents from the 17th-19th century. The dataset comprises differing page layouts and degradations that challenge text line segmentation methods. Ground truth is produced using the Aletheia tool by PRImA and stored in an XML representation, in the PAGE (Page Analysis and Ground truth Elements) format. The dataset presented will be easily available to researchers world-wide for research into the obstacles facing various historical Arabic documents such as geometric correction of historical Arabic documents.

Keywords: Dataset production, ground truth production, historical documents, arbitrary warping, geometric correction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 865
802 The Effects of the Inference Process in Reading Texts in Arabic

Authors: May George

Abstract:

Inference plays an important role in the learning process and it can lead to a rapid acquisition of a second language. When learning a non-native language i.e., a critical language like Arabic, the students depend on the teacher’s support most of the time to learn new concepts. The students focus on memorizing the new vocabulary and stress on learning all the grammatical rules. Hence, the students became mechanical and cannot produce the language easily. As a result, they are unable to predicate the meaning of words in the context by relying heavily on the teacher, in that they cannot link their prior knowledge or even identify the meaning of the words without the support of the teacher. This study explores how the teacher guides students learning during the inference process and what are the processes of learning that can direct student’s inference.

Keywords: Inference, Reading, Arabic, and Language Acquisition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2052
801 Enhanced Face Recognition with Daisy Descriptors Using 1BT Based Registration

Authors: Sevil Igit, Merve Meric, Sarp Erturk

Abstract:

In this paper, it is proposed to improve Daisy Descriptor based face recognition using a novel One-Bit Transform (1BT) based pre-registration approach. The 1BT based pre-registration procedure is fast and has low computational complexity. It is shown that the face recognition accuracy is improved with the proposed approach. The proposed approach can facilitate highly accurate face recognition using DAISY descriptor with simple matching and thereby facilitate a low-complexity approach.

Keywords: Face Recognition, Daisy Descriptor, One-Bit Transform, Image Registration.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1972
800 Ultra High Speed Approach for Document Skew Detection and Correction Based On Centre of Gravity

Authors: Seyyed Yasser Hashemi

Abstract:

Skew detection and correction (SDC) has a direct effect in efficiency and exactitude of documents’ segmentation and analysis and thus is considered as a very important step in documents’ analysis field. Skew is a major problem in documents’ analysis for every language. For Arabic/Persian document scripts this problem is more severe because of special features of these languages. In this paper an efficient and fast algorithm for Document Skew Detection (DSD) based on the concept of segmentation and Center of Gravity (COG) is proposed. This algorithm is examined for 150 Arabic/Persian and English documents and SDC process are done successfully for 93 percent of documents with error rate of less than 1°. This algorithm shows better results for English documents compared to Arabic/Persian documents. The proposed method is also represents favorable results for handwritten, printed and also complicated documents such as newspapers and journals even with very low quality and resolution.

Keywords: Arabic/Persian document, Baseline, Centre of gravity, Document segmentation, Skew detection and correction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1911
799 Improved Weighted Matching for Speaker Recognition

Authors: Ozan Mut, Mehmet Göktürk

Abstract:

Matching algorithms have significant importance in speaker recognition. Feature vectors of the unknown utterance are compared to feature vectors of the modeled speakers as a last step in speaker recognition. A similarity score is found for every model in the speaker database. Depending on the type of speaker recognition, these scores are used to determine the author of unknown speech samples. For speaker verification, similarity score is tested against a predefined threshold and either acceptance or rejection result is obtained. In the case of speaker identification, the result depends on whether the identification is open set or closed set. In closed set identification, the model that yields the best similarity score is accepted. In open set identification, the best score is tested against a threshold, so there is one more possible output satisfying the condition that the speaker is not one of the registered speakers in existing database. This paper focuses on closed set speaker identification using a modified version of a well known matching algorithm. The results of new matching algorithm indicated better performance on YOHO international speaker recognition database.

Keywords: Automatic Speaker Recognition, Voice Recognition, Pattern Recognition, Digital Audio Signal Processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1732
798 An Amalgam Approach for DICOM Image Classification and Recognition

Authors: J. Umamaheswari, G. Radhamani

Abstract:

This paper describes about the process of recognition and classification of brain images such as normal and abnormal based on PSO-SVM. Image Classification is becoming more important for medical diagnosis process. In medical area especially for diagnosis the abnormality of the patient is classified, which plays a great role for the doctors to diagnosis the patient according to the severeness of the diseases. In case of DICOM images it is very tough for optimal recognition and early detection of diseases. Our work focuses on recognition and classification of DICOM image based on collective approach of digital image processing. For optimal recognition and classification Particle Swarm Optimization (PSO), Genetic Algorithm (GA) and Support Vector Machine (SVM) are used. The collective approach by using PSO-SVM gives high approximation capability and much faster convergence.

Keywords: Recognition, classification, Relaxed Median Filter, Adaptive thresholding, clustering and Neural Networks

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2259
797 Advances in Artificial Intelligence Using Speech Recognition

Authors: Khaled M. Alhawiti

Abstract:

This research study aims to present a retrospective study about speech recognition systems and artificial intelligence. Speech recognition has become one of the widely used technologies, as it offers great opportunity to interact and communicate with automated machines. Precisely, it can be affirmed that speech recognition facilitates its users and helps them to perform their daily routine tasks, in a more convenient and effective manner. This research intends to present the illustration of recent technological advancements, which are associated with artificial intelligence. Recent researches have revealed the fact that speech recognition is found to be the utmost issue, which affects the decoding of speech. In order to overcome these issues, different statistical models were developed by the researchers. Some of the most prominent statistical models include acoustic model (AM), language model (LM), lexicon model, and hidden Markov models (HMM). The research will help in understanding all of these statistical models of speech recognition. Researchers have also formulated different decoding methods, which are being utilized for realistic decoding tasks and constrained artificial languages. These decoding methods include pattern recognition, acoustic phonetic, and artificial intelligence. It has been recognized that artificial intelligence is the most efficient and reliable methods, which are being used in speech recognition.

Keywords: Speech recognition, acoustic phonetic, artificial intelligence, Hidden Markov Models (HMM), statistical models of speech recognition, human machine performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7978
796 Face Recognition Using Double Dimension Reduction

Authors: M. A Anjum, M. Y. Javed, A. Basit

Abstract:

In this paper a new approach to face recognition is presented that achieves double dimension reduction making the system computationally efficient with better recognition results. In pattern recognition techniques, discriminative information of image increases with increase in resolution to a certain extent, consequently face recognition results improve with increase in face image resolution and levels off when arriving at a certain resolution level. In the proposed model of face recognition, first image decimation algorithm is applied on face image for dimension reduction to a certain resolution level which provides best recognition results. Due to better computational speed and feature extraction potential of Discrete Cosine Transform (DCT) it is applied on face image. A subset of coefficients of DCT from low to mid frequencies that represent the face adequately and provides best recognition results is retained. A trade of between decimation factor, number of DCT coefficients retained and recognition rate with minimum computation is obtained. Preprocessing of the image is carried out to increase its robustness against variations in poses and illumination level. This new model has been tested on different databases which include ORL database, Yale database and a color database. The proposed technique has performed much better compared to other techniques. The significance of the model is two fold: (1) dimension reduction up to an effective and suitable face image resolution (2) appropriate DCT coefficients are retained to achieve best recognition results with varying image poses, intensity and illumination level.

Keywords: Biometrics, DCT, Face Recognition, Feature extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1492
795 Probabilistic Bayesian Framework for Infrared Face Recognition

Authors: Moulay A. Akhloufi, Abdelhakim Bendada

Abstract:

Face recognition in the infrared spectrum has attracted a lot of interest in recent years. Many of the techniques used in infrared are based on their visible counterpart, especially linear techniques like PCA and LDA. In this work, we introduce a probabilistic Bayesian framework for face recognition in the infrared spectrum. In the infrared spectrum, variations can occur between face images of the same individual due to pose, metabolic, time changes, etc. Bayesian approaches permit to reduce intrapersonal variation, thus making them very interesting for infrared face recognition. This framework is compared with classical linear techniques. Non linear techniques we developed recently for infrared face recognition are also presented and compared to the Bayesian face recognition framework. A new approach for infrared face extraction based on SVM is introduced. Experimental results show that the Bayesian technique is promising and lead to interesting results in the infrared spectrum when a sufficient number of face images is used in an intrapersonal learning process.

Keywords: Face recognition, biometrics, probabilistic imageprocessing, infrared imaging.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1877