Search results for: Visual Character Recognition.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1493

Search results for: Visual Character Recognition.

1223 Rotation Invariant Face Recognition Based on Hybrid LPT/DCT Features

Authors: Rehab F. Abdel-Kader, Rabab M. Ramadan, Rawya Y. Rizk

Abstract:

The recognition of human faces, especially those with different orientations is a challenging and important problem in image analysis and classification. This paper proposes an effective scheme for rotation invariant face recognition using Log-Polar Transform and Discrete Cosine Transform combined features. The rotation invariant feature extraction for a given face image involves applying the logpolar transform to eliminate the rotation effect and to produce a row shifted log-polar image. The discrete cosine transform is then applied to eliminate the row shift effect and to generate the low-dimensional feature vector. A PSO-based feature selection algorithm is utilized to search the feature vector space for the optimal feature subset. Evolution is driven by a fitness function defined in terms of maximizing the between-class separation (scatter index). Experimental results, based on the ORL face database using testing data sets for images with different orientations; show that the proposed system outperforms other face recognition methods. The overall recognition rate for the rotated test images being 97%, demonstrating that the extracted feature vector is an effective rotation invariant feature set with minimal set of selected features.

Keywords: Discrete Cosine Transform, Face Recognition, Feature Extraction, Log Polar Transform, Particle SwarmOptimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1838
1222 Face Recognition Using Morphological Shared-weight Neural Networks

Authors: Hossein Sahoolizadeh, Mahdi Rahimi, Hamid Dehghani

Abstract:

We introduce an algorithm based on the morphological shared-weight neural network. Being nonlinear and translation-invariant, the MSNN can be used to create better generalization during face recognition. Feature extraction is performed on grayscale images using hit-miss transforms that are independent of gray-level shifts. The output is then learned by interacting with the classification process. The feature extraction and classification networks are trained together, allowing the MSNN to simultaneously learn feature extraction and classification for a face. For evaluation, we test for robustness under variations in gray levels and noise while varying the network-s configuration to optimize recognition efficiency and processing time. Results show that the MSNN performs better for grayscale image pattern classification than ordinary neural networks.

Keywords: Face recognition, Neural Networks, Multi-layer Perceptron, masking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1486
1221 MarginDistillation: Distillation for Face Recognition Neural Networks with Margin-Based Softmax

Authors: Svitov David, Alyamkin Sergey

Abstract:

The usage of convolutional neural networks (CNNs) in conjunction with the margin-based softmax approach demonstrates the state-of-the-art performance for the face recognition problem. Recently, lightweight neural network models trained with the margin-based softmax have been introduced for the face identification task for edge devices. In this paper, we propose a distillation method for lightweight neural network architectures that outperforms other known methods for the face recognition task on LFW, AgeDB-30 and Megaface datasets. The idea of the proposed method is to use class centers from the teacher network for the student network. Then the student network is trained to get the same angles between the class centers and face embeddings predicted by the teacher network.

Keywords: ArcFace, distillation, face recognition, margin-based softmax.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 576
1220 The Modified Eigenface Method using Two Thresholds

Authors: Yan Ma, ShunBao Li

Abstract:

A new approach is adopted in this paper based on Turk and Pentland-s eigenface method. It was found that the probability density function of the distance between the projection vector of the input face image and the average projection vector of the subject in the face database, follows Rayleigh distribution. In order to decrease the false acceptance rate and increase the recognition rate, the input face image has been recognized using two thresholds including the acceptance threshold and the rejection threshold. We also find out that the value of two thresholds will be close to each other as number of trials increases. During the training, in order to reduce the number of trials, the projection vectors for each subject has been averaged. The recognition experiments using the proposed algorithm show that the recognition rate achieves to 92.875% whilst the average number of judgment is only 2.56 times.

Keywords: Eigenface, Face Recognition, Threshold, Rayleigh Distribution, Feature Extraction

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1458
1219 Vision Based Hand Gesture Recognition Using Generative and Discriminative Stochastic Models

Authors: Mahmoud Elmezain, Samar El-shinawy

Abstract:

Many approaches to pattern recognition are founded on probability theory, and can be broadly characterized as either generative or discriminative according to whether or not the distribution of the image features. Generative and discriminative models have very different characteristics, as well as complementary strengths and weaknesses. In this paper, we study these models to recognize the patterns of alphabet characters (A-Z) and numbers (0-9). To handle isolated pattern, generative model as Hidden Markov Model (HMM) and discriminative models like Conditional Random Field (CRF), Hidden Conditional Random Field (HCRF) and Latent-Dynamic Conditional Random Field (LDCRF) with different number of window size are applied on extracted pattern features. The gesture recognition rate is improved initially as the window size increase, but degrades as window size increase further. Experimental results show that the LDCRF is the best in terms of results than CRF, HCRF and HMM at window size equal 4. Additionally, our results show that; an overall recognition rates are 91.52%, 95.28%, 96.94% and 98.05% for CRF, HCRF, HMM and LDCRF respectively.

Keywords: Statistical Pattern Recognition, Generative Model, Discriminative Model, Human Computer Interaction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2880
1218 Rapid Study on Feature Extraction and Classification Models in Healthcare Applications

Authors: S. Sowmyayani

Abstract:

The advancement of computer-aided design helps the medical force and security force. Some applications include biometric recognition, elderly fall detection, face recognition, cancer recognition, tumor recognition, etc. This paper deals with different machine learning algorithms that are more generically used for any health care system. The most focused problems are classification and regression. With the rise of big data, machine learning has become particularly important for solving problems. Machine learning uses two types of techniques: supervised learning and unsupervised learning. The former trains a model on known input and output data and predicts future outputs. Classification and regression are supervised learning techniques. Unsupervised learning finds hidden patterns in input data. Clustering is one such unsupervised learning technique. The above-mentioned models are discussed briefly in this paper.

Keywords: Supervised learning, unsupervised learning, regression, neural network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 286
1217 The Importance of Bridge Health Monitoring

Authors: Punya Chupanit, Chayatan Phromsorn

Abstract:

In the past, there were many bridge-s collapses due to lack of bridge structural capacity information. Most of concrete bridge health was relied on information from visual inspection, which sometime was inadequate. This study was conducted in order to investigate relationship between bridge structural condition and bridge visual condition. This study was a part of a big project conducted at Department of Highways of Thailand. In this study, 31 bridges including slab-type bridges, plank-girder bridges, prestressed box-beam bridges, prestressed I-girder bridges and prestressed multibeam bridges were selected for visual inspection and load test. It was found a positive correlation between bridge appearance and bridge-s load carrying capacity. However, statistical characteristic revealed low correlation between them.

Keywords: Bridge, Visual Inspection, Load Test, Condition Rating, Rating Factor

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2007
1216 Continuous Feature Adaptation for Non-Native Speech Recognition

Authors: Y. Deng, X. Li, C. Kwan, B. Raj, R. Stern

Abstract:

The current speech interfaces in many military applications may be adequate for native speakers. However, the recognition rate drops quite a lot for non-native speakers (people with foreign accents). This is mainly because the nonnative speakers have large temporal and intra-phoneme variations when they pronounce the same words. This problem is also complicated by the presence of large environmental noise such as tank noise, helicopter noise, etc. In this paper, we proposed a novel continuous acoustic feature adaptation algorithm for on-line accent and environmental adaptation. Implemented by incremental singular value decomposition (SVD), the algorithm captures local acoustic variation and runs in real-time. This feature-based adaptation method is then integrated with conventional model-based maximum likelihood linear regression (MLLR) algorithm. Extensive experiments have been performed on the NATO non-native speech corpus with baseline acoustic model trained on native American English. The proposed feature-based adaptation algorithm improved the average recognition accuracy by 15%, while the MLLR model based adaptation achieved 11% improvement. The corresponding word error rate (WER) reduction was 25.8% and 2.73%, as compared to that without adaptation. The combined adaptation achieved overall recognition accuracy improvement of 29.5%, and WER reduction of 31.8%, as compared to that without adaptation.

Keywords: speaker adaptation; environment adaptation; robust speech recognition; SVD; non-native speech recognition

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3185
1215 Sounds Alike Name Matching for Myanmar Language

Authors: Yuzana, Khin Marlar Tun

Abstract:

Personal name matching system is the core of essential task in national citizen database, text and web mining, information retrieval, online library system, e-commerce and record linkage system. It has necessitated to the all embracing research in the vicinity of name matching. Traditional name matching methods are suitable for English and other Latin based language. Asian languages which have no word boundary such as Myanmar language still requires sounds alike matching system in Unicode based application. Hence we proposed matching algorithm to get analogous sounds alike (phonetic) pattern that is convenient for Myanmar character spelling. According to the nature of Myanmar character, we consider for word boundary fragmentation, collation of character. Thus we use pattern conversion algorithm which fabricates words in pattern with fragmented and collated. We create the Myanmar sounds alike phonetic group to help in the phonetic matching. The experimental results show that fragmentation accuracy in 99.32% and processing time in 1.72 ms.

Keywords: natural language processing, name matching, phonetic matching

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1764
1214 VISUAL JESS: AN Expandable Visual Generator of Oriented Object Expert systems

Authors: Amel Grissa-Touzi, Habib Ounally, Aissa Boulila

Abstract:

The utility of expert system generators has been widely recognized in many applications. Several generators based on concept of the paradigm object, have been recently proposed. The generator of oriented object expert system (GSEOO) offers languages that are often complex and difficult to use. We propose in this paper an extension of the expert system generator, JESS, which permits a friendly use of this expert system. The new tool, called VISUAL JESS, bring two main improvements to JESS. The first improvement concerns the easiness of its utilization while giving back transparency to the syntax and semantic aspects of the JESS programming language. The second improvement permits an easy access and modification of the JESS knowledge basis. The implementation of VISUAL JESS is made so that it is extensible and portable.

Keywords: Generator of Systems Expert, Programming oriented object classifies, object, inheritance, polymorphism.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1566
1213 Control Chart Pattern Recognition Using Wavelet Based Neural Networks

Authors: Jun Seok Kim, Cheong-Sool Park, Jun-Geol Baek, Sung-Shick Kim

Abstract:

Control chart pattern recognition is one of the most important tools to identify the process state in statistical process control. The abnormal process state could be classified by the recognition of unnatural patterns that arise from assignable causes. In this study, a wavelet based neural network approach is proposed for the recognition of control chart patterns that have various characteristics. The procedure of proposed control chart pattern recognizer comprises three stages. First, multi-resolution wavelet analysis is used to generate time-shape and time-frequency coefficients that have detail information about the patterns. Second, distance based features are extracted by a bi-directional Kohonen network to make reduced and robust information. Third, a back-propagation network classifier is trained by these features. The accuracy of the proposed method is shown by the performance evaluation with numerical results.

Keywords: Control chart pattern recognition, Multi-resolution wavelet analysis, Bi-directional Kohonen network, Back-propagation network, Feature extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2440
1212 A Hidden Markov Model-Based Isolated and Meaningful Hand Gesture Recognition

Authors: Mahmoud Elmezain, Ayoub Al-Hamadi, Jörg Appenrodt, Bernd Michaelis

Abstract:

Gesture recognition is a challenging task for extracting meaningful gesture from continuous hand motion. In this paper, we propose an automatic system that recognizes isolated gesture, in addition meaningful gesture from continuous hand motion for Arabic numbers from 0 to 9 in real-time based on Hidden Markov Models (HMM). In order to handle isolated gesture, HMM using Ergodic, Left-Right (LR) and Left-Right Banded (LRB) topologies is applied over the discrete vector feature that is extracted from stereo color image sequences. These topologies are considered to different number of states ranging from 3 to 10. A new system is developed to recognize the meaningful gesture based on zero-codeword detection with static velocity motion for continuous gesture. Therefore, the LRB topology in conjunction with Baum-Welch (BW) algorithm for training and forward algorithm with Viterbi path for testing presents the best performance. Experimental results show that the proposed system can successfully recognize isolated and meaningful gesture and achieve average rate recognition 98.6% and 94.29% respectively.

Keywords: Computer Vision & Image Processing, Gesture Recognition, Pattern Recognition, Application

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2209
1211 Local Spectrum Feature Extraction for Face Recognition

Authors: Muhammad Imran Ahmad, Ruzelita Ngadiran, Mohd Nazrin Md Isa, Nor Ashidi Mat Isa, Mohd Zaizu Ilyas, Raja Abdullah Raja Ahmad, Said Amirul Anwar Ab Hamid, Muzammil Jusoh

Abstract:

This paper presents two techniques, local feature extraction using image spectrum and low frequency spectrum modelling using GMM to capture the underlying statistical information to improve the performance of face recognition system. Local spectrum features are extracted using overlap sub block window that are mapped on the face image. For each of this block, spatial domain is transformed to frequency domain using DFT. A low frequency coefficient is preserved by discarding high frequency coefficients by applying rectangular mask on the spectrum of the facial image. Low frequency information is non- Gaussian in the feature space and by using combination of several Gaussian functions that has different statistical properties, the best feature representation can be modelled using probability density function. The recognition process is performed using maximum likelihood value computed using pre-calculated GMM components. The method is tested using FERET datasets and is able to achieved 92% recognition rates.

Keywords: Local features modelling, face recognition system, Gaussian mixture models.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2215
1210 Self Watermarking based on Visual Cryptography

Authors: Mahmoud A. Hassan, Mohammed A. Khalili

Abstract:

We are proposing a simple watermarking method based on visual cryptography. The method is based on selection of specific pixels from the original image instead of random selection of pixels as per Hwang [1] paper. Verification information is generated which will be used to verify the ownership of the image without the need to embed the watermark pattern into the original digital data. Experimental results show the proposed method can recover the watermark pattern from the marked data even if some changes are made to the original digital data.

Keywords: Watermarking, visual cryptography, visualthreshold.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1710
1209 On Developing an Automatic Speech Recognition System for Standard Arabic Language

Authors: R. Walha, F. Drira, H. El-Abed, A. M. Alimi

Abstract:

The Automatic Speech Recognition (ASR) applied to Arabic language is a challenging task. This is mainly related to the language specificities which make the researchers facing multiple difficulties such as the insufficient linguistic resources and the very limited number of available transcribed Arabic speech corpora. In this paper, we are interested in the development of a HMM-based ASR system for Standard Arabic (SA) language. Our fundamental research goal is to select the most appropriate acoustic parameters describing each audio frame, acoustic models and speech recognition unit. To achieve this purpose, we analyze the effect of varying frame windowing (size and period), acoustic parameter number resulting from features extraction methods traditionally used in ASR, speech recognition unit, Gaussian number per HMM state and number of embedded re-estimations of the Baum-Welch Algorithm. To evaluate the proposed ASR system, a multi-speaker SA connected-digits corpus is collected, transcribed and used throughout all experiments. A further evaluation is conducted on a speaker-independent continue SA speech corpus. The phonemes recognition rate is 94.02% which is relatively high when comparing it with another ASR system evaluated on the same corpus.

Keywords: ASR, HMM, acoustical analysis, acoustic modeling, Standard Arabic language

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1746
1208 Mining and Visual Management of XML-Based Image Collections

Authors: Khalil Shihab, Nida Al-Chalabi

Abstract:

This article describes Uruk, the virtual museum of Iraq that we developed for visual exploration and retrieval of image collections. The system largely exploits the loosely-structured hierarchy of XML documents that provides a useful representation method to store semi-structured or unstructured data, which does not easily fit into existing database. The system offers users the capability to mine and manage the XML-based image collections through a web-based Graphical User Interface (GUI). Typically, at an interactive session with the system, the user can browse a visual structural summary of the XML database in order to select interesting elements. Using this intermediate result, queries combining structure and textual references can be composed and presented to the system. After query evaluation, the full set of answers is presented in a visual and structured way.

Keywords: Data-centric XML, graphical user interfaces, information retrieval, case-based reasoning, fuzzy sets

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1753
1207 Visual Inspection of Work Piece with a Complex Shape by Means of Robot Manipulator

Authors: A. Y. Bani Hashim, N. S. A. Ramdan

Abstract:

Inconsistency in manual inspection is real because humans get tired after some time. Recent trends show that automatic inspection is more appealing for mass production inspections. In such as a case, a robot manipulator seems the best candidate to run a dynamic visual inspection. The purpose of this work is to estimate the optimum workspace where a robot manipulator would perform a visual inspection process onto a work piece where a camera is attached to the end effector. The pseudo codes for the planned path are derived from the number of tool transit points, the delay time at the transit points, the process cycle time, and the configuration space that the distance between the tool and the work piece. It is observed that express start and swift end are acceptable in a robot program because applicable works usually in existence during these moments. However, during the mid-range cycle, there are always practical tasks programmed to be executed. For that reason, it is acceptable to program the robot such as that speedy alteration of actuator displacement is avoided. A dynamic visual inspection system using a robot manipulator seems practical for a work piece with a complex shape.

Keywords: Robot manipulator, Visual inspection, Work piece, Trajectory planning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1628
1206 Spine Evaluation Device with Visual Feedback

Authors: T. Hirata, G. H. Yokoyama, L. H. M. Duque

Abstract:

The posteroanterior manipulation technique is usually include in the procedure of the lumbar spine to evaluate the intervertebral motion according to mechanical resistance. The mechanical device with visual feedback was proposed that allows one to analysis the lumbar segments mobility “in vivo" facilitating for the therapist to take its treatment evolution. The measuring system uses load cell and displacement sensor to estimate spine stiffness. In this work, the device was tested by 2 therapists, female, applying posteroanterior force techniques to 5 volunteers, female, with frequency of approximately 1.2-1.8 Hz. A test-retest procedure was used for 2 periods of day. The visual feedback results small variation of forces and cycle time during 6 cycles rhythmic application. The stiffness values showed good agreement between test-retest procedures when used same order of maximum forces.

Keywords: Biomechanics, lumber spine stiffness, intervertebral manipulation device, visual feedback

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1859
1205 Accent Identification by Clustering and Scoring Formants

Authors: Dejan Stantic, Jun Jo

Abstract:

There have been significant improvements in automatic voice recognition technology. However, existing systems still face difficulties, particularly when used by non-native speakers with accents. In this paper we address a problem of identifying the English accented speech of speakers from different backgrounds. Once an accent is identified the speech recognition software can utilise training set from appropriate accent and therefore improve the efficiency and accuracy of the speech recognition system. We introduced the Q factor, which is defined by the sum of relationships between frequencies of the formants. Four different accents were considered and experimented for this research. A scoring method was introduced in order to effectively analyse accents. The proposed concept indicates that the accent could be identified by analysing their formants.

Keywords: Accent Identification, Formants, Q Factor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2055
1204 A Communication Signal Recognition Algorithm Based on Holder Coefficient Characteristics

Authors: Hui Zhang, Ye Tian, Fang Ye, Ziming Guo

Abstract:

Communication signal modulation recognition technology is one of the key technologies in the field of modern information warfare. At present, communication signal automatic modulation recognition methods are mainly divided into two major categories. One is the maximum likelihood hypothesis testing method based on decision theory, the other is a statistical pattern recognition method based on feature extraction. Now, the most commonly used is a statistical pattern recognition method, which includes feature extraction and classifier design. With the increasingly complex electromagnetic environment of communications, how to effectively extract the features of various signals at low signal-to-noise ratio (SNR) is a hot topic for scholars in various countries. To solve this problem, this paper proposes a feature extraction algorithm for the communication signal based on the improved Holder cloud feature. And the extreme learning machine (ELM) is used which aims at the problem of the real-time in the modern warfare to classify the extracted features. The algorithm extracts the digital features of the improved cloud model without deterministic information in a low SNR environment, and uses the improved cloud model to obtain more stable Holder cloud features and the performance of the algorithm is improved. This algorithm addresses the problem that a simple feature extraction algorithm based on Holder coefficient feature is difficult to recognize at low SNR, and it also has a better recognition accuracy. The results of simulations show that the approach in this paper still has a good classification result at low SNR, even when the SNR is -15dB, the recognition accuracy still reaches 76%.

Keywords: Communication signal, feature extraction, holder coefficient, improved cloud model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 651
1203 Validation Testing for Temporal Neural Networks for RBF Recognition

Authors: Khaled E. A. Negm

Abstract:

A neuron can emit spikes in an irregular time basis and by averaging over a certain time window one would ignore a lot of information. It is known that in the context of fast information processing there is no sufficient time to sample an average firing rate of the spiking neurons. The present work shows that the spiking neurons are capable of computing the radial basis functions by storing the relevant information in the neurons' delays. One of the fundamental findings of the this research also is that when using overlapping receptive fields to encode the data patterns it increases the network-s clustering capacity. The clustering algorithm that is discussed here is interesting from computer science and neuroscience point of view as well as from a perspective.

Keywords: Temporal Neurons, RBF Recognition, Perturbation, On Line Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1454
1202 Reducing the False Rejection Rate of Iris Recognition Using Textural and Topological Features

Authors: M. Vatsa, R. Singh, A. Noore

Abstract:

This paper presents a novel iris recognition system using 1D log polar Gabor wavelet and Euler numbers. 1D log polar Gabor wavelet is used to extract the textural features, and Euler numbers are used to extract topological features of the iris. The proposed decision strategy uses these features to authenticate an individual-s identity while maintaining a low false rejection rate. The algorithm was tested on CASIA iris image database and found to perform better than existing approaches with an overall accuracy of 99.93%.

Keywords: Iris recognition, textural features, topological features.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1906
1201 Face Recognition using Features Combination and a New Non-linear Kernel

Authors: Essam Al Daoud

Abstract:

To improve the classification rate of the face recognition, features combination and a novel non-linear kernel are proposed. The feature vector concatenates three different radius of local binary patterns and Gabor wavelet features. Gabor features are the mean, standard deviation and the skew of each scaling and orientation parameter. The aim of the new kernel is to incorporate the power of the kernel methods with the optimal balance between the features. To verify the effectiveness of the proposed method, numerous methods are tested by using four datasets, which are consisting of various emotions, orientations, configuration, expressions and lighting conditions. Empirical results show the superiority of the proposed technique when compared to other methods.

Keywords: Face recognition, Gabor wavelet, LBP, Non-linearkerner

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1507
1200 The Capacity of Mel Frequency Cepstral Coefficients for Speech Recognition

Authors: Fawaz S. Al-Anzi, Dia AbuZeina

Abstract:

Speech recognition is of an important contribution in promoting new technologies in human computer interaction. Today, there is a growing need to employ speech technology in daily life and business activities. However, speech recognition is a challenging task that requires different stages before obtaining the desired output. Among automatic speech recognition (ASR) components is the feature extraction process, which parameterizes the speech signal to produce the corresponding feature vectors. Feature extraction process aims at approximating the linguistic content that is conveyed by the input speech signal. In speech processing field, there are several methods to extract speech features, however, Mel Frequency Cepstral Coefficients (MFCC) is the popular technique. It has been long observed that the MFCC is dominantly used in the well-known recognizers such as the Carnegie Mellon University (CMU) Sphinx and the Markov Model Toolkit (HTK). Hence, this paper focuses on the MFCC method as the standard choice to identify the different speech segments in order to obtain the language phonemes for further training and decoding steps. Due to MFCC good performance, the previous studies show that the MFCC dominates the Arabic ASR research. In this paper, we demonstrate MFCC as well as the intermediate steps that are performed to get these coefficients using the HTK toolkit.

Keywords: Speech recognition, acoustic features, Mel Frequency Cepstral Coefficients.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1929
1199 Classification Algorithms in Human Activity Recognition using Smartphones

Authors: Mohd Fikri Azli bin Abdullah, Ali Fahmi Perwira Negara, Md. Shohel Sayeed, Deok-Jai Choi, Kalaiarasi Sonai Muthu

Abstract:

Rapid advancement in computing technology brings computers and humans to be seamlessly integrated in future. The emergence of smartphone has driven computing era towards ubiquitous and pervasive computing. Recognizing human activity has garnered a lot of interest and has raised significant researches- concerns in identifying contextual information useful to human activity recognition. Not only unobtrusive to users in daily life, smartphone has embedded built-in sensors that capable to sense contextual information of its users supported with wide range capability of network connections. In this paper, we will discuss the classification algorithms used in smartphone-based human activity. Existing technologies pertaining to smartphone-based researches in human activity recognition will be highlighted and discussed. Our paper will also present our findings and opinions to formulate improvement ideas in current researches- trends. Understanding research trends will enable researchers to have clearer research direction and common vision on latest smartphone-based human activity recognition area.

Keywords: Classification algorithms, Human Activity Recognition (HAR), Smartphones

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6260
1198 Environmentally Adaptive Acoustic Echo Suppression for Barge-in Speech Recognition

Authors: Jong Han Joo, Jeong Hun Lee, Young Sun Kim, Jae Young Kang, Seung Ho Choi

Abstract:

In this study, we propose a novel technique for acoustic echo suppression (AES) during speech recognition under barge-in conditions. Conventional AES methods based on spectral subtraction apply fixed weights to the estimated echo path transfer function (EPTF) at the current signal segment and to the EPTF estimated until the previous time interval. However, the effects of echo path changes should be considered for eliminating the undesired echoes. We describe a new approach that adaptively updates weight parameters in response to abrupt changes in the acoustic environment due to background noises or double-talk. Furthermore, we devised a voice activity detector and an initial time-delay estimator for barge-in speech recognition in communication networks. The initial time delay is estimated using log-spectral distance measure, as well as cross-correlation coefficients. The experimental results show that the developed techniques can be successfully applied in barge-in speech recognition systems.

Keywords: Acoustic echo suppression, barge-in, speech recognition, echo path transfer function, initial delay estimator, voice activity detector.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2277
1197 Students’ Awareness of the Use of Poster, Power Point and Animated Video Presentations: A Case Study of Third Year Students of the Department of English of Batna University

Authors: Bahloul Amel

Abstract:

The present study debates students’ perceptions of the use of technology in learning English as a Foreign Language. Its aim is to explore and understand students’ preparation and presentation of Posters, PowerPoint and Animated Videos by drawing attention to visual and oral elements. The data is collected through observations and semi-structured interviews and analyzed through phenomenological data analysis steps. The themes emerged from the data, visual learning satisfaction in using information and communication technology, providing structure to oral presentation, learning from peers’ presentations, draw attention to using Posters, PowerPoint and Animated Videos as each supports visual learning and organization of thoughts in oral presentations.

Keywords: Animated Videos, EFL, Posters, PowerPoint presentations, Visual Learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3016
1196 A Local Invariant Generalized Hough Transform Method for Integrated Circuit Visual Positioning

Authors: Fei Long Wei, Hua Yang, Hai Tao Zhang, Zhou Ping Yin

Abstract:

In this study, an local invariant generalized Houghtransform (LI-GHT) method is proposed for integrated circuit (IC) visual positioning. The original generalized Hough transform (GHT) is robust to external noise; however, it is not suitable for visual positioning of IC chips due to the four-dimensionality (4D) of parameter space which leads to the substantial storage requirement and high computational complexity. The proposed LI-GHT method can reduce the dimensionality of parameter space to 2D thanks to the rotational invariance of local invariant geometric feature and it can estimate the accuracy position and rotation angle of IC chips in real-time under noise and blur influence. The experiment results show that the proposed LI-GHT can estimate position and rotation angle of IC chips with high accuracy and fast speed. The proposed LI-GHT algorithm was implemented in IC visual positioning system of radio frequency identification (RFID) packaging equipment.

Keywords: Integrated Circuit Visual Positioning, Generalized Hough Transform, Local invariant Generalized Hough Transform, ICpacking equipment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2178
1195 Target Tracking by Flying Drone with Fixed Camera

Authors: Guilhem Baccialone, Nicolas Delaunay, Juan-Diego Gonzales, Céline Leclercq, Adrien Leroux, Santa Pallier

Abstract:

This paper presents the software conception of a quadrotor UAV, named SKYWATCHER, which is able to follow a target. This capacity can at a long turn time permit to follow another drone and combine their performance in order to military missions for example.

From a low-cost architecture constructed by five students we implemented a software and added a camera to create a visual servoing. This project demonstrates the possibility to associate the technology of stabilization and the technology of visual enslavement.

Keywords: Quadrotor, visual servoing, student project, image processing, Unmanned Aerial Vehicles, stabilization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2218
1194 Visual Construction of Youth in Czechoslovak Press Photographs: 1959-1989

Authors: Jana Teplá

Abstract:

This text focuses on the visual construction of youth in press photographs in socialist Czechoslovakia. It deals with photographs in a magazine for young readers, Mladý svět, published by the Socialist Union of Youth of Czechoslovakia. The aim of this study was to develop a methodological tool for uncovering the values and the ideological messages in the strategies used in the visual construction of reality in the socialist press. Two methods of visual analysis were applied to the photographs, a quantitative content analysis and a social semiotic analysis. The social semiotic analysis focused on images representing youth in their free time. The study shows that the meaning of a socialist press photograph is a result of a struggle for ideological power between formal and informal ideologies. This struggle takes place within the process of production of the photograph and also within the process of interpretation of the photograph.

Keywords: Ideology, press photography, socialist regime, social semiotics, youth.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 858