Search results for: speaker adaptation; environment adaptation; robust speech recognition; SVD; non-native speech recognition
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 4361

Search results for: speaker adaptation; environment adaptation; robust speech recognition; SVD; non-native speech recognition

4001 Horizontal Aspects of Planning Climate Change Adapted Management of Wetlands

Authors: Ákos Malatinszky, Szilvia Ádám

Abstract:

Climate change causes severe effects on natural habitats, especially wetlands. These challenges require the adaptation of their management to probable effects of climate change. A compilation of necessary changes in land management was collected in a Hungarian area being both national park and Natura 2000 SAC and SCI site in favor of increasing the resilience and reducing vulnerability. Several factors, such as ecological aspects, nature conservation and climatic adaptation should be combined with social and economic factors during the process of developing climate change adapted management on vulnerable wetlands. Planning adaptive management should be determined by a priority order of conservation aims and evaluation of factors at the determined planning unit. Mowing techniques, frequency and exact date should be observed as well as grazing species and their breed, due to different grazing, group forming and trampling habits. Integrating landscape history and historical land development into the planning process is essential.

Keywords: Adaptation, climate change, management, wetland.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1648
4000 Representing Collective Unconsciousness Using Neural Networks

Authors: Pierre Abou-Haila, Richard Hall, Mark Dawes

Abstract:

Instead of representing individual cognition only, population cognition is represented using artificial neural networks whilst maintaining individuality. This population network trains continuously, simulating adaptation. An implementation of two coexisting populations is compared to the Lotka-Volterra model of predator-prey interaction. Applications include multi-agent systems such as artificial life or computer games.

Keywords: Collective unconsciousness, neural networks, adaptation, predator-prey simulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1784
3999 Speaker Identification by Atomic Decomposition of Learned Features Using Computational Auditory Scene Analysis Principals in Noisy Environments

Authors: Thomas Bryan, Veton Kepuska, Ivica Kostanic

Abstract:

Speaker recognition is performed in high Additive White Gaussian Noise (AWGN) environments using principals of Computational Auditory Scene Analysis (CASA). CASA methods often classify sounds from images in the time-frequency (T-F) plane using spectrograms or cochleargrams as the image. In this paper atomic decomposition implemented by matching pursuit performs a transform from time series speech signals to the T-F plane. The atomic decomposition creates a sparsely populated T-F vector in “weight space” where each populated T-F position contains an amplitude weight. The weight space vector along with the atomic dictionary represents a denoised, compressed version of the original signal. The arraignment or of the atomic indices in the T-F vector are used for classification. Unsupervised feature learning implemented by a sparse autoencoder learns a single dictionary of basis features from a collection of envelope samples from all speakers. The approach is demonstrated using pairs of speakers from the TIMIT data set. Pairs of speakers are selected randomly from a single district. Each speak has 10 sentences. Two are used for training and 8 for testing. Atomic index probabilities are created for each training sentence and also for each test sentence. Classification is performed by finding the lowest Euclidean distance between then probabilities from the training sentences and the test sentences. Training is done at a 30dB Signal-to-Noise Ratio (SNR). Testing is performed at SNR’s of 0 dB, 5 dB, 10 dB and 30dB. The algorithm has a baseline classification accuracy of ~93% averaged over 10 pairs of speakers from the TIMIT data set. The baseline accuracy is attributable to short sequences of training and test data as well as the overall simplicity of the classification algorithm. The accuracy is not affected by AWGN and produces ~93% accuracy at 0dB SNR.

Keywords: Time-frequency plane, atomic decomposition, envelope sampling, Gabor atoms, matching pursuit, sparse dictionary learning, sparse autoencoder.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1534
3998 A Robust Diverged Localization and Recognition of License Registration Characters

Authors: M. Sankari, R. Bremananth, C.Meena

Abstract:

Localization and Recognition of License registration characters from the moving vehicle is a computationally complex task in the field of machine vision and is of substantial interest because of its diverse applications such as cross border security, law enforcement and various other intelligent transportation applications. Previous research used the plate specific details such as aspect ratio, character style, color or dimensions of the plate in the complex task of plate localization. In this paper, license registration character is localized by Enhanced Weight based density map (EWBDM) method, which is independent of such constraints. In connection with our previous method, this paper proposes a method that relaxes constraints in lighting conditions, different fonts of character occurred in the plate and plates with hand-drawn characters in various aspect quotients. The robustness of this method is well suited for applications where the appearance of plates seems to be varied widely. Experimental results show that this approach is suited for recognizing license plates in different external environments. 

Keywords: Character segmentation, Connectivity checking, Edge detection, Image analysis, license plate localization, license number recognition, Quality frame selection

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1858
3997 Morphological Description of Cervical Cell Images for the Pathological Recognition

Authors: N. Lassouaoui, L. Hamami, N. Nouali

Abstract:

The tracking allows to detect the tumor affections of cervical cancer, it is particularly complex and consuming time, because it consists in seeking some abnormal cells among a cluster of normal cells. In this paper, we present our proposed computer system for helping the doctors in tracking the cervical cancer. Knowing that the diagnosis of the malignancy is based in the set of atypical morphological details of all cells, herein, we present an unsupervised genetic algorithm for the separation of cell components since the diagnosis is doing by analysis of the core and the cytoplasm. We give also the various algorithms used for computing the morphological characteristics of cells (Ratio core/cytoplasm, cellular deformity, ...) necessary for the recognition of illness.

Keywords: Cervical cell, morphological analysis, recognition, segmentation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1901
3996 An Evaluation of Neural Network Efficacies for Image Recognition on Edge-AI Computer Vision Platform

Authors: Jie Zhao, Meng Su

Abstract:

Image recognition enables machine-like robotics to understand a scene and plays an important role in computer vision applications. Computer vision platforms as physical infrastructure, supporting Neural Networks for image recognition, are deterministic to leverage the performance of different Neural Networks. In this paper, three different computer vision platforms – edge AI (Jetson Nano, with 4GB), a standalone laptop (with RTX 3000s, using CUDA), and a web-based device (Google Colab, using GPU) are investigated. In the case study, four prominent neural network architectures (including AlexNet, VGG16, GoogleNet, and ResNet (34/50)), are deployed. By using public ImageNets (Cifar-10), our findings provide a nuanced perspective on optimizing image recognition tasks across Edge-AI platforms, offering guidance on selecting appropriate neural network structures to maximize performance under hardware constraints.

Keywords: AlexNet, VGG, GoogleNet, ResNet, ImageNet, Cifar-10, Edge AI, Jetson Nano, CUDA, GPU.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 121
3995 Speech Encryption and Decryption Using Linear Feedback Shift Register (LFSR)

Authors: Tin Lai Win, Nant Christina Kyaw

Abstract:

This paper is taken into consideration the problem of cryptanalysis of stream ciphers. There is some attempts need to improve the existing attacks on stream cipher and to make an attempt to distinguish the portions of cipher text obtained by the encryption of plain text in which some parts of the text are random and the rest are non-random. This paper presents a tutorial introduction to symmetric cryptography. The basic information theoretic and computational properties of classic and modern cryptographic systems are presented, followed by an examination of the application of cryptography to the security of VoIP system in computer networks using LFSR algorithm. The implementation program will be developed Java 2. LFSR algorithm is appropriate for the encryption and decryption of online streaming data, e.g. VoIP (voice chatting over IP). This paper is implemented the encryption module of speech signals to cipher text and decryption module of cipher text to speech signals.

Keywords: Linear Feedback Shift Register.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3082
3994 Face Texture Reconstruction for Illumination Variant Face Recognition

Authors: Pengfei Xiong, Lei Huang, Changping Liu

Abstract:

In illumination variant face recognition, existing methods extracting face albedo as light normalized image may lead to loss of extensive facial details, with light template discarded. To improve that, a novel approach for realistic facial texture reconstruction by combining original image and albedo image is proposed. First, light subspaces of different identities are established from the given reference face images; then by projecting the original and albedo image into each light subspace respectively, texture reference images with corresponding lighting are reconstructed and two texture subspaces are formed. According to the projections in texture subspaces, facial texture with normal light can be synthesized. Due to the combination of original image, facial details can be preserved with face albedo. In addition, image partition is applied to improve the synthesization performance. Experiments on Yale B and CMUPIE databases demonstrate that this algorithm outperforms the others both in image representation and in face recognition.

Keywords: texture reconstruction, illumination, face recognition, subspaces

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1451
3993 A Human Activity Recognition System Based On Sensory Data Related to Object Usage

Authors: M. Abdullah-Al-Wadud

Abstract:

Sensor-based Activity Recognition systems usually accounts which sensors have been activated to perform an activity. The system then combines the conditional probabilities of those sensors to represent different activities and takes the decision based on that. However, the information about the sensors which are not activated may also be of great help in deciding which activity has been performed. This paper proposes an approach where the sensory data related to both usage and non-usage of objects are utilized to make the classification of activities. Experimental results also show the promising performance of the proposed method.

Keywords: Naïve Bayesian-based classification, Activity recognition, sensor data, object-usage model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1792
3992 Player Number Localization and Recognition in Soccer Video using HSV Color Space and Internal Contours

Authors: Matko Šaric, Hrvoje Dujmic, Vladan Papic, Nikola Rožic

Abstract:

Detection of player identity is challenging task in sport video content analysis. In case of soccer video player number recognition is effective and precise solution. Jersey numbers can be considered as scene text and difficulties in localization and recognition appear due to variations in orientation, size, illumination, motion etc. This paper proposed new method for player number localization and recognition. By observing hue, saturation and value for 50 different jersey examples we noticed that most often combination of low and high saturated pixels is used to separate number and jersey region. Image segmentation method based on this observation is introduced. Then, novel method for player number localization based on internal contours is proposed. False number candidates are filtered using area and aspect ratio. Before OCR processing extracted numbers are enhanced using image smoothing and rotation normalization.

Keywords: player number, soccer video, HSV color space

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1952
3991 Common Acceptable Cuisine in Multicultural Countries: Towards Building the National Food Identity

Authors: Mohd Zulhilmi Suhaimi, Mohd Salehuddin Mohd Zahari

Abstract:

Common acceptable cuisine usually discussed in the multicultural/ethnic nation as it represents the process of sharing it among the ethnic groups. The common acceptable cuisine is also considered as a precursor in the process of constructing the national food identity within ethnic groups in the multicultural countries. The adaptation of certain ethnic cuisines through its types of food, methods of cooking, ingredients and eating decorum by ethnic groups is believed creating or enhancing the process of formation on common acceptable cuisines in a multicultural country. Malaysia as the multicultural country without doubt is continuing to experience cross-culturing processes among the ethnic groups including cuisine. This study empirically investigates the adaptation level of Malay, Chinese and Indian chefs on each other ethnic cuisine attributes toward the formation on common acceptable cuisines and national food identity.

Keywords: Common acceptable cuisine, adaptation, ethnic, food, identity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3028
3990 Learning Objects Content Presentation Adaptation Model Considering Students' Learning Styles

Authors: Zenaide Carvalho da Silva, Andrey Ricardo Pimentel, Leandro Rodrigues Ferreira

Abstract:

Learning styles (LSs) correspond to the individual preferences of a person regarding the modes and forms in which he/she prefers to learn throughout the teaching/learning process. The content presentation of learning objects (LOs) using knowledge about the students’ LSs offers them digital educational resources tailored to their individual learning preferences. In this context, the most relevant characteristics of the LSs along with the most appropriate forms of LOs' content presentation were mapped and associated. Such was performed in order to define the composition of an adaptive model of LO's content presentation considering the LSs, which was called Adaptation of Content Presentation of Learning Objects Considering Learning Styles (ACPLOLS). LO prototypes were created with interfaces that were adapted to students' LSs. These prototypes were based on a model created for validation of the approaches that were used, which were established through experiments with the students. The results of subjective measures of students' emotional responses demonstrated that the ACPLOLS has reached the desired results in relation to the adequacy of the LOs interface, in accordance with the Felder-Silverman LSs Model.

Keywords: Adaptation, interface, learning styles, learning objects, students.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 490
3989 Printed Arabic Sub-Word Recognition Using Moments

Authors: Ibrahim A. El rube, Mohamed T. El Sonni, Soha S. Saleh

Abstract:

the cursive nature of the Arabic writing makes it difficult to accurately segment characters or even deal with the whole word efficiently. Therefore, in this paper, a printed Arabic sub-word recognition system is proposed. The suggested algorithm utilizes geometrical moments as descriptors for the separated sub-words. Three types of moments are investigated and applied to the printed sub-word images after dividing each image into multiple parts using windowing. Since moments are global descriptors, the windowing mechanism allows the moments to be applied to local regions of the sub-word. The local-global mixture of the proposed scheme increases the discrimination power of the moments while keeping the simplicity and ease of use of moments.

Keywords: Arabic sub-word recognition, windowing, aspectratio, moments.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1531
3988 Matrix-Interleaved Serially Concatenated Block Codes for Speech Transmission in Fixed Wireless Communication Systems

Authors: F. Mehran

Abstract:

In this paper, we study a class of serially concatenated block codes (SCBC) based on matrix interleavers, to be employed in fixed wireless communication systems. The performances of SCBC¬coded systems are investigated under various interleaver dimensions. Numerical results reveal that the matrix interleaver could be a competitive candidate over conventional block interleaver for frame lengths of 200 bits; hence, the SCBC coding based on matrix interleaver is a promising technique to be employed for speech transmission applications in many international standards such as pan-European Global System for Mobile communications (GSM), Digital Cellular Systems (DCS) 1800, and Joint Detection Code Division Multiple Access (JD-CDMA) mobile radio systems, where the speech frame contains around 200 bits.

Keywords: Matrix Interleaver, serial concatenated block codes (SCBC), turbo codes, wireless communications.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1906
3987 Reading against the Grain: Transcodifying Stimulus Meaning

Authors: Aba-Carina Pârlog

Abstract:

The paper shows that on transferring sense from the SL to the TL, the translator’s reading against the grain determines the creation of a faulty pattern of rendering the original meaning in the receiving culture which reflects the use of misleading transformative codes. In this case, the translator is a writer per se who decides what goes in and out of the book, how the style is to be ciphered and what elements of ideology are to be highlighted. The paper also proves that figurative language must not be flattened for the sake of clarity or naturalness. The missing figurative elements make the translated text less interesting, less challenging and less vivid which reflects poorly on the writer. There is a close connection between style and the writer’s person. If the writer’s style is very much altered in a translation, the translation is useless as the original writer and his / her imaginative world can no longer be discovered. The purpose of the paper is to prove that adaptation is a dangerous tool which leads to variants that sometimes reflect the original less than the reader would wish to. It contradicts the very essence of the process of translation which is that of making an original work available in a foreign language. If the adaptive transformative codes are so flexible that they encourage the translator to repeatedly leave out parts of the original work, then a subversive pattern emerges which changes the entire book. In conclusion, as a result of using adaptation, manipulative or subversive effects are created in the translated work. This is generally achieved by adding new words or connotations, creating new figures of speech or using explicitations. The additional meanings of the original work are neglected and the translator creates new meanings, implications, emphases and contexts. Again s/he turns into a new author who enjoys the freedom of expressing his / her own ideas without the constraints of the original text. Reading against the grain is unadvisable during the process of translation and consequently, following personal common sense becomes essential in the field of translation as well as everywhere else, so that translation should not become a source of fantasy.

Keywords: Speculative aesthetics, substance of expression, transformative code, translation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1623
3986 A Self Configuring System for Object Recognition in Color Images

Authors: Michela Lecca

Abstract:

System MEMORI automatically detects and recognizes rotated and/or rescaled versions of the objects of a database within digital color images with cluttered background. This task is accomplished by means of a region grouping algorithm guided by heuristic rules, whose parameters concern some geometrical properties and the recognition score of the database objects. This paper focuses on the strategies implemented in MEMORI for the estimation of the heuristic rule parameters. This estimation, being automatic, makes the system a highly user-friendly tool.

Keywords: Automatic object recognition, clustering, content based image retrieval system, image segmentation, region adjacency graph, region grouping.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1379
3985 Automatic Number Plate Recognition System Based on Deep Learning

Authors: T. Damak, O. Kriaa, A. Baccar, M. A. Ben Ayed, N. Masmoudi

Abstract:

In the last few years, Automatic Number Plate Recognition (ANPR) systems have become widely used in the safety, the security, and the commercial aspects. Forethought, several methods and techniques are computing to achieve the better levels in terms of accuracy and real time execution. This paper proposed a computer vision algorithm of Number Plate Localization (NPL) and Characters Segmentation (CS). In addition, it proposed an improved method in Optical Character Recognition (OCR) based on Deep Learning (DL) techniques. In order to identify the number of detected plate after NPL and CS steps, the Convolutional Neural Network (CNN) algorithm is proposed. A DL model is developed using four convolution layers, two layers of Maxpooling, and six layers of fully connected. The model was trained by number image database on the Jetson TX2 NVIDIA target. The accuracy result has achieved 95.84%.

Keywords: Automatic number plate recognition, character segmentation, convolutional neural network, CNN, deep learning, number plate localization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1230
3984 Automatic Recognition of an Unknown and Time-Varying Number of Simultaneous Environmental Sound Sources

Authors: S. Ntalampiras, I. Potamitis, N. Fakotakis, S. Kouzoupis

Abstract:

The present work faces the problem of automatic enumeration and recognition of an unknown and time-varying number of environmental sound sources while using a single microphone. The assumption that is made is that the sound recorded is a realization of sound sources belonging to a group of audio classes which is known a-priori. We describe two variations of the same principle which is to calculate the distance between the current unknown audio frame and all possible combinations of the classes that are assumed to span the soundscene. We concentrate on categorizing environmental sound sources, such as birds, insects etc. in the task of monitoring the biodiversity of a specific habitat.

Keywords: automatic recognition of multiple sound sources, enumeration of sound sources, computational ecology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1526
3983 Technological Environment - International Marketing Strategy Relationship

Authors: Suthawan Chirapanda

Abstract:

International trade involves both large and small firms engaged in business overseas. Possible drivers that force companies to enter international markets include increasing competition at the domestic market, maturing domestic markets, and limited domestic market opportunities. Technology is an important driving factor in shaping international marketing strategy as well as in driving force towards a more global marketplace, especially technology in communication. It includes telephones, the internet, computer systems and e-mail. There are three main marketing strategy choices, namely standardization approach, adaptation approach and middleof- the-road approach that companies implement to overseas markets. The decision depends on situations and factors facing the companies in the international markets. In this paper, the contingency concept is considered that no single strategy can be effective in all contexts. The effect of strategy on performance depends on specific situational variables. Strategic fit is employed to investigate export marketing strategy adaptation under certain environmental conditions, which in turn can lead to superior performance.

Keywords: Contingency approach, international marketing strategy, strategic fit, technological environment

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6718
3982 Realtime Lip Contour Tracking For Audio-Visual Speech Recognition Applications

Authors: Mehran Yazdi, Mehdi Seyfi, Amirhossein Rafati, Meghdad Asadi

Abstract:

Detection and tracking of the lip contour is an important issue in speechreading. While there are solutions for lip tracking once a good contour initialization in the first frame is available, the problem of finding such a good initialization is not yet solved automatically, but done manually. We have developed a new tracking solution for lip contour detection using only few landmarks (15 to 25) and applying the well known Active Shape Models (ASM). The proposed method is a new LMS-like adaptive scheme based on an Auto regressive (AR) model that has been fit on the landmark variations in successive video frames. Moreover, we propose an extra motion compensation model to address more general cases in lip tracking. Computer simulations demonstrate a fair match between the true and the estimated spatial pixels. Significant improvements related to the well known LMS approach has been obtained via a defined Frobenius norm index.

Keywords: Lip contour, Tracking, LMS-Like

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1765
3981 Gender Differences in Spatial Navigation

Authors: Bia Kim, Sewon Lee, Jaesik Lee

Abstract:

This study aims to investigate the gender differences in spatial navigation using the tasks of 2-D matrix navigation and recognition of real driving scene. The results can be summarized as followings. First, female subjects responded faster in 2-D matrix navigation task than male subjects when landmark instructions were provided. Second, in recognition task, male subjects recognized the key elements involved in the past driving scene more accurately than female subjects. In particular, female subjects tended to miss peripheral information. These results suggest the possibility of gender differences in spatial navigation.

Keywords: Gender differences, Spatial navigation, 2-D matrixnavigation, Recognition of driving scene.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2703
3980 Parametric Primitives for Hand Gesture Recognition

Authors: Sanmohan Krüger, Volker Krüger

Abstract:

Imitation learning is considered to be an effective way of teaching humanoid robots and action recognition is the key step to imitation learning. In this paper an online algorithm to recognize parametric actions with object context is presented. Objects are key instruments in understanding an action when there is uncertainty. Ambiguities arising in similar actions can be resolved with objectn context. We classify actions according to the changes they make to the object space. Actions that produce the same state change in the object movement space are classified to belong to the same class. This allow us to define several classes of actions where members of each class are connected with a semantic interpretation.

Keywords: Parametric actions, Action primitives, Hand gesture recognition, Imitation learning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1462
3979 Improved Dynamic Bayesian Networks Applied to Arabic on Line Characters Recognition

Authors: Redouane Tlemsani, Abdelkader Benyettou

Abstract:

Work is in on line Arabic character recognition and the principal motivation is to study the Arab manuscript with on line technology.

This system is a Markovian system, which one can see as like a Dynamic Bayesian Network (DBN). One of the major interests of these systems resides in the complete models training (topology and parameters) starting from training data.

Our approach is based on the dynamic Bayesian Networks formalism. The DBNs theory is a Bayesians networks generalization to the dynamic processes. Among our objective, amounts finding better parameters, which represent the links (dependences) between dynamic network variables.

In applications in pattern recognition, one will carry out the fixing of the structure, which obliges us to admit some strong assumptions (for example independence between some variables). Our application will relate to the Arabic isolated characters on line recognition using our laboratory database: NOUN. A neural tester proposed for DBN external optimization.

The DBN scores and DBN mixed are respectively 70.24% and 62.50%, which lets predict their further development; other approaches taking account time were considered and implemented until obtaining a significant recognition rate 94.79%.

Keywords: Arabic on line character recognition, dynamic Bayesian network, pattern recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1713
3978 Extended Set of DCT-TPLBP and DCT-FPLBP for Face Recognition

Authors: El Mahdi Barrah, Said Safi, Abdessamad Malaoui

Abstract:

In this paper, we describe an application for face recognition. Many studies have used local descriptors to characterize a face, the performance of these local descriptors remain low by global descriptors (working on the entire image). The application of local descriptors (cutting image into blocks) must be able to store both the advantages of global and local methods in the Discrete Cosine Transform (DCT) domain. This system uses neural network techniques. The letter method provides a good compromise between the two approaches in terms of simplifying of calculation and classifying performance. Finally, we compare our results with those obtained from other local and global conventional approaches.

Keywords: Face detection, face recognition, discrete cosine transform (DCT), FPLBP, TPLBP, NN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1940
3977 Component-based Segmentation of Words from Handwritten Arabic Text

Authors: Jawad H AlKhateeb, Jianmin Jiang, Jinchang Ren, Stan S Ipson

Abstract:

Efficient preprocessing is very essential for automatic recognition of handwritten documents. In this paper, techniques on segmenting words in handwritten Arabic text are presented. Firstly, connected components (ccs) are extracted, and distances among different components are analyzed. The statistical distribution of this distance is then obtained to determine an optimal threshold for words segmentation. Meanwhile, an improved projection based method is also employed for baseline detection. The proposed method has been successfully tested on IFN/ENIT database consisting of 26459 Arabic words handwritten by 411 different writers, and the results were promising and very encouraging in more accurate detection of the baseline and segmentation of words for further recognition.

Keywords: Arabic OCR, off-line recognition, Baseline estimation, Word segmentation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2174
3976 Iris Recognition Based On the Low Order Norms of Gradient Components

Authors: Iman A. Saad, Loay E. George

Abstract:

Iris pattern is an important biological feature of human body; it becomes very hot topic in both research and practical applications. In this paper, an algorithm is proposed for iris recognition and a simple, efficient and fast method is introduced to extract a set of discriminatory features using first order gradient operator applied on grayscale images. The gradient based features are robust, up to certain extents, against the variations may occur in contrast or brightness of iris image samples; the variations are mostly occur due lightening differences and camera changes. At first, the iris region is located, after that it is remapped to a rectangular area of size 360x60 pixels. Also, a new method is proposed for detecting eyelash and eyelid points; it depends on making image statistical analysis, to mark the eyelash and eyelid as a noise points. In order to cover the features localization (variation), the rectangular iris image is partitioned into N overlapped sub-images (blocks); then from each block a set of different average directional gradient densities values is calculated to be used as texture features vector. The applied gradient operators are taken along the horizontal, vertical and diagonal directions. The low order norms of gradient components were used to establish the feature vector. Euclidean distance based classifier was used as a matching metric for determining the degree of similarity between the features vector extracted from the tested iris image and template features vectors stored in the database. Experimental tests were performed using 2639 iris images from CASIA V4-Interival database, the attained recognition accuracy has reached up to 99.92%.

Keywords: Iris recognition, contrast stretching, gradient features, texture features, Euclidean metric.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1927
3975 Indian License Plate Detection and Recognition Using Morphological Operation and Template Matching

Authors: W. Devapriya, C. Nelson Kennedy Babu, T. Srihari

Abstract:

Automatic License plate recognition (ALPR) is a technology which recognizes the registration plate or number plate or License plate of a vehicle. In this paper, an Indian vehicle number plate is mined and the characters are predicted in efficient manner. ALPR involves four major technique i) Pre-processing ii) License Plate Location Identification iii) Individual Character Segmentation iv) Character Recognition. The opening phase, named pre-processing helps to remove noises and enhances the quality of the image using the conception of Morphological Operation and Image subtraction. The second phase, the most puzzling stage ascertain the location of license plate using the protocol Canny Edge detection, dilation and erosion. In the third phase, each characters characterized by Connected Component Approach (CCA) and in the ending phase, each segmented characters are conceptualized using cross correlation template matching- a scheme specifically appropriate for fixed format. Major application of ALPR is Tolling collection, Border Control, Parking, Stolen cars, Enforcement, Access Control, Traffic control. The database consists of 500 car images taken under dissimilar lighting condition is used. The efficiency of the system is 97%. Our future focus is Indian Vehicle License Plate Validation (Whether License plate of a vehicle is as per Road transport and highway standard).

Keywords: Automatic License plate recognition, Character recognition, Number plate Recognition, Template matching, morphological operation, canny edge detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2358
3974 Interactive Shadow Play Animation System

Authors: Bo Wan, Xiu Wen, Lingling An, Xiaoling Ding

Abstract:

The paper describes a Chinese shadow play animation system based on Kinect. Users, without any professional training, can personally manipulate the shadow characters to finish a shadow play performance by their body actions and get a shadow play video through giving the record command to our system if they want. In our system, Kinect is responsible for capturing human movement and voice commands data. Gesture recognition module is used to control the change of the shadow play scenes. After packaging the data from Kinect and the recognition result from gesture recognition module, VRPN transmits them to the server-side. At last, the server-side uses the information to control the motion of shadow characters and video recording. This system not only achieves human-computer interaction, but also realizes the interaction between people. It brings an entertaining experience to users and easy to operate for all ages. Even more important is that the application background of Chinese shadow play embodies the protection of the art of shadow play animation.

Keywords: Gesture recognition, Kinect, shadow play animation, VRPN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2661
3973 SIFT Accordion: A Space-Time Descriptor Applied to Human Action Recognition

Authors: Olfa.Ben Ahmed, Mahmoud. Mejdoub, Chokri. Ben Amar

Abstract:

Recognizing human action from videos is an active field of research in computer vision and pattern recognition. Human activity recognition has many potential applications such as video surveillance, human machine interaction, sport videos retrieval and robot navigation. Actually, local descriptors and bag of visuals words models achieve state-of-the-art performance for human action recognition. The main challenge in features description is how to represent efficiently the local motion information. Most of the previous works focus on the extension of 2D local descriptors on 3D ones to describe local information around every interest point. In this paper, we propose a new spatio-temporal descriptor based on a spacetime description of moving points. Our description is focused on an Accordion representation of video which is well-suited to recognize human action from 2D local descriptors without the need to 3D extensions. We use the bag of words approach to represent videos. We quantify 2D local descriptor describing both temporal and spatial features with a good compromise between computational complexity and action recognition rates. We have reached impressive results on publicly available action data set

Keywords: Accordion, Bag of Features, Human action, Motion, Moving point, Space-Time Descriptor, SIFT, Video.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2081
3972 ADABeV: Automatic Detection of Abnormal Behavior in Video-surveillance

Authors: Nour Charara, Iman Jarkass, Maria Sokhn, Elena Mugellini, Omar Abou Khaled

Abstract:

Intelligent Video-Surveillance (IVS) systems are being more and more popular in security applications. The analysis and recognition of abnormal behaviours in a video sequence has gradually drawn the attention in the field of IVS, since it allows filtering out a large number of useless information, which guarantees the high efficiency in the security protection, and save a lot of human and material resources. We present in this paper ADABeV, an intelligent video-surveillance framework for event recognition in crowded scene to detect the abnormal human behaviour. This framework is attended to be able to achieve real-time alarming, reducing the lags in traditional monitoring systems. This architecture proposal addresses four main challenges: behaviour understanding in crowded scenes, hard lighting conditions, multiple input kinds of sensors and contextual-based adaptability to recognize the active context of the scene.

Keywords: Behavior recognition, Crowded scene, Data fusion, Pattern recognition, Video-surveillance

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3600