Search results for: sound recognition
692 Deep Learning Application for Object Image Recognition and Robot Automatic Grasping
Authors: Shiuh-Jer Huang, Chen-Zon Yan, C. K. Huang, Chun-Chien Ting
Abstract:
Since the vision system application in industrial environment for autonomous purposes is required intensely, the image recognition technique becomes an important research topic. Here, deep learning algorithm is employed in image system to recognize the industrial object and integrate with a 7A6 Series Manipulator for object automatic gripping task. PC and Graphic Processing Unit (GPU) are chosen to construct the 3D Vision Recognition System. Depth Camera (Intel RealSense SR300) is employed to extract the image for object recognition and coordinate derivation. The YOLOv2 scheme is adopted in Convolution neural network (CNN) structure for object classification and center point prediction. Additionally, image processing strategy is used to find the object contour for calculating the object orientation angle. Then, the specified object location and orientation information are sent to robotic controller. Finally, a six-axis manipulator can grasp the specific object in a random environment based on the user command and the extracted image information. The experimental results show that YOLOv2 has been successfully employed to detect the object location and category with confidence near 0.9 and 3D position error less than 0.4 mm. It is useful for future intelligent robotic application in industrial 4.0 environment.
Keywords: Deep learning, image processing, convolution neural network, YOLOv2, 7A6 series manipulator.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1095691 Mining Image Features in an Automatic Two-Dimensional Shape Recognition System
Authors: R. A. Salam, M.A. Rodrigues
Abstract:
The number of features required to represent an image can be very huge. Using all available features to recognize objects can suffer from curse dimensionality. Feature selection and extraction is the pre-processing step of image mining. Main issues in analyzing images is the effective identification of features and another one is extracting them. The mining problem that has been focused is the grouping of features for different shapes. Experiments have been conducted by using shape outline as the features. Shape outline readings are put through normalization and dimensionality reduction process using an eigenvector based method to produce a new set of readings. After this pre-processing step data will be grouped through their shapes. Through statistical analysis, these readings together with peak measures a robust classification and recognition process is achieved. Tests showed that the suggested methods are able to automatically recognize objects through their shapes. Finally, experiments also demonstrate the system invariance to rotation, translation, scale, reflection and to a small degree of distortion.Keywords: Image mining, feature selection, shape recognition, peak measures.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1458690 Methods of Geodesic Distance in Two-Dimensional Face Recognition
Authors: Rachid Ahdid, Said Safi, Bouzid Manaut
Abstract:
In this paper, we present a comparative study of three methods of 2D face recognition system such as: Iso-Geodesic Curves (IGC), Geodesic Distance (GD) and Geodesic-Intensity Histogram (GIH). These approaches are based on computing of geodesic distance between points of facial surface and between facial curves. In this study we represented the image at gray level as a 2D surface in a 3D space, with the third coordinate proportional to the intensity values of pixels. In the classifying step, we use: Neural Networks (NN), K-Nearest Neighbor (KNN) and Support Vector Machines (SVM). The images used in our experiments are from two wellknown databases of face images ORL and YaleB. ORL data base was used to evaluate the performance of methods under conditions where the pose and sample size are varied, and the database YaleB was used to examine the performance of the systems when the facial expressions and lighting are varied.
Keywords: 2D face recognition, Geodesic distance, Iso-Geodesic Curves, Geodesic-Intensity Histogram, facial surface, Neural Networks, K-Nearest Neighbor, Support Vector Machines.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1815689 Optimizing the Capacity of a Convolutional Neural Network for Image Segmentation and Pattern Recognition
Authors: Yalong Jiang, Zheru Chi
Abstract:
In this paper, we study the factors which determine the capacity of a Convolutional Neural Network (CNN) model and propose the ways to evaluate and adjust the capacity of a CNN model for best matching to a specific pattern recognition task. Firstly, a scheme is proposed to adjust the number of independent functional units within a CNN model to make it be better fitted to a task. Secondly, the number of independent functional units in the capsule network is adjusted to fit it to the training dataset. Thirdly, a method based on Bayesian GAN is proposed to enrich the variances in the current dataset to increase its complexity. Experimental results on the PASCAL VOC 2010 Person Part dataset and the MNIST dataset show that, in both conventional CNN models and capsule networks, the number of independent functional units is an important factor that determines the capacity of a network model. By adjusting the number of functional units, the capacity of a model can better match the complexity of a dataset.Keywords: CNN, capsule network, capacity optimization, character recognition, data augmentation; semantic segmentation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 701688 Burnout Recognition for Call Center Agents by Using Skin Color Detection with Hand Poses
Authors: El Sayed A. Sharara, A. Tsuji, K. Terada
Abstract:
Call centers have been expanding and they have influence on activation in various markets increasingly. A call center’s work is known as one of the most demanding and stressful jobs. In this paper, we propose the fatigue detection system in order to detect burnout of call center agents in the case of a neck pain and upper back pain. Our proposed system is based on the computer vision technique combined skin color detection with the Viola-Jones object detector. To recognize the gesture of hand poses caused by stress sign, the YCbCr color space is used to detect the skin color region including face and hand poses around the area related to neck ache and upper back pain. A cascade of clarifiers by Viola-Jones is used for face recognition to extract from the skin color region. The detection of hand poses is given by the evaluation of neck pain and upper back pain by using skin color detection and face recognition method. The system performance is evaluated using two groups of dataset created in the laboratory to simulate call center environment. Our call center agent burnout detection system has been implemented by using a web camera and has been processed by MATLAB. From the experimental results, our system achieved 96.3% for upper back pain detection and 94.2% for neck pain detection.
Keywords: Call center agents, fatigue, skin color detection, face recognition.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1045687 Pattern Recognition of Biological Signals
Authors: Paulo S. Caparelli, Eduardo Costa, Alexsandro S. Soares, Hipolito Barbosa
Abstract:
This paper presents an evolutionary method for designing electronic circuits and numerical methods associated with monitoring systems. The instruments described here have been used in studies of weather and climate changes due to global warming, and also in medical patient supervision. Genetic Programming systems have been used both for designing circuits and sensors, and also for determining sensor parameters. The authors advance the thesis that the software side of such a system should be written in computer languages with a strong mathematical and logic background in order to prevent software obsolescence, and achieve program correctness.Keywords: Pattern recognition, evolutionary computation, biological signal, functional programming.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1744686 Optimized Brain Computer Interface System for Unspoken Speech Recognition: Role of Wernicke Area
Authors: Nassib Abdallah, Pierre Chauvet, Abd El Salam Hajjar, Bassam Daya
Abstract:
In this paper, we propose an optimized brain computer interface (BCI) system for unspoken speech recognition, based on the fact that the constructions of unspoken words rely strongly on the Wernicke area, situated in the temporal lobe. Our BCI system has four modules: (i) the EEG Acquisition module based on a non-invasive headset with 14 electrodes; (ii) the Preprocessing module to remove noise and artifacts, using the Common Average Reference method; (iii) the Features Extraction module, using Wavelet Packet Transform (WPT); (iv) the Classification module based on a one-hidden layer artificial neural network. The present study consists of comparing the recognition accuracy of 5 Arabic words, when using all the headset electrodes or only the 4 electrodes situated near the Wernicke area, as well as the selection effect of the subbands produced by the WPT module. After applying the articial neural network on the produced database, we obtain, on the test dataset, an accuracy of 83.4% with all the electrodes and all the subbands of 8 levels of the WPT decomposition. However, by using only the 4 electrodes near Wernicke Area and the 6 middle subbands of the WPT, we obtain a high reduction of the dataset size, equal to approximately 19% of the total dataset, with 67.5% of accuracy rate. This reduction appears particularly important to improve the design of a low cost and simple to use BCI, trained for several words.Keywords: Brain-computer interface, speech recognition, electroencephalography EEG, Wernicke area, artificial neural network.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 918685 Improved Feature Extraction Technique for Handling Occlusion in Automatic Facial Expression Recognition
Authors: Khadijat T. Bamigbade, Olufade F. W. Onifade
Abstract:
The field of automatic facial expression analysis has been an active research area in the last two decades. Its vast applicability in various domains has drawn so much attention into developing techniques and dataset that mirror real life scenarios. Many techniques such as Local Binary Patterns and its variants (CLBP, LBP-TOP) and lately, deep learning techniques, have been used for facial expression recognition. However, the problem of occlusion has not been sufficiently handled, making their results not applicable in real life situations. This paper develops a simple, yet highly efficient method tagged Local Binary Pattern-Histogram of Gradient (LBP-HOG) with occlusion detection in face image, using a multi-class SVM for Action Unit and in turn expression recognition. Our method was evaluated on three publicly available datasets which are JAFFE, CK, SFEW. Experimental results showed that our approach performed considerably well when compared with state-of-the-art algorithms and gave insight to occlusion detection as a key step to handling expression in wild.
Keywords: Automatic facial expression analysis, local binary pattern, LBP-HOG, occlusion detection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 783684 A Supervised Learning Data Mining Approach for Object Recognition and Classification in High Resolution Satellite Data
Authors: Mais Nijim, Rama Devi Chennuboyina, Waseem Al Aqqad
Abstract:
Advances in spatial and spectral resolution of satellite images have led to tremendous growth in large image databases. The data we acquire through satellites, radars, and sensors consists of important geographical information that can be used for remote sensing applications such as region planning, disaster management. Spatial data classification and object recognition are important tasks for many applications. However, classifying objects and identifying them manually from images is a difficult task. Object recognition is often considered as a classification problem, this task can be performed using machine-learning techniques. Despite of many machine-learning algorithms, the classification is done using supervised classifiers such as Support Vector Machines (SVM) as the area of interest is known. We proposed a classification method, which considers neighboring pixels in a region for feature extraction and it evaluates classifications precisely according to neighboring classes for semantic interpretation of region of interest (ROI). A dataset has been created for training and testing purpose; we generated the attributes by considering pixel intensity values and mean values of reflectance. We demonstrated the benefits of using knowledge discovery and data-mining techniques, which can be on image data for accurate information extraction and classification from high spatial resolution remote sensing imagery.Keywords: Remote sensing, object recognition, classification, data mining, waterbody identification, feature extraction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2055683 Comparison of the Music Sound System between Thailand and Vietnam
Authors: Sansanee Jasuwan
Abstract:
Thai and Vietnamese music had been influenced and inspired by the traditional Chinese music. Whereby the differences of the tuning systems as well as the music modes are obviously known . The research examined the character of musical instruments, songs and culture between Thai and Vietnamese. An analyzing of songs and modes and the study of tone vibration as well as timbre had been done accurately. This qualitative research is based on documentary and songs analysis, field study, interviews and focus group discussion of Thai and Vietnamese masters. The research aims are to examine the musical instruments and songs of both Thai and Vietnamese as well as the comparison of the sounding system between Thailand and Vietnam. The finding of the research has revealed that there are similarities in certain kinds of instruments but differences in the sound systems regarding songs and scale of Thailand and Vietnam. Both cultural musical instruments are diverse and synthetic combining native and foreign inspiring. An integral part of Vietnam has been highly impacted by Chinese musical convention. Korea, Mongolia and Japan music have also play an active and effectively influenced as their geographical related. Whereas Thailand has been influenced by Chinese and Indian traditional music. Both Thai and Vietnamese musical instruments can be divided into four groups: plucked strings, bowed strings, winds and percussion. Songs from both countries have their own characteristics. They are playing a role in touching people heart in ceremonies, social functions and an essential element of the native performing arts. The Vietnamese music melodies have been influenced by Chinese music and taken the same character as Chinese songs. Thai song has specific identity and variety showed in its unique melody. Pentatonic scales have effectively been used in composing Thai and Vietnamese songs, but in different implementing concept.
Keywords: Music sound system, Thailand, Vietnam.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4386682 2D Spherical Spaces for Face Relighting under Harsh Illumination
Authors: Amr Almaddah, Sadi Vural, Yasushi Mae, Kenichi Ohara, Tatsuo Arai
Abstract:
In this paper, we propose a robust face relighting technique by using spherical space properties. The proposed method is done for reducing the illumination effects on face recognition. Given a single 2D face image, we relight the face object by extracting the nine spherical harmonic bases and the face spherical illumination coefficients. First, an internal training illumination database is generated by computing face albedo and face normal from 2D images under different lighting conditions. Based on the generated database, we analyze the target face pixels and compare them with the training bootstrap by using pre-generated tiles. In this work, practical real time processing speed and small image size were considered when designing the framework. In contrast to other works, our technique requires no 3D face models for the training process and takes a single 2D image as an input. Experimental results on publicly available databases show that the proposed technique works well under severe lighting conditions with significant improvements on the face recognition rates.Keywords: Face synthesis and recognition, Face illumination recovery, 2D spherical spaces, Vision for graphics.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1754681 Localizing Acoustic Touch Impacts using Zip-stuffing in Complex k-space Domain
Authors: R. Bremananth, Andy W. H. Khong, A. Chitra
Abstract:
Visualizing sound and noise often help us to determine an appropriate control over the source localization. Near-field acoustic holography (NAH) is a powerful tool for the ill-posed problem. However, in practice, due to the small finite aperture size, the discrete Fourier transform, FFT based NAH couldn-t predict the activeregion- of-interest (AROI) over the edges of the plane. Theoretically few approaches were proposed for solving finite aperture problem. However most of these methods are not quite compatible for the practical implementation, especially near the edge of the source. In this paper, a zip-stuffing extrapolation approach has suggested with 2D Kaiser window. It is operated on wavenumber complex space to localize the predicted sources. We numerically form a practice environment with touch impact databases to test the localization of sound source. It is observed that zip-stuffing aperture extrapolation and 2D window with evanescent components provide more accuracy especially in the small aperture and its derivatives.Keywords: Acoustic source localization, Near-field acoustic holography (NAH), FFT, Extrapolation, k-space wavenumber errors.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1648680 Combined Automatic Speech Recognition and Machine Translation in Business Correspondence Domain for English-Croatian
Authors: Sanja Seljan, Ivan Dunđer
Abstract:
The paper presents combined automatic speech recognition (ASR) of English and machine translation (MT) for English and Croatian and Croatian-English language pairs in the domain of business correspondence. The first part presents results of training the ASR commercial system on English data sets, enriched by error analysis. The second part presents results of machine translation performed by free online tool for English and Croatian and Croatian-English language pairs. Human evaluation in terms of usability is conducted and internal consistency calculated by Cronbach's alpha coefficient, enriched by error analysis. Automatic evaluation is performed by WER (Word Error Rate) and PER (Position-independent word Error Rate) metrics, followed by investigation of Pearson’s correlation with human evaluation.
Keywords: Automatic machine translation, integrated language technologies, quality evaluation, speech recognition.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2912679 Segmentation Problems and Solutions in Printed Degraded Gurmukhi Script
Authors: M. K. Jindal, G. S. Lehal, R. K. Sharma
Abstract:
Character segmentation is an important preprocessing step for text recognition. In degraded documents, existence of touching characters decreases recognition rate drastically, for any optical character recognition (OCR) system. In this paper we have proposed a complete solution for segmenting touching characters in all the three zones of printed Gurmukhi script. A study of touching Gurmukhi characters is carried out and these characters have been divided into various categories after a careful analysis. Structural properties of the Gurmukhi characters are used for defining the categories. New algorithms have been proposed to segment the touching characters in middle zone, upper zone and lower zone. These algorithms have shown a reasonable improvement in segmenting the touching characters in degraded printed Gurmukhi script. The algorithms proposed in this paper are applicable only to machine printed text. We have also discussed a new and useful technique to segment the horizontally overlapping lines.Keywords: Character Segmentation, Middle Zone, Upper Zone, Lower Zone, Touching Characters, Horizontally Overlapping Lines.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1696678 Modeling and Numerical Simulation of Sound Radiation by the Boundary Element Method
Authors: Costa, E.S., Borges, E.N.M., Afonso, M.M.
Abstract:
The modeling of sound radiation is of fundamental importance for understanding the propagation of acoustic waves and, consequently, develop mechanisms for reducing acoustic noise. The propagation of acoustic waves, are involved in various phenomena such as radiation, absorption, transmission and reflection. The radiation is studied through the linear equation of the acoustic wave that is obtained through the equation for the Conservation of Momentum, equation of State and Continuity. From these equations, is the Helmholtz differential equation that describes the problem of acoustic radiation. In this paper we obtained the solution of the Helmholtz differential equation for an infinite cylinder in a pulsating through free and homogeneous. The analytical solution is implemented and the results are compared with the literature. A numerical formulation for this problem is obtained using the Boundary Element Method (BEM). This method has great power for solving certain acoustical problems in open field, compared to differential methods. BEM reduces the size of the problem, thereby simplifying the input data to be worked and reducing the computational time used.
Keywords: Acoustic radiation, boundary element
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1476677 Using Speech Emotion Recognition as a Longitudinal Biomarker for Alzheimer’s Disease
Authors: Yishu Gong, Liangliang Yang, Jianyu Zhang, Zhengyu Chen, Sihong He, Xusheng Zhang, Wei Zhang
Abstract:
Alzheimer’s disease (AD) is a progressive neurodegenerative disorder that affects millions of people worldwide and is characterized by cognitive decline and behavioral changes. People living with Alzheimer’s disease often find it hard to complete routine tasks. However, there are limited objective assessments that aim to quantify the difficulty of certain tasks for AD patients compared to non-AD people. In this study, we propose to use speech emotion recognition (SER), especially the frustration level as a potential biomarker for quantifying the difficulty patients experience when describing a picture. We build an SER model using data from the IEMOCAP dataset and apply the model to the DementiaBank data to detect the AD/non-AD group difference and perform longitudinal analysis to track the AD disease progression. Our results show that the frustration level detected from the SER model can possibly be used as a cost-effective tool for objective tracking of AD progression in addition to the Mini-Mental State Examination (MMSE) score.
Keywords: Alzheimer’s disease, Speech Emotion Recognition, longitudinal biomarker, machine learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 274676 Applications of Support Vector Machines on Smart Phone Systems for Emotional Speech Recognition
Authors: Wernhuar Tarng, Yuan-Yuan Chen, Chien-Lung Li, Kun-Rong Hsie, Mingteh Chen
Abstract:
An emotional speech recognition system for the applications on smart phones was proposed in this study to combine with 3G mobile communications and social networks to provide users and their groups with more interaction and care. This study developed a mechanism using the support vector machines (SVM) to recognize the emotions of speech such as happiness, anger, sadness and normal. The mechanism uses a hierarchical classifier to adjust the weights of acoustic features and divides various parameters into the categories of energy and frequency for training. In this study, 28 commonly used acoustic features including pitch and volume were proposed for training. In addition, a time-frequency parameter obtained by continuous wavelet transforms was also used to identify the accent and intonation in a sentence during the recognition process. The Berlin Database of Emotional Speech was used by dividing the speech into male and female data sets for training. According to the experimental results, the accuracies of male and female test sets were increased by 4.6% and 5.2% respectively after using the time-frequency parameter for classifying happy and angry emotions. For the classification of all emotions, the average accuracy, including male and female data, was 63.5% for the test set and 90.9% for the whole data set.Keywords: Smart phones, emotional speech recognition, socialnetworks, support vector machines, time-frequency parameter, Mel-scale frequency cepstral coefficients (MFCC).
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1842675 Implementation of the SIP Express Router with Mediaproxy Method on VoIP
Authors: Heru Nurwarsito, R. Arief Setyawan, Rakhmadhany Primananda
Abstract:
Voice Over IP (VoIP) is a technology that could pass the voice traffic and data packet form over an IP network. Network can be used for intranet or Internet. Phone calls using VoIP has advantages in terms of cheaper cost of PSTN phone to more than half, because the cost is calculated by the cost of the global nature of the Internet. Session Initiation Protocol (SIP) is a signaling protocol at the application layer which serves to establish, modify, and terminate a multimedia session involving one or more users. This SIP signaling has SIP message in text form that is used for session management by the SIP components, such as User Agent, Registrar, Redirect Server, and Proxy Server. To build a SIP communication is required SIP Express Router (SER) to be able to receive SIP messages, for handling the basic functions of SIP messages. Problems occur when the NAT through which affects the voice communication will be blocked starting from the sound that is not sent or one side of the sound are sent (half duplex). How that could be used to penetrate NAT is to use a given mediaproxy random RTP port to penetrate NAT.Keywords: VoIP, SIP, SIP Express Router, NAT, Mediaproxy.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2558674 Decomposition Method for Neural Multiclass Classification Problem
Authors: H. El Ayech, A. Trabelsi
Abstract:
In this article we are going to discuss the improvement of the multi classes- classification problem using multi layer Perceptron. The considered approach consists in breaking down the n-class problem into two-classes- subproblems. The training of each two-class subproblem is made independently; as for the phase of test, we are going to confront a vector that we want to classify to all two classes- models, the elected class will be the strongest one that won-t lose any competition with the other classes. Rates of recognition gotten with the multi class-s approach by two-class-s decomposition are clearly better that those gotten by the simple multi class-s approach.Keywords: Artificial neural network, letter-recognition, Multi class Classification, Multi Layer Perceptron.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1572673 Spectral Analysis of Speech: A New Technique
Authors: Neeta Awasthy, J.P.Saini, D.S.Chauhan
Abstract:
ICA which is generally used for blind source separation problem has been tested for feature extraction in Speech recognition system to replace the phoneme based approach of MFCC. Applying the Cepstral coefficients generated to ICA as preprocessing has developed a new signal processing approach. This gives much better results against MFCC and ICA separately, both for word and speaker recognition. The mixing matrix A is different before and after MFCC as expected. As Mel is a nonlinear scale. However, cepstrals generated from Linear Predictive Coefficient being independent prove to be the right candidate for ICA. Matlab is the tool used for all comparisons. The database used is samples of ISOLET.Keywords: Cepstral Coefficient, Distance measures, Independent Component Analysis, Linear Predictive Coefficients.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1957672 Object Localization in Medical Images Using Genetic Algorithms
Authors: George Karkavitsas, Maria Rangoussi
Abstract:
We present a genetic algorithm application to the problem of object registration (i.e., object detection, localization and recognition) in a class of medical images containing various types of blood cells. The genetic algorithm approach taken here is seen to be most appropriate for this type of image, due to the characteristics of the objects. Successful cell registration results on real life microscope images of blood cells show the potential of the proposed approach.
Keywords: Genetic algorithms, object registration, pattern recognition, blood cell microscope images.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1969671 Facial Expressions Recognition from Complex Background using Face Context and Adaptively Weighted sub-Pattern PCA
Authors: Md. Zahangir Alom, Mei-Lan Piao, Md. Ashraful Alam, Nam Kim, Jae-Hyeung Park
Abstract:
A new approach for facial expressions recognition based on face context and adaptively weighted sub-pattern PCA (Aw-SpPCA) has been presented in this paper. The facial region and others part of the body have been segmented from the complex environment based on skin color model. An algorithm has been proposed to accurate detection of face region from the segmented image based on constant ratio of height and width of face (δ= 1.618). The paper also discusses on new concept to detect the eye and mouth position. The desired part of the face has been cropped to analysis the expression of a person. Unlike PCA based on a whole image pattern, Aw-SpPCA operates directly on its sub patterns partitioned from an original whole pattern and separately extracts features from them. Aw-SpPCA can adaptively compute the contributions of each part and a classification task in order to enhance the robustness to both expression and illumination variations. Experiments on single standard face with five types of facial expression database shows that the proposed method is competitive.
Keywords: Aw-SpPC, Expressoin Recognition, Face context, Face Detection, PCA
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1721670 A Neuron Model of Facial Recognition and Detection of an Authorized Entity Using Machine Learning System
Authors: J. K. Adedeji, M. O. Oyekanmi
Abstract:
This paper has critically examined the use of Machine Learning procedures in curbing unauthorized access into valuable areas of an organization. The use of passwords, pin codes, user’s identification in recent times has been partially successful in curbing crimes involving identities, hence the need for the design of a system which incorporates biometric characteristics such as DNA and pattern recognition of variations in facial expressions. The facial model used is the OpenCV library which is based on the use of certain physiological features, the Raspberry Pi 3 module is used to compile the OpenCV library, which extracts and stores the detected faces into the datasets directory through the use of camera. The model is trained with 50 epoch run in the database and recognized by the Local Binary Pattern Histogram (LBPH) recognizer contained in the OpenCV. The training algorithm used by the neural network is back propagation coded using python algorithmic language with 200 epoch runs to identify specific resemblance in the exclusive OR (XOR) output neurons. The research however confirmed that physiological parameters are better effective measures to curb crimes relating to identities.
Keywords: Biometric characters, facial recognition, neural network, OpenCV.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 695669 Speech Recognition Using Scaly Neural Networks
Authors: Akram M. Othman, May H. Riadh
Abstract:
This research work is aimed at speech recognition using scaly neural networks. A small vocabulary of 11 words were established first, these words are “word, file, open, print, exit, edit, cut, copy, paste, doc1, doc2". These chosen words involved with executing some computer functions such as opening a file, print certain text document, cutting, copying, pasting, editing and exit. It introduced to the computer then subjected to feature extraction process using LPC (linear prediction coefficients). These features are used as input to an artificial neural network in speaker dependent mode. Half of the words are used for training the artificial neural network and the other half are used for testing the system; those are used for information retrieval. The system components are consist of three parts, speech processing and feature extraction, training and testing by using neural networks and information retrieval. The retrieve process proved to be 79.5-88% successful, which is quite acceptable, considering the variation to surrounding, state of the person, and the microphone type.Keywords: Feature extraction, Liner prediction coefficients, neural network, Speech Recognition, Scaly ANN.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1738668 Combining Skin Color and Optical Flow for Computer Vision Systems
Authors: Muhammad Raza Ali, Tim Morris
Abstract:
Skin color is an important visual cue for computer vision systems involving human users. In this paper we combine skin color and optical flow for detection and tracking of skin regions. We apply these techniques to gesture recognition with encouraging results. We propose a novel skin similarity measure. For grouping detected skin regions we propose a novel skin region grouping mechanism. The proposed techniques work with any number of skin regions making them suitable for a multiuser scenario.Keywords: Bayesian tracking, chromaticity space, optical flowgesture recognition
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1928667 OCR for Script Identification of Hindi (Devnagari) Numerals using Error Diffusion Halftoning Algorithm with Neural Classifier
Authors: Banashree N. P., Andhe Dharani, R. Vasanta, P. S. Satyanarayana
Abstract:
The applications on numbers are across-the-board that there is much scope for study. The chic of writing numbers is diverse and comes in a variety of form, size and fonts. Identification of Indian languages scripts is challenging problems. In Optical Character Recognition [OCR], machine printed or handwritten characters/numerals are recognized. There are plentiful approaches that deal with problem of detection of numerals/character depending on the sort of feature extracted and different way of extracting them. This paper proposes a recognition scheme for handwritten Hindi (devnagiri) numerals; most admired one in Indian subcontinent our work focused on a technique in feature extraction i.e. Local-based approach, a method using 16-segment display concept, which is extracted from halftoned images & Binary images of isolated numerals. These feature vectors are fed to neural classifier model that has been trained to recognize a Hindi numeral. The archetype of system has been tested on varieties of image of numerals. Experimentation result shows that recognition rate of halftoned images is 98 % compared to binary images (95%).
Keywords: OCR, Halftoning, Neural classifier, 16-segmentdisplay concept.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1716666 Loudspeaker Parameters Inverse Problem for Improving Sound Frequency Response Simulation
Authors: Y. T. Tsai, Jin H. Huang
Abstract:
The sound pressure level (SPL) of the moving-coil loudspeaker (MCL) is often simulated and analyzed using the lumped parameter model. However, the SPL of a MCL cannot be simulated precisely in the high frequency region, because the value of cone effective area is changed due to the geometry variation in different mode shapes, it is also related to affect the acoustic radiation mass and resistance. Herein, the paper presents the inverse method which has a high ability to measure the value of cone effective area in various frequency points, also can estimate the MCL electroacoustic parameters simultaneously. The proposed inverse method comprises the direct problem, adjoint problem, and sensitivity problem in collaboration with nonlinear conjugate gradient method. Estimated values from the inverse method are validated experimentally which compared with the measured SPL curve result. Results presented in this paper not only improve the accuracy of lumped parameter model but also provide the valuable information on loudspeaker cone design.
Keywords: Inverse problem, cone effective area, loudspeaker, nonlinear conjugate gradient method.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2554665 Performance Evaluation of Purely Mechanical Wireless In-Mould Sensor for Injection Moulding
Authors: Florian Müller, Christian Kukla, Thomas Lucyshyn, Clemens Holzer
Abstract:
In this paper, the influencing parameters of a novel purely mechanical wireless in-mould injection moulding sensor were investigated. The sensor is capable of detecting the melt front at predefined locations inside the mould. The sensor comprises a movable pin which acts as the sensor element generating structure-borne sound triggered by the passing melt front. Due to the sensor design, melt pressure is the driving force. For pressure level measurement during pin movement a pressure transducer located at the same position as the movable pin. By deriving a mathematical model for the mechanical movement, dominant process parameters could be investigated towards their impact on the melt front detection characteristic. It was found that the sensor is not affected by the investigated parameters enabling it for reliable melt front detection. In addition, it could be proved that the novel sensor is in comparable range to conventional melt front detection sensors.
Keywords: Injection Moulding, In-Mould Sensor, Structure-Borne Sound, Wireless Sensor
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2069664 Recognition of Isolated Speech Signals using Simplified Statistical Parameters
Authors: Abhijit Mitra, Bhargav Kumar Mitra, Biswajoy Chatterjee
Abstract:
We present a novel scheme to recognize isolated speech signals using certain statistical parameters derived from those signals. The determination of the statistical estimates is based on extracted signal information rather than the original signal information in order to reduce the computational complexity. Subtle details of these estimates, after extracting the speech signal from ambience noise, are first exploited to segregate the polysyllabic words from the monosyllabic ones. Precise recognition of each distinct word is then carried out by analyzing the histogram, obtained from these information.Keywords: Isolated speech signals, Block overlapping technique, Positive peaks, Histogram analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1428663 Robust Heart Sounds Segmentation Based on the Variation of the Phonocardiogram Curve Length
Authors: Mecheri Zeid Belmecheri, Maamar Ahfir, Izzet Kale
Abstract:
Automatic cardiac auscultation is still a subject of research in order to establish an objective diagnosis. Recorded heart sounds as Phonocardiogram (PCG) signals can be used for automatic segmentation into components that have clinical meanings. These are the first sound, S1, the second sound, S2, and the systolic and diastolic components, respectively. In this paper, an automatic method is proposed for the robust segmentation of heart sounds. This method is based on calculating an intermediate sawtooth-shaped signal from the length variation of the recorded PCG signal in the time domain and, using its positive derivative function that is a binary signal in training a Recurrent Neural Network (RNN). Results obtained in the context of a large database of recorded PCGs with their simultaneously recorded Electrocardiograms (ECGs) from different patients in clinical settings, including normal and abnormal subjects, show on average a segmentation testing performance average of 76% sensitivity and 94% specificity.
Keywords: Heart sounds, PCG segmentation, event detection, Recurrent Neural Networks, PCG curve length.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 322