Search results for: speech encryption.
74 A Talking Head System for Korean Text
Authors: Sang-Wan Kim, Hoon Lee, Kyung-Ho Choi, Soon-Young Park
Abstract:
A talking head system (THS) is presented to animate the face of a speaking 3D avatar in such a way that it realistically pronounces the given Korean text. The proposed system consists of SAPI compliant text-to-speech (TTS) engine and MPEG-4 compliant face animation generator. The input to the THS is a unicode text that is to be spoken with synchronized lip shape. The TTS engine generates a phoneme sequence with their duration and audio data. The TTS applies the coarticulation rules to the phoneme sequence and sends a mouth animation sequence to the face modeler. The proposed THS can make more natural lip sync and facial expression by using the face animation generator than those using the conventional visemes only. The experimental results show that our system has great potential for the implementation of talking head for Korean text.Keywords: Talking head, Lip sync, TTS, MPEG4.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 149173 Adaptive Filtering in Subbands for Supervised Source Separation
Authors: Bruna Luisa Ramos Prado Vasques, Mariane Rembold Petraglia, Antonio Petraglia
Abstract:
This paper investigates MIMO (Multiple-Input Multiple-Output) adaptive filtering techniques for the application of supervised source separation in the context of convolutive mixtures. From the observation that there is correlation among the signals of the different mixtures, an improvement in the NSAF (Normalized Subband Adaptive Filter) algorithm is proposed in order to accelerate its convergence rate. Simulation results with mixtures of speech signals in reverberant environments show the superior performance of the proposed algorithm with respect to the performances of the NLMS (Normalized Least-Mean-Square) and conventional NSAF, considering both the convergence speed and SIR (Signal-to-Interference Ratio) after convergence.Keywords: Adaptive filtering, multirate processing, normalized subband adaptive filter, source separation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 96272 A Hybrid GMM/SVM System for Text Independent Speaker Identification
Authors: Rafik Djemili, Mouldi Bedda, Hocine Bourouba
Abstract:
This paper proposes a novel approach that combines statistical models and support vector machines. A hybrid scheme which appropriately incorporates the advantages of both the generative and discriminant model paradigms is described and evaluated. Support vector machines (SVMs) are trained to divide the whole speakers' space into small subsets of speakers within a hierarchical tree structure. During testing a speech token is assigned to its corresponding group and evaluation using gaussian mixture models (GMMs) is then processed. Experimental results show that the proposed method can significantly improve the performance of text independent speaker identification task. We report improvements of up to 50% reduction in identification error rate compared to the baseline statistical model.Keywords: Speaker identification, Gaussian mixture model (GMM), support vector machine (SVM), hybrid GMM/SVM.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 223771 A Data Hiding Model with High Security Features Combining Finite State Machines and PMM method
Authors: Souvik Bhattacharyya, Gautam Sanyal
Abstract:
Recent years have witnessed the rapid development of the Internet and telecommunication techniques. Information security is becoming more and more important. Applications such as covert communication, copyright protection, etc, stimulate the research of information hiding techniques. Traditionally, encryption is used to realize the communication security. However, important information is not protected once decoded. Steganography is the art and science of communicating in a way which hides the existence of the communication. Important information is firstly hidden in a host data, such as digital image, video or audio, etc, and then transmitted secretly to the receiver.In this paper a data hiding model with high security features combining both cryptography using finite state sequential machine and image based steganography technique for communicating information more securely between two locations is proposed. The authors incorporated the idea of secret key for authentication at both ends in order to achieve high level of security. Before the embedding operation the secret information has been encrypted with the help of finite-state sequential machine and segmented in different parts. The cover image is also segmented in different objects through normalized cut.Each part of the encoded secret information has been embedded with the help of a novel image steganographic method (PMM) on different cuts of the cover image to form different stego objects. Finally stego image is formed by combining different stego objects and transmit to the receiver side. At the receiving end different opposite processes should run to get the back the original secret message.Keywords: Cover Image, Finite state sequential machine, Melaymachine, Pixel Mapping Method (PMM), Stego Image, NCUT.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 226170 Development of Multimodal e-Slide Presentation to Support Self-Learning for the Visually Impaired
Authors: Rustam Asnawi, Wan Fatimah Wan Ahmad
Abstract:
Currently electronic slide (e-slide) is one of the most common styles in educational presentation. Unfortunately, the utilization of e-slide for the visually impaired is uncommon since they are unable to see the content of such e-slides which are usually composed of text, images and animation. This paper proposes a model for presenting e-slide in multimodal presentation i.e. using conventional slide concurrent with voicing, in both languages Malay and English. At the design level, live multimedia presentation concept is used, while at the implementation level several components are used. The text content of each slide is extracted using COM component, Microsoft Speech API for voicing the text in English language and the text in Malay language is voiced using dictionary approach. To support the accessibility, an auditory user interface is provided as an additional feature. A prototype of such model named as VSlide has been developed and introduced.
Keywords: presentation, self-learning, slide, visually impaired
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 156969 Application of Tacit Knowledge from Professional Packaging Designer for Teaching Packaging Design
Authors: Somsri Binraman, Boonliang Kaewnapan, Krittika Tanprasert
Abstract:
In the package design industry, there are a lot of tacit knowledge resided within each designer. The objectives are to capture them and compile it to be used as a teaching resource and to create a video clip of package design process as well as to evaluate its quality and learning effectiveness. Interview were used as a technique for capturing knowledge in brand design concept, differentiation, recognition, rank of recognition factor, consumer survey, knowledge about marketing, research, graphic design, the effect of color, and law and regulation. Video clip about package design were created. The clip consisted of both the speech and clip of actual process. The quality of the video in term of media was ranked as good while the content was ranked as excellent. The students- score on post-test was significantly greater than that of pretest (p>0.001).
Keywords: Tacit knowledge, interview, video, packaging, design.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 148368 Real-Time Hand Tracking and Gesture Recognition System Using Neural Networks
Authors: Tin Hninn Hninn Maung
Abstract:
This paper introduces a hand gesture recognition system to recognize real time gesture in unstrained environments. Efforts should be made to adapt computers to our natural means of communication: Speech and body language. A simple and fast algorithm using orientation histograms will be developed. It will recognize a subset of MAL static hand gestures. A pattern recognition system will be using a transforrn that converts an image into a feature vector, which will be compared with the feature vectors of a training set of gestures. The final system will be Perceptron implementation in MATLAB. This paper includes experiments of 33 hand postures and discusses the results. Experiments shows that the system can achieve a 90% recognition average rate and is suitable for real time applications.
Keywords: Hand gesture recognition, Orientation Histogram, Myanmar Alphabet Language, Perceptronnetwork, MATLAB.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 469767 Emotion Recognition Using Neural Network: A Comparative Study
Authors: Nermine Ahmed Hendy, Hania Farag
Abstract:
Emotion recognition is an important research field that finds lots of applications nowadays. This work emphasizes on recognizing different emotions from speech signal. The extracted features are related to statistics of pitch, formants, and energy contours, as well as spectral, perceptual and temporal features, jitter, and shimmer. The Artificial Neural Networks (ANN) was chosen as the classifier. Working on finding a robust and fast ANN classifier suitable for different real life application is our concern. Several experiments were carried out on different ANN to investigate the different factors that impact the classification success rate. Using a database containing 7 different emotions, it will be shown that with a proper and careful adjustment of features format, training data sorting, number of features selected and even the ANN type and architecture used, a success rate of 85% or even more can be achieved without increasing the system complicity and the computation time
Keywords: Classification, emotion recognition, features extraction, feature selection, neural network
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 469866 On Pseudo-Random and Orthogonal Binary Spreading Sequences
Authors: Abhijit Mitra
Abstract:
Different pseudo-random or pseudo-noise (PN) as well as orthogonal sequences that can be used as spreading codes for code division multiple access (CDMA) cellular networks or can be used for encrypting speech signals to reduce the residual intelligence are investigated. We briefly review the theoretical background for direct sequence CDMA systems and describe the main characteristics of the maximal length, Gold, Barker, and Kasami sequences. We also discuss about variable- and fixed-length orthogonal codes like Walsh- Hadamard codes. The equivalence of PN and orthogonal codes are also derived. Finally, a new PN sequence is proposed which is shown to have certain better properties than the existing codes.
Keywords: Code division multiple access, pseudo-noise codes, maximal length, Gold, Barker, Kasami, Walsh-Hadamard, autocorrelation, crosscorrelation, figure of merit.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 605465 Time Delay Estimation Using Signal Envelopes for Synchronisation of Recordings
Authors: Sergei Aleinik, Mikhail Stolbov
Abstract:
In this work, a method of time delay estimation for dual-channel acoustic signals (speech, music, etc.) recorded under reverberant conditions is investigated. Standard methods based on cross-correlation of the signals show poor results in cases involving strong reverberation, large distances between microphones and asynchronous recordings. Under similar conditions, a method based on cross-correlation of temporal envelopes of the signals delivers a delay estimation of acceptable quality. This method and its properties are described and investigated in detail, including its limits of applicability. The method’s optimal parameter estimation and a comparison with other known methods of time delay estimation are also provided.
Keywords: Cross-correlation, delay estimation, signal envelope, signal processing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 306364 Development of an Artificial Ear for Bone-Conducted Objective Occlusion Measurement
Authors: Yu Luan
Abstract:
The bone-conducted objective occlusion effect (OE) is characterized by a discomforting sensation of fullness experienced in an occluded ear. This phenomenon arises from various external stimuli, such as human speech, chewing, and walking, which generate vibrations transmitted through the body to the ear canal walls. The bone-conducted OE occurs due to the pressure build-up inside the occluded ear caused by sound radiating into the ear canal cavity from its walls. In the hearing aid industry, artificial ears are utilized as a tool for developing hearing aids. However, the currently available commercial artificial ears primarily focus on pure acoustics measurements, neglecting the bone-conducted vibration aspect. This research endeavors to develop an artificial ear specifically designed for bone-conducted occlusion measurements. Finite Element Analysis (FEA) modeling has been employed to gain insights into the behavior of the artificial ear.
Keywords: Artificial ear, bone conducted vibration, occlusion measurement, Finite Element Modeling.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18763 An Approach to Noise Variance Estimation in Very Low Signal-to-Noise Ratio Stochastic Signals
Authors: Miljan B. Petrović, Dušan B. Petrović, Goran S. Nikolić
Abstract:
This paper describes a method for AWGN (Additive White Gaussian Noise) variance estimation in noisy stochastic signals, referred to as Multiplicative-Noising Variance Estimation (MNVE). The aim was to develop an estimation algorithm with minimal number of assumptions on the original signal structure. The provided MATLAB simulation and results analysis of the method applied on speech signals showed more accuracy than standardized AR (autoregressive) modeling noise estimation technique. In addition, great performance was observed on very low signal-to-noise ratios, which in general represents the worst case scenario for signal denoising methods. High execution time appears to be the only disadvantage of MNVE. After close examination of all the observed features of the proposed algorithm, it was concluded it is worth of exploring and that with some further adjustments and improvements can be enviably powerful.
Keywords: Noise, signal-to-noise ratio, stochastic signals, variance estimation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 225862 A Cross-Gender Statistical Analysis of Tuvinian Intonation Features in Comparison With Uzbek and Azerbaijani
Authors: D. Beziakina, E. Bulgakova
Abstract:
The paper deals with cross-gender and cross-linguistic comparison of pitch characteristics for Tuvinian with two other Turkic languages - Uzbek and Azerbaijani, based on the results of statistical analysis of pitch parameter values and intonation patterns used by male and female speakers.
The main goal of our work is to obtain the ranges of pitch parameter values typical for Tuvinian speakers for the purpose of automatic language identification. We also propose a cross-gender analysis of declarative intonation in the poorly studied Tuvinian language.
The ranges of pitch parameter values were obtained by means of specially developed software that deals with the distribution of pitch values and allows us to obtain statistical language-specific pitch intervals.
Keywords: Speech analysis, Statistical analysis, Speaker recognition, Identification of person.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 184961 A Relationship Extraction Method from Literary Fiction Considering Korean Linguistic Features
Authors: Hee-Jeong Ahn, Kee-Won Kim, Seung-Hoon Kim
Abstract:
The knowledge of the relationship between characters can help readers to understand the overall story or plot of the literary fiction. In this paper, we present a method for extracting the specific relationship between characters from a Korean literary fiction. Generally, methods for extracting relationships between characters in text are statistical or computational methods based on the sentence distance between characters without considering Korean linguistic features. Furthermore, it is difficult to extract the relationship with direction from text, such as one-sided love, because they consider only the weight of relationship, without considering the direction of the relationship. Therefore, in order to identify specific relationships between characters, we propose a statistical method considering linguistic features, such as syntactic patterns and speech verbs in Korean. The result of our method is represented by a weighted directed graph of the relationship between the characters. Furthermore, we expect that proposed method could be applied to the relationship analysis between characters of other content like movie or TV drama.
Keywords: Data mining, Korean linguistic feature, literary fiction, relationship extraction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 179560 Weight Functions for Signal Reconstruction Based On Level Crossings
Authors: Nagesha, G. Hemantha Kumar
Abstract:
Although the level crossing concept has been the subject of intensive investigation over the last few years, certain problems of great interest remain unsolved. One of these concern is distribution of threshold levels. This paper presents a new threshold level allocation schemes for level crossing based on nonuniform sampling. Intuitively, it is more reasonable if the information rich regions of the signal are sampled finer and those with sparse information are sampled coarser. To achieve this objective, we propose non-linear quantization functions which dynamically assign the number of quantization levels depending on the importance of the given amplitude range. Two new approaches to determine the importance of the given amplitude segment are presented. The proposed methods are based on exponential and logarithmic functions. Various aspects of proposed techniques are discussed and experimentally validated. Its efficacy is investigated by comparison with uniform sampling.
Keywords: speech signals, sampling, signal reconstruction, asynchronousdelta modulation, non-linear quantization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 165159 The Security Trade-Offs in Resource Constrained Nodes for IoT Application
Authors: Sultan Alharby, Nick Harris, Alex Weddell, Jeff Reeve
Abstract:
The concept of the Internet of Things (IoT) has received much attention over the last five years. It is predicted that the IoT will influence every aspect of our lifestyles in the near future. Wireless Sensor Networks are one of the key enablers of the operation of IoTs, allowing data to be collected from the surrounding environment. However, due to limited resources, nature of deployment and unattended operation, a WSN is vulnerable to various types of attack. Security is paramount for reliable and safe communication between IoT embedded devices, but it does, however, come at a cost to resources. Nodes are usually equipped with small batteries, which makes energy conservation crucial to IoT devices. Nevertheless, security cost in terms of energy consumption has not been studied sufficiently. Previous research has used a security specification of 802.15.4 for IoT applications, but the energy cost of each security level and the impact on quality of services (QoS) parameters remain unknown. This research focuses on the cost of security at the IoT media access control (MAC) layer. It begins by studying the energy consumption of IEEE 802.15.4 security levels, which is followed by an evaluation for the impact of security on data latency and throughput, and then presents the impact of transmission power on security overhead, and finally shows the effects of security on memory footprint. The results show that security overhead in terms of energy consumption with a payload of 24 bytes fluctuates between 31.5% at minimum level over non-secure packets and 60.4% at the top security level of 802.15.4 security specification. Also, it shows that security cost has less impact at longer packet lengths, and more with smaller packet size. In addition, the results depicts a significant impact on data latency and throughput. Overall, maximum authentication length decreases throughput by almost 53%, and encryption and authentication together by almost 62%.Keywords: Internet of Things, IEEE 802.15.4, security cost evaluation, wireless sensor network, energy consumption.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 149158 Performance Analysis of a Series of Adaptive Filters in Non-Stationary Environment for Noise Cancelling Setup
Authors: Anam Rafique, Syed Sohail Ahmed
Abstract:
One of the essential components of much of DSP application is noise cancellation. Changes in real time signals are quite rapid and swift. In noise cancellation, a reference signal which is an approximation of noise signal (that corrupts the original information signal) is obtained and then subtracted from the noise bearing signal to obtain a noise free signal. This approximation of noise signal is obtained through adaptive filters which are self adjusting. As the changes in real time signals are abrupt, this needs adaptive algorithm that converges fast and is stable. Least mean square (LMS) and normalized LMS (NLMS) are two widely used algorithms because of their plainness in calculations and implementation. But their convergence rates are small. Adaptive averaging filters (AFA) are also used because they have high convergence, but they are less stable. This paper provides the comparative study of LMS and Normalized NLMS, AFA and new enhanced average adaptive (Average NLMS-ANLMS) filters for noise cancelling application using speech signals.Keywords: AFA, ANLMS, LMS, NLMS.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 193457 A Computer Model of Language Acquisition – Syllable Learning – Based on Hebbian Cell Assemblies and Reinforcement Learning
Authors: Sepideh Fazeli, Fariba Bahrami
Abstract:
Investigating language acquisition is one of the most challenging problems in the area of studying language. Syllable learning as a level of language acquisition has a considerable significance since it plays an important role in language acquisition. Because of impossibility of studying language acquisition directly with children, especially in its developmental phases, computer models will be useful in examining language acquisition. In this paper a computer model of early language learning for syllable learning is proposed. It is guided by a conceptual model of syllable learning which is named Directions Into Velocities of Articulators model (DIVA). The computer model uses simple associational and reinforcement learning rules within neural network architecture which are inspired by neuroscience. Our simulation results verify the ability of the proposed computer model in producing phonemes during babbling and early speech. Also, it provides a framework for examining the neural basis of language learning and communication disorders.Keywords: Brain modeling, computer models, language acquisition, reinforcement learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 159056 Comparison of Parameterization Methods in Recognizing Spoken Arabic Digits
Authors: Ali Ganoun
Abstract:
This paper proposes evaluation of sound parameterization methods in recognizing some spoken Arabic words, namely digits from zero to nine. Each isolated spoken word is represented by a single template based on a specific recognition feature, and the recognition is based on the Euclidean distance from those templates. The performance analysis of recognition is based on four parameterization features: the Burg Spectrum Analysis, the Walsh Spectrum Analysis, the Thomson Multitaper Spectrum Analysis and the Mel Frequency Cepstral Coefficients (MFCC) features. The main aim of this paper was to compare, analyze, and discuss the outcomes of spoken Arabic digits recognition systems based on the selected recognition features. The results acqired confirm that the use of MFCC features is a very promising method in recognizing Spoken Arabic digits.
Keywords: Speech Recognition, Spectrum Analysis, Burg Spectrum, Walsh Spectrum Analysis, Thomson Multitaper Spectrum, MFCC.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 159355 OPEN_EmoRec_II- A Multimodal Corpus of Human-Computer Interaction
Authors: Stefanie Rukavina, Sascha Gruss, Steffen Walter, Holger Hoffmann, Harald C. Traue
Abstract:
OPEN_EmoRec_II is an open multimodal corpus with experimentally induced emotions. In the first half of the experiment, emotions were induced with standardized picture material and in the second half during a human-computer interaction (HCI), realized with a wizard-of-oz design. The induced emotions are based on the dimensional theory of emotions (valence, arousal and dominance). These emotional sequences - recorded with multimodal data (facial reactions, speech, audio and physiological reactions) during a naturalistic-like HCI-environment one can improve classification methods on a multimodal level. This database is the result of an HCI-experiment, for which 30 subjects in total agreed to a publication of their data including the video material for research purposes*. The now available open corpus contains sensory signal of: video, audio, physiology (SCL, respiration, BVP, EMG Corrugator supercilii, EMG Zygomaticus Major) and facial reactions annotations.Keywords: Open multimodal emotion corpus, annotated labels.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 182054 OPEN_EmoRec_II- A Multimodal Corpus of Human-Computer Interaction
Authors: Stefanie Rukavina, Sascha Gruss, Steffen Walter, Holger Hoffmann, Harald C. Traue
Abstract:
OPEN_EmoRec_II is an open multimodal corpus with experimentally induced emotions. In the first half of the experiment, emotions were induced with standardized picture material and in the second half during a human-computer interaction (HCI), realized with a wizard-of-oz design. The induced emotions are based on the dimensional theory of emotions (valence, arousal and dominance). These emotional sequences - recorded with multimodal data (facial reactions, speech, audio and physiological reactions) during a naturalistic-like HCI-environment one can improve classification methods on a multimodal level. This database is the result of an HCI-experiment, for which 30 subjects in total agreed to a publication of their data including the video material for research purposes*. The now available open corpus contains sensory signal of: video, audio, physiology (SCL, respiration, BVP, EMG Corrugator supercilii, EMG Zygomaticus Major) and facial reactions annotations.Keywords: Open multimodal emotion corpus, annotated labels.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 38953 The Importance of Theatrical Language in the Creativeness of the Actor
Authors: Ordabek Khozhamberdiyev
Abstract:
In this article, some methods are mentioned for developing the theatrical language by giving information of “theatrical language" since the arising of the language in obsolete terms, and today, and also by examining the problems. Being able to talk meaningfully in the theater stage is a skillful art. Maybe, to be able to convey the idea of the poet, his/her world outlook and his/her feelings from the bottom of the heart as such, also conveying the speech norms without breaking them to the ear of audience in a fascinating way in adverse of a repellent way is the most difficult one. Because of this, “the word is the mirror of the idea". The importance of the theatrical language should not be perceived as only a post, it is “as the yarn that the culture carpet is weaved from". Thereby, it is a tool which transposes our culture and our life style from generation to generation. At the time of creativeness, the “word" comes out from the poet, “the word and feeling" art comes out from the actor. If it was not so, the audience could read the texts of the work himself/herself instead of going to the theater in order to see the performance. The fundamental works by the Turkish, Kazakh and English scientists have been taken as a basis for the research done.
Keywords: language, sound, stage, theatrical language, voice
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 134652 Neuro-Fuzzy Based Model for Phrase Level Emotion Understanding
Authors: Vadivel Ayyasamy
Abstract:
The present approach deals with the identification of Emotions and classification of Emotional patterns at Phrase-level with respect to Positive and Negative Orientation. The proposed approach considers emotion triggered terms, its co-occurrence terms and also associated sentences for recognizing emotions. The proposed approach uses Part of Speech Tagging and Emotion Actifiers for classification. Here sentence patterns are broken into phrases and Neuro-Fuzzy model is used to classify which results in 16 patterns of emotional phrases. Suitable intensities are assigned for capturing the degree of emotion contents that exist in semantics of patterns. These emotional phrases are assigned weights which supports in deciding the Positive and Negative Orientation of emotions. The approach uses web documents for experimental purpose and the proposed classification approach performs well and achieves good F-Scores.
Keywords: Emotions, sentences, phrases, classification, patterns, fuzzy, positive orientation, negative orientation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 107951 Comparative Analysis of Machine Learning Tools: A Review
Authors: S. Sarumathi, M. Vaishnavi, S. Geetha, P. Ranjetha
Abstract:
Machine learning is a new and exciting area of artificial intelligence nowadays. Machine learning is the most valuable, time, supervised, and cost-effective approach. It is not a narrow learning approach; it also includes a wide range of methods and techniques that can be applied to a wide range of complex realworld problems and time domains. Biological image classification, adaptive testing, computer vision, natural language processing, object detection, cancer detection, face recognition, handwriting recognition, speech recognition, and many other applications of machine learning are widely used in research, industry, and government. Every day, more data are generated, and conventional machine learning techniques are becoming obsolete as users move to distributed and real-time operations. By providing fundamental knowledge of machine learning tools and research opportunities in the field, the aim of this article is to serve as both a comprehensive overview and a guide. A diverse set of machine learning resources is demonstrated and contrasted with the key features in this survey.Keywords: Artificial intelligence, machine learning, deep learning, machine learning algorithms, machine learning tools.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 184850 Measuring Text-Based Semantics Relatedness Using WordNet
Authors: Madiha Khan, Sidrah Ramzan, Seemab Khan, Shahzad Hassan, Kamran Saeed
Abstract:
Measuring semantic similarity between texts is calculating semantic relatedness between texts using various techniques. Our web application (Measuring Relatedness of Concepts-MRC) allows user to input two text corpuses and get semantic similarity percentage between both using WordNet. Our application goes through five stages for the computation of semantic relatedness. Those stages are: Preprocessing (extracts keywords from content), Feature Extraction (classification of words into Parts-of-Speech), Synonyms Extraction (retrieves synonyms against each keyword), Measuring Similarity (using keywords and synonyms, similarity is measured) and Visualization (graphical representation of similarity measure). Hence the user can measure similarity on basis of features as well. The end result is a percentage score and the word(s) which form the basis of similarity between both texts with use of different tools on same platform. In future work we look forward for a Web as a live corpus application that provides a simpler and user friendly tool to compare documents and extract useful information.
Keywords: GraphViz representation, semantic relatedness, similarity measurement, WordNet similarity.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 83649 Comparison of MFCC and Cepstral Coefficients as a Feature Set for PCG Biometric Systems
Authors: Justin Leo Cheang Loong, Khazaimatol S Subari, Muhammad Kamil Abdullah, Nurul Nadia Ahmad, RosliBesar
Abstract:
Heart sound is an acoustic signal and many techniques used nowadays for human recognition tasks borrow speech recognition techniques. One popular choice for feature extraction of accoustic signals is the Mel Frequency Cepstral Coefficients (MFCC) which maps the signal onto a non-linear Mel-Scale that mimics the human hearing. However the Mel-Scale is almost linear in the frequency region of heart sounds and thus should produce similar results with the standard cepstral coefficients (CC). In this paper, MFCC is investigated to see if it produces superior results for PCG based human identification system compared to CC. Results show that the MFCC system is still superior to CC despite linear filter-banks in the lower frequency range, giving up to 95% correct recognition rate for MFCC and 90% for CC. Further experiments show that the high recognition rate is due to the implementation of filter-banks and not from Mel-Scaling.Keywords: Biometric, Phonocardiogram, Cepstral Coefficients, Mel Frequency
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 355248 Improved Weighted Matching for Speaker Recognition
Authors: Ozan Mut, Mehmet Göktürk
Abstract:
Matching algorithms have significant importance in speaker recognition. Feature vectors of the unknown utterance are compared to feature vectors of the modeled speakers as a last step in speaker recognition. A similarity score is found for every model in the speaker database. Depending on the type of speaker recognition, these scores are used to determine the author of unknown speech samples. For speaker verification, similarity score is tested against a predefined threshold and either acceptance or rejection result is obtained. In the case of speaker identification, the result depends on whether the identification is open set or closed set. In closed set identification, the model that yields the best similarity score is accepted. In open set identification, the best score is tested against a threshold, so there is one more possible output satisfying the condition that the speaker is not one of the registered speakers in existing database. This paper focuses on closed set speaker identification using a modified version of a well known matching algorithm. The results of new matching algorithm indicated better performance on YOHO international speaker recognition database.Keywords: Automatic Speaker Recognition, Voice Recognition, Pattern Recognition, Digital Audio Signal Processing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 173247 Effect of Personality Traits on Classification of Political Orientation
Authors: Vesile Evrim, Aliyu Awwal
Abstract:
Today, there is a large number of political transcripts available on the Web to be mined and used for statistical analysis, and product recommendations. As the online political resources are used for various purposes, automatically determining the political orientation on these transcripts becomes crucial. The methodologies used by machine learning algorithms to do an automatic classification are based on different features that are classified under categories such as Linguistic, Personality etc. Considering the ideological differences between Liberals and Conservatives, in this paper, the effect of Personality traits on political orientation classification is studied. The experiments in this study were based on the correlation between LIWC features and the BIG Five Personality traits. Several experiments were conducted using Convote U.S. Congressional- Speech dataset with seven benchmark classification algorithms. The different methodologies were applied on several LIWC feature sets that constituted by 8 to 64 varying number of features that are correlated to five personality traits. As results of experiments, Neuroticism trait was obtained to be the most differentiating personality trait for classification of political orientation. At the same time, it was observed that the personality trait based classification methodology gives better and comparable results with the related work.Keywords: Politics, personality traits, LIWC, machine learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 216246 Automatic Building an Extensive Arabic FA Terms Dictionary
Authors: El-Sayed Atlam, Masao Fuketa, Kazuhiro Morita, Jun-ichi Aoe
Abstract:
Field Association (FA) terms are a limited set of discriminating terms that give us the knowledge to identify document fields which are effective in document classification, similar file retrieval and passage retrieval. But the problem lies in the lack of an effective method to extract automatically relevant Arabic FA Terms to build a comprehensive dictionary. Moreover, all previous studies are based on FA terms in English and Japanese, and the extension of FA terms to other language such Arabic could be definitely strengthen further researches. This paper presents a new method to extract, Arabic FA Terms from domain-specific corpora using part-of-speech (POS) pattern rules and corpora comparison. Experimental evaluation is carried out for 14 different fields using 251 MB of domain-specific corpora obtained from Arabic Wikipedia dumps and Alhyah news selected average of 2,825 FA Terms (single and compound) per field. From the experimental results, recall and precision are 84% and 79% respectively. Therefore, this method selects higher number of relevant Arabic FA Terms at high precision and recall.
Keywords: Arabic Field Association Terms, information extraction, document classification, information retrieval.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 173445 Multi-Modal Visualization of Working Instructions for Assembly Operations
Authors: Josef Wolfartsberger, Michael Heiml, Georg Schwarz, Sabrina Egger
Abstract:
Growing individualization and higher numbers of variants in industrial assembly products raise the complexity of manufacturing processes. Technical assistance systems considering both procedural and human factors allow for an increase in product quality and a decrease in required learning times by supporting workers with precise working instructions. Due to varying needs of workers, the presentation of working instructions leads to several challenges. This paper presents an approach for a multi-modal visualization application to support assembly work of complex parts. Our approach is integrated within an interconnected assistance system network and supports the presentation of cloud-streamed textual instructions, images, videos, 3D animations and audio files along with multi-modal user interaction, customizable UI, multi-platform support (e.g. tablet-PC, TV screen, smartphone or Augmented Reality devices), automated text translation and speech synthesis. The worker benefits from more accessible and up-to-date instructions presented in an easy-to-read way.Keywords: Assembly, assistive technologies, augmented reality, manufacturing, visualization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 917