Search results for: speech signals
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 881

Search results for: speech signals

641 Automated Heart Sound Classification from Unsegmented Phonocardiogram Signals Using Time Frequency Features

Authors: Nadia Masood Khan, Muhammad Salman Khan, Gul Muhammad Khan

Abstract:

Cardiologists perform cardiac auscultation to detect abnormalities in heart sounds. Since accurate auscultation is a crucial first step in screening patients with heart diseases, there is a need to develop computer-aided detection/diagnosis (CAD) systems to assist cardiologists in interpreting heart sounds and provide second opinions. In this paper different algorithms are implemented for automated heart sound classification using unsegmented phonocardiogram (PCG) signals. Support vector machine (SVM), artificial neural network (ANN) and cartesian genetic programming evolved artificial neural network (CGPANN) without the application of any segmentation algorithm has been explored in this study. The signals are first pre-processed to remove any unwanted frequencies. Both time and frequency domain features are then extracted for training the different models. The different algorithms are tested in multiple scenarios and their strengths and weaknesses are discussed. Results indicate that SVM outperforms the rest with an accuracy of 73.64%.

Keywords: Pattern recognition, machine learning, computer aided diagnosis, heart sound classification, and feature extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1208
640 The Analysis of Deceptive and Truthful Speech: A Computational Linguistic Based Method

Authors: Seham El Kareh, Miramar Etman

Abstract:

Recently, detecting liars and extracting features which distinguish them from truth-tellers have been the focus of a wide range of disciplines. To the author’s best knowledge, most of the work has been done on facial expressions and body gestures but only few works have been done on the language used by both liars and truth-tellers. This paper sheds light on four axes. The first axis copes with building an audio corpus for deceptive and truthful speech for Egyptian Arabic speakers. The second axis focuses on examining the human perception of lies and proving our need for computational linguistic-based methods to extract features which characterize truthful and deceptive speech. The third axis is concerned with building a linguistic analysis program that could extract from the corpus the inter- and intra-linguistic cues for deceptive and truthful speech. The program built here is based on selected categories from the Linguistic Inquiry and Word Count program. Our results demonstrated that Egyptian Arabic speakers on one hand preferred to use first-person pronouns and present tense compared to the past tense when lying and their lies lacked of second-person pronouns, and on the other hand, when telling the truth, they preferred to use the verbs related to motion and the nouns related to time. The results also showed that there is a need for bigger data to prove the significance of words related to emotions and numbers.

Keywords: Egyptian Arabic corpus, computational analysis, deceptive features, forensic linguistics, human perception, truthful features.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1140
639 A Novel Spectrum Sensing Scheme Based on Periodicity of DVB-T Pilot Signals

Authors: Hyung-Weon Cho, Youngyoon Lee, Seung Goo Kang, Dahae Chong, Myungsoo Lee, Chonghan Song, Seokho Yoon

Abstract:

This paper proposes a novel spectrum sensing technique for the digital video broadcasting-terrestrial (DVB-T) systems, which utilizes the periodicity of pilot signals in the orthogonal frequency division multiplexing (OFDM) symbols. The proposed scheme can overcome the effect of the timing synchronization error by recorrelating the correlation values in the same sample distances. The numerical results demonstrate that the detection probability performance of the proposed scheme outperforms that of the conventional scheme when there exists a timing synchronization error.

Keywords: DVB-T, spectrum sensing, OFDM, timing synchronizationerror.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1882
638 Traffic Signs

Authors: A. Gutiérrez, A. Castillo, J. M. Gómez, J. M. Gutiérrez, A. García-Cabot

Abstract:

Road signs are the elements of roads with a lot of influence in driver-s behavior. So that signals can fulfill its function, they must overcome visibility and durability requirements, particularly needed at night, when the coefficient of retroreflection becomes a decisive factor in ensuring road safety. Accepting that the visibility of the signage has implications for people-s safety, we understand the importance to fulfill its function: to foster the highest standards of service and safety in drivers. The usual conditions of perception of any sign are determined by: age of the driver, reflective material, luminosity, vehicle speed and emplacement. In this way, this paper evaluates the different signals to increase the safety road.

Keywords: Luminosity, orientation, retroreflection, traffic signs.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1608
637 Dynamic Power Reduction in Sequential Circuits Using Look Ahead Clock Gating Technique

Authors: R. Manjith, C. Muthukumari

Abstract:

In this paper, a novel Linear Feedback Shift Register (LFSR) with Look Ahead Clock Gating (LACG) technique is presented to reduce the power consumption in modern processors and System-on-Chip. Clock gating is a predominant technique used to reduce unwanted switching of clock signals. Several clock gating techniques to reduce the dynamic power have been developed, of which LACG is predominant. LACG computes the clock enabling signals of each flip-flop (FF) one cycle ahead of time, based on the present cycle data of the flip-flops on which it depends. It overcomes the timing problems in the existing clock gating methods like datadriven clock gating and Auto-Gated flip-flops (AGFF) by allotting a full clock cycle for the determination of the clock enabling signals. Further to reduce the power consumption in LACG technique, FFs can be grouped so that they share a common clock enabling signal. Simulation results show that the novel grouped LFSR with LACG achieves 15.03% power savings than conventional LFSR with LACG and 44.87% than data-driven clock gating.

Keywords: AGFF, data-driven, LACG, LFSR.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1698
636 Microcontroller Based EOG Guided Wheelchair

Authors: Jobby K. Chacko, Deepu Oommen, Kevin K. Mathew, Noble Sunny, N. Babu

Abstract:

A new cost effective, eye controlled method was introduced to guide and control a wheel chair for disable people, based on Electrooculography (EOG). The guidance and control is effected by eye ball movements within the socket. The system consists of a standard electric wheelchair with an on-board microcontroller system attached. EOG is a new technology to sense the eye signals for eye movements and these signals are captured using electrodes, signal processed such as amplification, noise filtering, and then given to microcontroller which drives the motors attached with wheel chair for propulsion. This technique could be very useful in applications such as mobility for handicapped and paralyzed persons.

Keywords: Electrooculography, Microcontroller, Signal processing, Wheelchair.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5545
635 Comparison of MFCC and Cepstral Coefficients as a Feature Set for PCG Biometric Systems

Authors: Justin Leo Cheang Loong, Khazaimatol S Subari, Muhammad Kamil Abdullah, Nurul Nadia Ahmad, RosliBesar

Abstract:

Heart sound is an acoustic signal and many techniques used nowadays for human recognition tasks borrow speech recognition techniques. One popular choice for feature extraction of accoustic signals is the Mel Frequency Cepstral Coefficients (MFCC) which maps the signal onto a non-linear Mel-Scale that mimics the human hearing. However the Mel-Scale is almost linear in the frequency region of heart sounds and thus should produce similar results with the standard cepstral coefficients (CC). In this paper, MFCC is investigated to see if it produces superior results for PCG based human identification system compared to CC. Results show that the MFCC system is still superior to CC despite linear filter-banks in the lower frequency range, giving up to 95% correct recognition rate for MFCC and 90% for CC. Further experiments show that the high recognition rate is due to the implementation of filter-banks and not from Mel-Scaling.

Keywords: Biometric, Phonocardiogram, Cepstral Coefficients, Mel Frequency

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3503
634 Feature Extractions of EMG Signals during a Constant Workload Pedaling Exercise

Authors: Bing-Wen Chen, Alvin W. Y. Su, Yu-Lin Wang

Abstract:

Electromyography (EMG) is one of the important indicators during exercise, as it is closely related to the level of muscle activations. This work quantifies the muscle conditions of the lower limbs in a constant workload exercise. Surface EMG signals of the vastus laterals (VL), vastus medialis (VM), rectus femoris (RF), gastrocnemius medianus (GM), gastrocnemius lateral (GL) and Soleus (SOL) were recorded from fourteen healthy males. The EMG signals were segmented in two phases: activation segment (AS) and relaxation segment (RS). Period entropy (PE), peak count (PC), zero crossing (ZC), wave length (WL), mean power frequency (MPF), median frequency (MDF) and root mean square (RMS) are calculated to provide the quantitative information of the measured EMG segments. The outcomes reveal that the PE, PC, ZC and RMS have significantly changed (p<.001); WL presents moderately changed (p<.01); MPF and MDF show no changed (p>.05) during exercise. The results also suggest that the RS is also preferred for performance evaluation, while the results of the extracted features in AS are usually affected directly by the amplitudes. It is further found that the VL exhibits the most significant changes within six muscles during pedaling exercise. The proposed work could be applied to quantify the stamina analysis and to predict the instant muscle status in athletes.

Keywords: EMG, feature extraction, muscle status, pedaling exercise, relaxation segment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1176
633 Study on Performance of Wigner Ville Distribution for Linear FM and Transient Signal Analysis

Authors: Azeemsha Thacham Poyil, Nasimudeen KM

Abstract:

This research paper presents some methods to assess the performance of Wigner Ville Distribution for Time-Frequency representation of non-stationary signals, in comparison with the other representations like STFT, Spectrogram etc. The simultaneous timefrequency resolution of WVD is one of the important properties which makes it preferable for analysis and detection of linear FM and transient signals. There are two algorithms proposed here to assess the resolution and to compare the performance of signal detection. First method is based on the measurement of area under timefrequency plot; in case of a linear FM signal analysis. A second method is based on the instantaneous power calculation and is used in case of transient, non-stationary signals. The implementation is explained briefly for both methods with suitable diagrams. The accuracy of the measurements is validated to show the better performance of WVD representation in comparison with STFT and Spectrograms.

Keywords: WVD: Wigner Ville Distribution, STFT: Short Time Fourier Transform, FT: Fourier Transform, TFR: Time-Frequency Representation, FM: Frequency Modulation, LFM Signal: Linear FM Signal, JTFA: Joint time frequency analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2370
632 Part of Speech Tagging Using Statistical Approach for Nepali Text

Authors: Archit Yajnik

Abstract:

Part of Speech Tagging has always been a challenging task in the era of Natural Language Processing. This article presents POS tagging for Nepali text using Hidden Markov Model and Viterbi algorithm. From the Nepali text, annotated corpus training and testing data set are randomly separated. Both methods are employed on the data sets. Viterbi algorithm is found to be computationally faster and accurate as compared to HMM. The accuracy of 95.43% is achieved using Viterbi algorithm. Error analysis where the mismatches took place is elaborately discussed.

Keywords: Hidden Markov model, Viterbi algorithm, POS tagging, natural language processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1655
631 On the Quantizer Design for Base Station Cooperation Systems with SC-FDE Techniques

Authors: K. Firsanov, S. Gritsutenko, R. Dinis

Abstract:

By employing BS (Base Station) cooperation we can increase substantially the spectral efficiency and capacity of cellular systems. The signals received at each BS are sent to a central unit that performs the separation of the different MT (Mobile Terminal) using the same physical channel. However, we need accurate sampling and quantization of those signals so as to reduce the backhaul communication requirements. In this paper we consider the optimization of the quantizers for BS cooperation systems. Four different quantizer types are analyzed and optimized to allow better SQNR (Signal-to-Quantization Noise Ratio) and BER (Bit Error Rate) performance.

Keywords: Base Stations cooperation scheme, Bit Error Rate (BER), Quantizer, Signal to Quantization Noise Ratio (SQNR), SCFDE.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1732
630 Retina Based Mouse Control (RBMC)

Authors: Arslan Qamar Malik, Jehanzeb Ahmad

Abstract:

The paper presents a novel idea to control computer mouse cursor movement with human eyes. In this paper, a working of the product has been described as to how it helps the special people share their knowledge with the world. Number of traditional techniques such as Head and Eye Movement Tracking Systems etc. exist for cursor control by making use of image processing in which light is the primary source. Electro-oculography (EOG) is a new technology to sense eye signals with which the mouse cursor can be controlled. The signals captured using sensors, are first amplified, then noise is removed and then digitized, before being transferred to PC for software interfacing.

Keywords: Human Computer Interaction, Real-Time System, Electro-oculography, Signal Processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4178
629 Spread Spectrum Image Watermarking for Secured Multimedia Data Communication

Authors: Tirtha S. Das, Ayan K. Sau, Subir K. Sarkar

Abstract:

Digital watermarking is a way to provide the facility of secure multimedia data communication besides its copyright protection approach. The Spread Spectrum modulation principle is widely used in digital watermarking to satisfy the robustness of multimedia signals against various signal-processing operations. Several SS watermarking algorithms have been proposed for multimedia signals but very few works have discussed on the issues responsible for secure data communication and its robustness improvement. The current paper has critically analyzed few such factors namely properties of spreading codes, proper signal decomposition suitable for data embedding, security provided by the key, successive bit cancellation method applied at decoder which have greater impact on the detection reliability, secure communication of significant signal under camouflage of insignificant signals etc. Based on the analysis, robust SS watermarking scheme for secure data communication is proposed in wavelet domain and improvement in secure communication and robustness performance is reported through experimental results. The reported result also shows improvement in visual and statistical invisibility of the hidden data.

Keywords: Spread spectrum modulation, spreading code, signaldecomposition, security, successive bit cancellation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2738
628 Extracting Single Trial Visual Evoked Potentials using Selective Eigen-Rate Principal Components

Authors: Samraj Andrews, Ramaswamy Palaniappan, Nidal Kamel

Abstract:

In single trial analysis, when using Principal Component Analysis (PCA) to extract Visual Evoked Potential (VEP) signals, the selection of principal components (PCs) is an important issue. We propose a new method here that selects only the appropriate PCs. We denote the method as selective eigen-rate (SER). In the method, the VEP is reconstructed based on the rate of the eigen-values of the PCs. When this technique is applied on emulated VEP signals added with background electroencephalogram (EEG), with a focus on extracting the evoked P3 parameter, it is found to be feasible. The improvement in signal to noise ratio (SNR) is superior to two other existing methods of PC selection: Kaiser (KSR) and Residual Power (RP). Though another PC selection method, Spectral Power Ratio (SPR) gives a comparable SNR with high noise factors (i.e. EEGs), SER give more impressive results in such cases. Next, we applied SER method to real VEP signals to analyse the P3 responses for matched and non-matched stimuli. The P3 parameters extracted through our proposed SER method showed higher P3 response for matched stimulus, which confirms to the existing neuroscience knowledge. Single trial PCA using KSR and RP methods failed to indicate any difference for the stimuli.

Keywords: Electroencephalogram, P3, Single trial VEP.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1598
627 A Multi-Feature Deep Learning Algorithm for Urban Traffic Classification with Limited Labeled Data

Authors: Rohan Putatunda, Aryya Gangopadhyay

Abstract:

Acoustic sensors, if embedded in smart street lights, can help in capturing the activities (car honking, sirens, events, traffic, etc.) in cities. Needless to say, the acoustic data from such scenarios are complex due to multiple audio streams originating from different events, and when decomposed to independent signals, the amount of retrieved data volume is small in quantity which is inadequate to train deep neural networks. So, in this paper, we address the two challenges: a) separating the mixed signals, and b) developing an efficient acoustic classifier under data paucity. So, to address these challenges, we propose an architecture with supervised deep learning, where the initial captured mixed acoustics data are analyzed with Fast Fourier Transformation (FFT), followed by filtering the noise from the signal, and then decomposed to independent signals by fast independent component analysis (Fast ICA). To address the challenge of data paucity, we propose a multi feature-based deep neural network with high performance that is reflected in our experiments when compared to the conventional convolutional neural network (CNN) and multi-layer perceptron (MLP).

Keywords: FFT, ICA, vehicle classification, multi-feature DNN, CNN, MLP.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 362
626 Coherence Analysis between Respiration and PPG Signal by Bivariate AR Model

Authors: Yue-Der Lin, Wei-Ting Liu, Ching-Che Tsai, Wen-Hsiu Chen

Abstract:

PPG is a potential tool in clinical applications. Among such, the relationship between respiration and PPG signal has attracted attention in past decades. In this research, a bivariate AR spectral estimation method was utilized for the coherence analysis between these two signals. Ten healthy subjects participated in this research with signals measured at different respiratory rates. The results demonstrate that high coherence exists between respiration and PPG signal, whereas the coherence disappears in breath-holding experiments. These results imply that PPG signal reveals the respiratory information. The utilized method may provide an attractive alternative approach for the related researches.

Keywords: Coherence analysis, photoplethysmography (PPG), bivariate AR spectral estimation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2541
625 A Dynamic Filter for Removal DC - Offset In Current and Voltage Waveforms

Authors: Khaled M.EL-Naggar

Abstract:

In power systems, protective relays must filter their inputs to remove undesirable quantities and retain signal quantities of interest. This job must be performed accurate and fast. A new method for filtering the undesirable components such as DC and harmonic components associated with the fundamental system signals. The method is s based on a dynamic filtering algorithm. The filtering algorithm has many advantages over some other classical methods. It can be used as dynamic on-line filter without the need of parameters readjusting as in the case of classic filters. The proposed filter is tested using different signals. Effects of number of samples and sampling window size are discussed. Results obtained are presented and discussed to show the algorithm capabilities.

Keywords: Protection, DC-offset, Dynamic Filter, Estimation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3708
624 Conspiracy Theory in Discussions of the Coronavirus Pandemic in the Gulf Region

Authors: Rasha Salameh

Abstract:

In light of the tense relationship between Saudi Arabia and Iran, this research paper sheds some light on Saudi-owned television network, Al-Arabiya’s reporting of the Coronavirus in the Gulf region. Particularly because most of the cases in the beginning were coming from Iran, some programs of this Saudi channel embraced a conspiracy theory. Hate speech has been used in the talking and discussions about the topic. The results of these discussions will be detailed in this paper in percentages with regard to the research sample, which includes five programs on the Al-Arabiya channel: ‘DNA’, ‘Marraya’ (Mirrors), ‘Panorama’, ‘Tafaolcom’ (Your Interaction) and ‘Diplomatic Street’, in the period between January 19, that is, the date of the first case in Iran, and April 10, 2020. The research shows the use of a conspiracy theory in the programs, in addition to some professional violations. The surveyed sample also shows that the matter receded due to the Arab Gulf states' preoccupation with the successively increasing cases that have appeared there since the start of the pandemic. The results indicate that hate speech was present in the sample at a rate of 98.1%, and that most of the programs that dealt with the Iranian issue under the Coronavirus pandemic on Al Arabiya used the conspiracy theory at a rate of 75.5%.

Keywords: Al-Arabiya, Iran, COVID-19, hate speech, conspiracy theory, politicization of the pandemic

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 410
623 Electrocardiogram Signal Compression Using Multiwavelet Transform

Authors: Morteza Moazami-Goudarzi, Mohammad. H. Moradi

Abstract:

In this paper we are to find the optimum multiwavelet for compression of electrocardiogram (ECG) signals. At present, it is not well known which multiwavelet is the best choice for optimum compression of ECG. In this work, we examine different multiwavelets on 24 sets of ECG data with entirely different characteristics, selected from MITBIH database. For assessing the functionality of the different multiwavelets in compressing ECG signals, in addition to known factors such as Compression Ratio (CR), Percent Root Difference (PRD), Distortion (D), Root Mean Square Error (RMSE) in compression literature, we also employed the Cross Correlation (CC) criterion for studying the morphological relations between the reconstructed and the original ECG signal and Signal to reconstruction Noise Ratio (SNR). The simulation results show that the cardbal2 by the means of identity (Id) prefiltering method to be the best effective transformation.

Keywords: ECG compression, Multiwavelet, Prefiltering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1649
622 Radio over Fiber as a Cost Effective Technology for Transmission of WiMAX Signals

Authors: Mohammad Shaifur Rahman, Jung Hyun Lee, Youngil Park, Ki-Doo Kim

Abstract:

In this paper, an overview of the radio over fiber (RoF) technology is provided. Obstacles for reducing the capital and operational expenses in the existing systems are discussed in various perspectives. Some possible RoF deployment scenarios for WiMAX data transmission are proposed as a means for capital and operational expenses reduction. IEEE 802.16a standard based end-to-end physical layer model is simulated including intensity modulated direct detection RoF technology. Finally the feasibility of RoF technology to carry WiMAX signals between the base station and the remote antenna units is demonstrated using the simulation results.

Keywords: IMDD, Radio over Fiber, Remote Antenna Unit, WiMAX.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2479
621 Transient Analysis of Central Region Void Fraction in a 3x3 Rod Bundle under Bubbly and Cap/Slug Flows

Authors: Ya-Chi Yu, Pei-Syuan Ruan, Shao-Wen Chen, Yu-Hsien Chang, Jin-Der Lee, Jong-Rong Wang, Chunkuan Shih

Abstract:

This study analyzed the transient signals of central region void fraction of air-water two-phase flow in a 3x3 rod bundle. Experimental tests were carried out utilizing a vertical rod bundle test section along with a set of air-water supply/flow control system, and the transient signals of the central region void fraction were collected through the electrical conductivity sensors as well as visualized via high speed photography. By converting the electric signals, transient void fraction can be obtained through the voltage ratios. With a fixed superficial water velocity (Jf=0.094 m/s), two different superficial air velocities (Jg=0.094 m/s and 0.236 m/s) were tested and presented, which were corresponding to the flow conditions of bubbly flows and cap/slug flows, respectively. The time averaged central region void fraction was obtained as 0.109-0.122 with 0.028 standard deviation for the selected bubbly flow and 0.188-0.221with 0.101 standard deviation for the selected cap/slug flow, respectively. Through Fast Fourier Transform (FFT) analysis, no clear frequency peak was found in bubbly flow, while two dominant frequencies were identified around 1.6 Hz and 2.5 Hz in the present cap/slug flow.

Keywords: Central region, rod bundles, transient void fraction, two-phase flow.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 644
620 Control Algorithm for Shunt Active Power Filter using Synchronous Reference Frame Theory

Authors: Consalva J. Msigwa, Beda J. Kundy, Bakari M. M. Mwinyiwiwa,

Abstract:

This paper presents a method for obtaining the desired reference current for Voltage Source Converter (VSC) of the Shunt Active Power Filter (SAPF) using Synchronous Reference Frame Theory. The method relies on the performance of the Proportional-Integral (PI) controller for obtaining the best control performance of the SAPF. To improve the performance of the PI controller, the feedback path to the integral term is introduced to compensate the winding up phenomenon due to integrator. Using Reference Frame Transformation, reference signals are transformed from a - b - c stationery frame to 0 - d - q rotating frame. Using the PI controller, the reference signals in the 0 - d - q rotating frame are controlled to get the desired reference signals for the Pulse Width Modulation. The synchronizer, the Phase Locked Loop (PLL) with PI filter is used for synchronization, with much emphasis on minimizing delays. The system performance is examined with Shunt Active Power Filter simulation model.

Keywords: Phase Locked Loop (PLL), Voltage Source Converter (VSC), Shunt Active Power Filter (SAPF), PI, Pulse Width Modulation (PWM)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3516
619 A Cognitive Model for Frequency Signal Classification

Authors: Rui Antunes, Fernando V. Coito

Abstract:

This article presents the development of a neural network cognitive model for the classification and detection of different frequency signals. The basic structure of the implemented neural network was inspired on the perception process that humans generally make in order to visually distinguish between high and low frequency signals. It is based on the dynamic neural network concept, with delays. A special two-layer feedforward neural net structure was successfully implemented, trained and validated, to achieve minimum target error. Training confirmed that this neural net structure descents and converges to a human perception classification solution, even when far away from the target.

Keywords: Neural Networks, Signal Classification, Adaptative Filters, Cognitive Neuroscience

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1619
618 Improved Text-Independent Speaker Identification using Fused MFCC and IMFCC Feature Sets based on Gaussian Filter

Authors: Sandipan Chakroborty, Goutam Saha

Abstract:

A state of the art Speaker Identification (SI) system requires a robust feature extraction unit followed by a speaker modeling scheme for generalized representation of these features. Over the years, Mel-Frequency Cepstral Coefficients (MFCC) modeled on the human auditory system has been used as a standard acoustic feature set for speech related applications. On a recent contribution by authors, it has been shown that the Inverted Mel- Frequency Cepstral Coefficients (IMFCC) is useful feature set for SI, which contains complementary information present in high frequency region. This paper introduces the Gaussian shaped filter (GF) while calculating MFCC and IMFCC in place of typical triangular shaped bins. The objective is to introduce a higher amount of correlation between subband outputs. The performances of both MFCC & IMFCC improve with GF over conventional triangular filter (TF) based implementation, individually as well as in combination. With GMM as speaker modeling paradigm, the performances of proposed GF based MFCC and IMFCC in individual and fused mode have been verified in two standard databases YOHO, (Microphone Speech) and POLYCOST (Telephone Speech) each of which has more than 130 speakers.

Keywords: Gaussian Filter, Triangular Filter, Subbands, Correlation, MFCC, IMFCC, GMM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2391
617 Speaker Independent Quranic Recognizer Basedon Maximum Likelihood Linear Regression

Authors: Ehab Mourtaga, Ahmad Sharieh, Mousa Abdallah

Abstract:

An automatic speech recognition system for the formal Arabic language is needed. The Quran is the most formal spoken book in Arabic, it is spoken all over the world. In this research, an automatic speech recognizer for Quranic based speakerindependent was developed and tested. The system was developed based on the tri-phone Hidden Markov Model and Maximum Likelihood Linear Regression (MLLR). The MLLR computes a set of transformations which reduces the mismatch between an initial model set and the adaptation data. It uses the regression class tree, as well as, estimates a set of linear transformations for the mean and variance parameters of a Gaussian mixture HMM system. The 30th Chapter of the Quran, with five of the most famous readers of the Quran, was used for the training and testing of the data. The chapter includes about 2000 distinct words. The advantages of using the Quranic verses as the database in this developed recognizer are the uniqueness of the words and the high level of orderliness between verses. The level of accuracy from the tested data ranged 68 to 85%.

Keywords: Hidden Markov Model (HMM), MaximumLikelihood Linear Regression (MLLR), Quran, Regression ClassTree, Speech Recognition, Speaker-independent.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1874
616 Face Localization Using Illumination-dependent Face Model for Visual Speech Recognition

Authors: Robert E. Hursig, Jane X. Zhang

Abstract:

A robust still image face localization algorithm capable of operating in an unconstrained visual environment is proposed. First, construction of a robust skin classifier within a shifted HSV color space is described. Then various filtering operations are performed to better isolate face candidates and mitigate the effect of substantial non-skin regions. Finally, a novel Bhattacharyya-based face detection algorithm is used to compare candidate regions of interest with a unique illumination-dependent face model probability distribution function approximation. Experimental results show a 90% face detection success rate despite the demands of the visually noisy environment.

Keywords: Audio-visual speech recognition, Bhattacharyyacoefficient, face detection,

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1587
615 An Efficient Hamiltonian for Discrete Fractional Fourier Transform

Authors: Sukrit Shankar, Pardha Saradhi K., Chetana Shanta Patsa, Jaydev Sharma

Abstract:

Fractional Fourier Transform, which is a generalization of the classical Fourier Transform, is a powerful tool for the analysis of transient signals. The discrete Fractional Fourier Transform Hamiltonians have been proposed in the past with varying degrees of correlation between their eigenvectors and Hermite Gaussian functions. In this paper, we propose a new Hamiltonian for the discrete Fractional Fourier Transform and show that the eigenvectors of the proposed matrix has a higher degree of correlation with the Hermite Gaussian functions. Also, the proposed matrix is shown to give better Fractional Fourier responses with various transform orders for different signals.

Keywords: Fractional Fourier Transform, Hamiltonian, Eigen Vectors, Discrete Hermite Gaussians.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1485
614 Assessment of the Occupancy’s Effect on Speech Intelligibility in Al-Madinah Holy Mosque

Authors: Wasim Orfali, Hesham Tolba

Abstract:

This research investigates the acoustical characteristics of Al-Madinah Holy Mosque. Extensive field measurements were conducted in different locations of Al-Madinah Holy Mosque to characterize its acoustic characteristics. The acoustical characteristics are usually evaluated by the use of objective parameters in unoccupied rooms due to practical considerations. However, under normal conditions, the room occupancy can vary such characteristics due to the effect of the additional sound absorption present in the room or by the change in signal-to-noise ratio. Based on the acoustic measurements carried out in Al-Madinah Holy Mosque with and without occupancy, and the analysis of such measurements, the existence of acoustical deficiencies has been confirmed.

Keywords: Worship sound, Al-Madinah Holy Mosque, mosque acoustics, speech intelligibility.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 660
613 Effects of Hidden Unit Sizes and Autoregressive Features in Mental Task Classification

Authors: Ramaswamy Palaniappan, Nai-Jen Huan

Abstract:

Classification of electroencephalogram (EEG) signals extracted during mental tasks is a technique that is actively pursued for Brain Computer Interfaces (BCI) designs. In this paper, we compared the classification performances of univariateautoregressive (AR) and multivariate autoregressive (MAR) models for representing EEG signals that were extracted during different mental tasks. Multilayer Perceptron (MLP) neural network (NN) trained by the backpropagation (BP) algorithm was used to classify these features into the different categories representing the mental tasks. Classification performances were also compared across different mental task combinations and 2 sets of hidden units (HU): 2 to 10 HU in steps of 2 and 20 to 100 HU in steps of 20. Five different mental tasks from 4 subjects were used in the experimental study and combinations of 2 different mental tasks were studied for each subject. Three different feature extraction methods with 6th order were used to extract features from these EEG signals: AR coefficients computed with Burg-s algorithm (ARBG), AR coefficients computed with stepwise least square algorithm (ARLS) and MAR coefficients computed with stepwise least square algorithm. The best results were obtained with 20 to 100 HU using ARBG. It is concluded that i) it is important to choose the suitable mental tasks for different individuals for a successful BCI design, ii) higher HU are more suitable and iii) ARBG is the most suitable feature extraction method.

Keywords: Autoregressive, Brain-Computer Interface, Electroencephalogram, Neural Network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1765
612 ECG-Based Heartbeat Classification Using Convolutional Neural Networks

Authors: Jacqueline R. T. Alipo-on, Francesca I. F. Escobar, Myles J. T. Tan, Hezerul Abdul Karim, Nouar AlDahoul

Abstract:

Electrocardiogram (ECG) signal analysis and processing are crucial in the diagnosis of cardiovascular diseases which are considered as one of the leading causes of mortality worldwide. However, the traditional rule-based analysis of large volumes of ECG data is time-consuming, labor-intensive, and prone to human errors. With the advancement of the programming paradigm, algorithms such as machine learning have been increasingly used to perform an analysis on the ECG signals. In this paper, various deep learning algorithms were adapted to classify five classes of heart beat types. The dataset used in this work is the synthetic MIT-Beth Israel Hospital (MIT-BIH) Arrhythmia dataset produced from generative adversarial networks (GANs). Various deep learning models such as ResNet-50 convolutional neural network (CNN), 1-D CNN, and long short-term memory (LSTM) were evaluated and compared. ResNet-50 was found to outperform other models in terms of recall and F1 score using a five-fold average score of 98.88% and 98.87%, respectively. 1-D CNN, on the other hand, was found to have the highest average precision of 98.93%.

Keywords: Heartbeat classification, convolutional neural network, electrocardiogram signals, ECG signals, generative adversarial networks, long short-term memory, LSTM, ResNet-50.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 114