Search results for: Gaussian linearization
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 336

Search results for: Gaussian linearization

66 Automatic Musical Genre Classification Using Divergence and Average Information Measures

Authors: Hassan Ezzaidi, Jean Rouat

Abstract:

Recently many research has been conducted to retrieve pertinent parameters and adequate models for automatic music genre classification. In this paper, two measures based upon information theory concepts are investigated for mapping the features space to decision space. A Gaussian Mixture Model (GMM) is used as a baseline and reference system. Various strategies are proposed for training and testing sessions with matched or mismatched conditions, long training and long testing, long training and short testing. For all experiments, the file sections used for testing are never been used during training. With matched conditions all examined measures yield the best and similar scores (almost 100%). With mismatched conditions, the proposed measures yield better scores than the GMM baseline system, especially for the short testing case. It is also observed that the average discrimination information measure is most appropriate for music category classifications and on the other hand the divergence measure is more suitable for music subcategory classifications.

Keywords: Audio feature, information measures, music genre.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1538
65 Parametric Modeling Approach for Call Holding Times for IP based Public Safety Networks via EM Algorithm

Authors: Badarch Tuyatsetseg

Abstract:

This paper presents parametric probability density models for call holding times (CHTs) into emergency call center based on the actual data collected for over a week in the public Emergency Information Network (EIN) in Mongolia. When the set of chosen candidates of Gamma distribution family is fitted to the call holding time data, it is observed that the whole area in the CHT empirical histogram is underestimated due to spikes of higher probability and long tails of lower probability in the histogram. Therefore, we provide the Gaussian parametric model of a mixture of lognormal distributions with explicit analytical expressions for the modeling of CHTs of PSNs. Finally, we show that the CHTs for PSNs are fitted reasonably by a mixture of lognormal distributions via the simulation of expectation maximization algorithm. This result is significant as it expresses a useful mathematical tool in an explicit manner of a mixture of lognormal distributions.

Keywords: A mixture of lognormal distributions, modeling call holding times, public safety network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1609
64 Optimal Feature Extraction Dimension in Finger Vein Recognition Using Kernel Principal Component Analysis

Authors: Amir Hajian, Sepehr Damavandinejadmonfared

Abstract:

In this paper the issue of dimensionality reduction is investigated in finger vein recognition systems using kernel Principal Component Analysis (KPCA). One aspect of KPCA is to find the most appropriate kernel function on finger vein recognition as there are several kernel functions which can be used within PCA-based algorithms. In this paper, however, another side of PCA-based algorithms -particularly KPCA- is investigated. The aspect of dimension of feature vector in PCA-based algorithms is of importance especially when it comes to the real-world applications and usage of such algorithms. It means that a fixed dimension of feature vector has to be set to reduce the dimension of the input and output data and extract the features from them. Then a classifier is performed to classify the data and make the final decision. We analyze KPCA (Polynomial, Gaussian, and Laplacian) in details in this paper and investigate the optimal feature extraction dimension in finger vein recognition using KPCA.

Keywords: Biometrics, finger vein recognition, Principal Component Analysis (PCA), Kernel Principal Component Analysis (KPCA).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1923
63 Morphing Human Faces: Automatic Control Points Selection and Color Transition

Authors: Stephen Karungaru, Minoru Fukumi, Norio Akamatsu

Abstract:

In this paper, we propose a morphing method by which face color images can be freely transformed. The main focus of this work is the transformation of one face image to another. This method is fully automatic in that it can morph two face images by automatically detecting all the control points necessary to perform the morph. A face detection neural network, edge detection and medium filters are employed to detect the face position and features. Five control points, for both the source and target images, are then extracted based on the facial features. Triangulation method is then used to match and warp the source image to the target image using the control points. Finally color interpolation is done using a color Gaussian model that calculates the color for each particular frame depending on the number of frames used. A real coded Genetic algorithm is used in both the image warping and color blending steps to assist in step size decisions and speed up the morphing. This method results in ''very smooth'' morphs and is fast to process.

Keywords: color transition, genetic algorithms morphing, warping

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2777
62 Optimized Fuzzy Control by Particle Swarm Optimization Technique for Control of CSTR

Authors: Saeed Vaneshani, Hooshang Jazayeri-Rad

Abstract:

Fuzzy logic control (FLC) systems have been tested in many technical and industrial applications as a useful modeling tool that can handle the uncertainties and nonlinearities of modern control systems. The main drawback of the FLC methodologies in the industrial environment is challenging for selecting the number of optimum tuning parameters. In this paper, a method has been proposed for finding the optimum membership functions of a fuzzy system using particle swarm optimization (PSO) algorithm. A synthetic algorithm combined from fuzzy logic control and PSO algorithm is used to design a controller for a continuous stirred tank reactor (CSTR) with the aim of achieving the accurate and acceptable desired results. To exhibit the effectiveness of proposed algorithm, it is used to optimize the Gaussian membership functions of the fuzzy model of a nonlinear CSTR system as a case study. It is clearly proved that the optimized membership functions (MFs) provided better performance than a fuzzy model for the same system, when the MFs were heuristically defined.

Keywords: continuous stirred tank reactor (CSTR), fuzzy logiccontrol (FLC), membership function(MF), particle swarmoptimization (PSO)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3157
61 Optimal Power Allocation for the Proposed Asymmetric Turbo Code for 3G Systems

Authors: K. Ramasamy, B. Balamuralithara, Mohammad Umar Siddiqi

Abstract:

We proposed a new class of asymmetric turbo encoder for 3G systems that performs well in both “water fall" and “error floor" regions in [7]. In this paper, a modified (optimal) power allocation scheme for the different bits of new class of asymmetric turbo encoder has been investigated to enhance the performance. The simulation results and performance bound for proposed asymmetric turbo code with modified Unequal Power Allocation (UPA) scheme for the frame length, N=400, code rate, r=1/3 with Log-MAP decoder over Additive White Gaussian Noise (AWGN) channel are obtained and compared with the system with typical UPA and without UPA. The performance tests are extended over AWGN channel for different frame size to verify the possibility of implementation of the modified UPA scheme for the proposed asymmetric turbo code. From the performance results, it is observed that the proposed asymmetric turbo code with modified UPA performs better than the system without UPA and with typical UPA and it provides a coding gain of 0.4 to 0.52dB.

Keywords: Asymmetric turbo code, Generator polynomial, Interleaver, UPA, WCDMA, cdma2000.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1840
60 On the Efficient Implementation of a Serial and Parallel Decomposition Algorithm for Fast Support Vector Machine Training Including a Multi-Parameter Kernel

Authors: Tatjana Eitrich, Bruno Lang

Abstract:

This work deals with aspects of support vector machine learning for large-scale data mining tasks. Based on a decomposition algorithm for support vector machine training that can be run in serial as well as shared memory parallel mode we introduce a transformation of the training data that allows for the usage of an expensive generalized kernel without additional costs. We present experiments for the Gaussian kernel, but usage of other kernel functions is possible, too. In order to further speed up the decomposition algorithm we analyze the critical problem of working set selection for large training data sets. In addition, we analyze the influence of the working set sizes onto the scalability of the parallel decomposition scheme. Our tests and conclusions led to several modifications of the algorithm and the improvement of overall support vector machine learning performance. Our method allows for using extensive parameter search methods to optimize classification accuracy.

Keywords: Support Vector Machine Training, Multi-ParameterKernels, Shared Memory Parallel Computing, Large Data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1394
59 General Purpose Graphic Processing Units Based Real Time Video Tracking System

Authors: Mallikarjuna Rao Gundavarapu, Ch. Mallikarjuna Rao, K. Anuradha Bai

Abstract:

Real Time Video Tracking is a challenging task for computing professionals. The performance of video tracking techniques is greatly affected by background detection and elimination process. Local regions of the image frame contain vital information of background and foreground. However, pixel-level processing of local regions consumes a good amount of computational time and memory space by traditional approaches. In our approach we have explored the concurrent computational ability of General Purpose Graphic Processing Units (GPGPU) to address this problem. The Gaussian Mixture Model (GMM) with adaptive weighted kernels is used for detecting the background. The weights of the kernel are influenced by local regions and are updated by inter-frame variations of these corresponding regions. The proposed system has been tested with GPU devices such as GeForce GTX 280, GeForce GTX 280 and Quadro K2000. The results are encouraging with maximum speed up 10X compared to sequential approach.

Keywords: Connected components, Embrace threads, Local weighted kernel, Structuring element.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1121
58 Quality Estimation of Video Transmitted overan Additive WGN Channel based on Digital Watermarking and Wavelet Transform

Authors: Mohamed S. El-Mahallawy, Attalah Hashad, Hazem Hassan Ali, Heba Sami Zaky

Abstract:

This paper presents an evaluation for a wavelet-based digital watermarking technique used in estimating the quality of video sequences transmitted over Additive White Gaussian Noise (AWGN) channel in terms of a classical objective metric, such as Peak Signal-to-Noise Ratio (PSNR) without the need of the original video. In this method, a watermark is embedded into the Discrete Wavelet Transform (DWT) domain of the original video frames using a quantization method. The degradation of the extracted watermark can be used to estimate the video quality in terms of PSNR with good accuracy. We calculated PSNR for video frames contaminated with AWGN and compared the values with those estimated using the Watermarking-DWT based approach. It is found that the calculated and estimated quality measures of the video frames are highly correlated, suggesting that this method can provide a good quality measure for video frames transmitted over AWGN channel without the need of the original video.

Keywords: AWGN, DWT, PSNR, Watermarking, VideoQuality.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1794
57 M-band Wavelet and Cosine Transform Based Watermark Algorithm Using Randomization and Principal Component Analysis

Authors: Tong Liu, Xuan Xu, Xiaodi Wang

Abstract:

Computational techniques derived from digital image processing are playing a significant role in the security and digital copyrights of multimedia and visual arts. This technology has the effect within the domain of computers. This research presents discrete M-band wavelet transform (MWT) and cosine transform (DCT) based watermarking algorithm by incorporating the principal component analysis (PCA). The proposed algorithm is expected to achieve higher perceptual transparency. Specifically, the developed watermarking scheme can successfully resist common signal processing, such as geometric distortions, and Gaussian noise. In addition, the proposed algorithm can be parameterized, thus resulting in more security. To meet these requirements, the image is transformed by a combination of MWT & DCT. In order to improve the security further, we randomize the watermark image to create three code books. During the watermark embedding, PCA is applied to the coefficients in approximation sub-band. Finally, first few component bands represent an excellent domain for inserting the watermark.

Keywords: discrete M-band wavelet transform , discrete M-band wavelet transform, randomized watermark, principal component analysis

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1955
56 A Study of Panel Logit Model and Adaptive Neuro-Fuzzy Inference System in the Prediction of Financial Distress Periods

Authors: Ε. Giovanis

Abstract:

The purpose of this paper is to present two different approaches of financial distress pre-warning models appropriate for risk supervisors, investors and policy makers. We examine a sample of the financial institutions and electronic companies of Taiwan Security Exchange (TSE) market from 2002 through 2008. We present a binary logistic regression with paned data analysis. With the pooled binary logistic regression we build a model including more variables in the regression than with random effects, while the in-sample and out-sample forecasting performance is higher in random effects estimation than in pooled regression. On the other hand we estimate an Adaptive Neuro-Fuzzy Inference System (ANFIS) with Gaussian and Generalized Bell (Gbell) functions and we find that ANFIS outperforms significant Logit regressions in both in-sample and out-of-sample periods, indicating that ANFIS is a more appropriate tool for financial risk managers and for the economic policy makers in central banks and national statistical services.

Keywords: ANFIS, Binary logistic regression, Financialdistress, Panel data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2304
55 A Comparison of Some Thresholding Selection Methods for Wavelet Regression

Authors: Alsaidi M. Altaher, Mohd T. Ismail

Abstract:

In wavelet regression, choosing threshold value is a crucial issue. A too large value cuts too many coefficients resulting in over smoothing. Conversely, a too small threshold value allows many coefficients to be included in reconstruction, giving a wiggly estimate which result in under smoothing. However, the proper choice of threshold can be considered as a careful balance of these principles. This paper gives a very brief introduction to some thresholding selection methods. These methods include: Universal, Sure, Ebays, Two fold cross validation and level dependent cross validation. A simulation study on a variety of sample sizes, test functions, signal-to-noise ratios is conducted to compare their numerical performances using three different noise structures. For Gaussian noise, EBayes outperforms in all cases for all used functions while Two fold cross validation provides the best results in the case of long tail noise. For large values of signal-to-noise ratios, level dependent cross validation works well under correlated noises case. As expected, increasing both sample size and level of signal to noise ratio, increases estimation efficiency.

Keywords: wavelet regression, simulation, Threshold.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1722
54 On Developing an Automatic Speech Recognition System for Standard Arabic Language

Authors: R. Walha, F. Drira, H. El-Abed, A. M. Alimi

Abstract:

The Automatic Speech Recognition (ASR) applied to Arabic language is a challenging task. This is mainly related to the language specificities which make the researchers facing multiple difficulties such as the insufficient linguistic resources and the very limited number of available transcribed Arabic speech corpora. In this paper, we are interested in the development of a HMM-based ASR system for Standard Arabic (SA) language. Our fundamental research goal is to select the most appropriate acoustic parameters describing each audio frame, acoustic models and speech recognition unit. To achieve this purpose, we analyze the effect of varying frame windowing (size and period), acoustic parameter number resulting from features extraction methods traditionally used in ASR, speech recognition unit, Gaussian number per HMM state and number of embedded re-estimations of the Baum-Welch Algorithm. To evaluate the proposed ASR system, a multi-speaker SA connected-digits corpus is collected, transcribed and used throughout all experiments. A further evaluation is conducted on a speaker-independent continue SA speech corpus. The phonemes recognition rate is 94.02% which is relatively high when comparing it with another ASR system evaluated on the same corpus.

Keywords: ASR, HMM, acoustical analysis, acoustic modeling, Standard Arabic language

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1732
53 Impact of Modeling Different Fading Channels on Wireless MAN Fixed IEEE802.16d OFDM System with Diversity Transmission Technique

Authors: Shanar Askar, Shahzad Memon, LachhmanDas, MSKalhoro

Abstract:

Wimax (Worldwide Interoperability for Microwave Access) is a promising technology which can offer high speed data, voice and video service to the customer end, which is presently, dominated by the cable and digital subscriber line (DSL) technologies. The performance assessment of Wimax systems is dealt with. The biggest advantage of Broadband wireless application (BWA) over its wired competitors is its increased capacity and ease of deployment. The aims of this paper are to model and simulate the fixed OFDM IEEE 802.16d physical layer under variant combinations of digital modulation (BPSK, QPSK, and 16-QAM) over diverse combination of fading channels (AWGN, SUIs). Stanford University Interim (SUI) Channel serial was proposed to simulate the fixed broadband wireless access channel environments where IEEE 802.16d is to be deployed. It has six channel models that are grouped into three categories according to three typical different outdoor Terrains, in order to give a comprehensive effect of fading channels on the overall performance of the system.

Keywords: WIMAX, OFDM, Additive White Gaussian Noise, Fading Channel, SUI, Doppler Effect.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2081
52 Genetic-Based Multi Resolution Noisy Color Image Segmentation

Authors: Raghad Jawad Ahmed

Abstract:

Segmentation of a color image composed of different kinds of regions can be a hard problem, namely to compute for an exact texture fields. The decision of the optimum number of segmentation areas in an image when it contains similar and/or un stationary texture fields. A novel neighborhood-based segmentation approach is proposed. A genetic algorithm is used in the proposed segment-pass optimization process. In this pass, an energy function, which is defined based on Markov Random Fields, is minimized. In this paper we use an adaptive threshold estimation method for image thresholding in the wavelet domain based on the generalized Gaussian distribution (GGD) modeling of sub band coefficients. This method called Normal Shrink is computationally more efficient and adaptive because the parameters required for estimating the threshold depend on sub band data energy that used in the pre-stage of segmentation. A quad tree is employed to implement the multi resolution framework, which enables the use of different strategies at different resolution levels, and hence, the computation can be accelerated. The experimental results using the proposed segmentation approach are very encouraging.

Keywords: Color image segmentation, Genetic algorithm, Markov random field, Scale space filter.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1536
51 Optimal Channel Equalization for MIMO Time-Varying Channels

Authors: Ehab F. Badran, Guoxiang Gu

Abstract:

We consider optimal channel equalization for MIMO (multi-input/multi-output) time-varying channels in the sense of MMSE (minimum mean-squared-error), where the observation noise can be non-stationary. We show that all ZF (zero-forcing) receivers can be parameterized in an affine form which eliminates completely the ISI (inter-symbol-interference), and optimal channel equalizers can be designed through minimization of the MSE (mean-squarederror) between the detected signals and the transmitted signals, among all ZF receivers. We demonstrate that the optimal channel equalizer is a modified Kalman filter, and show that under the AWGN (additive white Gaussian noise) assumption, the proposed optimal channel equalizer minimizes the BER (bit error rate) among all possible ZF receivers. Our results are applicable to optimal channel equalization for DWMT (discrete wavelet multitone), multirate transmultiplexers, OFDM (orthogonal frequency division multiplexing), and DS (direct sequence) CDMA (code division multiple access) wireless data communication systems. A design algorithm for optimal channel equalization is developed, and several simulation examples are worked out to illustrate the proposed design algorithm.

Keywords: Channel equalization, Kalman filtering, Time-varying systems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1795
50 Motion-Based Detection and Tracking of Multiple Pedestrians

Authors: A. Harras, A. Tsuji, K. Terada

Abstract:

Tracking of moving people has gained a matter of great importance due to rapid technological advancements in the field of computer vision. The objective of this study is to design a motion based detection and tracking multiple walking pedestrians randomly in different directions. In our proposed method, Gaussian mixture model (GMM) is used to determine moving persons in image sequences. It reacts to changes that take place in the scene like different illumination; moving objects start and stop often, etc. Background noise in the scene is eliminated through applying morphological operations and the motions of tracked people which is determined by using the Kalman filter. The Kalman filter is applied to predict the tracked location in each frame and to determine the likelihood of each detection. We used a benchmark data set for the evaluation based on a side wall stationary camera. The actual scenes from the data set are taken on a street including up to eight people in front of the camera in different two scenes, the duration is 53 and 35 seconds, respectively. In the case of walking pedestrians in close proximity, the proposed method has achieved the detection ratio of 87%, and the tracking ratio is 77 % successfully. When they are deferred from each other, the detection ratio is increased to 90% and the tracking ratio is also increased to 79%.

Keywords: Automatic detection, tracking, pedestrians.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 796
49 Support Vector Machine based Intelligent Watermark Decoding for Anticipated Attack

Authors: Syed Fahad Tahir, Asifullah Khan, Abdul Majid, Anwar M. Mirza

Abstract:

In this paper, we present an innovative scheme of blindly extracting message bits from an image distorted by an attack. Support Vector Machine (SVM) is used to nonlinearly classify the bits of the embedded message. Traditionally, a hard decoder is used with the assumption that the underlying modeling of the Discrete Cosine Transform (DCT) coefficients does not appreciably change. In case of an attack, the distribution of the image coefficients is heavily altered. The distribution of the sufficient statistics at the receiving end corresponding to the antipodal signals overlap and a simple hard decoder fails to classify them properly. We are considering message retrieval of antipodal signal as a binary classification problem. Machine learning techniques like SVM is used to retrieve the message, when certain specific class of attacks is most probable. In order to validate SVM based decoding scheme, we have taken Gaussian noise as a test case. We generate a data set using 125 images and 25 different keys. Polynomial kernel of SVM has achieved 100 percent accuracy on test data.

Keywords: Bit Correct Ratio (BCR), Grid Search, Intelligent Decoding, Jackknife Technique, Support Vector Machine (SVM), Watermarking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1624
48 Statistical Distributions of the Lapped Transform Coefficients for Images

Authors: Vijay Kumar Nath, Deepika Hazarika, Anil Mahanta,

Abstract:

Discrete Cosine Transform (DCT) based transform coding is very popular in image, video and speech compression due to its good energy compaction and decorrelating properties. However, at low bit rates, the reconstructed images generally suffer from visually annoying blocking artifacts as a result of coarse quantization. Lapped transform was proposed as an alternative to the DCT with reduced blocking artifacts and increased coding gain. Lapped transforms are popular for their good performance, robustness against oversmoothing and availability of fast implementation algorithms. However, there is no proper study reported in the literature regarding the statistical distributions of block Lapped Orthogonal Transform (LOT) and Lapped Biorthogonal Transform (LBT) coefficients. This study performs two goodness-of-fit tests, the Kolmogorov-Smirnov (KS) test and the 2- test, to determine the distribution that best fits the LOT and LBT coefficients. The experimental results show that the distribution of a majority of the significant AC coefficients can be modeled by the Generalized Gaussian distribution. The knowledge of the statistical distribution of transform coefficients greatly helps in the design of optimal quantizers that may lead to minimum distortion and hence achieve optimal coding efficiency.

Keywords: Lapped orthogonal transform, Lapped biorthogonal transform, Image compression, KS test,

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1556
47 Efficiency Improvement for Conventional Rectangular Horn Antenna by Using EBG Technique

Authors: S. Kampeephat, P. Krachodnok, R. Wongsan

Abstract:

The conventional rectangular horn has been used for microwave antenna a long time. Its gain can be increased by enlarging the construction of horn to flare exponentially. This paper presents a study of the shaped woodpile Electromagnetic Band Gap (EBG) to improve its gain for conventional horn without construction enlargement. The gain enhancement synthesis method for shaped woodpile EBG that has to transfer the electromagnetic fields from aperture of a horn antenna through woodpile EBG is presented by using the variety of shaped woodpile EBGs such as planar, triangular, quadratic, circular, gaussian, cosine, and squared cosine structures. The proposed technique has the advantages of low profile, low cost for fabrication and light weight. The antenna characteristics such as reflection coefficient (S11), radiation patterns and gain are simulated by utilized A Computer Simulation Technology (CST) software. With the proposed concept, an antenna prototype was fabricated and experimented. The S11 and radiation patterns obtained from measurements show a good impedance matching and a gain enhancement of the proposed antenna. The gain at dominant frequency of 10 GHz is 25.6 dB, application for X- and Ku-Band Radar, that higher than the gain of the basic rectangular horn antenna around 8 dB with adding only one appropriated EBG structures.

Keywords: Conventional Rectangular Horn Antenna, Electromagnetic Band Gap, Gain Enhancement, X- and Ku-Band Radar.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3740
46 Improved Closed Set Text-Independent Speaker Identification by Combining MFCC with Evidence from Flipped Filter Banks

Authors: Sandipan Chakroborty, Anindya Roy, Goutam Saha

Abstract:

A state of the art Speaker Identification (SI) system requires a robust feature extraction unit followed by a speaker modeling scheme for generalized representation of these features. Over the years, Mel-Frequency Cepstral Coefficients (MFCC) modeled on the human auditory system has been used as a standard acoustic feature set for SI applications. However, due to the structure of its filter bank, it captures vocal tract characteristics more effectively in the lower frequency regions. This paper proposes a new set of features using a complementary filter bank structure which improves distinguishability of speaker specific cues present in the higher frequency zone. Unlike high level features that are difficult to extract, the proposed feature set involves little computational burden during the extraction process. When combined with MFCC via a parallel implementation of speaker models, the proposed feature set outperforms baseline MFCC significantly. This proposition is validated by experiments conducted on two different kinds of public databases namely YOHO (microphone speech) and POLYCOST (telephone speech) with Gaussian Mixture Models (GMM) as a Classifier for various model orders.

Keywords: Complementary Information, Filter Bank, GMM, IMFCC, MFCC, Speaker Identification, Speaker Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2210
45 Adaptive Square-Rooting Companding Technique for PAPR Reduction in OFDM Systems

Authors: Wisam F. Al-Azzo, Borhanuddin Mohd. Ali

Abstract:

This paper addresses the problem of peak-to-average power ratio (PAPR) in orthogonal frequency division multiplexing (OFDM) systems. It also introduces a new PAPR reduction technique based on adaptive square-rooting (SQRT) companding process. The SQRT process of the proposed technique changes the statistical characteristics of the OFDM output signals from Rayleigh distribution to Gaussian-like distribution. This change in statistical distribution results changes of both the peak and average power values of OFDM signals, and consequently reduces significantly the PAPR. For the 64QAM OFDM system using 512 subcarriers, up to 6 dB reduction in PAPR was achieved by square-rooting technique with fixed degradation in bit error rate (BER) equal to 3 dB. However, the PAPR is reduced at the expense of only -15 dB out-ofband spectral shoulder re-growth below the in-band signal level. The proposed adaptive SQRT technique is superior in terms of BER performance than the original, non-adaptive, square-rooting technique when the required reduction in PAPR is no more than 5 dB. Also, it provides fixed amount of PAPR reduction in which it is not available in the original SQRT technique.

Keywords: complementary cumulative distribution function(CCDF), OFDM, peak-to-average power ratio (PAPR), adaptivesquare-rooting PAPR reduction technique.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2152
44 A Simple Adaptive Atomic Decomposition Voice Activity Detector Implemented by Matching Pursuit

Authors: Thomas Bryan, Veton Kepuska, Ivica Kostanic

Abstract:

A simple adaptive voice activity detector (VAD) is implemented using Gabor and gammatone atomic decomposition of speech for high Gaussian noise environments. Matching pursuit is used for atomic decomposition, and is shown to achieve optimal speech detection capability at high data compression rates for low signal to noise ratios. The most active dictionary elements found by matching pursuit are used for the signal reconstruction so that the algorithm adapts to the individual speakers dominant time-frequency characteristics. Speech has a high peak to average ratio enabling matching pursuit greedy heuristic of highest inner products to isolate high energy speech components in high noise environments. Gabor and gammatone atoms are both investigated with identical logarithmically spaced center frequencies, and similar bandwidths. The algorithm performs equally well for both Gabor and gammatone atoms with no significant statistical differences. The algorithm achieves 70% accuracy at a 0 dB SNR, 90% accuracy at a 5 dB SNR and 98% accuracy at a 20dB SNR using 30d B SNR as a reference for voice activity.

Keywords: Atomic Decomposition, Gabor, Gammatone, Matching Pursuit, Voice Activity Detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1751
43 Comparisons of Co-Seismic Gravity Changes between GRACE Observations and the Predictions from the Finite-Fault Models for the 2012 Mw = 8.6 Indian Ocean Earthquake Off-Sumatra

Authors: Armin Rahimi

Abstract:

The Gravity Recovery and Climate Experiment (GRACE) has been a very successful project in determining math redistribution within the Earth system. Large deformations caused by earthquakes are in the high frequency band. Unfortunately, GRACE is only capable to provide reliable estimate at the low-to-medium frequency band for the gravitational changes. In this study, we computed the gravity changes after the 2012 Mw8.6 Indian Ocean earthquake off-Sumatra using the GRACE Level-2 monthly spherical harmonic (SH) solutions released by the University of Texas Center for Space Research (UTCSR). Moreover, we calculated gravity changes using different fault models derived from teleseismic data. The model predictions showed non-negligible discrepancies in gravity changes. However, after removing high-frequency signals, using Gaussian filtering 350 km commensurable GRACE spatial resolution, the discrepancies vanished, and the spatial patterns of total gravity changes predicted from all slip models became similar at the spatial resolution attainable by GRACE observations, and predicted-gravity changes were consistent with the GRACE-detected gravity changes. Nevertheless, the fault models, in which give different slip amplitudes, proportionally lead to different amplitude in the predicted gravity changes.

Keywords: Undersea earthquake, GRACE observation, gravity change, dislocation model, slip distribution.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 946
42 Tidal Data Analysis using ANN

Authors: Ritu Vijay, Rekha Govil

Abstract:

The design of a complete expansion that allows for compact representation of certain relevant classes of signals is a central problem in signal processing applications. Achieving such a representation means knowing the signal features for the purpose of denoising, classification, interpolation and forecasting. Multilayer Neural Networks are relatively a new class of techniques that are mathematically proven to approximate any continuous function arbitrarily well. Radial Basis Function Networks, which make use of Gaussian activation function, are also shown to be a universal approximator. In this age of ever-increasing digitization in the storage, processing, analysis and communication of information, there are numerous examples of applications where one needs to construct a continuously defined function or numerical algorithm to approximate, represent and reconstruct the given discrete data of a signal. Many a times one wishes to manipulate the data in a way that requires information not included explicitly in the data, which is done through interpolation and/or extrapolation. Tidal data are a very perfect example of time series and many statistical techniques have been applied for tidal data analysis and representation. ANN is recent addition to such techniques. In the present paper we describe the time series representation capabilities of a special type of ANN- Radial Basis Function networks and present the results of tidal data representation using RBF. Tidal data analysis & representation is one of the important requirements in marine science for forecasting.

Keywords: ANN, RBF, Tidal Data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1608
41 Speaker Independent Quranic Recognizer Basedon Maximum Likelihood Linear Regression

Authors: Ehab Mourtaga, Ahmad Sharieh, Mousa Abdallah

Abstract:

An automatic speech recognition system for the formal Arabic language is needed. The Quran is the most formal spoken book in Arabic, it is spoken all over the world. In this research, an automatic speech recognizer for Quranic based speakerindependent was developed and tested. The system was developed based on the tri-phone Hidden Markov Model and Maximum Likelihood Linear Regression (MLLR). The MLLR computes a set of transformations which reduces the mismatch between an initial model set and the adaptation data. It uses the regression class tree, as well as, estimates a set of linear transformations for the mean and variance parameters of a Gaussian mixture HMM system. The 30th Chapter of the Quran, with five of the most famous readers of the Quran, was used for the training and testing of the data. The chapter includes about 2000 distinct words. The advantages of using the Quranic verses as the database in this developed recognizer are the uniqueness of the words and the high level of orderliness between verses. The level of accuracy from the tested data ranged 68 to 85%.

Keywords: Hidden Markov Model (HMM), MaximumLikelihood Linear Regression (MLLR), Quran, Regression ClassTree, Speech Recognition, Speaker-independent.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1873
40 Recognizing an Individual, Their Topic of Conversation, and Cultural Background from 3D Body Movement

Authors: Gheida J. Shahrour, Martin J. Russell

Abstract:

The 3D body movement signals captured during human-human conversation include clues not only to the content of people’s communication but also to their culture and personality. This paper is concerned with automatic extraction of this information from body movement signals. For the purpose of this research, we collected a novel corpus from 27 subjects, arranged them into groups according to their culture. We arranged each group into pairs and each pair communicated with each other about different topics. A state-of-art recognition system is applied to the problems of person, culture, and topic recognition. We borrowed modeling, classification, and normalization techniques from speech recognition. We used Gaussian Mixture Modeling (GMM) as the main technique for building our three systems, obtaining 77.78%, 55.47%, and 39.06% from the person, culture, and topic recognition systems respectively. In addition, we combined the above GMM systems with Support Vector Machines (SVM) to obtain 85.42%, 62.50%, and 40.63% accuracy for person, culture, and topic recognition respectively. Although direct comparison among these three recognition systems is difficult, it seems that our person recognition system performs best for both GMM and GMM-SVM, suggesting that intersubject differences (i.e. subject’s personality traits) are a major source of variation. When removing these traits from culture and topic recognition systems using the Nuisance Attribute Projection (NAP) and the Intersession Variability Compensation (ISVC) techniques, we obtained 73.44% and 46.09% accuracy from culture and topic recognition systems respectively.

Keywords: Person Recognition, Topic Recognition, Culture Recognition, 3D Body Movement Signals, Variability Compensation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2118
39 A New Approach for Image Segmentation using Pillar-Kmeans Algorithm

Authors: Ali Ridho Barakbah, Yasushi Kiyoki

Abstract:

This paper presents a new approach for image segmentation by applying Pillar-Kmeans algorithm. This segmentation process includes a new mechanism for clustering the elements of high-resolution images in order to improve precision and reduce computation time. The system applies K-means clustering to the image segmentation after optimized by Pillar Algorithm. The Pillar algorithm considers the pillars- placement which should be located as far as possible from each other to withstand against the pressure distribution of a roof, as identical to the number of centroids amongst the data distribution. This algorithm is able to optimize the K-means clustering for image segmentation in aspects of precision and computation time. It designates the initial centroids- positions by calculating the accumulated distance metric between each data point and all previous centroids, and then selects data points which have the maximum distance as new initial centroids. This algorithm distributes all initial centroids according to the maximum accumulated distance metric. This paper evaluates the proposed approach for image segmentation by comparing with K-means and Gaussian Mixture Model algorithm and involving RGB, HSV, HSL and CIELAB color spaces. The experimental results clarify the effectiveness of our approach to improve the segmentation quality in aspects of precision and computational time.

Keywords: Image segmentation, K-means clustering, Pillaralgorithm, color spaces.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3328
38 Three-Dimensional Simulation of Free Electron Laser with Prebunching and Efficiency Enhancement

Authors: M. Chitsazi, B. Maraghechi, M. H. Rouhani

Abstract:

Three-dimensional simulation of harmonic up generation in free electron laser amplifier operating simultaneously with a cold and relativistic electron beam is presented in steady-state regime where the slippage of the electromagnetic wave with respect to the electron beam is ignored. By using slowly varying envelope approximation and applying the source-dependent expansion to wave equations, electromagnetic fields are represented in terms of the Hermit Gaussian modes which are well suited for the planar wiggler configuration. The electron dynamics is described by the fully threedimensional Lorentz force equation in presence of the realistic planar magnetostatic wiggler and electromagnetic fields. A set of coupled nonlinear first-order differential equations is derived and solved numerically. The fundamental and third harmonic radiation of the beam is considered. In addition to uniform beam, prebunched electron beam has also been studied. For this effect of sinusoidal distribution of entry times for the electron beam on the evolution of radiation is compared with uniform distribution. It is shown that prebunching reduces the saturation length substantially. For efficiency enhancement the wiggler is set to decrease linearly when the radiation of the third harmonic saturates. The optimum starting point of tapering and the slope of radiation in the amplitude of wiggler are found by successive run of the code.

Keywords: Free electron laser, Prebunching, Undulator, Wiggler.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1422
37 Effect of Transmission Codes on Hybrid SC/MRC Diversity Reception MQAM system over Rayleigh Fading Channels

Authors: J.S. Ubhi, M.S. Patterh, T.S. Kamal

Abstract:

In this paper, the effect of transmission codes on the performance of coherent square M-ary quadrature amplitude modulation (CSMQAM) under hybrid selection/maximal-ratio combining (H-S/MRC) diversity is analysed. The fading channels are modeled as frequency non-selective slow independent and identically distributed Rayleigh fading channels corrupted by additive white Gaussian noise (AWGN). The results for coded MQAM are computed numerically for the case of (24,12) extended Golay code and compared with uncoded MQAM under H-S/MRC diversity by plotting error probabilities versus average signal to noise ratio (SNR) for various values L and N in order to examine the improvement in the performance of the digital communications system as the number of selected diversity branches is increased. The results for no diversity, conventional SC and Lth order MRC schemes are also plotted for comparison. Closed form analytical results derived in this paper are sufficiently simple and therefore can be computed numerically without any approximations. The analytical results presented in this paper are expected to provide useful information needed for design and analysis of digital communication systems over wireless fading channels.

Keywords: Error probability, diversity reception, Rayleigh fading channels, wireless digital communications.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1691