Search results for: perceptual speech coding
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 547

Search results for: perceptual speech coding

457 An Analysis of Genetic Algorithm Based Test Data Compression Using Modified PRL Coding

Authors: K. S. Neelukumari, K. B. Jayanthi

Abstract:

In this paper genetic based test data compression is targeted for improving the compression ratio and for reducing the computation time. The genetic algorithm is based on extended pattern run-length coding. The test set contains a large number of X value that can be effectively exploited to improve the test data compression. In this coding method, a reference pattern is set and its compatibility is checked. For this process, a genetic algorithm is proposed to reduce the computation time of encoding algorithm. This coding technique encodes the 2n compatible pattern or the inversely compatible pattern into a single test data segment or multiple test data segment. The experimental result shows that the compression ratio and computation time is reduced.

Keywords: Backtracking, test data compression (TDC), x-filling, x-propagating and genetic algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1831
456 Recognition of Isolated Speech Signals using Simplified Statistical Parameters

Authors: Abhijit Mitra, Bhargav Kumar Mitra, Biswajoy Chatterjee

Abstract:

We present a novel scheme to recognize isolated speech signals using certain statistical parameters derived from those signals. The determination of the statistical estimates is based on extracted signal information rather than the original signal information in order to reduce the computational complexity. Subtle details of these estimates, after extracting the speech signal from ambience noise, are first exploited to segregate the polysyllabic words from the monosyllabic ones. Precise recognition of each distinct word is then carried out by analyzing the histogram, obtained from these information.

Keywords: Isolated speech signals, Block overlapping technique, Positive peaks, Histogram analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1378
455 Method to Improve Channel Coding Using Cryptography

Authors: Ayyaz Mahmood

Abstract:

A new approach for the improvement of coding gain in channel coding using Advanced Encryption Standard (AES) and Maximum A Posteriori (MAP) algorithm is proposed. This new approach uses the avalanche effect of block cipher algorithm AES and soft output values of MAP decoding algorithm. The performance of proposed approach is evaluated in the presence of Additive White Gaussian Noise (AWGN). For the verification of proposed approach, computer simulation results are included.

Keywords: Advanced Encryption Standard (AES), Avalanche Effect, Maximum A Posteriori (MAP), Soft Input Decryption (SID).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1906
454 Efficient DTW-Based Speech Recognition System for Isolated Words of Arabic Language

Authors: Khalid A. Darabkh, Ala F. Khalifeh, Baraa A. Bathech, Saed W. Sabah

Abstract:

Despite the fact that Arabic language is currently one of the most common languages worldwide, there has been only a little research on Arabic speech recognition relative to other languages such as English and Japanese. Generally, digital speech processing and voice recognition algorithms are of special importance for designing efficient, accurate, as well as fast automatic speech recognition systems. However, the speech recognition process carried out in this paper is divided into three stages as follows: firstly, the signal is preprocessed to reduce noise effects. After that, the signal is digitized and hearingized. Consequently, the voice activity regions are segmented using voice activity detection (VAD) algorithm. Secondly, features are extracted from the speech signal using Mel-frequency cepstral coefficients (MFCC) algorithm. Moreover, delta and acceleration (delta-delta) coefficients have been added for the reason of improving the recognition accuracy. Finally, each test word-s features are compared to the training database using dynamic time warping (DTW) algorithm. Utilizing the best set up made for all affected parameters to the aforementioned techniques, the proposed system achieved a recognition rate of about 98.5% which outperformed other HMM and ANN-based approaches available in the literature.

Keywords: Arabic speech recognition, MFCC, DTW, VAD.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4032
453 Adaptive Multiple Transforms Hardware Architecture for Versatile Video Coding

Authors: T. Damak, S. Houidi, M. A. Ben Ayed, N. Masmoudi

Abstract:

The Versatile Video Coding standard (VVC) is actually under development by the Joint Video Exploration Team (or JVET). An Adaptive Multiple Transforms (AMT) approach was announced. It is based on different transform modules that provided an efficient coding. However, the AMT solution raises several issues especially regarding the complexity of the selected set of transforms. This can be an important issue, particularly for a future industrial adoption. This paper proposed an efficient hardware implementation of the most used transform in AMT approach: the DCT II. The developed circuit is adapted to different block sizes and can reach a minimum frequency of 192 MHz allowing an optimized execution time.

Keywords: AMT, DCT II, hardware, transform, VVC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 530
452 Intelligibility of Cued Speech in Video

Authors: P. Heribanová, J. Polec, S. Ondrušová, M. Hosťovecký

Abstract:

This paper discusses the cued speech recognition methods in videoconference. Cued speech is a specific gesture language that is used for communication between deaf people. We define the criteria for sentence intelligibility according to answers of testing subjects (deaf people). In our tests we use 30 sample videos coded by H.264 codec with various bit-rates and various speed of cued speech. Additionally, we define the criteria for consonant sign recognizability in single-handed finger alphabet (dactyl) analogically to acoustics. We use another 12 sample videos coded by H.264 codec with various bit-rates in four different video formats. To interpret the results we apply the standard scale for subjective video quality evaluation and the percentual evaluation of intelligibility as in acoustics. From the results we construct the minimum coded bit-rate recommendations for every spatial resolution.

Keywords: cued speech, inteligibility, logatom, video

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1479
451 The Causation and Solution of Ringing Effect in DCT-based Video Coding

Authors: Yu Yuan, David Feng, Yu-Zhuo Zhong

Abstract:

Ringing effect is one of the most annoying visual artifacts in digital video. It is a significant factor of subjective quality deterioration. However, there is a widely-accepted misunderstanding of its cause. In this paper, we propose a reasonable interpretation of the cause of ringing effect. Based on the interpretation, we suggest further two methods to reduce ringing effect in DCT-based video coding. The methods adaptively adjust quantizers according to video features. Our experiments proved that the methods could efficiently improve subjective quality with acceptable additional computing costs.

Keywords: ringing effect, video coding, subjective quality, DCT.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1708
450 Detection of Clipped Fragments in Speech Signals

Authors: Sergei Aleinik, Yuri Matveev

Abstract:

In this paper a novel method for the detection of  clipping in speech signals is described. It is shown that the new  method has better performance than known clipping detection  methods, is easy to implement, and is robust to changes in signal  amplitude, size of data, etc. Statistical simulation results are  presented.

 

Keywords: Clipping, clipped signal, speech signal processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2624
449 A New Time-Frequency Speech Analysis Approach Based On Adaptive Fourier Decomposition

Authors: Liming Zhang

Abstract:

In this paper, a new adaptive Fourier decomposition (AFD) based time-frequency speech analysis approach is proposed. Given the fact that the fundamental frequency of speech signals often undergo fluctuation, the classical short-time Fourier transform (STFT) based spectrogram analysis suffers from the difficulty of window size selection. AFD is a newly developed signal decomposition theory. It is designed to deal with time-varying non-stationary signals. Its outstanding characteristic is to provide instantaneous frequency for each decomposed component, so the time-frequency analysis becomes easier. Experiments are conducted based on the sample sentence in TIMIT Acoustic-Phonetic Continuous Speech Corpus. The results show that the AFD based time-frequency distribution outperforms the STFT based one.

Keywords: Adaptive fourier decomposition, instantaneous frequency, speech analysis, time-frequency distribution.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1683
448 EZW Coding System with Artificial Neural Networks

Authors: Saudagar Abdul Khader Jilani, Syed Abdul Sattar

Abstract:

Image compression plays a vital role in today-s communication. The limitation in allocated bandwidth leads to slower communication. To exchange the rate of transmission in the limited bandwidth the Image data must be compressed before transmission. Basically there are two types of compressions, 1) LOSSY compression and 2) LOSSLESS compression. Lossy compression though gives more compression compared to lossless compression; the accuracy in retrievation is less in case of lossy compression as compared to lossless compression. JPEG, JPEG2000 image compression system follows huffman coding for image compression. JPEG 2000 coding system use wavelet transform, which decompose the image into different levels, where the coefficient in each sub band are uncorrelated from coefficient of other sub bands. Embedded Zero tree wavelet (EZW) coding exploits the multi-resolution properties of the wavelet transform to give a computationally simple algorithm with better performance compared to existing wavelet transforms. For further improvement of compression applications other coding methods were recently been suggested. An ANN base approach is one such method. Artificial Neural Network has been applied to many problems in image processing and has demonstrated their superiority over classical methods when dealing with noisy or incomplete data for image compression applications. The performance analysis of different images is proposed with an analysis of EZW coding system with Error Backpropagation algorithm. The implementation and analysis shows approximately 30% more accuracy in retrieved image compare to the existing EZW coding system.

Keywords: Accuracy, Compression, EZW, JPEG2000, Performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1875
447 Speech Enhancement of Vowels Based on Pitch and Formant Frequency

Authors: R. Rishma Rodrigo, R. Radhika, M. Vanitha Lakshmi

Abstract:

Numerous signal processing based speech enhancement systems have been proposed to improve intelligibility in the presence of noise. Traditionally, studies of neural vowel encoding have focused on the representation of formants (peaks in vowel spectra) in the discharge patterns of the population of auditory-nerve (AN) fibers. A method is presented for recording high-frequency speech components into a low-frequency region, to increase audibility for hearing loss listeners. The purpose of the paper is to enhance the formant of the speech based on the Kaiser window. The pitch and formant of the signal is based on the auto correlation, zero crossing and magnitude difference function. The formant enhancement stage aims to restore the representation of formants at the level of the midbrain. A MATLAB software’s are used for the implementation of the system with low complexity is developed.

Keywords: Formant estimation, formant enhancement, pitch detection, speech analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1588
446 Performance Evaluation of One and Two Dimensional Prime Codes for Optical Code Division Multiple Access Systems

Authors: Gurjit Kaur, Neena Gupta

Abstract:

In this paper, we have analyzed and compared the performance of various coding schemes. The basic ID prime sequence codes are unique in only dimension, i.e. time slots, whereas 2D coding techniques are not unique by their time slots but with their wavelengths also. In this research, we have evaluated and compared the performance of 1D and 2D coding techniques constructed using prime sequence coding pattern for Optical Code Division Multiple Access (OCDMA) system on a single platform. Analysis shows that 2D prime code supports lesser number of active users than 1D codes, but they are having large code family and are the most secure codes compared to other codes. The performance of all these codes is analyzed on basis of number of active users supported at a Bit Error Rate (BER) of 10-9.

Keywords: CDMA, OCDMA, BER, OOC, PC, EPC, MPC, 2-D PC/PC, λc, λa.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1081
445 Induction of Expressive Rules using the Binary Coding Method

Authors: Seyed R Mousavi

Abstract:

In most rule-induction algorithms, the only operator used against nominal attributes is the equality operator =. In this paper, we first propose the use of the inequality operator, , in addition to the equality operator, to increase the expressiveness of induced rules. Then, we present a new method, Binary Coding, which can be used along with an arbitrary rule-induction algorithm to make use of the inequality operator without any need to change the algorithm. Experimental results suggest that the Binary Coding method is promising enough for further investigation, especially in cases where the minimum number of rules is desirable.

Keywords: Data mining, Inequality operator, Number of rules, Rule-induction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1215
444 Performance Evaluation of Wavelet Based Coders on Brain MRI Volumetric Medical Datasets for Storage and Wireless Transmission

Authors: D. Dhouib, A. Naït-Ali, C. Olivier, M. S. Naceur

Abstract:

In this paper, we evaluate the performance of some wavelet based coding algorithms such as 3D QT-L, 3D SPIHT and JPEG2K. In the first step we achieve an objective comparison between three coders, namely 3D SPIHT, 3D QT-L and JPEG2K. For this purpose, eight MRI head scan test sets of 256 x 256x124 voxels have been used. Results show superior performance of 3D SPIHT algorithm, whereas 3D QT-L outperforms JPEG2K. The second step consists of evaluating the robustness of 3D SPIHT and JPEG2K coding algorithm over wireless transmission. Compressed dataset images are then transmitted over AWGN wireless channel or over Rayleigh wireless channel. Results show the superiority of JPEG2K over these two models. In fact, it has been deduced that JPEG2K is more robust regarding coding errors. Thus we may conclude the necessity of using corrector codes in order to protect the transmitted medical information.

Keywords: Image coding, medical imaging, wavelet basedcoder, wireless transmission.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1909
443 Effective Context Lossless Image Coding Approach Based on Adaptive Prediction

Authors: Grzegorz Ulacha, Ryszard Stasiński

Abstract:

In the paper an effective context based lossless coding technique is presented. Three principal and few auxiliary contexts are defined. The predictor adaptation technique is an improved CoBALP algorithm, denoted CoBALP+. Cumulated predictor error combining 8 bias estimators is calculated. It is shown experimentally that indeed, the new technique is time-effective while it outperforms the well known methods having reasonable time complexity, and is inferior only to extremely computationally complex ones.

Keywords: Adaptive prediction, context coding, image losslesscoding, prediction error bias correction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1313
442 Efficient Secured Lossless Coding of Medical Images– Using Modified Runlength Coding for Character Representation

Authors: S. Annadurai, P. Geetha

Abstract:

Lossless compression schemes with secure transmission play a key role in telemedicine applications that helps in accurate diagnosis and research. Traditional cryptographic algorithms for data security are not fast enough to process vast amount of data. Hence a novel Secured lossless compression approach proposed in this paper is based on reversible integer wavelet transform, EZW algorithm, new modified runlength coding for character representation and selective bit scrambling. The use of the lifting scheme allows generating truly lossless integer-to-integer wavelet transforms. Images are compressed/decompressed by well-known EZW algorithm. The proposed modified runlength coding greatly improves the compression performance and also increases the security level. This work employs scrambling method which is fast, simple to implement and it provides security. Lossless compression ratios and distortion performance of this proposed method are found to be better than other lossless techniques.

Keywords: EZW algorithm, lifting scheme, losslesscompression, reversible integer wavelet transform, securetransmission, selective bit scrambling, modified runlength coding .

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1329
441 A New Fast Intra Prediction Mode Decision Algorithm for H.264/AVC Encoders

Authors: A. Elyousfi, A. Tamtaoui, E. Bouyakhf

Abstract:

The H.264/AVC video coding standard contains a number of advanced features. Ones of the new features introduced in this standard is the multiple intramode prediction. Its function exploits directional spatial correlation with adjacent block for intra prediction. With this new features, intra coding of H.264/AVC offers a considerably higher improvement in coding efficiency compared to other compression standard, but computational complexity is increased significantly when brut force rate distortion optimization (RDO) algorithm is used. In this paper, we propose a new fast intra prediction mode decision method for the complexity reduction of H.264 video coding. for luma intra prediction, the proposed method consists of two step: in the first step, we make the RDO for four mode of intra 4x4 block, based the distribution of RDO cost of those modes and the idea that the fort correlation with adjacent mode, we select the best mode of intra 4x4 block. In the second step, we based the fact that the dominating direction of a smaller block is similar to that of bigger block, the candidate modes of 8x8 blocks and 16x16 macroblocks are determined. So, in case of chroma intra prediction, the variance of the chroma pixel values is much smaller than that of luma ones, since our proposed uses only the mode DC. Experimental results show that the new fast intra mode decision algorithm increases the speed of intra coding significantly with negligible loss of PSNR.

Keywords: Intra prediction, H264/AVC, video coding, encodercomplexity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2443
440 On Preprocessing of Speech Signals

Authors: Ayaz Keerio, Bhargav Kumar Mitra, Philip Birch, Rupert Young, Chris Chatwin

Abstract:

Preprocessing of speech signals is considered a crucial step in the development of a robust and efficient speech or speaker recognition system. In this paper, we present some popular statistical outlier-detection based strategies to segregate the silence/unvoiced part of the speech signal from the voiced portion. The proposed methods are based on the utilization of the 3 σ edit rule, and the Hampel Identifier which are compared with the conventional techniques: (i) short-time energy (STE) based methods, and (ii) distribution based methods. The results obtained after applying the proposed strategies on some test voice signals are encouraging.

Keywords: STE based methods, Mahalanobis distance, 3 edit σ rule, Hampel Identifier.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1657
439 An Implementation of Data Reusable MPEG Video Coding Scheme

Authors: Vasily G. Moshnyaga

Abstract:

This paper presents an optimized MPEG2 video codec implementation, which drastically reduces the number of computations and memory accesses required for video compression. Unlike traditional scheme, we reuse data stored in frame memory to omit unnecessary coding operations and memory read/writes for unchanged macroblocks. Due to dynamic memory sharing among reference frames, data-driven macroblock characterization and selective macroblock processing, we perform less than 15% of the total operations required by a conventional coder while maintaining high picture quality.

Keywords: Data reuse, adaptive processing, video coding, MPEG

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1230
438 Convergence and Divergence in Telephone Conversations: A Case of Persian

Authors: Anna Mirzaiyan, Vahid Parvaresh, Mahmoud Hashemian, Masoud Saeedi

Abstract:

People usually have a telephone voice, which means they adjust their speech to fit particular situations and to blend in with other interlocutors. The question is: Do we speak differently to different people? This possibility has been suggested by social psychologists within Accommodation Theory [1]. Converging toward the speech of another person can be regarded as a polite speech strategy while choosing a language not used by the other interlocutor can be considered as the clearest example of speech divergence [2]. The present study sets out to investigate such processes in the course of everyday telephone conversations. Using Joos-s [3] model of formality in spoken English, the researchers try to explore convergence to or divergence from the addressee. The results propound the actuality that lexical choice, and subsequently, patterns of style vary intriguingly in concordance with the person being addressed.

Keywords: Convergence, divergence, lexical formality, speechaccommodation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3474
437 Colour Image Compression Method Based On Fractal Block Coding Technique

Authors: Dibyendu Ghoshal, Shimal Das

Abstract:

Image compression based on fractal coding is a lossy compression method and normally used for gray level images range and domain blocks in rectangular shape. Fractal based digital image compression technique provide a large compression ratio and in this paper, it is proposed using YUV colour space and the fractal theory which is based on iterated transformation. Fractal geometry is mainly applied in the current study towards colour image compression coding. These colour images possesses correlations among the colour components and hence high compression ratio can be achieved by exploiting all these redundancies. The proposed method utilises the self-similarity in the colour image as well as the cross-correlations between them. Experimental results show that the greater compression ratio can be achieved with large domain blocks but more trade off in image quality is good to acceptable at less than 1 bit per pixel.

Keywords: Fractal coding, Iterated Function System (IFS), Image compression, YUV colour space.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1935
436 On Developing an Automatic Speech Recognition System for Standard Arabic Language

Authors: R. Walha, F. Drira, H. El-Abed, A. M. Alimi

Abstract:

The Automatic Speech Recognition (ASR) applied to Arabic language is a challenging task. This is mainly related to the language specificities which make the researchers facing multiple difficulties such as the insufficient linguistic resources and the very limited number of available transcribed Arabic speech corpora. In this paper, we are interested in the development of a HMM-based ASR system for Standard Arabic (SA) language. Our fundamental research goal is to select the most appropriate acoustic parameters describing each audio frame, acoustic models and speech recognition unit. To achieve this purpose, we analyze the effect of varying frame windowing (size and period), acoustic parameter number resulting from features extraction methods traditionally used in ASR, speech recognition unit, Gaussian number per HMM state and number of embedded re-estimations of the Baum-Welch Algorithm. To evaluate the proposed ASR system, a multi-speaker SA connected-digits corpus is collected, transcribed and used throughout all experiments. A further evaluation is conducted on a speaker-independent continue SA speech corpus. The phonemes recognition rate is 94.02% which is relatively high when comparing it with another ASR system evaluated on the same corpus.

Keywords: ASR, HMM, acoustical analysis, acoustic modeling, Standard Arabic language

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1734
435 Transformation of Vocal Characteristics: A Review of Literature

Authors: Dong-Yan Huang, Ee Ping Ong, Susanto Rahardja, Minghui Dong, Haizhou Li

Abstract:

The transformation of vocal characteristics aims at modifying voice such that the intelligibility of aphonic voice is increased or the voice characteristics of a speaker (source speaker) to be perceived as if another speaker (target speaker) had uttered it. In this paper, the current state-of-the-art voice characteristics transformation methodology is reviewed. Special emphasis is placed on voice transformation methodology and issues for improving the transformed speech quality in intelligibility and naturalness are discussed. In particular, it is suggested to use the modulation theory of speech as a base for research on high quality voice transformation. This approach allows one to separate linguistic, expressive, organic and perspective information of speech, based on an analysis of how they are fused when speech is produced. Therefore, this theory provides the fundamentals not only for manipulating non-linguistic, extra-/paralinguistic and intra-linguistic variables for voice transformation, but also for paving the way for easily transposing the existing voice transformation methods to emotion-related voice quality transformation and speaking style transformation. From the perspectives of human speech production and perception, the popular voice transformation techniques are described and classified them based on the underlying principles either from the speech production or perception mechanisms or from both. In addition, the advantages and limitations of voice transformation techniques and the experimental manipulation of vocal cues are discussed through examples from past and present research. Finally, a conclusion and road map are pointed out for more natural voice transformation algorithms in the future.

Keywords: Voice transformation, Voice Quality, Emotion, Individuality, Speaking Style, Speech Production, Speech Perception.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1993
434 Enhanced Spectral Envelope Coding Based On NLMS for G.729.1

Authors: Keunseok Cho, Sangbae Jeong, Hyungwook Chang, Minsoo Hahn

Abstract:

In this paper, a new encoding algorithm of spectral envelope based on NLMS in G.729.1 for VoIP is proposed. In the TDAC part of G.729.1, the spectral envelope and MDCT coefficients extracted in the weighted CELP coding error (lower-band) and the higher-band input signal are encoded. In order to reduce allocation bits for spectral envelope coding, a new quantization algorithm based on NLMS is proposed. Also, reduced bits are used to enhance sound quality. The performance of the proposed algorithm is evaluated by sound quality and bit reduction rates in clean and frame loss conditions.

Keywords: G.729.1, MDCT coefficient, NLMS, spectral envelope.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1623
433 Cognitive SATP for Airborne Radar Based on Slow-Time Coding

Authors: Fanqiang Kong, Jindong Zhang, Daiyin Zhu

Abstract:

Space-time adaptive processing (STAP) techniques have been motivated as a key enabling technology for advanced airborne radar applications. In this paper, the notion of cognitive radar is extended to STAP technique, and cognitive STAP is discussed. The principle for improving signal-to-clutter ratio (SCNR) based on slow-time coding is given, and the corresponding optimization algorithm based on cyclic and power-like algorithms is presented. Numerical examples show the effectiveness of the proposed method.

Keywords: Space-time adaptive processing (STAP), signal-to-clutter ratio, slow-time coding.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 810
432 Speech Enhancement by Marginal Statistical Characterization in the Log Gabor Wavelet Domain

Authors: Suman Senapati, Goutam Saha

Abstract:

This work presents a fusion of Log Gabor Wavelet (LGW) and Maximum a Posteriori (MAP) estimator as a speech enhancement tool for acoustical background noise reduction. The probability density function (pdf) of the speech spectral amplitude is approximated by a Generalized Laplacian Distribution (GLD). Compared to earlier estimators the proposed method estimates the underlying statistical model more accurately by appropriately choosing the model parameters of GLD. Experimental results show that the proposed estimator yields a higher improvement in Segmental Signal-to-Noise Ratio (S-SNR) and lower Log-Spectral Distortion (LSD) in two different noisy environments compared to other estimators.

Keywords: Speech Enhancement, Generalized Laplacian Distribution, Log Gabor Wavelet, Bayesian MAP Marginal Estimator.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1589
431 A System of Automatic Speech Recognition based on the Technique of Temporal Retiming

Authors: Samir Abdelhamid, Noureddine Bouguechal

Abstract:

We report in this paper the procedure of a system of automatic speech recognition based on techniques of the dynamic programming. The technique of temporal retiming is a technique used to synchronize between two forms to compare. We will see how this technique is adapted to the field of the automatic speech recognition. We will expose, in a first place, the theory of the function of retiming which is used to compare and to adjust an unknown form with a whole of forms of reference constituting the vocabulary of the application. Then we will give, in the second place, the various algorithms necessary to their implementation on machine. The algorithms which we will present were tested on part of the corpus of words in Arab language Arabdic-10 [4] and gave whole satisfaction. These algorithms are effective insofar as we apply them to the small ones or average vocabularies.

Keywords: Continuous speech recognition, temporal retiming, phonetic decoding, algorithms, vocal signal, dynamic programming.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1302
430 A proposed High-Resolution Time-Frequency Distribution for the Analysis of Multicomponent and Speech Signals

Authors: D. Boutana, B. Barkat , F. Marir

Abstract:

In this paper, we propose a novel time-frequency distribution (TFD) for the analysis of multi-component signals. In particular, we use synthetic as well as real-life speech signals to prove the superiority of the proposed TFD in comparison to some existing ones. In the comparison, we consider the cross-terms suppression and the high energy concentration of the signal around its instantaneous frequency (IF).

Keywords: Cohen's Class, Multicomponent signal, SeparableKernel, Speech signal, Time- frequency resolution.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1829
429 Designing Ontology-Based Knowledge Integration for Preprocessing of Medical Data in Enhancing a Machine Learning System for Coding Assignment of a Multi-Label Medical Text

Authors: Phanu Waraporn

Abstract:

This paper discusses the designing of knowledge integration of clinical information extracted from distributed medical ontologies in order to ameliorate a machine learning-based multilabel coding assignment system. The proposed approach is implemented using a decision tree technique of the machine learning on the university hospital data for patients with Coronary Heart Disease (CHD). The preliminary results obtained show a satisfactory finding that the use of medical ontologies improves the overall system performance.

Keywords: Medical Ontology, Knowledge Integration, Machine Learning, Medical Coding, Text Assignment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1807
428 Optimal Image Compression Based on Sign and Magnitude Coding of Wavelet Coefficients

Authors: Mbainaibeye Jérôme, Noureddine Ellouze

Abstract:

Wavelet transforms is a very powerful tools for image compression. One of its advantage is the provision of both spatial and frequency localization of image energy. However, wavelet transform coefficients are defined by both a magnitude and sign. While algorithms exist for efficiently coding the magnitude of the transform coefficients, they are not efficient for the coding of their sign. It is generally assumed that there is no compression gain to be obtained from the coding of the sign. Only recently have some authors begun to investigate the sign of wavelet coefficients in image coding. Some authors have assumed that the sign information bit of wavelet coefficients may be encoded with the estimated probability of 0.5; the same assumption concerns the refinement information bit. In this paper, we propose a new method for Separate Sign Coding (SSC) of wavelet image coefficients. The sign and the magnitude of wavelet image coefficients are examined to obtain their online probabilities. We use the scalar quantization in which the information of the wavelet coefficient to belong to the lower or to the upper sub-interval in the uncertainly interval is also examined. We show that the sign information and the refinement information may be encoded by the probability of approximately 0.5 only after about five bit planes. Two maps are separately entropy encoded: the sign map and the magnitude map. The refinement information of the wavelet coefficient to belong to the lower or to the upper sub-interval in the uncertainly interval is also entropy encoded. An algorithm is developed and simulations are performed on three standard images in grey scale: Lena, Barbara and Cameraman. Five scales are performed using the biorthogonal wavelet transform 9/7 filter bank. The obtained results are compared to JPEG2000 standard in terms of peak signal to noise ration (PSNR) for the three images and in terms of subjective quality (visual quality). It is shown that the proposed method outperforms the JPEG2000. The proposed method is also compared to other codec in the literature. It is shown that the proposed method is very successful and shows its performance in term of PSNR.

Keywords: Image compression, wavelet transform, sign coding, magnitude coding.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1632