Search results for: Perceptual speech filtering
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 558

Search results for: Perceptual speech filtering

498 Advances in Artificial Intelligence Using Speech Recognition

Authors: Khaled M. Alhawiti

Abstract:

This research study aims to present a retrospective study about speech recognition systems and artificial intelligence. Speech recognition has become one of the widely used technologies, as it offers great opportunity to interact and communicate with automated machines. Precisely, it can be affirmed that speech recognition facilitates its users and helps them to perform their daily routine tasks, in a more convenient and effective manner. This research intends to present the illustration of recent technological advancements, which are associated with artificial intelligence. Recent researches have revealed the fact that speech recognition is found to be the utmost issue, which affects the decoding of speech. In order to overcome these issues, different statistical models were developed by the researchers. Some of the most prominent statistical models include acoustic model (AM), language model (LM), lexicon model, and hidden Markov models (HMM). The research will help in understanding all of these statistical models of speech recognition. Researchers have also formulated different decoding methods, which are being utilized for realistic decoding tasks and constrained artificial languages. These decoding methods include pattern recognition, acoustic phonetic, and artificial intelligence. It has been recognized that artificial intelligence is the most efficient and reliable methods, which are being used in speech recognition.

Keywords: Speech recognition, acoustic phonetic, artificial intelligence, Hidden Markov Models (HMM), statistical models of speech recognition, human machine performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7901
497 Speech Enhancement Using Wavelet Coefficients Masking with Local Binary Patterns

Authors: Christian Arcos, Marley Vellasco, Abraham Alcaim

Abstract:

In this paper, we present a wavelet coefficients masking based on Local Binary Patterns (WLBP) approach to enhance the temporal spectra of the wavelet coefficients for speech enhancement. This technique exploits the wavelet denoising scheme, which splits the degraded speech into pyramidal subband components and extracts frequency information without losing temporal information. Speech enhancement in each high-frequency subband is performed by binary labels through the local binary pattern masking that encodes the ratio between the original value of each coefficient and the values of the neighbour coefficients. This approach enhances the high-frequency spectra of the wavelet transform instead of eliminating them through a threshold. A comparative analysis is carried out with conventional speech enhancement algorithms, demonstrating that the proposed technique achieves significant improvements in terms of PESQ, an international recommendation of objective measure for estimating subjective speech quality. Informal listening tests also show that the proposed method in an acoustic context improves the quality of speech, avoiding the annoying musical noise present in other speech enhancement techniques. Experimental results obtained with a DNN based speech recognizer in noisy environments corroborate the superiority of the proposed scheme in the robust speech recognition scenario.

Keywords: Binary labels, local binary patterns, mask, wavelet coefficients, speech enhancement, speech recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 967
496 Investigation of Combined use of MFCC and LPC Features in Speech Recognition Systems

Authors: К. R. Aida–Zade, C. Ardil, S. S. Rustamov

Abstract:

Statement of the automatic speech recognition problem, the assignment of speech recognition and the application fields are shown in the paper. At the same time as Azerbaijan speech, the establishment principles of speech recognition system and the problems arising in the system are investigated. The computing algorithms of speech features, being the main part of speech recognition system, are analyzed. From this point of view, the determination algorithms of Mel Frequency Cepstral Coefficients (MFCC) and Linear Predictive Coding (LPC) coefficients expressing the basic speech features are developed. Combined use of cepstrals of MFCC and LPC in speech recognition system is suggested to improve the reliability of speech recognition system. To this end, the recognition system is divided into MFCC and LPC-based recognition subsystems. The training and recognition processes are realized in both subsystems separately, and recognition system gets the decision being the same results of each subsystems. This results in decrease of error rate during recognition. The training and recognition processes are realized by artificial neural networks in the automatic speech recognition system. The neural networks are trained by the conjugate gradient method. In the paper the problems observed by the number of speech features at training the neural networks of MFCC and LPC-based speech recognition subsystems are investigated. The variety of results of neural networks trained from different initial points in training process is analyzed. Methodology of combined use of neural networks trained from different initial points in speech recognition system is suggested to improve the reliability of recognition system and increase the recognition quality, and obtained practical results are shown.

Keywords: Speech recognition, cepstral analysis, Voice activation detection algorithm, Mel Frequency Cepstral Coefficients, features of speech, Cepstral Mean Subtraction, neural networks, Linear Predictive Coding.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 864
495 Study of Adaptive Filtering Algorithms and the Equalization of Radio Mobile Channel

Authors: Said Elkassimi, Said Safi, B. Manaut

Abstract:

This paper presented a study of three algorithms, the equalization algorithm to equalize the transmission channel with ZF and MMSE criteria, application of channel Bran A, and adaptive filtering algorithms LMS and RLS to estimate the parameters of the equalizer filter, i.e. move to the channel estimation and therefore reflect the temporal variations of the channel, and reduce the error in the transmitted signal. So far the performance of the algorithm equalizer with ZF and MMSE criteria both in the case without noise, a comparison of performance of the LMS and RLS algorithm.

Keywords: Adaptive filtering second equalizer, LMS, RLS Bran A, Proakis (B) MMSE, ZF.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2073
494 EEG Signal Processing Methods to Differentiate Mental States

Authors: Sun H. Hwang, Young E. Lee, Yunhan Ga, Gilwon Yoon

Abstract:

EEG is a very complex signal with noises and other bio-potential interferences. EOG is the most distinct interfering signal when EEG signals are measured and analyzed. It is very important how to process raw EEG signals in order to obtain useful information. In this study, the EEG signal processing techniques such as EOG filtering and outlier removal were examined to minimize unwanted EOG signals and other noises. The two different mental states of resting and focusing were examined through EEG analysis. A focused state was induced by letting subjects to watch a red dot on the white screen. EEG data for 32 healthy subjects were measured. EEG data after 60-Hz notch filtering were processed by a commercially available EOG filtering and our presented algorithm based on the removal of outliers. The ratio of beta wave to theta wave was used as a parameter for determining the degree of focusing. The results show that our algorithm was more appropriate than the existing EOG filtering.

Keywords: EEG, focus, mental state, outlier, signal processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1484
493 A Technique for Improving the Performance of Median Smoothers at the Corners Characterized by Low Order Polynomials

Authors: E. Srinivasan, D. Ebenezer

Abstract:

Median filters with larger windows offer greater smoothing and are more robust than the median filters of smaller windows. However, the larger median smoothers (the median filters with the larger windows) fail to track low order polynomial trends in the signals. Due to this, constant regions are produced at the signal corners, leading to the loss of fine details. In this paper, an algorithm, which combines the ability of the 3-point median smoother in preserving the low order polynomial trends and the superior noise filtering characteristics of the larger median smoother, is introduced. The proposed algorithm (called the combiner algorithm in this paper) is evaluated for its performance on a test image corrupted with different types of noise and the results obtained are included.

Keywords: Image filtering, detail preservation, median filters, nonlinear filters, order statistics filtering, Rank order filtering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1335
492 New Subband Adaptive IIR Filter Based On Polyphase Decomposition

Authors: Young-Seok Choi

Abstract:

We present a subband adaptive infinite-impulse response (IIR) filtering method, which is based on a polyphase decomposition of IIR filter. Motivated by the fact that the polyphase structure has benefits in terms of convergence rate and stability, we introduce the polyphase decomposition to subband IIR filtering, i.e., in each subband high order IIR filter is decomposed into polyphase IIR filters with lower order. Computer simulations demonstrate that the proposed method has improved convergence rate over conventional IIR filters.

Keywords: Subband adaptive filter, IIR filtering. Polyphase decomposition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2438
491 Adaptive Noise Reduction Algorithm for Speech Enhancement

Authors: M. Kalamani, S. Valarmathy, M. Krishnamoorthi

Abstract:

In this paper, Least Mean Square (LMS) adaptive noise reduction algorithm is proposed to enhance the speech signal from the noisy speech. In this, the speech signal is enhanced by varying the step size as the function of the input signal. Objective and subjective measures are made under various noises for the proposed and existing algorithms. From the experimental results, it is seen that the proposed LMS adaptive noise reduction algorithm reduces Mean square Error (MSE) and Log Spectral Distance (LSD) as compared to that of the earlier methods under various noise conditions with different input SNR levels. In addition, the proposed algorithm increases the Peak Signal to Noise Ratio (PSNR) and Segmental SNR improvement (ΔSNRseg) values; improves the Mean Opinion Score (MOS) as compared to that of the various existing LMS adaptive noise reduction algorithms. From these experimental results, it is observed that the proposed LMS adaptive noise reduction algorithm reduces the speech distortion and residual noise as compared to that of the existing methods.

Keywords: LMS, speech enhancement, speech quality, residual noise.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2765
490 Speech Intelligibility Improvement Using Variable Level Decomposition DWT

Authors: Samba Raju, Chiluveru, Manoj Tripathy

Abstract:

Intelligibility is an essential characteristic of a speech signal, which is used to help in the understanding of information in speech signal. Background noise in the environment can deteriorate the intelligibility of a recorded speech. In this paper, we presented a simple variance subtracted - variable level discrete wavelet transform, which improve the intelligibility of speech. The proposed algorithm does not require an explicit estimation of noise, i.e., prior knowledge of the noise; hence, it is easy to implement, and it reduces the computational burden. The proposed algorithm decides a separate decomposition level for each frame based on signal dominant and dominant noise criteria. The performance of the proposed algorithm is evaluated with speech intelligibility measure (STOI), and results obtained are compared with Universal Discrete Wavelet Transform (DWT) thresholding and Minimum Mean Square Error (MMSE) methods. The experimental results revealed that the proposed scheme outperformed competing methods

Keywords: Discrete Wavelet Transform, speech intelligibility, STOI, standard deviation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 640
489 Nonlinear Acoustic Echo Cancellation Using Volterra Filtering with a Variable Step-Size GS-PAP Algorithm

Authors: J. B. Seo, K. J. Kim, S. W. Nam

Abstract:

In this paper, a nonlinear acoustic echo cancellation (AEC) system is proposed, whereby 3rd order Volterra filtering is utilized along with a variable step-size Gauss-Seidel pseudo affine projection (VSSGS-PAP) algorithm. In particular, the proposed nonlinear AEC system is developed by considering a double-talk situation with near-end signal variation. Simulation results demonstrate that the proposed approach yields better nonlinear AEC performance than conventional approaches.

Keywords: Acoustic echo cancellation (AEC), Volterra filtering, variable step-size, GS-PAP.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1770
488 Comparative Study of QRS Complex Detection in ECG

Authors: Ibtihel Nouira, Asma Ben Abdallah, Ibtissem Kouaja, Mohamed Hèdi Bedoui

Abstract:

The processing of the electrocardiogram (ECG) signal consists essentially in the detection of the characteristic points of signal which are an important tool in the diagnosis of heart diseases. The most suitable are the detection of R waves. In this paper, we present various mathematical tools used for filtering ECG using digital filtering and Discreet Wavelet Transform (DWT) filtering. In addition, this paper will include two main R peak detection methods by applying a windowing process: The first method is based on calculations derived, the second is a time-frequency method based on Dyadic Wavelet Transform DyWT.

Keywords: Derived calculation methods, Electrocardiogram, R peaks, Wavelet Transform.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2517
487 A Family of Minimal Residual Based Algorithm for Adaptive Filtering

Authors: Noor Atinah Ahmad

Abstract:

The Minimal Residual (MR) is modified for adaptive filtering application. Three forms of MR based algorithm are presented: i) the low complexity SPCG, ii) MREDSI, and iii) MREDSII. The low complexity is a reduced complexity version of a previously proposed SPCG algorithm. Approximations introduced reduce the algorithm to an LMS type algorithm, but, maintain the superior convergence of the SPCG algorithm. Both MREDSI and MREDSII are MR based methods with Euclidean direction of search. The choice of Euclidean directions is shown via simulation to give better misadjustment compared to their gradient search counterparts.

Keywords: Adaptive filtering, Adaptive least square, Minimalresidual method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1394
486 Efficient Filtering of Graph Based Data Using Graph Partitioning

Authors: Nileshkumar Vaishnav, Aditya Tatu

Abstract:

An algebraic framework for processing graph signals axiomatically designates the graph adjacency matrix as the shift operator. In this setup, we often encounter a problem wherein we know the filtered output and the filter coefficients, and need to find out the input graph signal. Solution to this problem using direct approach requires O(N3) operations, where N is the number of vertices in graph. In this paper, we adapt the spectral graph partitioning method for partitioning of graphs and use it to reduce the computational cost of the filtering problem. We use the example of denoising of the temperature data to illustrate the efficacy of the approach.

Keywords: Graph signal processing, graph partitioning, inverse filtering on graphs, algebraic signal processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1172
485 Efficient Realization of an ADFE with a New Adaptive Algorithm

Authors: N. Praveen Kumar, Abhijit Mitra, C. Ardil

Abstract:

Decision feedback equalizers are commonly employed to reduce the error caused by intersymbol interference. Here, an adaptive decision feedback equalizer is presented with a new adaptation algorithm. The algorithm follows a block-based approach of normalized least mean square (NLMS) algorithm with set-membership filtering and achieves a significantly less computational complexity over its conventional NLMS counterpart with set-membership filtering. It is shown in the results that the proposed algorithm yields similar type of bit error rate performance over a reasonable signal to noise ratio in comparison with the latter one.

Keywords: Decision feedback equalizer, Adaptive algorithm, Block based computation, Set membership filtering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1628
484 Various Speech Processing Techniques For Speech Compression And Recognition

Authors: Jalal Karam

Abstract:

Years of extensive research in the field of speech processing for compression and recognition in the last five decades, resulted in a severe competition among the various methods and paradigms introduced. In this paper we include the different representations of speech in the time-frequency and time-scale domains for the purpose of compression and recognition. The examination of these representations in a variety of related work is accomplished. In particular, we emphasize methods related to Fourier analysis paradigms and wavelet based ones along with the advantages and disadvantages of both approaches.

Keywords: Time-Scale, Wavelets, Time-Frequency, Compression, Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2292
483 A Two-Stage Adaptation towards Automatic Speech Recognition System for Malay-Speaking Children

Authors: Mumtaz Begum Mustafa, Siti Salwah Salim, Feizal Dani Rahman

Abstract:

Recently, Automatic Speech Recognition (ASR) systems were used to assist children in language acquisition as it has the ability to detect human speech signal. Despite the benefits offered by the ASR system, there is a lack of ASR systems for Malay-speaking children. One of the contributing factors for this is the lack of continuous speech database for the target users. Though cross-lingual adaptation is a common solution for developing ASR systems for under-resourced language, it is not viable for children as there are very limited speech databases as a source model. In this research, we propose a two-stage adaptation for the development of ASR system for Malay-speaking children using a very limited database. The two stage adaptation comprises the cross-lingual adaptation (first stage) and cross-age adaptation. For the first stage, a well-known speech database that is phonetically rich and balanced, is adapted to the medium-sized Malay adults using supervised MLLR. The second stage adaptation uses the speech acoustic model generated from the first adaptation, and the target database is a small-sized database of the target users. We have measured the performance of the proposed technique using word error rate, and then compare them with the conventional benchmark adaptation. The two stage adaptation proposed in this research has better recognition accuracy as compared to the benchmark adaptation in recognizing children’s speech.

Keywords: Automatic speech recognition system, children speech, adaptation, Malay.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1716
482 Adaptive Digital Watermarking Integrating Fuzzy Inference HVS Perceptual Model

Authors: Sherin M. Youssef, Ahmed Abouelfarag, Noha M. Ghatwary

Abstract:

An adaptive Fuzzy Inference Perceptual model has been proposed for watermarking of digital images. The model depends on the human visual characteristics of image sub-regions in the frequency multi-resolution wavelet domain. In the proposed model, a multi-variable fuzzy based architecture has been designed to produce a perceptual membership degree for both candidate embedding sub-regions and strength watermark embedding factor. Different sizes of benchmark images with different sizes of watermarks have been applied on the model. Several experimental attacks have been applied such as JPEG compression, noises and rotation, to ensure the robustness of the scheme. In addition, the model has been compared with different watermarking schemes. The proposed model showed its robustness to attacks and at the same time achieved a high level of imperceptibility.

Keywords: Watermarking, The human visual system (HVS), Fuzzy Inference System (FIS), Local Binary Pattern (LBP), Discrete Wavelet Transform (DWT).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1778
481 Speech Coding and Recognition

Authors: M. Satya Sai Ram, P. Siddaiah, M. Madhavi Latha

Abstract:

This paper investigates the performance of a speech recognizer in an interactive voice response system for various coded speech signals, coded by using a vector quantization technique namely Multi Switched Split Vector Quantization Technique. The process of recognizing the coded output can be used in Voice banking application. The recognition technique used for the recognition of the coded speech signals is the Hidden Markov Model technique. The spectral distortion performance, computational complexity, and memory requirements of Multi Switched Split Vector Quantization Technique and the performance of the speech recognizer at various bit rates have been computed. From results it is found that the speech recognizer is showing better performance at 24 bits/frame and it is found that the percentage of recognition is being varied from 100% to 93.33% for various bit rates.

Keywords: Linear predictive coding, Speech Recognition, Voice banking, Multi Switched Split Vector Quantization, Hidden Markov Model, Linear Predictive Coefficients.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1789
480 A Dynamic Filter for Removal DC - Offset In Current and Voltage Waveforms

Authors: Khaled M.EL-Naggar

Abstract:

In power systems, protective relays must filter their inputs to remove undesirable quantities and retain signal quantities of interest. This job must be performed accurate and fast. A new method for filtering the undesirable components such as DC and harmonic components associated with the fundamental system signals. The method is s based on a dynamic filtering algorithm. The filtering algorithm has many advantages over some other classical methods. It can be used as dynamic on-line filter without the need of parameters readjusting as in the case of classic filters. The proposed filter is tested using different signals. Effects of number of samples and sampling window size are discussed. Results obtained are presented and discussed to show the algorithm capabilities.

Keywords: Protection, DC-offset, Dynamic Filter, Estimation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3708
479 Slovenian Text-to-Speech Synthesis for Speech User Interfaces

Authors: Jerneja Žganec Gros, Aleš Mihelič, Nikola Pavešić, Mario Žganec, Stanislav Gruden

Abstract:

The paper presents the design concept of a unitselection text-to-speech synthesis system for the Slovenian language. Due to its modular and upgradable architecture, the system can be used in a variety of speech user interface applications, ranging from server carrier-grade voice portal applications, desktop user interfaces to specialized embedded devices. Since memory and processing power requirements are important factors for a possible implementation in embedded devices, lexica and speech corpora need to be reduced. We describe a simple and efficient implementation of a greedy subset selection algorithm that extracts a compact subset of high coverage text sentences. The experiment on a reference text corpus showed that the subset selection algorithm produced a compact sentence subset with a small redundancy. The adequacy of the spoken output was evaluated by several subjective tests as they are recommended by the International Telecommunication Union ITU.

Keywords: text-to-speech synthesis, prosody modeling, speech user interface.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1401
478 3D Anisotropic Diffusion for Liver Segmentation

Authors: Wan Nural Jawahir Wan Yussof, Hans Burkhardt

Abstract:

Liver segmentation is the first significant process for liver diagnosis of the Computed Tomography. It segments the liver structure from other abdominal organs. Sophisticated filtering techniques are indispensable for a proper segmentation. In this paper, we employ a 3D anisotropic diffusion as a preprocessing step. While removing image noise, this technique preserve the significant parts of the image, typically edges, lines or other details that are important for the interpretation of the image. The segmentation task is done by using thresholding with automatic threshold values selection and finally the false liver region is eliminated using 3D connected component. The result shows that by employing the 3D anisotropic filtering, better liver segmentation results could be achieved eventhough simple segmentation technique is used.

Keywords: 3D Anisotropic Diffusion, non-linear filtering, CT Liver.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1548
477 Design of a DCT-based Image Compression with Efficient Enhancement Filter

Authors: Yen-Yu Chen, Pao-Ching Chu, Ya-Ling Tsai

Abstract:

The algorithm represents the DCT coefficients to concentrate signal energy and proposes combination and dictator to eliminate the correlation in the same level subband for encoding the DCT-based images. This work adopts DCT and modifies the SPIHT algorithm to encode DCT coefficients. The proposed algorithm also provides the enhancement function in low bit rate in order to improve the perceptual quality. Experimental results indicate that the proposed technique improves the quality of the reconstructed image in terms of both PSNR and the perceptual results close to JPEG2000 at the same bit rate.

Keywords: JPEG 2000, enhancement filter

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1643
476 Application of Smooth Ergodic Hidden Markov Model in Text to Speech Systems

Authors: Armin Ghayoori, Faramarz Hendessi, Asrar Sheikh

Abstract:

In developing a text-to-speech system, it is well known that the accuracy of information extracted from a text is crucial to produce high quality synthesized speech. In this paper, a new scheme for converting text into its equivalent phonetic spelling is introduced and developed. This method is applicable to many applications in text to speech converting systems and has many advantages over other methods. The proposed method can also complement the other methods with a purpose of improving their performance. The proposed method is a probabilistic model and is based on Smooth Ergodic Hidden Markov Model. This model can be considered as an extension to HMM. The proposed method is applied to Persian language and its accuracy in converting text to speech phonetics is evaluated using simulations.

Keywords: Hidden Markov Models, text, synthesis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1509
475 Environmental Interference Cancellation of Speech with the Radial Basis Function Networks: An Experimental Comparison

Authors: Nima Hatami

Abstract:

In this paper, we use Radial Basis Function Networks (RBFN) for solving the problem of environmental interference cancellation of speech signal. We show that the Second Order Thin- Plate Spline (SOTPS) kernel cancels the interferences effectively. For make comparison, we test our experiments on two conventional most used RBFN kernels: the Gaussian and First order TPS (FOTPS) basis functions. The speech signals used here were taken from the OGI Multi-Language Telephone Speech Corpus database and were corrupted with six type of environmental noise from NOISEX-92 database. Experimental results show that the SOTPS kernel can considerably outperform the Gaussian and FOTPS functions on speech interference cancellation problem.

Keywords: Environmental interference, interference cancellation of speech, Radial Basis Function networks, Gaussian and TPS kernels.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1520
474 A Tool for Audio Quality Evaluation Under Hostile Environment

Authors: Akhil Kumar Arya, Jagdeep Singh Lather, Lillie Dewan

Abstract:

In this paper is to evaluate audio and speech quality with the help of Digital Audio Watermarking Technique under the different types of attacks (signal impairments) like Gaussian Noise, Compression Error and Jittering Effect. Further attacks are considered as Hostile Environment. Audio and Speech Quality Evaluation is an important research topic. The traditional way for speech quality evaluation is using subjective tests. They are reliable, but very expensive, time consuming, and cannot be used in certain applications such as online monitoring. Objective models, based on human perception, were developed to predict the results of subjective tests. The existing objective methods require either the original speech or complicated computation model, which makes some applications of quality evaluation impossible.

Keywords: Digital Watermarking, DCT, Speech Quality, Attacks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1582
473 A Hybrid Scheme for on-Line Diagnostic Decision Making Using Optimal Data Representation and Filtering Technique

Authors: Hyun-Woo Cho

Abstract:

The early diagnostic decision making in industrial processes is absolutely necessary to produce high quality final products. It helps to provide early warning for a special event in a process, and finding its assignable cause can be obtained. This work presents a hybrid diagnostic schmes for batch processes. Nonlinear representation of raw process data is combined with classification tree techniques. The nonlinear kernel-based dimension reduction is executed for nonlinear classification decision boundaries for fault classes. In order to enhance diagnosis performance for batch processes, filtering of the data is performed to get rid of the irrelevant information of the process data. For the diagnosis performance of several representation, filtering, and future observation estimation methods, four diagnostic schemes are evaluated. In this work, the performance of the presented diagnosis schemes is demonstrated using batch process data.

Keywords: Diagnostics, batch process, nonlinear representation, data filtering, multivariate statistical approach

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1279
472 Information Filtering using Index Word Selection based on the Topics

Authors: Takeru YOKOI, Hidekazu YANAGIMOTO, Sigeru OMATU

Abstract:

We have proposed an information filtering system using index word selection from a document set based on the topics included in a set of documents. This method narrows down the particularly characteristic words in a document set and the topics are obtained by Sparse Non-negative Matrix Factorization. In information filtering, a document is often represented with the vector in which the elements correspond to the weight of the index words, and the dimension of the vector becomes larger as the number of documents is increased. Therefore, it is possible that useless words as index words for the information filtering are included. In order to address the problem, the dimension needs to be reduced. Our proposal reduces the dimension by selecting index words based on the topics included in a document set. We have applied the Sparse Non-negative Matrix Factorization to the document set to obtain these topics. The filtering is carried out based on a centroid of the learning document set. The centroid is regarded as the user-s interest. In addition, the centroid is represented with a document vector whose elements consist of the weight of the selected index words. Using the English test collection MEDLINE, thus, we confirm the effectiveness of our proposal. Hence, our proposed selection can confirm the improvement of the recommendation accuracy from the other previous methods when selecting the appropriate number of index words. In addition, we discussed the selected index words by our proposal and we found our proposal was able to select the index words covered some minor topics included in the document set.

Keywords: Information Filtering, Sparse NMF, Index wordSelection, User Profile, Chi-squared Measure

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1409
471 Author's Approach to the Problem of Correctional Speech Therapy with Children Suffering from Alalia

Authors: Е. V. Kutsina, S. A. Tarasova

Abstract:

In this article we present a methodology which enables preschool and primary school unlanguaged children to remember words, phrases and texts with the help of graphic signs - letters, syllables and words. Reading for a child becomes a support for speech development. Teaching is based on the principle "from simple to complex", "a letter - a syllable - a word - a proposal - a text." Availability of multi-level texts allows using this methodology for working with children who have different levels of speech development.

Keywords: Alalia, analytic-synthetic method, development of coherent speech, formation of vocabulary, learning to read, , sentence formation, three-level stories, unlanguaged children.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1899
470 Recognition of Isolated Speech Signals using Simplified Statistical Parameters

Authors: Abhijit Mitra, Bhargav Kumar Mitra, Biswajoy Chatterjee

Abstract:

We present a novel scheme to recognize isolated speech signals using certain statistical parameters derived from those signals. The determination of the statistical estimates is based on extracted signal information rather than the original signal information in order to reduce the computational complexity. Subtle details of these estimates, after extracting the speech signal from ambience noise, are first exploited to segregate the polysyllabic words from the monosyllabic ones. Precise recognition of each distinct word is then carried out by analyzing the histogram, obtained from these information.

Keywords: Isolated speech signals, Block overlapping technique, Positive peaks, Histogram analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1377
469 Efficient DTW-Based Speech Recognition System for Isolated Words of Arabic Language

Authors: Khalid A. Darabkh, Ala F. Khalifeh, Baraa A. Bathech, Saed W. Sabah

Abstract:

Despite the fact that Arabic language is currently one of the most common languages worldwide, there has been only a little research on Arabic speech recognition relative to other languages such as English and Japanese. Generally, digital speech processing and voice recognition algorithms are of special importance for designing efficient, accurate, as well as fast automatic speech recognition systems. However, the speech recognition process carried out in this paper is divided into three stages as follows: firstly, the signal is preprocessed to reduce noise effects. After that, the signal is digitized and hearingized. Consequently, the voice activity regions are segmented using voice activity detection (VAD) algorithm. Secondly, features are extracted from the speech signal using Mel-frequency cepstral coefficients (MFCC) algorithm. Moreover, delta and acceleration (delta-delta) coefficients have been added for the reason of improving the recognition accuracy. Finally, each test word-s features are compared to the training database using dynamic time warping (DTW) algorithm. Utilizing the best set up made for all affected parameters to the aforementioned techniques, the proposed system achieved a recognition rate of about 98.5% which outperformed other HMM and ANN-based approaches available in the literature.

Keywords: Arabic speech recognition, MFCC, DTW, VAD.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4032