Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2161

Search results for: spatial audio processing.

2161 Spatial Audio Player Using Musical Genre Classification

Authors: Jun-Yong Lee, Hyoung-Gook Kim

Abstract:

In this paper, we propose a smart music player that combines the musical genre classification and the spatial audio processing. The musical genre is classified based on content analysis of the musical segment detected from the audio stream. In parallel with the classification, the spatial audio quality is achieved by adding an artificial reverberation in a virtual acoustic space to the input mono sound. Thereafter, the spatial sound is boosted with the given frequency gains based on the musical genre when played back. Experiments measured the accuracy of detecting the musical segment from the audio stream and its musical genre classification. A listening test was performed based on the virtual acoustic space based spatial audio processing.

Keywords: Automatic equalization, genre classification, music segment detection, spatial audio processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1402
2160 Audio User Interface for Visually Impaired Computer Users: in a Two Dimensional Audio Environment

Authors: Ravihansa Rajapakse, Malshika Dias, Kanishka Weerasekara, Anuja Dharmaratne, Prasad Wimalaratne

Abstract:

In this paper we discuss a set of guidelines which could be adapted when designing an audio user interface for the visually impaired. It is based on an audio environment that is focused on audio positioning. Unlike current applications which only interpret Graphical User Interface (GUI) for the visually impaired, this particular audio environment bypasses GUI to provide a direct auditory output. It presents the capability of two dimensional (2D) navigation on audio interfaces. This paper highlights the significance of a 2D audio environment with spatial information in the context of the visually impaired. A thorough usability study has been conducted to prove the applicability of proposed design guidelines for these auditory interfaces. While proving these guidelines, previously unearthed design aspects have been revealed in this study.

Keywords: Human Computer Interaction, Audio User Interfaces, 2D Audio Environment, Visually Impaired Users

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1991
2159 Orchestra/Percussion Classification Algorithm for United Speech Audio Coding System

Authors: Yueming Wang, Rendong Ying, Sumxin Jiang, Peilin Liu

Abstract:

Unified Speech Audio Coding (USAC), the latest MPEG standardization for unified speech and audio coding, uses a speech/audio classification algorithm to distinguish speech and audio segments of the input signal. The quality of the recovered audio can be increased by well-designed orchestra/percussion classification and subsequent processing. However, owing to the shortcoming of the system, introducing an orchestra/percussion classification and modifying subsequent processing can enormously increase the quality of the recovered audio. This paper proposes an orchestra/percussion classification algorithm for the USAC system which only extracts 3 scales of Mel-Frequency Cepstral Coefficients (MFCCs) rather than traditional 13 scales of MFCCs and use Iterative Dichotomiser 3 (ID3) Decision Tree rather than other complex learning method, thus the proposed algorithm has lower computing complexity than most existing algorithms. Considering that frequent changing of attributes may lead to quality loss of the recovered audio signal, this paper also design a modified subsequent process to help the whole classification system reach an accurate rate as high as 97% which is comparable to classical 99%.

Keywords: ID3 Decision Tree, MFCC, Orchestra/Percussion Classification, USAC

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1433
2158 Intelligent Audio Watermarking using Genetic Algorithm in DWT Domain

Authors: M. Ketcham, S. Vongpradhip

Abstract:

In this paper, an innovative watermarking scheme for audio signal based on genetic algorithms (GA) in the discrete wavelet transforms is proposed. It is robust against watermarking attacks, which are commonly employed in literature. In addition, the watermarked image quality is also considered. We employ GA for the optimal localization and intensity of watermark. The watermark detection process can be performed without using the original audio signal. The experimental results demonstrate that watermark is inaudible and robust to many digital signal processing, such as cropping, low pass filter, additive noise.

Keywords: Intelligent Audio Watermarking, GeneticAlgorithm, DWT Domain.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1787
2157 Watermark-based Counter for Restricting Digital Audio Consumption

Authors: Mikko Löytynoja, Nedeljko Cvejic, Tapio Seppänen

Abstract:

In this paper we introduce three watermarking methods that can be used to count the number of times that a user has played some content. The proposed methods are tested with audio content in our experimental system using the most common signal processing attacks. The test results show that the watermarking methods used enable the watermark to be extracted under the most common attacks with a low bit error rate.

Keywords: Digital rights management, restricted usage, content protection, spread spectrum, audio watermarking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1241
2156 Encrypted Audio Communication Based On Synchronized Unified Chaotic Systems

Authors: C. Cruz-Hernández, E. Inzunza-González, R.M. López-Gutiérrez H. Serrano-Guerrero, E.E.García-Guerrero

Abstract:

In this paper, encrypted audio communications based on synchronization of coupled unified chaotic systems in master-slave configuration is numerically studied. We transmit the encrypted audio messages by using two unsecure channels. Encoding, transmission, and decoding audio messages in chaotic communication is presented.

Keywords: Audio encrypted, chaos, synchronization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1420
2155 Application of a Novel Audio Compression Scheme in Automatic Music Recommendation, Digital Rights Management and Audio Fingerprinting

Authors: Anindya Roy, Goutam Saha

Abstract:

Rapid progress in audio compression technology has contributed to the explosive growth of music available in digital form today. In a reversal of ideas, this work makes use of a recently proposed efficient audio compression scheme to develop three important applications in the context of Music Information Retrieval (MIR) for the effective manipulation of large music databases, namely automatic music recommendation (AMR), digital rights management (DRM) and audio finger-printing for song identification. The performance of these three applications has been evaluated with respect to a database of songs collected from a diverse set of genres.

Keywords: Audio compression, Music Information Retrieval, Digital Rights Management, Audio Fingerprinting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1279
2154 Spatial Data Mining by Decision Trees

Authors: S. Oujdi, H. Belbachir

Abstract:

Existing methods of data mining cannot be applied on spatial data because they require spatial specificity consideration, as spatial relationships. This paper focuses on the classification with decision trees, which are one of the data mining techniques. We propose an extension of the C4.5 algorithm for spatial data, based on two different approaches Join materialization and Querying on the fly the different tables. Similar works have been done on these two main approaches, the first - Join materialization - favors the processing time in spite of memory space, whereas the second - Querying on the fly different tables- promotes memory space despite of the processing time. The modified C4.5 algorithm requires three entries tables: a target table, a neighbor table, and a spatial index join that contains the possible spatial relationship among the objects in the target table and those in the neighbor table. Thus, the proposed algorithms are applied to a spatial data pattern in the accidentology domain. A comparative study of our approach with other works of classification by spatial decision trees will be detailed.

Keywords: C4.5 Algorithm, Decision trees, S-CART, Spatial data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2533
2153 Freedom of Expression and Its Restriction in Audio Visual Media

Authors: Sevil Yildiz

Abstract:

Audio visual communication is a type of collective expression. Due to inform the masses, give direction to opinions, and establish public opinion, audio visual communication must be subjected to special restrictions. This has been stipulated in both the Constitution and the European Human Rights Agreement. This paper aims to review freedom of expression and its restriction in audio visual media. For this purpose, the authorization of the Radio and Television Supreme Council to impose sanctions as an independent administrative authority empowered to regulate the field of audio visual communication has been reviewed with regard to freedom of expression and its limits.

Keywords: Audio visual media, freedom of expression, its limits, Radio and Television Supreme Council.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1278
2152 A Smart-Visio Microphone for Audio-Visual Speech Recognition “Vmike“

Authors: Y. Ni, K. Sebri

Abstract:

The practical implementation of audio-video coupled speech recognition systems is mainly limited by the hardware complexity to integrate two radically different information capturing devices with good temporal synchronisation. In this paper, we propose a solution based on a smart CMOS image sensor in order to simplify the hardware integration difficulties. By using on-chip image processing, this smart sensor can calculate in real time the X/Y projections of the captured image. This on-chip projection reduces considerably the volume of the output data. This data-volume reduction permits a transmission of the condensed visual information via the same audio channel by using a stereophonic input available on most of the standard computation devices such as PC, PDA and mobile phones. A prototype called VMIKE (Visio-Microphone) has been designed and realised by using standard 0.35um CMOS technology. A preliminary experiment gives encouraged results. Its efficiency will be further investigated in a large variety of applications such as biometrics, speech recognition in noisy environments, and vocal control for military or disabled persons, etc.

Keywords: Audio-Visual Speech recognition, CMOS Smartsensor, On-Chip image processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1574
2151 Spatial Econometric Approaches for Count Data: An Overview and New Directions

Authors: Paula Simões, Isabel Natário

Abstract:

This paper reviews a number of theoretical aspects for implementing an explicit spatial perspective in econometrics for modelling non-continuous data, in general, and count data, in particular. It provides an overview of the several spatial econometric approaches that are available to model data that are collected with reference to location in space, from the classical spatial econometrics approaches to the recent developments on spatial econometrics to model count data, in a Bayesian hierarchical setting. Considerable attention is paid to the inferential framework, necessary for structural consistent spatial econometric count models, incorporating spatial lag autocorrelation, to the corresponding estimation and testing procedures for different assumptions, to the constrains and implications embedded in the various specifications in the literature. This review combines insights from the classical spatial econometrics literature as well as from hierarchical modeling and analysis of spatial data, in order to look for new possible directions on the processing of count data, in a spatial hierarchical Bayesian econometric context.

Keywords: Spatial data analysis, spatial econometrics, Bayesian hierarchical models, count data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2147
2150 An Effective Method for Audio Translation between IAX and RSW Protocols

Authors: Hadeel S. Haj Aliwi, Saleh A. Alomari, Putra Sumari

Abstract:

Nowadays, Multimedia Communication has been developed and improved rapidly in order to enable users to communicate between each other over the Internet. In general, the multimedia communication consists of audio and video communication. However, this paper focuses on audio streams. The audio translation between protocols is a very critical issue due to solving the communication problems between any two protocols, as well as it enables people around the world to talk with each other at anywhere and anytime even they use different protocols. In this paper, a proposed method for an audio translation module between two protocols has been presented. These two protocols are InterAsterisk eXchange Protocol (IAX) and Real Time Switching Control Protocol (RSW), which they are widely used to provide two ways audio transfer feature. The result of this work is to introduce possibility of interworking together.

Keywords: Multimedia, VoIP, Interworking, InterAsterisk eXchange Protocol (IAX), Real Time Switching Control Criteria (REW)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1168
2149 A Robust Audio Fingerprinting Algorithm in MP3 Compressed Domain

Authors: Ruili Zhou, Yuesheng Zhu

Abstract:

In this paper, a new robust audio fingerprinting algorithm in MP3 compressed domain is proposed with high robustness to time scale modification (TSM). Instead of simply employing short-term information of the MP3 stream, the new algorithm extracts the long-term features in MP3 compressed domain by using the modulation frequency analysis. Our experiment has demonstrated that the proposed method can achieve a hit rate of above 95% in audio retrieval and resist the attack of 20% TSM. It has lower bit error rate (BER) performance compared to the other algorithms. The proposed algorithm can also be used in other compressed domains, such as AAC.

Keywords: Audio Fingerprinting, MP3, Modulation Frequency, TSM

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1954
2148 A Watermarking Scheme for MP3 Audio Files

Authors: Dimitrios Koukopoulos, Yiannis Stamatiou

Abstract:

In this work, we present for the first time in our perception an efficient digital watermarking scheme for mpeg audio layer 3 files that operates directly in the compressed data domain, while manipulating the time and subband/channel domain. In addition, it does not need the original signal to detect the watermark. Our scheme was implemented taking special care for the efficient usage of the two limited resources of computer systems: time and space. It offers to the industrial user the capability of watermark embedding and detection in time immediately comparable to the real music time of the original audio file that depends on the mpeg compression, while the end user/audience does not face any artifacts or delays hearing the watermarked audio file. Furthermore, it overcomes the disadvantage of algorithms operating in the PCMData domain to be vulnerable to compression/recompression attacks, as it places the watermark in the scale factors domain and not in the digitized sound audio data. The strength of our scheme, that allows it to be used with success in both authentication and copyright protection, relies on the fact that it gives to the users the enhanced capability their ownership of the audio file not to be accomplished simply by detecting the bit pattern that comprises the watermark itself, but by showing that the legal owner knows a hard to compute property of the watermark.

Keywords: Audio watermarking, mpeg audio layer 3, hardinstance generation, NP-completeness.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1367
2147 Linux based Embedded Node for Capturing, Compression and Streaming of Digital Audio and Video

Authors: F.J. Suárez, J.C. Granda, J. Molleda, D.F. García

Abstract:

A prototype for audio and video capture and compression in real time on a Linux platform has been developed. It is able to visualize both the captured and the compressed video at the same time, as well as the captured and compressed audio with the goal of comparing their quality. As it is based on free code, the final goal is to run it in an embedded system running Linux. Therefore, we would implement a node to capture and compress such multimedia information. Thus, it would be possible to consider the project within a larger one aimed at live broadcast of audio and video using a streaming server which would communicate with our node. Then, we would have a very powerful and flexible system with several practical applications.

Keywords: Audio and video compression, Linux platform, live streaming, real time, visualization of captured and compressed video.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1295
2146 Audio Watermarking Using Spectral Modifications

Authors: Jyotsna Singh, Parul Garg, Alok Nath De

Abstract:

In this paper, we present a non-blind technique of adding the watermark to the Fourier spectral components of audio signal in a way such that the modified amplitude does not exceed the maximum amplitude spread (MAS). This MAS is due to individual Discrete fourier transform (DFT) coefficients in that particular frame, which is derived from the Energy Spreading function given by Schroeder. Using this technique one can store double the information within a given frame length i.e. overriding the watermark on the host of equal length with least perceptual distortion. The watermark is uniformly floating on the DFT components of original signal. This helps in detecting any intentional manipulations done on the watermarked audio. Also, the scheme is found robust to various signal processing attacks like presence of multiple watermarks, Additive white gaussian noise (AWGN) and mp3 compression.

Keywords: Discrete Fourier Transform, Spreading Function, Watermark, Pseudo Noise Sequence, Spectral Masking Effect

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1411
2145 Genetic Algorithms for Feature Generation in the Context of Audio Classification

Authors: José A. Menezes, Giordano Cabral, Bruno T. Gomes

Abstract:

Choosing good features is an essential part of machine learning. Recent techniques aim to automate this process. For instance, feature learning intends to learn the transformation of raw data into a useful representation to machine learning tasks. In automatic audio classification tasks, this is interesting since the audio, usually complex information, needs to be transformed into a computationally convenient input to process. Another technique tries to generate features by searching a feature space. Genetic algorithms, for instance, have being used to generate audio features by combining or modifying them. We find this approach particularly interesting and, despite the undeniable advances of feature learning approaches, we wanted to take a step forward in the use of genetic algorithms to find audio features, combining them with more conventional methods, like PCA, and inserting search control mechanisms, such as constraints over a confusion matrix. This work presents the results obtained on particular audio classification problems.

Keywords: Feature generation, feature learning, genetic algorithm, music information retrieval.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 688
2144 An Efficient Watermarking Method for MP3 Audio Files

Authors: Dimitrios Koukopoulos, Yiannis Stamatiou

Abstract:

In this work, we present for the first time in our perception an efficient digital watermarking scheme for mpeg audio layer 3 files that operates directly in the compressed data domain, while manipulating the time and subband/channel domain. In addition, it does not need the original signal to detect the watermark. Our scheme was implemented taking special care for the efficient usage of the two limited resources of computer systems: time and space. It offers to the industrial user the capability of watermark embedding and detection in time immediately comparable to the real music time of the original audio file that depends on the mpeg compression, while the end user/audience does not face any artifacts or delays hearing the watermarked audio file. Furthermore, it overcomes the disadvantage of algorithms operating in the PCMData domain to be vulnerable to compression/recompression attacks, as it places the watermark in the scale factors domain and not in the digitized sound audio data. The strength of our scheme, that allows it to be used with success in both authentication and copyright protection, relies on the fact that it gives to the users the enhanced capability their ownership of the audio file not to be accomplished simply by detecting the bit pattern that comprises the watermark itself, but by showing that the legal owner knows a hard to compute property of the watermark.

Keywords: Audio watermarking, mpeg audio layer 3, hard instance generation, NP-completeness.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1583
2143 Genetic Content-Based MP3 Audio Watermarking in MDCT Domain

Authors: N. Moghadam, H. Sadeghi

Abstract:

In this paper a novel scheme for watermarking digital audio during its compression to MPEG-1 Layer III format is proposed. For this purpose we slightly modify some of the selected MDCT coefficients, which are used during MPEG audio compression procedure. Due to the possibility of modifying different MDCT coefficients, there will be different choices for embedding the watermark into audio data, considering robustness and transparency factors. Our proposed method uses a genetic algorithm to select the best coefficients to embed the watermark. This genetic selection is done according to the parameters that are extracted from the perceptual content of the audio to optimize the robustness and transparency of the watermark. On the other hand the watermark security is increased due to the random nature of the genetic selection. The information of the selected MDCT coefficients that carry the watermark bits, are saves in a database for future extraction of the watermark. The proposed method is suitable for online MP3 stores to pursue illegal copies of musical artworks. Experimental results show that the detection ratio of the watermarks at the bitrate of 128kbps remains above 90% while the inaudibility of the watermark is preserved.

Keywords: Content-Based Audio Watermarking, Genetic AudioWatermarking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1242
2142 Audio Watermarking Based on Compression-expansion Technique

Authors: Say Wei Foo, Qi Dong

Abstract:

A novel robust audio watermarking scheme is proposed in this paper. In the proposed scheme, the host audio signals are segmented into frames. Two consecutive frames are assessed if they are suitable to represent a watermark bit. If so, frequency transform is performed on these two frames. The compressionexpansion technique is adopted to generate distortion over the two frames. The distortion is used to represent one watermark bit. Psychoacoustic model is applied to calculate local auditory mask to ensure that the distortion is not audible. The watermarking schemes using mono and stereo audio signals are designed differently. The correlation-based detection method is used to detect the distortion and extract embedded watermark bits. The experimental results show that the quality degradation caused by the embedded watermarks is perceptually transparent and the proposed schemes are very robust against different types of attacks.

Keywords: Audio watermarking, Compression-expansion, Stereo signals, Robustness.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1404
2141 A Dictionary Learning Method Based On EMD for Audio Sparse Representation

Authors: Yueming Wang, Zenghui Zhang, Rendong Ying, Peilin Liu

Abstract:

Sparse representation has long been studied and several dictionary learning methods have been proposed. The dictionary learning methods are widely used because they are adaptive. In this paper, a new dictionary learning method for audio is proposed. Signals are at first decomposed into different degrees of Intrinsic Mode Functions (IMF) using Empirical Mode Decomposition (EMD) technique. Then these IMFs form a learned dictionary. To reduce the size of the dictionary, the K-means method is applied to the dictionary to generate a K-EMD dictionary. Compared to K-SVD algorithm, the K-EMD dictionary decomposes audio signals into structured components, thus the sparsity of the representation is increased by 34.4% and the SNR of the recovered audio signals is increased by 20.9%.

Keywords: Dictionary Learning, EMD, K-means Method, Sparse Representation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2339
2140 Spatial thinking Issues: Towards Rural Sociological Research Agenda in the Third Millennium

Authors: Abdel-Samad M. Ali

Abstract:

Does the spatial perspective provide a common thread for rural sociology? Have rural sociologists succeeded in bringing order to their data using spatial analysis models and techniques? A trial answer to such questions, as touchstones of theoretical and applied sociological studies in rural areas, is the point at issue in the present paper. Spatial analyses have changed the way rural sociologists approach scientific problems. Rural sociology is spatial by nature because much, if not most, of its research topics has a spatial “awareness." However, such spatial awareness is not quite the same as spatial analysis because it is not typically associated with underlying theories and hypotheses about spatial patterns that are designed to be tested for their specific spatial content. This paper presents pressing issues for future research to reintroduce mainstream rural sociology to the concept of space.

Keywords: Maps, Rural Sociology, Space, Spatial variations

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1670
2139 Determination and Comparison of Fabric Pills Distribution Using Image Processing and Spatial Data Analysis Tools

Authors: Lenka Techniková, Maroš Tunák, Jiří Janáček

Abstract:

This work deals with the determination and comparison of pill patterns in 2 sets of fabric samples which differ in way of pill creation. The first set contains fabric samples with the pills created by simulation on a Martindale abrasion machine, while pills in the second set originated during normal wearing and maintenance. The goal of the study is to determine whether the pattern of the fabric pills created by simulation is the same as the pattern of naturally occurring pills. The system of determination and comparison of the pills is based on image processing and spatial data analysis tools. Firstly, 3D reconstruction of the fabric surfaces with the pills is realized with using a gradient fields method. The gradient fields method creates a 3D fabric surface from a set of 4 images. Thereafter, the pills are detected in 3D fabric surfaces using image-processing tools in the MATLAB software. Determination and comparison of the pills patterns of two sets of fabric samples is based on spatial data analysis using tools in R software.

Keywords: 3D reconstruction of the surface, image analysis tools, distribution of the pills, spatial data analysis tools.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1870
2138 A Non-Parametric Based Mapping Algorithm for Use in Audio Fingerprinting

Authors: Analise Borg, Paul Micallef

Abstract:

Over the past few years, the online multimedia collection has grown at a fast pace. Several companies showed interest to study the different ways to organise the amount of audio information without the need of human intervention to generate metadata. In the past few years, many applications have emerged on the market which are capable of identifying a piece of music in a short time. Different audio effects and degradation make it much harder to identify the unknown piece. In this paper, an audio fingerprinting system which makes use of a non-parametric based algorithm is presented. Parametric analysis is also performed using Gaussian Mixture Models (GMMs). The feature extraction methods employed are the Mel Spectrum Coefficients and the MPEG-7 basic descriptors. Bin numbers replaced the extracted feature coefficients during the non-parametric modelling. The results show that nonparametric analysis offer potential results as the ones mentioned in the literature.

Keywords: Audio fingerprinting, mapping algorithm, Gaussian Mixture Models, MFCC, MPEG-7.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2039
2137 A Tool for Audio Quality Evaluation Under Hostile Environment

Authors: Akhil Kumar Arya, Jagdeep Singh Lather, Lillie Dewan

Abstract:

In this paper is to evaluate audio and speech quality with the help of Digital Audio Watermarking Technique under the different types of attacks (signal impairments) like Gaussian Noise, Compression Error and Jittering Effect. Further attacks are considered as Hostile Environment. Audio and Speech Quality Evaluation is an important research topic. The traditional way for speech quality evaluation is using subjective tests. They are reliable, but very expensive, time consuming, and cannot be used in certain applications such as online monitoring. Objective models, based on human perception, were developed to predict the results of subjective tests. The existing objective methods require either the original speech or complicated computation model, which makes some applications of quality evaluation impossible.

Keywords: Digital Watermarking, DCT, Speech Quality, Attacks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1370
2136 A Genetic-Algorithm-Based Approach for Audio Steganography

Authors: Mazdak Zamani , Azizah A. Manaf , Rabiah B. Ahmad , Akram M. Zeki , Shahidan Abdullah

Abstract:

In this paper, we present a novel, principled approach to resolve the remained problems of substitution technique of audio steganography. Using the proposed genetic algorithm, message bits are embedded into multiple, vague and higher LSB layers, resulting in increased robustness. The robustness specially would be increased against those intentional attacks which try to reveal the hidden message and also some unintentional attacks like noise addition as well.

Keywords: Artificial Intelligence, Audio Steganography, DataHiding, Genetic Algorithm, Substitution Techniques.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2843
2135 A Real-Time Signal Processing Technique for MIDI Generation

Authors: Farshad Arvin, Shyamala Doraisamy

Abstract:

This paper presents a new hardware interface using a microcontroller which processes audio music signals to standard MIDI data. A technique for processing music signals by extracting note parameters from music signals is described. An algorithm to convert the voice samples for real-time processing without complex calculations is proposed. A high frequency microcontroller as the main processor is deployed to execute the outlined algorithm. The MIDI data generated is transmitted using the EIA-232 protocol. The analyses of data generated show the feasibility of using microcontrollers for real-time MIDI generation hardware interface.

Keywords: Signal processing, MIDI, Microcontroller, EIA-232.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1915
2134 Satisfaction of Distance Education University Students with the Use of Audio Media as a Medium of Instruction: The Case of Mountains of the Moon University in Uganda

Authors: Mark Kaahwa, Chang Zhu, Moses Muhumuza

Abstract:

This study investigates the satisfaction of distance education university students (DEUS) with the use of audio media as a medium of instruction. Studying students’ satisfaction is vital because it shows whether learners are comfortable with a certain instructional strategy or not. Although previous studies have investigated the use of audio media, the satisfaction of students with an instructional strategy that combines radio teaching and podcasts as an independent teaching strategy has not been fully investigated. In this study, all lectures were delivered through the radio and students had no direct contact with their instructors. No modules or any other material in form of text were given to the students. They instead, revised the taught content by listening to podcasts saved on their mobile electronic gadgets. Prior to data collection, DEUS received orientation through workshops on how to use audio media in distance education. To achieve objectives of the study, a survey, naturalistic observations and face-to-face interviews were used to collect data from a sample of 211 undergraduate and graduate students. Findings indicate that there was no statistically significant difference in the levels of satisfaction between male and female students. The results from post hoc analysis show that there is a statistically significant difference in the levels of satisfaction regarding the use of audio media between diploma and graduate students. Diploma students are more satisfied compared to their graduate counterparts. T-test results reveal that there was no statistically significant difference in the general satisfaction with audio media between rural and urban-based students. And ANOVA results indicate that there is no statistically significant difference in the levels of satisfaction with the use of audio media across age groups. Furthermore, results from observations and interviews reveal that DEUS found learning using audio media a pleasurable medium of instruction. This is an indication that audio media can be considered as an instructional strategy on its own merit.

Keywords: Audio media, distance education, distance education university students, medium of instruction, satisfaction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 419
2133 A 24-Bit, 8.1-MS/s D/A Converter for Audio Baseband Channel Applications

Authors: N. Ben Ameur, M. Loulou

Abstract:

This paper study the high-level modelling and design of delta-sigma (ΔΣ) noise shapers for audio Digital-to-Analog Converter (DAC) so as to eliminate the in-band Signal-to-Noise- Ratio (SNR) degradation that accompany one channel mismatch in audio signal. The converter combines a cascaded digital signal interpolation, a noise-shaping single loop delta-sigma modulator with a 5-bit quantizer resolution in the final stage. To reduce sensitivity of Digital-to-Analog Converter (DAC) nonlinearities of the last stage, a high pass second order Data Weighted Averaging (R2DWA) is introduced. This paper presents a MATLAB description modelling approach of the proposed DAC architecture with low distortion and swing suppression integrator designs. The ΔΣ Modulator design can be configured as a 3rd-order and allows 24-bit PCM at sampling rate of 64 kHz for Digital Video Disc (DVD) audio application. The modeling approach provides 139.38 dB of dynamic range for a 32 kHz signal band at -1.6 dBFS input signal level.

Keywords: DVD-audio, DAC, Interpolator and Interpolation Filter, Single-Loop ΔΣ Modulation, R2DWA, Clock Jitter

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2364
2132 Robust and Transparent Spread Spectrum Audio Watermarking

Authors: Ali Akbar Attari, Ali Asghar Beheshti Shirazi

Abstract:

In this paper, we propose a blind and robust audio watermarking scheme based on spread spectrum in Discrete Wavelet Transform (DWT) domain. Watermarks are embedded in the low-frequency coefficients, which is less audible. The key idea is dividing the audio signal into small frames, and magnitude of the 6th level of DWT approximation coefficients is modifying based upon the Direct Sequence Spread Spectrum (DSSS) technique. Also, the psychoacoustic model for enhancing in imperceptibility, as well as Savitsky-Golay filter for increasing accuracy in extraction, is used. The experimental results illustrate high robustness against most common attacks, i.e. Gaussian noise addition, Low pass filter, Resampling, Requantizing, MP3 compression, without significant perceptual distortion (ODG is higher than -1). The proposed scheme has about 83 bps data payload.

Keywords: Audio watermarking, spread spectrum, discrete wavelet transform, psychoacoustic, Savitsky-Golay filter.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 593