Search results for: audio codec
140 Performance Study on Audio Codec and Session Transfer of Open Source VoIP applications
Authors: Cheng-Suan Lee, Khong Neng Choong, So Gean Koh, Chee Onn Chow, Mazlan Abbas
Abstract:
Voice over Internet Protocol (VoIP) application or commonly known as softphone has been developing an increasingly large market in today-s telecommunication world and the trend is expected to continue with the enhancement of additional features. This includes leveraging on the existing presence services, location and contextual information to enable more ubiquitous and seamless communications. In this paper, we discuss the concept of seamless session transfer for real-time application such as VoIP and IPTV, and our prototype implementation of such concept on a selected open source VoIP application. The first part of this paper is about conducting performance evaluation and assessments across some commonly found open source VoIP applications that are Ekiga, Kphone, Linphone and Twinkle so as to identify one of them for implementing our design of seamless session transfer. Subjective testing has been carried out to evaluate the audio performance on these VoIP applications and rank them according to their Mean Opinion Score (MOS) results. The second part of this paper is to discuss on the performance evaluations of our prototype implementation of session transfer using Linphone.
Keywords: audio codec, softphone, session transfer.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1686139 High Quality Speech Coding using Combined Parametric and Perceptual Modules
Authors: M. Kulesza, G. Szwoch, A. Czyżewski
Abstract:
A novel approach to speech coding using the hybrid architecture is presented. Advantages of parametric and perceptual coding methods are utilized together in order to create a speech coding algorithm assuring better signal quality than in traditional CELP parametric codec. Two approaches are discussed. One is based on selection of voiced signal components that are encoded using parametric algorithm, unvoiced components that are encoded perceptually and transients that remain unencoded. The second approach uses perceptual encoding of the residual signal in CELP codec. The algorithm applied for precise transient selection is described. Signal quality achieved using the proposed hybrid codec is compared to quality of some standard speech codecs.
Keywords: CELP residual coding, hybrid codec architecture, perceptual speech coding, speech codecs comparison.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1530138 Audio User Interface for Visually Impaired Computer Users: in a Two Dimensional Audio Environment
Authors: Ravihansa Rajapakse, Malshika Dias, Kanishka Weerasekara, Anuja Dharmaratne, Prasad Wimalaratne
Abstract:
In this paper we discuss a set of guidelines which could be adapted when designing an audio user interface for the visually impaired. It is based on an audio environment that is focused on audio positioning. Unlike current applications which only interpret Graphical User Interface (GUI) for the visually impaired, this particular audio environment bypasses GUI to provide a direct auditory output. It presents the capability of two dimensional (2D) navigation on audio interfaces. This paper highlights the significance of a 2D audio environment with spatial information in the context of the visually impaired. A thorough usability study has been conducted to prove the applicability of proposed design guidelines for these auditory interfaces. While proving these guidelines, previously unearthed design aspects have been revealed in this study.Keywords: Human Computer Interaction, Audio User Interfaces, 2D Audio Environment, Visually Impaired Users
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2306137 Encrypted Audio Communication Based On Synchronized Unified Chaotic Systems
Authors: C. Cruz-Hernández, E. Inzunza-González, R.M. López-Gutiérrez H. Serrano-Guerrero, E.E.García-Guerrero
Abstract:
In this paper, encrypted audio communications based on synchronization of coupled unified chaotic systems in master-slave configuration is numerically studied. We transmit the encrypted audio messages by using two unsecure channels. Encoding, transmission, and decoding audio messages in chaotic communication is presented.
Keywords: Audio encrypted, chaos, synchronization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1699136 Application of a Novel Audio Compression Scheme in Automatic Music Recommendation, Digital Rights Management and Audio Fingerprinting
Authors: Anindya Roy, Goutam Saha
Abstract:
Rapid progress in audio compression technology has contributed to the explosive growth of music available in digital form today. In a reversal of ideas, this work makes use of a recently proposed efficient audio compression scheme to develop three important applications in the context of Music Information Retrieval (MIR) for the effective manipulation of large music databases, namely automatic music recommendation (AMR), digital rights management (DRM) and audio finger-printing for song identification. The performance of these three applications has been evaluated with respect to a database of songs collected from a diverse set of genres.
Keywords: Audio compression, Music Information Retrieval, Digital Rights Management, Audio Fingerprinting.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1540135 VoIP and Database Traffic Co-existence over IEEE 802.11b WLAN with Redundancy
Authors: Rizik Al-Sayyed, Colin Pattinson, Tony Dacre
Abstract:
This paper presents the findings of two experiments that were performed on the Redundancy in Wireless Connection Model (RiWC) using the 802.11b standard. The experiments were simulated using OPNET 11.5 Modeler software. The first was aimed at finding the maximum number of simultaneous Voice over Internet Protocol (VoIP) users the model would support under the G.711 and G.729 codec standards when the packetization interval was 10 milliseconds (ms). The second experiment examined the model?s VoIP user capacity using the G.729 codec standard along with background traffic using the same packetization interval as in the first experiment. To determine the capacity of the model under various experiments, we checked three metrics: jitter, delay and data loss. When background traffic was added, we checked the response time in addition to the previous three metrics. The findings of the first experiment indicated that the maximum number of simultaneous VoIP users the model was able to support was 5, which is consistent with recent research findings. When using the G.729 codec, the model was able to support up to 16 VoIP users; similar experiments in current literature have indicated a maximum of 7 users. The finding of the second experiment demonstrated that the maximum number of VoIP users the model was able to support was 12, with the existence of background traffic.
Keywords: WLAN, IEEE 802.11b, Codec, VoIP, OPNET, Background traffic, and QoS.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1694134 Freedom of Expression and Its Restriction in Audio Visual Media
Authors: Sevil Yildiz
Abstract:
Audio visual communication is a type of collective expression. Due to inform the masses, give direction to opinions, and establish public opinion, audio visual communication must be subjected to special restrictions. This has been stipulated in both the Constitution and the European Human Rights Agreement. This paper aims to review freedom of expression and its restriction in audio visual media. For this purpose, the authorization of the Radio and Television Supreme Council to impose sanctions as an independent administrative authority empowered to regulate the field of audio visual communication has been reviewed with regard to freedom of expression and its limits.
Keywords: Audio visual media, freedom of expression, its limits, Radio and Television Supreme Council.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1694133 Bandwidth Estimation Algorithms for the Dynamic Adaptation of Voice Codec
Authors: Davide Pierattoni, Ivan Macor, Pier Luca Montessoro
Abstract:
In the recent years multimedia traffic and in particular VoIP services are growing dramatically. We present a new algorithm to control the resource utilization and to optimize the voice codec selection during SIP call setup on behalf of the traffic condition estimated on the network path. The most suitable methodologies and the tools that perform realtime evaluation of the available bandwidth on a network path have been integrated with our proposed algorithm: this selects the best codec for a VoIP call in function of the instantaneous available bandwidth on the path. The algorithm does not require any explicit feedback from the network, and this makes it easily deployable over the Internet. We have also performed intensive tests on real network scenarios with a software prototype, verifying the algorithm efficiency with different network topologies and traffic patterns between two SIP PBXs. The promising results obtained during the experimental validation of the algorithm are now the basis for the extension towards a larger set of multimedia services and the integration of our methodology with existing PBX appliances.Keywords: Integrated voice-data communication, computernetwork performance, resource optimization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1693132 Spatial Audio Player Using Musical Genre Classification
Authors: Jun-Yong Lee, Hyoung-Gook Kim
Abstract:
In this paper, we propose a smart music player that combines the musical genre classification and the spatial audio processing. The musical genre is classified based on content analysis of the musical segment detected from the audio stream. In parallel with the classification, the spatial audio quality is achieved by adding an artificial reverberation in a virtual acoustic space to the input mono sound. Thereafter, the spatial sound is boosted with the given frequency gains based on the musical genre when played back. Experiments measured the accuracy of detecting the musical segment from the audio stream and its musical genre classification. A listening test was performed based on the virtual acoustic space based spatial audio processing.
Keywords: Automatic equalization, genre classification, music segment detection, spatial audio processing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1624131 An Effective Method for Audio Translation between IAX and RSW Protocols
Authors: Hadeel S. Haj Aliwi, Saleh A. Alomari, Putra Sumari
Abstract:
Nowadays, Multimedia Communication has been developed and improved rapidly in order to enable users to communicate between each other over the Internet. In general, the multimedia communication consists of audio and video communication. However, this paper focuses on audio streams. The audio translation between protocols is a very critical issue due to solving the communication problems between any two protocols, as well as it enables people around the world to talk with each other at anywhere and anytime even they use different protocols. In this paper, a proposed method for an audio translation module between two protocols has been presented. These two protocols are InterAsterisk eXchange Protocol (IAX) and Real Time Switching Control Protocol (RSW), which they are widely used to provide two ways audio transfer feature. The result of this work is to introduce possibility of interworking together.
Keywords: Multimedia, VoIP, Interworking, InterAsterisk eXchange Protocol (IAX), Real Time Switching Control Criteria (REW)
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1512130 Intelligent Audio Watermarking using Genetic Algorithm in DWT Domain
Authors: M. Ketcham, S. Vongpradhip
Abstract:
In this paper, an innovative watermarking scheme for audio signal based on genetic algorithms (GA) in the discrete wavelet transforms is proposed. It is robust against watermarking attacks, which are commonly employed in literature. In addition, the watermarked image quality is also considered. We employ GA for the optimal localization and intensity of watermark. The watermark detection process can be performed without using the original audio signal. The experimental results demonstrate that watermark is inaudible and robust to many digital signal processing, such as cropping, low pass filter, additive noise.
Keywords: Intelligent Audio Watermarking, GeneticAlgorithm, DWT Domain.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2056129 The Influence of Audio on Perceived Quality of Segmentation
Authors: Silvio R. R. Sanches, Bianca C. Barbosa, Beatriz R. Brum, Cléber G.Corrêa
Abstract:
In order to evaluate the quality of a segmentation algorithm, the researchers use subjective or objective metrics. Although subjective metrics are more accurate than objective ones, objective metrics do not require user feedback to test an algorithm. Objective metrics require subjective experiments only during their development. Subjective experiments typically display to users some videos (generated from frames with segmentation errors) that simulate the environment of an application domain. This user feedback is crucial information for metric definition. In the subjective experiments applied to develop some state-of-the-art metrics used to test segmentation algorithms, the videos displayed during the experiments did not contain audio. Audio is an essential component in applications such as videoconference and augmented reality. If the audio influences the user’s perception, using only videos without audio in subjective experiments can compromise the efficiency of an objective metric generated using data from these experiments. This work aims to identify if the audio influences the user’s perception of segmentation quality in background substitution applications with audio. The proposed approach used a subjective method based on formal video quality assessment methods. The results showed that audio influences the quality of segmentation perceived by a user.
Keywords: Background substitution, influence of audio, segmentation evaluation, segmentation quality.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 356128 A Robust Audio Fingerprinting Algorithm in MP3 Compressed Domain
Authors: Ruili Zhou, Yuesheng Zhu
Abstract:
In this paper, a new robust audio fingerprinting algorithm in MP3 compressed domain is proposed with high robustness to time scale modification (TSM). Instead of simply employing short-term information of the MP3 stream, the new algorithm extracts the long-term features in MP3 compressed domain by using the modulation frequency analysis. Our experiment has demonstrated that the proposed method can achieve a hit rate of above 95% in audio retrieval and resist the attack of 20% TSM. It has lower bit error rate (BER) performance compared to the other algorithms. The proposed algorithm can also be used in other compressed domains, such as AAC.Keywords: Audio Fingerprinting, MP3, Modulation Frequency, TSM
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2196127 Orchestra/Percussion Classification Algorithm for United Speech Audio Coding System
Authors: Yueming Wang, Rendong Ying, Sumxin Jiang, Peilin Liu
Abstract:
Unified Speech Audio Coding (USAC), the latest MPEG standardization for unified speech and audio coding, uses a speech/audio classification algorithm to distinguish speech and audio segments of the input signal. The quality of the recovered audio can be increased by well-designed orchestra/percussion classification and subsequent processing. However, owing to the shortcoming of the system, introducing an orchestra/percussion classification and modifying subsequent processing can enormously increase the quality of the recovered audio. This paper proposes an orchestra/percussion classification algorithm for the USAC system which only extracts 3 scales of Mel-Frequency Cepstral Coefficients (MFCCs) rather than traditional 13 scales of MFCCs and use Iterative Dichotomiser 3 (ID3) Decision Tree rather than other complex learning method, thus the proposed algorithm has lower computing complexity than most existing algorithms. Considering that frequent changing of attributes may lead to quality loss of the recovered audio signal, this paper also design a modified subsequent process to help the whole classification system reach an accurate rate as high as 97% which is comparable to classical 99%.
Keywords: ID3 Decision Tree, MFCC, Orchestra/Percussion Classification, USAC
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1673126 A Watermarking Scheme for MP3 Audio Files
Authors: Dimitrios Koukopoulos, Yiannis Stamatiou
Abstract:
In this work, we present for the first time in our perception an efficient digital watermarking scheme for mpeg audio layer 3 files that operates directly in the compressed data domain, while manipulating the time and subband/channel domain. In addition, it does not need the original signal to detect the watermark. Our scheme was implemented taking special care for the efficient usage of the two limited resources of computer systems: time and space. It offers to the industrial user the capability of watermark embedding and detection in time immediately comparable to the real music time of the original audio file that depends on the mpeg compression, while the end user/audience does not face any artifacts or delays hearing the watermarked audio file. Furthermore, it overcomes the disadvantage of algorithms operating in the PCMData domain to be vulnerable to compression/recompression attacks, as it places the watermark in the scale factors domain and not in the digitized sound audio data. The strength of our scheme, that allows it to be used with success in both authentication and copyright protection, relies on the fact that it gives to the users the enhanced capability their ownership of the audio file not to be accomplished simply by detecting the bit pattern that comprises the watermark itself, but by showing that the legal owner knows a hard to compute property of the watermark.Keywords: Audio watermarking, mpeg audio layer 3, hardinstance generation, NP-completeness.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1651125 Linux based Embedded Node for Capturing, Compression and Streaming of Digital Audio and Video
Authors: F.J. Suárez, J.C. Granda, J. Molleda, D.F. García
Abstract:
A prototype for audio and video capture and compression in real time on a Linux platform has been developed. It is able to visualize both the captured and the compressed video at the same time, as well as the captured and compressed audio with the goal of comparing their quality. As it is based on free code, the final goal is to run it in an embedded system running Linux. Therefore, we would implement a node to capture and compress such multimedia information. Thus, it would be possible to consider the project within a larger one aimed at live broadcast of audio and video using a streaming server which would communicate with our node. Then, we would have a very powerful and flexible system with several practical applications.
Keywords: Audio and video compression, Linux platform, live streaming, real time, visualization of captured and compressed video.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1554124 Genetic Algorithms for Feature Generation in the Context of Audio Classification
Authors: José A. Menezes, Giordano Cabral, Bruno T. Gomes
Abstract:
Choosing good features is an essential part of machine learning. Recent techniques aim to automate this process. For instance, feature learning intends to learn the transformation of raw data into a useful representation to machine learning tasks. In automatic audio classification tasks, this is interesting since the audio, usually complex information, needs to be transformed into a computationally convenient input to process. Another technique tries to generate features by searching a feature space. Genetic algorithms, for instance, have being used to generate audio features by combining or modifying them. We find this approach particularly interesting and, despite the undeniable advances of feature learning approaches, we wanted to take a step forward in the use of genetic algorithms to find audio features, combining them with more conventional methods, like PCA, and inserting search control mechanisms, such as constraints over a confusion matrix. This work presents the results obtained on particular audio classification problems.
Keywords: Feature generation, feature learning, genetic algorithm, music information retrieval.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1078123 An Efficient Watermarking Method for MP3 Audio Files
Authors: Dimitrios Koukopoulos, Yiannis Stamatiou
Abstract:
In this work, we present for the first time in our perception an efficient digital watermarking scheme for mpeg audio layer 3 files that operates directly in the compressed data domain, while manipulating the time and subband/channel domain. In addition, it does not need the original signal to detect the watermark. Our scheme was implemented taking special care for the efficient usage of the two limited resources of computer systems: time and space. It offers to the industrial user the capability of watermark embedding and detection in time immediately comparable to the real music time of the original audio file that depends on the mpeg compression, while the end user/audience does not face any artifacts or delays hearing the watermarked audio file. Furthermore, it overcomes the disadvantage of algorithms operating in the PCMData domain to be vulnerable to compression/recompression attacks, as it places the watermark in the scale factors domain and not in the digitized sound audio data. The strength of our scheme, that allows it to be used with success in both authentication and copyright protection, relies on the fact that it gives to the users the enhanced capability their ownership of the audio file not to be accomplished simply by detecting the bit pattern that comprises the watermark itself, but by showing that the legal owner knows a hard to compute property of the watermark.
Keywords: Audio watermarking, mpeg audio layer 3, hard instance generation, NP-completeness.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1834122 Genetic Content-Based MP3 Audio Watermarking in MDCT Domain
Authors: N. Moghadam, H. Sadeghi
Abstract:
In this paper a novel scheme for watermarking digital audio during its compression to MPEG-1 Layer III format is proposed. For this purpose we slightly modify some of the selected MDCT coefficients, which are used during MPEG audio compression procedure. Due to the possibility of modifying different MDCT coefficients, there will be different choices for embedding the watermark into audio data, considering robustness and transparency factors. Our proposed method uses a genetic algorithm to select the best coefficients to embed the watermark. This genetic selection is done according to the parameters that are extracted from the perceptual content of the audio to optimize the robustness and transparency of the watermark. On the other hand the watermark security is increased due to the random nature of the genetic selection. The information of the selected MDCT coefficients that carry the watermark bits, are saves in a database for future extraction of the watermark. The proposed method is suitable for online MP3 stores to pursue illegal copies of musical artworks. Experimental results show that the detection ratio of the watermarks at the bitrate of 128kbps remains above 90% while the inaudibility of the watermark is preserved.Keywords: Content-Based Audio Watermarking, Genetic AudioWatermarking.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1517121 Audio Watermarking Based on Compression-expansion Technique
Authors: Say Wei Foo, Qi Dong
Abstract:
A novel robust audio watermarking scheme is proposed in this paper. In the proposed scheme, the host audio signals are segmented into frames. Two consecutive frames are assessed if they are suitable to represent a watermark bit. If so, frequency transform is performed on these two frames. The compressionexpansion technique is adopted to generate distortion over the two frames. The distortion is used to represent one watermark bit. Psychoacoustic model is applied to calculate local auditory mask to ensure that the distortion is not audible. The watermarking schemes using mono and stereo audio signals are designed differently. The correlation-based detection method is used to detect the distortion and extract embedded watermark bits. The experimental results show that the quality degradation caused by the embedded watermarks is perceptually transparent and the proposed schemes are very robust against different types of attacks.Keywords: Audio watermarking, Compression-expansion, Stereo signals, Robustness.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1645120 Intelligibility of Cued Speech in Video
Authors: P. Heribanová, J. Polec, S. Ondrušová, M. Hosťovecký
Abstract:
This paper discusses the cued speech recognition methods in videoconference. Cued speech is a specific gesture language that is used for communication between deaf people. We define the criteria for sentence intelligibility according to answers of testing subjects (deaf people). In our tests we use 30 sample videos coded by H.264 codec with various bit-rates and various speed of cued speech. Additionally, we define the criteria for consonant sign recognizability in single-handed finger alphabet (dactyl) analogically to acoustics. We use another 12 sample videos coded by H.264 codec with various bit-rates in four different video formats. To interpret the results we apply the standard scale for subjective video quality evaluation and the percentual evaluation of intelligibility as in acoustics. From the results we construct the minimum coded bit-rate recommendations for every spatial resolution.Keywords: cued speech, inteligibility, logatom, video
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1530119 A Dictionary Learning Method Based On EMD for Audio Sparse Representation
Authors: Yueming Wang, Zenghui Zhang, Rendong Ying, Peilin Liu
Abstract:
Sparse representation has long been studied and several dictionary learning methods have been proposed. The dictionary learning methods are widely used because they are adaptive. In this paper, a new dictionary learning method for audio is proposed. Signals are at first decomposed into different degrees of Intrinsic Mode Functions (IMF) using Empirical Mode Decomposition (EMD) technique. Then these IMFs form a learned dictionary. To reduce the size of the dictionary, the K-means method is applied to the dictionary to generate a K-EMD dictionary. Compared to K-SVD algorithm, the K-EMD dictionary decomposes audio signals into structured components, thus the sparsity of the representation is increased by 34.4% and the SNR of the recovered audio signals is increased by 20.9%.
Keywords: Dictionary Learning, EMD, K-means Method, Sparse Representation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2628118 A Novel Compression Algorithm for Electrocardiogram Signals based on Wavelet Transform and SPIHT
Authors: Sana Ktata, Kaïs Ouni, Noureddine Ellouze
Abstract:
Electrocardiogram (ECG) data compression algorithm is needed that will reduce the amount of data to be transmitted, stored and analyzed, but without losing the clinical information content. A wavelet ECG data codec based on the Set Partitioning In Hierarchical Trees (SPIHT) compression algorithm is proposed in this paper. The SPIHT algorithm has achieved notable success in still image coding. We modified the algorithm for the one-dimensional (1-D) case and applied it to compression of ECG data. By this compression method, small percent root mean square difference (PRD) and high compression ratio with low implementation complexity are achieved. Experiments on selected records from the MIT-BIH arrhythmia database revealed that the proposed codec is significantly more efficient in compression and in computation than previously proposed ECG compression schemes. Compression ratios of up to 48:1 for ECG signals lead to acceptable results for visual inspection.Keywords: Discrete Wavelet Transform, ECG compression, SPIHT.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2131117 Watermark-based Counter for Restricting Digital Audio Consumption
Authors: Mikko Löytynoja, Nedeljko Cvejic, Tapio Seppänen
Abstract:
In this paper we introduce three watermarking methods that can be used to count the number of times that a user has played some content. The proposed methods are tested with audio content in our experimental system using the most common signal processing attacks. The test results show that the watermarking methods used enable the watermark to be extracted under the most common attacks with a low bit error rate.
Keywords: Digital rights management, restricted usage, content protection, spread spectrum, audio watermarking.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1466116 A Non-Parametric Based Mapping Algorithm for Use in Audio Fingerprinting
Authors: Analise Borg, Paul Micallef
Abstract:
Over the past few years, the online multimedia collection has grown at a fast pace. Several companies showed interest to study the different ways to organise the amount of audio information without the need of human intervention to generate metadata. In the past few years, many applications have emerged on the market which are capable of identifying a piece of music in a short time. Different audio effects and degradation make it much harder to identify the unknown piece. In this paper, an audio fingerprinting system which makes use of a non-parametric based algorithm is presented. Parametric analysis is also performed using Gaussian Mixture Models (GMMs). The feature extraction methods employed are the Mel Spectrum Coefficients and the MPEG-7 basic descriptors. Bin numbers replaced the extracted feature coefficients during the non-parametric modelling. The results show that nonparametric analysis offer potential results as the ones mentioned in the literature.
Keywords: Audio fingerprinting, mapping algorithm, Gaussian Mixture Models, MFCC, MPEG-7.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2285115 A Tool for Audio Quality Evaluation Under Hostile Environment
Authors: Akhil Kumar Arya, Jagdeep Singh Lather, Lillie Dewan
Abstract:
In this paper is to evaluate audio and speech quality with the help of Digital Audio Watermarking Technique under the different types of attacks (signal impairments) like Gaussian Noise, Compression Error and Jittering Effect. Further attacks are considered as Hostile Environment. Audio and Speech Quality Evaluation is an important research topic. The traditional way for speech quality evaluation is using subjective tests. They are reliable, but very expensive, time consuming, and cannot be used in certain applications such as online monitoring. Objective models, based on human perception, were developed to predict the results of subjective tests. The existing objective methods require either the original speech or complicated computation model, which makes some applications of quality evaluation impossible.Keywords: Digital Watermarking, DCT, Speech Quality, Attacks.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1624114 On Musical Information Geometry with Applications to Sonified Image Analysis
Authors: Shannon Steinmetz, Ellen Gethner
Abstract:
In this paper a theoretical foundation is developed to segment, analyze and associate patterns within audio. We explore this on imagery via sonified audio applied to our segmentation framework. The approach involves a geodesic estimator within the statistical manifold, parameterized by musical centricity. We demonstrate viability by processing a database of random imagery to produce statistically significant clusters of similar imagery content.
Keywords: Sonification, musical information geometry, image content extraction, automated quantification, audio segmentation, pattern recognition.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 426113 A Genetic-Algorithm-Based Approach for Audio Steganography
Authors: Mazdak Zamani , Azizah A. Manaf , Rabiah B. Ahmad , Akram M. Zeki , Shahidan Abdullah
Abstract:
In this paper, we present a novel, principled approach to resolve the remained problems of substitution technique of audio steganography. Using the proposed genetic algorithm, message bits are embedded into multiple, vague and higher LSB layers, resulting in increased robustness. The robustness specially would be increased against those intentional attacks which try to reveal the hidden message and also some unintentional attacks like noise addition as well.
Keywords: Artificial Intelligence, Audio Steganography, DataHiding, Genetic Algorithm, Substitution Techniques.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3116112 Satisfaction of Distance Education University Students with the Use of Audio Media as a Medium of Instruction: The Case of Mountains of the Moon University in Uganda
Authors: Mark Kaahwa, Chang Zhu, Moses Muhumuza
Abstract:
This study investigates the satisfaction of distance education university students (DEUS) with the use of audio media as a medium of instruction. Studying students’ satisfaction is vital because it shows whether learners are comfortable with a certain instructional strategy or not. Although previous studies have investigated the use of audio media, the satisfaction of students with an instructional strategy that combines radio teaching and podcasts as an independent teaching strategy has not been fully investigated. In this study, all lectures were delivered through the radio and students had no direct contact with their instructors. No modules or any other material in form of text were given to the students. They instead, revised the taught content by listening to podcasts saved on their mobile electronic gadgets. Prior to data collection, DEUS received orientation through workshops on how to use audio media in distance education. To achieve objectives of the study, a survey, naturalistic observations and face-to-face interviews were used to collect data from a sample of 211 undergraduate and graduate students. Findings indicate that there was no statistically significant difference in the levels of satisfaction between male and female students. The results from post hoc analysis show that there is a statistically significant difference in the levels of satisfaction regarding the use of audio media between diploma and graduate students. Diploma students are more satisfied compared to their graduate counterparts. T-test results reveal that there was no statistically significant difference in the general satisfaction with audio media between rural and urban-based students. And ANOVA results indicate that there is no statistically significant difference in the levels of satisfaction with the use of audio media across age groups. Furthermore, results from observations and interviews reveal that DEUS found learning using audio media a pleasurable medium of instruction. This is an indication that audio media can be considered as an instructional strategy on its own merit.
Keywords: Audio media, distance education, distance education university students, medium of instruction, satisfaction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 798111 A 24-Bit, 8.1-MS/s D/A Converter for Audio Baseband Channel Applications
Authors: N. Ben Ameur, M. Loulou
Abstract:
This paper study the high-level modelling and design of delta-sigma (ΔΣ) noise shapers for audio Digital-to-Analog Converter (DAC) so as to eliminate the in-band Signal-to-Noise- Ratio (SNR) degradation that accompany one channel mismatch in audio signal. The converter combines a cascaded digital signal interpolation, a noise-shaping single loop delta-sigma modulator with a 5-bit quantizer resolution in the final stage. To reduce sensitivity of Digital-to-Analog Converter (DAC) nonlinearities of the last stage, a high pass second order Data Weighted Averaging (R2DWA) is introduced. This paper presents a MATLAB description modelling approach of the proposed DAC architecture with low distortion and swing suppression integrator designs. The ΔΣ Modulator design can be configured as a 3rd-order and allows 24-bit PCM at sampling rate of 64 kHz for Digital Video Disc (DVD) audio application. The modeling approach provides 139.38 dB of dynamic range for a 32 kHz signal band at -1.6 dBFS input signal level.Keywords: DVD-audio, DAC, Interpolator and Interpolation Filter, Single-Loop ΔΣ Modulation, R2DWA, Clock Jitter
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2623