Search results for: Audio Fingerprinting.
72 A Temporal Synchronization Model for Heterogeneous Data in Distributed Systems
Authors: Jorge Estudillo Ramirez, Saul E. Pomares Hernandez
Abstract:
Multimedia distributed systems deal with heterogeneous data, such as texts, images, graphics, video and audio. The specification of temporal relations among different data types and distributed sources is an open research area. This paper proposes a fully distributed synchronization model to be used in multimedia systems. One original aspect of the model is that it avoids the use of a common reference (e.g. wall clock and shared memory). To achieve this, all possible multimedia temporal relations are specified according to their causal dependencies.Keywords: Multimedia, Distributed Systems, Partial Ordering, Temporal Synchronization
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 135771 Feature-Driven Classification of Musical Styles
Authors: A. Buzzanca, G. Castellano, A.M. Fanelli
Abstract:
In this paper we address the problem of musical style classification, which has a number of applications like indexing in musical databases or automatic composition systems. Starting from MIDI files of real-world improvisations, we extract the melody track and cut it into overlapping segments of equal length. From these fragments, some numerical features are extracted as descriptors of style samples. We show that a standard Bayesian classifier can be conveniently employed to build an effective musical style classifier, once this set of features has been extracted from musical data. Preliminary experimental results show the effectiveness of the developed classifier that represents the first component of a musical audio retrieval systemKeywords: Musical style, Bayesian classifier.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 129770 A Guide to the Implementation of Ambisonics Super Stereo
Authors: Alessio Mastrorillo, Giuseppe Silvi, Francesco Scagliola
Abstract:
This paper explores the decoding of Ambisonics material into 2-channel mixing formats, addressing challenges related to stereo speakers and headphones. We present the Universal HJ (UHJ) format as a solution, enabling the preservation of the entire horizontal plane and offering versatile spatial audio experiences. Our paper presents a UHJ format decoder, explaining its design, computational aspects, and empirical optimization. We discuss the advantages of UHJ decoding, potential applications, and its significance in music composition. Additionally, we highlight the integration of this decoder within the Envelop for Live (E4L) suite.
Keywords: Ambisonics, UHJ, quadrature filter, virtual reality, Gerzon, decoder, stereo, binaural, biquad.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19869 Design and Study of a DC/DC Converter for High Power, 14.4 V and 300 A for Automotive Applications
Authors: Julio Cesar Lopes de Oliveira, Carlos Henrique Gonc¸alves Treviso
Abstract:
The shortage of the automotive market in relation to options for sources of high power car audio systems, led to development of this work. Thus, we developed a source with stabilized voltage with 4320 W effective power. Designed to the voltage of 14.4 V and a choice of two currents: 30 A load option in battery banks and 300 A at full load. This source can also be considered as a source of general use dedicated commercial with a simple control circuit in analog form based on discrete components. The assembly of power circuit uses a methodology for higher power than the initially stipulated.
Keywords: DC-DC power converters, converters, power convertion, pulse width modulation converters.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 290868 Two Kinds of Self-Oscillating Circuits Mechanically Demonstrated
Authors: Shiang-Hwua Yu, Po-Hsun Wu
Abstract:
This study introduces two types of self-oscillating circuits that are frequently found in power electronics applications. Special effort is made to relate the circuits to the analogous mechanical systems of some important scientific inventions: Galileo’s pendulum clock and Coulomb’s friction model. A little touch of related history and philosophy of science will hopefully encourage curiosity, advance the understanding of self-oscillating systems and satisfy the aspiration of some students for scientific literacy. Finally, the two self-oscillating circuits are applied to design a simple class-D audio amplifier.
Keywords: Self-oscillation, sigma-delta modulator, pendulum clock, Coulomb friction, class-D amplifier.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 245667 Digital Image Forensics: Discovering the History of Digital Images
Authors: Gurinder Singh, Kulbir Singh
Abstract:
Digital multimedia contents such as image, video, and audio can be tampered easily due to the availability of powerful editing softwares. Multimedia forensics is devoted to analyze these contents by using various digital forensic techniques in order to validate their authenticity. Digital image forensics is dedicated to investigate the reliability of digital images by analyzing the integrity of data and by reconstructing the historical information of an image related to its acquisition phase. In this paper, a survey is carried out on the forgery detection by considering the most recent and promising digital image forensic techniques.
Keywords: Computer forensics, multimedia forensics, image ballistics, camera source identification, forgery detection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 181666 Real-Time Digital Oscilloscope Implementation in 90nm CMOS Technology FPGA
Authors: Nasir Mehmood, Jens Ogniewski, Vinodh Ravinath
Abstract:
This paper describes the design of a real-time audiorange digital oscilloscope and its implementation in 90nm CMOS FPGA platform. The design consists of sample and hold circuits, A/D conversion, audio and video processing, on-chip RAM, clock generation and control logic. The design of internal blocks and modules in 90nm devices in an FPGA is elaborated. Also the key features and their implementation algorithms are presented. Finally, the timing waveforms and simulation results are put forward.Keywords: CMOS, VLSI, Oscilloscope, Field Programmable Gate Array (FPGA), VHDL, Video Graphics Array (VGA)
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 308365 The Electronic and Computer-Aided Periodic Table Prepared for the Visually Impaired Individuals
Authors: Ayşe Eldem, Fatih Başçiftçi
Abstract:
Visually impaired individuals cannot lead their lives as comfortable as others. Therefore, new applications are being developed every passing day in order to make their lives easier. In this study, an electronic and computer-aided audio device was developed with the aim of making the learning of the periodic table easier for the visually impaired. In this device, a board includes buttons for each element of the periodic table. After pressing a button, the visually impaired individual not only hears the name of the element but also feels with his/her hands where that specific element is located.
Keywords: Periodic Table, PIC16F877, Serial port, Visually Impaired Individual.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 194264 Early Installation Effect on the Vibration Generated by Machines
Authors: Maitham Al-Safwani
Abstract:
Motor vibration issues were analyzed and correlated to poor equipment installation. We had a water injection pump tested in the factory and exceeded the pump vibration limit. Once the pump was brought to the site, its half-size shim plates were replaced with full-size shims plate that drastically reduced the vibration. In this study, vibration data were recorded for several and similar motors run at the same and different speeds. The vibration values were recorded — for two and a half hours — and the vibration readings analyzed to determine when the readings become consistent. This was as well supported by recording the audio noises produced by some machines seeking a relationship between changes in machine noises and machine abnormalities, such as vibration.
Keywords: Vibration, noise, shaft unbalance, shaft misalignment.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 43563 Pulse Skipping Modulated DC to DC Step Down Converter Under Discontinuous Conduction Mode
Authors: Ramamurthy S, Ranjan P V, Raghavendiran T A
Abstract:
Reduced switching loss favours Pulse Skipping Modulation mode of switching dc-to-dc converters at light loads. Under certain conditions the converter operates in discontinuous conduction mode (DCM). Inductor current starts from zero in each switching cycle as the switching frequency is constant and not adequately high. A DC-to-DC buck converter is modelled and simulated in this paper under DCM. Effect of ESR of the filter capacitor in input current frequency components is studied. The converter is studied for its operation under input voltage and load variation. The operating frequency is selected to be close to and above audio range.Keywords: Buck converter, Discontinuous conduction mode, Electromagnetic Interference, Pulse Skipping Modulation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 492862 Face Localization Using Illumination-dependent Face Model for Visual Speech Recognition
Authors: Robert E. Hursig, Jane X. Zhang
Abstract:
A robust still image face localization algorithm capable of operating in an unconstrained visual environment is proposed. First, construction of a robust skin classifier within a shifted HSV color space is described. Then various filtering operations are performed to better isolate face candidates and mitigate the effect of substantial non-skin regions. Finally, a novel Bhattacharyya-based face detection algorithm is used to compare candidate regions of interest with a unique illumination-dependent face model probability distribution function approximation. Experimental results show a 90% face detection success rate despite the demands of the visually noisy environment.Keywords: Audio-visual speech recognition, Bhattacharyyacoefficient, face detection,
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 162861 A Real-Time Signal Processing Technique for MIDI Generation
Authors: Farshad Arvin, Shyamala Doraisamy
Abstract:
This paper presents a new hardware interface using a microcontroller which processes audio music signals to standard MIDI data. A technique for processing music signals by extracting note parameters from music signals is described. An algorithm to convert the voice samples for real-time processing without complex calculations is proposed. A high frequency microcontroller as the main processor is deployed to execute the outlined algorithm. The MIDI data generated is transmitted using the EIA-232 protocol. The analyses of data generated show the feasibility of using microcontrollers for real-time MIDI generation hardware interface.Keywords: Signal processing, MIDI, Microcontroller, EIA-232.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 212760 Realtime Lip Contour Tracking For Audio-Visual Speech Recognition Applications
Authors: Mehran Yazdi, Mehdi Seyfi, Amirhossein Rafati, Meghdad Asadi
Abstract:
Detection and tracking of the lip contour is an important issue in speechreading. While there are solutions for lip tracking once a good contour initialization in the first frame is available, the problem of finding such a good initialization is not yet solved automatically, but done manually. We have developed a new tracking solution for lip contour detection using only few landmarks (15 to 25) and applying the well known Active Shape Models (ASM). The proposed method is a new LMS-like adaptive scheme based on an Auto regressive (AR) model that has been fit on the landmark variations in successive video frames. Moreover, we propose an extra motion compensation model to address more general cases in lip tracking. Computer simulations demonstrate a fair match between the true and the estimated spatial pixels. Significant improvements related to the well known LMS approach has been obtained via a defined Frobenius norm index.Keywords: Lip contour, Tracking, LMS-Like
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 179659 Finite Element Method Analysis of Occluded-Ear Simulator and Natural Human Ear Canal
Authors: M. Sasajima, T. Yamaguchi, Y. Hu, Y. Koike
Abstract:
In this paper, we discuss the propagation of sound in the narrow pathways of an occluded-ear simulator typically used for the measurement of insert-type earphones. The simulator has a standardized frequency response conforming to the international standard (IEC60318-4). In narrow pathways, the speed and phase of sound waves are modified by viscous air damping. In our previous paper, we proposed a new finite element method (FEM) to consider the effects of air viscosity in this type of audio equipment. In this study, we will compare the results from the ear simulator FEM model, and those from a three dimensional human ear canal FEM model made from computed tomography images, with the measured frequency response data from the ear canals of 18 people.
Keywords: Ear simulator, FEM, viscosity, human ear canal.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 112858 A Research of the Influence that MP3 Sound Gives EEG of the Person
Authors: Seiya Teshima, Kazushige Magatani
Abstract:
Currently, many types of no-reversible compressed sound source, represented by MP3 (MPEG Audio Layer-3) are popular in the world and they are widely used to make the music file size smaller. The sound data created in this way has less information as compared to pre-compressed data. The objective of this study is by analyzing EEG to determine if people can recognize such difference as differences in sound. A measurement system that can measure and analyze EEG when a subject listens to music were experimentally developed. And ten subjects were studied with this system. In this experiment, a WAVE formatted music data and a MP3 compressed music data that is made from the WAVE formatted data were prepared. Each subject was made to hear these music sources at the same volume. From the results of this experiment, clear differences were confirmed between two wound sources.Keywords: EEG, Biological signal , Sound , MP3
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 177657 Devising and Assessing the Efficacy of Mobile-Assisted Instructional Modes in Mobile Learning
Authors: Majlinda Fetaji, Alajdin Abazi, Zamir Dika, Bekim Fetaji
Abstract:
The assessment of the efficacy of devised Mobile- Assisted Instructional Modes in Mobile Learning was the focus of this research. The study adopted pre-test, post-test, control group quasi-experimental design. Research instruments were developed, validated and used for collecting data. Findings revealed that the students exposed to Mobile Task Based Learning Mode (MTBLM) in using Mobile-Assisted Instruction (MAI) performed significantly better. The implication of these findings is that, the Audio tutorial and Practice Mode (ATPM) (Stimulus instruments) of MAI had been found better over the other modes used in the study.Keywords: Mobile-Assisted instructions, Mobile learning, learning instructions, task based learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 157256 Image Steganography Using Least Significant Bit Technique
Authors: Preeti Kumari, Ridhi Kapoor
Abstract:
In any communication, security is the most important issue in today’s world. In this paper, steganography is the process of hiding the important data into other data, such as text, audio, video, and image. The interest in this topic is to provide availability, confidentiality, integrity, and authenticity of data. The steganographic technique that embeds hides content with unremarkable cover media so as not to provoke eavesdropper’s suspicion or third party and hackers. In which many applications of compression, encryption, decryption, and embedding methods are used for digital image steganography. Due to compression, the nose produces in the image. To sustain noise in the image, the LSB insertion technique is used. The performance of the proposed embedding system with respect to providing security to secret message and robustness is discussed. We also demonstrate the maximum steganography capacity and visual distortion.Keywords: Steganography, LSB, encoding, information hiding, color image.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 109255 A Talking Head System for Korean Text
Authors: Sang-Wan Kim, Hoon Lee, Kyung-Ho Choi, Soon-Young Park
Abstract:
A talking head system (THS) is presented to animate the face of a speaking 3D avatar in such a way that it realistically pronounces the given Korean text. The proposed system consists of SAPI compliant text-to-speech (TTS) engine and MPEG-4 compliant face animation generator. The input to the THS is a unicode text that is to be spoken with synchronized lip shape. The TTS engine generates a phoneme sequence with their duration and audio data. The TTS applies the coarticulation rules to the phoneme sequence and sends a mouth animation sequence to the face modeler. The proposed THS can make more natural lip sync and facial expression by using the face animation generator than those using the conventional visemes only. The experimental results show that our system has great potential for the implementation of talking head for Korean text.Keywords: Talking head, Lip sync, TTS, MPEG4.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 149154 An Analysis of Compression Methods and Implementation of Medical Images in Wireless Network
Authors: C. Rajan, K. Geetha, S. Geetha
Abstract:
The motivation of image compression technique is to reduce the irrelevance and redundancy of the image data in order to store or pass data in an efficient way from one place to another place. There are several types of compression methods available. Without the help of compression technique, the file size is knowingly larger, usually several megabytes, but by doing the compression technique, it is possible to reduce file size up to 10% as of the original without noticeable loss in quality. Image compression can be lossless or lossy. The compression technique can be applied to images, audio, video and text data. This research work mainly concentrates on methods of encoding, DCT, compression methods, security, etc. Different methodologies and network simulations have been analyzed here. Various methods of compression methodologies and its performance metrics has been investigated and presented in a table manner.Keywords: Image compression techniques, encoding, DCT, lossy compression, lossless compression, JPEG.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 118853 Terrain Classification for Ground Robots Based on Acoustic Features
Authors: Bernd Kiefer, Abraham Gebru Tesfay, Dietrich Klakow
Abstract:
The motivation of our work is to detect different terrain types traversed by a robot based on acoustic data from the robot-terrain interaction. Different acoustic features and classifiers were investigated, such as Mel-frequency cepstral coefficient and Gamma-tone frequency cepstral coefficient for the feature extraction, and Gaussian mixture model and Feed forward neural network for the classification. We analyze the system’s performance by comparing our proposed techniques with some other features surveyed from distinct related works. We achieve precision and recall values between 87% and 100% per class, and an average accuracy at 95.2%. We also study the effect of varying audio chunk size in the application phase of the models and find only a mild impact on performance.Keywords: Terrain classification, acoustic features, autonomous robots, feature extraction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 113252 A Survey on Voice over IP over Wireless LANs
Authors: Haniyeh Kazemitabar, Sameha Ahmed, Kashif Nisar, Abas B Said, Halabi B Hasbullah
Abstract:
Voice over Internet Protocol (VoIP) is a form of voice communication that uses audio data to transmit voice signals to the end user. VoIP is one of the most important technologies in the World of communication. Around, 20 years of research on VoIP, some problems of VoIP are still remaining. During the past decade and with growing of wireless technologies, we have seen that many papers turn their concentration from Wired-LAN to Wireless-LAN. VoIP over Wireless LAN (WLAN) faces many challenges due to the loose nature of wireless network. Issues like providing Quality of Service (QoS) at a good level, dedicating capacity for calls and having secure calls is more difficult rather than wired LAN. Therefore VoIP over WLAN (VoWLAN) remains a challenging research topic. In this paper we consolidate and address major VoWLAN issues. This research is helpful for those researchers wants to do research in Voice over IP technology over WLAN network.Keywords: Capacity, QoS, Security, VoIP Issues, WLAN.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 224551 A Robust Image Steganography Method Using PMM in Bit Plane Domain
Authors: Souvik Bhattacharyya, Aparajita Khan, Indradip Banerjee, Gautam Sanyal
Abstract:
Steganography is the art and science that hides the information in an appropriate cover carrier like image, text, audio and video media. In this work the authors propose a new image based steganographic method for hiding information within the complex bit planes of the image. After slicing into bit planes the cover image is analyzed to extract the most complex planes in decreasing order based on their bit plane complexity. The complexity function next determines the complex noisy blocks of the chosen bit plane and finally pixel mapping method (PMM) has been used to embed secret bits into those regions of the bit plane. The novel approach of using pixel mapping method (PMM) in bit plane domain adaptively embeds data on most complex regions of image, provides high embedding capacity, better imperceptibility and resistance to steganalysis attack.
Keywords: PMM (Pixel Mapping Method), Bit Plane, Steganography, SSIM, KL-Divergence.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 286750 Robust Features for Impulsive Noisy Speech Recognition Using Relative Spectral Analysis
Authors: Hajer Rahali, Zied Hajaiej, Noureddine Ellouze
Abstract:
The goal of speech parameterization is to extract the relevant information about what is being spoken from the audio signal. In speech recognition systems Mel-Frequency Cepstral Coefficients (MFCC) and Relative Spectral Mel-Frequency Cepstral Coefficients (RASTA-MFCC) are the two main techniques used. It will be shown in this paper that it presents some modifications to the original MFCC method. In our work the effectiveness of proposed changes to MFCC called Modified Function Cepstral Coefficients (MODFCC) were tested and compared against the original MFCC and RASTA-MFCC features. The prosodic features such as jitter and shimmer are added to baseline spectral features. The above-mentioned techniques were tested with impulsive signals under various noisy conditions within AURORA databases.
Keywords: Auditory filter, impulsive noise, MFCC, prosodic features, RASTA filter.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 232349 Signed Approach for Mining Web Content Outliers
Authors: G. Poonkuzhali, K.Thiagarajan, K.Sarukesi, G.V.Uma
Abstract:
The emergence of the Internet has brewed the revolution of information storage and retrieval. As most of the data in the web is unstructured, and contains a mix of text, video, audio etc, there is a need to mine information to cater to the specific needs of the users without loss of important hidden information. Thus developing user friendly and automated tools for providing relevant information quickly becomes a major challenge in web mining research. Most of the existing web mining algorithms have concentrated on finding frequent patterns while neglecting the less frequent ones that are likely to contain outlying data such as noise, irrelevant and redundant data. This paper mainly focuses on Signed approach and full word matching on the organized domain dictionary for mining web content outliers. This Signed approach gives the relevant web documents as well as outlying web documents. As the dictionary is organized based on the number of characters in a word, searching and retrieval of documents takes less time and less space.Keywords: Outliers, Relevant document, , Signed Approach, Web content mining, Web documents..
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 234948 Media Pedagogy - The Medium is the Message
Authors: Syed Sultan Ahmed
Abstract:
The current education system in India is adept in equipping and assessing the scholastic development of children. However, there is an immediate need to strengthen co-scholastic areas like life-skills, values and attitudes to equip students to face real life challenges. Audio-visual technology and their respective media can make a significant contribution to a value based learning curriculum. Thus, co-scholastic skills need to be effectively nurtured by a medium that is entertaining and impactful. Films in general have a tremendous impact in our society. Films with a positive message make a formidable learning experience that can influence and inspire generations of learners. Leveraging on this powerful medium, EduMedia India Pvt. Ltd. has introduced School Cinema a well researched film-based learning module supported by a fun and exciting workbook, designed to introduce and reaffirm life-skills and values to children, thereby having a positive influence on their attitudes.Keywords: Co-Scholastics, Entertaining, Educative, Holistic- Development
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 167747 Collaborative and Content-based Recommender System for Social Bookmarking Website
Authors: Cheng-Lung Huang, Cheng-Wei Lin
Abstract:
This study proposes a new recommender system based on the collaborative folksonomy. The purpose of the proposed system is to recommend Internet resources (such as books, articles, documents, pictures, audio and video) to users. The proposed method includes four steps: creating the user profile based on the tags, grouping the similar users into clusters using an agglomerative hierarchical clustering, finding similar resources based on the user-s past collections by using content-based filtering, and recommending similar items to the target user. This study examines the system-s performance for the dataset collected from “del.icio.us," which is a famous social bookmarking website. Experimental results show that the proposed tag-based collaborative and content-based filtering hybridized recommender system is promising and effectiveness in the folksonomy-based bookmarking website.
Keywords: Collaborative recommendation, Folksonomy, Social tagging
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 224846 Specification of Attributes of a Multimedia Presentation for Presentation Manager
Authors: Veli Hakkoymaz, Alpaslan Altunköprü
Abstract:
A multimedia presentation system refers to the integration of a multimedia database with a presentation manager which has the functionality of content selection, organization and playout of multimedia presentations. It requires high performance of involved system components. Starting from multimedia information capture until the presentation delivery, high performance tools are required for accessing, manipulating, storing and retrieving these segments, for transferring and delivering them in a presentation terminal according to a playout order. The organization of presentations is a complex task in that the display order of presentation contents (in time and space) must be specified. A multimedia presentation contains audio, video, images and text media types. The critical decisions for presentation construction include what the contents are, how the contents are organized, and once the decision is made on the organization of the contents of the presentation, it must be conveyed to the end user in the correct organizational order and in a timely fashion. This paper introduces a framework for specification of multimedia presentations and describes the design of sample presentations using this framework from a multimedia database.
Keywords: Multimedia presentation, temporal specification, SMIL, spatial specification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 181445 Comparison between Haar and Daubechies Wavelet Transformions on FPGA Technology
Authors: Mohamed I. Mahmoud, Moawad I. M. Dessouky, Salah Deyab, Fatma H. Elfouly
Abstract:
Recently, the Field Programmable Gate Array (FPGA) technology offers the potential of designing high performance systems at low cost. The discrete wavelet transform has gained the reputation of being a very effective signal analysis tool for many practical applications. However, due to its computation-intensive nature, current implementation of the transform falls short of meeting real-time processing requirements of most application. The objectives of this paper are implement the Haar and Daubechies wavelets using FPGA technology. In addition, the comparison between the Haar and Daubechies wavelets is investigated. The Bit Error Rat (BER) between the input audio signal and the reconstructed output signal for each wavelet is calculated. It is seen that the BER using Daubechies wavelet techniques is less than Haar wavelet. The design procedure has been explained and designed using the stat-of-art Electronic Design Automation (EDA) tools for system design on FPGA. Simulation, synthesis and implementation on the FPGA target technology has been carried out.
Keywords: Daubechies wavelet, discrete wavelet transform, Haar wavelet, Xilinx FPGA.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 484544 Multimodal Database of Emotional Speech, Video and Gestures
Authors: Tomasz Sapiński, Dorota Kamińska, Adam Pelikant, Egils Avots, Cagri Ozcinar, Gholamreza Anbarjafari
Abstract:
People express emotions through different modalities. Integration of verbal and non-verbal communication channels creates a system in which the message is easier to understand. Expanding the focus to several expression forms can facilitate research on emotion recognition as well as human-machine interaction. In this article, the authors present a Polish emotional database composed of three modalities: facial expressions, body movement and gestures, and speech. The corpora contains recordings registered in studio conditions, acted out by 16 professional actors (8 male and 8 female). The data is labeled with six basic emotions categories, according to Ekman’s emotion categories. To check the quality of performance, all recordings are evaluated by experts and volunteers. The database is available to academic community and might be useful in the study on audio-visual emotion recognition.Keywords: Body movement, emotion recognition, emotional corpus, facial expressions, gestures, multimodal database, speech.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 112543 Comparison between Haar and Daubechies Wavelet Transformations on FPGA Technology
Authors: Fatma H. Elfouly, Mohamed I. Mahmoud, Moawad I. M. Dessouky, Salah Deyab
Abstract:
Recently, the Field Programmable Gate Array (FPGA) technology offers the potential of designing high performance systems at low cost. The discrete wavelet transform has gained the reputation of being a very effective signal analysis tool for many practical applications. However, due to its computation-intensive nature, current implementation of the transform falls short of meeting real-time processing requirements of most application. The objectives of this paper are implement the Haar and Daubechies wavelets using FPGA technology. In addition, the Bit Error Rate (BER) between the input audio signal and the reconstructed output signal for each wavelet is calculated. From the BER, it is seen that the implementations execute the operation of the wavelet transform correctly and satisfying the perfect reconstruction conditions. The design procedure has been explained and designed using the stat-ofart Electronic Design Automation (EDA) tools for system design on FPGA. Simulation, synthesis and implementation on the FPGA target technology has been carried out.
Keywords: Daubechies wavelet, discrete wavelet transform, Haar wavelet, Xilinx FPGA.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7230