Search results for: Audio Fingerprinting
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 132

Search results for: Audio Fingerprinting

72 A Temporal Synchronization Model for Heterogeneous Data in Distributed Systems

Authors: Jorge Estudillo Ramirez, Saul E. Pomares Hernandez

Abstract:

Multimedia distributed systems deal with heterogeneous data, such as texts, images, graphics, video and audio. The specification of temporal relations among different data types and distributed sources is an open research area. This paper proposes a fully distributed synchronization model to be used in multimedia systems. One original aspect of the model is that it avoids the use of a common reference (e.g. wall clock and shared memory). To achieve this, all possible multimedia temporal relations are specified according to their causal dependencies.

Keywords: Multimedia, Distributed Systems, Partial Ordering, Temporal Synchronization

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1357
71 Feature-Driven Classification of Musical Styles

Authors: A. Buzzanca, G. Castellano, A.M. Fanelli

Abstract:

In this paper we address the problem of musical style classification, which has a number of applications like indexing in musical databases or automatic composition systems. Starting from MIDI files of real-world improvisations, we extract the melody track and cut it into overlapping segments of equal length. From these fragments, some numerical features are extracted as descriptors of style samples. We show that a standard Bayesian classifier can be conveniently employed to build an effective musical style classifier, once this set of features has been extracted from musical data. Preliminary experimental results show the effectiveness of the developed classifier that represents the first component of a musical audio retrieval system

Keywords: Musical style, Bayesian classifier.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1297
70 A Guide to the Implementation of Ambisonics Super Stereo

Authors: Alessio Mastrorillo, Giuseppe Silvi, Francesco Scagliola

Abstract:

This paper explores the decoding of Ambisonics material into 2-channel mixing formats, addressing challenges related to stereo speakers and headphones. We present the Universal HJ (UHJ) format as a solution, enabling the preservation of the entire horizontal plane and offering versatile spatial audio experiences. Our paper presents a UHJ format decoder, explaining its design, computational aspects, and empirical optimization. We discuss the advantages of UHJ decoding, potential applications, and its significance in music composition. Additionally, we highlight the integration of this decoder within the Envelop for Live (E4L) suite.

Keywords: Ambisonics, UHJ, quadrature filter, virtual reality, Gerzon, decoder, stereo, binaural, biquad.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 198
69 Design and Study of a DC/DC Converter for High Power, 14.4 V and 300 A for Automotive Applications

Authors: Julio Cesar Lopes de Oliveira, Carlos Henrique Gonc¸alves Treviso

Abstract:

The shortage of the automotive market in relation to options for sources of high power car audio systems, led to development of this work. Thus, we developed a source with stabilized voltage with 4320 W effective power. Designed to the voltage of 14.4 V and a choice of two currents: 30 A load option in battery banks and 300 A at full load. This source can also be considered as a source of general use dedicated commercial with a simple control circuit in analog form based on discrete components. The assembly of power circuit uses a methodology for higher power than the initially stipulated.

Keywords: DC-DC power converters, converters, power convertion, pulse width modulation converters.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2908
68 Two Kinds of Self-Oscillating Circuits Mechanically Demonstrated

Authors: Shiang-Hwua Yu, Po-Hsun Wu

Abstract:

This study introduces two types of self-oscillating circuits that are frequently found in power electronics applications. Special effort is made to relate the circuits to the analogous mechanical systems of some important scientific inventions: Galileo’s pendulum clock and Coulomb’s friction model. A little touch of related history and philosophy of science will hopefully encourage curiosity, advance the understanding of self-oscillating systems and satisfy the aspiration of some students for scientific literacy. Finally, the two self-oscillating circuits are applied to design a simple class-D audio amplifier.

Keywords: Self-oscillation, sigma-delta modulator, pendulum clock, Coulomb friction, class-D amplifier.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2456
67 Digital Image Forensics: Discovering the History of Digital Images

Authors: Gurinder Singh, Kulbir Singh

Abstract:

Digital multimedia contents such as image, video, and audio can be tampered easily due to the availability of powerful editing softwares. Multimedia forensics is devoted to analyze these contents by using various digital forensic techniques in order to validate their authenticity. Digital image forensics is dedicated to investigate the reliability of digital images by analyzing the integrity of data and by reconstructing the historical information of an image related to its acquisition phase. In this paper, a survey is carried out on the forgery detection by considering the most recent and promising digital image forensic techniques.

Keywords: Computer forensics, multimedia forensics, image ballistics, camera source identification, forgery detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1816
66 Real-Time Digital Oscilloscope Implementation in 90nm CMOS Technology FPGA

Authors: Nasir Mehmood, Jens Ogniewski, Vinodh Ravinath

Abstract:

This paper describes the design of a real-time audiorange digital oscilloscope and its implementation in 90nm CMOS FPGA platform. The design consists of sample and hold circuits, A/D conversion, audio and video processing, on-chip RAM, clock generation and control logic. The design of internal blocks and modules in 90nm devices in an FPGA is elaborated. Also the key features and their implementation algorithms are presented. Finally, the timing waveforms and simulation results are put forward.

Keywords: CMOS, VLSI, Oscilloscope, Field Programmable Gate Array (FPGA), VHDL, Video Graphics Array (VGA)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3083
65 The Electronic and Computer-Aided Periodic Table Prepared for the Visually Impaired Individuals

Authors: Ayşe Eldem, Fatih Başçiftçi

Abstract:

Visually impaired individuals cannot lead their lives as comfortable as others. Therefore, new applications are being developed every passing day in order to make their lives easier. In this study, an electronic and computer-aided audio device was developed with the aim of making the learning of the periodic table easier for the visually impaired. In this device, a board includes buttons for each element of the periodic table. After pressing a button, the visually impaired individual not only hears the name of the element but also feels with his/her hands where that specific element is located.

Keywords: Periodic Table, PIC16F877, Serial port, Visually Impaired Individual.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1942
64 Early Installation Effect on the Vibration Generated by Machines

Authors: Maitham Al-Safwani

Abstract:

Motor vibration issues were analyzed and correlated to poor equipment installation. We had a water injection pump tested in the factory and exceeded the pump vibration limit. Once the pump was brought to the site, its half-size shim plates were replaced with full-size shims plate that drastically reduced the vibration. In this study, vibration data were recorded for several and similar motors run at the same and different speeds. The vibration values were recorded — for two and a half hours — and the vibration readings analyzed to determine when the readings become consistent. This was as well supported by recording the audio noises produced by some machines seeking a relationship between changes in machine noises and machine abnormalities, such as vibration.

Keywords: Vibration, noise, shaft unbalance, shaft misalignment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 435
63 Pulse Skipping Modulated DC to DC Step Down Converter Under Discontinuous Conduction Mode

Authors: Ramamurthy S, Ranjan P V, Raghavendiran T A

Abstract:

Reduced switching loss favours Pulse Skipping Modulation mode of switching dc-to-dc converters at light loads. Under certain conditions the converter operates in discontinuous conduction mode (DCM). Inductor current starts from zero in each switching cycle as the switching frequency is constant and not adequately high. A DC-to-DC buck converter is modelled and simulated in this paper under DCM. Effect of ESR of the filter capacitor in input current frequency components is studied. The converter is studied for its operation under input voltage and load variation. The operating frequency is selected to be close to and above audio range.

Keywords: Buck converter, Discontinuous conduction mode, Electromagnetic Interference, Pulse Skipping Modulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4928
62 Face Localization Using Illumination-dependent Face Model for Visual Speech Recognition

Authors: Robert E. Hursig, Jane X. Zhang

Abstract:

A robust still image face localization algorithm capable of operating in an unconstrained visual environment is proposed. First, construction of a robust skin classifier within a shifted HSV color space is described. Then various filtering operations are performed to better isolate face candidates and mitigate the effect of substantial non-skin regions. Finally, a novel Bhattacharyya-based face detection algorithm is used to compare candidate regions of interest with a unique illumination-dependent face model probability distribution function approximation. Experimental results show a 90% face detection success rate despite the demands of the visually noisy environment.

Keywords: Audio-visual speech recognition, Bhattacharyyacoefficient, face detection,

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1628
61 A Real-Time Signal Processing Technique for MIDI Generation

Authors: Farshad Arvin, Shyamala Doraisamy

Abstract:

This paper presents a new hardware interface using a microcontroller which processes audio music signals to standard MIDI data. A technique for processing music signals by extracting note parameters from music signals is described. An algorithm to convert the voice samples for real-time processing without complex calculations is proposed. A high frequency microcontroller as the main processor is deployed to execute the outlined algorithm. The MIDI data generated is transmitted using the EIA-232 protocol. The analyses of data generated show the feasibility of using microcontrollers for real-time MIDI generation hardware interface.

Keywords: Signal processing, MIDI, Microcontroller, EIA-232.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2127
60 Realtime Lip Contour Tracking For Audio-Visual Speech Recognition Applications

Authors: Mehran Yazdi, Mehdi Seyfi, Amirhossein Rafati, Meghdad Asadi

Abstract:

Detection and tracking of the lip contour is an important issue in speechreading. While there are solutions for lip tracking once a good contour initialization in the first frame is available, the problem of finding such a good initialization is not yet solved automatically, but done manually. We have developed a new tracking solution for lip contour detection using only few landmarks (15 to 25) and applying the well known Active Shape Models (ASM). The proposed method is a new LMS-like adaptive scheme based on an Auto regressive (AR) model that has been fit on the landmark variations in successive video frames. Moreover, we propose an extra motion compensation model to address more general cases in lip tracking. Computer simulations demonstrate a fair match between the true and the estimated spatial pixels. Significant improvements related to the well known LMS approach has been obtained via a defined Frobenius norm index.

Keywords: Lip contour, Tracking, LMS-Like

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1796
59 Finite Element Method Analysis of Occluded-Ear Simulator and Natural Human Ear Canal

Authors: M. Sasajima, T. Yamaguchi, Y. Hu, Y. Koike

Abstract:

In this paper, we discuss the propagation of sound in the narrow pathways of an occluded-ear simulator typically used for the measurement of insert-type earphones. The simulator has a standardized frequency response conforming to the international standard (IEC60318-4). In narrow pathways, the speed and phase of sound waves are modified by viscous air damping. In our previous paper, we proposed a new finite element method (FEM) to consider the effects of air viscosity in this type of audio equipment. In this study, we will compare the results from the ear simulator FEM model, and those from a three dimensional human ear canal FEM model made from computed tomography images, with the measured frequency response data from the ear canals of 18 people.

Keywords: Ear simulator, FEM, viscosity, human ear canal.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1128
58 A Research of the Influence that MP3 Sound Gives EEG of the Person

Authors: Seiya Teshima, Kazushige Magatani

Abstract:

Currently, many types of no-reversible compressed sound source, represented by MP3 (MPEG Audio Layer-3) are popular in the world and they are widely used to make the music file size smaller. The sound data created in this way has less information as compared to pre-compressed data. The objective of this study is by analyzing EEG to determine if people can recognize such difference as differences in sound. A measurement system that can measure and analyze EEG when a subject listens to music were experimentally developed. And ten subjects were studied with this system. In this experiment, a WAVE formatted music data and a MP3 compressed music data that is made from the WAVE formatted data were prepared. Each subject was made to hear these music sources at the same volume. From the results of this experiment, clear differences were confirmed between two wound sources.

Keywords: EEG, Biological signal , Sound , MP3

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1776
57 Devising and Assessing the Efficacy of Mobile-Assisted Instructional Modes in Mobile Learning

Authors: Majlinda Fetaji, Alajdin Abazi, Zamir Dika, Bekim Fetaji

Abstract:

The assessment of the efficacy of devised Mobile- Assisted Instructional Modes in Mobile Learning was the focus of this research. The study adopted pre-test, post-test, control group quasi-experimental design. Research instruments were developed, validated and used for collecting data. Findings revealed that the students exposed to Mobile Task Based Learning Mode (MTBLM) in using Mobile-Assisted Instruction (MAI) performed significantly better. The implication of these findings is that, the Audio tutorial and Practice Mode (ATPM) (Stimulus instruments) of MAI had been found better over the other modes used in the study.

Keywords: Mobile-Assisted instructions, Mobile learning, learning instructions, task based learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1572
56 Image Steganography Using Least Significant Bit Technique

Authors: Preeti Kumari, Ridhi Kapoor

Abstract:

 In any communication, security is the most important issue in today’s world. In this paper, steganography is the process of hiding the important data into other data, such as text, audio, video, and image. The interest in this topic is to provide availability, confidentiality, integrity, and authenticity of data. The steganographic technique that embeds hides content with unremarkable cover media so as not to provoke eavesdropper’s suspicion or third party and hackers. In which many applications of compression, encryption, decryption, and embedding methods are used for digital image steganography. Due to compression, the nose produces in the image. To sustain noise in the image, the LSB insertion technique is used. The performance of the proposed embedding system with respect to providing security to secret message and robustness is discussed. We also demonstrate the maximum steganography capacity and visual distortion.

Keywords: Steganography, LSB, encoding, information hiding, color image.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1092
55 A Talking Head System for Korean Text

Authors: Sang-Wan Kim, Hoon Lee, Kyung-Ho Choi, Soon-Young Park

Abstract:

A talking head system (THS) is presented to animate the face of a speaking 3D avatar in such a way that it realistically pronounces the given Korean text. The proposed system consists of SAPI compliant text-to-speech (TTS) engine and MPEG-4 compliant face animation generator. The input to the THS is a unicode text that is to be spoken with synchronized lip shape. The TTS engine generates a phoneme sequence with their duration and audio data. The TTS applies the coarticulation rules to the phoneme sequence and sends a mouth animation sequence to the face modeler. The proposed THS can make more natural lip sync and facial expression by using the face animation generator than those using the conventional visemes only. The experimental results show that our system has great potential for the implementation of talking head for Korean text.

Keywords: Talking head, Lip sync, TTS, MPEG4.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1491
54 An Analysis of Compression Methods and Implementation of Medical Images in Wireless Network

Authors: C. Rajan, K. Geetha, S. Geetha

Abstract:

The motivation of image compression technique is to reduce the irrelevance and redundancy of the image data in order to store or pass data in an efficient way from one place to another place. There are several types of compression methods available. Without the help of compression technique, the file size is knowingly larger, usually several megabytes, but by doing the compression technique, it is possible to reduce file size up to 10% as of the original without noticeable loss in quality. Image compression can be lossless or lossy. The compression technique can be applied to images, audio, video and text data. This research work mainly concentrates on methods of encoding, DCT, compression methods, security, etc. Different methodologies and network simulations have been analyzed here. Various methods of compression methodologies and its performance metrics has been investigated and presented in a table manner.

Keywords: Image compression techniques, encoding, DCT, lossy compression, lossless compression, JPEG.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1188
53 Terrain Classification for Ground Robots Based on Acoustic Features

Authors: Bernd Kiefer, Abraham Gebru Tesfay, Dietrich Klakow

Abstract:

The motivation of our work is to detect different terrain types traversed by a robot based on acoustic data from the robot-terrain interaction. Different acoustic features and classifiers were investigated, such as Mel-frequency cepstral coefficient and Gamma-tone frequency cepstral coefficient for the feature extraction, and Gaussian mixture model and Feed forward neural network for the classification. We analyze the system’s performance by comparing our proposed techniques with some other features surveyed from distinct related works. We achieve precision and recall values between 87% and 100% per class, and an average accuracy at 95.2%. We also study the effect of varying audio chunk size in the application phase of the models and find only a mild impact on performance.

Keywords: Terrain classification, acoustic features, autonomous robots, feature extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1133
52 A Survey on Voice over IP over Wireless LANs

Authors: Haniyeh Kazemitabar, Sameha Ahmed, Kashif Nisar, Abas B Said, Halabi B Hasbullah

Abstract:

Voice over Internet Protocol (VoIP) is a form of voice communication that uses audio data to transmit voice signals to the end user. VoIP is one of the most important technologies in the World of communication. Around, 20 years of research on VoIP, some problems of VoIP are still remaining. During the past decade and with growing of wireless technologies, we have seen that many papers turn their concentration from Wired-LAN to Wireless-LAN. VoIP over Wireless LAN (WLAN) faces many challenges due to the loose nature of wireless network. Issues like providing Quality of Service (QoS) at a good level, dedicating capacity for calls and having secure calls is more difficult rather than wired LAN. Therefore VoIP over WLAN (VoWLAN) remains a challenging research topic. In this paper we consolidate and address major VoWLAN issues. This research is helpful for those researchers wants to do research in Voice over IP technology over WLAN network.

Keywords: Capacity, QoS, Security, VoIP Issues, WLAN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2245
51 A Robust Image Steganography Method Using PMM in Bit Plane Domain

Authors: Souvik Bhattacharyya, Aparajita Khan, Indradip Banerjee, Gautam Sanyal

Abstract:

Steganography is the art and science that hides the information in an appropriate cover carrier like image, text, audio and video media. In this work the authors propose a new image based steganographic method for hiding information within the complex bit planes of the image. After slicing into bit planes the cover image is analyzed to extract the most complex planes in decreasing order based on their bit plane complexity. The complexity function next determines the complex noisy blocks of the chosen bit plane and finally pixel mapping method (PMM) has been used to embed secret bits into those regions of the bit plane. The novel approach of using pixel mapping method (PMM) in bit plane domain adaptively embeds data on most complex regions of image, provides high embedding capacity, better imperceptibility and resistance to steganalysis attack.

Keywords: PMM (Pixel Mapping Method), Bit Plane, Steganography, SSIM, KL-Divergence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2867
50 Robust Features for Impulsive Noisy Speech Recognition Using Relative Spectral Analysis

Authors: Hajer Rahali, Zied Hajaiej, Noureddine Ellouze

Abstract:

The goal of speech parameterization is to extract the relevant information about what is being spoken from the audio signal. In speech recognition systems Mel-Frequency Cepstral Coefficients (MFCC) and Relative Spectral Mel-Frequency Cepstral Coefficients (RASTA-MFCC) are the two main techniques used. It will be shown in this paper that it presents some modifications to the original MFCC method. In our work the effectiveness of proposed changes to MFCC called Modified Function Cepstral Coefficients (MODFCC) were tested and compared against the original MFCC and RASTA-MFCC features. The prosodic features such as jitter and shimmer are added to baseline spectral features. The above-mentioned techniques were tested with impulsive signals under various noisy conditions within AURORA databases.

Keywords: Auditory filter, impulsive noise, MFCC, prosodic features, RASTA filter.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2323
49 Signed Approach for Mining Web Content Outliers

Authors: G. Poonkuzhali, K.Thiagarajan, K.Sarukesi, G.V.Uma

Abstract:

The emergence of the Internet has brewed the revolution of information storage and retrieval. As most of the data in the web is unstructured, and contains a mix of text, video, audio etc, there is a need to mine information to cater to the specific needs of the users without loss of important hidden information. Thus developing user friendly and automated tools for providing relevant information quickly becomes a major challenge in web mining research. Most of the existing web mining algorithms have concentrated on finding frequent patterns while neglecting the less frequent ones that are likely to contain outlying data such as noise, irrelevant and redundant data. This paper mainly focuses on Signed approach and full word matching on the organized domain dictionary for mining web content outliers. This Signed approach gives the relevant web documents as well as outlying web documents. As the dictionary is organized based on the number of characters in a word, searching and retrieval of documents takes less time and less space.

Keywords: Outliers, Relevant document, , Signed Approach, Web content mining, Web documents..

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2349
48 Media Pedagogy - The Medium is the Message

Authors: Syed Sultan Ahmed

Abstract:

The current education system in India is adept in equipping and assessing the scholastic development of children. However, there is an immediate need to strengthen co-scholastic areas like life-skills, values and attitudes to equip students to face real life challenges. Audio-visual technology and their respective media can make a significant contribution to a value based learning curriculum. Thus, co-scholastic skills need to be effectively nurtured by a medium that is entertaining and impactful. Films in general have a tremendous impact in our society. Films with a positive message make a formidable learning experience that can influence and inspire generations of learners. Leveraging on this powerful medium, EduMedia India Pvt. Ltd. has introduced School Cinema a well researched film-based learning module supported by a fun and exciting workbook, designed to introduce and reaffirm life-skills and values to children, thereby having a positive influence on their attitudes.

Keywords: Co-Scholastics, Entertaining, Educative, Holistic- Development

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1677
47 Collaborative and Content-based Recommender System for Social Bookmarking Website

Authors: Cheng-Lung Huang, Cheng-Wei Lin

Abstract:

This study proposes a new recommender system based on the collaborative folksonomy. The purpose of the proposed system is to recommend Internet resources (such as books, articles, documents, pictures, audio and video) to users. The proposed method includes four steps: creating the user profile based on the tags, grouping the similar users into clusters using an agglomerative hierarchical clustering, finding similar resources based on the user-s past collections by using content-based filtering, and recommending similar items to the target user. This study examines the system-s performance for the dataset collected from “del.icio.us," which is a famous social bookmarking website. Experimental results show that the proposed tag-based collaborative and content-based filtering hybridized recommender system is promising and effectiveness in the folksonomy-based bookmarking website.

Keywords: Collaborative recommendation, Folksonomy, Social tagging

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2248
46 Specification of Attributes of a Multimedia Presentation for Presentation Manager

Authors: Veli Hakkoymaz, Alpaslan Altunköprü

Abstract:

A multimedia presentation system refers to the integration of a multimedia database with a presentation manager which has the functionality of content selection, organization and playout of multimedia presentations. It requires high performance of involved system components. Starting from multimedia information capture until the presentation delivery, high performance tools are required for accessing, manipulating, storing and retrieving these segments, for transferring and delivering them in a presentation terminal according to a playout order. The organization of presentations is a complex task in that the display order of presentation contents (in time and space) must be specified. A multimedia presentation contains audio, video, images and text media types. The critical decisions for presentation construction include what the contents are, how the contents are organized, and once the decision is made on the organization of the contents of the presentation, it must be conveyed to the end user in the correct organizational order and in a timely fashion. This paper introduces a framework for specification of multimedia presentations and describes the design of sample presentations using this framework from a multimedia database.

Keywords: Multimedia presentation, temporal specification, SMIL, spatial specification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1814
45 Comparison between Haar and Daubechies Wavelet Transformions on FPGA Technology

Authors: Mohamed I. Mahmoud, Moawad I. M. Dessouky, Salah Deyab, Fatma H. Elfouly

Abstract:

Recently, the Field Programmable Gate Array (FPGA) technology offers the potential of designing high performance systems at low cost. The discrete wavelet transform has gained the reputation of being a very effective signal analysis tool for many practical applications. However, due to its computation-intensive nature, current implementation of the transform falls short of meeting real-time processing requirements of most application. The objectives of this paper are implement the Haar and Daubechies wavelets using FPGA technology. In addition, the comparison between the Haar and Daubechies wavelets is investigated. The Bit Error Rat (BER) between the input audio signal and the reconstructed output signal for each wavelet is calculated. It is seen that the BER using Daubechies wavelet techniques is less than Haar wavelet. The design procedure has been explained and designed using the stat-of-art Electronic Design Automation (EDA) tools for system design on FPGA. Simulation, synthesis and implementation on the FPGA target technology has been carried out.

Keywords: Daubechies wavelet, discrete wavelet transform, Haar wavelet, Xilinx FPGA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4845
44 Multimodal Database of Emotional Speech, Video and Gestures

Authors: Tomasz Sapiński, Dorota Kamińska, Adam Pelikant, Egils Avots, Cagri Ozcinar, Gholamreza Anbarjafari

Abstract:

People express emotions through different modalities. Integration of verbal and non-verbal communication channels creates a system in which the message is easier to understand. Expanding the focus to several expression forms can facilitate research on emotion recognition as well as human-machine interaction. In this article, the authors present a Polish emotional database composed of three modalities: facial expressions, body movement and gestures, and speech. The corpora contains recordings registered in studio conditions, acted out by 16 professional actors (8 male and 8 female). The data is labeled with six basic emotions categories, according to Ekman’s emotion categories. To check the quality of performance, all recordings are evaluated by experts and volunteers. The database is available to academic community and might be useful in the study on audio-visual emotion recognition.

Keywords: Body movement, emotion recognition, emotional corpus, facial expressions, gestures, multimodal database, speech.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1125
43 Comparison between Haar and Daubechies Wavelet Transformations on FPGA Technology

Authors: Fatma H. Elfouly, Mohamed I. Mahmoud, Moawad I. M. Dessouky, Salah Deyab

Abstract:

Recently, the Field Programmable Gate Array (FPGA) technology offers the potential of designing high performance systems at low cost. The discrete wavelet transform has gained the reputation of being a very effective signal analysis tool for many practical applications. However, due to its computation-intensive nature, current implementation of the transform falls short of meeting real-time processing requirements of most application. The objectives of this paper are implement the Haar and Daubechies wavelets using FPGA technology. In addition, the Bit Error Rate (BER) between the input audio signal and the reconstructed output signal for each wavelet is calculated. From the BER, it is seen that the implementations execute the operation of the wavelet transform correctly and satisfying the perfect reconstruction conditions. The design procedure has been explained and designed using the stat-ofart Electronic Design Automation (EDA) tools for system design on FPGA. Simulation, synthesis and implementation on the FPGA target technology has been carried out.

Keywords: Daubechies wavelet, discrete wavelet transform, Haar wavelet, Xilinx FPGA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7230