Search results for: Audio Steganography

64 A Guide to the Implementation of Ambisonics Super Stereo

Authors: Alessio Mastrorillo, Giuseppe Silvi, Francesco Scagliola

Abstract:

This paper explores the decoding of Ambisonics material into 2-channel mixing formats, addressing challenges related to stereo speakers and headphones. We present the Universal HJ (UHJ) format as a solution, enabling the preservation of the entire horizontal plane and offering versatile spatial audio experiences. Our paper presents a UHJ format decoder, explaining its design, computational aspects, and empirical optimization. We discuss the advantages of UHJ decoding, potential applications, and its significance in music composition. Additionally, we highlight the integration of this decoder within the Envelop for Live (E4L) suite.

Keywords: Ambisonics, UHJ, quadrature filter, virtual reality, Gerzon, decoder, stereo, binaural, biquad.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 119

63 Design and Study of a DC/DC Converter for High Power, 14.4 V and 300 A for Automotive Applications

Authors: Julio Cesar Lopes de Oliveira, Carlos Henrique Gonc¸alves Treviso

Abstract:

The shortage of the automotive market in relation to options for sources of high power car audio systems, led to development of this work. Thus, we developed a source with stabilized voltage with 4320 W effective power. Designed to the voltage of 14.4 V and a choice of two currents: 30 A load option in battery banks and 300 A at full load. This source can also be considered as a source of general use dedicated commercial with a simple control circuit in analog form based on discrete components. The assembly of power circuit uses a methodology for higher power than the initially stipulated.

Keywords: DC-DC power converters, converters, power convertion, pulse width modulation converters.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2859

62 Two Kinds of Self-Oscillating Circuits Mechanically Demonstrated

Authors: Shiang-Hwua Yu, Po-Hsun Wu

Abstract:

This study introduces two types of self-oscillating circuits that are frequently found in power electronics applications. Special effort is made to relate the circuits to the analogous mechanical systems of some important scientific inventions: Galileo’s pendulum clock and Coulomb’s friction model. A little touch of related history and philosophy of science will hopefully encourage curiosity, advance the understanding of self-oscillating systems and satisfy the aspiration of some students for scientific literacy. Finally, the two self-oscillating circuits are applied to design a simple class-D audio amplifier.

Keywords: Self-oscillation, sigma-delta modulator, pendulum clock, Coulomb friction, class-D amplifier.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2403

61 Digital Image Forensics: Discovering the History of Digital Images

Authors: Gurinder Singh, Kulbir Singh

Abstract:

Digital multimedia contents such as image, video, and audio can be tampered easily due to the availability of powerful editing softwares. Multimedia forensics is devoted to analyze these contents by using various digital forensic techniques in order to validate their authenticity. Digital image forensics is dedicated to investigate the reliability of digital images by analyzing the integrity of data and by reconstructing the historical information of an image related to its acquisition phase. In this paper, a survey is carried out on the forgery detection by considering the most recent and promising digital image forensic techniques.

Keywords: Computer forensics, multimedia forensics, image ballistics, camera source identification, forgery detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1730

60 Real-Time Digital Oscilloscope Implementation in 90nm CMOS Technology FPGA

Authors: Nasir Mehmood, Jens Ogniewski, Vinodh Ravinath

Abstract:

This paper describes the design of a real-time audiorange digital oscilloscope and its implementation in 90nm CMOS FPGA platform. The design consists of sample and hold circuits, A/D conversion, audio and video processing, on-chip RAM, clock generation and control logic. The design of internal blocks and modules in 90nm devices in an FPGA is elaborated. Also the key features and their implementation algorithms are presented. Finally, the timing waveforms and simulation results are put forward.

Keywords: CMOS, VLSI, Oscilloscope, Field Programmable Gate Array (FPGA), VHDL, Video Graphics Array (VGA)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3045

59 The Electronic and Computer-Aided Periodic Table Prepared for the Visually Impaired Individuals

Authors: Ayşe Eldem, Fatih Başçiftçi

Abstract:

Visually impaired individuals cannot lead their lives as comfortable as others. Therefore, new applications are being developed every passing day in order to make their lives easier. In this study, an electronic and computer-aided audio device was developed with the aim of making the learning of the periodic table easier for the visually impaired. In this device, a board includes buttons for each element of the periodic table. After pressing a button, the visually impaired individual not only hears the name of the element but also feels with his/her hands where that specific element is located.

Keywords: Periodic Table, PIC16F877, Serial port, Visually Impaired Individual.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1903

58 Early Installation Effect on the Vibration Generated by Machines

Authors: Maitham Al-Safwani

Abstract:

Motor vibration issues were analyzed and correlated to poor equipment installation. We had a water injection pump tested in the factory and exceeded the pump vibration limit. Once the pump was brought to the site, its half-size shim plates were replaced with full-size shims plate that drastically reduced the vibration. In this study, vibration data were recorded for several and similar motors run at the same and different speeds. The vibration values were recorded — for two and a half hours — and the vibration readings analyzed to determine when the readings become consistent. This was as well supported by recording the audio noises produced by some machines seeking a relationship between changes in machine noises and machine abnormalities, such as vibration.

Keywords: Vibration, noise, shaft unbalance, shaft misalignment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 338

57 Pulse Skipping Modulated DC to DC Step Down Converter Under Discontinuous Conduction Mode

Authors: Ramamurthy S, Ranjan P V, Raghavendiran T A

Abstract:

Reduced switching loss favours Pulse Skipping Modulation mode of switching dc-to-dc converters at light loads. Under certain conditions the converter operates in discontinuous conduction mode (DCM). Inductor current starts from zero in each switching cycle as the switching frequency is constant and not adequately high. A DC-to-DC buck converter is modelled and simulated in this paper under DCM. Effect of ESR of the filter capacitor in input current frequency components is studied. The converter is studied for its operation under input voltage and load variation. The operating frequency is selected to be close to and above audio range.

Keywords: Buck converter, Discontinuous conduction mode, Electromagnetic Interference, Pulse Skipping Modulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4868

56 Face Localization Using Illumination-dependent Face Model for Visual Speech Recognition

Authors: Robert E. Hursig, Jane X. Zhang

Abstract:

A robust still image face localization algorithm capable of operating in an unconstrained visual environment is proposed. First, construction of a robust skin classifier within a shifted HSV color space is described. Then various filtering operations are performed to better isolate face candidates and mitigate the effect of substantial non-skin regions. Finally, a novel Bhattacharyya-based face detection algorithm is used to compare candidate regions of interest with a unique illumination-dependent face model probability distribution function approximation. Experimental results show a 90% face detection success rate despite the demands of the visually noisy environment.

Keywords: Audio-visual speech recognition, Bhattacharyyacoefficient, face detection,

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1582

55 A Real-Time Signal Processing Technique for MIDI Generation

Authors: Farshad Arvin, Shyamala Doraisamy

Abstract:

This paper presents a new hardware interface using a microcontroller which processes audio music signals to standard MIDI data. A technique for processing music signals by extracting note parameters from music signals is described. An algorithm to convert the voice samples for real-time processing without complex calculations is proposed. A high frequency microcontroller as the main processor is deployed to execute the outlined algorithm. The MIDI data generated is transmitted using the EIA-232 protocol. The analyses of data generated show the feasibility of using microcontrollers for real-time MIDI generation hardware interface.

Keywords: Signal processing, MIDI, Microcontroller, EIA-232.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2085

54 Realtime Lip Contour Tracking For Audio-Visual Speech Recognition Applications

Authors: Mehran Yazdi, Mehdi Seyfi, Amirhossein Rafati, Meghdad Asadi

Abstract:

Detection and tracking of the lip contour is an important issue in speechreading. While there are solutions for lip tracking once a good contour initialization in the first frame is available, the problem of finding such a good initialization is not yet solved automatically, but done manually. We have developed a new tracking solution for lip contour detection using only few landmarks (15 to 25) and applying the well known Active Shape Models (ASM). The proposed method is a new LMS-like adaptive scheme based on an Auto regressive (AR) model that has been fit on the landmark variations in successive video frames. Moreover, we propose an extra motion compensation model to address more general cases in lip tracking. Computer simulations demonstrate a fair match between the true and the estimated spatial pixels. Significant improvements related to the well known LMS approach has been obtained via a defined Frobenius norm index.

Keywords: Lip contour, Tracking, LMS-Like

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1741

53 Finite Element Method Analysis of Occluded-Ear Simulator and Natural Human Ear Canal

Authors: M. Sasajima, T. Yamaguchi, Y. Hu, Y. Koike

Abstract:

In this paper, we discuss the propagation of sound in the narrow pathways of an occluded-ear simulator typically used for the measurement of insert-type earphones. The simulator has a standardized frequency response conforming to the international standard (IEC60318-4). In narrow pathways, the speed and phase of sound waves are modified by viscous air damping. In our previous paper, we proposed a new finite element method (FEM) to consider the effects of air viscosity in this type of audio equipment. In this study, we will compare the results from the ear simulator FEM model, and those from a three dimensional human ear canal FEM model made from computed tomography images, with the measured frequency response data from the ear canals of 18 people.

Keywords: Ear simulator, FEM, viscosity, human ear canal.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1074

52 A Research of the Influence that MP3 Sound Gives EEG of the Person

Authors: Seiya Teshima, Kazushige Magatani

Abstract:

Currently, many types of no-reversible compressed sound source, represented by MP3 (MPEG Audio Layer-3) are popular in the world and they are widely used to make the music file size smaller. The sound data created in this way has less information as compared to pre-compressed data. The objective of this study is by analyzing EEG to determine if people can recognize such difference as differences in sound. A measurement system that can measure and analyze EEG when a subject listens to music were experimentally developed. And ten subjects were studied with this system. In this experiment, a WAVE formatted music data and a MP3 compressed music data that is made from the WAVE formatted data were prepared. Each subject was made to hear these music sources at the same volume. From the results of this experiment, clear differences were confirmed between two wound sources.

Keywords: EEG, Biological signal , Sound , MP3

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1735

51 Devising and Assessing the Efficacy of Mobile-Assisted Instructional Modes in Mobile Learning

Authors: Majlinda Fetaji, Alajdin Abazi, Zamir Dika, Bekim Fetaji

Abstract:

The assessment of the efficacy of devised Mobile- Assisted Instructional Modes in Mobile Learning was the focus of this research. The study adopted pre-test, post-test, control group quasi-experimental design. Research instruments were developed, validated and used for collecting data. Findings revealed that the students exposed to Mobile Task Based Learning Mode (MTBLM) in using Mobile-Assisted Instruction (MAI) performed significantly better. The implication of these findings is that, the Audio tutorial and Practice Mode (ATPM) (Stimulus instruments) of MAI had been found better over the other modes used in the study.

Keywords: Mobile-Assisted instructions, Mobile learning, learning instructions, task based learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1529

50 Data Hiding in Images in Discrete Wavelet Domain Using PMM

Authors: Souvik Bhattacharyya, Gautam Sanyal

Abstract:

Over last two decades, due to hostilities of environment over the internet the concerns about confidentiality of information have increased at phenomenal rate. Therefore to safeguard the information from attacks, number of data/information hiding methods have evolved mostly in spatial and transformation domain.In spatial domain data hiding techniques,the information is embedded directly on the image plane itself. In transform domain data hiding techniques the image is first changed from spatial domain to some other domain and then the secret information is embedded so that the secret information remains more secure from any attack. Information hiding algorithms in time domain or spatial domain have high capacity and relatively lower robustness. In contrast, the algorithms in transform domain, such as DCT, DWT have certain robustness against some multimedia processing.In this work the authors propose a novel steganographic method for hiding information in the transform domain of the gray scale image.The proposed approach works by converting the gray level image in transform domain using discrete integer wavelet technique through lifting scheme.This approach performs a 2-D lifting wavelet decomposition through Haar lifted wavelet of the cover image and computes the approximation coefficients matrix CA and detail coefficients matrices CH, CV, and CD.Next step is to apply the PMM technique in those coefficients to form the stego image. The aim of this paper is to propose a high-capacity image steganography technique that uses pixel mapping method in integer wavelet domain with acceptable levels of imperceptibility and distortion in the cover image and high level of overall security. This solution is independent of the nature of the data to be hidden and produces a stego image with minimum degradation.

Keywords: Cover Image, Pixel Mapping Method (PMM), StegoImage, Integer Wavelet Tranform.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2793

49 A Talking Head System for Korean Text

Authors: Sang-Wan Kim, Hoon Lee, Kyung-Ho Choi, Soon-Young Park

Abstract:

A talking head system (THS) is presented to animate the face of a speaking 3D avatar in such a way that it realistically pronounces the given Korean text. The proposed system consists of SAPI compliant text-to-speech (TTS) engine and MPEG-4 compliant face animation generator. The input to the THS is a unicode text that is to be spoken with synchronized lip shape. The TTS engine generates a phoneme sequence with their duration and audio data. The TTS applies the coarticulation rules to the phoneme sequence and sends a mouth animation sequence to the face modeler. The proposed THS can make more natural lip sync and facial expression by using the face animation generator than those using the conventional visemes only. The experimental results show that our system has great potential for the implementation of talking head for Korean text.

Keywords: Talking head, Lip sync, TTS, MPEG4.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1445

48 An Analysis of Compression Methods and Implementation of Medical Images in Wireless Network

Authors: C. Rajan, K. Geetha, S. Geetha

Abstract:

The motivation of image compression technique is to reduce the irrelevance and redundancy of the image data in order to store or pass data in an efficient way from one place to another place. There are several types of compression methods available. Without the help of compression technique, the file size is knowingly larger, usually several megabytes, but by doing the compression technique, it is possible to reduce file size up to 10% as of the original without noticeable loss in quality. Image compression can be lossless or lossy. The compression technique can be applied to images, audio, video and text data. This research work mainly concentrates on methods of encoding, DCT, compression methods, security, etc. Different methodologies and network simulations have been analyzed here. Various methods of compression methodologies and its performance metrics has been investigated and presented in a table manner.

Keywords: Image compression techniques, encoding, DCT, lossy compression, lossless compression, JPEG.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1151

47 Terrain Classification for Ground Robots Based on Acoustic Features

Authors: Bernd Kiefer, Abraham Gebru Tesfay, Dietrich Klakow

Abstract:

The motivation of our work is to detect different terrain types traversed by a robot based on acoustic data from the robot-terrain interaction. Different acoustic features and classifiers were investigated, such as Mel-frequency cepstral coefficient and Gamma-tone frequency cepstral coefficient for the feature extraction, and Gaussian mixture model and Feed forward neural network for the classification. We analyze the system’s performance by comparing our proposed techniques with some other features surveyed from distinct related works. We achieve precision and recall values between 87% and 100% per class, and an average accuracy at 95.2%. We also study the effect of varying audio chunk size in the application phase of the models and find only a mild impact on performance.

Keywords: Terrain classification, acoustic features, autonomous robots, feature extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1075

46 A Survey on Voice over IP over Wireless LANs

Authors: Haniyeh Kazemitabar, Sameha Ahmed, Kashif Nisar, Abas B Said, Halabi B Hasbullah

Abstract:

Voice over Internet Protocol (VoIP) is a form of voice communication that uses audio data to transmit voice signals to the end user. VoIP is one of the most important technologies in the World of communication. Around, 20 years of research on VoIP, some problems of VoIP are still remaining. During the past decade and with growing of wireless technologies, we have seen that many papers turn their concentration from Wired-LAN to Wireless-LAN. VoIP over Wireless LAN (WLAN) faces many challenges due to the loose nature of wireless network. Issues like providing Quality of Service (QoS) at a good level, dedicating capacity for calls and having secure calls is more difficult rather than wired LAN. Therefore VoIP over WLAN (VoWLAN) remains a challenging research topic. In this paper we consolidate and address major VoWLAN issues. This research is helpful for those researchers wants to do research in Voice over IP technology over WLAN network.

Keywords: Capacity, QoS, Security, VoIP Issues, WLAN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2205

45 Robust Features for Impulsive Noisy Speech Recognition Using Relative Spectral Analysis

Authors: Hajer Rahali, Zied Hajaiej, Noureddine Ellouze

Abstract:

The goal of speech parameterization is to extract the relevant information about what is being spoken from the audio signal. In speech recognition systems Mel-Frequency Cepstral Coefficients (MFCC) and Relative Spectral Mel-Frequency Cepstral Coefficients (RASTA-MFCC) are the two main techniques used. It will be shown in this paper that it presents some modifications to the original MFCC method. In our work the effectiveness of proposed changes to MFCC called Modified Function Cepstral Coefficients (MODFCC) were tested and compared against the original MFCC and RASTA-MFCC features. The prosodic features such as jitter and shimmer are added to baseline spectral features. The above-mentioned techniques were tested with impulsive signals under various noisy conditions within AURORA databases.

Keywords: Auditory filter, impulsive noise, MFCC, prosodic features, RASTA filter.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2279

44 Signed Approach for Mining Web Content Outliers

Authors: G. Poonkuzhali, K.Thiagarajan, K.Sarukesi, G.V.Uma

Abstract:

The emergence of the Internet has brewed the revolution of information storage and retrieval. As most of the data in the web is unstructured, and contains a mix of text, video, audio etc, there is a need to mine information to cater to the specific needs of the users without loss of important hidden information. Thus developing user friendly and automated tools for providing relevant information quickly becomes a major challenge in web mining research. Most of the existing web mining algorithms have concentrated on finding frequent patterns while neglecting the less frequent ones that are likely to contain outlying data such as noise, irrelevant and redundant data. This paper mainly focuses on Signed approach and full word matching on the organized domain dictionary for mining web content outliers. This Signed approach gives the relevant web documents as well as outlying web documents. As the dictionary is organized based on the number of characters in a word, searching and retrieval of documents takes less time and less space.

Keywords: Outliers, Relevant document, , Signed Approach, Web content mining, Web documents..

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2305

43 Media Pedagogy - The Medium is the Message

Authors: Syed Sultan Ahmed

Abstract:

The current education system in India is adept in equipping and assessing the scholastic development of children. However, there is an immediate need to strengthen co-scholastic areas like life-skills, values and attitudes to equip students to face real life challenges. Audio-visual technology and their respective media can make a significant contribution to a value based learning curriculum. Thus, co-scholastic skills need to be effectively nurtured by a medium that is entertaining and impactful. Films in general have a tremendous impact in our society. Films with a positive message make a formidable learning experience that can influence and inspire generations of learners. Leveraging on this powerful medium, EduMedia India Pvt. Ltd. has introduced School Cinema a well researched film-based learning module supported by a fun and exciting workbook, designed to introduce and reaffirm life-skills and values to children, thereby having a positive influence on their attitudes.

Keywords: Co-Scholastics, Entertaining, Educative, Holistic- Development

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1631

42 Collaborative and Content-based Recommender System for Social Bookmarking Website

Authors: Cheng-Lung Huang, Cheng-Wei Lin

Abstract:

This study proposes a new recommender system based on the collaborative folksonomy. The purpose of the proposed system is to recommend Internet resources (such as books, articles, documents, pictures, audio and video) to users. The proposed method includes four steps: creating the user profile based on the tags, grouping the similar users into clusters using an agglomerative hierarchical clustering, finding similar resources based on the user-s past collections by using content-based filtering, and recommending similar items to the target user. This study examines the system-s performance for the dataset collected from “del.icio.us," which is a famous social bookmarking website. Experimental results show that the proposed tag-based collaborative and content-based filtering hybridized recommender system is promising and effectiveness in the folksonomy-based bookmarking website.

Keywords: Collaborative recommendation, Folksonomy, Social tagging

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2197

41 Specification of Attributes of a Multimedia Presentation for Presentation Manager

Authors: Veli Hakkoymaz, Alpaslan Altunköprü

Abstract:

A multimedia presentation system refers to the integration of a multimedia database with a presentation manager which has the functionality of content selection, organization and playout of multimedia presentations. It requires high performance of involved system components. Starting from multimedia information capture until the presentation delivery, high performance tools are required for accessing, manipulating, storing and retrieving these segments, for transferring and delivering them in a presentation terminal according to a playout order. The organization of presentations is a complex task in that the display order of presentation contents (in time and space) must be specified. A multimedia presentation contains audio, video, images and text media types. The critical decisions for presentation construction include what the contents are, how the contents are organized, and once the decision is made on the organization of the contents of the presentation, it must be conveyed to the end user in the correct organizational order and in a timely fashion. This paper introduces a framework for specification of multimedia presentations and describes the design of sample presentations using this framework from a multimedia database.

Keywords: Multimedia presentation, temporal specification, SMIL, spatial specification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1771

40 Comparison between Haar and Daubechies Wavelet Transformions on FPGA Technology

Authors: Mohamed I. Mahmoud, Moawad I. M. Dessouky, Salah Deyab, Fatma H. Elfouly

Abstract:

Recently, the Field Programmable Gate Array (FPGA) technology offers the potential of designing high performance systems at low cost. The discrete wavelet transform has gained the reputation of being a very effective signal analysis tool for many practical applications. However, due to its computation-intensive nature, current implementation of the transform falls short of meeting real-time processing requirements of most application. The objectives of this paper are implement the Haar and Daubechies wavelets using FPGA technology. In addition, the comparison between the Haar and Daubechies wavelets is investigated. The Bit Error Rat (BER) between the input audio signal and the reconstructed output signal for each wavelet is calculated. It is seen that the BER using Daubechies wavelet techniques is less than Haar wavelet. The design procedure has been explained and designed using the stat-of-art Electronic Design Automation (EDA) tools for system design on FPGA. Simulation, synthesis and implementation on the FPGA target technology has been carried out.

Keywords: Daubechies wavelet, discrete wavelet transform, Haar wavelet, Xilinx FPGA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4788

39 Multimodal Database of Emotional Speech, Video and Gestures

Authors: Tomasz Sapiński, Dorota Kamińska, Adam Pelikant, Egils Avots, Cagri Ozcinar, Gholamreza Anbarjafari

Abstract:

People express emotions through different modalities. Integration of verbal and non-verbal communication channels creates a system in which the message is easier to understand. Expanding the focus to several expression forms can facilitate research on emotion recognition as well as human-machine interaction. In this article, the authors present a Polish emotional database composed of three modalities: facial expressions, body movement and gestures, and speech. The corpora contains recordings registered in studio conditions, acted out by 16 professional actors (8 male and 8 female). The data is labeled with six basic emotions categories, according to Ekman’s emotion categories. To check the quality of performance, all recordings are evaluated by experts and volunteers. The database is available to academic community and might be useful in the study on audio-visual emotion recognition.

Keywords: Body movement, emotion recognition, emotional corpus, facial expressions, gestures, multimodal database, speech.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1068

38 Comparison between Haar and Daubechies Wavelet Transformations on FPGA Technology

Authors: Fatma H. Elfouly, Mohamed I. Mahmoud, Moawad I. M. Dessouky, Salah Deyab

Abstract:

Recently, the Field Programmable Gate Array (FPGA) technology offers the potential of designing high performance systems at low cost. The discrete wavelet transform has gained the reputation of being a very effective signal analysis tool for many practical applications. However, due to its computation-intensive nature, current implementation of the transform falls short of meeting real-time processing requirements of most application. The objectives of this paper are implement the Haar and Daubechies wavelets using FPGA technology. In addition, the Bit Error Rate (BER) between the input audio signal and the reconstructed output signal for each wavelet is calculated. From the BER, it is seen that the implementations execute the operation of the wavelet transform correctly and satisfying the perfect reconstruction conditions. The design procedure has been explained and designed using the stat-ofart Electronic Design Automation (EDA) tools for system design on FPGA. Simulation, synthesis and implementation on the FPGA target technology has been carried out.

Keywords: Daubechies wavelet, discrete wavelet transform, Haar wavelet, Xilinx FPGA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7183

37 Improved Weighted Matching for Speaker Recognition

Authors: Ozan Mut, Mehmet Göktürk

Abstract:

Matching algorithms have significant importance in speaker recognition. Feature vectors of the unknown utterance are compared to feature vectors of the modeled speakers as a last step in speaker recognition. A similarity score is found for every model in the speaker database. Depending on the type of speaker recognition, these scores are used to determine the author of unknown speech samples. For speaker verification, similarity score is tested against a predefined threshold and either acceptance or rejection result is obtained. In the case of speaker identification, the result depends on whether the identification is open set or closed set. In closed set identification, the model that yields the best similarity score is accepted. In open set identification, the best score is tested against a threshold, so there is one more possible output satisfying the condition that the speaker is not one of the registered speakers in existing database. This paper focuses on closed set speaker identification using a modified version of a well known matching algorithm. The results of new matching algorithm indicated better performance on YOHO international speaker recognition database.

Keywords: Automatic Speaker Recognition, Voice Recognition, Pattern Recognition, Digital Audio Signal Processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1692

36 Development of Mobile Application Social Guidance and Counseling for Junior High School

Authors: Suyoto, Tri Prasetyaningrum

Abstract:

At this paper, we will present the development of mobile application Social Guidance and Counseling (GC) that called “m-NingBK: Social GC”. The application is used for GC services that run on mobile devices. The application is designed specifically for Junior High School student. The methods are a combination of interactive multimedia approaches and educational psychology. Therefore, the design process is carried out three processes, which are digitizing of material social GC services, visualizing wisely and making interactive. This method is intended to make students not only hear and see but also "do" the virtual. There are five components used in multimedia applications "m-NingBK: Social GC" i.e. text, images / graphics, audio / sound, animation and video. Four menus provided by this application is the potential self, social, Expert System and about. The application is built using the Java programming language. This application was tested using a Smartphone with Android Operating System. Based on the test, people give rating: 16.7% excellent, 61.1% good, 19.4% adequate, and 2.8% poor.

Keywords: Expert Systems, Guidance and Counseling, mobile application, multimedia.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2861

35 “Moves” for Guiding Presentations in French

Authors: Nuchanat Handumrongkul, Suwaree Yordchim, Anantachai Aeka

Abstract:

Despite four years of study in the tourism industry, the Bachelor’s graduates cannot perform their jobs as experienced tour guides. This research aimed to develop French teaching and studying for Tourism with two main purposes: to analyze ‘Moves’ used in oral presentations at tourist attraction; and to study content in guiding presentations or 'Guide Speak'. The study employed audio recording of these presentations as an interview method in authentic situations, having four tour guides as respondents and information providers. The data was analyzed via moves and content analysis. The results found that there were eight Moves used; namely, Welcoming, Introducing oneself, Drawing someone’s attention, Giving information, Explaining, Highlighting, Persuading and Saying goodbye. In terms of content, the information being presented covered the outstanding characteristics of the places and wellintegrated with other related content. The findings were used as guidelines for curriculum development; in particular, the core content and the presentation forming the basis for students to meet the standard requirements of the labor-market and professional schemes.

Keywords: "Moves", Guiding Presentation, French.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1547