Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 18729

Search results for: audio lingual method

18699 Musical Tesla Coil with Faraday Box Controlled by a GNU Radio

Authors: Jairo Vega, Fabian Chamba, Jordy Urgiles

Abstract:

In this work, the implementation of a Matlabcontrolled Musical Tesla Coil and external audio signals was presented. First, the audio signal was obtained from a mobile device and processed in Matlab to modify it, adding noise or other desired effects. Then, the processed signal was passed through a preamplifier to increase its amplitude to a level suitable for further amplification through a power amplifier, which was part of the current driver circuit of the Tesla coil. To get the Tesla coil to generate music, a circuit capable of modulating and generating the audio signal by manipulating electrical discharges was used. To visualize and listen to these discharges, a small Faraday cage was built to attenuate the external electric fields. Finally, the implementation of the musical Tesla coil was concluded. However, it was observed that the audio signal volume was very low, and the components used heated up quickly. Due to these limitations, it was determined that the project could not be connected to power for long periods of time.

Keywords: Tesla coil, plasma, electrical signals, GNU Radio

Procedia PDF Downloads 61

18698 Digital Musical Organology: The Audio Games: The Question of “A-Musicological” Interfaces

Authors: Hervé Zénouda

Abstract:

This article seeks to shed light on an emerging creative field: "Audio games," at the crossroads between video games and computer music. Indeed, many applications, which propose entertaining audio-visual experiences with the objective of musical creation, are available today for different supports (game consoles, computers, cell phones). The originality of this field is the use of the gameplay of video games applied to music composition. Thus, composing music using interfaces but also cognitive logics that we qualify as "a-musicological" seem to us particularly interesting from the perspective of musical digital organology. This field raises questions about the representation of sound and musical structures and develops new instrumental gestures and strategies of musical composition. We will try in this article to define the characteristics of this field by highlighting some historical milestones (abstract cinema, game theory in music, actions, and graphic scores) as well as the novelties brought by digital technologies.

Keywords: audio-games, video games, computer generated music, gameplay, interactivity, synesthesia, sound interfaces, relationships image/sound, audiovisual music

Procedia PDF Downloads 81

18697 On Musical Information Geometry with Applications to Sonified Image Analysis

Authors: Shannon Steinmetz, Ellen Gethner

Abstract:

In this paper, a theoretical foundation is developed for patterned segmentation of audio using the geometry of music and statistical manifold. We demonstrate image content clustering using conic space sonification. The algorithm takes a geodesic curve as a model estimator of the three-parameter Gamma distribution. The random variable is parameterized by musical centricity and centric velocity. Model parameters predict audio segmentation in the form of duration and frame count based on the likelihood of musical geometry transition. We provide an example using a database of randomly selected images, resulting in statistically significant clusters of similar image content.

Keywords: sonification, musical information geometry, image, content extraction, automated quantification, audio segmentation, pattern recognition

Procedia PDF Downloads 185

18696 Method Comprising One to One Web Based Real Time Communications

Authors: Lata Kiran Dey, Rajendra Kumar, Biren Karmakar

Abstract:

Web Real Time Communications is a collection of standards, protocols, which provides real-time communications capabilities between web browsers and devices. This paper outlines the design and further implementation of web real-time communications on secure web applications having audio and video call capabilities. This proposed application may put up a system that will be able to work over both desktops as well as the mobile browser. Though, WebRTC also gives a set of JavaScript standard RTC APIs, which primarily works over the real-time communication framework. This helps to build a suitable communication application, which enables the audio, video, and message transfer in between the today’s modern browsers having WebRTC support.

Keywords: WebRTC, SIP, RTC, JavaScript, SRTP, secure web sockets, browser

Procedia PDF Downloads 106

18695 A Combined Feature Extraction and Thresholding Technique for Silence Removal in Percussive Sounds

Authors: B. Kishore Kumar, Pogula Rakesh, T. Kishore Kumar

Abstract:

The music analysis is a part of the audio content analysis used to analyze the music by using the different features of audio signal. In music analysis, the first step is to divide the music signal to different sections based on the feature profiles of the music signal. In this paper, we present a music segmentation technique that will effectively segmentize the signal and thresholding technique to remove silence from the percussive sounds produced by percussive instruments, which uses two features of music, namely signal energy and spectral centroid. The proposed method impose thresholds on both the features which will vary depends on the music signal. Depends on the threshold, silence part is removed and the segmentation is done. The effectiveness of the proposed method is analyzed using MATLAB.

Keywords: percussive sounds, spectral centroid, spectral energy, silence removal, feature extraction

Procedia PDF Downloads 559

18694 Carrier Communication through Power Lines

Authors: Pavuluri Gopikrishna, B. Neelima

Abstract:

Power line carrier communication means audio power transmission via power line and reception of the amplified audio power at the receiver as in the form of speaker output signal using power line as the channel medium. The main objective of this suggested work is to transmit our message signal after frequency modulation by the help of FM modulator IC LM565 which gives output proportional to the input voltage of the input message signal. And this audio power is received from the power line by the help of isolation circuit and demodulated from IC LM565 which uses the concept of the PLL and produces FM demodulated signal to the listener. Message signal will be transmitted over the carrier signal that will be generated from the FM modulator IC LM565. Using this message signal will not damage because of no direct contact of message signal from the power line, but noise can disturb our information.

Keywords: amplification, fm demodulator ic 565, fm modulator ic 565, phase locked loop, power isolation

Procedia PDF Downloads 519

18693 Development of a Method to Prepare In-School Tactile Guide Maps for Visually Impaired School Children

Authors: K. Doi, T. Nishimura, M. Kawano, H. Fujimoto, Y. Tanaka, M. Sawada, S. Oouchi, T. Kaneko, K. Kanamori

Abstract:

As part of reasonable accommodation for people with disabilities in Japan, which has ratified the Convention on the Rights of Persons with Disabilities, tactile guide maps are necessary. Such maps can enable visually impaired children to attend schools of special needs education (visual impairments) to grasp the arrangement of classrooms on their school campuses. However, it takes many years to be able to use a tactile guide map without difficulty. Thus, information support, in which audio information is added in addition to tactile information, is required. In the present research, a method to prepare an in-school tactile guide map with an additional audio reading function was developed. This map can enable visually impaired school children attending schools of special needs education (visual impairments) to grasp the arrangement of classrooms on their school campuses.

Keywords: accessible design, visually impaired, braille, tactile map, in-school tactile guide map

Procedia PDF Downloads 334

18692 The Implication of News Segments and Movies for Enhancing Listening Comprehension of Language Learners

Authors: Taher Bahrani

Abstract:

Armed with technological development, the present study aimed at gauging the effectiveness of exposure to news and movies as two types of audio-visual programs on improving language learners’ listening comprehension at the intermediate level. To this end, a listening comprehension test was administered to 108 language learners and finally 60 language learners were selected as intermediate language learners and randomly divided into group one and group two. During the experiment, group one participants had exposure to audio-visual news stories to work on in-and out-side the classroom. On the contrary, the participants in group two had only exposure to a sample selected utterances extracted from different kinds of movies. At the end of the experiment, both groups took another sample listening test to find out to what extent the participants in each group could enhance their listening comprehension. The results obtained from the post-test were indicative of the fact that the participants who had exposure to news outperformed the participants who had exposure to movies. The findings of the present research seem to indicate that the language input embedded in the type of audio-visual programs which language learners are exposed to is more important than the amount of exposure.

Keywords: audio-visual news, movies, listening comprehension, intermediate level

Procedia PDF Downloads 349

18691 The Influence of Audio-Visual Resources in Teaching Business Subjects in Selected Secondary Schools in Ifako Ijaiye Local Government Area of Lagos State, Nigeria

Authors: Oluwole Victor Falobi, Lawrence Olusola Ige

Abstract:

The cardinal drawing force of this study is to examine the influence of audio-visual resources in teaching business subjects in selected secondary schools in IfakoIjaiye Local Government Area of Lagos State, Nigeria. A descriptive survey research design was employed for the study. By using a quantitative research approach and a sample size of 120 students were randomly selected from four public schools. Three research questions with one hypothesis guided the study. Data collected were analysed using frequency, the mean and standard deviation for the research questions, and Pearson Product Moment Correlation PPMC were used to analysed the inferential statistic. Findings from the study revealed that the Influence of audio-visual resources in teaching business subjects in selected secondary schools in IfakoIjaiye Local Government Area of Lagos State is low. It further revealed data the knowledge of teachers on the use of audio-visual resources is high in Ifako Local Government Area. It was recommended that government should create a timely monitoring system in other to check secondary school laboratories and classrooms to replace outdated facilities and also purchase needed facilities for effective teaching and learning to take place.

Keywords: audio-visual resources, business subjects, school, teaching

Procedia PDF Downloads 61

18690 Speech and Swallowing Function after Tonsillo-Lingual Sulcus Resection with PMMC Flap Reconstruction: A Case Study

Authors: K. Rhea Devaiah, B. S. Premalatha

Abstract:

Background: Tonsillar Lingual sulcus is the area between the tonsils and the base of the tongue. The surgical resection of the lesions in the head and neck results in changes in speech and swallowing functions. The severity of the speech and swallowing problem depends upon the site and extent of the lesion, types and extent of surgery and also the flexibility of the remaining structures. Need of the study: This paper focuses on the importance of speech and swallowing rehabilitation in an individual with the lesion in the Tonsillar Lingual Sulcus and post-operative functions. Aim: Evaluating the speech and swallow functions post-intensive speech and swallowing rehabilitation. The objectives are to evaluate the speech intelligibility and swallowing functions after intensive therapy and assess the quality of life. Method: The present study describes a report of an individual aged 47years male, with the diagnosis of basaloid squamous cell carcinoma, left tonsillar lingual sulcus (pT2n2M0) and underwent wide local excision with left radical neck dissection with PMMC flap reconstruction. Post-surgery the patient came with a complaint of reduced speech intelligibility, and difficulty in opening the mouth and swallowing. Detailed evaluation of the speech and swallowing functions were carried out such as OPME, articulation test, speech intelligibility, different phases of swallowing and trismus evaluation. Self-reported questionnaires such as SHI-E(Speech handicap Index- Indian English), DHI (Dysphagia handicap Index) and SESEQ -K (Self Evaluation of Swallowing Efficiency in Kannada) were also administered to know what the patient feels about his problem. Based on the evaluation, the patient was diagnosed with pharyngeal phase dysphagia associated with trismus and reduced speech intelligibility. Intensive speech and swallowing therapy was advised weekly twice for the duration of 1 hour. Results: Totally the patient attended 10 intensive speech and swallowing therapy sessions. Results indicated misarticulation of speech sounds such as lingua-palatal sounds. Mouth opening was restricted to one finger width with difficulty chewing, masticating, and swallowing the bolus. Intervention strategies included Oro motor exercise, Indirect swallowing therapy, usage of a trismus device to facilitate mouth opening, and change in the food consistency to help to swallow. A practice session was held with articulation drills to improve the production of speech sounds and also improve speech intelligibility. Significant changes in articulatory production and speech intelligibility and swallowing abilities were observed. The self-rated quality of life measures such as DHI, SHI and SESE Q-K revealed no speech handicap and near-normal swallowing ability indicating the improved QOL after the intensive speech and swallowing therapy. Conclusion: Speech and swallowing therapy post carcinoma in the tonsillar lingual sulcus is crucial as the tongue plays an important role in both speech and swallowing. The role of Speech-language and swallowing therapists in oral cancer should be highlighted in treating these patients and improving the overall quality of life. With intensive speech-language and swallowing therapy post-surgery for oral cancer, there can be a significant change in the speech outcome and swallowing functions depending on the site and extent of lesions which will thereby improve the individual’s QOL.

Keywords: oral cancer, speech and swallowing therapy, speech intelligibility, trismus, quality of life

Procedia PDF Downloads 78

18689 Social Change and Cultural Sustainability in the Wake of Digital Media Revolution in South Asia

Authors: Binod C. Agrawal

Abstract:

In modern history, industrial and media merchandising in South Asia from East Asia, Europe, United States and other countries of the West is over 200 years old. Hence, continued external technology and media exposure is not a new experience in multi-lingual and multi religious South Asia which evolved cultural means to withstand structural change. In the post-World War II phase, media exposure especially of telecommunication, film, Internet, radio, print media and television have increased manifold. South Asia did not lose any time in acquiring and adopting digital media accelerated by chip revolution, computer and satellite communication. The penetration of digital media and utilization are exceptionally high though the spread has an unequal intensity, use and effects. The author argues that industrial and media products are “cultural products” apart from being “technological products”; hence their influences are most felt in the cultural domain which may lead to blunting of unique cultural specifics in the multi-cultural, multi-lingual and multi religious South Asia. Social scientists, political leaders and parents have voiced concern of “Cultural domination”, “Digital media colonization” and “Westernization”. Increased digital media access has also opened up doors of pornography and other harmful information that have sparked fresh debates and discussions about serious negative, harmful, and undesirable social effects especially among youth. Within ‘techno-social’ perspective, based on recent research studies, the paper aims to describe and analyse possible socio-economic change due to digital media penetration. Further, analysis supports the view that the ancient multi-lingual and multi-religious cultures of South Asia due to inner cultural strength may sustain without setting in a process of irreversible structural changes in South Asia.

Keywords: cultural sustainability, digital media effects, digital media impact in South Asia, social change in South Asia

Procedia PDF Downloads 321

18688 Mapping the Sonic Spectrum of Traditional Music and Instruments Used in Malaysian Kavadi Rituals

Authors: Ainolnaim Azizol, Valerie Ross

Abstract:

Music is as old as mankind and rituals using music such as Kavadi have been associated with social, cultural, and spiritual practices in many traditional and modern societies. Recent literature has provided scientific evidence that music affects psychological and physical changes through stimulation of brainwave. Despite such advances, the scientific study of the sonic qualities peculiar to traditional instruments and how it impacts on ritualistic activities is still lacking. This study addresses one such phenomenon. Devotees in Kavadi rituals are known to be in a state of trance state and do not experience pain nor suffer injury despite the hundreds of needles pierced through their skins. Although scientists have sought to understand how this is possible, lesser is known about the music that is used to prepare devotees to enter into the trance state. This study fills this gap of knowledge by providing scientific evidence through the identification and mapping of the sonic spectrum or sound fingerprint of the instruments and the repertoire used in these ritualistic forms in their ethnographic environment and in audio-controlled situations. The objectives are to identify and categorize the different types of traditional music used in Kavadi rituals; to record, transcribe and digitally score the musical repertoire used in the oral tradition of Kavadi rituals; to map the sonic spectrum of ritual music using spectromography and advanced music analytical software a mixed methodology will be used. This comprises ethnographic field studies using interviews, participant observation, audio-video recordings and audio-methodology using spectromography and advanced audio-technology for sonic mapping and the transcription of audio recordings into digital scores.

Keywords: sonic, traditional, ritual, Kavadi, music

Procedia PDF Downloads 217

18687 Illumina MiSeq Sequencing for Bacteria Identification on Audio-Visual Materials

Authors: Tereza Branyšová, Martina Kračmarová, Kateřina Demnerová, Michal Ďurovič, Hana Stiborová

Abstract:

Microbial deterioration threatens all objects of cultural heritage, including audio-visual materials. Fungi are commonly known to be the main factor in audio-visual material deterioration. However, although being neglected, bacteria also play a significant role. In addition to microbial contamination of materials, it is also essential to analyse air as a possible contamination source. This work aims to identify bacterial species in the archives of the Czech Republic that occur on audio-visual materials as well as in the air in the archives. For sampling purposes, the smears from the materials were taken by sterile polyurethane sponges, and the air was collected using a MAS-100 aeroscope. Metagenomic DNA from all collected samples was immediately isolated and stored at -20 °C. DNA library for the 16S rRNA gene was prepared using two-step PCR and specific primers and the concentration step was included due to meagre yields of the DNA. After that, the samples were sent to the University of Fairbanks, Alaska, for Illumina MiSeq sequencing. Subsequently, the analysis of the sequences was conducted in R software. The obtained sequences were assigned to the corresponding bacterial species using the DADA2 package. The impact of air contamination and the impact of different photosensitive layers that audio-visual materials were made of, such as gelatine, albumen, and collodion, were evaluated. As a next step, we will take a deeper focus on air contamination. We will select an appropriate culture-dependent approach along with a culture-independent approach to observe a metabolically active species in the air. Acknowledgment: This project is supported by grant no. DG18P02OVV062 of the Ministry of Culture of the Czech Republic.

Keywords: cultural heritage, Illumina MiSeq, metagenomics, microbial identification

Procedia PDF Downloads 125

18686 Crosssampler: A Digital Convolution Cross Synthesis Instrument

Authors: Jimmy Eadie

Abstract:

Convolutional Cross Synthesis (CCS) has emerged as a powerful technique for blending input signals to create hybrid sounds. It has significantly expanded the horizons of digital signal processing, enabling artists to explore audio effects. However, the conventional applications of CCS primarily revolve around reverberation and room simulation rather than being utilized as a creative synthesis method. In this paper, we present the design of a digital instrument called CrossSampler that harnesses a parametric approach to convolution cross-synthesis, which involves using adjustable parameters to control the blending of audio signals through convolution. These parameters allow for customization of the resulting sound, offering greater creative control and flexibility. It enables users to shape the output by manipulating factors such as duration, intensity, and spectral characteristics. This approach facilitates experimentation and exploration in sound design and opens new sonic possibilities.

Keywords: convolution, synthesis, sampling, virtual instrument

Procedia PDF Downloads 19

18685 Correlation between Speech Emotion Recognition Deep Learning Models and Noises

Authors: Leah Lee

Abstract:

This paper examines the correlation between deep learning models and emotions with noises to see whether or not noises mask emotions. The deep learning models used are plain convolutional neural networks (CNN), auto-encoder, long short-term memory (LSTM), and Visual Geometry Group-16 (VGG-16). Emotion datasets used are Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS), Crowd-sourced Emotional Multimodal Actors Dataset (CREMA-D), Toronto Emotional Speech Set (TESS), and Surrey Audio-Visual Expressed Emotion (SAVEE). To make it four times bigger, audio set files, stretch, and pitch augmentations are utilized. From the augmented datasets, five different features are extracted for inputs of the models. There are eight different emotions to be classified. Noise variations are white noise, dog barking, and cough sounds. The variation in the signal-to-noise ratio (SNR) is 0, 20, and 40. In summation, per a deep learning model, nine different sets with noise and SNR variations and just augmented audio files without any noises will be used in the experiment. To compare the results of the deep learning models, the accuracy and receiver operating characteristic (ROC) are checked.

Keywords: auto-encoder, convolutional neural networks, long short-term memory, speech emotion recognition, visual geometry group-16

Procedia PDF Downloads 38

18684 Investment Projects Selection Problem under Hesitant Fuzzy Environment

Authors: Irina Khutsishvili

Abstract:

In the present research, a decision support methodology for the multi-attribute group decision-making (MAGDM) problem is developed, namely for the selection of investment projects. The objective of the investment project selection problem is to choose the best project among the set of projects, seeking investment, or to rank all projects in descending order. The project selection is made considering a set of weighted attributes. To evaluate the attributes in our approach, expert assessments are used. In the proposed methodology, lingual expressions (linguistic terms) given by all experts are used as initial attribute evaluations, since they are the most natural and convenient representation of experts' evaluations. Then lingual evaluations are converted into trapezoidal fuzzy numbers, and the aggregate trapezoidal hesitant fuzzy decision matrix will be built. The case is considered when information on the attribute weights is completely unknown. The attribute weights are identified based on the De Luca and Termini information entropy concept, determined in the context of hesitant fuzzy sets. The decisions are made using the extended Technique for Order Performance by Similarity to Ideal Solution (TOPSIS) method under a hesitant fuzzy environment. Hence, a methodology is based on a trapezoidal valued hesitant fuzzy TOPSIS decision-making model with entropy weights. The ranking of alternatives is performed by the proximity of their distances to both the fuzzy positive-ideal solution (FPIS) and the fuzzy negative-ideal solution (FNIS). For this purpose, the weighted hesitant Hamming distance is used. An example of investment decision-making is shown that clearly explains the procedure of the proposed methodology.

Keywords: In the present research, a decision support methodology for the multi-attribute group decision-making (MAGDM) problem is developed, namely for the selection of investment projects. The objective of the investment project selection problem is to choose the best project among the set of projects, seeking investment, or to rank all projects in descending order. The project selection is made considering a set of weighted attributes. To evaluate the attributes in our approach, expert assessments are used. In the proposed methodology, lingual expressions (linguistic terms) given by all experts are used as initial attribute evaluations since they are the most natural and convenient representation of experts' evaluations. Then lingual evaluations are converted into trapezoidal fuzzy numbers, and the aggregate trapezoidal hesitant fuzzy decision matrix will be built. The case is considered when information on the attribute weights is completely unknown. The attribute weights are identified based on the De Luca and Termini information entropy concept, determined in the context of hesitant fuzzy sets. The decisions are made using the extended Technique for Order Performance by Similarity to Ideal Solution (TOPSIS) method under a hesitant fuzzy environment. Hence, a methodology is based on a trapezoidal valued hesitant fuzzy TOPSIS decision-making model with entropy weights. The ranking of alternatives is performed by the proximity of their distances to both the fuzzy positive-ideal solution (FPIS) and the fuzzy negative-ideal solution (FNIS). For this purpose, the weighted hesitant Hamming distance is used. An example of investment decision-making is shown that clearly explains the procedure of the proposed methodology.

Procedia PDF Downloads 87

18683 The Audio-Visual and Syntactic Priming Effect on Specific Language Impairment and Gender in Modern Standard Arabic

Authors: Mohammad Al-Dawoody

Abstract:

This study aims at exploring if priming is affected by gender in Modern Standard Arabic and if it is restricted solely to subjects with no specific language impairment (SLI). The sample in this study consists of 74 subjects, between the ages of 11;1 and 11;10, distributed into (a) 2 SLI experimental groups of 38 subjects divided into two gender groups of 18 females and 20 males and (b) 2 non-SLI control groups of 36 subjects divided into two gender groups of 17 females and 19 males. Employing a mixed research design, the researcher conducted this study within the framework of the relevance theory (RT) whose main assumption is that human beings are endowed with a biological ability to magnify the relevance of the incoming stimuli. Each of the four groups was given two different priming stimuli: audio-visual priming (T1) and syntactic priming (T2). The results showed that the priming effect was sheer distinct among SLI participants especially when retrieving typical responses (TR) in T1 and T2 with slight superiority of males over females. The results also revealed that non-SLI females showed stronger original response (OR) priming in T1 than males and that non-SLI males in T2 excelled in OR priming than females. Furthermore, the results suggested that the audio-visual priming has a stronger effect on SLI females than non-SLI females and that syntactic priming seems to have the same effect on the two groups (non-SLI and SLI females). The conclusion is that the priming effect varies according to gender and is not confined merely to non-SLI subjects.

Keywords: specific language impairment, relevance theory, audio-visual priming, syntactic priming, modern standard Arabic

Procedia PDF Downloads 143

18682 The Arabic Literary Text, between Proficiency and Pedagogy

Authors: Abdul Rahman M. Chamseddine, Mahmoud El-ashiri

Abstract:

In the field of language teaching, communication skills are essential for the learner to achieve, however, these skills, in general, might not support the comprehension of some texts of literary or artistic nature like poetry. Understanding sentences and expressions is not enough to understand a poem; other skills are needed in order to understand the special structure of a text which literary meaning is inapprehensible even when the lingual meaning is well comprehended. And then there is the need for many other components that surpass one text to other similar texts that can be understood through solid traditions, which do not form an obstacle in the face of change and progress. This is not exclusive to texts that are classified as a literary but it is also the same with some daily short phrases and indicatively charged expressions that can be classified as literary or bear a taste of literary nature.. it can be found in Newpapers’ titles, TV news reports, and maybe football commentaries… the need to understand this special lingual use – described as literary – is highly important to understand this discourse that can be generally classified as very far from literature. This work will try to explore the role of the literary text in the language class and the way it is being covered or dealt with throughout all levels of acquiring proficiency. It will also attempt to survery the position of the literary text in some of the most important books for teaching Arabic around the world. The same way grammar is needed to understand the language, another (literary) grammar is also needed for understanding literature.

Keywords: language teaching, Arabic, literature, pedagogy, language proficiency

Procedia PDF Downloads 243

18681 Online Delivery Approaches of Post Secondary Virtual Inclusive Media Education

Authors: Margot Whitfield, Andrea Ducent, Marie Catherine Rombaut, Katia Iassinovskaia, Deborah Fels

Abstract:

Learning how to create inclusive media, such as closed captioning (CC) and audio description (AD), in North America is restricted to the private sector, proprietary company-based training. We are delivering (through synchronous and asynchronous online learning) the first Canadian post-secondary, practice-based continuing education course package in inclusive media for broadcast production and processes. Despite the prevalence of CC and AD taught within the field of translation studies in Europe, North America has no comparable field of study. This novel approach to audio visual translation (AVT) education develops evidence-based methodology innovations, stemming from user study research with blind/low vision and Deaf/hard of hearing audiences for television and theatre, undertaken at Ryerson University. Knowledge outcomes from the courses include a) Understanding how CC/AD fit within disability/regulatory frameworks in Canada. b) Knowledge of how CC/AD could be employed in the initial stages of production development within broadcasting. c) Writing and/or speaking techniques designed for media. d) Hands-on practice in captioning re-speaking techniques and open source technologies, or in AD techniques. e) Understanding of audio production technologies and editing techniques. The case study of the curriculum development and deployment, involving first-time online course delivery from academic and practitioner-based instructors in introductory Captioning and Audio Description courses (CDIM 101 and 102), will compare two different instructors' approaches to learning design, including the ratio of synchronous and asynchronous classroom time and technological engagement tools on meeting software platform such as breakout rooms and polling. Student reception of these two different approaches will be analysed using qualitative thematic and quantitative survey analysis. Thus far, anecdotal conversations with students suggests that they prefer synchronous compared with asynchronous learning within our hands-on online course delivery method.

Keywords: inclusive media theory, broadcasting practices, AVT post secondary education, respeaking, audio description, learning design, virtual education

Procedia PDF Downloads 160

18680 Multimodal Convolutional Neural Network for Musical Instrument Recognition

Authors: Yagya Raj Pandeya, Joonwhoan Lee

Abstract:

The dynamic behavior of music and video makes it difficult to evaluate musical instrument playing in a video by computer system. Any television or film video clip with music information are rich sources for analyzing musical instruments using modern machine learning technologies. In this research, we integrate the audio and video information sources using convolutional neural network (CNN) and pass network learned features through recurrent neural network (RNN) to preserve the dynamic behaviors of audio and video. We use different pre-trained CNN for music and video feature extraction and then fine tune each model. The music network use 2D convolutional network and video network use 3D convolution (C3D). Finally, we concatenate each music and video feature by preserving the time varying features. The long short term memory (LSTM) network is used for long-term dynamic feature characterization and then use late fusion with generalized mean. The proposed network performs better performance to recognize the musical instrument using audio-video multimodal neural network.

Keywords: multimodal, 3D convolution, music-video feature extraction, generalized mean

Procedia PDF Downloads 186

18679 A Novel Method for Silence Removal in Sounds Produced by Percussive Instruments

Authors: B. Kishore Kumar, Rakesh Pogula, T. Kishore Kumar

Abstract:

The steepness of an audio signal which is produced by the musical instruments, specifically percussive instruments is the perception of how high tone or low tone which can be considered as a frequency closely related to the fundamental frequency. This paper presents a novel method for silence removal and segmentation of music signals produced by the percussive instruments and the performance of proposed method is studied with the help of MATLAB simulations. This method is based on two simple features, namely the signal energy and the spectral centroid. As long as the feature sequences are extracted, a simple thresholding criterion is applied in order to remove the silence areas in the sound signal. The simulations were carried on various instruments like drum, flute and guitar and results of the proposed method were analyzed.

Keywords: percussive instruments, spectral energy, spectral centroid, silence removal

Procedia PDF Downloads 372

18678 A Two-Stage Adaptation towards Automatic Speech Recognition System for Malay-Speaking Children

Authors: Mumtaz Begum Mustafa, Siti Salwah Salim, Feizal Dani Rahman

Abstract:

Recently, Automatic Speech Recognition (ASR) systems were used to assist children in language acquisition as it has the ability to detect human speech signal. Despite the benefits offered by the ASR system, there is a lack of ASR systems for Malay-speaking children. One of the contributing factors for this is the lack of continuous speech database for the target users. Though cross-lingual adaptation is a common solution for developing ASR systems for under-resourced language, it is not viable for children as there are very limited speech databases as a source model. In this research, we propose a two-stage adaptation for the development of ASR system for Malay-speaking children using a very limited database. The two stage adaptation comprises the cross-lingual adaptation (first stage) and cross-age adaptation. For the first stage, a well-known speech database that is phonetically rich and balanced, is adapted to the medium-sized Malay adults using supervised MLLR. The second stage adaptation uses the speech acoustic model generated from the first adaptation, and the target database is a small-sized database of the target users. We have measured the performance of the proposed technique using word error rate, and then compare them with the conventional benchmark adaptation. The two stage adaptation proposed in this research has better recognition accuracy as compared to the benchmark adaptation in recognizing children’s speech.

Keywords: Automatic Speech Recognition System, children speech, adaptation, Malay

Procedia PDF Downloads 366

18677 A Comparison of Proxemics and Postural Head Movements during Pop Music versus Matched Music Videos

Authors: Harry J. Witchel, James Ackah, Carlos P. Santos, Nachiappan Chockalingam, Carina E. I. Westling

Abstract:

Introduction: Proxemics is the study of how people perceive and use space. It is commonly proposed that when people like or engage with a person/object, they will move slightly closer to it, often quite subtly and subconsciously. Music videos are known to add entertainment value to a pop song. Our hypothesis was that by adding appropriately matched video to a pop song, it would lead to a net approach of the head to the monitor screen compared to simply listening to an audio-only version of the song. Methods: We presented to 27 participants (ages 21.00 ± 2.89, 15 female) seated in front of 47.5 x 27 cm monitor two musical stimuli in a counterbalanced order; all stimuli were based on music videos by the band OK Go: Here It Goes Again (HIGA, boredom ratings (0-100) = 15.00 ± 4.76, mean ± SEM, standard-error-of-the-mean) and Do What You Want (DWYW, boredom ratings = 23.93 ± 5.98), which did not differ in boredom elicited (P = 0.21, rank-sum test). Each participant experienced each song only once, and one song (counterbalanced) as audio-only versus the other song as a music video. The movement was measured by video-tracking using Kinovea 0.8, based on recording from a lateral aspect; before beginning, each participant had a reflective motion tracking marker placed on the outer canthus of the left eye. Analysis of the Kinovea X-Y coordinate output in comma-separated-variables format was performed in Matlab, as were non-parametric statistical tests. Results: We found that the audio-only stimuli (combined for both HIGA and DWYW, mean ± SEM, 35.71 ± 5.36) were significantly more boring than the music video versions (19.46 ± 3.83, P = 0.0066 Wilcoxon Signed Rank Test (WSRT), Cohen's d = 0.658, N = 28). We also found that participants' heads moved around twice as much during the audio-only versions (speed = 0.590 ± 0.095 mm/sec) compared to the video versions (0.301 ± 0.063 mm/sec, P = 0.00077, WSRT). However, the participants' mean head-to-screen distances were not detectably smaller (i.e. head closer to the screen) during the music videos (74.4 ± 1.8 cm) compared to the audio-only stimuli (73.9 ± 1.8 cm, P = 0.37, WSRT). If anything, during the audio-only condition, they were slightly closer. Interestingly, the ranges of the head-to-screen distances were smaller during the music video (8.6 ± 1.4 cm) compared to the audio-only (12.9 ± 1.7 cm, P = 0.0057, WSRT), the standard deviations were also smaller (P = 0.0027, WSRT), and their heads were held 7 mm higher (video 116.1 ± 0.8 vs. audio-only 116.8 ± 0.8 cm above floor, P = 0.049, WSRT). Discussion: As predicted, sitting and listening to experimenter-selected pop music was more boring than when the music was accompanied by a matched, professionally-made video. However, we did not find that the proxemics of the situation led to approaching the screen. Instead, adding video led to efforts to control the head to a more central and upright viewing position and to suppress head fidgeting.

Keywords: boredom, engagement, music videos, posture, proxemics

Procedia PDF Downloads 138

18676 First Record of Eotragus noyei from the Middle Siwalik Dhok Pathan Formation of Pakistan

Authors: Abdul M. Khan, Hafiza I. Naz, Ayesha Iqbal, Muhammad Akhtar

Abstract:

The fossil remains described in this study have been recovered during fieldwork by the authors from the Dhok Pathan Formation of Middle Siwaliks Pakistan in December, 2015. The sample comprises maxillary and mandibular fragments along with isolated upper and lower teeth. The morphometric analysis of the specimens led us to recognize the sample as belonging to Eotragus noyei, which has been considered as the smallest and the oldest bovid in the Siwaliks. Eotragus noyei is characterized by brachydont teeth, finely rugose enamel, more inclined buccal walls of the molars and small lingual cingula. The inclination of the metaconal area has caused rotation of the metastyle in relation to the antero-posterior tooth axis and thus situated more lingually. The protocone in second upper premolar is well developed and situated posteriorly and also has an anterior lingual constriction. The metaconule in the third upper molar is smaller than the protocone. The dentition in Eotragus noyei is smaller in size as compared to Eotragus sansaniensis and Eotragus lampangensis. In Eotragus noyei the buccal walls in molars are more inclined while in Eotragus sansaniensis they are less inclined. The genus Eotragus has been reported previously in the Lower and Middle Siwaliks of Pakistan; however, the recognition of the present sample as Eotragus noyei has extended the range of this species from Lower to the Middle Siwaliks of Pakistan.

Keywords: Boselaphini, Chakwal, Dhok Pathan, late miocene

Procedia PDF Downloads 263

18675 A Guide to the Implementation of Ambisonics Super Stereo

Authors: Alessio Mastrorillo, Giuseppe Silvi, Francesco Scagliola

Abstract:

In this work, we introduce an Ambisonics decoder with an implementation of the C-format, also called Super Stereo. This format is an alternative to conventional stereo and binaural decoding. Unlike those, this format conveys audio information from the horizontal plane and works with stereo speakers and headphones. The two C-format channels can also return a reconstructed planar B-format. This work provides an open-source implementation for this format. We implement an all-pass filter for signal quadrature, as required by the decoding equations. This filter works with six Biquads in a cascade configuration, with values for control frequency and quality factor discovered experimentally. The phase response of the filter delivers a small error in the 20-14.000Hz range. The decoder has been tested with audio sources up to 192kHz sample rate, returning pristine sound quality and detailed stereo image. It has been included in the Envelop for Live suite and is available as an open-source repository. This decoder has applications in Virtual Reality and 360° audio productions, music composition, and online streaming.

Keywords: ambisonics, UHJ, quadrature filter, virtual reality, Gerzon, decoder, stereo, binaural, biquad

Procedia PDF Downloads 64

18674 Teachers Handbook: A Key to Imparting Teaching in Multilingual Classrooms at Kalinga Institute of Social Sciences (KISS)

Authors: Sushree Sangita Mohanty

Abstract:

The pedagogic system, which is used to work with indigenous groups, who have equally different socio-economic, socio-cultural & multi-lingual conditions with differing cognitive capabilities, makes the education situation complex. As a result, educating the indigenous people became just the dissemination of facts and information, but advancement in knowledge and possibilities somewhere hides. This gap arises complexities due to the language barrier and the teachers from a conventional background of teaching practices are unable to understand or connect with the students in the schools. This paper presents the research work of the Mother Tongue Based Multilingual Education (MTB-MLE) project that has developed a creative pedagogic endeavor for the students of Kalinga Institute of Social Sciences (KISS) for facilitating Multilingual Education (MLE) teaching. KISS is a home for 25,000 indigenous children. The students enrolled here are from 62 different indigenous communities who speak around 24 different languages with geographical articulation. The book contents include concept, understanding languages, similitudes among languages, the need of mother tongue in teaching and learning, skill development (Listening-Speaking-Reading-Writing), teachers activities for teaching in multilingual schools, the process of teaching, training format of multilingual teaching and procedures for basic data collection regarding multilingual schools and classroom handle.

Keywords: indigenous, multi-lingual, pedagogic, teachers, teaching practices

Procedia PDF Downloads 255

18673 Subtitled Based-Approach for Learning Foreign Arabic Language

Authors: Elleuch Imen

Abstract:

In this paper, it propose a new approach for learning Arabic as a foreign language via audio-visual translation, particularly subtitling. The approach consists of developing video sequences appropriate to different levels of learning (from A1 to C2) containing conversations, quizzes, games and others. Each video aims to achieve a specific objective, such as the correct pronunciation of Arabic words, the correct syntactic structuring of Arabic sentences, the recognition of the morphological characteristics of terms and the semantic understanding of statements. The subtitled videos obtained can be incorporated into different Arabic second language learning tools such as Moocs, websites, platforms, etc.

Keywords: arabic foreign language, learning, audio-visuel translation, subtitled videos

Procedia PDF Downloads 30

18672 An Assessment of Inferior Dental (IDN) and Lingual Nerve (LN) Injuries Following Third Molar Removal Under LA, IVS, and GA - An Audit and Case-Series

Authors: Aamna Tufail, Catherine Anyanwu

Abstract:

Introduction/Aims: Neurosensory deficits following third molar removal affect the quality of life markedly. The purpose of this audit was to evaluate the incidence of IDN and LN damage and to compare departmental rates to an established standard. A secondary objective was to provide a descriptive summary of identified cases for clinical learning. Materials and Methods: A retrospective audit was conducted by a telephone survey of 101 patients who had third molar extractions performed under LA, IVS, or GA from January 2019 to June 2020 at a District General Hospital. The results were compared to a clinical standard identified as Cheng et al1. Data collection included mode of surgery, mode of anaesthesia, grade of clinician, assessment of difficulty, severity, and duration of symptoms. Results/Statistics: A total of 101 patients had 136 third molars extracted. Age range was 18-84 years. 44% extractions were under LA, 52% under GA, and 4% under IV sedation. 30% were simple extractions, 68% were surgical removals, 2% were unspecified. 89% extractions were performed by an Associate Specialist, 5% by a consultant, and 6% by unspecified grade of clinician. The rate of IDN injuries was 2.9% (n=4), higher than standard (0.3%). The rate of LN injuries was 0.7% (n=1), same as standard (0.7%). The 5 cases of neurosensory deficits are discussed in detail. Conclusions/Clinical Relevance: The rate of ID nerve injuries was higher than the standard. The rate of LN complications was lower than the standard.

Keywords: inferior dental nerve, lingual nerve, nerve injuries, third molars

Procedia PDF Downloads 72

18671 Neuropsychological Testing in a Multi-Lingual Society: Normative Data for South African Adults in More Than Eight Languages

Authors: Sharon Truter, Ann B. Shuttleworth-Edwards

Abstract:

South Africa is a developing country with significant diversity in languages spoken and quality of education available, creating challenges for fair and accurate neuropsychological assessments when most available neuropsychological tests are obtained from English-speaking developed countries. The aim of this research was to compare normative data on a spectrum of commonly used neuropsychological tests for English- and Afrikaans-speaking South Africans with relatively high quality of education and South Africans with relatively low quality of education who speak Afrikaans, Sesotho, Setswana, Sepedi, Tsonga, Venda, Xhosa or Zulu. The participants were all healthy adults aged 18-60 years, with 8-12 years of education. All the participants were tested in their first language on the following tests: two non-verbal tests (Rey Osterrieth Complex Figure Test and Bell Cancellation Test), four verbal fluency tests (category, phonemic, verb and 'any words'), one verbal learning test (Rey Auditory Verbal Leaning Test) and three tests that have a verbal component (Trail Making Test A & B; Symbol Digit Modalities Test and Digit Span). Descriptive comparisons of mean scores and standard deviations across the language groups and between the groups with relatively high versus low quality of education highlight the importance of using normative data that takes into account language and quality of education.

Keywords: cross-cultural, language, multi-lingual, neuropsychological testing, quality of education

Procedia PDF Downloads 126

18670 Methodologies, Findings, Discussion, and Limitations in Global, Multi-Lingual Research: We Are All Alone - Chinese Internet Drama

Authors: Patricia Portugal Marques de Carvalho Lourenco

Abstract:

A three-phase methodological multi-lingual path was designed, constructed and carried out using the 2020 Chinese Internet Drama Series We Are All Alone as a case study. Phase one, the backbone of the research, comprised of secondary data analysis, providing the structure on which the next two phases would be built on. Phase one incorporated a Google Scholar and a Baidu Index analysis, Star Network Influence Index and Mydramalist.com top two drama reviews, along with an article written about the drama and scrutiny of Chinese related blogs and websites. Phase two was field research elaborated across Latin Europe, and phase three was social media focused, having into account that perceptions are going to be memory conditioned based on past ideas recall. Overall, research has shown the poor cultural expression of Chinese entertainment in Latin Europe and demonstrated the inexistence of Chinese content in French, Italian, Portuguese and Spanish Business to Consumer retailers; a reflection of their low significance in Latin European markets and the short-life cycle of entertainment products in general, bubble-gum, disposable goods without a mid to long-term effect in consumers lives. The process of conducting comprehensive international research was complex and time-consuming, with data not always available in Mandarin, the researcher’s linguistic deficiency, limited Chinese Cultural Knowledge and cultural equivalence. Despite steps being taken to minimize the international proposed research, theoretical limitations concurrent to Latin Europe and China still occurred. Data accuracy was disputable; sampling, data collection/analysis methods are heterogeneous; ascertaining data requirements and the method of analysis to achieve a construct equivalence was challenging and morose to operationalize. Secondary data was also not often readily available in Mandarin; yet, in spite of the array of limitations, research was done, and results were produced.

Keywords: research methodologies, international research, primary data, secondary data, research limitations, online dramas, china, latin europe

Procedia PDF Downloads 45