Search results for: audio fingerprinting
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 461

Search results for: audio fingerprinting

431 Using Audio-Visual Aids and Computer-Assisted Language Instruction to Overcome Learning Difficulties of Sound System in Students of Special Needs

Authors: Sadeq Al Yaari, Ayman Al Yaari, Adham Al Yaari, Montaha Al Yaari, Aayah Al Yaari, Sajedah Al Yaari

Abstract:

Background & Objectives: Audio-visual aids and computer-assisted language instruction (CALI) effects are strong in teaching language components (sound system, grammatical structures and vocabulary) to students of special needs. To explore the effects of the audio-visual aids and CALI in teaching sound system to this class of students by speech language therapists (SLTs), an experiment has been undertaken to evaluate their performance during their study of the sound system course. Methods: Forty students (males and females) of special needs at al-Malādh school for teaching students of special needs in Dhamar (Yemen) range between 8 and 18 years old underwent this experimental study while they were studying language sound system course. Pre-and-posttests have been administered at the begging and end of the semester. Students' treatment was compared to a similar group (control group) of the same number under the same environment. Whereas the first group was taught using audio-visual aids and CALI, the second was not. Students' performances were linguistically and statistically evaluated. Results & conclusions: Compared with the control group, the treatment group showed significantly higher scores in the posttest (72.32% vs. 31%). Compared with females, males scored higher marks (1421 vs. 1472). Thus, we should take the audio-visual aids and CALI into consideration in teaching sound system to students of special needs.

Keywords: language components, sound system, audio-visual aids, CALI, students, special needs, SLTs

Procedia PDF Downloads 46
430 On Musical Information Geometry with Applications to Sonified Image Analysis

Authors: Shannon Steinmetz, Ellen Gethner

Abstract:

In this paper, a theoretical foundation is developed for patterned segmentation of audio using the geometry of music and statistical manifold. We demonstrate image content clustering using conic space sonification. The algorithm takes a geodesic curve as a model estimator of the three-parameter Gamma distribution. The random variable is parameterized by musical centricity and centric velocity. Model parameters predict audio segmentation in the form of duration and frame count based on the likelihood of musical geometry transition. We provide an example using a database of randomly selected images, resulting in statistically significant clusters of similar image content.

Keywords: sonification, musical information geometry, image, content extraction, automated quantification, audio segmentation, pattern recognition

Procedia PDF Downloads 237
429 Audio-Lingual Method and the English-Speaking Proficiency of Grade 11 Students

Authors: Marthadale Acibo Semacio

Abstract:

Speaking skill is a crucial part of English language teaching and learning. This actually shows the great importance of this skill in English language classes. Through speaking, ideas and thoughts are shared with other people, and a smooth interaction between people takes place. The study examined the levels of speaking proficiency of the control and experimental groups on pronunciation, grammatical accuracy, and fluency. As a quasi-experimental study, it also determined the presence or absence of significant changes in their speaking proficiency levels in terms of pronouncing the words correctly, the accuracy of grammar and fluency of a language given the two methods to the groups of students in the English language, using the traditional and audio-lingual methods. Descriptive and inferential statistics were employed according to the stated specific problems. The study employed a video presentation with prior information about it. In the video, the teacher acts as model one, giving instructions on what is going to be done, and then the students will perform the activity. The students were paired purposively based on their learning capabilities. Observing proper ethics, their performance was audio recorded to help the researcher assess the learner using the modified speaking rubric. The study revealed that those under the traditional method were more fluent than those in the audio-lingual method. With respect to the way in which each method deals with the feelings of the student, the audio-lingual one fails to provide a principle that would relate to this area and follows the assumption that the intrinsic motivation of the students to learn the target language will spring from their interest in the structure of the language. However, the speaking proficiency levels of the students were remarkably reinforced in reading different words through the aid of aural media with their teachers. The study concluded that using an audio-lingual method of teaching is not a stand-alone method but only an aid of the teacher in helping the students improve their speaking proficiency in the English Language. Hence, audio-lingual approach is encouraged to be used in teaching English language, on top of the chalk-talk or traditional method, to improve the speaking proficiency of students.

Keywords: audio-lingual, speaking, grammar, pronunciation, accuracy, fluency, proficiency

Procedia PDF Downloads 69
428 Carrier Communication through Power Lines

Authors: Pavuluri Gopikrishna, B. Neelima

Abstract:

Power line carrier communication means audio power transmission via power line and reception of the amplified audio power at the receiver as in the form of speaker output signal using power line as the channel medium. The main objective of this suggested work is to transmit our message signal after frequency modulation by the help of FM modulator IC LM565 which gives output proportional to the input voltage of the input message signal. And this audio power is received from the power line by the help of isolation circuit and demodulated from IC LM565 which uses the concept of the PLL and produces FM demodulated signal to the listener. Message signal will be transmitted over the carrier signal that will be generated from the FM modulator IC LM565. Using this message signal will not damage because of no direct contact of message signal from the power line, but noise can disturb our information.

Keywords: amplification, fm demodulator ic 565, fm modulator ic 565, phase locked loop, power isolation

Procedia PDF Downloads 552
427 Molecular and Phytochemical Fingerprinting of Anti-Cancer Drug Yielding Plants in South India

Authors: Alexis John de Britto

Abstract:

Studies were performed to select the superior genotypes based on intra-specific variations, caused by phytogeographical, climatic and edaphic parameters of three anti cancer drug yielding mangrove plants such as Acanthus ilicifolius L., Calophyllum inophyllum L. and Excoecaria agallocha L. using ISSR (Inter Simple Sequence Repeats) markers and phytochemical analysis such as preliminary phytochemical tests, TLC, HPTLC, HPLC and antioxidant tests. The plants were collected from five different geographical locations of the East Coast of south India. Genetic heterozygosity, Nei’s gene diversity, Shannon’s information index and Percentage of polymorphism between the populations were calculated using POPGENE software. Cluster analysis was performed using UPGMA algorithm. AMOVA and correlations between genetic diversity and soil factors were analyzed. Combining the molecular and phytochemical variations superior genotypes were selected. Conservation constraints and methods of efficient exploitation of the species are discussed.

Keywords: anti-cancer drug yielding plants, DNA fingerprinting, phytochemical analysis, selection of superior genotypes

Procedia PDF Downloads 330
426 The Implication of News Segments and Movies for Enhancing Listening Comprehension of Language Learners

Authors: Taher Bahrani

Abstract:

Armed with technological development, the present study aimed at gauging the effectiveness of exposure to news and movies as two types of audio-visual programs on improving language learners’ listening comprehension at the intermediate level. To this end, a listening comprehension test was administered to 108 language learners and finally 60 language learners were selected as intermediate language learners and randomly divided into group one and group two. During the experiment, group one participants had exposure to audio-visual news stories to work on in-and out-side the classroom. On the contrary, the participants in group two had only exposure to a sample selected utterances extracted from different kinds of movies. At the end of the experiment, both groups took another sample listening test to find out to what extent the participants in each group could enhance their listening comprehension. The results obtained from the post-test were indicative of the fact that the participants who had exposure to news outperformed the participants who had exposure to movies. The findings of the present research seem to indicate that the language input embedded in the type of audio-visual programs which language learners are exposed to is more important than the amount of exposure.

Keywords: audio-visual news, movies, listening comprehension, intermediate level

Procedia PDF Downloads 382
425 The Influence of Audio-Visual Resources in Teaching Business Subjects in Selected Secondary Schools in Ifako Ijaiye Local Government Area of Lagos State, Nigeria

Authors: Oluwole Victor Falobi, Lawrence Olusola Ige

Abstract:

The cardinal drawing force of this study is to examine the influence of audio-visual resources in teaching business subjects in selected secondary schools in IfakoIjaiye Local Government Area of Lagos State, Nigeria. A descriptive survey research design was employed for the study. By using a quantitative research approach and a sample size of 120 students were randomly selected from four public schools. Three research questions with one hypothesis guided the study. Data collected were analysed using frequency, the mean and standard deviation for the research questions, and Pearson Product Moment Correlation PPMC were used to analysed the inferential statistic. Findings from the study revealed that the Influence of audio-visual resources in teaching business subjects in selected secondary schools in IfakoIjaiye Local Government Area of Lagos State is low. It further revealed data the knowledge of teachers on the use of audio-visual resources is high in Ifako Local Government Area. It was recommended that government should create a timely monitoring system in other to check secondary school laboratories and classrooms to replace outdated facilities and also purchase needed facilities for effective teaching and learning to take place.

Keywords: audio-visual resources, business subjects, school, teaching

Procedia PDF Downloads 99
424 Atomic Decomposition Audio Data Compression and Denoising Using Sparse Dictionary Feature Learning

Authors: T. Bryan , V. Kepuska, I. Kostnaic

Abstract:

A method of data compression and denoising is introduced that is based on atomic decomposition of audio data using “basis vectors” that are learned from the audio data itself. The basis vectors are shown to have higher data compression and better signal-to-noise enhancement than the Gabor and gammatone “seed atoms” that were used to generate them. The basis vectors are the input weights of a Sparse AutoEncoder (SAE) that is trained using “envelope samples” of windowed segments of the audio data. The envelope samples are extracted from the audio data by performing atomic decomposition with Gabor or gammatone seed atoms. This process identifies segments of audio data that are locally coherent with the seed atoms. Envelope samples are extracted by identifying locally coherent audio data segments with Gabor or gammatone seed atoms, found by matching pursuit. The envelope samples are formed by taking the kronecker products of the atomic envelopes with the locally coherent data segments. Oracle signal-to-noise ratio (SNR) verses data compression curves are generated for the seed atoms as well as the basis vectors learned from Gabor and gammatone seed atoms. SNR data compression curves are generated for speech signals as well as early American music recordings. The basis vectors are shown to have higher denoising capability for data compression rates ranging from 90% to 99.84% for speech as well as music. Envelope samples are displayed as images by folding the time series into column vectors. This display method is used to compare of the output of the SAE with the envelope samples that produced them. The basis vectors are also displayed as images. Sparsity is shown to play an important role in producing the highest denoising basis vectors.

Keywords: sparse dictionary learning, autoencoder, sparse autoencoder, basis vectors, atomic decomposition, envelope sampling, envelope samples, Gabor, gammatone, matching pursuit

Procedia PDF Downloads 253
423 Tracing Sources of Sediment in an Arid River, Southern Iran

Authors: Hesam Gholami

Abstract:

Elevated suspended sediment loads in riverine systems resulting from accelerated erosion due to human activities are a serious threat to the sustainable management of watersheds and ecosystem services therein worldwide. Therefore, mitigation of deleterious sediment effects as a distributed or non-point pollution source in the catchments requires reliable provenance information. Sediment tracing or sediment fingerprinting, as a combined process consisting of sampling, laboratory measurements, different statistical tests, and the application of mixing or unmixing models, is a useful technique for discriminating the sources of sediments. From 1996 to the present, different aspects of this technique, such as grouping the sources (spatial and individual sources), discriminating the potential sources by different statistical techniques, and modification of mixing and unmixing models, have been introduced and modified by many researchers worldwide, and have been applied to identify the provenance of fine materials in agricultural, rural, mountainous, and coastal catchments, and in large catchments with numerous lakes and reservoirs. In the last two decades, efforts exploring the uncertainties associated with sediment fingerprinting results have attracted increasing attention. The frameworks used to quantify the uncertainty associated with fingerprinting estimates can be divided into three groups comprising Monte Carlo simulation, Bayesian approaches and generalized likelihood uncertainty estimation (GLUE). Given the above background, the primary goal of this study was to apply geochemical fingerprinting within the GLUE framework in the estimation of sub-basin spatial sediment source contributions in the arid Mehran River catchment in southern Iran, which drains into the Persian Gulf. The accuracy of GLUE predictions generated using four different sets of statistical tests for discriminating three sub-basin spatial sources was evaluated using 10 virtual sediments (VS) samples with known source contributions using the root mean square error (RMSE) and mean absolute error (MAE). Based on the results, the contributions modeled by GLUE for the western, central and eastern sub-basins are 1-42% (overall mean 20%), 0.5-30% (overall mean 12%) and 55-84% (overall mean 68%), respectively. According to the mean absolute fit (MAF; ≥ 95% for all target sediment samples) and goodness-of-fit (GOF; ≥ 99% for all samples), our suggested modeling approach is an accurate technique to quantify the source of sediments in the catchments. Overall, the estimated source proportions can help watershed engineers plan the targeting of conservation programs for soil and water resources.

Keywords: sediment source tracing, generalized likelihood uncertainty estimation, virtual sediment mixtures, Iran

Procedia PDF Downloads 74
422 DNA Fingerprinting of Some Major Genera of Subterranean Termites (Isoptera) (Anacanthotermes, Psammotermes and Microtermes) from Western Saudi Arabia

Authors: AbdelRahman A. Faragalla, Mohamed H. Alqhtani, Mohamed M. M.Ahmed

Abstract:

Saudi Arabia has currently been beset by a barrage of bizarre assemblages of subterranean termite fauna, inflicting heavy catastrophic havocs on human valued properties in various homes, storage facilities, warehouses, agricultural and horticultural crops including okra, sweet pepper, tomatoes, sorghum, date palm trees, citruses and many forest domains and green lush desert oases. The most pressing urgent priority is to use modern technologies to alleviate the painstaking obstacle of taxonomic identification of these injurious noxious pests that might lead to effective pest control in both infested agricultural commodities and field crops. Our study has indicated the use of DNA fingerprinting technologies, in order to generate basic information of the genetic similarity between 3 predominant families containing the most destructive termite species. The methodologies included extraction and DNA isolation from members of the major families and the use of randomly selected primers and PCR amplifications with the nucleotide sequences. GC content and annealing temperatures for all primers, PCR amplifications and agarose gel electrophoresis were also conducted in addition to the scoring and analysis of Random Amplification Polymorphic DNA-PCR (RAPDs). A phylogenetic analysis for different species using statistical computer program on the basis of RAPD-DNA results, represented as a dendrogram based on the average of band sharing ratio between different species. Our study aims to shed more light on this intriguing subject, which may lead to an expedited display of the kinship and relatedness of species in an ambitious undertaking to arrive at correct taxonomic classification of termite species, discover sibling species, so that a logistic rational pest management strategy could be delineated.

Keywords: DNA fingerprinting, Western Saudi Arabia, DNA primers, RAPD

Procedia PDF Downloads 430
421 Multi-Modal Feature Fusion Network for Speaker Recognition Task

Authors: Xiang Shijie, Zhou Dong, Tian Dan

Abstract:

Speaker recognition is a crucial task in the field of speech processing, aimed at identifying individuals based on their vocal characteristics. However, existing speaker recognition methods face numerous challenges. Traditional methods primarily rely on audio signals, which often suffer from limitations in noisy environments, variations in speaking style, and insufficient sample sizes. Additionally, relying solely on audio features can sometimes fail to capture the unique identity of the speaker comprehensively, impacting recognition accuracy. To address these issues, we propose a multi-modal network architecture that simultaneously processes both audio and text signals. By gradually integrating audio and text features, we leverage the strengths of both modalities to enhance the robustness and accuracy of speaker recognition. Our experiments demonstrate significant improvements with this multi-modal approach, particularly in complex environments, where recognition performance has been notably enhanced. Our research not only highlights the limitations of current speaker recognition methods but also showcases the effectiveness of multi-modal fusion techniques in overcoming these limitations, providing valuable insights for future research.

Keywords: feature fusion, memory network, multimodal input, speaker recognition

Procedia PDF Downloads 33
420 Mapping the Sonic Spectrum of Traditional Music and Instruments Used in Malaysian Kavadi Rituals

Authors: Ainolnaim Azizol, Valerie Ross

Abstract:

Music is as old as mankind and rituals using music such as Kavadi have been associated with social, cultural, and spiritual practices in many traditional and modern societies. Recent literature has provided scientific evidence that music affects psychological and physical changes through stimulation of brainwave. Despite such advances, the scientific study of the sonic qualities peculiar to traditional instruments and how it impacts on ritualistic activities is still lacking. This study addresses one such phenomenon. Devotees in Kavadi rituals are known to be in a state of trance state and do not experience pain nor suffer injury despite the hundreds of needles pierced through their skins. Although scientists have sought to understand how this is possible, lesser is known about the music that is used to prepare devotees to enter into the trance state. This study fills this gap of knowledge by providing scientific evidence through the identification and mapping of the sonic spectrum or sound fingerprint of the instruments and the repertoire used in these ritualistic forms in their ethnographic environment and in audio-controlled situations. The objectives are to identify and categorize the different types of traditional music used in Kavadi rituals; to record, transcribe and digitally score the musical repertoire used in the oral tradition of Kavadi rituals; to map the sonic spectrum of ritual music using spectromography and advanced music analytical software a mixed methodology will be used. This comprises ethnographic field studies using interviews, participant observation, audio-video recordings and audio-methodology using spectromography and advanced audio-technology for sonic mapping and the transcription of audio recordings into digital scores.

Keywords: sonic, traditional, ritual, Kavadi, music

Procedia PDF Downloads 242
419 Development of DNA Fingerprints in Selected Medicinal Plants of India

Authors: V. Verma, Hazi Raja

Abstract:

Conventionally, morphological descriptors are routinely used for establishing the identity of varieties. But these morphological descriptors suffer from many drawbacks such as influence of environment on trait expression, epistatic interactions, pleiotrophic effects etc. Furthermore, the paucity of a sufficient number of these descriptors for unequivocal identification of increasing number of reference collection varieties enforces to look for alternatives. Therefore, DNA based finger-print based techniques were selected to define the systematic position of the selected medicinal plants like Plumbago zeylanica, Desmodium gangeticum, Uraria picta. DNA fingerprinting of herbal plants can be useful in authenticating the various claims of medical uses related to the plants, in germplasm characterization and conservation. In plants it has not only helped in identifying species but also in defining a new realm in plant genomics, plant breeding and in conserving the biodiversity. With world paving way for developments in biotechnology, DNA fingerprinting promises a very powerful tool in our future endeavors. Data will be presented on the development of microsatellite markers (SSR) used to fingerprint, characterize, and assess genetic diversity among 12 accessions of both Plumbago zeylanica, 4 accessions of Desmodium gengaticum, 4 accessions of Uraria Picta.

Keywords: Plumbago zeylanica, Desmodium gangeticum, Uraria picta, microsaetllite markers

Procedia PDF Downloads 216
418 ISSR Based Molecular Phylogeny in Naturally Growing Suaeda Populations of Saudi Arabia

Authors: Mohammed Abdullah Basahi

Abstract:

The objective of the present study was to identify the phylogenetic relationships and determine genetic diversity among Suaeda genotypes growing in Saudi Arabia and to find out whether these could be a potential source for genetic diversity. A set of nineteen genotypes was analyzed using twenty-four ISSR primers. Clear amplified polymorphic DNA products were obtained from the screening of twenty-four ISSR primers on nineteen genotypes that allowed selection of ten primers and the results were reproducible. Nineteen genotypes were revealed a unique profile with ten ISSR primers and thus it can be used for the DNA fingerprinting. Different primers produced a different level of polymorphism among the nineteen genotypes. The number of polymorphic bands per primer varied from 5 to 14 with an average of 8 bands per primer. The results revealed that the genotypes differed for ISSR markers. The genetic similarity based on Nei and Li’s ranged from 0.450 to 0.930. Cluster analysis was conducted based on ISSR data to group the Suaeda genotypes and to construct a dendrogram. Four groups can be distinguished by truncating the dendrogram at GS value of 0.54. ISSR markers showed high level of polymorphism among the genotypes examined. The present study indicates that ISSR markers could be successfully used in genetic characterization and diversity in Suaeda.

Keywords: suaeda, DNA fingerprinting, ISSR, Saudi Arabia

Procedia PDF Downloads 331
417 Illumina MiSeq Sequencing for Bacteria Identification on Audio-Visual Materials

Authors: Tereza Branyšová, Martina Kračmarová, Kateřina Demnerová, Michal Ďurovič, Hana Stiborová

Abstract:

Microbial deterioration threatens all objects of cultural heritage, including audio-visual materials. Fungi are commonly known to be the main factor in audio-visual material deterioration. However, although being neglected, bacteria also play a significant role. In addition to microbial contamination of materials, it is also essential to analyse air as a possible contamination source. This work aims to identify bacterial species in the archives of the Czech Republic that occur on audio-visual materials as well as in the air in the archives. For sampling purposes, the smears from the materials were taken by sterile polyurethane sponges, and the air was collected using a MAS-100 aeroscope. Metagenomic DNA from all collected samples was immediately isolated and stored at -20 °C. DNA library for the 16S rRNA gene was prepared using two-step PCR and specific primers and the concentration step was included due to meagre yields of the DNA. After that, the samples were sent to the University of Fairbanks, Alaska, for Illumina MiSeq sequencing. Subsequently, the analysis of the sequences was conducted in R software. The obtained sequences were assigned to the corresponding bacterial species using the DADA2 package. The impact of air contamination and the impact of different photosensitive layers that audio-visual materials were made of, such as gelatine, albumen, and collodion, were evaluated. As a next step, we will take a deeper focus on air contamination. We will select an appropriate culture-dependent approach along with a culture-independent approach to observe a metabolically active species in the air. Acknowledgment: This project is supported by grant no. DG18P02OVV062 of the Ministry of Culture of the Czech Republic.

Keywords: cultural heritage, Illumina MiSeq, metagenomics, microbial identification

Procedia PDF Downloads 156
416 Isolation and Identification of Diacylglycerol Acyltransferase Type-2 (GAT2) Genes from Three Egyptian Olive Cultivars

Authors: Yahia I. Mohamed, Ahmed I. Marzouk, Mohamed A. Yacout

Abstract:

Aim of this work was to study the genetic basis for oil accumulation in olive fruit via tracking DGAT2 (Diacylglycerol acyltransferase type-2) gene in three Egyptian Origen Olive cultivars namely Toffahi, Hamed and Maraki using molecular marker techniques and bioinformatics tools. Results illustrate that, firstly: specific genomic band of Maraki cultivars was identified as DGAT2 (Diacylglycerol acyltransferase type-2) and identical for this gene in Olea europaea with 100 % of similarity. Secondly, differential genomic band of Maraki cultivars which produced from RAPD fingerprinting technique reflected predicted distinguished sequence which identified as DGAT2 (Diacylglycerol acyltransferase type-2) in Fragaria vesca subsp. Vesca with 76% of sequential similarity. Third and finally, specific genomic specific band of Hamed cultivars was indentified as two fragments, 1-Olea europaea cultivar Koroneiki diacylglycerol acyltransferase type 2 mRNA, complete cds with two matches regions with 99% or 2-PREDICTED: Fragaria vesca subsp. vesca diacylglycerol O-acyltransferase 2-like (LOC101313050), mRNA with 86% of similarity.

Keywords: Olea europaea, fingerprinting, diacylglycerol acyltransferase type-2 (DGAT2), Egypt

Procedia PDF Downloads 503
415 Analytical Authentication of Butter Using Fourier Transform Infrared Spectroscopy Coupled with Chemometrics

Authors: M. Bodner, M. Scampicchio

Abstract:

Fourier Transform Infrared (FT-IR) spectroscopy coupled with chemometrics was used to distinguish between butter samples and non-butter samples. Further, quantification of the content of margarine in adulterated butter samples was investigated. Fingerprinting region (1400-800 cm–1) was used to develop unsupervised pattern recognition (Principal Component Analysis, PCA), supervised modeling (Soft Independent Modelling by Class Analogy, SIMCA), classification (Partial Least Squares Discriminant Analysis, PLS-DA) and regression (Partial Least Squares Regression, PLS-R) models. PCA of the fingerprinting region shows a clustering of the two sample types. All samples were classified in their rightful class by SIMCA approach; however, nine adulterated samples (between 1% and 30% w/w of margarine) were classified as belonging both at the butter class and at the non-butter one. In the two-class PLS-DA model’s (R2 = 0.73, RMSEP, Root Mean Square Error of Prediction = 0.26% w/w) sensitivity was 71.4% and Positive Predictive Value (PPV) 100%. Its threshold was calculated at 7% w/w of margarine in adulterated butter samples. Finally, PLS-R model (R2 = 0.84, RMSEP = 16.54%) was developed. PLS-DA was a suitable classification tool and PLS-R a proper quantification approach. Results demonstrate that FT-IR spectroscopy combined with PLS-R can be used as a rapid, simple and safe method to identify pure butter samples from adulterated ones and to determine the grade of adulteration of margarine in butter samples.

Keywords: adulterated butter, margarine, PCA, PLS-DA, PLS-R, SIMCA

Procedia PDF Downloads 143
414 Correlation between Speech Emotion Recognition Deep Learning Models and Noises

Authors: Leah Lee

Abstract:

This paper examines the correlation between deep learning models and emotions with noises to see whether or not noises mask emotions. The deep learning models used are plain convolutional neural networks (CNN), auto-encoder, long short-term memory (LSTM), and Visual Geometry Group-16 (VGG-16). Emotion datasets used are Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS), Crowd-sourced Emotional Multimodal Actors Dataset (CREMA-D), Toronto Emotional Speech Set (TESS), and Surrey Audio-Visual Expressed Emotion (SAVEE). To make it four times bigger, audio set files, stretch, and pitch augmentations are utilized. From the augmented datasets, five different features are extracted for inputs of the models. There are eight different emotions to be classified. Noise variations are white noise, dog barking, and cough sounds. The variation in the signal-to-noise ratio (SNR) is 0, 20, and 40. In summation, per a deep learning model, nine different sets with noise and SNR variations and just augmented audio files without any noises will be used in the experiment. To compare the results of the deep learning models, the accuracy and receiver operating characteristic (ROC) are checked.

Keywords: auto-encoder, convolutional neural networks, long short-term memory, speech emotion recognition, visual geometry group-16

Procedia PDF Downloads 75
413 Using Audio-Visual Aids and Computer-Assisted Language Instruction to Overcome Learning Difficulties of Vocabulary in Students of Special Needs

Authors: Sadeq Al Yaari, Ayman Al Yaari, Adham Al Yaari, Montaha Al Yaari, Aayah Al Yaari, Sajedah Al Yaar

Abstract:

Objectives: To assess the effect of using audio-visual aids and computer-assisted/ aided language instruction (CALI) in the performance of students of special needs studying vocabulary course. Methods: The performance of forty students of special needs (males and females) who used audiovisual aids and CALI in their vocabulary course at al-Malādh school for students of special needs was compared to that of another group (control group) of the same number and age (8-18). Again, subjects in the experimental group were given lessons using audio-visual aids and CALI, while those in the control group were given lessons using ordinary educational aids only, although both groups almost shared the same features (class environment, speech language therapist (SLT), etc.). Pre-andposttest was given at the beginning and end of the semester and a qualitative and quantitative analysis followed. Results & conclusions: Results of the present experimental study's pre-and-posttests indicated that the performance of the students in the first group was higher than that of those of the second group (34.27%, 73.82% vs. 33.57%, 34.92%, respectively). Compared with females, males’ performance was higher (1515 scores vs. 1438 scores). Such findings suggest that the presence of these audiovisual aids and CALI in the classes of students of special needs, especially if they are studying vocabulary building course is very important due to their usefulness in the improvement of performance of the students of special needs.

Keywords: language components, vocabulary, audio-visual aids, CALI, special needs, students, SLTs

Procedia PDF Downloads 50
412 The Audio-Visual and Syntactic Priming Effect on Specific Language Impairment and Gender in Modern Standard Arabic

Authors: Mohammad Al-Dawoody

Abstract:

This study aims at exploring if priming is affected by gender in Modern Standard Arabic and if it is restricted solely to subjects with no specific language impairment (SLI). The sample in this study consists of 74 subjects, between the ages of 11;1 and 11;10, distributed into (a) 2 SLI experimental groups of 38 subjects divided into two gender groups of 18 females and 20 males and (b) 2 non-SLI control groups of 36 subjects divided into two gender groups of 17 females and 19 males. Employing a mixed research design, the researcher conducted this study within the framework of the relevance theory (RT) whose main assumption is that human beings are endowed with a biological ability to magnify the relevance of the incoming stimuli. Each of the four groups was given two different priming stimuli: audio-visual priming (T1) and syntactic priming (T2). The results showed that the priming effect was sheer distinct among SLI participants especially when retrieving typical responses (TR) in T1 and T2 with slight superiority of males over females. The results also revealed that non-SLI females showed stronger original response (OR) priming in T1 than males and that non-SLI males in T2 excelled in OR priming than females. Furthermore, the results suggested that the audio-visual priming has a stronger effect on SLI females than non-SLI females and that syntactic priming seems to have the same effect on the two groups (non-SLI and SLI females). The conclusion is that the priming effect varies according to gender and is not confined merely to non-SLI subjects.

Keywords: specific language impairment, relevance theory, audio-visual priming, syntactic priming, modern standard Arabic

Procedia PDF Downloads 176
411 Multimodal Convolutional Neural Network for Musical Instrument Recognition

Authors: Yagya Raj Pandeya, Joonwhoan Lee

Abstract:

The dynamic behavior of music and video makes it difficult to evaluate musical instrument playing in a video by computer system. Any television or film video clip with music information are rich sources for analyzing musical instruments using modern machine learning technologies. In this research, we integrate the audio and video information sources using convolutional neural network (CNN) and pass network learned features through recurrent neural network (RNN) to preserve the dynamic behaviors of audio and video. We use different pre-trained CNN for music and video feature extraction and then fine tune each model. The music network use 2D convolutional network and video network use 3D convolution (C3D). Finally, we concatenate each music and video feature by preserving the time varying features. The long short term memory (LSTM) network is used for long-term dynamic feature characterization and then use late fusion with generalized mean. The proposed network performs better performance to recognize the musical instrument using audio-video multimodal neural network.

Keywords: multimodal, 3D convolution, music-video feature extraction, generalized mean

Procedia PDF Downloads 215
410 A Comparison of Proxemics and Postural Head Movements during Pop Music versus Matched Music Videos

Authors: Harry J. Witchel, James Ackah, Carlos P. Santos, Nachiappan Chockalingam, Carina E. I. Westling

Abstract:

Introduction: Proxemics is the study of how people perceive and use space. It is commonly proposed that when people like or engage with a person/object, they will move slightly closer to it, often quite subtly and subconsciously. Music videos are known to add entertainment value to a pop song. Our hypothesis was that by adding appropriately matched video to a pop song, it would lead to a net approach of the head to the monitor screen compared to simply listening to an audio-only version of the song. Methods: We presented to 27 participants (ages 21.00 ± 2.89, 15 female) seated in front of 47.5 x 27 cm monitor two musical stimuli in a counterbalanced order; all stimuli were based on music videos by the band OK Go: Here It Goes Again (HIGA, boredom ratings (0-100) = 15.00 ± 4.76, mean ± SEM, standard-error-of-the-mean) and Do What You Want (DWYW, boredom ratings = 23.93 ± 5.98), which did not differ in boredom elicited (P = 0.21, rank-sum test). Each participant experienced each song only once, and one song (counterbalanced) as audio-only versus the other song as a music video. The movement was measured by video-tracking using Kinovea 0.8, based on recording from a lateral aspect; before beginning, each participant had a reflective motion tracking marker placed on the outer canthus of the left eye. Analysis of the Kinovea X-Y coordinate output in comma-separated-variables format was performed in Matlab, as were non-parametric statistical tests. Results: We found that the audio-only stimuli (combined for both HIGA and DWYW, mean ± SEM, 35.71 ± 5.36) were significantly more boring than the music video versions (19.46 ± 3.83, P = 0.0066 Wilcoxon Signed Rank Test (WSRT), Cohen's d = 0.658, N = 28). We also found that participants' heads moved around twice as much during the audio-only versions (speed = 0.590 ± 0.095 mm/sec) compared to the video versions (0.301 ± 0.063 mm/sec, P = 0.00077, WSRT). However, the participants' mean head-to-screen distances were not detectably smaller (i.e. head closer to the screen) during the music videos (74.4 ± 1.8 cm) compared to the audio-only stimuli (73.9 ± 1.8 cm, P = 0.37, WSRT). If anything, during the audio-only condition, they were slightly closer. Interestingly, the ranges of the head-to-screen distances were smaller during the music video (8.6 ± 1.4 cm) compared to the audio-only (12.9 ± 1.7 cm, P = 0.0057, WSRT), the standard deviations were also smaller (P = 0.0027, WSRT), and their heads were held 7 mm higher (video 116.1 ± 0.8 vs. audio-only 116.8 ± 0.8 cm above floor, P = 0.049, WSRT). Discussion: As predicted, sitting and listening to experimenter-selected pop music was more boring than when the music was accompanied by a matched, professionally-made video. However, we did not find that the proxemics of the situation led to approaching the screen. Instead, adding video led to efforts to control the head to a more central and upright viewing position and to suppress head fidgeting.

Keywords: boredom, engagement, music videos, posture, proxemics

Procedia PDF Downloads 167
409 A Guide to the Implementation of Ambisonics Super Stereo

Authors: Alessio Mastrorillo, Giuseppe Silvi, Francesco Scagliola

Abstract:

In this work, we introduce an Ambisonics decoder with an implementation of the C-format, also called Super Stereo. This format is an alternative to conventional stereo and binaural decoding. Unlike those, this format conveys audio information from the horizontal plane and works with stereo speakers and headphones. The two C-format channels can also return a reconstructed planar B-format. This work provides an open-source implementation for this format. We implement an all-pass filter for signal quadrature, as required by the decoding equations. This filter works with six Biquads in a cascade configuration, with values for control frequency and quality factor discovered experimentally. The phase response of the filter delivers a small error in the 20-14.000Hz range. The decoder has been tested with audio sources up to 192kHz sample rate, returning pristine sound quality and detailed stereo image. It has been included in the Envelop for Live suite and is available as an open-source repository. This decoder has applications in Virtual Reality and 360° audio productions, music composition, and online streaming.

Keywords: ambisonics, UHJ, quadrature filter, virtual reality, Gerzon, decoder, stereo, binaural, biquad

Procedia PDF Downloads 91
408 Subtitled Based-Approach for Learning Foreign Arabic Language

Authors: Elleuch Imen

Abstract:

In this paper, it propose a new approach for learning Arabic as a foreign language via audio-visual translation, particularly subtitling. The approach consists of developing video sequences appropriate to different levels of learning (from A1 to C2) containing conversations, quizzes, games and others. Each video aims to achieve a specific objective, such as the correct pronunciation of Arabic words, the correct syntactic structuring of Arabic sentences, the recognition of the morphological characteristics of terms and the semantic understanding of statements. The subtitled videos obtained can be incorporated into different Arabic second language learning tools such as Moocs, websites, platforms, etc.

Keywords: arabic foreign language, learning, audio-visuel translation, subtitled videos

Procedia PDF Downloads 61
407 Method Comprising One to One Web Based Real Time Communications

Authors: Lata Kiran Dey, Rajendra Kumar, Biren Karmakar

Abstract:

Web Real Time Communications is a collection of standards, protocols, which provides real-time communications capabilities between web browsers and devices. This paper outlines the design and further implementation of web real-time communications on secure web applications having audio and video call capabilities. This proposed application may put up a system that will be able to work over both desktops as well as the mobile browser. Though, WebRTC also gives a set of JavaScript standard RTC APIs, which primarily works over the real-time communication framework. This helps to build a suitable communication application, which enables the audio, video, and message transfer in between the today’s modern browsers having WebRTC support.

Keywords: WebRTC, SIP, RTC, JavaScript, SRTP, secure web sockets, browser

Procedia PDF Downloads 148
406 Preliminary Evaluation of Echinacea Species by UV-VIS Spectroscopy Fingerprinting of Phenolic Compounds

Authors: Elena Ionescu, Elena Iacob, Marie-Louise Ionescu, Carmen Elena Tebrencu, Oana Teodora Ciuperca

Abstract:

Echinacea species (Asteraceae) has received a global attention because it is widely used for treatment of cold, flu and upper respiratory tract infections. Echinacea species contain a great variety of chemical components that contribute to their activity. The most important components responsible for the biological activity are those with high molecular-weight such as polysaccharides, polyacetylenes, highly unsaturated alkamides and caffeic acid derivatives. The principal factors that may influence the chemical composition of Echinacea include the species and the part of plant used (aerial parts or roots ). In recent years the market for Echinacea has grown rapidly and also the cases of adultery/replacement especially for Echinacea root. The identification of presence or absence of same biomarkers provide information for safe use of Echinacea species in food supplements industry. The aim of the study was the preliminary evaluation and fingerprinting by UV-VISIBLE spectroscopy of biomarkers in terms of content in phenolic derivatives of some Echinacea species (E. purpurea, E. angustifolia and E. pallida) for identification and authentication of the species. The steps of the study were: (1) samples (extracts) preparation from Echinacea species (non-hydrolyzed and hydrolyzed ethanol extracts); (2) samples preparation of reference substances (polyphenol acids: caftaric acid, caffeic acid, chlorogenic acid, ferulic acid; flavonoids: rutoside, hyperoside, isoquercitrin and their aglycones: quercitri, quercetol, luteolin, kaempferol and apigenin); (3) identification of specific absorption at wavelengths between 700-200 nm; (4) identify the phenolic compounds from Echinacea species based on spectral characteristics and the specific absorption; each class of compounds corresponds to a maximum absorption in the UV spectrum. The phytochemical compounds were identified at specific wavelengths between 700-200 nm. The absorption intensities were measured. The obtained results proved that ethanolic extract showed absorption peaks attributed to: phenolic compounds (free phenolic acids and phenolic acids derivatives) registrated between 220-280 nm, unsymmetrical chemical structure compounds (caffeic acid, chlorogenic acid, ferulic acid) with maximum absorption peak and absorption "shoulder" that may be due to substitution of hydroxyl or methoxy group, flavonoid compounds (in free form or glycosides) between 330-360 nm, due to the double bond in position 2,3 and carbonyl group in position 4 flavonols. UV spectra showed two major peaks of absorption (quercetin glycoside, rutin, etc.). The results obtained by UV-VIS spectroscopy has revealed the presence of phenolic derivatives such as cicoric acid (240 nm), caftaric acid (329 nm), caffeic acid (240 nm), rutoside (205 nm), quercetin (255 nm), luteolin (235 nm) in all three species of Echinacea. The echinacoside is absent. This profile mentioned above and the absence of phenolic compound echinacoside leads to the conclusion that species harvested as Echinacea angustifolia and Echinacea pallida are Echinacea purpurea also; It can be said that preliminary fingerprinting of Echinacea species through correspondence with the phenolic derivatives profile can be achieved by UV-VIS spectroscopic investigation, which is an adequate technique for preliminary identification and authentication of Echinacea in medicinal herbs.

Keywords: Echinacea species, Fingerprinting, Phenolic compounds, UV-VIS spectroscopy

Procedia PDF Downloads 261
405 Teaching Speaking Skills to Adult English Language Learners through ALM

Authors: Wichuda Kunnu, Aungkana Sukwises

Abstract:

Audio-lingual method (ALM) is a teaching approach that is claimed that ineffective for teaching second/foreign languages. Because some linguists and second/foreign language teachers believe that ALM is a rote learning style. However, this study is done on a belief that ALM will be able to solve Thais’ English speaking problem. This paper aims to report the findings on teaching English speaking to adult learners with an “adapted ALM”, one distinction of which is to use Thai as the medium language of instruction. The participants are consisted of 9 adult learners. They were allowed to speak English more freely using both the materials presented in the class and their background knowledge of English. At the end of the course, they spoke English more fluently, more confidently, to the extent that they applied what they learnt both in and outside the class.

Keywords: teaching English, audio lingual method, cognitive science, psychology

Procedia PDF Downloads 418
404 Intelligent Indoor Localization Using WLAN Fingerprinting

Authors: Gideon C. Joseph

Abstract:

The ability to localize mobile devices is quite important, as some applications may require location information of these devices to operate or deliver better services to the users. Although there are several ways of acquiring location data of mobile devices, the WLAN fingerprinting approach has been considered in this work. This approach uses the Received Signal Strength Indicator (RSSI) measurement as a function of the position of the mobile device. RSSI is a quantitative technique of describing the radio frequency power carried by a signal. RSSI may be used to determine RF link quality and is very useful in dense traffic scenarios where interference is of major concern, for example, indoor environments. This research aims to design a system that can predict the location of a mobile device, when supplied with the mobile’s RSSIs. The developed system takes as input the RSSIs relating to the mobile device, and outputs parameters that describe the location of the device such as the longitude, latitude, floor, and building. The relationship between the Received Signal Strengths (RSSs) of mobile devices and their corresponding locations is meant to be modelled; hence, subsequent locations of mobile devices can be predicted using the developed model. It is obvious that describing mathematical relationships between the RSSIs measurements and localization parameters is one option to modelling the problem, but the complexity of such an approach is a serious turn-off. In contrast, we propose an intelligent system that can learn the mapping of such RSSIs measurements to the localization parameters to be predicted. The system is capable of upgrading its performance as more experiential knowledge is acquired. The most appealing consideration to using such a system for this task is that complicated mathematical analysis and theoretical frameworks are excluded or not needed; the intelligent system on its own learns the underlying relationship in the supplied data (RSSI levels) that corresponds to the localization parameters. These localization parameters to be predicted are of two different tasks: Longitude and latitude of mobile devices are real values (regression problem), while the floor and building of the mobile devices are of integer values or categorical (classification problem). This research work presents artificial neural network based intelligent systems to model the relationship between the RSSIs predictors and the mobile device localization parameters. The designed systems were trained and validated on the collected WLAN fingerprint database. The trained networks were then tested with another supplied database to obtain the performance of trained systems on achieved Mean Absolute Error (MAE) and error rates for the regression and classification tasks involved therein.

Keywords: indoor localization, WLAN fingerprinting, neural networks, classification, regression

Procedia PDF Downloads 347
403 Improving Fingerprinting-Based Localization System Using Generative AI

Authors: Getaneh Berie Tarekegn

Abstract:

A precise localization system is crucial for many artificial intelligence Internet of Things (AI-IoT) applications in the era of smart cities. Their applications include traffic monitoring, emergency alarming, environmental monitoring, location-based advertising, intelligent transportation, and smart health care. The most common method for providing continuous positioning services in outdoor environments is by using a global navigation satellite system (GNSS). Due to nonline-of-sight, multipath, and weather conditions, GNSS systems do not perform well in dense urban, urban, and suburban areas.This paper proposes a generative AI-based positioning scheme for large-scale wireless settings using fingerprinting techniques. In this article, we presented a semi-supervised deep convolutional generative adversarial network (S-DCGAN)-based radio map construction method for real-time device localization. It also employed a reliable signal fingerprint feature extraction method with t-distributed stochastic neighbor embedding (t-SNE), which extracts dominant features while eliminating noise from hybrid WLAN and long-term evolution (LTE) fingerprints. The proposed scheme reduced the workload of site surveying required to build the fingerprint database by up to 78.5% and significantly improved positioning accuracy. The results show that the average positioning error of GAILoc is less than 0.39 m, and more than 90% of the errors are less than 0.82 m. According to numerical results, SRCLoc improves positioning performance and reduces radio map construction costs significantly compared to traditional methods.

Keywords: location-aware services, feature extraction technique, generative adversarial network, long short-term memory, support vector machine

Procedia PDF Downloads 60
402 Online Delivery Approaches of Post Secondary Virtual Inclusive Media Education

Authors: Margot Whitfield, Andrea Ducent, Marie Catherine Rombaut, Katia Iassinovskaia, Deborah Fels

Abstract:

Learning how to create inclusive media, such as closed captioning (CC) and audio description (AD), in North America is restricted to the private sector, proprietary company-based training. We are delivering (through synchronous and asynchronous online learning) the first Canadian post-secondary, practice-based continuing education course package in inclusive media for broadcast production and processes. Despite the prevalence of CC and AD taught within the field of translation studies in Europe, North America has no comparable field of study. This novel approach to audio visual translation (AVT) education develops evidence-based methodology innovations, stemming from user study research with blind/low vision and Deaf/hard of hearing audiences for television and theatre, undertaken at Ryerson University. Knowledge outcomes from the courses include a) Understanding how CC/AD fit within disability/regulatory frameworks in Canada. b) Knowledge of how CC/AD could be employed in the initial stages of production development within broadcasting. c) Writing and/or speaking techniques designed for media. d) Hands-on practice in captioning re-speaking techniques and open source technologies, or in AD techniques. e) Understanding of audio production technologies and editing techniques. The case study of the curriculum development and deployment, involving first-time online course delivery from academic and practitioner-based instructors in introductory Captioning and Audio Description courses (CDIM 101 and 102), will compare two different instructors' approaches to learning design, including the ratio of synchronous and asynchronous classroom time and technological engagement tools on meeting software platform such as breakout rooms and polling. Student reception of these two different approaches will be analysed using qualitative thematic and quantitative survey analysis. Thus far, anecdotal conversations with students suggests that they prefer synchronous compared with asynchronous learning within our hands-on online course delivery method.

Keywords: inclusive media theory, broadcasting practices, AVT post secondary education, respeaking, audio description, learning design, virtual education

Procedia PDF Downloads 183