Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 43

Search results for: Auditory hallucinations

13 An Approach for Blind Source Separation using the Sliding DFT and Time Domain Independent Component Analysis

Authors: Koji Yamanouchi, Masaru Fujieda, Takahiro Murakami, Yoshihisa Ishida

Abstract:

''Cocktail party problem'' is well known as one of the human auditory abilities. We can recognize the specific sound that we want to listen by this ability even if a lot of undesirable sounds or noises are mixed. Blind source separation (BSS) based on independent component analysis (ICA) is one of the methods by which we can separate only a special signal from their mixed signals with simple hypothesis. In this paper, we propose an online approach for blind source separation using the sliding DFT and the time domain independent component analysis. The proposed method can reduce calculation complexity in comparison with conventional methods, and can be applied to parallel processing by using digital signal processors (DSPs) and so on. We evaluate this method and show its availability.

Keywords: Cocktail party problem, blind Source Separation(BSS), independent component analysis, sliding DFT, onlineprocessing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1638

12 Effects of Mobile Phone Generated High Frequency Electromagnetic Field on the Viability and Biofilm Formation of Staphylococcus aureus

Authors: Zaini Mohd-Zain, Mohd-Saufee A.F. Mohd-Ismail, Norlida Buniyamin

Abstract:

Staphylococcus aureus, one of the microflora in a human external auditory canal (EAC) is frequently exposed to highfrequency electromagnetic field (HF-EMF) generated by mobile phones. It is normally non-pathogenic but in certain circumstances, it can cause infections. This study investigates the changes in the physiology of S. aureus when exposed to HF-EMF of a mobile phone. Exponentially grown S. aureus were exposed to two conditions of EMF irradiation (standby-mode and on-call mode) at four durations; 15, 30, 45 and 60 min. Changes in the viability and biofilm production of the S. aureus were compared between the two conditions of exposure. EMF from the standby-mode has enhanced the growth of S. aureus but during on-call, the growth was suppressed. No significant difference in the amount of biofilm produced in both modes of exposure was observed. Thus, HF-EMF of mobile phone affects the viability of S. aureus but not its ability to produce biofilm.

Keywords: Electromagnetic field, mobile phone, biofilm, Staphylococcus aureus

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2014

11 Self-Supervised Pretraining on Paired Sequences of fMRI Data for Transfer Learning to Brain Decoding Tasks

Authors: Sean Paulsen, Michael Casey

Abstract:

In this work, we present a self-supervised pretraining framework for transformers on functional Magnetic Resonance Imaging (fMRI) data. First, we pretrain our architecture on two self-supervised tasks simultaneously to teach the model a general understanding of the temporal and spatial dynamics of human auditory cortex during music listening. Our pretraining results are the first to suggest a synergistic effect of multitask training on fMRI data. Second, we finetune the pretrained models and train additional fresh models on a supervised fMRI classification task. We observe significantly improved accuracy on held-out runs with the finetuned models, which demonstrates the ability of our pretraining tasks to facilitate transfer learning. This work contributes to the growing body of literature on transformer architectures for pretraining and transfer learning with fMRI data, and serves as a proof of concept for our pretraining tasks and multitask pretraining on fMRI data.

Keywords: Transfer learning, fMRI, self-supervised, brain decoding, transformer, multitask training.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 151

10 The Creative Unfolding of “Reduced Descriptive Structures” in Musical Cognition: Technical and Theoretical Insights Based on the OpenMusic and PWGL Long-Term Feedback

Authors: Jacopo Baboni Schilingi

Abstract:

We here describe the theoretical and philosophical understanding of a long term use and development of algorithmic computer-based tools applied to music composition. The findings of our research lead us to interrogate some specific processes and systems of communication engaged in the discovery of specific cultural artworks: artistic creation in the sono-musical domain. Our hypothesis is that the patterns of auditory learning cannot be only understood in terms of social transmission but would gain to be questioned in the way they rely on various ranges of acoustic stimuli modes of consciousness and how the different types of memories engaged in the percept-action expressive systems of our cultural communities also relies on these shadowy conscious entities we named “Reduced Descriptive Structures”.

Keywords: Algorithmic sonic computation, corrected and self-correcting learning patterns in acoustic perception, morphological derivations in sensorial patterns, social unconscious modes of communication.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 609

9 Designing a Motivated Tangible Multimedia System for Preschoolers

Authors: Kien Tsong Chau, Zarina Samsudin, Wan Ahmad Jaafar Wan Yahaya

Abstract:

The paper examined the capability of a prototype of a tangible multimedia system that was augmented with tangible objects in motivating young preschoolers in learning. Preschoolers’ learning behaviour is highly captivated and motivated by external physical stimuli. Hence, conventional multimedia which solely dependent on digital visual and auditory formats for knowledge delivery could potentially place them in inappropriate state of circumstances that are frustrating, boring, or worse, impede overall learning motivations. This paper begins by discussion with the objectives of the research, followed by research questions, hypotheses, ARCS model of motivation adopted in the process of macro-design, and the research instrumentation, Persuasive Multimedia Motivational Scale was deployed for measuring the level of motivation of subjects towards the experimental tangible multimedia. At the close, a succinct description of the findings of a relevant research is provided. In the research, a total of 248 preschoolers recruited from seven Malaysian kindergartens were examined. Analyses revealed that the tangible multimedia system improved preschoolers’ learning motivation significantly more than conventional multimedia. Overall, the findings led to the conclusion that the tangible multimedia system is a motivation conducive multimedia for preschoolers.

Keywords: Tangible multimedia, preschooler, motivation, multimedia.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1298

8 Improved Text-Independent Speaker Identification using Fused MFCC and IMFCC Feature Sets based on Gaussian Filter

Authors: Sandipan Chakroborty, Goutam Saha

Abstract:

A state of the art Speaker Identification (SI) system requires a robust feature extraction unit followed by a speaker modeling scheme for generalized representation of these features. Over the years, Mel-Frequency Cepstral Coefficients (MFCC) modeled on the human auditory system has been used as a standard acoustic feature set for speech related applications. On a recent contribution by authors, it has been shown that the Inverted Mel- Frequency Cepstral Coefficients (IMFCC) is useful feature set for SI, which contains complementary information present in high frequency region. This paper introduces the Gaussian shaped filter (GF) while calculating MFCC and IMFCC in place of typical triangular shaped bins. The objective is to introduce a higher amount of correlation between subband outputs. The performances of both MFCC & IMFCC improve with GF over conventional triangular filter (TF) based implementation, individually as well as in combination. With GMM as speaker modeling paradigm, the performances of proposed GF based MFCC and IMFCC in individual and fused mode have been verified in two standard databases YOHO, (Microphone Speech) and POLYCOST (Telephone Speech) each of which has more than 130 speakers.

Keywords: Gaussian Filter, Triangular Filter, Subbands, Correlation, MFCC, IMFCC, GMM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2450

7 Improved Closed Set Text-Independent Speaker Identification by Combining MFCC with Evidence from Flipped Filter Banks

Authors: Sandipan Chakroborty, Anindya Roy, Goutam Saha

Abstract:

A state of the art Speaker Identification (SI) system requires a robust feature extraction unit followed by a speaker modeling scheme for generalized representation of these features. Over the years, Mel-Frequency Cepstral Coefficients (MFCC) modeled on the human auditory system has been used as a standard acoustic feature set for SI applications. However, due to the structure of its filter bank, it captures vocal tract characteristics more effectively in the lower frequency regions. This paper proposes a new set of features using a complementary filter bank structure which improves distinguishability of speaker specific cues present in the higher frequency zone. Unlike high level features that are difficult to extract, the proposed feature set involves little computational burden during the extraction process. When combined with MFCC via a parallel implementation of speaker models, the proposed feature set outperforms baseline MFCC significantly. This proposition is validated by experiments conducted on two different kinds of public databases namely YOHO (microphone speech) and POLYCOST (telephone speech) with Gaussian Mixture Models (GMM) as a Classifier for various model orders.

Keywords: Complementary Information, Filter Bank, GMM, IMFCC, MFCC, Speaker Identification, Speaker Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2296

6 Retrieval Augmented Generation against the Machine: Merging Human Cyber Security Expertise with Generative AI

Authors: Brennan Lodge

Abstract:

Amidst a complex regulatory landscape, Retrieval Augmented Generation (RAG) emerges as a transformative tool for Governance Risk and Compliance (GRC) officers. This paper details the application of RAG in synthesizing Large Language Models (LLMs) with external knowledge bases, offering GRC professionals an advanced means to adapt to rapid changes in compliance requirements. While the development for standalone LLMs is exciting, such models do have their downsides. LLMs cannot easily expand or revise their memory, and they cannot straightforwardly provide insight into their predictions, and may produce “hallucinations.” Leveraging a pre-trained seq2seq transformer and a dense vector index of domain-specific data, this approach integrates real-time data retrieval into the generative process, enabling gap analysis and the dynamic generation of compliance and risk management content. We delve into the mechanics of RAG, focusing on its dual structure that pairs parametric knowledge contained within the transformer model with non-parametric data extracted from an updatable corpus. This hybrid model enhances decision-making through context-rich insights, drawing from the most current and relevant information, thereby enabling GRC officers to maintain a proactive compliance stance. Our methodology aligns with the latest advances in neural network fine-tuning, providing a granular, token-level application of retrieved information to inform and generate compliance narratives. By employing RAG, we exhibit a scalable solution that can adapt to novel regulatory challenges and cybersecurity threats, offering GRC officers a robust, predictive tool that augments their expertise. The granular application of RAG’s dual structure not only improves compliance and risk management protocols but also informs the development of compliance narratives with pinpoint accuracy. It underscores AI’s emerging role in strategic risk mitigation and proactive policy formation, positioning GRC officers to anticipate and navigate the complexities of regulatory evolution confidently.

Keywords: Retrieval Augmented Generation, Governance Risk and Compliance, Cybersecurity, AI-driven Compliance, Risk Management, Generative AI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 124

5 Sonic Localization Cues for Classrooms: A Structural Model Proposal

Authors: Abhijit Mitra, C. Ardil

Abstract:

We investigate sonic cues for binaural sound localization within classrooms and present a structural model for the same. Two of the primary cues for localization, interaural time difference (ITD) and interaural level difference (ILD) created between the two ears by sounds from a particular point in space, are used. Although these cues do not lend any information about the elevation of a sound source, the torso, head, and outer ear carry out elevation dependent spectral filtering of sounds before they reach the inner ear. This effect is commonly captured in head related transfer function (HRTF) which aids in resolving the ambiguity from the ITDs and ILDs alone and helps localize sounds in free space. The proposed structural model of HRTF produces well controlled horizontal as well as vertical effects. The implemented HRTF is a signal processing model which tries to mimic the physical effects of the sounds interacting with different parts of the body. The effectiveness of the method is tested by synthesizing spatial audio, in MATLAB, for use in listening tests with human subjects and is found to yield satisfactory results in comparison with existing models.

Keywords: Auditory localization, Binaural sound, Head related impulse response, Head related transfer function, Interaural level difference, Interaural time difference, Localization cues.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1729

4 Noninvasive Brain-Machine Interface to Control Both Mecha TE Robotic Hands Using Emotiv EEG Neuroheadset

Authors: Adrienne Kline, Jaydip Desai

Abstract:

Electroencephalogram (EEG) is a noninvasive technique that registers signals originating from the firing of neurons in the brain. The Emotiv EEG Neuroheadset is a consumer product comprised of 14 EEG channels and was used to record the reactions of the neurons within the brain to two forms of stimuli in 10 participants. These stimuli consisted of auditory and visual formats that provided directions of ‘right’ or ‘left.’ Participants were instructed to raise their right or left arm in accordance with the instruction given. A scenario in OpenViBE was generated to both stimulate the participants while recording their data. In OpenViBE, the Graz Motor BCI Stimulator algorithm was configured to govern the duration and number of visual stimuli. Utilizing EEGLAB under the cross platform MATLAB®, the electrodes most stimulated during the study were defined. Data outputs from EEGLAB were analyzed using IBM SPSS Statistics® Version 20. This aided in determining the electrodes to use in the development of a brain-machine interface (BMI) using real-time EEG signals from the Emotiv EEG Neuroheadset. Signal processing and feature extraction were accomplished via the Simulink® signal processing toolbox. An Arduino™ Duemilanove microcontroller was used to link the Emotiv EEG Neuroheadset and the right and left Mecha TE™ Hands.

Keywords: Brain-machine interface, EEGLAB, emotiv EEG neuroheadset, openViBE, simulink.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2804

3 Analysis of Brain Activities due to Differences in Running Shoe Properties

Authors: K. Okubo, Y. Kurihara, T. Kaburagi, K. Watanabe

Abstract:

Many of the ever-growing elderly population require exercise, such as running, for health management. One important element of a runner’s training is the choice of shoes for exercise; shoes are important because they provide the interface between the feet and road. When we purchase shoes, we may instinctively choose a pair after trying on many different pairs of shoes. Selecting the shoes instinctively may work, but it does not guarantee a suitable fit for running activities. Therefore, if we could select suitable shoes for each runner from the viewpoint of brain activities, it would be helpful for validating shoe selection. In this paper, we describe how brain activities show different characteristics during particular task, corresponding to different properties of shoes. Using five subjects, we performed a verification experiment, applying weight, softness, and flexibility as shoe properties. In order to affect the shoe property’s differences to the brain, subjects run for 10 min. Before and after running, subjects conducted a paced auditory serial addition task (PASAT) as the particular task; and the subjects’ brain activities during the PASAT are evaluated based on oxyhemoglobin and deoxyhemoglobin relative concentration changes, measured by near-infrared spectroscopy (NIRS). When the brain works actively, oxihemoglobin and deoxyhemoglobin concentration drastically changes; therefore, we calculate the maximum values of concentration changes. In order to normalize relative concentration changes after running, the maximum value are divided by before running maximum value as evaluation parameters. The classification of the groups of shoes is expressed on a self-organizing map (SOM). As a result, deoxyhemoglobin can make clusters for two of the three types of shoes.

Keywords: Brain activities, NIRS, PASAT, running shoes.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2313

2 Study of Integrated Vehicle Image System Including LDW, FCW, and AFS

Authors: Yi-Feng Su, Chia-Tseng Chen, Hsueh-Lung Liao

Abstract:

The objective of this research is to develop an advanced driver assistance system characterized with the functions of lane departure warning (LDW), forward collision warning (FCW) and adaptive front-lighting system (AFS). The system is mainly configured a CCD/CMOS camera to acquire the images of roadway ahead in association with the analysis made by an image-processing unit concerning the lane ahead and the preceding vehicles. The input image captured by a camera is used to recognize the lane and the preceding vehicle positions by image detection and DROI (Dynamic Range of Interesting) algorithms. Therefore, the system is able to issue real-time auditory and visual outputs of warning when a driver is departing the lane or driving too close to approach the preceding vehicle unwittingly so that the danger could be prevented from occurring. During the nighttime, in addition to the foregoing warning functions, the system is able to control the bending light of headlamp to provide an immediate light illumination when making a turn at a curved lane and adjust the level automatically to reduce the lighting interference against the oncoming vehicles driving in the opposite direction by the curvature of lane and the vanishing point estimations. The experimental results show that the integrated vehicle image system is robust to most environments such as the lane detection and preceding vehicle detection average accuracy performances are both above 90 %.

Keywords: Lane mark detection, lane departure warning (LDW), dynamic range of interesting (DROI), forward collision warning (FCW), adaptive front-lighting system (AFS).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2157

1 Design and Modeling of Human Middle Ear for Harmonic Response Analysis

Authors: Shende Suraj Balu, A. B. Deoghare, K. M. Pandey

Abstract:

The human middle ear (ME) is a delicate and vital organ. It has a complex structure that performs various functions such as receiving sound pressure and producing vibrations of eardrum and propagating it to inner ear. It consists of Tympanic Membrane (TM), three auditory ossicles, various ligament structures and muscles. Incidents such as traumata, infections, ossification of ossicular structures and other pathologies may damage the ME organs. The conditions can be surgically treated by employing prosthesis. However, the suitability of the prosthesis needs to be examined in advance prior to the surgery. Few decades ago, this issue was addressed and analyzed by developing an equivalent representation either in the form of spring mass system, electrical system using R-L-C circuit or developing an approximated CAD model. But, nowadays a three-dimensional ME model can be constructed using micro X-Ray Computed Tomography (μCT) scan data. Moreover, the concern about patient specific integrity pertaining to the disease can be examined well in advance. The current research work emphasizes to develop the ME model from the stacks of μCT images which are used as input file to MIMICS Research 19.0 (Materialise Interactive Medical Image Control System) software. A stack of CT images is converted into geometrical surface model to build accurate morphology of ME. The work is further extended to understand the dynamic behaviour of Harmonic response of the stapes footplate and umbo for different sound pressure levels applied at lateral side of eardrum using finite element approach. The pathological condition Cholesteatoma of ME is investigated to obtain peak to peak displacement of stapes footplate and umbo. Apart from this condition, other pathologies, mainly, changes in the stiffness of stapedial ligament, TM thickness and ossicular chain separation and fixation are also explored. The developed model of ME for pathologies is validated by comparing the results available in the literatures and also with the results of a normal ME to calculate the percentage loss in hearing capability.

Keywords: Computed tomography, human middle ear, harmonic response, pathologies, tympanic membrane.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1013