Search results for: speech signal processing
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 5626

Search results for: speech signal processing

5386 Performance Analysis of VoIP Coders for Different Modulations Under Pervasive Environment

Authors: Jasbinder Singh, Harjit Pal Singh, S. A. Khan

Abstract:

The work, in this paper, presents the comparison of encoded speech signals by different VoIP narrow-band and wide-band codecs for different modulation schemes. The simulation results indicate that codec has an impact on the speech quality and also effected by modulation schemes.

Keywords: VoIP, coders, modulations, BER, MOS

Procedia PDF Downloads 516
5385 Heart Murmurs and Heart Sounds Extraction Using an Algorithm Process Separation

Authors: Fatima Mokeddem

Abstract:

The phonocardiogram signal (PCG) is a physiological signal that reflects heart mechanical activity, is a promising tool for curious researchers in this field because it is full of indications and useful information for medical diagnosis. PCG segmentation is a basic step to benefit from this signal. Therefore, this paper presents an algorithm that serves the separation of heart sounds and heart murmurs in case they exist in order to use them in several applications and heart sounds analysis. The separation process presents here is founded on three essential steps filtering, envelope detection, and heart sounds segmentation. The algorithm separates the PCG signal into S1 and S2 and extract cardiac murmurs.

Keywords: phonocardiogram signal, filtering, Envelope, Detection, murmurs, heart sounds

Procedia PDF Downloads 140
5384 Detection Characteristics of the Random and Deterministic Signals in Antenna Arrays

Authors: Olesya Bolkhovskaya, Alexey Davydov, Alexander Maltsev

Abstract:

In this paper approach to incoherent signal detection in multi-element antenna array are researched and modeled. Two types of useful signals with unknown wavefront were considered. First one is deterministic (Barker code), the second one is random (Gaussian distribution). The derivation of the sufficient statistics took into account the linearity of the antenna array. The performance characteristics and detecting curves are modeled and compared for different useful signals parameters and for different number of elements of the antenna array. Results of researches in case of some additional conditions can be applied to a digital communications systems.

Keywords: antenna array, detection curves, performance characteristics, quadrature processing, signal detection

Procedia PDF Downloads 405
5383 [Keynote Speech]: Bridge Damage Detection Using Frequency Response Function

Authors: Ahmed Noor Al-Qayyim

Abstract:

During the past decades, the bridge structures are considered very important portions of transportation networks, due to the fast urban sprawling. With the failure of bridges that under operating conditions lead to focus on updating the default bridge inspection methodology. The structures health monitoring (SHM) using the vibration response appeared as a promising method to evaluate the condition of structures. The rapid development in the sensors technology and the condition assessment techniques based on the vibration-based damage detection made the SHM an efficient and economical ways to assess the bridges. SHM is set to assess state and expects probable failures of designated bridges. In this paper, a presentation for Frequency Response function method that uses the captured vibration test information of structures to evaluate the structure condition. Furthermore, the main steps of the assessment of bridge using the vibration information are presented. The Frequency Response function method is applied to the experimental data of a full-scale bridge.

Keywords: bridge assessment, health monitoring, damage detection, frequency response function (FRF), signal processing, structure identification

Procedia PDF Downloads 347
5382 Multimodal Data Fusion Techniques in Audiovisual Speech Recognition

Authors: Hadeer M. Sayed, Hesham E. El Deeb, Shereen A. Taie

Abstract:

In the big data era, we are facing a diversity of datasets from different sources in different domains that describe a single life event. These datasets consist of multiple modalities, each of which has a different representation, distribution, scale, and density. Multimodal fusion is the concept of integrating information from multiple modalities in a joint representation with the goal of predicting an outcome through a classification task or regression task. In this paper, multimodal fusion techniques are classified into two main classes: model-agnostic techniques and model-based approaches. It provides a comprehensive study of recent research in each class and outlines the benefits and limitations of each of them. Furthermore, the audiovisual speech recognition task is expressed as a case study of multimodal data fusion approaches, and the open issues through the limitations of the current studies are presented. This paper can be considered a powerful guide for interested researchers in the field of multimodal data fusion and audiovisual speech recognition particularly.

Keywords: multimodal data, data fusion, audio-visual speech recognition, neural networks

Procedia PDF Downloads 111
5381 Smooth Second Order Nonsingular Terminal Sliding Mode Control for a 6 DOF Quadrotor UAV

Authors: V. Tabrizi, A. Vali, R. GHasemi, V. Behnamgol

Abstract:

In this article, a nonlinear model of an under actuated six degrees of freedom (6 DOF) quadrotor UAV is derived on the basis of the Newton-Euler formula. The derivation comprises determining equations of the motion of the quadrotor in three dimensions and approximating the actuation forces through the modeling of aerodynamic coefficients and electric motor dynamics. The robust nonlinear control strategy includes a smooth second order non-singular terminal sliding mode control which is applied to stabilizing this model. The control method is on the basis of super twisting algorithm for removing the chattering and producing smooth control signal. Also, nonsingular terminal sliding mode idea is used for introducing a nonlinear sliding variable that guarantees the finite time convergence in sliding phase. Simulation results show that the proposed algorithm is robust against uncertainty or disturbance and guarantees a fast and precise control signal.

Keywords: quadrotor UAV, nonsingular terminal sliding mode, second order sliding mode t, electronics, control, signal processing

Procedia PDF Downloads 440
5380 Optimized Processing of Neural Sensory Information with Unwanted Artifacts

Authors: John Lachapelle

Abstract:

Introduction: Neural stimulation is increasingly targeted toward treatment of back pain, PTSD, Parkinson’s disease, and for sensory perception. Sensory recording during stimulation is important in order to examine neural response to stimulation. Most neural amplifiers (headstages) focus on noise efficiency factor (NEF). Conversely, neural headstages need to handle artifacts from several sources including power lines, movement (EMG), and neural stimulation itself. In this work a layered approach to artifact rejection is used to reduce corruption of the neural ENG signal by 60dBv, resulting in recovery of sensory signals in rats and primates that would previously not be possible. Methods: The approach combines analog techniques to reduce and handle unwanted signal amplitudes. The methods include optimized (1) sensory electrode placement, (2) amplifier configuration, and (3) artifact blanking when necessary. The techniques together are like concentric moats protecting a castle; only the wanted neural signal can penetrate. There are two conditions in which the headstage operates: unwanted artifact < 50mV, linear operation, and artifact > 50mV, fast-settle gain reduction signal limiting (covered in more detail in a separate paper). Unwanted Signals at the headstage input: Consider: (a) EMG signals are by nature < 10mV. (b) 60 Hz power line signals may be > 50mV with poor electrode cable conditions; with careful routing much of the signal is common to both reference and active electrode and rejected in the differential amplifier with <50mV remaining. (c) An unwanted (to the neural recorder) stimulation signal is attenuated from stimulation to sensory electrode. The voltage seen at the sensory electrode can be modeled Φ_m=I_o/4πσr. For a 1 mA stimulation signal, with 1 cm spacing between electrodes, the signal is <20mV at the headstage. Headstage ASIC design: The front end ASIC design is designed to produce < 1% THD at 50mV input; 50 times higher than typical headstage ASICs, with no increase in noise floor. This requires careful balance of amplifier stages in the headstage ASIC, as well as consideration of the electrodes effect on noise. The ASIC is designed to allow extremely small signal extraction on low impedance (< 10kohm) electrodes with configuration of the headstage ASIC noise floor to < 700nV/rt-Hz. Smaller high impedance electrodes (> 100kohm) are typically located closer to neural sources and transduce higher amplitude signals (> 10uV); the ASIC low-power mode conserves power with 2uV/rt-Hz noise. Findings: The enhanced neural processing ASIC has been compared with a commercial neural recording amplifier IC. Chronically implanted primates at MGH demonstrated the presence of commercial neural amplifier saturation as a result of large environmental artifacts. The enhanced artifact suppression headstage ASIC, in the same setup, was able to recover and process the wanted neural signal separately from the suppressed unwanted artifacts. Separately, the enhanced artifact suppression headstage ASIC was able to separate sensory neural signals from unwanted artifacts in mouse-implanted peripheral intrafascicular electrodes. Conclusion: Optimizing headstage ASICs allow observation of neural signals in the presence of large artifacts that will be present in real-life implanted applications, and are targeted toward human implantation in the DARPA HAPTIX program.

Keywords: ASIC, biosensors, biomedical signal processing, biomedical sensors

Procedia PDF Downloads 330
5379 Denoising of Magnetotelluric Signals by Filtering

Authors: Rodrigo Montufar-Chaveznava, Fernando Brambila-Paz, Ivette Caldelas

Abstract:

In this paper, we present the advances corresponding to the denoising processing of magnetotelluric signals using several filters. In particular, we use the most common spatial domain filters such as median and mean, but we are also using the Fourier and wavelet transform for frequency domain filtering. We employ three datasets obtained at the different sampling rate (128, 4096 and 8192 bps) and evaluate the mean square error, signal-to-noise relation, and peak signal-to-noise relation to compare the kernels and determine the most suitable for each case. The magnetotelluric signals correspond to earth exploration when water is searched. The object is to find a denoising strategy different to the one included in the commercial equipment that is employed in this task.

Keywords: denoising, filtering, magnetotelluric signals, wavelet transform

Procedia PDF Downloads 370
5378 Analysis of Linguistic Disfluencies in Bilingual Children’s Discourse

Authors: Sheena Christabel Pravin, M. Palanivelan

Abstract:

Speech disfluencies are common in spontaneous speech. The primary purpose of this study was to distinguish linguistic disfluencies from stuttering disfluencies in bilingual Tamil–English (TE) speaking children. The secondary purpose was to determine whether their disfluencies are mediated by native language dominance and/or on an early onset of developmental stuttering at childhood. A detailed study was carried out to identify the prosodic and acoustic features that uniquely represent the disfluent regions of speech. This paper focuses on statistical modeling of repetitions, prolongations, pauses and interjections in the speech corpus encompassing bilingual spontaneous utterances from school going children – English and Tamil. Two classifiers including Hidden Markov Models (HMM) and the Multilayer Perceptron (MLP), which is a class of feed-forward artificial neural network, were compared in the classification of disfluencies. The results of the classifiers document the patterns of disfluency in spontaneous speech samples of school-aged children to distinguish between Children Who Stutter (CWS) and Children with Language Impairment CLI). The ability of the models in classifying the disfluencies was measured in terms of F-measure, Recall, and Precision.

Keywords: bi-lingual, children who stutter, children with language impairment, hidden markov models, multi-layer perceptron, linguistic disfluencies, stuttering disfluencies

Procedia PDF Downloads 217
5377 An Agent-Based Modelling Simulation Approach to Calculate Processing Delay of GEO Satellite Payload

Authors: V. Vicente E. Mujica, Gustavo Gonzalez

Abstract:

The global coverage of broadband multimedia and internet-based services in terrestrial-satellite networks demand particular interests for satellite providers in order to enhance services with low latencies and high signal quality to diverse users. In particular, the delay of on-board processing is an inherent source of latency in a satellite communication that sometimes is discarded for the end-to-end delay of the satellite link. The frame work for this paper includes modelling of an on-orbit satellite payload using an agent model that can reproduce the properties of processing delays. In essence, a comparison of different spatial interpolation methods is carried out to evaluate physical data obtained by an GEO satellite in order to define a discretization function for determining that delay. Furthermore, the performance of the proposed agent and the development of a delay discretization function are together validated by simulating an hybrid satellite and terrestrial network. Simulation results show high accuracy according to the characteristics of initial data points of processing delay for Ku bands.

Keywords: terrestrial-satellite networks, latency, on-orbit satellite payload, simulation

Procedia PDF Downloads 271
5376 Theoretical BER Analyzing of MPSK Signals Based on the Signal Space

Authors: Jing Qing-feng, Liu Danmei

Abstract:

Based on the optimum detection, signal projection and Maximum A Posteriori (MAP) rule, Proakis has deduced the theoretical BER equation of Gray coded MPSK signals. Proakis analyzed the BER theoretical equations mainly based on the projection of signals, which is difficult to be understood. This article solve the same problem based on the signal space, which explains the vectors relations among the sending signals, received signals and noises. The more explicit and easy-deduced process is illustrated in this article based on the signal space, which can illustrated the relations among the signals and noises clearly. This kind of deduction has a univocal geometry meaning. It can explain the correlation between the production and calculation of BER in vector level.

Keywords: MPSK, MAP, signal space, BER

Procedia PDF Downloads 346
5375 Cooperative Jamming for Implantable Medical Device Security

Authors: Kim Lytle, Tim Talty, Alan Michaels, Jeff Reed

Abstract:

Implantable medical devices (IMDs) are medically necessary devices embedded in the human body that monitor chronic disorders or automatically deliver therapies. Most IMDs have wireless capabilities that allow them to share data with an offboard programming device to help medical providers monitor the patient’s health while giving the patient more insight into their condition. However, serious security concerns have arisen as researchers demonstrated these devices could be hacked to obtain sensitive information or harm the patient. Cooperative jamming can be used to prevent privileged information leaks by maintaining an adequate signal-to-noise ratio at the intended receiver while minimizing signal power elsewhere. This paper uses ray tracing to demonstrate how a low number of friendly nodes abiding by Bluetooth Low Energy (BLE) transmission regulations can enhance IMD communication security in an office environment, which in turn may inform how companies and individuals can protect their proprietary and personal information.

Keywords: implantable biomedical devices, communication system security, array signal processing, ray tracing

Procedia PDF Downloads 114
5374 Emotional and Physiological Reaction While Listening the Speech of Adults Who Stutter

Authors: Xharavina V., Gallopeni F., Ahmeti K.

Abstract:

Stuttered speech is filled with intermittent sound prolongations and/or rapid part word repetitions. Oftentimes, these aberrant acoustic behaviors are associated with intermittent physical tension and struggle behaviors such as head jerks, arm jerks, finger tapping, excessive eye-blinks, etc. Additionally, the jarring nature of acoustic and physical manifestations that often accompanies moderate-severe stuttering may induce negative emotional responses in listeners, which alters communication between the person who stutters and their listeners. However, researches for the influence of negative emotions in the communication and for physical reaction are limited. Therefore, to compare psycho-physiological responses of fluent adults, while listening the speech of adults who speak fluency and adults who stutter, are necessary. This study comprises the experimental method, with total of 104 participants (average age-20 years old, SD=2.1), divided into 3 groups. All participants self-reported no impairments in speech, language, or hearing. Exploring the responses of the participants, there were used two records speeches; a voice who speaks fluently and the voice who stutters. Heartbeats and the pulse were measured by the digital blood pressure monitor called 'Tensoval', as a physiological response to the fluent and stuttering sample. Meanwhile, the emotional responses of participants were measured by the self-reporting questionnaire (Steenbarger, 2001). Results showed an increase in heartbeats during the stuttering speech compared with the fluent sample (p < 0.5). The listeners also self-reported themselves as more alive, unhappy, nervous, repulsive, sad, tense, distracted and upset when listening the stuttering words versus the words of the fluent adult (where it was reported to experience positive emotions). These data support the notions that speech with stuttering can bring a psycho-physical reaction to the listeners. Speech pathologists should be aware that listeners show intolerable physiological reactions to stuttering that remain visible over time.

Keywords: emotional, physiological, stuttering, fluent speech

Procedia PDF Downloads 142
5373 Speech Acts of Selected Classroom Encounters: Analyzing the Speech Acts of a Career Technology Lesson

Authors: Michael Amankwaa Adu

Abstract:

Effective communication in the classroom plays a vital role in ensuring successful teaching and learning. In particular, the types of language and speech acts teachers use shape classroom interactions and influence student engagement. This study aims to analyze the speech acts employed by a Career Technology teacher in a junior high school. While much research has focused on speech acts in language classrooms, less attention has been given to how these acts operate in non-language subject areas like technical education. The study explores how different types of speech acts—directives, assertives, expressives, and commissives—are used during three classroom encounters: lesson introduction, content delivery, and classroom management. This research seeks to fill the gap in understanding how teachers of non-language subjects use speech acts to manage classroom dynamics and facilitate learning. The study employs a mixed-methods design, combining qualitative and quantitative approaches. Data was collected through direct classroom observation and audio recordings of a one-hour Career Technology lesson. The transcriptions of the lesson were analyzed using John Searle’s taxonomy of speech acts, classifying the teacher’s utterances into directives, assertives, expressives, and commissives. Results show that directives were the most frequently used speech act, accounting for 59.3% of the teacher's utterances. These speech acts were essential in guiding student behavior, giving instructions, and maintaining classroom control. Assertives made up 20.4% of the speech acts, primarily used for stating facts and reinforcing content. Expressives, at 14.2%, expressed emotions such as approval or frustration, helping to manage the emotional atmosphere of the classroom. Commissives were the least used, representing 6.2% of the speech acts, often used to set expectations or outline future actions. No declarations were observed during the lesson. The findings of this study reveal the critical role that speech acts play in managing classroom behavior and delivering content in technical subjects. Directives were crucial for ensuring students followed instructions and completed tasks, while assertives helped in reinforcing lesson objectives. Expressives contributed to motivating or disciplining students, and commissives, though less frequent, helped set clear expectations for students’ future actions. The absence of declarations suggests that the teacher prioritized guiding students over making formal pronouncements. These insights can inform teaching strategies across various subject areas, demonstrating that a diverse use of speech acts can create a balanced and interactive learning environment. This study contributes to the growing field of pragmatics in education and offers practical recommendations for educators, particularly in non-language classrooms, on how to utilize speech acts to enhance both classroom management and student engagement.

Keywords: classroom interaction, pragmatics, speech acts, teacher communication, career technology

Procedia PDF Downloads 20
5372 Analysis of Matching Pursuit Features of EEG Signal for Mental Tasks Classification

Authors: Zin Mar Lwin

Abstract:

Brain Computer Interface (BCI) Systems have developed for people who suffer from severe motor disabilities and challenging to communicate with their environment. BCI allows them for communication by a non-muscular way. For communication between human and computer, BCI uses a type of signal called Electroencephalogram (EEG) signal which is recorded from the human„s brain by means of an electrode. The electroencephalogram (EEG) signal is an important information source for knowing brain processes for the non-invasive BCI. Translating human‟s thought, it needs to classify acquired EEG signal accurately. This paper proposed a typical EEG signal classification system which experiments the Dataset from “Purdue University.” Independent Component Analysis (ICA) method via EEGLab Tools for removing artifacts which are caused by eye blinks. For features extraction, the Time and Frequency features of non-stationary EEG signals are extracted by Matching Pursuit (MP) algorithm. The classification of one of five mental tasks is performed by Multi_Class Support Vector Machine (SVM). For SVMs, the comparisons have been carried out for both 1-against-1 and 1-against-all methods.

Keywords: BCI, EEG, ICA, SVM

Procedia PDF Downloads 277
5371 The Importance of the Historical Approach in the Linguistic Research

Authors: Zoran Spasovski

Abstract:

The paper shortly discusses the significance and the benefits of the historical approach in the research of languages by presenting examples of it in the fields of phonetics and phonology, lexicology, morphology, syntax, and even in the onomastics (toponomy and anthroponomy). The examples from the field of phonetics/phonology include insights into animal speech and its evolution into human speech, the evolution of the sounds of human speech from vocals to glides and consonants and from velar consonants to palatal, etc., on well-known examples of former researchers. Those from the field of lexicology show shortly the formation of the lexemes and their evolution; the morphology and syntax are explained by examples of the development of grammar and syntax forms, and the importance of the historical approach in the research of place-names and personal names is briefly outlined through examples of place-names and personal names and surnames, and the conclusions that come from it, in different languages.

Keywords: animal speech, glotogenesis, grammar forms, lexicology, place-names, personal names, surnames, syntax categories

Procedia PDF Downloads 83
5370 The Importance of Visual Communication in Artificial Intelligence

Authors: Manjitsingh Rajput

Abstract:

Visual communication plays an important role in artificial intelligence (AI) because it enables machines to understand and interpret visual information, similar to how humans do. This abstract explores the importance of visual communication in AI and emphasizes the importance of various applications such as computer vision, object emphasis recognition, image classification and autonomous systems. In going deeper, with deep learning techniques and neural networks that modify visual understanding, In addition to AI programming, the abstract discusses challenges facing visual interfaces for AI, such as data scarcity, domain optimization, and interpretability. Visual communication and other approaches, such as natural language processing and speech recognition, have also been explored. Overall, this abstract highlights the critical role that visual communication plays in advancing AI capabilities and enabling machines to perceive and understand the world around them. The abstract also explores the integration of visual communication with other modalities like natural language processing and speech recognition, emphasizing the critical role of visual communication in AI capabilities. This methodology explores the importance of visual communication in AI development and implementation, highlighting its potential to enhance the effectiveness and accessibility of AI systems. It provides a comprehensive approach to integrating visual elements into AI systems, making them more user-friendly and efficient. In conclusion, Visual communication is crucial in AI systems for object recognition, facial analysis, and augmented reality, but challenges like data quality, interpretability, and ethics must be addressed. Visual communication enhances user experience, decision-making, accessibility, and collaboration. Developers can integrate visual elements for efficient and accessible AI systems.

Keywords: visual communication AI, computer vision, visual aid in communication, essence of visual communication.

Procedia PDF Downloads 95
5369 Exact Formulas of the End-To-End Green’s Functions in Non-hermitian Systems

Authors: Haoshu Li, Shaolong Wan

Abstract:

The recent focus has been on directional signal amplification of a signal input at one end of a one-dimensional chain and measured at the other end. The amplification rate is given by the end-to-end Green’s functions of the system. In this work, we derive the exact formulas for the end-to-end Green's functions of non-Hermitian single-band systems. While in the bulk region, it is found that the Green's functions are displaced from the prior established integral formula by O(e⁻ᵇᴸ). The results confirm the correspondence between the signal amplification and the non-Hermitian skin effect.

Keywords: non-Hermitian, Green's function, non-Hermitian skin effect, signal amplification

Procedia PDF Downloads 141
5368 A Deep Learning Based Approach for Dynamically Selecting Pre-processing Technique for Images

Authors: Revoti Prasad Bora, Nikita Katyal, Saurabh Yadav

Abstract:

Pre-processing plays an important role in various image processing applications. Most of the time due to the similar nature of images, a particular pre-processing or a set of pre-processing steps are sufficient to produce the desired results. However, in the education domain, there is a wide variety of images in various aspects like images with line-based diagrams, chemical formulas, mathematical equations, etc. Hence a single pre-processing or a set of pre-processing steps may not yield good results. Therefore, a Deep Learning based approach for dynamically selecting a relevant pre-processing technique for each image is proposed. The proposed method works as a classifier to detect hidden patterns in the images and predicts the relevant pre-processing technique needed for the image. This approach experimented for an image similarity matching problem but it can be adapted to other use cases too. Experimental results showed significant improvement in average similarity ranking with the proposed method as opposed to static pre-processing techniques.

Keywords: deep-learning, classification, pre-processing, computer vision, image processing, educational data mining

Procedia PDF Downloads 163
5367 EEG Diagnosis Based on Phase Space with Wavelet Transforms for Epilepsy Detection

Authors: Mohmmad A. Obeidat, Amjed Al Fahoum, Ayman M. Mansour

Abstract:

The recognition of an abnormal activity of the brain functionality is a vital issue. To determine the type of the abnormal activity either a brain image or brain signal are usually considered. Imaging localizes the defect within the brain area and relates this area with somebody functionalities. However, some functions may be disturbed without affecting the brain as in epilepsy. In this case, imaging may not provide the symptoms of the problem. A cheaper yet efficient approach that can be utilized to detect abnormal activity is the measurement and analysis of the electroencephalogram (EEG) signals. The main goal of this work is to come up with a new method to facilitate the classification of the abnormal and disorder activities within the brain directly using EEG signal processing, which makes it possible to be applied in an on-line monitoring system.

Keywords: EEG, wavelet, epilepsy, detection

Procedia PDF Downloads 538
5366 Anonymous Editing Prevention Technique Using Gradient Method for High-Quality Video

Authors: Jiwon Lee, Chanho Jung, Si-Hwan Jang, Kyung-Ill Kim, Sanghyun Joo, Wook-Ho Son

Abstract:

Since the advances in digital imaging technologies have led to development of high quality digital devices, there are a lot of illegal copies of copyrighted video content on the internet. Thus, we propose a high-quality (HQ) video watermarking scheme that can prevent these illegal copies from spreading out. The proposed scheme is applied spatial and temporal gradient methods to improve the fidelity and detection performance. Also, the scheme duplicates the watermark signal temporally to alleviate the signal reduction caused by geometric and signal-processing distortions. Experimental results show that the proposed scheme achieves better performance than previously proposed schemes and it has high fidelity. The proposed scheme can be used in broadcast monitoring or traitor tracking applications which need fast detection process to prevent illegally recorded video content from spreading out.

Keywords: editing prevention technique, gradient method, luminance change, video watermarking

Procedia PDF Downloads 456
5365 Semiautomatic Calculation of Ejection Fraction Using Echocardiographic Image Processing

Authors: Diana Pombo, Maria Loaiza, Mauricio Quijano, Alberto Cadena, Juan Pablo Tello

Abstract:

In this paper, we present a semi-automatic tool for calculating ejection fraction from an echocardiographic video signal which is derived from a database in DICOM format, of Clinica de la Costa - Barranquilla. Described in this paper are each of the steps and methods used to find the respective calculation that includes acquisition and formation of the test samples, processing and finally the calculation of the parameters to obtain the ejection fraction. Two imaging segmentation methods were compared following a methodological framework that is similar only in the initial stages of processing (process of filtering and image enhancement) and differ in the end when algorithms are implemented (Active Contour and Region Growing Algorithms). The results were compared with the measurements obtained by two different medical specialists in cardiology who calculated the ejection fraction of the study samples using the traditional method, which consists of drawing the region of interest directly from the computer using echocardiography equipment and a simple equation to calculate the desired value. The results showed that if the quality of video samples are good (i.e., after the pre-processing there is evidence of an improvement in the contrast), the values provided by the tool are substantially close to those reported by physicians; also the correlation between physicians does not vary significantly.

Keywords: echocardiography, DICOM, processing, segmentation, EDV, ESV, ejection fraction

Procedia PDF Downloads 426
5364 Hand Gesture Detection via EmguCV Canny Pruning

Authors: N. N. Mosola, S. J. Molete, L. S. Masoebe, M. Letsae

Abstract:

Hand gesture recognition is a technique used to locate, detect, and recognize a hand gesture. Detection and recognition are concepts of Artificial Intelligence (AI). AI concepts are applicable in Human Computer Interaction (HCI), Expert systems (ES), etc. Hand gesture recognition can be used in sign language interpretation. Sign language is a visual communication tool. This tool is used mostly by deaf societies and those with speech disorder. Communication barriers exist when societies with speech disorder interact with others. This research aims to build a hand recognition system for Lesotho’s Sesotho and English language interpretation. The system will help to bridge the communication problems encountered by the mentioned societies. The system has various processing modules. The modules consist of a hand detection engine, image processing engine, feature extraction, and sign recognition. Detection is a process of identifying an object. The proposed system uses Canny pruning Haar and Haarcascade detection algorithms. Canny pruning implements the Canny edge detection. This is an optimal image processing algorithm. It is used to detect edges of an object. The system employs a skin detection algorithm. The skin detection performs background subtraction, computes the convex hull, and the centroid to assist in the detection process. Recognition is a process of gesture classification. Template matching classifies each hand gesture in real-time. The system was tested using various experiments. The results obtained show that time, distance, and light are factors that affect the rate of detection and ultimately recognition. Detection rate is directly proportional to the distance of the hand from the camera. Different lighting conditions were considered. The more the light intensity, the faster the detection rate. Based on the results obtained from this research, the applied methodologies are efficient and provide a plausible solution towards a light-weight, inexpensive system which can be used for sign language interpretation.

Keywords: canny pruning, hand recognition, machine learning, skin tracking

Procedia PDF Downloads 185
5363 Frequency of Consonant Production Errors in Children with Speech Sound Disorder: A Retrospective-Descriptive Study

Authors: Amulya P. Rao, Prathima S., Sreedevi N.

Abstract:

Speech sound disorders (SSD) encompass the major concern in younger population of India with highest prevalence rate among the speech disorders. Children with SSD if not identified and rehabilitated at the earliest, are at risk for academic difficulties. This necessitates early identification using screening tools assessing the frequently misarticulated speech sounds. The literature on frequently misarticulated speech sounds is ample in English and other western languages targeting individuals with various communication disorders. Articulation is language specific, and there are limited studies reporting the same in Kannada, a Dravidian Language. Hence, the present study aimed to identify the frequently misarticulated consonants in Kannada and also to examine the error type. A retrospective, descriptive study was carried out using secondary data analysis of 41 participants (34-phonetic type and 7-phonemic type) with SSD in the age range 3-to 12-years. All the consonants of Kannada were analyzed by considering three words for each speech sound from the Kannada Diagnostic Photo Articulation test (KDPAT). Picture naming task was carried out, and responses were audio recorded. The recorded data were transcribed using IPA 2018 broad transcription. A criterion of 2/3 or 3/3 error productions was set to consider the speech sound to be an error. Number of error productions was calculated for each consonant in each participant. Then, the percentage of participants meeting the criteria were documented for each consonant to identify the frequently misarticulated speech sound. Overall results indicated that velar /k/ (48.78%) and /g/ (43.90%) were frequently misarticulated followed by voiced retroflex /ɖ/ (36.58%) and trill /r/ (36.58%). The lateral retroflex /ɭ/ was misarticulated by 31.70% of the children with SSD. Dentals (/t/, /n/), bilabials (/p/, /b/, /m/) and labiodental /v/ were produced correctly by all the participants. The highly misarticulated velars /k/ and /g/ were frequently substituted by dentals /t/ and /d/ respectively or omitted. Participants with SSD-phonemic type had multiple substitutions for one speech sound whereas, SSD-phonetic type had consistent single sound substitutions. Intra- and inter-judge reliability for 10% of the data using Cronbach’s Alpha revealed good reliability (0.8 ≤ α < 0.9). Analyzing a larger sample by replicating such studies will validate the present study results.

Keywords: consonant, frequently misarticulated, Kannada, SSD

Procedia PDF Downloads 134
5362 Machine Learning Approach for Yield Prediction in Semiconductor Production

Authors: Heramb Somthankar, Anujoy Chakraborty

Abstract:

This paper presents a classification study on yield prediction in semiconductor production using machine learning approaches. A complicated semiconductor production process is generally monitored continuously by signals acquired from sensors and measurement sites. A monitoring system contains a variety of signals, all of which contain useful information, irrelevant information, and noise. In the case of each signal being considered a feature, "Feature Selection" is used to find the most relevant signals. The open-source UCI SECOM Dataset provides 1567 such samples, out of which 104 fail in quality assurance. Feature extraction and selection are performed on the dataset, and useful signals were considered for further study. Afterward, common machine learning algorithms were employed to predict whether the signal yields pass or fail. The most relevant algorithm is selected for prediction based on the accuracy and loss of the ML model.

Keywords: deep learning, feature extraction, feature selection, machine learning classification algorithms, semiconductor production monitoring, signal processing, time-series analysis

Procedia PDF Downloads 109
5361 Noninvasive Brain-Machine Interface to Control Both Mecha TE Robotic Hands Using Emotiv EEG Neuroheadset

Authors: Adrienne Kline, Jaydip Desai

Abstract:

Electroencephalogram (EEG) is a noninvasive technique that registers signals originating from the firing of neurons in the brain. The Emotiv EEG Neuroheadset is a consumer product comprised of 14 EEG channels and was used to record the reactions of the neurons within the brain to two forms of stimuli in 10 participants. These stimuli consisted of auditory and visual formats that provided directions of ‘right’ or ‘left.’ Participants were instructed to raise their right or left arm in accordance with the instruction given. A scenario in OpenViBE was generated to both stimulate the participants while recording their data. In OpenViBE, the Graz Motor BCI Stimulator algorithm was configured to govern the duration and number of visual stimuli. Utilizing EEGLAB under the cross platform MATLAB®, the electrodes most stimulated during the study were defined. Data outputs from EEGLAB were analyzed using IBM SPSS Statistics® Version 20. This aided in determining the electrodes to use in the development of a brain-machine interface (BMI) using real-time EEG signals from the Emotiv EEG Neuroheadset. Signal processing and feature extraction were accomplished via the Simulink® signal processing toolbox. An Arduino™ Duemilanove microcontroller was used to link the Emotiv EEG Neuroheadset and the right and left Mecha TE™ Hands.

Keywords: brain-machine interface, EEGLAB, emotiv EEG neuroheadset, OpenViBE, simulink

Procedia PDF Downloads 502
5360 Design and Development of Ssvep-Based Brain-Computer Interface for Limb Disabled Patients

Authors: Zerihun Ketema Tadesse, Dabbu Suman Reddy

Abstract:

Brain-Computer Interfaces (BCIs) give the possibility for disabled people to communicate and control devices. This work aims at developing steady-state visual evoked potential (SSVEP)-based BCI for patients with limb disabilities. In hospitals, devices like nurse emergency call devices, lights, and TV sets are what patients use most frequently, but these devices are operated manually or using the remote control. Thus, disabled patients are not able to operate these devices by themselves. Hence, SSVEP-based BCI system that can allow disabled patients to control nurse calling device and other devices is proposed in this work. Portable LED visual stimulator that flickers at specific frequencies of 7Hz, 8Hz, 9Hz and 10Hz were developed as part of this project. Disabled patients can stare at specific flickering LED of visual stimulator and Emotiv EPOC used to acquire EEG signal in a non-invasive way. The acquired EEG signal can be processed to generate various control signals depending upon the amplitude and duration of signal components. MATLAB software is used for signal processing and analysis and also for command generation. Arduino is used as a hardware interface device to receive and transmit command signals to the experimental setup. Therefore, this study is focused on the design and development of Steady-state visually evoked potential (SSVEP)-based BCI for limb disabled patients, which helps them to operate and control devices in the hospital room/wards.

Keywords: SSVEP-BCI, Limb Disabled Patients, LED Visual Stimulator, EEG signal, control devices, hospital room/wards

Procedia PDF Downloads 221
5359 Analysis of Injection-Lock in Oscillators versus Phase Variation of Injected Signal

Authors: M. Yousefi, N. Nasirzadeh

Abstract:

In this paper, behavior of an oscillator under injection of another signal has been investigated. Also, variation of output signal amplitude versus injected signal phase variation, the effect of varying the amplitude of injected signal and quality factor of the oscillator has been investigated. The results show that the locking time depends on phase and the best locking time happens at 180-degrees phase. Also, the effect of injected lock has been discussed. Simulations show that the locking time decreases with signal injection to bulk. Locking time has been investigated versus various phase differences. The effect of phase and amplitude changes on locking time of a typical LC oscillator in 180 nm technology has been investigated.

Keywords: analysis, oscillator, injection-lock oscillator, phase modulation

Procedia PDF Downloads 348
5358 Atomic Decomposition Audio Data Compression and Denoising Using Sparse Dictionary Feature Learning

Authors: T. Bryan , V. Kepuska, I. Kostnaic

Abstract:

A method of data compression and denoising is introduced that is based on atomic decomposition of audio data using “basis vectors” that are learned from the audio data itself. The basis vectors are shown to have higher data compression and better signal-to-noise enhancement than the Gabor and gammatone “seed atoms” that were used to generate them. The basis vectors are the input weights of a Sparse AutoEncoder (SAE) that is trained using “envelope samples” of windowed segments of the audio data. The envelope samples are extracted from the audio data by performing atomic decomposition with Gabor or gammatone seed atoms. This process identifies segments of audio data that are locally coherent with the seed atoms. Envelope samples are extracted by identifying locally coherent audio data segments with Gabor or gammatone seed atoms, found by matching pursuit. The envelope samples are formed by taking the kronecker products of the atomic envelopes with the locally coherent data segments. Oracle signal-to-noise ratio (SNR) verses data compression curves are generated for the seed atoms as well as the basis vectors learned from Gabor and gammatone seed atoms. SNR data compression curves are generated for speech signals as well as early American music recordings. The basis vectors are shown to have higher denoising capability for data compression rates ranging from 90% to 99.84% for speech as well as music. Envelope samples are displayed as images by folding the time series into column vectors. This display method is used to compare of the output of the SAE with the envelope samples that produced them. The basis vectors are also displayed as images. Sparsity is shown to play an important role in producing the highest denoising basis vectors.

Keywords: sparse dictionary learning, autoencoder, sparse autoencoder, basis vectors, atomic decomposition, envelope sampling, envelope samples, Gabor, gammatone, matching pursuit

Procedia PDF Downloads 252
5357 Speech Perception by Video Hosting Services Actors: Urban Planning Conflicts

Authors: M. Pilgun

Abstract:

The report presents the results of a study of the specifics of speech perception by actors of video hosting services on the material of urban planning conflicts. To analyze the content, the multimodal approach using neural network technologies is employed. Analysis of word associations and associative networks of relevant stimulus revealed the evaluative reactions of the actors. Analysis of the data identified key topics that generated negative and positive perceptions from the participants. The calculation of social stress and social well-being indices based on user-generated content made it possible to build a rating of road transport construction objects according to the degree of negative and positive perception by actors.

Keywords: social media, speech perception, video hosting, networks

Procedia PDF Downloads 147