Search results for: speech signal processing
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 5424

Search results for: speech signal processing

5214 Clutter Suppression Based on Singular Value Decomposition and Fast Wavelet Algorithm

Authors: Ruomeng Xiao, Zhulin Zong, Longfa Yang

Abstract:

Aiming at the problem that the target signal is difficult to detect under the strong ground clutter environment, this paper proposes a clutter suppression algorithm based on the combination of singular value decomposition and the Mallat fast wavelet algorithm. The method first carries out singular value decomposition on the radar echo data matrix, realizes the initial separation of target and clutter through the threshold processing of singular value, and then carries out wavelet decomposition on the echo data to find out the target location, and adopts the discard method to select the appropriate decomposition layer to reconstruct the target signal, which ensures the minimum loss of target information while suppressing the clutter. After the verification of the measured data, the method has a significant effect on the target extraction under low SCR, and the target reconstruction can be realized without the prior position information of the target and the method also has a certain enhancement on the output SCR compared with the traditional single wavelet processing method.

Keywords: clutter suppression, singular value decomposition, wavelet transform, Mallat algorithm, low SCR

Procedia PDF Downloads 78
5213 Musical Tesla Coil Controlled by an Audio Signal Processed in Matlab

Authors: Sandra Cuenca, Danilo Santana, Anderson Reyes

Abstract:

The following project is based on the manipulation of audio signals through the Matlab software, which has an audio signal that is modified, and its resultant obtained through the auxiliary port of the computer is passed through a signal amplifier whose amplified signal is connected to a tesla coil which has a behavior like a vumeter, the flashes at the output of the tesla coil increase and decrease its intensity depending on the audio signal in the computer and also the voltage source from which it is sent. The amplified signal then passes to the tesla coil being shown in the plasma sphere with the respective flashes; this activation is given through the specified parameters that we want to give in the MATLAB algorithm that contains the digital filters for the manipulation of our audio signal sent to the tesla coil to be displayed in a plasma sphere with flashes of the combination of colors commonly pink and purple that varies according to the tone of the song.

Keywords: auxiliary port, tesla coil, vumeter, plasma sphere

Procedia PDF Downloads 46
5212 A Generalized Sparse Bayesian Learning Algorithm for Near-Field Synthetic Aperture Radar Imaging: By Exploiting Impropriety and Noncircularity

Authors: Pan Long, Bi Dongjie, Li Xifeng, Xie Yongle

Abstract:

The near-field synthetic aperture radar (SAR) imaging is an advanced nondestructive testing and evaluation (NDT&E) technique. This paper investigates the complex-valued signal processing related to the near-field SAR imaging system, where the measurement data turns out to be noncircular and improper, meaning that the complex-valued data is correlated to its complex conjugate. Furthermore, we discover that the degree of impropriety of the measurement data and that of the target image can be highly correlated in near-field SAR imaging. Based on these observations, A modified generalized sparse Bayesian learning algorithm is proposed, taking impropriety and noncircularity into account. Numerical results show that the proposed algorithm provides performance gain, with the help of noncircular assumption on the signals.

Keywords: complex-valued signal processing, synthetic aperture radar, 2-D radar imaging, compressive sensing, sparse Bayesian learning

Procedia PDF Downloads 92
5211 Graphical User Interface for Presting Matlab Work for Reduction of Chromatic Disperion Using Digital Signal Processing for Optical Communication

Authors: Muhammad Faiz Liew Abdullah, Bhagwan Das, Nor Shahida, Abdul Fattah Chandio

Abstract:

This study presents the designed features of Graphical User Interface (GUI) for chromatic dispersion (CD) reduction using digital signal processing (DSP) techniques. GUI is specially designed for windows platform. The obtained simulation results from matlab are presented via this GUI. After importing results from matlab in GUI, It will present your work on any windows7 and onwards versions platforms without matlab software. First part of the GUI contains the research methodology block diagram and in the second part, output for each stage is shown in separate reserved area for the result display. Each stage of methodology has the captions to display the results. This GUI will be very helpful during presentations instead of making slides this GUI will present all your work easily in the absence of other software’s such as Matlab, Labview, MS PowerPoint. GUI is designed using C programming in MS Visio Studio.

Keywords: Matlab simulation results, C programming, MS VISIO studio, chromatic dispersion

Procedia PDF Downloads 426
5210 A Pole Radius Varying Notch Filter with Transient Suppression for Electrocardiogram

Authors: Ramesh Rajagopalan, Adam Dahlstrom

Abstract:

Noise removal techniques play a vital role in the performance of electrocardiographic (ECG) signal processing systems. ECG signals can be corrupted by various kinds of noise such as baseline wander noise, electromyographic interference, and power-line interference. One of the significant challenges in ECG signal processing is the degradation caused by additive 50 or 60 Hz power-line interference. This work investigates the removal of power line interference and suppression of transient response for filtering noise corrupted ECG signals. We demonstrate the effectiveness of Infinite Impulse Response (IIR) notch filter with time varying pole radius for improving the transient behavior. The temporary change in the pole radius of the filter diminishes the transient behavior. Simulation results show that the proposed IIR filter with time varying pole radius outperforms traditional IIR notch filters in terms of mean square error and transient suppression.

Keywords: notch filter, ECG, transient, pole radius

Procedia PDF Downloads 350
5209 A Contactless Capacitive Biosensor for Muscle Activity Measurement

Authors: Charn Loong Ng, Mamun Bin Ibne Reaz

Abstract:

As elderly population grows globally, the percentage of people diagnosed with musculoskeletal disorder (MSD) increase proportionally. Electromyography (EMG) is an important biosignal that contributes to MSD’s clinical diagnose and recovery process. Conventional conductive electrode has many disadvantages in the continuous EMG measurement application. This research has design a new surface EMG biosensor based on the parallel-plate capacitive coupling principle. The biosensor is developed by using a double-sided PCB with having one side of the PCB use to construct high input impedance circuitry while the other side of the copper (CU) plate function as biosignal sensing metal plate. The metal plate is insulated using kapton tape for contactless application. The result implicates that capacitive biosensor is capable to constantly capture EMG signal without having galvanic contact to human skin surface. However, there are noticeable noise couple into the measured signal. Post signal processing is needed in order to present a clean and significant EMG signal. A complete design of single ended, non-contact, high input impedance, front end EMG biosensor is presented in this paper.

Keywords: contactless, capacitive, biosensor, electromyography

Procedia PDF Downloads 421
5208 From Electroencephalogram to Epileptic Seizures Detection by Using Artificial Neural Networks

Authors: Gaetano Zazzaro, Angelo Martone, Roberto V. Montaquila, Luigi Pavone

Abstract:

Seizure is the main factor that affects the quality of life of epileptic patients. The diagnosis of epilepsy, and hence the identification of epileptogenic zone, is commonly made by using continuous Electroencephalogram (EEG) signal monitoring. Seizure identification on EEG signals is made manually by epileptologists and this process is usually very long and error prone. The aim of this paper is to describe an automated method able to detect seizures in EEG signals, using knowledge discovery in database process and data mining methods and algorithms, which can support physicians during the seizure detection process. Our detection method is based on Artificial Neural Network classifier, trained by applying the multilayer perceptron algorithm, and by using a software application, called Training Builder that has been developed for the massive extraction of features from EEG signals. This tool is able to cover all the data preparation steps ranging from signal processing to data analysis techniques, including the sliding window paradigm, the dimensionality reduction algorithms, information theory, and feature selection measures. The final model shows excellent performances, reaching an accuracy of over 99% during tests on data of a single patient retrieved from a publicly available EEG dataset.

Keywords: artificial neural network, data mining, electroencephalogram, epilepsy, feature extraction, seizure detection, signal processing

Procedia PDF Downloads 150
5207 Investigating the English Speech Processing System of EFL Japanese Older Children

Authors: Hiromi Kawai

Abstract:

This study investigates the nature of EFL older children’s L2 perceptive and productive abilities using classroom data, in order to find a pedagogical solution to the teaching of L2 sounds at an early stage of learning in a formal school setting. It is still inconclusive whether older children with only EFL formal school instruction at the initial stage of L2 learning are able to attain native-like perception and production in English within the very limited amount of exposure to the target language available. Based on the notion of the lack of study of EFL Japanese children’s acquisition of English segments, the researcher uses a model of L1 speech processing which was developed for investigating L1 English children’s speech and literacy difficulties using a psycholinguistic framework. The model is composed of input channel, output channel, and lexical representation, and examines how a child receives information from spoken or written language, remembers and stores it within the lexical representations and how the child selects and produces spoken or written words. Concerning language universality and language specificity in the language acquisitional process, the aim of finding any sound errors in L1 English children seemed to conform to the author’s intention to find abilities of English sounds in older Japanese children at the novice level of English in an EFL setting. 104 students in Grade 5 (between the ages of 10 and 11 years old) of an elementary school in Tokyo participated in this study. Four tests to measure their perceptive ability and three oral repetition tests to measure their productive ability were conducted with/without reference to lexical representation. All the test items were analyzed to calculate item facility (IF) indices, and correlational analyses and Structural Equation Modeling (SEM) were conducted to examine the relationship between the receptive ability and the productive ability. IF analysis showed that (1) the participants were better at perceiving a segment than producing a segment, (2) they had difficulty in auditory discrimination of paired consonants when one of them does not exist in the Japanese inventory, (3) they had difficulty in both perceiving and producing English vowels, and (4) their L1 loan word knowledge had an influence on their ability to perceive and produce L2 sounds. The result of the Multiple Regression Modeling showed that the two production tests could predict the participants’ auditory ability of real words in English. The result of SEM showed that the hypothesis that perceptive ability affects productive ability was supported. Based on these findings, the author discusses the possible explicit method of teaching English segments to EFL older children in a formal school setting.

Keywords: EFL older children, english segments, perception, production, speech processing system

Procedia PDF Downloads 217
5206 An Improved Two-dimensional Ordered Statistical Constant False Alarm Detection

Authors: Weihao Wang, Zhulin Zong

Abstract:

Two-dimensional ordered statistical constant false alarm detection is a widely used method for detecting weak target signals in radar signal processing applications. The method is based on analyzing the statistical characteristics of the noise and clutter present in the radar signal and then using this information to set an appropriate detection threshold. In this approach, the reference cell of the unit to be detected is divided into several reference subunits. These subunits are used to estimate the noise level and adjust the detection threshold, with the aim of minimizing the false alarm rate. By using an ordered statistical approach, the method is able to effectively suppress the influence of clutter and noise, resulting in a low false alarm rate. The detection process involves a number of steps, including filtering the input radar signal to remove any noise or clutter, estimating the noise level based on the statistical characteristics of the reference subunits, and finally, setting the detection threshold based on the estimated noise level. One of the main advantages of two-dimensional ordered statistical constant false alarm detection is its ability to detect weak target signals in the presence of strong clutter and noise. This is achieved by carefully analyzing the statistical properties of the signal and using an ordered statistical approach to estimate the noise level and adjust the detection threshold. In conclusion, two-dimensional ordered statistical constant false alarm detection is a powerful technique for detecting weak target signals in radar signal processing applications. By dividing the reference cell into several subunits and using an ordered statistical approach to estimate the noise level and adjust the detection threshold, this method is able to effectively suppress the influence of clutter and noise and maintain a low false alarm rate.

Keywords: two-dimensional, ordered statistical, constant false alarm, detection, weak target signals

Procedia PDF Downloads 44
5205 Compressed Sensing of Fetal Electrocardiogram Signals Based on Joint Block Multi-Orthogonal Least Squares Algorithm

Authors: Xiang Jianhong, Wang Cong, Wang Linyu

Abstract:

With the rise of medical IoT technologies, Wireless body area networks (WBANs) can collect fetal electrocardiogram (FECG) signals to support telemedicine analysis. The compressed sensing (CS)-based WBANs system can avoid the sampling of a large amount of redundant information and reduce the complexity and computing time of data processing, but the existing algorithms have poor signal compression and reconstruction performance. In this paper, a Joint block multi-orthogonal least squares (JBMOLS) algorithm is proposed. We apply the FECG signal to the Joint block sparse model (JBSM), and a comparative study of sparse transformation and measurement matrices is carried out. A FECG signal compression transmission mode based on Rbio5.5 wavelet, Bernoulli measurement matrix, and JBMOLS algorithm is proposed to improve the compression and reconstruction performance of FECG signal by CS-based WBANs. Experimental results show that the compression ratio (CR) required for accurate reconstruction of this transmission mode is increased by nearly 10%, and the runtime is saved by about 30%.

Keywords: telemedicine, fetal ECG, compressed sensing, joint sparse reconstruction, block sparse signal

Procedia PDF Downloads 95
5204 Investigating the Online Effect of Language on Gesture in Advanced Bilinguals of Two Structurally Different Languages in Comparison to L1 Native Speakers of L2 and Explores Whether Bilinguals Will Follow Target L2 Patterns in Speech and Co-speech

Authors: Armita Ghobadi, Samantha Emerson, Seyda Ozcaliskan

Abstract:

Being a bilingual involves mastery of both speech and gesture patterns in a second language (L2). We know from earlier work in first language (L1) production contexts that speech and co-speech gesture form a tightly integrated system: co-speech gesture mirrors the patterns observed in speech, suggesting an online effect of language on nonverbal representation of events in gesture during the act of speaking (i.e., “thinking for speaking”). Relatively less is known about the online effect of language on gesture in bilinguals speaking structurally different languages. The few existing studies—mostly with small sample sizes—suggests inconclusive findings: some show greater achievement of L2 patterns in gesture with more advanced L2 speech production, while others show preferences for L1 gesture patterns even in advanced bilinguals. In this study, we focus on advanced bilingual speakers of two structurally different languages (Spanish L1 with English L2) in comparison to L1 English speakers. We ask whether bilingual speakers will follow target L2 patterns not only in speech but also in gesture, or alternatively, follow L2 patterns in speech but resort to L1 patterns in gesture. We examined this question by studying speech and gestures produced by 23 advanced adult Spanish (L1)-English (L2) bilinguals (Mage=22; SD=7) and 23 monolingual English speakers (Mage=20; SD=2). Participants were shown 16 animated motion event scenes that included distinct manner and path components (e.g., "run over the bridge"). We recorded and transcribed all participant responses for speech and segmented it into sentence units that included at least one motion verb and its associated arguments. We also coded all gestures that accompanied each sentence unit. We focused on motion event descriptions as it shows strong crosslinguistic differences in the packaging of motion elements in speech and co-speech gesture in first language production contexts. English speakers synthesize manner and path into a single clause or gesture (he runs over the bridge; running fingers forward), while Spanish speakers express each component separately (manner-only: el corre=he is running; circle arms next to body conveying running; path-only: el cruza el puente=he crosses the bridge; trace finger forward conveying trajectory). We tallied all responses by group and packaging type, separately for speech and co-speech gesture. Our preliminary results (n=4/group) showed that productions in English L1 and Spanish L1 differed, with greater preference for conflated packaging in L1 English and separated packaging in L1 Spanish—a pattern that was also largely evident in co-speech gesture. Bilinguals’ production in L2 English, however, followed the patterns of the target language in speech—with greater preference for conflated packaging—but not in gesture. Bilinguals used separated and conflated strategies in gesture in roughly similar rates in their L2 English, showing an effect of both L1 and L2 on co-speech gesture. Our results suggest that online production of L2 language has more limited effects on L2 gestures and that mastery of native-like patterns in L2 gesture might take longer than native-like L2 speech patterns.

Keywords: bilingualism, cross-linguistic variation, gesture, second language acquisition, thinking for speaking hypothesis

Procedia PDF Downloads 45
5203 Cognitive Semantics Study of Conceptual and Metonymical Expressions in Johnson's Speeches about COVID-19

Authors: Hussain Hameed Mayuuf

Abstract:

The study is an attempt to investigate the conceptual metonymies is used in political discourse about COVID-19. Thus, this study tries to analyze and investigate how the conceptual metonymies in Johnson's speech about coronavirus are constructed. This study aims at: Identifying how are metonymies relevant to understand the messages in Boris Johnson speeches and to find out how can conceptual blending theory help people to understand the messages in the political speech about COVID-19. Lastly, it tries to Point out the kinds of integration networks are common in political speech. The study is based on the hypotheses that conceptual blending theory is a powerful tool for investigating the intended messages in Johnson's speech and there are different processes of blending networks and conceptual mapping that enable the listeners to identify the messages in political speech. This study presents a qualitative and quantitative analysis of four speeches about COVID-19; they are said by Boris Johnson. The selected data have been tackled from the cognitive-semantic perspective by adopting Conceptual Blending Theory as a model for the analysis. It concludes that CBT is applicable to the analysis of metonymies in political discourse. Its mechanisms enable listeners to analyze and understand these speeches. Also the listener can identify and understand the hidden messages in Biden and Johnson's discourse about COVID-19 by using different conceptual networks. Finally, it is concluded that the double scope networks are the most common types of blending of metonymies in the political speech.

Keywords: cognitive, semantics, conceptual, metonymical, Covid-19

Procedia PDF Downloads 75
5202 On the Implementation of The Pulse Coupled Neural Network (PCNN) in the Vision of Cognitive Systems

Authors: Hala Zaghloul, Taymoor Nazmy

Abstract:

One of the great challenges of the 21st century is to build a robot that can perceive and act within its environment and communicate with people, while also exhibiting the cognitive capabilities that lead to performance like that of people. The Pulse Coupled Neural Network, PCNN, is a relative new ANN model that derived from a neural mammal model with a great potential in the area of image processing as well as target recognition, feature extraction, speech recognition, combinatorial optimization, compressed encoding. PCNN has unique feature among other types of neural network, which make it a candid to be an important approach for perceiving in cognitive systems. This work show and emphasis on the potentials of PCNN to perform different tasks related to image processing. The main drawback or the obstacle that prevent the direct implementation of such technique, is the need to find away to control the PCNN parameters toward perform a specific task. This paper will evaluate the performance of PCNN standard model for processing images with different properties, and select the important parameters that give a significant result, also, the approaches towards find a way for the adaptation of the PCNN parameters to perform a specific task.

Keywords: cognitive system, image processing, segmentation, PCNN kernels

Procedia PDF Downloads 248
5201 Bidirectional Dynamic Time Warping Algorithm for the Recognition of Isolated Words Impacted by Transient Noise Pulses

Authors: G. Tamulevičius, A. Serackis, T. Sledevič, D. Navakauskas

Abstract:

We consider the biggest challenge in speech recognition – noise reduction. Traditionally detected transient noise pulses are removed with the corrupted speech using pulse models. In this paper we propose to cope with the problem directly in Dynamic Time Warping domain. Bidirectional Dynamic Time Warping algorithm for the recognition of isolated words impacted by transient noise pulses is proposed. It uses simple transient noise pulse detector, employs bidirectional computation of dynamic time warping and directly manipulates with warping results. Experimental investigation with several alternative solutions confirms effectiveness of the proposed algorithm in the reduction of impact of noise on recognition process – 3.9% increase of the noisy speech recognition is achieved.

Keywords: transient noise pulses, noise reduction, dynamic time warping, speech recognition

Procedia PDF Downloads 521
5200 The Combination of the Mel Frequency Cepstral Coefficients (MFCC), Perceptual Linear Prediction (PLP), JITTER and SHIMMER Coefficients for the Improvement of Automatic Recognition System for Dysarthric Speech

Authors: Brahim-Fares Zaidi, Malika Boudraa, Sid-Ahmed Selouani

Abstract:

Our work aims to improve our Automatic Recognition System for Dysarthria Speech (ARSDS) based on the Hidden Models of Markov (HMM) and the Hidden Markov Model Toolkit (HTK) to help people who are sick. With pronunciation problems, we applied two techniques of speech parameterization based on Mel Frequency Cepstral Coefficients (MFCC's) and Perceptual Linear Prediction (PLP's) and concatenated them with JITTER and SHIMMER coefficients in order to increase the recognition rate of a dysarthria speech. For our tests, we used the NEMOURS database that represents speakers with dysarthria and normal speakers.

Keywords: hidden Markov model toolkit (HTK), hidden models of Markov (HMM), Mel-frequency cepstral coefficients (MFCC), perceptual linear prediction (PLP’s)

Procedia PDF Downloads 128
5199 Cultural-Creative Design with Language Figures of Speech

Authors: Wei Chen Chang, Ming Yu Hsiao

Abstract:

The commodity takes one kind of mark, the designer how to construction and interpretation the user how to use the process and effectively convey message in design education has always been an important issue. Cultural-creative design refers to signifying cultural heritage for product design. In terms of Peirce’s Semiotic Triangle: signifying elements-object-interpretant, signifying elements are the outcomes of design, the object is cultural heritage, and the interpretant is the positioning and description of product design. How to elaborate the positioning, design, and development of a product is a narrative issue of the interpretant, and how to shape the signifying elements of a product by modifying and adapting styles is a rhetoric matter. This study investigated the rhetoric of elements signifying products to develop a rhetoric model with cultural style. Figures of speech are a rhetoric method in narrative. By adapting figures of speech to the interpretant, this study developed the rhetoric context of cultural context by narrative means. In this two-phase study, phase I defines figures of speech and phase II analyzes existing cultural-creative products in terms of figures of speech to develop a rhetoric of style model. We expect it can reference for the future development of Cultural-creative design.

Keywords: cultural-creative design, cultural-creative products, figures of speech, Peirce’s semiotic triangle, rhetoric of style model

Procedia PDF Downloads 344
5198 Efficacy of Phonological Awareness Intervention for People with Language Impairment

Authors: I. Wardana Ketut, I. Suparwa Nyoman

Abstract:

This study investigated the form and characteristic of speech sound produced by three Balinese subjects who have recovered from aphasia as well as intervened their language impairment on side of linguistic and neuronal aspects of views. The failure of judging the speech sound was caused by impairment of motor cortex that indicated there were lesions in left hemispheric language zone. Sound articulation phenomena were in the forms of phonemes deletion, replacement or assimilation in individual words and meaning building for anomic aphasia. Therefore, the Balinese sound patterns were stimulated by showing pictures to the subjects and recorded to recognize what individual consonants or vowels they unclearly produced and to find out how the sound disorder occurred. The physiology of sound production by subject’s speech organs could not only show the accuracy of articulation but also any level of severity the lesion they suffered from. The subjects’ speech sounds were investigated, classified and analyzed to know how poor the lingual units were and observed to clarify weaknesses of sound characters occurred either for place or manner of articulation. Many fricative and stopped consonants were replaced by glottal or palatal sounds because the cranial nerve, such as facial, trigeminal, and hypoglossal underwent impairment after the stroke. The phonological intervention was applied through a technique called phonemic articulation drill and the examination was conducted to know any change has been obtained. The finding informed that some weak articulation turned into clearer sound and simple meaning of language has been conveyed. The hierarchy of functional parts of brain played important role of language formulation and processing. From this finding, it can be clearly emphasized that this study supports the role of right hemisphere in recovery from aphasia is associated with functional brain reorganization.

Keywords: aphasia, intervention, phonology, stroke

Procedia PDF Downloads 174
5197 Normalized P-Laplacian: From Stochastic Game to Image Processing

Authors: Abderrahim Elmoataz

Abstract:

More and more contemporary applications involve data in the form of functions defined on irregular and topologically complicated domains (images, meshs, points clouds, networks, etc). Such data are not organized as familiar digital signals and images sampled on regular lattices. However, they can be conveniently represented as graphs where each vertex represents measured data and each edge represents a relationship (connectivity or certain affinities or interaction) between two vertices. Processing and analyzing these types of data is a major challenge for both image and machine learning communities. Hence, it is very important to transfer to graphs and networks many of the mathematical tools which were initially developed on usual Euclidean spaces and proven to be efficient for many inverse problems and applications dealing with usual image and signal domains. Historically, the main tools for the study of graphs or networks come from combinatorial and graph theory. In recent years there has been an increasing interest in the investigation of one of the major mathematical tools for signal and image analysis, which are Partial Differential Equations (PDEs) variational methods on graphs. The normalized p-laplacian operator has been recently introduced to model a stochastic game called tug-of-war-game with noise. Part interest of this class of operators arises from the fact that it includes, as particular case, the infinity Laplacian, the mean curvature operator and the traditionnal Laplacian operators which was extensiveley used to models and to solve problems in image processing. The purpose of this paper is to introduce and to study a new class of normalized p-Laplacian on graphs. The introduction is based on the extension of p-harmonious function introduced in as discrete approximation for both infinity Laplacian and p-Laplacian equations. Finally, we propose to use these operators as a framework for solving many inverse problems in image processing.

Keywords: normalized p-laplacian, image processing, stochastic game, inverse problems

Procedia PDF Downloads 480
5196 Wolof Voice Response Recognition System: A Deep Learning Model for Wolof Audio Classification

Authors: Krishna Mohan Bathula, Fatou Bintou Loucoubar, FNU Kaleemunnisa, Christelle Scharff, Mark Anthony De Castro

Abstract:

Voice recognition algorithms such as automatic speech recognition and text-to-speech systems with African languages can play an important role in bridging the digital divide of Artificial Intelligence in Africa, contributing to the establishment of a fully inclusive information society. This paper proposes a Deep Learning model that can classify the user responses as inputs for an interactive voice response system. A dataset with Wolof language words ‘yes’ and ‘no’ is collected as audio recordings. A two stage Data Augmentation approach is adopted for enhancing the dataset size required by the deep neural network. Data preprocessing and feature engineering with Mel-Frequency Cepstral Coefficients are implemented. Convolutional Neural Networks (CNNs) have proven to be very powerful in image classification and are promising for audio processing when sounds are transformed into spectra. For performing voice response classification, the recordings are transformed into sound frequency feature spectra and then applied image classification methodology using a deep CNN model. The inference model of this trained and reusable Wolof voice response recognition system can be integrated with many applications associated with both web and mobile platforms.

Keywords: automatic speech recognition, interactive voice response, voice response recognition, wolof word classification

Procedia PDF Downloads 80
5195 Forensic Analysis of Signal Messenger on Android

Authors: Ward Bakker, Shadi Alhakimi

Abstract:

The amount of people moving towards more privacy focused instant messaging applications has grown significantly. Signal is one of these instant messaging applications, which makes Signal interesting for digital investigators. In this research, we evaluate the artifacts that are generated by the Signal messenger for Android. This evaluation was done by using the features that Signal provides to create artifacts, whereafter, we made an image of the internal storage and the process memory. This image was analysed manually. The manual analysis revealed the content that Signal stores in different locations during its operation. From our research, we were able to identify the artifacts and interpret how they were used. We also examined the source code of Signal. Using our obtain knowledge from the source code, we developed a tool that decrypts some of the artifacts using the key stored in the Android Keystore. In general, we found that most artifacts are encrypted and encoded, even after decrypting some of the artifacts. During data visualization, some artifacts were found, such as that Signal does not use relationships between the data. In this research, two interesting groups of artifacts were identified, those related to the database and those stored in the process memory dump. In the database, we found plaintext private- and group chats, and in the memory dump, we were able to retrieve the plaintext access code to the application. Nevertheless, we conclude that Signal contains a wealth of artifacts that could be very valuable to a digital forensic investigation.

Keywords: forensic, signal, Android, digital

Procedia PDF Downloads 43
5194 Neural Network Based Path Loss Prediction for Global System for Mobile Communication in an Urban Environment

Authors: Danladi Ali

Abstract:

In this paper, we measured GSM signal strength in the Dnepropetrovsk city in order to predict path loss in study area using nonlinear autoregressive neural network prediction and we also, used neural network clustering to determine average GSM signal strength receive at the study area. The nonlinear auto-regressive neural network predicted that the GSM signal is attenuated with the mean square error (MSE) of 2.6748dB, this attenuation value is used to modify the COST 231 Hata and the Okumura-Hata models. The neural network clustering revealed that -75dB to -95dB is received more frequently. This means that the signal strength received at the study is mostly weak signal

Keywords: one-dimensional multilevel wavelets, path loss, GSM signal strength, propagation, urban environment and model

Procedia PDF Downloads 350
5193 Quantum Cum Synaptic-Neuronal Paradigm and Schema for Human Speech Output and Autism

Authors: Gobinathan Devathasan, Kezia Devathasan

Abstract:

Objective: To improve the current modified Broca-Wernicke-Lichtheim-Kussmaul speech schema and provide insight into autism. Methods: We reviewed the pertinent literature. Current findings, involving Brodmann areas 22, 46, 9,44,45,6,4 are based on neuropathology and functional MRI studies. However, in primary autism, there is no lucid explanation and changes described, whether neuropathology or functional MRI, appear consequential. Findings: We forward an enhanced model which may explain the enigma related to autism. Vowel output is subcortical and does need cortical representation whereas consonant speech is cortical in origin. Left lateralization is needed to commence the circuitry spin as our life have evolved with L-amino acids and left spin of electrons. A fundamental species difference is we are capable of three syllable-consonants and bi-syllable expression whereas cetaceans and songbirds are confined to single or dual consonants. The 4 key sites for speech are superior auditory cortex, Broca’s two areas, and the supplementary motor cortex. Using the Argand’s diagram and Reimann’s projection, we theorize that the Euclidean three dimensional synaptic neuronal circuits of speech are quantized to coherent waves, and then decoherence takes place at area 6 (spherical representation). In this quantum state complex, 3-consonant languages are instantaneously integrated and multiple languages can be learned, verbalized and differentiated. Conclusion: We postulate that evolutionary human speech is elevated to quantum interaction unlike cetaceans and birds to achieve the three consonants/bi-syllable speech. In classical primary autism, the sudden speech switches off and on noted in several cases could now be explained not by any anatomical lesion but failure of coherence. Area 6 projects directly into prefrontal saccadic area (8); and this further explains the second primary feature in autism: lack of eye contact. The third feature which is repetitive finger gestures, located adjacent to the speech/motor areas, are actual attempts to communicate with the autistic child akin to sign language for the deaf.

Keywords: quantum neuronal paradigm, cetaceans and human speech, autism and rapid magnetic stimulation, coherence and decoherence of speech

Procedia PDF Downloads 158
5192 Performance Analysis of VoIP Coders for Different Modulations Under Pervasive Environment

Authors: Jasbinder Singh, Harjit Pal Singh, S. A. Khan

Abstract:

The work, in this paper, presents the comparison of encoded speech signals by different VoIP narrow-band and wide-band codecs for different modulation schemes. The simulation results indicate that codec has an impact on the speech quality and also effected by modulation schemes.

Keywords: VoIP, coders, modulations, BER, MOS

Procedia PDF Downloads 475
5191 Low Probability of Intercept (LPI) Signal Detection and Analysis Using Choi-Williams Distribution

Authors: V. S. S. Kumar, V. Ramya

Abstract:

In the modern electronic warfare, the signal scenario is changing at a rapid pace with the introduction of Low Probability of Intercept (LPI) radars. In the modern battlefield, radar system faces serious threats from passive intercept receivers such as Electronic Attack (EA) and Anti-Radiation Missiles (ARMs). To perform necessary target detection and tracking and simultaneously hide themselves from enemy attack, radar systems should be LPI. These LPI radars use a variety of complex signal modulation schemes together with pulse compression with the aid of advancement in signal processing capabilities of the radar such that the radar performs target detection and tracking while simultaneously hiding enemy from attack such as EA etc., thus posing a major challenge to the ES/ELINT receivers. Today an increasing number of LPI radars are being introduced into the modern platforms and weapon systems so these LPI radars created a requirement for the armed forces to develop new techniques, strategies and equipment to counter them. This paper presents various modulation techniques used in generation of LPI signals and development of Time Frequency Algorithms to analyse those signals.

Keywords: anti-radiation missiles, cross terms, electronic attack, electronic intelligence, electronic warfare, intercept receiver, low probability of intercept

Procedia PDF Downloads 405
5190 Multimodal Data Fusion Techniques in Audiovisual Speech Recognition

Authors: Hadeer M. Sayed, Hesham E. El Deeb, Shereen A. Taie

Abstract:

In the big data era, we are facing a diversity of datasets from different sources in different domains that describe a single life event. These datasets consist of multiple modalities, each of which has a different representation, distribution, scale, and density. Multimodal fusion is the concept of integrating information from multiple modalities in a joint representation with the goal of predicting an outcome through a classification task or regression task. In this paper, multimodal fusion techniques are classified into two main classes: model-agnostic techniques and model-based approaches. It provides a comprehensive study of recent research in each class and outlines the benefits and limitations of each of them. Furthermore, the audiovisual speech recognition task is expressed as a case study of multimodal data fusion approaches, and the open issues through the limitations of the current studies are presented. This paper can be considered a powerful guide for interested researchers in the field of multimodal data fusion and audiovisual speech recognition particularly.

Keywords: multimodal data, data fusion, audio-visual speech recognition, neural networks

Procedia PDF Downloads 73
5189 Analysis of Linguistic Disfluencies in Bilingual Children’s Discourse

Authors: Sheena Christabel Pravin, M. Palanivelan

Abstract:

Speech disfluencies are common in spontaneous speech. The primary purpose of this study was to distinguish linguistic disfluencies from stuttering disfluencies in bilingual Tamil–English (TE) speaking children. The secondary purpose was to determine whether their disfluencies are mediated by native language dominance and/or on an early onset of developmental stuttering at childhood. A detailed study was carried out to identify the prosodic and acoustic features that uniquely represent the disfluent regions of speech. This paper focuses on statistical modeling of repetitions, prolongations, pauses and interjections in the speech corpus encompassing bilingual spontaneous utterances from school going children – English and Tamil. Two classifiers including Hidden Markov Models (HMM) and the Multilayer Perceptron (MLP), which is a class of feed-forward artificial neural network, were compared in the classification of disfluencies. The results of the classifiers document the patterns of disfluency in spontaneous speech samples of school-aged children to distinguish between Children Who Stutter (CWS) and Children with Language Impairment CLI). The ability of the models in classifying the disfluencies was measured in terms of F-measure, Recall, and Precision.

Keywords: bi-lingual, children who stutter, children with language impairment, hidden markov models, multi-layer perceptron, linguistic disfluencies, stuttering disfluencies

Procedia PDF Downloads 181
5188 System for Electromyography Signal Emulation Through the Use of Embedded Systems

Authors: Valentina Narvaez Gaitan, Laura Valentina Rodriguez Leguizamon, Ruben Dario Hernandez B.

Abstract:

This work describes a physiological signal emulation system that uses electromyography (EMG) signals obtained from muscle sensors in the first instance. These signals are used to extract their characteristics to model and emulate specific arm movements. The main objective of this effort is to develop a new biomedical software system capable of generating physiological signals through the use of embedded systems by establishing the characteristics of the acquired signals. The acquisition system used was Biosignals, which contains two EMG electrodes used to acquire signals from the forearm muscles placed on the extensor and flexor muscles. Processing algorithms were implemented to classify the signals generated by the arm muscles when performing specific movements such as wrist flexion extension, palmar grip, and wrist pronation-supination. Matlab software was used to condition and preprocess the signals for subsequent classification. Subsequently, the mathematical modeling of each signal is performed to be generated by the embedded system, with a validation of the accuracy of the obtained signal using the percentage of cross-correlation, obtaining a precision of 96%. The equations are then discretized to be emulated in the embedded system, obtaining a system capable of generating physiological signals according to the characteristics of medical analysis.

Keywords: classification, electromyography, embedded system, emulation, physiological signals

Procedia PDF Downloads 58
5187 Sleep Apnea Hypopnea Syndrom Diagnosis Using Advanced ANN Techniques

Authors: Sachin Singh, Thomas Penzel, Dinesh Nandan

Abstract:

Accurate identification of Sleep Apnea Hypopnea Syndrom Diagnosis is difficult problem for human expert because of variability among persons and unwanted noise. This paper proposes the diagonosis of Sleep Apnea Hypopnea Syndrome (SAHS) using airflow, ECG, Pulse and SaO2 signals. The features of each type of these signals are extracted using statistical methods and ANN learning methods. These extracted features are used to approximate the patient's Apnea Hypopnea Index(AHI) using sample signals in model. Advance signal processing is also applied to snore sound signal to locate snore event and SaO2 signal is used to support whether determined snore event is true or noise. Finally, Apnea Hypopnea Index (AHI) event is calculated as per true snore event detected. Experiment results shows that the sensitivity can reach up to 96% and specificity to 96% as AHI greater than equal to 5.

Keywords: neural network, AHI, statistical methods, autoregressive models

Procedia PDF Downloads 94
5186 [Keynote Speech]: Bridge Damage Detection Using Frequency Response Function

Authors: Ahmed Noor Al-Qayyim

Abstract:

During the past decades, the bridge structures are considered very important portions of transportation networks, due to the fast urban sprawling. With the failure of bridges that under operating conditions lead to focus on updating the default bridge inspection methodology. The structures health monitoring (SHM) using the vibration response appeared as a promising method to evaluate the condition of structures. The rapid development in the sensors technology and the condition assessment techniques based on the vibration-based damage detection made the SHM an efficient and economical ways to assess the bridges. SHM is set to assess state and expects probable failures of designated bridges. In this paper, a presentation for Frequency Response function method that uses the captured vibration test information of structures to evaluate the structure condition. Furthermore, the main steps of the assessment of bridge using the vibration information are presented. The Frequency Response function method is applied to the experimental data of a full-scale bridge.

Keywords: bridge assessment, health monitoring, damage detection, frequency response function (FRF), signal processing, structure identification

Procedia PDF Downloads 318
5185 Denoising Transient Electromagnetic Data

Authors: Lingerew Nebere Kassie, Ping-Yu Chang, Hsin-Hua Huang, , Chaw-Son Chen

Abstract:

Transient electromagnetic (TEM) data plays a crucial role in hydrogeological and environmental applications, providing valuable insights into geological structures and resistivity variations. However, the presence of noise often hinders the interpretation and reliability of these data. Our study addresses this issue by utilizing a FASTSNAP system for the TEM survey, which operates at different modes (low, medium, and high) with continuous adjustments to discretization, gain, and current. We employ a denoising approach that processes the raw data obtained from each acquisition mode to improve signal quality and enhance data reliability. We use a signal-averaging technique for each mode, increasing the signal-to-noise ratio. Additionally, we utilize wavelet transform to suppress noise further while preserving the integrity of the underlying signals. This approach significantly improves the data quality, notably suppressing severe noise at late times. The resulting denoised data exhibits a substantially improved signal-to-noise ratio, leading to increased accuracy in parameter estimation. By effectively denoising TEM data, our study contributes to a more reliable interpretation and analysis of underground structures. Moreover, the proposed denoising approach can be seamlessly integrated into existing ground-based TEM data processing workflows, facilitating the extraction of meaningful information from noisy measurements and enhancing the overall quality and reliability of the acquired data.

Keywords: data quality, signal averaging, transient electromagnetic, wavelet transform

Procedia PDF Downloads 54