Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 1814

Search results for: noise speech

1634 Speech Identification Test for Individuals with High-Frequency Sloping Hearing Loss in Telugu

Authors: S. B. Rathna Kumar, Sandya K. Varudhini, Aparna Ravichandran

Abstract:

Telugu is a south central Dravidian language spoken in Andhra Pradesh, a southern state of India. The available speech identification tests in Telugu have been developed to determine the communication problems of individuals having a flat frequency hearing loss. These conventional speech audiometric tests would provide redundant information when used on individuals with high-frequency sloping hearing loss because of better hearing sensitivity in the low- and mid-frequency regions. Hence, conventional speech identification tests do not indicate the true nature of the communication problem of individuals with high-frequency sloping hearing loss. It is highly possible that a person with a high-frequency sloping hearing loss may get maximum scores if conventional speech identification tests are used. Hence, there is a need to develop speech identification test materials that are specifically designed to assess the speech identification performance of individuals with high-frequency sloping hearing loss. The present study aimed to develop speech identification test for individuals with high-frequency sloping hearing loss in Telugu. Individuals with high-frequency sloping hearing loss have difficulty in perception of voiceless consonants whose spectral energy is above 1000 Hz. Hence, the word lists constructed with phonemes having mid- and high-frequency spectral energy will estimate speech identification performance better for such individuals. The phonemes /k/, /g/, /c/, /ṭ/ /t/, /p/, /s/, /ś/, /ṣ/ and /h/are preferred for the construction of words as these phonemes have spectral energy distributed in the frequencies above 1000 KHz predominantly. The present study developed two word lists in Telugu (each word list contained 25 words) for evaluating speech identification performance of individuals with high-frequency sloping hearing loss. The performance of individuals with high-frequency sloping hearing loss was evaluated using both conventional and high-frequency word lists under recorded voice condition. The results revealed that the developed word lists were found to be more sensitive in identifying the true nature of the communication problem of individuals with high-frequency sloping hearing loss.

Keywords: speech identification test, high-frequency sloping hearing loss, recorded voice condition, Telugu

Procedia PDF Downloads 394

1633 A Corpus-Based Contrastive Analysis of Directive Speech Act Verbs in English and Chinese Legal Texts

Authors: Wujian Han

Abstract:

In the process of human interaction and communication, speech act verbs are considered to be the most active component and the main means for information transmission, and are also taken as an indication of the structure of linguistic behavior. The theoretical value and practical significance of such everyday built-in metalanguage have long been recognized. This paper, which is part of a bigger study, is aimed to provide useful insights for a more precise and systematic application to speech act verbs translation between English and Chinese, especially with regard to the degree to which generic integrity is maintained in the practice of translation of legal documents. In this study, the corpus, i.e. Chinese legal texts and their English translations, English legal texts, ordinary Chinese texts, and ordinary English texts, serve as a testing ground for examining contrastively the usage of English and Chinese directive speech act verbs in legal genre. The scope of this paper is relatively wide and essentially covers all directive speech act verbs which are used in ordinary English and Chinese, such as order, command, request, prohibit, threat, advice, warn and permit. The researcher, by combining the corpus methodology with a contrastive perspective, explored a range of characteristics of English and Chinese directive speech act verbs including their semantic, syntactic and pragmatic features, and then contrasted them in a structured way. It has been found that there are similarities between English and Chinese directive speech act verbs in legal genre, such as similar semantic components between English speech act verbs and their translation equivalents in Chinese, formal and accurate usage of English and Chinese directive speech act verbs in legal contexts. But notable differences have been identified in areas of difference between their usage in the original Chinese and English legal texts such as valency patterns and frequency of occurrences. For example, the subjects of some directive speech act verbs are very frequently omitted in Chinese legal texts, but this is not the case in English legal texts. One of the practicable methods to achieve adequacy and conciseness in speech act verb translation from Chinese into English in legal genre is to repeat the subjects or the message with discrepancy, and vice versa. In addition, translation effects such as overuse and underuse of certain directive speech act verbs are also found in the translated English texts compared to the original English texts. Legal texts constitute a particularly valuable material for speech act verb study. Building up such a contrastive picture of the Chinese and English speech act verbs in legal language would yield results of value and interest to legal translators and students of language for legal purposes and have practical application to legal translation between English and Chinese.

Keywords: contrastive analysis, corpus-based, directive speech act verbs, legal texts, translation between English and Chinese

Procedia PDF Downloads 452

1632 Subband Coding and Glottal Closure Instant (GCI) Using SEDREAMS Algorithm

Authors: Harisudha Kuresan, Dhanalakshmi Samiappan, T. Rama Rao

Abstract:

In modern telecommunication applications, Glottal Closure Instants location finding is important and is directly evaluated from the speech waveform. Here, we study the GCI using Speech Event Detection using Residual Excitation and the Mean Based Signal (SEDREAMS) algorithm. Speech coding uses parameter estimation using audio signal processing techniques to model the speech signal combined with generic data compression algorithms to represent the resulting modeled in a compact bit stream. This paper proposes a sub-band coder SBC, which is a type of transform coding and its performance for GCI detection using SEDREAMS are evaluated. In SBCs code in the speech signal is divided into two or more frequency bands and each of these sub-band signal is coded individually. The sub-bands after being processed are recombined to form the output signal, whose bandwidth covers the whole frequency spectrum. Then the signal is decomposed into low and high-frequency components and decimation and interpolation in frequency domain are performed. The proposed structure significantly reduces error, and precise locations of Glottal Closure Instants (GCIs) are found using SEDREAMS algorithm.

Keywords: SEDREAMS, GCI, SBC, GOI

Procedia PDF Downloads 327

1631 Tuning of Kalman Filter Using Genetic Algorithm

Authors: Hesham Abdin, Mohamed Zakaria, Talaat Abd-Elmonaem, Alaa El-Din Sayed Hafez

Abstract:

Kalman filter algorithm is an estimator known as the workhorse of estimation. It has an important application in missile guidance, especially in lack of accurate data of the target due to noise or uncertainty. In this paper, a Kalman filter is used as a tracking filter in a simulated target-interceptor scenario with noise. It estimates the position, velocity, and acceleration of the target in the presence of noise. These estimations are needed for both proportional navigation and differential geometry guidance laws. A Kalman filter has a good performance at low noise, but a large noise causes considerable errors leads to performance degradation. Therefore, a new technique is required to overcome this defect using tuning factors to tune a Kalman filter to adapt increasing of noise. The values of the tuning factors are between 0.8 and 1.2, they have a specific value for the first half of range and a different value for the second half. they are multiplied by the estimated values. These factors have its optimum values and are altered with the change of the target heading. A genetic algorithm updates these selections to increase the maximum effective range which was previously reduced by noise. The results show that the selected factors have other benefits such as decreasing the minimum effective range that was increased earlier due to noise. In addition to, the selected factors decrease the miss distance for all ranges of this direction of the target, and expand the effective range which leads to increase probability of kill.

Keywords: proportional navigation, differential geometry, Kalman filter, genetic algorithm

Procedia PDF Downloads 484

1630 Google Translate: AI Application

Authors: Shaima Almalhan, Lubna Shukri, Miriam Talal, Safaa Teskieh

Abstract:

Since artificial intelligence is a rapidly evolving topic that has had a significant impact on technical growth and innovation, this paper examines people's awareness, use, and engagement with the Google Translate application. To see how familiar aware users are with the app and its features, quantitative and qualitative research was conducted. The findings revealed that consumers have a high level of confidence in the application and how far people they benefit from this sort of innovation and how convenient it makes communication.

Keywords: artificial intelligence, google translate, speech recognition, language translation, camera translation, speech to text, text to speech

Procedia PDF Downloads 123

1629 Design of Aesthetic Acoustic Metamaterials Window Panel Based on Sierpiński Fractal Triangle for Sound-Silencing with Free Airflow

Authors: Sanjeet Kumar Singh, Shantanu Bhatacharya

Abstract:

Design of high-efficiency low, frequency (<1000Hz) soundproof window or wall absorber which is transparent to airflow is presented. Due to the massive rise in human population and modernization, environmental noise has significantly risen globally. Prolonged noise exposure can cause severe physiological and psychological symptoms like nausea, headaches, fatigue, and insomnia. There has been continuous growth in building construction and infrastructure like offices, bus stops, and airports due to the urban population. Generally, a ventilated window is used for getting fresh air into the room, but at the same time, unwanted noise comes along. Researchers used traditional approaches like noise barrier mats in front of the window or designed the entire window using sound-absorbing materials. However, this solution is not aesthetically pleasing, and at the same time, it's heavy and not adequate for low-frequency noise shielding. To address this challenge, we design a transparent hexagonal panel based on the Sierpiński fractal triangle, which is aesthetically pleasing and demonstrates a normal incident sound absorption coefficient of more than 0.96 around 700 Hz and transmission loss of around 23 dB while maintaining e air circulation through the triangular cutout. Next, we present a concept of fabrication of large acoustic panels for large-scale applications, which leads to suppressing urban noise pollution.

Keywords: acoustic metamaterials, ventilation, urban noise pollution, noise control

Procedia PDF Downloads 85

1628 ML-Based Blind Frequency Offset Estimation Schemes for OFDM Systems in Non-Gaussian Noise Environments

Authors: Keunhong Chae, Seokho Yoon

Abstract:

This paper proposes frequency offset (FO) estimation schemes robust to the non-Gaussian noise for orthogonal frequency division multiplexing (OFDM) systems. A maximum-likelihood (ML) scheme and a low-complexity estimation scheme are proposed by applying the probability density function of the cyclic prefix of OFDM symbols to the ML criterion. From simulation results, it is confirmed that the proposed schemes offer a significant FO estimation performance improvement over the conventional estimation scheme in non-Gaussian noise environments.

Keywords: frequency offset, cyclic prefix, maximum-likelihood, non-Gaussian noise, OFDM

Procedia PDF Downloads 448

1627 Effect of Noise Reducing Headphones on the Short-Term Memory Recall of College Students

Authors: Gregory W. Smith, Paul J. Riccomini

Abstract:

The goal of this empirical inquiry is to explore the effect of noise reducing headphones on the short-term memory recall of college students. Immediately following the presentation (via PowerPoint) of 12 unrelated and randomly selected one- and two-syllable words, students were asked to recall as many words as possible. Using a linear model with conditions marked with binary indicators, we examined the frequency and accuracy of words that were recalled. The findings indicate that for some students, a reduction of noise has a significant positive impact on their ability to recall information. As classrooms become more aurally distracting due to the implementation of cooperative learning activities, these findings highlight the need for a quiet learning environment for some learners.

Keywords: auditory distraction, education, instruction, noise, working memory

Procedia PDF Downloads 299

1626 Recognition by the Voice and Speech Features of the Emotional State of Children by Adults and Automatically

Authors: Elena E. Lyakso, Olga V. Frolova, Yuri N. Matveev, Aleksey S. Grigorev, Alexander S. Nikolaev, Viktor A. Gorodnyi

Abstract:

The study of the children’s emotional sphere depending on age and psychoneurological state is of great importance for the design of educational programs for children and their social adaptation. Atypical development may be accompanied by violations or specificities of the emotional sphere. To study characteristics of the emotional state reflection in the voice and speech features of children, the perceptual study with the participation of adults and the automatic recognition of speech were conducted. Speech of children with typical development (TD), with Down syndrome (DS), and with autism spectrum disorders (ASD) aged 6-12 years was recorded. To obtain emotional speech in children, model situations were created, including a dialogue between the child and the experimenter containing questions that can cause various emotional states in the child and playing with a standard set of toys. The questions and toys were selected, taking into account the child’s age, developmental characteristics, and speech skills. For the perceptual experiment by adults, test sequences containing speech material of 30 children: TD, DS, and ASD were created. The listeners were 100 adults (age 19.3 ± 2.3 years). The listeners were tasked with determining the children’s emotional state as “comfort – neutral – discomfort” while listening to the test material. Spectrographic analysis of speech signals was conducted. For automatic recognition of the emotional state, 6594 speech files containing speech material of children were prepared. Automatic recognition of three states, “comfort – neutral – discomfort,” was performed using automatically extracted from the set of acoustic features - the Geneva Minimalistic Acoustic Parameter Set (GeMAPS) and the extended Geneva Minimalistic Acoustic Parameter Set (eGeMAPS). The results showed that the emotional state is worse determined by the speech of TD children (comfort – 58% of correct answers, discomfort – 56%). Listeners better recognized discomfort in children with ASD and DS (78% of answers) than comfort (70% and 67%, respectively, for children with DS and ASD). The neutral state is better recognized by the speech of children with ASD (67%) than by the speech of children with DS (52%) and TD children (54%). According to the automatic recognition data using the acoustic feature set GeMAPSv01b, the accuracy of automatic recognition of emotional states for children with ASD is 0.687; children with DS – 0.725; TD children – 0.641. When using the acoustic feature set eGeMAPSv01b, the accuracy of automatic recognition of emotional states for children with ASD is 0.671; children with DS – 0.717; TD children – 0.631. The use of different models showed similar results, with better recognition of emotional states by the speech of children with DS than by the speech of children with ASD. The state of comfort is automatically determined better by the speech of TD children (precision – 0.546) and children with ASD (0.523), discomfort – children with DS (0.504). The data on the specificities of recognition by adults of the children’s emotional state by their speech may be used in recruitment for working with children with atypical development. Automatic recognition data can be used to create alternative communication systems and automatic human-computer interfaces for social-emotional learning. Acknowledgment: This work was financially supported by the Russian Science Foundation (project 18-18-00063).

Keywords: autism spectrum disorders, automatic recognition of speech, child’s emotional speech, Down syndrome, perceptual experiment

Procedia PDF Downloads 161

1625 YOLO-IR: Infrared Small Object Detection in High Noise Images

Authors: Yufeng Li, Yinan Ma, Jing Wu, Chengnian Long

Abstract:

Infrared object detection aims at separating small and dim targets from cluttered backgrounds, and its capabilities extend beyond the limits of visible light, making it invaluable in a wide range of applications, such as improving safety, security, efficiency, and functionality. However, existing methods are usually sensitive to the noise of the input infrared image, leading to a decrease in target detection accuracy and an increase in the false alarm rate in high-noise environments. To address this issue, an infrared small target detection algorithm called YOLO-IR is proposed in this paper to improve the robustness to high infrared noise. To address the problem that high noise significantly reduces the clarity and reliability of target features in infrared images, we design a soft-threshold coordinate attention mechanism to improve the model’s ability to extract target features and its robustness to noise. Since the noise may overwhelm the local details of the target, resulting in the loss of small target features during depth down-sampling, we propose a deep and shallow feature fusion neck to improve the detection accuracy. In addition, because the generalized Intersection over Union (IoU)-based loss functions may be sensitive to noise and lead to unstable training in high-noise environments, we introduce a Wasserstein-distance based loss function to improve the training of the model. The experimental results show that YOLO-IR achieves a 5.0% improvement in recall and a 6.6% improvement in the F1 score over the existing state-of-the-art model.

Keywords: infrared small target detection, high noise, robustness, soft-threshold coordinate attention, feature fusion

Procedia PDF Downloads 27

1624 Compensatory Articulation of Pressure Consonants in Telugu Cleft Palate Speech: A Spectrographic Analysis

Authors: Indira Kothalanka

Abstract:

For individuals born with a cleft palate (CP), there is no separation between the nasal cavity and the oral cavity, due to which they cannot build up enough air pressure in the mouth for speech. Therefore, it is common for them to have speech problems. Common cleft type speech errors include abnormal articulation (compensatory or obligatory) and abnormal resonance (hyper, hypo and mixed nasality). These are generally resolved after palate repair. However, in some individuals, articulation problems do persist even after the palate repair. Such individuals develop variant articulations in an attempt to compensate for the inability to produce the target phonemes. A spectrographic analysis is used to investigate the compensatory articulatory behaviours of pressure consonants in the speech of 10 Telugu speaking individuals aged between 7-17 years with a history of cleft palate. Telugu is a Dravidian language which is spoken in Andhra Pradesh and Telangana states in India. It is a language with the third largest number of native speakers in India and the most spoken Dravidian language. The speech of the informants is analysed using single word list, sentences, passage and conversation. Spectrographic analysis is carried out using PRAAT, speech analysis software. The place and manner of articulation of consonant sounds is studied through spectrograms with the help of various acoustic cues. The types of compensatory articulation identified are glottal stops, palatal stops, uvular, velar stops and nasal fricatives which are non-native in Telugu.

Keywords: cleft palate, compensatory articulation, spectrographic analysis, PRAAT

Procedia PDF Downloads 419

1623 Experimental Analysis of Structure Borne Noise in an Enclosure

Authors: Waziralilah N. Fathiah, A. Aminudin, U. Alyaa Hashim, T. Vikneshvaran D. Shakirah Shukor

Abstract:

This paper presents the experimental analysis conducted on a structure borne noise in a rectangular enclosure prototype made by joining of sheet aluminum metal and plywood. The study is significant as many did not realized the annoyance caused by structural borne-noise. In this study, modal analysis is carried out to seek the structure’s behaviour in order to identify the characteristics of enclosure in frequency domain ranging from 0 Hz to 200 Hz. Here, numbers of modes are identified and the characteristic of mode shape is categorized. Modal experiment is used to diagnose the structural behaviour while microphone is used to diagnose the sound. Spectral testing is performed on the enclosure. It is acoustically excited using shaker and as it vibrates, the vibrational and noise responses sensed by tri-axis accelerometer and microphone sensors are recorded respectively. Experimental works is performed on each node lies on the gridded surface of the enclosure. Both experimental measurement is carried out simultaneously. The modal experimental results of the modal modes are validated by simulation performed using MSC Nastran software. In pursuance of reducing the structure borne-noise, mitigation method is used whereby the stiffener plates are perpendicularly placed on the sheet aluminum metal. By using this method, reduction in structure borne-noise is successfully made at the end of the study.

Keywords: enclosure, modal analysis, sound analysis, structure borne-noise

Procedia PDF Downloads 401

1622 Reduction of Speckle Noise in Echocardiographic Images: A Survey

Authors: Fathi Kallel, Saida Khachira, Mohamed Ben Slima, Ahmed Ben Hamida

Abstract:

Speckle noise is a main characteristic of cardiac ultrasound images, it corresponding to grainy appearance that degrades the image quality. For this reason, the ultrasound images are difficult to use automatically in clinical use, then treatments are required for this type of images. Then a filtering procedure of these images is necessary to eliminate the speckle noise and to improve the quality of ultrasound images which will be then segmented to extract the necessary forms that exist. In this paper, we present the importance of the pre-treatment step for segmentation. This work is applied to cardiac ultrasound images. In a first step, a comparative study of speckle filtering method will be presented and then we use a segmentation algorithm to locate and extract cardiac structures.

Keywords: medical image processing, ultrasound images, Speckle noise, image enhancement, speckle filtering, segmentation, snakes

Procedia PDF Downloads 499

1621 Study of Effect of Gear Tooth Accuracy on Transmission Mount Vibration

Authors: Kalyan Deepak Kolla, Ketan Paua, Rajkumar Bhagate

Abstract:

Transmission dynamics occupy major role in customer perception of the product in both senses of touch and quality of sound. The quantity and quality of sound perceived is more concerned with the whine noise of the gears engaged. Whine noise is tonal in nature and tonal noises cause fatigue and irritation to customers, which in turn affect the quality of the product. Transmission error is the usual suspect for whine noise, which can be caused due to misalignments, tolerances, manufacturing variabilities. In-cabin noise is also more sensitive to the gear design. As the details of the gear tooth design and manufacturing are in microns, anything out of the tolerance zone, either in design or manufacturing, will cause a whine noise. This will also cause high variation in stress and deformation due to change in the load and leads to the fatigue failure of the gears. Hence gear design and development take priority in the transmission development process. This paper aims to study such variability by considering five pairs of helical spur gears and their effect on the transmission error, contact pattern and vibration level on the transmission.

Keywords: gears, whine noise, manufacturing variability, mount vibration variability

Procedia PDF Downloads 127

1620 Virtual Reality Based 3D Video Games and Speech-Lip Synchronization Superseding Algebraic Code Excited Linear Prediction

Authors: P. S. Jagadeesh Kumar, S. Meenakshi Sundaram, Wenli Hu, Yang Yung

Abstract:

In 3D video games, the dominance of production is unceasingly growing with a protruding level of affordability in terms of budget. Afterward, the automation of speech-lip synchronization technique is customarily onerous and has advanced a critical research subject in virtual reality based 3D video games. This paper presents one of these automatic tools, precisely riveted on the synchronization of the speech and the lip movement of the game characters. A robust and precise speech recognition segment that systematized with Algebraic Code Excited Linear Prediction method is developed which unconventionally delivers lip sync results. The Algebraic Code Excited Linear Prediction algorithm is constructed on that used in code-excited linear prediction, but Algebraic Code Excited Linear Prediction codebooks have an explicit algebraic structure levied upon them. This affords a quicker substitute to the software enactments of lip sync algorithms and thus advances the superiority of service factors abridged production cost.

Keywords: algebraic code excited linear prediction, speech-lip synchronization, video games, virtual reality

Procedia PDF Downloads 442

1619 Auditory Effects among 18-45 Years Old Workers of a Textile Plant in Seeduwa, Sri Lanka

Authors: P. G. S. Madushani, L. D. Illeperuma

Abstract:

Abstract Noise is one of the most common physical hazards in industrial settings. The prevalence of Noise Induced Hearing Loss (NIHL) is on the rise with increasedduration of exposure and the increase in the severity of hearing loss. The purpose of the study was to determine auditory effects among textile workers and to establish associations between the degree of hearing loss and exposure duration, degree of hearing loss and noise level and the proportion of hearing related complaints. A cross sectional descriptive study using purposive sampling was carried out. An interviewer administered questionnaire and Distortion Product Oto Acoustic Emission (DPOAE) hearing screening on 127 (72 female and 55 male) textile workers of the selected textile plant in Seeduwa, Sri Lanka was done (Age: M= 31.16, SD=7.75). Noise measurements were done in six sections of the factory and average noise levels were obtained. Diagnostic hearing evaluations were done for 60 (57.75%) subjects, referred from the DPOAE hearing screening test. The degree of hearing loss and the exposure duration had a significant association in the high frequency region of 4 kHz to 8 kHz (p < 0.05). Noise levels fluctuated between 90.3±0.8 dBA and 50.6. ±0.52 dBA. 30.83% of workers reported having NIHL. Most of the workers (33.9%) complained difficulty in conversing in noisy backgrounds. Other complaints as tinnitus, dizziness, ear fullness and headache were reported in less than 30%. workers who were exposed to noise for more than 15 years were affected with NIHL in the high frequency region. Administrative controls and engineering controls need to be implemented to manage hazardous noise levels in industrial settings. Hearing Conservation Programs should be initiated and implemented for textile workers.

Keywords: textile industry, NIHL, degree of hearing loss, noise levels, auditory effects

Procedia PDF Downloads 112

1618 Estimation of Endogenous Brain Noise from Brain Response to Flickering Visual Stimulation Magnetoencephalography Visual Perception Speed

Authors: Alexander N. Pisarchik, Parth Chholak

Abstract:

Intrinsic brain noise was estimated via magneto-encephalograms (MEG) recorded during perception of flickering visual stimuli with frequencies of 6.67 and 8.57 Hz. First, we measured the mean phase difference between the flicker signal and steady-state event-related field (SSERF) in the occipital area where the brain response at the flicker frequencies and their harmonics appeared in the power spectrum. Then, we calculated the probability distribution of the phase fluctuations in the regions of frequency locking and computed its kurtosis. Since kurtosis is a measure of the distribution’s sharpness, we suppose that inverse kurtosis is related to intrinsic brain noise. In our experiments, the kurtosis value varied among subjects from K = 3 to K = 5 for 6.67 Hz and from 2.6 to 4 for 8.57 Hz. The majority of subjects demonstrated leptokurtic kurtosis (K < 3), i.e., the distribution tails approached zero more slowly than Gaussian. In addition, we found a strong correlation between kurtosis and brain complexity measured as the correlation dimension, so that the MEGs of subjects with higher kurtosis exhibited lower complexity. The obtained results are discussed in the framework of nonlinear dynamics and complex network theories. Specifically, in a network of coupled oscillators, phase synchronization is mainly determined by two antagonistic factors, noise, and the coupling strength. While noise worsens phase synchronization, the coupling improves it. If we assume that each neuron and each synapse contribute to brain noise, the larger neuronal network should have stronger noise, and therefore phase synchronization should be worse, that results in smaller kurtosis. The described method for brain noise estimation can be useful for diagnostics of some brain pathologies associated with abnormal brain noise.

Keywords: brain, flickering, magnetoencephalography, MEG, visual perception, perception time

Procedia PDF Downloads 113

1617 A Cross-Dialect Statistical Analysis of Final Declarative Intonation in Tuvinian

Authors: D. Beziakina, E. Bulgakova

Abstract:

This study continues the research on Tuvinian intonation and presents a general cross-dialect analysis of intonation of Tuvinian declarative utterances, specifically the character of the tone movement in order to test the hypothesis about the prevalence of level tone in some Tuvinian dialects. The results of the analysis of basic pitch characteristics of Tuvinian speech (in general and in comparison with two other Turkic languages - Uzbek and Azerbaijani) are also given in this paper. The goal of our work was to obtain the ranges of pitch parameter values typical for Tuvinian speech. Such language-specific values can be used in speaker identification systems in order to get more accurate results of ethnic speech analysis. We also present the results of a cross-dialect analysis of declarative intonation in the poorly studied Tuvinian language.

Keywords: speech analysis, statistical analysis, speaker recognition, identification of person

Procedia PDF Downloads 443

1616 A Profile of the Patients at the Hearing and Speech Clinic at the University of Jordan: A Retrospective Study

Authors: Maisa Haj-Tas, Jehad Alaraifi

Abstract:

The significance of the study: This retrospective study examined the speech and language profiles of patients who received clinical services at the University of Jordan Hearing and Speech Clinic (UJ-HSC) from 2009 to 2014. The UJ-HSC clinic is located in the capital Amman and was established in the late 1990s. It is the first hearing and speech clinic in Jordan and one of first speech and hearing clinics in the Middle East. This clinic provides services to an annual average of 2000 patients who are diagnosed with different communication disorders. Examining the speech and language profiles of patients in this clinic could provide an insight about the most common disorders seen in patients who attend similar clinics in Jordan. It could also provide information about community awareness of the role of speech therapists in the management of speech and language disorders. Methodology: The researchers examined the clinical records of 1140 patients (797 males and 343 females) who received clinical services at the UJ-HSC between the years 2009 and 2014 for the purpose of data analysis for this study. The main variables examined in the study were disorder type and gender. Participants were divided into four age groups: children, adolescents, adults, and older adults. The examined disorders were classified as either speech disorders, language disorders, or dysphagia (i.e., swallowing problems). The disorders were further classified as childhood language impairments, articulation disorders, stuttering, cluttering, voice disorders, aphasia, and dysphagia. Results: The results indicated that the prevalence for language disorders was the highest (50.7%) followed by speech disorders (48.3%), and dysphagia (0.9%). The majority of patients who were seen at the JU-HSC were diagnosed with childhood language impairments (47.3%) followed consecutively by articulation disorders (21.1%), stuttering (16.3%), voice disorders (12.1%), aphasia (2.2%), dysphagia (0.9%), and cluttering (0.2%). As for gender, the majority of patients seen at the clinic were males in all disorders except for voice disorders and cluttering. Discussion: The results of the present study indicate that the majority of examined patients were diagnosed with childhood language impairments. Based on this result, the researchers suggest that there seems to be a high prevalence of childhood language impairments among children in Jordan compared to other types of speech and language disorders. The researchers also suggest that there is a need for further examination of the actual prevalence data on speech and language disorders in Jordan. The fact that many of the children seen at the UJ-HSC were brought to the clinic either as a result of parental concern or teacher referral indicates that there seems to an increased awareness among parents and teachers about the services speech pathologists can provide about assessment and treatment of childhood speech and language disorders. The small percentage of other disorders (i.e., stuttering, cluttering, dysphasia, aphasia, and voice disorders) seen at the UJ-HSC may indicate a little awareness by the local community about the role of speech pathologists in the assessment and treatment of these disorders.

Keywords: clinic, disorders, language, profile, speech

Procedia PDF Downloads 291

1615 Atomic Decomposition Audio Data Compression and Denoising Using Sparse Dictionary Feature Learning

Authors: T. Bryan , V. Kepuska, I. Kostnaic

Abstract:

A method of data compression and denoising is introduced that is based on atomic decomposition of audio data using “basis vectors” that are learned from the audio data itself. The basis vectors are shown to have higher data compression and better signal-to-noise enhancement than the Gabor and gammatone “seed atoms” that were used to generate them. The basis vectors are the input weights of a Sparse AutoEncoder (SAE) that is trained using “envelope samples” of windowed segments of the audio data. The envelope samples are extracted from the audio data by performing atomic decomposition with Gabor or gammatone seed atoms. This process identifies segments of audio data that are locally coherent with the seed atoms. Envelope samples are extracted by identifying locally coherent audio data segments with Gabor or gammatone seed atoms, found by matching pursuit. The envelope samples are formed by taking the kronecker products of the atomic envelopes with the locally coherent data segments. Oracle signal-to-noise ratio (SNR) verses data compression curves are generated for the seed atoms as well as the basis vectors learned from Gabor and gammatone seed atoms. SNR data compression curves are generated for speech signals as well as early American music recordings. The basis vectors are shown to have higher denoising capability for data compression rates ranging from 90% to 99.84% for speech as well as music. Envelope samples are displayed as images by folding the time series into column vectors. This display method is used to compare of the output of the SAE with the envelope samples that produced them. The basis vectors are also displayed as images. Sparsity is shown to play an important role in producing the highest denoising basis vectors.

Keywords: sparse dictionary learning, autoencoder, sparse autoencoder, basis vectors, atomic decomposition, envelope sampling, envelope samples, Gabor, gammatone, matching pursuit

Procedia PDF Downloads 228

1614 Environmentally Adaptive Acoustic Echo Suppression for Barge-in Speech Recognition

Authors: Jong Han Joo, Jung Hoon Lee, Young Sun Kim, Jae Young Kang, Seung Ho Choi

Abstract:

In this study, we propose a novel technique for acoustic echo suppression (AES) during speech recognition under barge-in conditions. Conventional AES methods based on spectral subtraction apply fixed weights to the estimated echo path transfer function (EPTF) at the current signal segment and to the EPTF estimated until the previous time interval. We propose a new approach that adaptively updates weight parameters in response to abrupt changes in the acoustic environment due to background noises or double-talk. Furthermore, we devised a voice activity detector and an initial time-delay estimator for barge-in speech recognition in communication networks. The initial time delay is estimated using log-spectral distance measure, as well as cross-correlation coefficients. The experimental results show that the developed techniques can be successfully applied in barge-in speech recognition systems.

Keywords: acoustic echo suppression, barge-in, speech recognition, echo path transfer function, initial delay estimator, voice activity detector

Procedia PDF Downloads 347

1613 Role of Speech Articulation in English Language Learning

Authors: Khadija Rafi, Neha Jamil, Laiba Khalid, Meerub Nawaz, Mahwish Farooq

Abstract:

Speech articulation is a complex process to produce intelligible sounds with the help of precise movements of various structures within the vocal tract. All these structures in the vocal tract are named as articulators, which comprise lips, teeth, tongue, and palate. These articulators work together to produce a range of distinct phonemes, which happen to be the basis of language. It starts with the airstream from the lungs passing through the trachea and into oral and nasal cavities. When the air passes through the mouth, the tongue and the muscles around it form such coordination it creates certain sounds. It can be seen when the tongue is placed in different positions- sometimes near the alveolar ridge, soft palate, roof of the mouth or the back of the teeth which end up creating unique qualities of each phoneme. We can articulate vowels with open vocal tracts, but the height and position of the tongue is different every time depending upon each vowel, while consonants can be pronounced when we create obstructions in the airflow. For instance, the alphabet ‘b’ is a plosive and can be produced only by briefly closing the lips. Articulation disorders can not only affect communication but can also be a hurdle in speech production. To improve articulation skills for such individuals, doctors often recommend speech therapy, which involves various kinds of exercises like jaw exercises and tongue twisters. However, this disorder is more common in children who are going through developmental articulation issues right after birth, but in adults, it can be caused by injury, neurological conditions, or other speech-related disorders. In short, speech articulation is an essential aspect of productive communication, which also includes coordination of the specific articulators to produce different intelligible sounds, which are a vital part of spoken language.

Keywords: linguistics, speech articulation, speech therapy, language learning

Procedia PDF Downloads 34

1612 Hate Speech in Selected Nigerian Newspapers

Authors: Laurel Chikwado Madumere, Kevin O. Ugorji

Abstract:

A speech is said to be full of hate when it appropriates disparaging and vituperative locutions and/or appellations, which are riddled with prejudices and misconceptions about an antagonizing party on the grounds of gender, race, political orientation, religious affiliations, tribe, etc. Due largely to the dichotomies and polarities that exist in Nigeria across political ideological spectrum, tribal affiliations, and gender contradistinctions, there are possibilities for the existence of socioeconomic, religious and political conditions that would induce, provoke and catalyze hate speeches in Nigeria’s mainstream media. Therefore the aim of this paper is to investigate, using select daily newspapers in Nigeria, the extent and complexity of those likely hate speeches that emanate from the pluralism in Nigeria and to set in to relief, the discrepancies and contrariety in the interpretation of those hate words. To achieve the above, the paper shall be qualitative in orientation as it shall be using the Speech Act Theory of J. L. Austin and J. R. Searle to interpret and evaluate the hate speeches in the select Nigerian daily newspapers. Also this paper shall help to elucidate the conditions that generate hate, and inform the government and NGOs how best to approach those conditions and put an end to the possible violence and extremism that emanate from extreme cases of hate.

Keywords: extremism, gender, hate speech, pluralism, prejudice, speech act theory

Procedia PDF Downloads 122

1611 Integration of Acoustic Solutions for Classrooms

Authors: Eyibo Ebengeobong Eddie, Halil Zafer Alibaba

Abstract:

The neglect of classroom acoustics is dominant in most educational facilities, meanwhile, hearing and listening is the learning process in this kind of facilities. A classroom should therefore be an environment that encourages listening, without an obstacles to understanding what is being taught. Although different studies have shown teachers to complain that noise is the everyday factor that causes stress in classroom, the capacity of individuals to understand speech is further affected by Echoes, Reverberation, and room modes. It is therefore necessary for classrooms to have an ideal acoustics to aid the intelligibility of students in the learning process. The influence of these acoustical parameters on learning and teaching in schools needs to be further researched upon to enhance the teaching and learning capacity of both teacher and student. For this reason, there is a strong need to provide and collect data to analyse and define the suitable quality of classrooms needed for a learning environment. Research has shown that acoustical problems are still experienced in both newer and older schools. However, recently, principle of acoustics has been analysed and room acoustics can now be measured with various technologies and sound systems to improve and solve the problem of acoustics in classrooms. These acoustic solutions, materials, construction methods and integration processes would be discussed in this paper.

Keywords: classroom, acoustics, materials, integration, speech intelligibility

Procedia PDF Downloads 392

1610 Diversity of Voices: Audio Visual Continuous Speech Recognition with Traditional Approach

Authors: Partha Protim Majumder, Sajeeb Das, Sharun Akter Khushbu

Abstract:

Bengali is widely spoken in the world, but Bengali speech recognition has not received much attention. Here, we are conducting the toughest task because it must be performed in a noisy place in our study. Another challenge we overcome is dealing with speeches and collecting data on third genders, and our approach is to recognize the gender in speeches. All of the Bangla speech samples used in this study were short and were taken from real-life situations. We employed the male, female, and third-gender categories of speech. In this study, we derive the feature from the spoken word. We used MFCC(1-20), ZCR,rolloff,spec_cen, RMSE, and chroma_stft. Here, we used the algorithms Gboost, Random Forest, K-Nearest Neighbors (KNN), Decision Tree, Naive Bayes, and Logistic Regression (LR) to assess the performance of recognition metrics, and we got the highest performance from random forest in recognizing the gender of the speeches.

Keywords: MFCC, ZCR, Bengali, LR, RMSE, roll-off, Gboost

Procedia PDF Downloads 36

1609 Robust Adaptation to Background Noise in Multichannel C-OTDR Monitoring Systems

Authors: Andrey V. Timofeev, Viktor M. Denisov

Abstract:

A robust sequential nonparametric method is proposed for adaptation to background noise parameters for real-time. The distribution of background noise was modelled like to Huber contamination mixture. The method is designed to operate as an adaptation-unit, which is included inside a detection subsystem of an integrated multichannel monitoring system. The proposed method guarantees the given size of a nonasymptotic confidence set for noise parameters. Properties of the suggested method are rigorously proved. The proposed algorithm has been successfully tested in real conditions of a functioning C-OTDR monitoring system, which was designed to monitor railways.

Keywords: guaranteed estimation, multichannel monitoring systems, non-asymptotic confidence set, contamination mixture

Procedia PDF Downloads 399

1608 Absence of Developmental Change in Epenthetic Vowel Duration in Japanese Speakers’ English

Authors: Takayuki Konishi, Kakeru Yazawa, Mariko Kondo

Abstract:

This study examines developmental change in the production of epenthetic vowels by Japanese learners of English in relation to acquisition of L2 English speech rhythm. Seventy-two Japanese learners of English in the J-AESOP corpus were divided into lower- and higher-level learners according to their proficiency score and the frequency of vowel epenthesis. Three learners were excluded because no vowel epenthesis was observed in their utterances. The analysis of their read English speech data showed no statistical difference between lower- and higher-level learners, implying the absence of any developmental change in durations of epenthetic vowels. This result, together with the findings of previous studies, will be discussed in relation to the transfer of L1 phonology and manifestation of L2 English rhythm.

Keywords: vowel epenthesis, Japanese learners of English, L2 speech corpus, speech rhythm

Procedia PDF Downloads 244

1607 Evaluation of Simulated Noise Levels through the Analysis of Temperature and Rainfall: A Case Study of Nairobi Central Business District

Authors: Emmanuel Yussuf, John Muthama, John Ng'ang'A

Abstract:

There has been increasing noise levels all over the world in the last decade. Many factors contribute to this increase, which is causing health related effects to humans. Developing countries are not left out of the whole picture as they are still growing and advancing their development. Motor vehicles are increasing on urban roads; there is an increase in infrastructure due to the rising population, increasing number of industries to provide goods and so many other activities. All this activities lead to the high noise levels in cities. This study was conducted in Nairobi’s Central Business District (CBD) with the main objective of simulating noise levels in order to understand the noise exposed to the people within the urban area, in relation to weather parameters namely temperature, rainfall and wind field. The study was achieved using the Neighbourhood Proximity Model and Time Series Analysis, with data obtained from proxies/remotely-sensed from satellites, in order to establish the levels of noise exposed to which people of Nairobi CBD are exposed to. The findings showed that there is an increase in temperature (0.1°C per year) and a decrease in precipitation (40 mm per year), which in comparison to the noise levels in the area, are increasing. The study also found out that noise levels exposed to people in Nairobi CBD were roughly between 61 and 63 decibels and has been increasing, a level which is high and likely to cause adverse physical and psychological effects on the human body in which air temperature, precipitation and wind contribute so much in the spread of noise. As a noise reduction measure, the use of sound proof materials in buildings close to busy roads, implementation of strict laws to most emitting sources as well as further research on the study was recommended. The data used for this study ranged from the year 2000 to 2015, rainfall being in millimeters (mm), temperature in degrees Celsius (°C) and the urban form characteristics being in meters (m).

Keywords: simulation, noise exposure, weather, proxy

Procedia PDF Downloads 351

1606 Rough Neural Networks in Adapting Cellular Automata Rule for Reducing Image Noise

Authors: Yasser F. Hassan

Abstract:

The reduction or removal of noise in a color image is an essential part of image processing, whether the final information is used for human perception or for an automatic inspection and analysis. This paper describes the modeling system based on the rough neural network model to adaptive cellular automata for various image processing tasks and noise remover. In this paper, we consider the problem of object processing in colored image using rough neural networks to help deriving the rules which will be used in cellular automata for noise image. The proposed method is compared with some classical and recent methods. The results demonstrate that the new model is capable of being trained to perform many different tasks, and that the quality of these results is comparable or better than established specialized algorithms.

Keywords: rough sets, rough neural networks, cellular automata, image processing

Procedia PDF Downloads 402

1605 Grammatical and Lexical Cohesion in the Japan’s Prime Minister Shinzo Abe’s Speech Text ‘Nihon wa Modottekimashita’

Authors: Nadya Inda Syartanti

Abstract:

This research aims to identify, classify, and analyze descriptively the aspects of grammatical and lexical cohesion in the speech text of Japan’s Prime Minister Shinzo Abe entitled Nihon wa Modotte kimashita delivered in Washington DC, the United States on February 23, 2013, as a research data source. The method used is qualitative research, which uses descriptions through words that are applied by analyzing aspects of grammatical and lexical cohesion proposed by Halliday and Hasan (1976). The aspects of grammatical cohesion consist of references (personal, demonstrative, interrogative pronouns), substitution, ellipsis, and conjunction. In contrast, lexical cohesion consists of reiteration (repetition, synonym, antonym, hyponym, meronym) and collocation. Data classification is based on the 6 aspects of the cohesion. Through some aspects of cohesion, this research tries to find out the frequency of using grammatical and lexical cohesion in Shinzo Abe's speech text entitled Nihon wa Modotte kimashita. The results of this research are expected to help overcome the difficulty of understanding speech texts in Japanese. Therefore, this research can be a reference for learners, researchers, and anyone who is interested in the field of discourse analysis.

Keywords: cohesion, grammatical cohesion, lexical cohesion, speech text, Shinzo Abe

Procedia PDF Downloads 134