Search results for: speech noise reduction
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 6358

Search results for: speech noise reduction

6088 Identifying Model to Predict Deterioration of Water Mains Using Robust Analysis

Authors: Go Bong Choi, Shin Je Lee, Sung Jin Yoo, Gibaek Lee, Jong Min Lee

Abstract:

In South Korea, it is difficult to obtain data for statistical pipe assessment. In this paper, to address these issues, we find that various statistical model presented before is how data mixed with noise and are whether apply in South Korea. Three major type of model is studied and if data is presented in the paper, we add noise to data, which affects how model response changes. Moreover, we generate data from model in paper and analyse effect of noise. From this we can find robustness and applicability in Korea of each model.

Keywords: proportional hazard model, survival model, water main deterioration, ecological sciences

Procedia PDF Downloads 705
6087 Learning from Small Amount of Medical Data with Noisy Labels: A Meta-Learning Approach

Authors: Gorkem Algan, Ilkay Ulusoy, Saban Gonul, Banu Turgut, Berker Bakbak

Abstract:

Computer vision systems recently made a big leap thanks to deep neural networks. However, these systems require correctly labeled large datasets in order to be trained properly, which is very difficult to obtain for medical applications. Two main reasons for label noise in medical applications are the high complexity of the data and conflicting opinions of experts. Moreover, medical imaging datasets are commonly tiny, which makes each data very important in learning. As a result, if not handled properly, label noise significantly degrades the performance. Therefore, a label-noise-robust learning algorithm that makes use of the meta-learning paradigm is proposed in this article. The proposed solution is tested on retinopathy of prematurity (ROP) dataset with a very high label noise of 68%. Results show that the proposed algorithm significantly improves the classification algorithm's performance in the presence of noisy labels.

Keywords: deep learning, label noise, robust learning, meta-learning, retinopathy of prematurity

Procedia PDF Downloads 130
6086 Weighted G2 Multi-Degree Reduction of Bezier Curves

Authors: Salisu ibrahim, Abdalla Rababah

Abstract:

In this research, we use Weighted G2-Multi-degree reduction of Bezier curve of degree n to a Bezier curve of degree m, m < n. The degree reduction of Bezier curves is used to represent a given Bezier curve of n by a Bezier curve of degree m, m < n. Exact degree reduction is not possible, and degree reduction is approximate process in nature. We derive a weighted degree reducing method that is geometrically continuous at the end points. Different norms will be considered, several error minimizations will be given. The proposed methods produce error function that are less than the errors of existing methods.

Keywords: Bezier curves, multiple degree reduction, geometric continuity, error function

Procedia PDF Downloads 450
6085 Performance Analysis of VoIP Coders for Different Modulations Under Pervasive Environment

Authors: Jasbinder Singh, Harjit Pal Singh, S. A. Khan

Abstract:

The work, in this paper, presents the comparison of encoded speech signals by different VoIP narrow-band and wide-band codecs for different modulation schemes. The simulation results indicate that codec has an impact on the speech quality and also effected by modulation schemes.

Keywords: VoIP, coders, modulations, BER, MOS

Procedia PDF Downloads 481
6084 The Evaluation of the Performance of Different Filtering Approaches in Tracking Problem and the Effect of Noise Variance

Authors: Mohammad Javad Mollakazemi, Farhad Asadi, Aref Ghafouri

Abstract:

Performance of different filtering approaches depends on modeling of dynamical system and algorithm structure. For modeling and smoothing the data the evaluation of posterior distribution in different filtering approach should be chosen carefully. In this paper different filtering approaches like filter KALMAN, EKF, UKF, EKS and smoother RTS is simulated in some trajectory tracking of path and accuracy and limitation of these approaches are explained. Then probability of model with different filters is compered and finally the effect of the noise variance to estimation is described with simulations results.

Keywords: Gaussian approximation, Kalman smoother, parameter estimation, noise variance

Procedia PDF Downloads 403
6083 Audio-Visual Co-Data Processing Pipeline

Authors: Rita Chattopadhyay, Vivek Anand Thoutam

Abstract:

Speech is the most acceptable means of communication where we can quickly exchange our feelings and thoughts. Quite often, people can communicate orally but cannot interact or work with computers or devices. It’s easy and quick to give speech commands than typing commands to computers. In the same way, it’s easy listening to audio played from a device than extract output from computers or devices. Especially with Robotics being an emerging market with applications in warehouses, the hospitality industry, consumer electronics, assistive technology, etc., speech-based human-machine interaction is emerging as a lucrative feature for robot manufacturers. Considering this factor, the objective of this paper is to design the “Audio-Visual Co-Data Processing Pipeline.” This pipeline is an integrated version of Automatic speech recognition, a Natural language model for text understanding, object detection, and text-to-speech modules. There are many Deep Learning models for each type of the modules mentioned above, but OpenVINO Model Zoo models are used because the OpenVINO toolkit covers both computer vision and non-computer vision workloads across Intel hardware and maximizes performance, and accelerates application development. A speech command is given as input that has information about target objects to be detected and start and end times to extract the required interval from the video. Speech is converted to text using the Automatic speech recognition QuartzNet model. The summary is extracted from text using a natural language model Generative Pre-Trained Transformer-3 (GPT-3). Based on the summary, essential frames from the video are extracted, and the You Only Look Once (YOLO) object detection model detects You Only Look Once (YOLO) objects on these extracted frames. Frame numbers that have target objects (specified objects in the speech command) are saved as text. Finally, this text (frame numbers) is converted to speech using text to speech model and will be played from the device. This project is developed for 80 You Only Look Once (YOLO) labels, and the user can extract frames based on only one or two target labels. This pipeline can be extended for more than two target labels easily by making appropriate changes in the object detection module. This project is developed for four different speech command formats by including sample examples in the prompt used by Generative Pre-Trained Transformer-3 (GPT-3) model. Based on user preference, one can come up with a new speech command format by including some examples of the respective format in the prompt used by the Generative Pre-Trained Transformer-3 (GPT-3) model. This pipeline can be used in many projects like human-machine interface, human-robot interaction, and surveillance through speech commands. All object detection projects can be upgraded using this pipeline so that one can give speech commands and output is played from the device.

Keywords: OpenVINO, automatic speech recognition, natural language processing, object detection, text to speech

Procedia PDF Downloads 49
6082 Multimodal Data Fusion Techniques in Audiovisual Speech Recognition

Authors: Hadeer M. Sayed, Hesham E. El Deeb, Shereen A. Taie

Abstract:

In the big data era, we are facing a diversity of datasets from different sources in different domains that describe a single life event. These datasets consist of multiple modalities, each of which has a different representation, distribution, scale, and density. Multimodal fusion is the concept of integrating information from multiple modalities in a joint representation with the goal of predicting an outcome through a classification task or regression task. In this paper, multimodal fusion techniques are classified into two main classes: model-agnostic techniques and model-based approaches. It provides a comprehensive study of recent research in each class and outlines the benefits and limitations of each of them. Furthermore, the audiovisual speech recognition task is expressed as a case study of multimodal data fusion approaches, and the open issues through the limitations of the current studies are presented. This paper can be considered a powerful guide for interested researchers in the field of multimodal data fusion and audiovisual speech recognition particularly.

Keywords: multimodal data, data fusion, audio-visual speech recognition, neural networks

Procedia PDF Downloads 80
6081 Ocean Planner: A Web-Based Decision Aid to Design Measures to Best Mitigate Underwater Noise

Authors: Thomas Folegot, Arnaud Levaufre, Léna Bourven, Nicolas Kermagoret, Alexis Caillard, Roger Gallou

Abstract:

Concern for negative impacts of anthropogenic noise on the ocean’s ecosystems has increased over the recent decades. This concern leads to a similar increased willingness to regulate noise-generating activities, of which shipping is one of the most significant. Dealing with ship noise requires not only knowledge about the noise from individual ships, but also how the ship noise is distributed in time and space within the habitats of concern. Marine mammals, but also fish, sea turtles, larvae and invertebrates are mostly dependent on the sounds they use to hunt, feed, avoid predators, during reproduction to socialize and communicate, or to defend a territory. In the marine environment, sight is only useful up to a few tens of meters, whereas sound can propagate over hundreds or even thousands of kilometers. Directive 2008/56/EC of the European Parliament and of the Council of June 17, 2008 called the Marine Strategy Framework Directive (MSFD) require the Member States of the European Union to take the necessary measures to reduce the impacts of maritime activities to achieve and maintain a good environmental status of the marine environment. The Ocean-Planner is a web-based platform that provides to regulators, managers of protected or sensitive areas, etc. with a decision support tool that enable to anticipate and quantify the effectiveness of management measures in terms of reduction or modification the distribution of underwater noise, in response to Descriptor 11 of the MSFD and to the Marine Spatial Planning Directive. Based on the operational sound modelling tool Quonops Online Service, Ocean-Planner allows the user via an intuitive geographical interface to define management measures at local (Marine Protected Area, Natura 2000 sites, Harbors, etc.) or global (Particularly Sensitive Sea Area) scales, seasonal (regulation over a period of time) or permanent, partial (focused to some maritime activities) or complete (all maritime activities), etc. Speed limit, exclusion area, traffic separation scheme (TSS), and vessel sound level limitation are among the measures supported be the tool. Ocean Planner help to decide on the most effective measure to apply to maintain or restore the biodiversity and the functioning of the ecosystems of the coastal seabed, maintain a good state of conservation of sensitive areas and maintain or restore the populations of marine species.

Keywords: underwater noise, marine biodiversity, marine spatial planning, mitigation measures, prediction

Procedia PDF Downloads 89
6080 Analysis of Linguistic Disfluencies in Bilingual Children’s Discourse

Authors: Sheena Christabel Pravin, M. Palanivelan

Abstract:

Speech disfluencies are common in spontaneous speech. The primary purpose of this study was to distinguish linguistic disfluencies from stuttering disfluencies in bilingual Tamil–English (TE) speaking children. The secondary purpose was to determine whether their disfluencies are mediated by native language dominance and/or on an early onset of developmental stuttering at childhood. A detailed study was carried out to identify the prosodic and acoustic features that uniquely represent the disfluent regions of speech. This paper focuses on statistical modeling of repetitions, prolongations, pauses and interjections in the speech corpus encompassing bilingual spontaneous utterances from school going children – English and Tamil. Two classifiers including Hidden Markov Models (HMM) and the Multilayer Perceptron (MLP), which is a class of feed-forward artificial neural network, were compared in the classification of disfluencies. The results of the classifiers document the patterns of disfluency in spontaneous speech samples of school-aged children to distinguish between Children Who Stutter (CWS) and Children with Language Impairment CLI). The ability of the models in classifying the disfluencies was measured in terms of F-measure, Recall, and Precision.

Keywords: bi-lingual, children who stutter, children with language impairment, hidden markov models, multi-layer perceptron, linguistic disfluencies, stuttering disfluencies

Procedia PDF Downloads 190
6079 Affordable and Environmental Friendly Small Commuter Aircraft Improving European Mobility

Authors: Diego Giuseppe Romano, Gianvito Apuleo, Jiri Duda

Abstract:

Mobility is one of the most important societal needs for amusement, business activities and health. Thus, transport needs are continuously increasing, with the consequent traffic congestion and pollution increase. Aeronautic effort aims at smarter infrastructures use and in introducing greener concepts. A possible solution to address the abovementioned topics is the development of Small Air Transport (SAT) system, able to guarantee operability from today underused airfields in an affordable and green way, helping meanwhile travel time reduction, too. In the framework of Horizon2020, EU (European Union) has funded the Clean Sky 2 SAT TA (Transverse Activity) initiative to address market innovations able to reduce SAT operational cost and environmental impact, ensuring good levels of operational safety. Nowadays, most of the key technologies to improve passenger comfort and to reduce community noise, DOC (Direct Operating Costs) and pilot workload for SAT have reached an intermediate level of maturity TRL (Technology Readiness Level) 3/4. Thus, the key technologies must be developed, validated and integrated on dedicated ground and flying aircraft demonstrators to reach higher TRL levels (5/6). Particularly, SAT TA focuses on the integration at aircraft level of the following technologies [1]: 1)    Low-cost composite wing box and engine nacelle using OoA (Out of Autoclave) technology, LRI (Liquid Resin Infusion) and advance automation process. 2) Innovative high lift devices, allowing aircraft operations from short airfields (< 800 m). 3) Affordable small aircraft manufacturing of metallic fuselage using FSW (Friction Stir Welding) and LMD (Laser Metal Deposition). 4)       Affordable fly-by-wire architecture for small aircraft (CS23 certification rules). 5) More electric systems replacing pneumatic and hydraulic systems (high voltage EPGDS -Electrical Power Generation and Distribution System-, hybrid de-ice system, landing gear and brakes). 6) Advanced avionics for small aircraft, reducing pilot workload. 7) Advanced cabin comfort with new interiors materials and more comfortable seats. 8) New generation of turboprop engine with reduced fuel consumption, emissions, noise and maintenance costs for 19 seats aircraft. (9) Alternative diesel engine for 9 seats commuter aircraft. To address abovementioned market innovations, two different platforms have been designed: Reference and Green aircraft. Reference aircraft is a virtual aircraft designed considering 2014 technologies with an existing engine assuring requested take-off power; Green aircraft is designed integrating the technologies addressed in Clean Sky 2. Preliminary integration of the proposed technologies shows an encouraging reduction of emissions and operational costs of small: about 20% CO2 reduction, about 24% NOx reduction, about 10 db (A) noise reduction at measurement point and about 25% DOC reduction. Detailed description of the performed studies, analyses and validations for each technology as well as the expected benefit at aircraft level are reported in the present paper.

Keywords: affordable, European, green, mobility, technologies development, travel time reduction

Procedia PDF Downloads 75
6078 The Acoustic Performance of Double-skin Wind Energy Facade

Authors: Sara Mota Carmo

Abstract:

Wind energy applied in architecture has been largely abandoned due to the uncomfortable noise it causes. This study aims to investigate the acoustical performance in the urban environment and indoor environment of a double-skin wind energy facade. Measurements for sound transmission were recorded by using a hand-held sound meter device on a reduced-scale prototype of a wind energy façade. The applied wind intensities ranged between 2m/s and 8m/s, and the increase sound produced were proportional to the wind intensity.The study validates the acoustic performance of wind energy façade using a double skin façade system, showing that noise reduction indoor by approximately 30 to 35 dB. However, the results found that above 6m/s win intensity, in urban environment, the wind energy system applied to the façade exceeds the maximum 50dB recommended by world health organization and needs some adjustments.

Keywords: double-skin wind energy facade, acoustic energy facade, wind energy in architecture, wind energy prototype

Procedia PDF Downloads 63
6077 Market Illiquidity and Pricing Errors in the Term Structure of CDS

Authors: Lidia Sanchis-Marco, Antonio Rubia, Pedro Serrano

Abstract:

This paper studies the informational content of pricing errors in the term structure of sovereign CDS spreads. The residuals from a non-arbitrage model are employed to construct a Price discrepancy estimate, or noise measure. The noise estimate is understood as an indicator of market distress and reflects frictions such as illiquidity. Empirically, the noise measure is computed for an extensive panel of CDS spreads. Our results reveal an important fraction of systematic risk is not priced in default swap contracts. When projecting the noise measure onto a set of financial variables, the panel-data estimates show that greater price discrepancies are systematically related to a higher level of offsetting transactions of CDS contracts. This evidence suggests that arbitrage capital flows exit the marketplace during time of distress, and this consistent with a market segmentation among investors and arbitrageurs where professional arbitrageurs are particularly ineffective at bringing prices to their fundamental values during turbulent periods. Our empirical findings are robust for the most common CDS pricing models employed in the industry.

Keywords: credit default swaps, noise measure, illiquidity, capital arbitrage

Procedia PDF Downloads 545
6076 Non-Universality in Barkhausen Noise Signatures of Thin Iron Films

Authors: Arnab Roy, P. S. Anil Kumar

Abstract:

We discuss angle dependent changes to the Barkhausen noise signatures of thin epitaxial Fe films upon altering the angle of the applied field. We observe a sub-critical to critical phase transition in the hysteresis loop of the sample upon increasing the out-of-plane component of the applied field. The observations are discussed in the light of simulations of a 2D Gaussian Random Field Ising Model with references to a reducible form of the Random Anisotropy Ising Model.

Keywords: Barkhausen noise, Planar Hall effect, Random Field Ising Model, Random Anisotropy Ising Model

Procedia PDF Downloads 365
6075 Emotional and Physiological Reaction While Listening the Speech of Adults Who Stutter

Authors: Xharavina V., Gallopeni F., Ahmeti K.

Abstract:

Stuttered speech is filled with intermittent sound prolongations and/or rapid part word repetitions. Oftentimes, these aberrant acoustic behaviors are associated with intermittent physical tension and struggle behaviors such as head jerks, arm jerks, finger tapping, excessive eye-blinks, etc. Additionally, the jarring nature of acoustic and physical manifestations that often accompanies moderate-severe stuttering may induce negative emotional responses in listeners, which alters communication between the person who stutters and their listeners. However, researches for the influence of negative emotions in the communication and for physical reaction are limited. Therefore, to compare psycho-physiological responses of fluent adults, while listening the speech of adults who speak fluency and adults who stutter, are necessary. This study comprises the experimental method, with total of 104 participants (average age-20 years old, SD=2.1), divided into 3 groups. All participants self-reported no impairments in speech, language, or hearing. Exploring the responses of the participants, there were used two records speeches; a voice who speaks fluently and the voice who stutters. Heartbeats and the pulse were measured by the digital blood pressure monitor called 'Tensoval', as a physiological response to the fluent and stuttering sample. Meanwhile, the emotional responses of participants were measured by the self-reporting questionnaire (Steenbarger, 2001). Results showed an increase in heartbeats during the stuttering speech compared with the fluent sample (p < 0.5). The listeners also self-reported themselves as more alive, unhappy, nervous, repulsive, sad, tense, distracted and upset when listening the stuttering words versus the words of the fluent adult (where it was reported to experience positive emotions). These data support the notions that speech with stuttering can bring a psycho-physical reaction to the listeners. Speech pathologists should be aware that listeners show intolerable physiological reactions to stuttering that remain visible over time.

Keywords: emotional, physiological, stuttering, fluent speech

Procedia PDF Downloads 118
6074 Enhanced Constraint-Based Optical Network (ECON) for Enhancing OSNR

Authors: G. R. Kavitha, T. S. Indumathi

Abstract:

With the constantly rising demands of the multimedia services, the requirements of long haul transport network are constantly changing in the area of optical network. Maximum data transmission using optimization of the communication channel poses the biggest challenge. Although there has been a constant focus on this area from the past decade, there was no evidence of a significant result that has been accomplished. Hence, after reviewing some potential design of optical network from literatures, it was understood that optical signal to noise ratio was one of the elementary attributes that can define the performance of the optical network. In this paper, we propose a framework termed as ECON (Enhanced Constraint-based Optical Network) that primarily optimize the optical signal to noise ratio using ROADM. The simulation is performed in Matlab and optical signal to noise ratio is extracted considering the system matrix. The outcome of the proposed study shows that optimized OSNR as compared to the existing studies.

Keywords: component, optical network, reconfigurable optical add-drop multiplexer, optical signal-to-noise ratio

Procedia PDF Downloads 461
6073 Effect of Signal Acquisition Procedure on Imagined Speech Classification Accuracy

Authors: M.R Asghari Bejestani, Gh. R. Mohammad Khani, V.R. Nafisi

Abstract:

Imagined speech recognition is one of the most interesting approaches to BCI development and a lot of works have been done in this area. Many different experiments have been designed and hundreds of combinations of feature extraction methods and classifiers have been examined. Reported classification accuracies range from the chance level to more than 90%. Based on non-stationary nature of brain signals, we have introduced 3 classification modes according to time difference in inter and intra-class samples. The modes can explain the diversity of reported results and predict the range of expected classification accuracies from the brain signal accusation procedure. In this paper, a few samples are illustrated by inspecting results of some previous works.

Keywords: brain computer interface, silent talk, imagined speech, classification, signal processing

Procedia PDF Downloads 121
6072 The Importance of the Historical Approach in the Linguistic Research

Authors: Zoran Spasovski

Abstract:

The paper shortly discusses the significance and the benefits of the historical approach in the research of languages by presenting examples of it in the fields of phonetics and phonology, lexicology, morphology, syntax, and even in the onomastics (toponomy and anthroponomy). The examples from the field of phonetics/phonology include insights into animal speech and its evolution into human speech, the evolution of the sounds of human speech from vocals to glides and consonants and from velar consonants to palatal, etc., on well-known examples of former researchers. Those from the field of lexicology show shortly the formation of the lexemes and their evolution; the morphology and syntax are explained by examples of the development of grammar and syntax forms, and the importance of the historical approach in the research of place-names and personal names is briefly outlined through examples of place-names and personal names and surnames, and the conclusions that come from it, in different languages.

Keywords: animal speech, glotogenesis, grammar forms, lexicology, place-names, personal names, surnames, syntax categories

Procedia PDF Downloads 45
6071 An Automatic Speech Recognition of Conversational Telephone Speech in Malay Language

Authors: M. Draman, S. Z. Muhamad Yassin, M. S. Alias, Z. Lambak, M. I. Zulkifli, S. N. Padhi, K. N. Baharim, F. Maskuriy, A. I. A. Rahim

Abstract:

The performance of Malay automatic speech recognition (ASR) system for the call centre environment is presented. The system utilizes Kaldi toolkit as the platform to the entire library and algorithm used in performing the ASR task. The acoustic model implemented in this system uses a deep neural network (DNN) method to model the acoustic signal and the standard (n-gram) model for language modelling. With 80 hours of training data from the call centre recordings, the ASR system can achieve 72% of accuracy that corresponds to 28% of word error rate (WER). The testing was done using 20 hours of audio data. Despite the implementation of DNN, the system shows a low accuracy owing to the varieties of noises, accent and dialect that typically occurs in Malaysian call centre environment. This significant variation of speakers is reflected by the large standard deviation of the average word error rate (WERav) (i.e., ~ 10%). It is observed that the lowest WER (13.8%) was obtained from recording sample with a standard Malay dialect (central Malaysia) of native speaker as compared to 49% of the sample with the highest WER that contains conversation of the speaker that uses non-standard Malay dialect.

Keywords: conversational speech recognition, deep neural network, Malay language, speech recognition

Procedia PDF Downloads 296
6070 Numerical Simulation of Supersonic Gas Jet Flows and Acoustics Fields

Authors: Lei Zhang, Wen-jun Ruan, Hao Wang, Peng-Xin Wang

Abstract:

The source of the jet noise is generated by rocket exhaust plume during rocket engine testing. A domain decomposition approach is applied to the jet noise prediction in this paper. The aerodynamic noise coupling is based on the splitting into acoustic sources generation and sound propagation in separate physical domains. Large Eddy Simulation (LES) is used to simulate the supersonic jet flow. Based on the simulation results of the flow-fields, the jet noise distribution of the sound pressure level is obtained by applying the Ffowcs Williams-Hawkings (FW-H) acoustics equation and Fourier transform. The calculation results show that the complex structures of expansion waves, compression waves and the turbulent boundary layer could occur due to the strong interaction between the gas jet and the ambient air. In addition, the jet core region, the shock cell and the sound pressure level of the gas jet increase with the nozzle size increasing. Importantly, the numerical simulation results of the far-field sound are in good agreement with the experimental measurements in directivity.

Keywords: supersonic gas jet, Large Eddy Simulation(LES), acoustic noise, Ffowcs Williams-Hawkings(FW-H) equations, nozzle size

Procedia PDF Downloads 382
6069 Performance of LTE Multicast Systems in the Presence of the Colored Noise Jamming

Authors: S. Malisuwan, J. Sivaraks, N. Madan, N. Suriyakrai

Abstract:

The ever going evolution of advanced wireless technologies makes it financially impossible for military operations to completely manufacture their own equipment. Therefore, Commercial-Off-The-Shelf (COTS) and Modified-Off-The-Shelf (MOTS) are being considered in military mission with low-cost modifications. In this paper, we focus on the LTE multicast systems for military communication systems under tactical environments with jamming condition. We examine the influence of the colored noise jamming on the performance of the LTE multicast systems in terms of the average throughput. The simulation results demonstrate the degradation of the average throughput for different dynamic ranges of the colored noise jamming versus average SNR.

Keywords: performance, LTE, multicast, jamming, throughput

Procedia PDF Downloads 394
6068 A Comprehensive Methodology for Voice Segmentation of Large Sets of Speech Files Recorded in Naturalistic Environments

Authors: Ana Londral, Burcu Demiray, Marcus Cheetham

Abstract:

Speech recording is a methodology used in many different studies related to cognitive and behaviour research. Modern advances in digital equipment brought the possibility of continuously recording hours of speech in naturalistic environments and building rich sets of sound files. Speech analysis can then extract from these files multiple features for different scopes of research in Language and Communication. However, tools for analysing a large set of sound files and automatically extract relevant features from these files are often inaccessible to researchers that are not familiar with programming languages. Manual analysis is a common alternative, with a high time and efficiency cost. In the analysis of long sound files, the first step is the voice segmentation, i.e. to detect and label segments containing speech. We present a comprehensive methodology aiming to support researchers on voice segmentation, as the first step for data analysis of a big set of sound files. Praat, an open source software, is suggested as a tool to run a voice detection algorithm, label segments and files and extract other quantitative features on a structure of folders containing a large number of sound files. We present the validation of our methodology with a set of 5000 sound files that were collected in the daily life of a group of voluntary participants with age over 65. A smartphone device was used to collect sound using the Electronically Activated Recorder (EAR): an app programmed to record 30-second sound samples that were randomly distributed throughout the day. Results demonstrated that automatic segmentation and labelling of files containing speech segments was 74% faster when compared to a manual analysis performed with two independent coders. Furthermore, the methodology presented allows manual adjustments of voiced segments with visualisation of the sound signal and the automatic extraction of quantitative information on speech. In conclusion, we propose a comprehensive methodology for voice segmentation, to be used by researchers that have to work with large sets of sound files and are not familiar with programming tools.

Keywords: automatic speech analysis, behavior analysis, naturalistic environments, voice segmentation

Procedia PDF Downloads 258
6067 Urban Noise and Air Quality: Correlation between Air and Noise Pollution; Sensors, Data Collection, Analysis and Mapping in Urban Planning

Authors: Massimiliano Condotta, Paolo Ruggeri, Chiara Scanagatta, Giovanni Borga

Abstract:

Architects and urban planners, when designing and renewing cities, have to face a complex set of problems, including the issues of noise and air pollution which are considered as hot topics (i.e., the Clean Air Act of London and the Soundscape definition). It is usually taken for granted that these problems go by together because the noise pollution present in cities is often linked to traffic and industries, and these produce air pollutants as well. Traffic congestion can create both noise pollution and air pollution, because NO₂ is mostly created from the oxidation of NO, and these two are notoriously produced by processes of combustion at high temperatures (i.e., car engines or thermal power stations). We can see the same process for industrial plants as well. What have to be investigated – and is the topic of this paper – is whether or not there really is a correlation between noise pollution and air pollution (taking into account NO₂) in urban areas. To evaluate if there is a correlation, some low-cost methodologies will be used. For noise measurements, the OpeNoise App will be installed on an Android phone. The smartphone will be positioned inside a waterproof box, to stay outdoor, with an external battery to allow it to collect data continuously. The box will have a small hole to install an external microphone, connected to the smartphone, which will be calibrated to collect the most accurate data. For air, pollution measurements will be used the AirMonitor device, an Arduino board to which the sensors, and all the other components, are plugged. After assembling the sensors, they will be coupled (one noise and one air sensor) and placed in different critical locations in the area of Mestre (Venice) to map the existing situation. The sensors will collect data for a fixed period of time to have an input for both week and weekend days, in this way it will be possible to see the changes of the situation during the week. The novelty is that data will be compared to check if there is a correlation between the two pollutants using graphs that should show the percentage of pollution instead of the values obtained with the sensors. To do so, the data will be converted to fit on a scale that goes up to 100% and will be shown thru a mapping of the measurement using GIS methods. Another relevant aspect is that this comparison can help to choose which are the right mitigation solutions to be applied in the area of the analysis because it will make it possible to solve both the noise and the air pollution problem making only one intervention. The mitigation solutions must consider not only the health aspect but also how to create a more livable space for citizens. The paper will describe in detail the methodology and the technical solution adopted for the realization of the sensors, the data collection, noise and pollution mapping and analysis.

Keywords: air quality, data analysis, data collection, NO₂, noise mapping, noise pollution, particulate matter

Procedia PDF Downloads 180
6066 Frequency of Consonant Production Errors in Children with Speech Sound Disorder: A Retrospective-Descriptive Study

Authors: Amulya P. Rao, Prathima S., Sreedevi N.

Abstract:

Speech sound disorders (SSD) encompass the major concern in younger population of India with highest prevalence rate among the speech disorders. Children with SSD if not identified and rehabilitated at the earliest, are at risk for academic difficulties. This necessitates early identification using screening tools assessing the frequently misarticulated speech sounds. The literature on frequently misarticulated speech sounds is ample in English and other western languages targeting individuals with various communication disorders. Articulation is language specific, and there are limited studies reporting the same in Kannada, a Dravidian Language. Hence, the present study aimed to identify the frequently misarticulated consonants in Kannada and also to examine the error type. A retrospective, descriptive study was carried out using secondary data analysis of 41 participants (34-phonetic type and 7-phonemic type) with SSD in the age range 3-to 12-years. All the consonants of Kannada were analyzed by considering three words for each speech sound from the Kannada Diagnostic Photo Articulation test (KDPAT). Picture naming task was carried out, and responses were audio recorded. The recorded data were transcribed using IPA 2018 broad transcription. A criterion of 2/3 or 3/3 error productions was set to consider the speech sound to be an error. Number of error productions was calculated for each consonant in each participant. Then, the percentage of participants meeting the criteria were documented for each consonant to identify the frequently misarticulated speech sound. Overall results indicated that velar /k/ (48.78%) and /g/ (43.90%) were frequently misarticulated followed by voiced retroflex /ɖ/ (36.58%) and trill /r/ (36.58%). The lateral retroflex /ɭ/ was misarticulated by 31.70% of the children with SSD. Dentals (/t/, /n/), bilabials (/p/, /b/, /m/) and labiodental /v/ were produced correctly by all the participants. The highly misarticulated velars /k/ and /g/ were frequently substituted by dentals /t/ and /d/ respectively or omitted. Participants with SSD-phonemic type had multiple substitutions for one speech sound whereas, SSD-phonetic type had consistent single sound substitutions. Intra- and inter-judge reliability for 10% of the data using Cronbach’s Alpha revealed good reliability (0.8 ≤ α < 0.9). Analyzing a larger sample by replicating such studies will validate the present study results.

Keywords: consonant, frequently misarticulated, Kannada, SSD

Procedia PDF Downloads 94
6065 Programmed Speech to Text Summarization Using Graph-Based Algorithm

Authors: Hamsini Pulugurtha, P. V. S. L. Jagadamba

Abstract:

Programmed Speech to Text and Text Summarization Using Graph-based Algorithms can be utilized in gatherings to get the short depiction of the gathering for future reference. This gives signature check utilizing Siamese neural organization to confirm the personality of the client and convert the client gave sound record which is in English into English text utilizing the discourse acknowledgment bundle given in python. At times just the outline of the gathering is required, the answer for this text rundown. Thus, the record is then summed up utilizing the regular language preparing approaches, for example, solo extractive text outline calculations

Keywords: Siamese neural network, English speech, English text, natural language processing, unsupervised extractive text summarization

Procedia PDF Downloads 183
6064 Steady State Rolling and Dynamic Response of a Tire at Low Frequency

Authors: Md Monir Hossain, Anne Staples, Kuya Takami, Tomonari Furukawa

Abstract:

Tire noise has a significant impact on ride quality and vehicle interior comfort, even at low frequency. Reduction of tire noise is especially important due to strict state and federal environmental regulations. The primary sources of tire noise are the low frequency structure-borne noise and the noise that originates from the release of trapped air between the tire tread and road surface during each revolution of the tire. The frequency response of the tire changes at low and high frequency. At low frequency, the tension and bending moment become dominant, while the internal structure and local deformation become dominant at higher frequencies. Here, we analyze tire response in terms of deformation and rolling velocity at low revolution frequency. An Abaqus FEA finite element model is used to calculate the static and dynamic response of a rolling tire under different rolling conditions. The natural frequencies and mode shapes of a deformed tire are calculated with the FEA package where the subspace-based steady state dynamic analysis calculates dynamic response of tire subjected to harmonic excitation. The analysis was conducted on the dynamic response at the road (contact point of tire and road surface) and side nodes of a static and rolling tire when the tire was excited with 200 N vertical load for a frequency ranging from 20 to 200 Hz. The results show that frequency has little effect on tire deformation up to 80 Hz. But between 80 and 200 Hz, the radial and lateral components of displacement of the road and side nodes exhibited significant oscillation. For the static analysis, the fluctuation was sharp and frequent and decreased with frequency. In contrast, the fluctuation was periodic in nature for the dynamic response of the rolling tire. In addition to the dynamic analysis, a steady state rolling analysis was also performed on the tire traveling at ground velocity with a constant angular motion. The purpose of the computation was to demonstrate the effect of rotating motion on deformation and rolling velocity with respect to a fixed Newtonian reference point. The analysis showed a significant variation in deformation and rolling velocity due to centrifugal and Coriolis acceleration with respect to a fixed Newtonian point on ground.

Keywords: natural frequency, rotational motion, steady state rolling, subspace-based steady state dynamic analysis

Procedia PDF Downloads 335
6063 Fluctuations of Transfer Factor of the Mixer Based on Schottky Diode

Authors: Alexey V. Klyuev, Arkady V. Yakimov, Mikhail I. Ryzhkin, Andrey V. Klyuev

Abstract:

Fluctuations of Schottky diode parameters in a structure of the mixer are investigated. These fluctuations are manifested in two ways. At the first, they lead to fluctuations in the transfer factor that is lead to the amplitude fluctuations in the signal of intermediate frequency. On the basis of the measurement data of 1/f noise of the diode at forward current, the estimation of a spectrum of relative fluctuations in transfer factor of the mixer is executed. Current dependence of the spectrum of relative fluctuations in transfer factor of the mixer and dependence of the spectrum of relative fluctuations in transfer factor of the mixer on the amplitude of the heterodyne signal are investigated. At the second, fluctuations in parameters of the diode lead to the occurrence of 1/f noise in the output signal of the mixer. This noise limits the sensitivity of the mixer to the value of received signal.

Keywords: current-voltage characteristic, fluctuations, mixer, Schottky diode, 1/f noise

Procedia PDF Downloads 555
6062 An Evolutionary Perspective on the Role of Extrinsic Noise in Filtering Transcript Variability in Small RNA Regulation in Bacteria

Authors: Rinat Arbel-Goren, Joel Stavans

Abstract:

Cell-to-cell variations in transcript or protein abundance, called noise, may give rise to phenotypic variability between isogenic cells, enhancing the probability of survival under stress conditions. These variations may be introduced by post-transcriptional regulatory processes such as non-coding, small RNAs stoichiometric degradation of target transcripts in bacteria. We study the iron homeostasis network in Escherichia coli, in which the RyhB small RNA regulates the expression of various targets as a model system. Using fluorescence reporter genes to detect protein levels and single-molecule fluorescence in situ hybridization to monitor transcripts levels in individual cells, allows us to compare noise at both transcript and protein levels. The experimental results and computer simulations show that extrinsic noise buffers through a feed-forward loop configuration the increase in variability introduced at the transcript level by iron deprivation, illuminating the important role that extrinsic noise plays during stress. Surprisingly, extrinsic noise also decouples of fluctuations of two different targets, in spite of RyhB being a common upstream factor degrading both. Thus, phenotypic variability increases under stress conditions by the decoupling of target fluctuations in the same cell rather than by increasing the noise of each. We also present preliminary results on the adaptation of cells to prolonged iron deprivation in order to shed light on the evolutionary role of post-transcriptional downregulation by small RNAs.

Keywords: cell-to-cell variability, Escherichia coli, noise, single-molecule fluorescence in situ hybridization (smFISH), transcript

Procedia PDF Downloads 139
6061 Reducing Weight and Fuel Consumption of Civil Aircraft by EML

Authors: Luca Bertola, Tom Cox, Pat Wheeler, Seamus Garvey, Herve Morvan

Abstract:

Electromagnetic launch systems have been proposed for military applications to accelerate jet planes on aircraft carriers. This paper proposes the implementation of similar technology to aid civil aircraft take-off, which can provide significant economic, environmental and technical benefits. Assisted launch has the potential of reducing ground noise and emissions near airports and improving overall aircraft efficiency through reducing engine thrust requirements. This paper presents a take-off performance analysis for an Airbus A320-200 taking off with and without the assistance of the electromagnetic catapult. Assisted take-off allows for a significant reduction in take-off field length, giving more capacity with existing airport footprints and reducing the necessary footprint of new airports, which will both reduce costs and increase the number of suitable sites. The electromagnetic catapult may allow the installation of smaller engines with lower rated thrust. The consequent fuel consumption and operational cost reduction are estimated. The potential of reducing the aircraft operational costs and the runway length required making electromagnetic launch system an attractive solution to the air traffic growth in busy airports.

Keywords: electromagnetic launch, fuel consumption, take-off analysis, weight reduction

Procedia PDF Downloads 302
6060 Hearing Threshold Levels among Steel Industry Workers in Samut Prakan Province, Thailand

Authors: Petcharat  Kerdonfag, Surasak Taneepanichskul, Winai Wadwongtham

Abstract:

Industrial noise is usually considered as the main impact of the environmental health and safety because its exposure can cause permanently serious hearing damage. Despite providing strictly hearing protection standards and campaigning extensively encouraging public health awareness among industrial workers in Thailand, hazard noise-induced hearing loss has dramatically been massive obstacles for workers’ health. The aims of the study were to explore and specify the hearing threshold levels among steel industrial workers responsible in which higher noise levels of work zone and to examine the relationships of hearing loss and workers’ age and the length of employment in Samut Prakan province, Thailand. Cross-sectional study design was done. Ninety-three steel industrial workers in the designated zone of higher noise (> 85dBA) with more than 1 year of employment from two factories by simple random sampling and available to participate in were assessed by the audiometric screening at regional Samut Prakan hospital. Data of doing screening were collected from October to December, 2016 by the occupational medicine physician and a qualified occupational nurse. All participants were examined by the same examiners for the validity. An Audiometric testing was performed at least 14 hours after the last noise exposure from the workplace. Workers’ age and the length of employment were gathered by the developed occupational record form. Results: The range of workers’ age was from 23 to 59 years, (Mean = 41.67, SD = 9.69) and the length of employment was from 1 to 39 years, (Mean = 13.99, SD = 9.88). Fifty three (60.0%) out of all participants have been exposing to the hazard of noise in the workplace for more than 10 years. Twenty-three (24.7%) of them have been exposing to the hazard of noise less than or equal to 5 years. Seventeen (18.3%) of them have been exposing to the hazard of noise for 5 to 10 years. Using the cut point of less than or equal to 25 dBA of hearing thresholds, the average means of hearing thresholds for participants at 4, 6, and 8 kHz were 31.34, 29.62, and 25.64 dB, respectively for the right ear and 40.15, 32.20, and 25.48 dB for the left ear, respectively. The more developing age of workers in the work zone with hazard of noise, the more the hearing thresholds would be increasing at frequencies of 4, 6, and 8 kHz (p =.012, p =.026, p =.024) for the right ear, respectively and for the left ear only at the frequency 4 kHz (p =.009). Conclusion: The participants’ age in the hazard of noise work zone was significantly associated with the hearing loss in different levels while the length of participants’ employment was not significantly associated with the hearing loss. Thus hearing threshold levels among industrial workers would be regularly assessed and needed to be protected at the beginning of working.

Keywords: hearing threshold levels, hazard of noise, hearing loss, audiometric testing

Procedia PDF Downloads 198
6059 Reconstructed Phase Space Features for Estimating Post Traumatic Stress Disorder

Authors: Andre Wittenborn, Jarek Krajewski

Abstract:

Trauma-related sadness in speech can alter the voice in several ways. The generation of non-linear aerodynamic phenomena within the vocal tract is crucial when analyzing trauma-influenced speech production. They include non-laminar flow and formation of jets rather than well-behaved laminar flow aspects. Especially state-space reconstruction methods based on chaotic dynamics and fractal theory have been suggested to describe these aerodynamic turbulence-related phenomena of the speech production system. To extract the non-linear properties of the speech signal, we used the time delay embedding method to reconstruct from a scalar time series (reconstructed phase space, RPS). This approach results in the extraction of 7238 Features per .wav file (N= 47, 32 m, 15 f). The speech material was prompted by telling about autobiographical related sadness-inducing experiences (sampling rate 16 kHz, 8-bit resolution). After combining these features in a support vector machine based machine learning approach (leave-one-sample out validation), we achieved a correlation of r = .41 with the well-established, self-report ground truth measure (RATS) of post-traumatic stress disorder (PTSD).

Keywords: non-linear dynamics features, post traumatic stress disorder, reconstructed phase space, support vector machine

Procedia PDF Downloads 79