Search results for: statistical parametric speech synthesis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7407

Search results for: statistical parametric speech synthesis

7257 Students' Statistical Reasoning and Attitudes towards Statistics in Blended Learning, E-Learning and On-Campus Learning

Authors: Petros Roussos

Abstract:

The present study focused on students' statistical reasoning related to Null Hypothesis Statistical Testing and p-values. Its objective was to test the hypothesis that neither the place (classroom, at a distance, online) nor the medium that actually supports the learning (ICT, internet, books) has an effect on understanding of statistical concepts. In addition, it was expected that students' attitudes towards statistics would not predict understanding of statistical concepts. The sample consisted of 385 undergraduate and postgraduate students from six state and private universities (five in Greece and one in Cyprus). Students were administered two questionnaires: a) the Greek version of the Survey of Attitudes Toward Statistics, and b) a short instrument which measures students' understanding of statistical significance and p-values. Results suggest that attitudes towards statistics do not predict students' understanding of statistical concepts, whereas the medium did not have an effect.

Keywords: attitudes towards statistics, blended learning, e-learning, statistical reasoning

Procedia PDF Downloads 302
7256 Tuning of the Thermal Capacity of an Envelope for Peak Demand Reduction

Authors: Isha Rathore, Peeyush Jain, Elangovan Rajasekar

Abstract:

The thermal capacity of the envelope impacts the cooling and heating demand of a building and modulates the peak electricity demand. This paper presents the thermal capacity tuning of a building envelope to minimize peak electricity demand for space cooling. We consider a 40 m² residential testbed located in Hyderabad, India (Composite Climate). An EnergyPlus model is validated using real-time data. A Parametric simulation framework for thermal capacity tuning is created using the Honeybee plugin. Diffusivity, Thickness, layer position, orientation and fenestration size of the exterior envelope are parametrized considering a five-layered wall system. A total of 1824 parametric runs are performed and the optimum wall configuration leading to minimum peak cooling demand is presented.

Keywords: thermal capacity, tuning, peak demand reduction, parametric analysis

Procedia PDF Downloads 177
7255 Quantum Cum Synaptic-Neuronal Paradigm and Schema for Human Speech Output and Autism

Authors: Gobinathan Devathasan, Kezia Devathasan

Abstract:

Objective: To improve the current modified Broca-Wernicke-Lichtheim-Kussmaul speech schema and provide insight into autism. Methods: We reviewed the pertinent literature. Current findings, involving Brodmann areas 22, 46, 9,44,45,6,4 are based on neuropathology and functional MRI studies. However, in primary autism, there is no lucid explanation and changes described, whether neuropathology or functional MRI, appear consequential. Findings: We forward an enhanced model which may explain the enigma related to autism. Vowel output is subcortical and does need cortical representation whereas consonant speech is cortical in origin. Left lateralization is needed to commence the circuitry spin as our life have evolved with L-amino acids and left spin of electrons. A fundamental species difference is we are capable of three syllable-consonants and bi-syllable expression whereas cetaceans and songbirds are confined to single or dual consonants. The 4 key sites for speech are superior auditory cortex, Broca’s two areas, and the supplementary motor cortex. Using the Argand’s diagram and Reimann’s projection, we theorize that the Euclidean three dimensional synaptic neuronal circuits of speech are quantized to coherent waves, and then decoherence takes place at area 6 (spherical representation). In this quantum state complex, 3-consonant languages are instantaneously integrated and multiple languages can be learned, verbalized and differentiated. Conclusion: We postulate that evolutionary human speech is elevated to quantum interaction unlike cetaceans and birds to achieve the three consonants/bi-syllable speech. In classical primary autism, the sudden speech switches off and on noted in several cases could now be explained not by any anatomical lesion but failure of coherence. Area 6 projects directly into prefrontal saccadic area (8); and this further explains the second primary feature in autism: lack of eye contact. The third feature which is repetitive finger gestures, located adjacent to the speech/motor areas, are actual attempts to communicate with the autistic child akin to sign language for the deaf.

Keywords: quantum neuronal paradigm, cetaceans and human speech, autism and rapid magnetic stimulation, coherence and decoherence of speech

Procedia PDF Downloads 183
7254 Refusal Speech Acts in French Learners of Mandarin Chinese

Authors: Jui-Hsueh Hu

Abstract:

This study investigated various models of refusal speech acts among three target groups: French learners of Mandarin Chinese (FM), Taiwanese native Mandarin speakers (TM), and native French speakers (NF). The refusal responses were analyzed in terms of their options, frequencies, and sequences and the contents of their semantic formulas. This study also examined differences in refusal strategies, as determined by social status and social distance, among the three groups. The difficulties of refusal speech acts encountered by FM were then generalized. The results indicated that Mandarin instructors of NF should focus on the different reasons for the pragmatic failure of French learners and should assist these learners in mastering refusal speech acts that rely on abundant cultural information. In this study, refusal policies were mainly classified according to the research of Beebe et al. (1990). Discourse completion questionnaires were collected from TM, FM, and NF, and their responses were compared to determine how refusal policies differed among the groups. This study not only emphasized the dissimilarities of refusal strategies between native Mandarin speakers and second-language Mandarin learners but also used NF as a control group. The results of this study demonstrated that regarding overall strategies, FM were biased toward NF in terms of strategy choice, order, and content, resulting in pragmatic transfer under the influence of social factors such as 'social status' and 'social distance,' strategy choices of FM were still closer to those of NF, and the phenomenon of pragmatic transfer of FM was revealed. Regarding the refusal difficulties among the three groups, the F-test in the analysis of variance revealed statistical significance was achieved for Role Playing Items 13 and 14 (P < 0.05). A difference was observed in the average number of refusal difficulties between the participants. However, after multiple comparisons, it was found that item 13 (unrecognized heterosexual junior colleague requesting contacts) was significantly more difficult for NF than for TM and FM; item 14 (contacts requested by an unrecognized classmate of the opposite sex) was significantly more difficult to refuse for NF than for TM. This study summarized the pragmatic language errors that most FM often perform, including the misuse or absence of modal words, hedging expressions, and empty words at the end of sentences, as the reasons for pragmatic failures. The common social pragmatic failures of FM include inaccurately applying the level of directness and formality.

Keywords: French Mandarin, interlanguage refusal, pragmatic transfer, speech acts

Procedia PDF Downloads 247
7253 Genetic Algorithm and Multi-Parametric Programming Based Cascade Control System for Unmanned Aerial Vehicles

Authors: Dao Phuong Nam, Do Trong Tan, Pham Tam Thanh, Le Duy Tung, Tran Hoang Anh

Abstract:

This paper considers the problem of cascade control system for unmanned aerial vehicles (UAVs). Due to the complicated modelling technique of UAV, it is necessary to separate them into two subsystems. The proposed cascade control structure is a hierarchical scheme including a robust control for inner subsystem based on H infinity theory and trajectory generator using genetic algorithm (GA), outer loop control law based on multi-parametric programming (MPP) technique to overcome the disadvantage of a big amount of calculations. Simulation results are presented to show that the equivalent path has been found and obtained by proposed cascade control scheme.

Keywords: genetic algorithm, GA, H infinity, multi-parametric programming, MPP, unmanned aerial vehicles, UAVs

Procedia PDF Downloads 209
7252 Performance Analysis of VoIP Coders for Different Modulations Under Pervasive Environment

Authors: Jasbinder Singh, Harjit Pal Singh, S. A. Khan

Abstract:

The work, in this paper, presents the comparison of encoded speech signals by different VoIP narrow-band and wide-band codecs for different modulation schemes. The simulation results indicate that codec has an impact on the speech quality and also effected by modulation schemes.

Keywords: VoIP, coders, modulations, BER, MOS

Procedia PDF Downloads 507
7251 Audio-Visual Co-Data Processing Pipeline

Authors: Rita Chattopadhyay, Vivek Anand Thoutam

Abstract:

Speech is the most acceptable means of communication where we can quickly exchange our feelings and thoughts. Quite often, people can communicate orally but cannot interact or work with computers or devices. It’s easy and quick to give speech commands than typing commands to computers. In the same way, it’s easy listening to audio played from a device than extract output from computers or devices. Especially with Robotics being an emerging market with applications in warehouses, the hospitality industry, consumer electronics, assistive technology, etc., speech-based human-machine interaction is emerging as a lucrative feature for robot manufacturers. Considering this factor, the objective of this paper is to design the “Audio-Visual Co-Data Processing Pipeline.” This pipeline is an integrated version of Automatic speech recognition, a Natural language model for text understanding, object detection, and text-to-speech modules. There are many Deep Learning models for each type of the modules mentioned above, but OpenVINO Model Zoo models are used because the OpenVINO toolkit covers both computer vision and non-computer vision workloads across Intel hardware and maximizes performance, and accelerates application development. A speech command is given as input that has information about target objects to be detected and start and end times to extract the required interval from the video. Speech is converted to text using the Automatic speech recognition QuartzNet model. The summary is extracted from text using a natural language model Generative Pre-Trained Transformer-3 (GPT-3). Based on the summary, essential frames from the video are extracted, and the You Only Look Once (YOLO) object detection model detects You Only Look Once (YOLO) objects on these extracted frames. Frame numbers that have target objects (specified objects in the speech command) are saved as text. Finally, this text (frame numbers) is converted to speech using text to speech model and will be played from the device. This project is developed for 80 You Only Look Once (YOLO) labels, and the user can extract frames based on only one or two target labels. This pipeline can be extended for more than two target labels easily by making appropriate changes in the object detection module. This project is developed for four different speech command formats by including sample examples in the prompt used by Generative Pre-Trained Transformer-3 (GPT-3) model. Based on user preference, one can come up with a new speech command format by including some examples of the respective format in the prompt used by the Generative Pre-Trained Transformer-3 (GPT-3) model. This pipeline can be used in many projects like human-machine interface, human-robot interaction, and surveillance through speech commands. All object detection projects can be upgraded using this pipeline so that one can give speech commands and output is played from the device.

Keywords: OpenVINO, automatic speech recognition, natural language processing, object detection, text to speech

Procedia PDF Downloads 75
7250 Writing a Parametric Design Algorithm Based on Recreation and Structural Analysis of Patkane Model: The Case Study of Oshtorjan Mosque

Authors: Behnoush Moghiminia, Jesus Anaya Diaz

Abstract:

The current study attempts to present the relationship between the structure development and Patkaneh as one of the Iranian geometric patterns and parametric algorithms by introducing two practical methods. While having a structural function, Patkaneh is also used as an ornamental element. It can be helpful in the scientific and practical review of Patkaneh. The current study aims to use Patkaneh as a parametric form generator based on the algorithm. The current paper attempts to express how can a more complete algorithm of this covering be obtained based on the parametric study and analysis of a sample of a Patkaneh and also investigate the relationship between the development of the geometrical pattern of Patkaneh as a structural-decorative element of Iranian architecture and digital design. In this regard, to achieve the research purposes, researchers investigated the oldest type of Patkaneh in the architecture history of Iran, such as the Northern Entrance Patkaneh of Oshtorjan Jame’ Mosque. An accurate investigation was done on the history of the background to answer the questions. Then, by investigating the structural behavior of Patkaneh, the decorative or structural-decorative role of Patkaneh was investigated to eliminate the ambiguity. Then, the geometrical structure of Patkaneh was analyzed by introducing two practical methods. The first method is based on the constituent units of Patkaneh (Square and diamond) and investigating the interactive relationships between them in 2D and 3D. This method is appropriate for cases where there are rational and regular geometrical relationships. The second method is based on the separation of the floors and the investigation of their interrelation. It is practical when the constituent units are not geometrically regular and have numerous diversity. Finally, the parametric form algorithm of these methods was codified.

Keywords: geometric properties, parametric design, Patkaneh, structural analysis

Procedia PDF Downloads 146
7249 An Approach for Modeling CMOS Gates

Authors: Spyridon Nikolaidis

Abstract:

A modeling approach for CMOS gates is presented based on the use of the equivalent inverter. A new model for the inverter has been developed using a simplified transistor current model which incorporates the nanoscale effects for the planar technology. Parametric expressions for the output voltage are provided as well as the values of the output and supply current to be compatible with the CCS technology. The model is parametric according the input signal slew, output load, transistor widths, supply voltage, temperature and process. The transistor widths of the equivalent inverter are determined by HSPICE simulations and parametric expressions are developed for that using a fitting procedure. Results for the NAND gate shows that the proposed approach offers sufficient accuracy with an average error in propagation delay about 5%.

Keywords: CMOS gate modeling, inverter modeling, transistor current mode, timing model

Procedia PDF Downloads 417
7248 Evaluation of Collagen Synthesis in Macrophages/Fibroblasts Co-Culture Using Polylactic Acid Particles as Stimulants

Authors: Feng Ju Chuang, Yu Wen Wang, Tai Jung Hsieh, Shyh Ming Kuo

Abstract:

Polylactic acid is a synthetic polymer with good biocompatibility and degradability, is widely used in clinical applications. In this study, we utilized Polylactic acid particles as stimulants for macrophages and the collagen synthesis of co-cultured fibroblasts was evaluated. The results indicated that Polylactic acid particles were nontoxic to cells from 3-(4,5-Dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide. No obvious inflammation effect was observed (under the PLLA concentration of 1 mg/mL) after 24-h co-culture of Raw264.7 and NIH3T3 cells (from TNF-α assay). The addition of PLLA particles to the Raw264.7 and NIH3T3 co-cultures increased the synthesis of collagen, the highest collagen synthesis from the fibroblast was the 0.2 mg/mL (approximately 60% increased as compared with without addition Polylactic acid particles). Moreover, a co-axial atomization delivery device was used to percutaneously introduce Polylactic acid particles into the dermis layer and stimulating macrophages to secrete growth factors promoting fibroblasts to produce collagen. The preliminary results demonstrated the synthesis of collagen was increased mildly after the introduction of Polylactic acid particles for 28-d post implantation. The Polylactic acid particles could be successfully introduced into the dermis layer from H&E staining examination, however, the optimum concentration of Polylactic acid particles and the time-period for collagen synthesis still need to be evaluated.

Keywords: collagen synthesis, macrophage, NIH3T3 cells, polylactic acid particles

Procedia PDF Downloads 102
7247 Multimodal Data Fusion Techniques in Audiovisual Speech Recognition

Authors: Hadeer M. Sayed, Hesham E. El Deeb, Shereen A. Taie

Abstract:

In the big data era, we are facing a diversity of datasets from different sources in different domains that describe a single life event. These datasets consist of multiple modalities, each of which has a different representation, distribution, scale, and density. Multimodal fusion is the concept of integrating information from multiple modalities in a joint representation with the goal of predicting an outcome through a classification task or regression task. In this paper, multimodal fusion techniques are classified into two main classes: model-agnostic techniques and model-based approaches. It provides a comprehensive study of recent research in each class and outlines the benefits and limitations of each of them. Furthermore, the audiovisual speech recognition task is expressed as a case study of multimodal data fusion approaches, and the open issues through the limitations of the current studies are presented. This paper can be considered a powerful guide for interested researchers in the field of multimodal data fusion and audiovisual speech recognition particularly.

Keywords: multimodal data, data fusion, audio-visual speech recognition, neural networks

Procedia PDF Downloads 103
7246 Thermodynamic Modeling of Three Pressure Level Reheat HRSG, Parametric Analysis and Optimization Using PSO

Authors: Mahmoud Nadir, Adel Ghenaiet

Abstract:

The main purpose of this study is the thermodynamic modeling, the parametric analysis, and the optimization of three pressure level reheat HRSG (Heat Recovery Steam Generator) using PSO method (Particle Swarm Optimization). In this paper, a parametric analysis followed by a thermodynamic optimization is presented. The chosen objective function is the specific work of the steam cycle that may be, in the case of combined cycle (CC), a good criterion of thermodynamic performance analysis, contrary to the conventional steam turbines in which the thermal efficiency could be also an important criterion. The technologic constraints such as maximal steam cycle temperature, minimal steam fraction at steam turbine outlet, maximal steam pressure, minimal stack temperature, minimal pinch point, and maximal superheater effectiveness are also considered. The parametric analyses permitted to understand the effect of design parameters and the constraints on steam cycle specific work variation. PSO algorithm was used successfully in HRSG optimization, knowing that the achieved results are in accordance with those of the previous studies in which genetic algorithms were used. Moreover, this method is easy to implement comparing with the other methods.

Keywords: combined cycle, HRSG thermodynamic modeling, optimization, PSO, steam cycle specific work

Procedia PDF Downloads 376
7245 Synthesis of Filtering in Stochastic Systems on Continuous-Time Memory Observations in the Presence of Anomalous Noises

Authors: S. Rozhkova, O. Rozhkova, A. Harlova, V. Lasukov

Abstract:

We have conducted the optimal synthesis of root-mean-squared objective filter to estimate the state vector in the case if within the observation channel with memory the anomalous noises with unknown mathematical expectation are complement in the function of the regular noises. The synthesis has been carried out for linear stochastic systems of continuous-time.

Keywords: mathematical expectation, filtration, anomalous noise, memory

Procedia PDF Downloads 239
7244 Hybrid Temporal Correlation Based on Gaussian Mixture Model Framework for View Synthesis

Authors: Deng Zengming, Wang Mingjiang

Abstract:

As 3D video is explored as a hot research topic in the last few decades, free-viewpoint TV (FTV) is no doubt a promising field for its better visual experience and incomparable interactivity. View synthesis is obviously a crucial technology for FTV; it enables to render images in unlimited numbers of virtual viewpoints with the information from limited numbers of reference view. In this paper, a novel hybrid synthesis framework is proposed and blending priority is explored. In contrast to the commonly used View Synthesis Reference Software (VSRS), the presented synthesis process is driven in consideration of the temporal correlation of image sequences. The temporal correlations will be exploited to produce fine synthesis results even near the foreground boundaries. As for the blending priority, this scheme proposed that one of the two reference views is selected to be the main reference view based on the distance between the reference views and virtual view, another view is chosen as the auxiliary viewpoint, just assist to fill the hole pixel with the help of background information. Significant improvement of the proposed approach over the state-of –the-art pixel-based virtual view synthesis method is presented, the results of the experiments show that subjective gains can be observed, and objective PSNR average gains range from 0.5 to 1.3 dB, while SSIM average gains range from 0.01 to 0.05.

Keywords: fusion method, Gaussian mixture model, hybrid framework, view synthesis

Procedia PDF Downloads 245
7243 Emotional and Physiological Reaction While Listening the Speech of Adults Who Stutter

Authors: Xharavina V., Gallopeni F., Ahmeti K.

Abstract:

Stuttered speech is filled with intermittent sound prolongations and/or rapid part word repetitions. Oftentimes, these aberrant acoustic behaviors are associated with intermittent physical tension and struggle behaviors such as head jerks, arm jerks, finger tapping, excessive eye-blinks, etc. Additionally, the jarring nature of acoustic and physical manifestations that often accompanies moderate-severe stuttering may induce negative emotional responses in listeners, which alters communication between the person who stutters and their listeners. However, researches for the influence of negative emotions in the communication and for physical reaction are limited. Therefore, to compare psycho-physiological responses of fluent adults, while listening the speech of adults who speak fluency and adults who stutter, are necessary. This study comprises the experimental method, with total of 104 participants (average age-20 years old, SD=2.1), divided into 3 groups. All participants self-reported no impairments in speech, language, or hearing. Exploring the responses of the participants, there were used two records speeches; a voice who speaks fluently and the voice who stutters. Heartbeats and the pulse were measured by the digital blood pressure monitor called 'Tensoval', as a physiological response to the fluent and stuttering sample. Meanwhile, the emotional responses of participants were measured by the self-reporting questionnaire (Steenbarger, 2001). Results showed an increase in heartbeats during the stuttering speech compared with the fluent sample (p < 0.5). The listeners also self-reported themselves as more alive, unhappy, nervous, repulsive, sad, tense, distracted and upset when listening the stuttering words versus the words of the fluent adult (where it was reported to experience positive emotions). These data support the notions that speech with stuttering can bring a psycho-physical reaction to the listeners. Speech pathologists should be aware that listeners show intolerable physiological reactions to stuttering that remain visible over time.

Keywords: emotional, physiological, stuttering, fluent speech

Procedia PDF Downloads 137
7242 A Multilevel-Synthesis Approach with Reduced Number of Switches for 99-Level Inverter

Authors: P. Satish Kumar, V. Ramu, K. Ramakrishna

Abstract:

In this paper, an efficient multilevel wave form synthesis technique is proposed and applied to a 99-level inverter. The basic principle of the proposed scheme is that the continuous output voltage levels can be synthesized by the addition or subtraction of the instantaneous voltages generated from different voltage levels. This synthesis technique can be realized by an array of switching devices composing full-bridge inverter modules and proper mixing of each bi-directional switch modules. The most different aspect, compared to the conventional approach, in the synthesis of the multilevel output waveform is the utilization of a combination of bidirectional switches and full bridge inverter modules with reduced number of components. A 99-level inverter consists of three full-bridge modules and six bi-directional switch modules. The validity of the proposed scheme is verified by the simulation.

Keywords: cascaded connection, multilevel inverter, synthesis, total harmonic distortion

Procedia PDF Downloads 523
7241 Stochastic Modeling for Parameters of Modified Car-Following Model in Area-Based Traffic Flow

Authors: N. C. Sarkar, A. Bhaskar, Z. Zheng

Abstract:

The driving behavior in area-based (i.e., non-lane based) traffic is induced by the presence of other individuals in the choice space from the driver’s visual perception area. The driving behavior of a subject vehicle is constrained by the potential leaders and leaders are frequently changed over time. This paper is to determine a stochastic model for a parameter of modified intelligent driver model (MIDM) in area-based traffic (as in developing countries). The parametric and non-parametric distributions are presented to fit the parameters of MIDM. The goodness of fit for each parameter is measured in two different ways such as graphically and statistically. The quantile-quantile (Q-Q) plot is used for a graphical representation of a theoretical distribution to model a parameter and the Kolmogorov-Smirnov (K-S) test is used for a statistical measure of fitness for a parameter with a theoretical distribution. The distributions are performed on a set of estimated parameters of MIDM. The parameters are estimated on the real vehicle trajectory data from India. The fitness of each parameter with a stochastic model is well represented. The results support the applicability of the proposed modeling for parameters of MIDM in area-based traffic flow simulation.

Keywords: area-based traffic, car-following model, micro-simulation, stochastic modeling

Procedia PDF Downloads 144
7240 Effect of Signal Acquisition Procedure on Imagined Speech Classification Accuracy

Authors: M.R Asghari Bejestani, Gh. R. Mohammad Khani, V.R. Nafisi

Abstract:

Imagined speech recognition is one of the most interesting approaches to BCI development and a lot of works have been done in this area. Many different experiments have been designed and hundreds of combinations of feature extraction methods and classifiers have been examined. Reported classification accuracies range from the chance level to more than 90%. Based on non-stationary nature of brain signals, we have introduced 3 classification modes according to time difference in inter and intra-class samples. The modes can explain the diversity of reported results and predict the range of expected classification accuracies from the brain signal accusation procedure. In this paper, a few samples are illustrated by inspecting results of some previous works.

Keywords: brain computer interface, silent talk, imagined speech, classification, signal processing

Procedia PDF Downloads 149
7239 Design of Single Point Mooring Buoy System by Parametric Analysis

Authors: Chul-Hee Jo, Do-Youb Kim, Seok-Jin Cho, Yu-Ho Rho

Abstract:

The Catenary Anchor Leg Mooring (CALM) Single Point Mooring (SPM) buoy system is the most popular and widely used type of offshore loading terminals. SPM buoy mooring systems have been deployed worldwide for a variety of applications, water depths and vessel sizes ranging from small production carriers to Very Large Crude Carriers (VLCCs). Because of safe and easy berthing and un-berthing operations, the SPM buoy mooring system is also preferred for offshore terminals. The SPM buoy consists of a buoy that is permanently moored to the seabed by means of multiple mooring lines. The buoy contains a bearing system that allows a part of it to rotate around the moored geostatic part. When moored to the rotating part of the buoy, a vessel is able to freely weathervane around the buoy. This study was verified the effects of design variables in order to design an SPM buoy mooring system through parametric analysis. The design variables have independent and nonlinear characteristics. Using parametric analysis, this research was found that the fairlead departure angle, wave height and period, chain diameter and line length effect to the mooring top tension, buoy excursion and line layback.

Keywords: Single Point Mooring (SPM), Catenary Anchor Leg Mooring(CALM), design variables, parametric analysis, mooring system optimization

Procedia PDF Downloads 385
7238 Synthesis of TiO2 Nanoparticles by Sol-Gel and Sonochemical Combination

Authors: Sabriye Piskin, Sibel Kasap, Muge Sari Yilmaz

Abstract:

Nanocrystalline TiO2 particles were successfully synthesized via sol-gel and sonochemical combination using titanium tetraisopropoxide as a precursor at lower temperature for a short time. The effect of the reaction parameters (hydrolysis media, acid media, and reaction temperatures) on the synthesis of TiO2 particles were investigated in the present study. Characterizations of synthesized samples were prepared by X-ray diffraction (XRD) analysis. It was shown that the reaction parameters played a significant role in the synthesis of TiO2 particles.

Keywords: crystalline TiO2, sonochemical mechanism, sol-gel reaction, XRD

Procedia PDF Downloads 451
7237 The Importance of the Historical Approach in the Linguistic Research

Authors: Zoran Spasovski

Abstract:

The paper shortly discusses the significance and the benefits of the historical approach in the research of languages by presenting examples of it in the fields of phonetics and phonology, lexicology, morphology, syntax, and even in the onomastics (toponomy and anthroponomy). The examples from the field of phonetics/phonology include insights into animal speech and its evolution into human speech, the evolution of the sounds of human speech from vocals to glides and consonants and from velar consonants to palatal, etc., on well-known examples of former researchers. Those from the field of lexicology show shortly the formation of the lexemes and their evolution; the morphology and syntax are explained by examples of the development of grammar and syntax forms, and the importance of the historical approach in the research of place-names and personal names is briefly outlined through examples of place-names and personal names and surnames, and the conclusions that come from it, in different languages.

Keywords: animal speech, glotogenesis, grammar forms, lexicology, place-names, personal names, surnames, syntax categories

Procedia PDF Downloads 74
7236 An Automatic Speech Recognition of Conversational Telephone Speech in Malay Language

Authors: M. Draman, S. Z. Muhamad Yassin, M. S. Alias, Z. Lambak, M. I. Zulkifli, S. N. Padhi, K. N. Baharim, F. Maskuriy, A. I. A. Rahim

Abstract:

The performance of Malay automatic speech recognition (ASR) system for the call centre environment is presented. The system utilizes Kaldi toolkit as the platform to the entire library and algorithm used in performing the ASR task. The acoustic model implemented in this system uses a deep neural network (DNN) method to model the acoustic signal and the standard (n-gram) model for language modelling. With 80 hours of training data from the call centre recordings, the ASR system can achieve 72% of accuracy that corresponds to 28% of word error rate (WER). The testing was done using 20 hours of audio data. Despite the implementation of DNN, the system shows a low accuracy owing to the varieties of noises, accent and dialect that typically occurs in Malaysian call centre environment. This significant variation of speakers is reflected by the large standard deviation of the average word error rate (WERav) (i.e., ~ 10%). It is observed that the lowest WER (13.8%) was obtained from recording sample with a standard Malay dialect (central Malaysia) of native speaker as compared to 49% of the sample with the highest WER that contains conversation of the speaker that uses non-standard Malay dialect.

Keywords: conversational speech recognition, deep neural network, Malay language, speech recognition

Procedia PDF Downloads 318
7235 A Mixing Matrix Estimation Algorithm for Speech Signals under the Under-Determined Blind Source Separation Model

Authors: Jing Wu, Wei Lv, Yibing Li, Yuanfan You

Abstract:

The separation of speech signals has become a research hotspot in the field of signal processing in recent years. It has many applications and influences in teleconferencing, hearing aids, speech recognition of machines and so on. The sounds received are usually noisy. The issue of identifying the sounds of interest and obtaining clear sounds in such an environment becomes a problem worth exploring, that is, the problem of blind source separation. This paper focuses on the under-determined blind source separation (UBSS). Sparse component analysis is generally used for the problem of under-determined blind source separation. The method is mainly divided into two parts. Firstly, the clustering algorithm is used to estimate the mixing matrix according to the observed signals. Then the signal is separated based on the known mixing matrix. In this paper, the problem of mixing matrix estimation is studied. This paper proposes an improved algorithm to estimate the mixing matrix for speech signals in the UBSS model. The traditional potential algorithm is not accurate for the mixing matrix estimation, especially for low signal-to noise ratio (SNR).In response to this problem, this paper considers the idea of an improved potential function method to estimate the mixing matrix. The algorithm not only avoids the inuence of insufficient prior information in traditional clustering algorithm, but also improves the estimation accuracy of mixing matrix. This paper takes the mixing of four speech signals into two channels as an example. The results of simulations show that the approach in this paper not only improves the accuracy of estimation, but also applies to any mixing matrix.

Keywords: DBSCAN, potential function, speech signal, the UBSS model

Procedia PDF Downloads 132
7234 Facile Synthesis and Structure Characterization of Europium (III) Tungstate Nanoparticles

Authors: Mehdi Rahimi-Nasrabadi, Seied Mahdi Pourmortazavi

Abstract:

Taguchi robust design as a statistical method was applied for optimization of the process parameters in order to tunable, simple and fast synthesis of europium (III) tungstate nanoparticles. Europium (III) tungstate nanoparticles were synthesized by a chemical precipitation reaction involving direct addition of europium ion aqueous solution to the tungstate reagent solved in aqueous media. Effects of some synthesis procedure variables i.e., europium and tungstate concentrations, flow rate of cation reagent addition, and temperature of reaction reactor on the particle size of europium (III) tungstate nanoparticles were studied experimentally in order to tune particle size of europium (III) tungstate. Analysis of variance shows the importance of controlling tungstate concentration, cation feeding flow rate and temperature for preparation of europium (III) tungstate nanoparticles by the proposed chemical precipitation reaction. Finally, europium (III) tungstate nanoparticles were synthesized at the optimum conditions of the proposed method and the morphology and chemical composition of the prepared nano-material were characterized by means of X-Ray diffraction, scanning electron microscopy, transmission electron microscopy, FT-IR spectroscopy, and fluorescence.

Keywords: europium (III) tungstate, nano-material, particle size control, procedure optimization

Procedia PDF Downloads 391
7233 Theoretical Comparisons and Empirical Illustration of Malmquist, Hicks–Moorsteen, and Luenberger Productivity Indices

Authors: Fatemeh Abbasi, Sahand Daneshvar

Abstract:

Productivity is one of the essential goals of companies to improve performance, which as a strategy-oriented method, determines the basis of the company's economic growth. The history of productivity goes back centuries, but most researchers defined productivity as the relationship between a product and the factors used in production in the early twentieth century. Productivity as the optimal use of available resources means that "more output using less input" can increase companies' economic growth and prosperity capacity. Also, having a quality life based on economic progress depends on productivity growth in that society. Therefore, productivity is a national priority for any developed country. There are several methods for calculating productivity growth measurements that can be divided into parametric and non-parametric methods. Parametric methods rely on the existence of a function in their hypotheses, while non-parametric methods do not require a function based on empirical evidence. One of the most popular non-parametric methods is Data Envelopment Analysis (DEA), which measures changes in productivity over time. The DEA evaluates the productivity of decision-making units (DMUs) based on mathematical models. This method uses multiple inputs and outputs to compare the productivity of similar DMUs such as banks, government agencies, companies, airports, Etc. Non-parametric methods are themselves divided into the frontier and non frontier approaches. The Malmquist productivity index (MPI) proposed by Caves, Christensen, and Diewert (1982), the Hicks–Moorsteen productivity index (HMPI) proposed by Bjurek (1996), or the Luenberger productivity indicator (LPI) proposed by Chambers (2002) are powerful tools for measuring productivity changes over time. This study will compare the Malmquist, Hicks–Moorsteen, and Luenberger indices theoretically and empirically based on DEA models and review their strengths and weaknesses.

Keywords: data envelopment analysis, Hicks–Moorsteen productivity index, Leuenberger productivity indicator, malmquist productivity index

Procedia PDF Downloads 188
7232 A Comprehensive Methodology for Voice Segmentation of Large Sets of Speech Files Recorded in Naturalistic Environments

Authors: Ana Londral, Burcu Demiray, Marcus Cheetham

Abstract:

Speech recording is a methodology used in many different studies related to cognitive and behaviour research. Modern advances in digital equipment brought the possibility of continuously recording hours of speech in naturalistic environments and building rich sets of sound files. Speech analysis can then extract from these files multiple features for different scopes of research in Language and Communication. However, tools for analysing a large set of sound files and automatically extract relevant features from these files are often inaccessible to researchers that are not familiar with programming languages. Manual analysis is a common alternative, with a high time and efficiency cost. In the analysis of long sound files, the first step is the voice segmentation, i.e. to detect and label segments containing speech. We present a comprehensive methodology aiming to support researchers on voice segmentation, as the first step for data analysis of a big set of sound files. Praat, an open source software, is suggested as a tool to run a voice detection algorithm, label segments and files and extract other quantitative features on a structure of folders containing a large number of sound files. We present the validation of our methodology with a set of 5000 sound files that were collected in the daily life of a group of voluntary participants with age over 65. A smartphone device was used to collect sound using the Electronically Activated Recorder (EAR): an app programmed to record 30-second sound samples that were randomly distributed throughout the day. Results demonstrated that automatic segmentation and labelling of files containing speech segments was 74% faster when compared to a manual analysis performed with two independent coders. Furthermore, the methodology presented allows manual adjustments of voiced segments with visualisation of the sound signal and the automatic extraction of quantitative information on speech. In conclusion, we propose a comprehensive methodology for voice segmentation, to be used by researchers that have to work with large sets of sound files and are not familiar with programming tools.

Keywords: automatic speech analysis, behavior analysis, naturalistic environments, voice segmentation

Procedia PDF Downloads 278
7231 Frequency of Consonant Production Errors in Children with Speech Sound Disorder: A Retrospective-Descriptive Study

Authors: Amulya P. Rao, Prathima S., Sreedevi N.

Abstract:

Speech sound disorders (SSD) encompass the major concern in younger population of India with highest prevalence rate among the speech disorders. Children with SSD if not identified and rehabilitated at the earliest, are at risk for academic difficulties. This necessitates early identification using screening tools assessing the frequently misarticulated speech sounds. The literature on frequently misarticulated speech sounds is ample in English and other western languages targeting individuals with various communication disorders. Articulation is language specific, and there are limited studies reporting the same in Kannada, a Dravidian Language. Hence, the present study aimed to identify the frequently misarticulated consonants in Kannada and also to examine the error type. A retrospective, descriptive study was carried out using secondary data analysis of 41 participants (34-phonetic type and 7-phonemic type) with SSD in the age range 3-to 12-years. All the consonants of Kannada were analyzed by considering three words for each speech sound from the Kannada Diagnostic Photo Articulation test (KDPAT). Picture naming task was carried out, and responses were audio recorded. The recorded data were transcribed using IPA 2018 broad transcription. A criterion of 2/3 or 3/3 error productions was set to consider the speech sound to be an error. Number of error productions was calculated for each consonant in each participant. Then, the percentage of participants meeting the criteria were documented for each consonant to identify the frequently misarticulated speech sound. Overall results indicated that velar /k/ (48.78%) and /g/ (43.90%) were frequently misarticulated followed by voiced retroflex /ɖ/ (36.58%) and trill /r/ (36.58%). The lateral retroflex /ɭ/ was misarticulated by 31.70% of the children with SSD. Dentals (/t/, /n/), bilabials (/p/, /b/, /m/) and labiodental /v/ were produced correctly by all the participants. The highly misarticulated velars /k/ and /g/ were frequently substituted by dentals /t/ and /d/ respectively or omitted. Participants with SSD-phonemic type had multiple substitutions for one speech sound whereas, SSD-phonetic type had consistent single sound substitutions. Intra- and inter-judge reliability for 10% of the data using Cronbach’s Alpha revealed good reliability (0.8 ≤ α < 0.9). Analyzing a larger sample by replicating such studies will validate the present study results.

Keywords: consonant, frequently misarticulated, Kannada, SSD

Procedia PDF Downloads 121
7230 The Effect of Speech-Shaped Noise and Speaker’s Voice Quality on First-Grade Children’s Speech Perception and Listening Comprehension

Authors: I. Schiller, D. Morsomme, A. Remacle

Abstract:

Children’s ability to process spoken language develops until the late teenage years. At school, where efficient spoken language processing is key to academic achievement, listening conditions are often unfavorable. High background noise and poor teacher’s voice represent typical sources of interference. It can be assumed that these factors particularly affect primary school children, because their language and literacy skills are still low. While it is generally accepted that background noise and impaired voice impede spoken language processing, there is an increasing need for analyzing impacts within specific linguistic areas. Against this background, the aim of the study was to investigate the effect of speech-shaped noise and imitated dysphonic voice on first-grade primary school children’s speech perception and sentence comprehension. Via headphones, 5 to 6-year-old children, recruited within the French-speaking community of Belgium, listened to and performed a minimal-pair discrimination task and a sentence-picture matching task. Stimuli were randomly presented according to four experimental conditions: (1) normal voice / no noise, (2) normal voice / noise, (3) impaired voice / no noise, and (4) impaired voice / noise. The primary outcome measure was task score. How did performance vary with respect to listening condition? Preliminary results will be presented with respect to speech perception and sentence comprehension and carefully interpreted in the light of past findings. This study helps to support our understanding of children’s language processing skills under adverse conditions. Results shall serve as a starting point for probing new measures to optimize children’s learning environment.

Keywords: impaired voice, sentence comprehension, speech perception, speech-shaped noise, spoken language processing

Procedia PDF Downloads 185
7229 Programmed Speech to Text Summarization Using Graph-Based Algorithm

Authors: Hamsini Pulugurtha, P. V. S. L. Jagadamba

Abstract:

Programmed Speech to Text and Text Summarization Using Graph-based Algorithms can be utilized in gatherings to get the short depiction of the gathering for future reference. This gives signature check utilizing Siamese neural organization to confirm the personality of the client and convert the client gave sound record which is in English into English text utilizing the discourse acknowledgment bundle given in python. At times just the outline of the gathering is required, the answer for this text rundown. Thus, the record is then summed up utilizing the regular language preparing approaches, for example, solo extractive text outline calculations

Keywords: Siamese neural network, English speech, English text, natural language processing, unsupervised extractive text summarization

Procedia PDF Downloads 210
7228 Asymmetric Synthesis and Biological Study of Suberosanes

Authors: Mohammad Kousara, Françoise Dumas, Rama Ibrahim, Joëlle Dubois, Joël Raingeaud

Abstract:

Suberosanes are a small group of marine natural sesquiterpenes discovered since 1996 by Boyd, Sheu and Qi from three gorgonians. Their skeleton was previously found in quadranes produced by the terrestrial fungus Aspergillus terreus. Up to date, eleven suberosanes are described from which (-)-suberosanone and (-)-suberosenol A are reaching the picomolar cytotoxicity level on human solid tumors cell lines. Due to their impressive cytotoxic properties and their limited availability, we undertook an asymmetric synthesis of the most active members of this family in order to get insight into their absolute configurations and their biological properties. The challenge of their synthesis is the regio- and stereoselective elaboration of the compact bridged tricyclic skeleton with up to five all adjacent asymmetric centers, including a central quaternary carbon one. Our strategy is based on an aza-ene-synthesis key step which is regio-and stereo-controlled by the choice of a chiral amine enantiomer. it approach is concise and flexible, the enantiopur ABC tricyclic intermediate that have been synthesized being the common precursor of suberosanes.

Keywords: suberosanes, asymmetric synthesis, sesquiterpenes, quadranes

Procedia PDF Downloads 85