Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 10

Search results for: formants

10 Comparing Sounds of the Singing Voice

Abstract:

This experiment aims at showing that classical singing and belting have both different singing qualities, but singing with a speaking voice has no singing quality. For this purpose, a singing female voice was recorded on four different tone pitches, singing the vowel ‘a’ by using 3 different kinds of singing - classical trained voice, belting voice and speaking voice. The recordings have been entered in the Software Praat. Then the formants of each recorded tone were compared to each other and put in relationship to the singer’s formant. The visible results are taken as an indicator of comparable sound qualities of a classical trained female voice and a belting female voice concerning the concentration of overtones in F1 to F5 and a lack of sound quality in the speaking voice for singing purpose. The results also show that classical singing and belting are both valuable vocal techniques for singing due to their richness of overtones and that belting is not comparable to shouting or screaming. Singing with a speaking voice in contrast should not be called singing due to the lack of overtones which means by definition that there is no musical tone.

Keywords: formants, overtone, singer’s formant, singing voice, belting, classical singing, singing with the speaking voice

Procedia PDF Downloads 307

9 Detection of Phoneme [S] Mispronounciation for Sigmatism Diagnosis in Adults

Authors: Michal Krecichwost, Zauzanna Miodonska, Pawel Badura

Abstract:

The diagnosis of sigmatism is mostly based on the observation of articulatory organs. It is, however, not always possible to precisely observe the vocal apparatus, in particular in the oral cavity of the patient. Speech processing can allow to objectify the therapy and simplify the verification of its progress. In the described study the methodology for classification of incorrectly pronounced phoneme [s] is proposed. The recordings come from adults. They were registered with the speech recorder at the sampling rate of 44.1 kHz and the resolution of 16 bit. The database of pathological and normative speech has been collected for the study including reference assessments provided by the speech therapy experts. Ten adult subjects were asked to simulate a certain type of stigmatism under the speech therapy expert supervision. In the recordings, the analyzed phone [s] was surrounded by vowels, viz: ASA, ESE, ISI, SPA, USU, YSY. Thirteen MFCC (mel-frequency cepstral coefficients) and RMS (root mean square) values are calculated within each frame being a part of the analyzed phoneme. Additionally, 3 fricative formants along with corresponding amplitudes are determined for the entire segment. In order to aggregate the information within the segment, the average value of each MFCC coefficient is calculated. All features of other types are aggregated by means of their 75th percentile. The proposed method of features aggregation reduces the size of the feature vector used in the classification. Binary SVM (support vector machine) classifier is employed at the phoneme recognition stage. The first group consists of pathological phones, while the other of the normative ones. The proposed feature vector yields classification sensitivity and specificity measures above 90% level in case of individual logo phones. The employment of a fricative formants-based information improves the sole-MFCC classification results average of 5 percentage points. The study shows that the employment of specific parameters for the selected phones improves the efficiency of pathology detection referred to the traditional methods of speech signal parameterization.

Keywords: computer-aided pronunciation evaluation, sibilants, sigmatism diagnosis, speech processing

Procedia PDF Downloads 262

8 Features of Normative and Pathological Realizations of Sibilant Sounds for Computer-Aided Pronunciation Evaluation in Children

Authors: Zuzanna Miodonska, Michal Krecichwost, Pawel Badura

Abstract:

Sigmatism (lisping) is a speech disorder in which sibilant consonants are mispronounced. The diagnosis of this phenomenon is usually based on the auditory assessment. However, the progress in speech analysis techniques creates a possibility of developing computer-aided sigmatism diagnosis tools. The aim of the study is to statistically verify whether specific acoustic features of sibilant sounds may be related to pronunciation correctness. Such knowledge can be of great importance while implementing classifiers and designing novel tools for automatic sibilants pronunciation evaluation. The study covers analysis of various speech signal measures, including features proposed in the literature for the description of normative sibilants realization. Amplitudes and frequencies of three fricative formants (FF) are extracted based on local spectral maxima of the friction noise. Skewness, kurtosis, four normalized spectral moments (SM) and 13 mel-frequency cepstral coefficients (MFCC) with their 1st and 2nd derivatives (13 Delta and 13 Delta-Delta MFCC) are included in the analysis as well. The resulting feature vector contains 51 measures. The experiments are performed on the speech corpus containing words with selected sibilant sounds (/ʃ, ʒ/) pronounced by 60 preschool children with proper pronunciation or with natural pathologies. In total, 224 /ʃ/ segments and 191 /ʒ/ segments are employed in the study. The Mann-Whitney U test is employed for the analysis of stigmatism and normative pronunciation. Statistically, significant differences are obtained in most of the proposed features in children divided into these two groups at p < 0.05. All spectral moments and fricative formants appear to be distinctive between pathology and proper pronunciation. These metrics describe the friction noise characteristic for sibilants, which makes them particularly promising for the use in sibilants evaluation tools. Correspondences found between phoneme feature values and an expert evaluation of the pronunciation correctness encourage to involve speech analysis tools in diagnosis and therapy of sigmatism. Proposed feature extraction methods could be used in a computer-assisted stigmatism diagnosis or therapy systems.

Keywords: computer-aided pronunciation evaluation, sigmatism diagnosis, speech signal analysis, statistical verification

Procedia PDF Downloads 282

7 Effect of Helium and Sulfur Hexafluoride Gas Inhalation on Voice Resonances

Authors: Pallavi Marathe

Abstract:

Voice is considered to be a unique biometric property of human beings. Unlike other biometric evidence, for example, fingerprints and retina scans, etc., voice can be easily changed or mimicked. The present paper talks about how the inhalation of helium and sulfur hexafluoride (SF6) gas affects the voice formant frequencies that are the resonant frequencies of the vocal tract. Helium gas is low-density gas; hence, the voice travels with a higher speed than that of air. On the other side in SF6 gas voice travels with lower speed than that of air due to its higher density. These results in decreasing the resonant frequencies of voice in helium and increasing in SF6. Results are presented with the help of Praat software, which is used for voice analysis.

Keywords: voice formants, helium, sulfur hexafluoride, gas inhalation

Procedia PDF Downloads 106

6 Employee Inventor Compensation: A New Quest for Comparative Law

Authors: Andrea Borroni

Abstract:

The evolution of technology, the global scale of economy, and the new short-term employment contracts make a very peculiar set of disposition of raising interest for the legal interpreter: the employee inventor compensation. Around the globe, this issue is differently regulated according to the legal systems; therefore, it is extremely fragmented. Of course, employers with transnational businesses should face this issue from a comparative perspective. Different legal regimes are available worldwide awarding, as a consequence, diverse compensation to the inventor and according to their own methodology. Given these premises, the recourse to comparative law methodology (legal formants, diachronic and synchronic methodology, common core approach) is the best equipped to face all these different national approaches in order to achieve a tidy systematic. This research, so, elaborates a map of the specific criteria to grant the compensation for the inventor and to show the criteria to calculate them. This finding has been the first step to find out a common core of the discipline given by the common features present in the different legal systems.

Keywords: comparative law, employee invention, intellectual property, legal transplant

Procedia PDF Downloads 318

5 Coping Orientation of Academic Community in the Time of COVID-19 Pandemic: A Pilot Survey Study

Authors: Fereshteh Ahmadi, Önver Cetrez, Said Zandi, Sharareh Akhavan

Abstract:

In this paper, we have mapped the coping methods used to address the coronavirus pandemic by members of the academic community. We conducted an anonymous survey of a convenient sample of 674 faculty/staff members and students from September to December 2020. A modified version of the RCOPE scale was used for data collection. The results indicate that both religious and existential coping methods were used by respondents. The study also indicates that even though 71% of in-formants believed in God or another religious figure, 61% reported that they had tried to gain control of the situation directly without the help of God or another religious figure. The ranking of the coping strategies used indicates that the first five methods used by informants were all non-religious coping methods (i.e., secular existential coping methods): regarding life as a part of a greater whole, regarding nature as an important resource, listening to the sound of surrounding nature, being alone and con-templating, and walking/engaging in any activities outdoors giving a spiritual feeling. Our results contribute to the new area of research on academic community’s coping with pandemic-related stress and challenges.

Keywords: academic staff, academics, coping strategies, coronavirus epidemic, higher education.

Procedia PDF Downloads 62

4 Tensor Deep Stacking Neural Networks and Bilinear Mapping Based Speech Emotion Classification Using Facial Electromyography

Authors: P. S. Jagadeesh Kumar, Yang Yung, Wenli Hu

Abstract:

Speech emotion classification is a dominant research field in finding a sturdy and profligate classifier appropriate for different real-life applications. This effort accentuates on classifying different emotions from speech signal quarried from the features related to pitch, formants, energy contours, jitter, shimmer, spectral, perceptual and temporal features. Tensor deep stacking neural networks were supported to examine the factors that influence the classification success rate. Facial electromyography signals were composed of several forms of focuses in a controlled atmosphere by means of audio-visual stimuli. Proficient facial electromyography signals were pre-processed using moving average filter, and a set of arithmetical features were excavated. Extracted features were mapped into consistent emotions using bilinear mapping. With facial electromyography signals, a database comprising diverse emotions will be exposed with a suitable fine-tuning of features and training data. A success rate of 92% can be attained deprived of increasing the system connivance and the computation time for sorting diverse emotional states.

Keywords: speech emotion classification, tensor deep stacking neural networks, facial electromyography, bilinear mapping, audio-visual stimuli

Procedia PDF Downloads 225

3 Perceptual and Ultrasound Articulatory Training Effects on English L2 Vowels Production by Italian Learners

Authors: I. Sonia d’Apolito, Bianca Sisinni, Mirko Grimaldi, Barbara Gili Fivela

Abstract:

The American English contrast /ɑ-ʌ/ (cop-cup) is difficult to be produced by Italian learners since they realize L2-/ɑ-ʌ/ as L1-/ɔ-a/ respectively, due to differences in phonetic-phonological systems and also in grapheme-to-phoneme conversion rules. In this paper, we try to answer the following research questions: Can a short training improve the production of English /ɑ-ʌ/ by Italian learners? Is a perceptual training better than an articulatory (ultrasound - US) training? Thus, we compare a perceptual training with an US articulatory one to observe: 1) the effects of short trainings on L2-/ɑ-ʌ/ productions; 2) if the US articulatory training improves the pronunciation better than the perceptual training. In this pilot study, 9 Salento-Italian monolingual adults participated: 3 subjects performed a 1-hour perceptual training (ES-P); 3 subjects performed a 1-hour US training (ES-US); and 3 control subjects did not receive any training (CS). Verbal instructions about the phonetic properties of L2-/ɑ-ʌ/ and L1-/ɔ-a/ and their differences (representation on F1-F2 plane) were provided during both trainings. After these instructions, the ES-P group performed an identification training based on the High Variability Phonetic Training procedure, while the ES-US group performed the articulatory training, by means of US video of tongue gestures in L2-/ɑ-ʌ/ production and dynamic view of their own tongue movements and position using a probe under their chin. The acoustic data were analyzed and the first three formants were calculated. Independent t-tests were run to compare: 1) /ɑ-ʌ/ in pre- vs. post-test respectively; /ɑ-ʌ/ in pre- and post-test vs. L1-/a-ɔ/ respectively. Results show that in the pre-test all speakers realize L2-/ɑ-ʌ/ as L1-/ɔ-a/ respectively. Contrary to CS and ES-P groups, the ES-US group in the post-test differentiates the L2 vowels from those produced in the pre-test as well as from the L1 vowels, although only one ES-US subject produces both L2 vowels accurately. The articulatory training seems more effective than the perceptual one since it favors the production of vowels in the correct direction of L2 vowels and differently from the similar L1 vowels.

Keywords: L2 vowel production, perceptual training, articulatory training, ultrasound

Procedia PDF Downloads 236

2 Pharyngealization Spread in Ibbi Dialect of Yemeni Arabic: An Acoustic Study

Authors: Fadhl Qutaish

Abstract:

This paper examines the pharyngealization spread in one of the Yemeni Arabic dialects, namely, Ibbi Arabic (IA). It investigates how pharyngealized sounds spread their acoustic features onto the neighboring vowels and change their default features. This feature has been investigated quietly well in MSA but still has to be deeply studied in the different dialect of Arabic which will bring about a clearer picture of the similarities and the differences among these dialects and help in mapping them based on the way this feature is utilized. Though the studies are numerous, no one of them has illustrated how far in the multi-syllabic word the spread can be and whether it takes a steady or gradient manner. This study tries to fill this gap and give a satisfactory explanation of the pharyngealization spread in Ibbi Dialect. This study is the first step towards a larger investigation of the different dialects of Yemeni Arabic in the future. The data recorded are represented in minimal pairs in which the trigger (pharyngealized or the non-pharyngealized sound) is in the initial or final position of monosyllabic and multisyllabic words. A group of 24 words were divided into four groups and repeated three times by three subjects which will yield 216 tokens that are tested and analyzed. The subjects are three male speakers aged between 28 and 31 with no history of neurological, speaking or hearing problems. All of them are bilingual speakers of Arabic and English and native speakers of Ibbi-Dialect. Recordings were done in a sound-proof room and praat software was used for the analysis and coding of the trajectories of F1 and F2 for the low vowel /a/ to see the effect of pharyngealization on the formant trajectory within the same syllable and in other syllables of the same word by comparing the F1 and F2 formants to the non-pharyngealized environment. The results show that pharyngealization spread is gradient (progressively and regressively). The spread is reflected in the gradual raising of F1 as we move closer towards the trigger and the gradual lowering of F2 as well. The results of the F1 mean values in tri-syllabic words when the trigger is word initially show that there is a raise of 37.9 HZ in the first syllable, 26.8HZ in the second syllable and 14.2HZ in the third syllable. F2 mean values undergo a lowering of 239 HZ in the first syllable, 211.7 HZ in the second syllable and 176.5 in the third syllable. This gradual decrease in the difference of F2 values in the non-pharyngealized and pharyngealized context illustrates that the spread is gradient. A similar result was found when the trigger is word-final which proves that the spread is gradient (progressively and regressively.

Keywords: pharyngealization, Yemeni Arabic, Ibbi dialect, pharyngealization spread

Procedia PDF Downloads 201

1 The Roles of Mandarin and Local Dialect in the Acquisition of L2 English Consonants Among Chinese Learners of English: Evidence From Suzhou Dialect Areas

Authors: Weijing Zhou, Yuting Lei, Francis Nolan

Abstract:

In the domain of second language acquisition, whenever pronunciation errors or acquisition difficulties are found, researchers habitually attribute them to the negative transfer of the native language or local dialect. To what extent do Mandarin and local dialects affect English phonological acquisition for Chinese learners of English as a foreign language (EFL)? Little evidence, however, has been found via empirical research in China. To address this core issue, the present study conducted phonetic experiments to explore the roles of local dialects and Mandarin in Chinese EFL learners’ acquisition of L2 English consonants. Besides Mandarin, the sole national language in China, Suzhou dialect was selected as the target local dialect because of its distinct phonology from Mandarin. The experimental group consisted of 30 junior English majors at Yangzhou University, who were born and lived in Suzhou, acquired Suzhou Dialect since their early childhood, and were able to communicate freely and fluently with each other in Suzhou Dialect, Mandarin as well as English. The consonantal target segments were all the consonants of English, Mandarin and Suzhou Dialect in typical carrier words embedded in the carrier sentence Say again. The control group consisted of two Suzhou Dialect experts, two Mandarin radio broadcasters, and two British RP phoneticians, who served as the standard speakers of the three languages. The reading corpus was recorded and sampled in the phonetic laboratories at Yangzhou University, Soochow University and Cambridge University, respectively, then transcribed, segmented and analyzed acoustically via Praat software, and finally analyzed statistically via EXCEL and SPSS software. The main findings are as follows: First, in terms of correct acquisition rates (CARs) of all the consonants, Mandarin ranked top (92.83%), English second (74.81%) and Suzhou Dialect last (70.35%), and significant differences were found only between the CARs of Mandarin and English and between the CARs of Mandarin and Suzhou Dialect, demonstrating Mandarin was overwhelmingly more robust than English or Suzhou Dialect in subjects’ multilingual phonological ecology. Second, in terms of typical acoustic features, the average duration of all the consonants plus the voice onset time (VOT) of plosives, fricatives, and affricatives in 3 languages were much longer than those of standard speakers; the intensities of English fricatives and affricatives were higher than RP speakers but lower than Mandarin and Suzhou Dialect standard speakers; the formants of English nasals and approximants were significantly different from those of Mandarin and Suzhou Dialects, illustrating the inconsistent acoustic variations between the 3 languages. Thirdly, in terms of typical pronunciation variations or errors, there were significant interlingual interactions between the 3 consonant systems, in which Mandarin consonants were absolutely dominant, accounting for the strong transfer from L1 Mandarin to L2 English instead of from earlier-acquired L1 local dialect to L2 English. This is largely because the subjects were knowingly exposed to Mandarin since their nursery and were strictly required to speak in Mandarin through all the formal education periods from primary school to university.

Keywords: acquisition of L2 English consonants, role of Mandarin, role of local dialect, Chinese EFL learners from Suzhou Dialect areas

Procedia PDF Downloads 76