Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 58

Search results for: vowel

28 Equivalences and Contrasts in the Morphological Formation of Echo Words in Two Indo-Aryan Languages: Bengali and Odia

Authors: Subhanan Mandal, Bidisha Hore

Abstract:

The linguistic process whereby repetition of all or part of the base word with or without internal change before or after the base itself takes place is regarded as reduplication. The reduplicated morphological construction annotates with itself a new grammatical category and meaning. Reduplication is a very frequent and abundant phenomenon in the eastern Indian languages from the states of West Bengal and Odisha, i.e. Bengali and Odia respectively. Bengali, an Indo-Aryan language and a part of the Indo-European language family is one of the largest spoken languages in India and is the national language of Bangladesh. Despite this classification, Bengali has certain influences in terms of vocabulary and grammar due to its geographical proximity to Tibeto-Burman and Austro-Asiatic language speaking communities. Bengali along with Odia belonged to a single linguistic branch. But with time and gradual linguistic changes due to various factors, Odia was the first to break away and develop as a separate distinct language. However, less of contrasts and more of similarities still exist among these languages along the line of linguistics, leaving apart the script. This paper deals with the procedure of echo word formations in Bengali and Odia. The morphological research of the two languages concerning the field of reduplication reveals several linguistic processes. The revelation is based on the information elicited from native language speakers and also on the analysis of echo words found in discourse and conversational patterns. For the purpose of partial reduplication analysis, prefixed class and suffixed class word formations are taken into consideration which show specific rule based changes. For example, in suffixed class categorization, both consonant and vowel alterations are found, following the rules: i) CVx à tVX, ii) CVCV à CVCi. Further classifications were also found on sentential studies of both languages which revealed complete reduplication complexities while forming echo words where the head word lose its original meaning. Complexities based on onomatopoetic/phonetic imitation of natural phenomena and not according to any rule-based occurrences were also found. Taking these aspects into consideration which are very prevalent in both the languages, inferences are drawn from the study which bring out many similarities in both the languages in this area in spite of branching away from each other several years ago.

Keywords: consonant alteration, onomatopoetic, partial reduplication and complete reduplication, reduplication, vowel alteration

Procedia PDF Downloads 217

27 Reconstructing the Segmental System of Proto-Graeco-Phrygian: a Bottom-Up Approach

Authors: Aljoša Šorgo

Abstract:

Recent scholarship on Phrygian has begun to more closely examine the long-held belief that Greek and Phrygian are two very closely related languages. It is now clear that Graeco-Phrygian can be firmly postulated as a subclade of the Indo-European languages. The present paper will focus on the reconstruction of the phonological and phonetic segments of Proto-Graeco-Phrygian (= PGPh.) by providing relevant correspondence sets and reconstructing the classes of segments. The PGPh. basic vowel system consisted of ten phonemic oral vowels: */a e o ā ē ī ō ū/. The correspondences of the vowels are clear and leave little open to ambiguity. There were four resonants and two semi-vowels in PGPh.: */r l m n i̯ u̯/, which could appear in both a consonantal and a syllabic function, with the distribution between the two still being phonotactically predictable. Of note is the fact that the segments *m and *n seem to have merged when their phonotactic position would see them used in a syllabic function. Whether the segment resulting from this merger was a nasalized vowel (most likely *[ã]) or a syllabic nasal *[N̥] (underspecified for place of articulation) cannot be determined at this stage. There were three fricatives in PGPh.: */s h ç/. *s and *h are easily identifiable. The existence of *ç, which may seem unexpected, is postulated on the basis of the correspondence Gr. ὄς ~ Phr. yos/ιος. It is of note that Bozzone has previously proposed the existence of *ç ( < PIE *h₁i̯-) in an early stage of Greek even without taking into account Phrygian data. Finally, the system of stops in PGPh. distinguished four places of articulation (labial, dental, velar, and labiovelar) and three phonation types. The question of which three phonation types were actually present in PGPh. is one of great importance for the ongoing debate on the realization of the three series in PIE. Since the matter is still very much in dispute, we ought to, at this stage, endeavour to reconstruct the PGPh. system without recourse to the other IE languages. The three series of correspondences are: 1. Gr. T (= tenuis) ~ Phr. T; 2. Gr. D (= media) ~ Phr. T; 3. Gr. TA (= tenuis aspirata) ~ Phr. M. The first series must clearly be reconstructed as composed of voiceless stops. The second and third series are more problematic. With a bottom-up approach, neither the second nor the third series of correspondences are compatible with simple modal voicing, and the reflexes differ greatly in voice onset time. Rather, the defining feature distinguishing the two series was [±spread glottis], with ancillary vibration of the vocal cords. In PGPh. the second series was undergoing further spreading of the glottis. As the two languages split, this process would continue, but be affected by dissimilar changes in VOT, which was ultimately phonemicized in both languages as the defining feature distinguishing between their series of stops.

Keywords: bottom-up reconstruction, Proto-Graeco-Phrygian, spread glottis, syllabic resonant

Procedia PDF Downloads 11

26 The French, the Yoruba, and the H-Thing: Sharing and Realising Same Phenomenon Differently

Authors: Rose-Juliet Anyanwu

Abstract:

The principal objective of this paper is to investigate whether some sort of phonological processes, such as elision, aspiration, glottalisation, and hardening can be used to account for the behaviour of the glottal fricative (or approximant, as the case may be) ‘h’ in both French and Yoruba. French and Yoruba speakers generally tend to say, for instance ‘ockey’ and ‘amburger’, instead of ‘hockey’ and ‘hamburger’, respectively. Whereas the Yoruba conversely say, for instance ‘hadd’ for ‘add’, ‘heat’ for ‘eat’ on the one hand and ‘ard’ for ‘hard’, ‘eat’ for ‘heat’ on the other hand, on a similar note, it is not quite clear whether the French, however, if not at least in rare instances, would tend to force themselves to pronounce (in any form whatsoever) the h-sound. Recorded sentences containing h-initial as well as vowel-initial words will be used for the investigation. The present paper is meant to contribute to work on aspiration, compensation, elision, and glottalisation, as well as hardening.

Keywords: aspiration, compensation, glottalisation, hardening

Procedia PDF Downloads 141

25 Kannudi- A Reference Editor for Kannada (Based on OPOK! and OHOK! Principles, and Domain Knowledge)

Authors: Vishweshwar V. Dixit

Abstract:

Kannudi is a reference editor introducing a method of input for Kannada, called OHOK!, that is, Ottu Hāku Ottu Koḍu!. This is especially suited for pressure-sensitive input devices, though the current online implementation uses the regular mechanical keyboard. OHOK! has three possible modes, namely, sva-ottu (self-conjunct), kandante (as you see), and andante (as you say). It may be noted that kandante mode does not follow the phonetic order. However, this model may work well for those who are inclined to visualize as they type rather than vocalize the sounds. Kannudi also demonstrates how domain knowledge can be effectively used to potentially increase speed, accuracy, and user-friendliness. For example, selection of a default vowel, automatic shunyification, and arkification. Also implemented are four types of Deletes that are necessary for phono-syllabic languages like Kannada.

Keywords: kannada, conjunct, reference editor, pressure input

Procedia PDF Downloads 69

24 The Contrastive Survey of Phonetic Structure in Two Iranian Dialects

Authors: Iran Kalbasi, Foroozandeh Zardashti

Abstract:

Dialectology is a branch of social linguistics that studies systematic language variations. Dialects are the branches of a unique language that have structural, morphological and phonetic differences with each other. In Iran, these dialects and language variations themselves have a lot of cultural loads, and studying them have linguistic and cultural importance. In this study, phonetic structure of two Iranian dialects, Bakhtiyari Lori of Masjedsoleyman and Shushtari in Khuzestan Province of Iran have been surveyed. Its statistical community includes twenty speakers of two dialects. The theoretic bases of this research is based on structuralism. Its data have been collected by interviewing the questionnaire that consist of 3000 words, 410 sentences and 110 complex and simple verbs. These datas are analysed and described synchronically. Then, the phonetic characteristics of these two dialects and standard Persian have been compared. Therefore, we can say that in phonetic level of these two dialects and standard Persian, there are clearly differences.

Keywords: standard language, dialectology, bakhtiyari lori dialect of Masjedsoleyman, Shushtari dialect, vowel, consonant

Procedia PDF Downloads 564

23 English Loanwords in the Egyptian Variety of Arabic: Morphological and Phonological Changes

Authors: Mohamed Yacoub

Abstract:

This paper investigates the English loanwords in the Egyptian variety of Arabic and reaches three findings. Data, in the first finding, were collected from Egyptian movies and soap operas; over two hundred words have been borrowed from English, code-switching was not included. These words then have been put into eleven different categories according to their use and part of speech. Finding two addresses the morphological and phonological change that occurred to these words. Regarding the phonological change, eight categories were found in both consonant and vowel variation, five for consonants and three for vowels. Examples were given for each. Regarding the morphological change, five categories were found including the masculine, feminine, dual, broken, and non-pluralize-able nouns. The last finding is the answers to a four-question survey that addresses forty eight native speakers of Egyptian Arabic and found that most participants did not recognize English borrowed words and thought they were originally Arabic and could not give Arabic equivalents for the loanwords that they could recognize.

Keywords: sociolinguistics, loanwords, borrowing, morphology, phonology, variation, Egyptian dialect

Procedia PDF Downloads 354

22 Comparing Sounds of the Singing Voice

Authors: Christel Elisabeth Bonin

Abstract:

This experiment aims at showing that classical singing and belting have both different singing qualities, but singing with a speaking voice has no singing quality. For this purpose, a singing female voice was recorded on four different tone pitches, singing the vowel ‘a’ by using 3 different kinds of singing - classical trained voice, belting voice and speaking voice. The recordings have been entered in the Software Praat. Then the formants of each recorded tone were compared to each other and put in relationship to the singer’s formant. The visible results are taken as an indicator of comparable sound qualities of a classical trained female voice and a belting female voice concerning the concentration of overtones in F1 to F5 and a lack of sound quality in the speaking voice for singing purpose. The results also show that classical singing and belting are both valuable vocal techniques for singing due to their richness of overtones and that belting is not comparable to shouting or screaming. Singing with a speaking voice in contrast should not be called singing due to the lack of overtones which means by definition that there is no musical tone.

Keywords: formants, overtone, singer’s formant, singing voice, belting, classical singing, singing with the speaking voice

Procedia PDF Downloads 299

21 Acoustic Characteristics of Ḫijaiyaḫ Letters Pronunciation by Indonesian Native Speaker

Authors: Romi Hardiyansyah, Raden Sugeng Joko Sarwono, Agus Samsi

Abstract:

Indonesian people have a mother language but not Arabic. Meanwhile, they must be able to pronounce the Arabic because Islam is the biggest religion in Indonesia. Arabic is composed by ḫijaiyaḫ letters which has its own pronunciation. Sound production process in humans can be divided into three physiological processes, namely: the formation of airflow from the lungs, the change in airflow from the lungs into the sound, and articulation (the modulation/sound setting into a specific sound). Ḫijaiyaḫ letters has its own articulation, some of which seem strange for most people in Indonesia. Those letters come out from the middle and upper throat so that the letters has its own acoustic characteristics. Acoustic characteristics of voice can be observed by source-filter approach that has parameters: pitch, formant, and formant bandwidth. Pitch is the basic tone in every human being. Formant is the resonance frequency of the human voice. Formant bandwidth is the time-width of a formant. After recording the sound from 21 subjects, data is processed by software Praat version 5.3.39. The analysis showed that each pronunciation, syakal (vowel changer), and the place of discharge letters has the same timbre which are determined by third and fourth formant.

Keywords: ḫijaiyaḫ, articulation, pitch, formant, formant bandwidth, timbre

Procedia PDF Downloads 363

20 Analysis of Vocal Pathologies Through Subglottic Pressure Measurement

Authors: Perla Elizabeth Jimarez Rocha, Carolina Daniela Tejeda Franco, Arturo Minor Martínez, Annel Gomez Coello

Abstract:

One of the biggest problems in developing new therapies for the management and treatment of voice disorders is the difficulty of objectively evaluating the results of each treatment. A system was proposed that captures and records voice signals, in addition to analyzing the vocal quality (fundamental frequency, zero crossings, energy, and amplitude spectrum), as well as the subglottic pressure (cm H2O) during the sustained phonation of the vowel / a /; a recording system is implemented, as well as an interactive system that records information on subglottic pressure. In Mexico City, a control group of 31 patients with phoniatric pathology is proposed; non-invasive tests were performed for these most common vocal pathologies (Nodules, Polyps, Irritative Laryngitis, Ventricular Dysphonia, Laryngeal Cancer, Dysphonia, and Dysphagia). The most common pathology was irritative laryngitis (32%), followed by vocal fold paralysis (unilateral and bilateral,19.4 %). We take into consideration men and women in the pathological groups due to the physiological difference. They were separated in gender by the difference in the morphology of the respiratory tract.

Keywords: amplitude spectrum, energy, fundamental frequency, subglottic pressure, zero crossings

Procedia PDF Downloads 91

19 Descriptive Analysis of Variations in Maguindanaon Language

Authors: Fhajema Kunso

Abstract:

People who live in the same region and who seemed to speak the same language still vary in some aspects of their language. The variation may occur in terms of pronunciation, lexicon, morphology, and syntax. This qualitative study described the phonological, morphological, and lexical variations of the Maguindanaon language among the ten Maguindanao municipalities. Purposive sampling, in-depth interviews, focus group discussion, and sorting and classifying of words according to phonological and morphological as well as lexical structures in data analysis were employed. The variations occurred through phonemic changes and other phonological processes and morphological processes. Phonological processes consisted of vowel lengthening and deletion while morphological processes included affixation, borrowing, and coinage. In the phonological variation, it was observed that there were phonemic changes in one dialect to another. For example, there was a change of phoneme /r/ to /l/. The phoneme /r/ was most likely to occur in Kabuntalan like /biru/, /kurIt/, and /kɘmɅr/ whereas in the rest of the dialects these were /bilu/, /kuIɪt/, and /kɘmɅl/ respectively. Morphologically, the affixation was the main way to know the tenses. For example, the root sarig (expect) when inserted with im becomes simarig, i.e. s + im + arig = simarig (expected). Lexical variation also existed in the Maguindanaon language. Results revealed that the variation in phonology, morphology, and lexicon were observed to be associated primarily on geographic distribution.

Keywords: applied linguistics, language, lexicon, Maguindanao, morphology, Philippines, phonology, processes, qualitative, variation

Procedia PDF Downloads 356

18 The Development, Composition, and Implementation of Vocalises as a Method of Technical Training for the Adult Musical Theatre Singer

Authors: Casey Keenan Joiner, Shayna Tayloe

Abstract:

Classical voice training for the novice singer has long relied on the guidance and instruction of vocalise collections, such as those written and compiled by Marchesi, Lütgen, Vaccai, and Lamperti. These vocalise collections purport to encourage healthy vocal habits and instill technical longevity in both aspiring and established singers, though their scope has long been somewhat confined to the classical idiom. For pedagogues and students specializing in other vocal genres, such as musical theatre and CCM (contemporary commercial music,) low-impact and pertinent vocal training aids are in short supply, and much of the suggested literature derives from classical methodology. While the tenants of healthy vocal production remain ubiquitous, specific stylistic needs and technical emphases differ from genre to genre and may require a specified extension of vocal acuity. As musical theatre continues to grow in popularity at both the professional and collegiate levels, the need for specialized training grows as well. Pedagogical literature geared specifically towards musical theatre (MT) singing and vocal production, while relatively uncommon, is readily accessible to the contemporary educator. Practitioners such as Norman Spivey, Mary Saunders Barton, Claudia Friedlander, Wendy Leborgne, and Marci Rosenberg continue to publish relevant research in the field of musical theatre voice pedagogy and have successfully identified many common MT vocal faults, their subsequent diagnoses, and their eventual corrections. Where classical methodology would suggest specific vocalises or training exercises to maintain corrected vocal posture following successful fault diagnosis, musical theatre finds itself without a relevant body of work towards which to transition. By analyzing the existing vocalise literature by means of a specialized set of parameters, including but not limited to melodic variation, rhythmic complexity, vowel utilization, and technical targeting, we have composed a set of vocalises meant specifically to address the training and conditioning of adult musical theatre voices. These vocalises target many pedagogical tenants in the musical theatre genre, including but not limited to thyroarytenoid-dominant production, twang resonance, lateral vowel formation, and “belt-mix.” By implementing these vocalises in the musical theatre voice studio, pedagogues can efficiently communicate proper musical theatre vocal posture and kinesthetic connection to their students, regardless of age or level of experience. The composition of these vocalises serves MT pedagogues on both a technical level as well as a sociological one. MT is a relative newcomer on the collegiate stage and the academization of musical theatre methodologies has been a slow and arduous process. The conflation of classical and MT techniques and training methods has long plagued the world of voice pedagogy and teachers often find themselves in positions of “cross-training,” that is, teaching students of both genres in one combined voice studio. As MT continues to establish itself on academic platforms worldwide, genre-specific literature and focused studies are both rare and invaluable. To ensure that modern students receive exacting and definitive training in their chosen fields, it becomes increasingly necessary for genres such as musical theatre to boast specified literature and a collection of musical theatre-specific vocalises only aids in this effort. This collection of musical theatre vocalises is the first of its kind and provides genre-specific studios with a basis upon which to grow healthy, balanced voices built for the harsh conditions of the modern theatre stage.

Keywords: voice pedagogy, targeted methodology, musical theatre, singing

Procedia PDF Downloads 127

17 Phonetics and Phonological Investigation of Geminates and Gemination in Some Indic Languages

Authors: Hifzur Ansary

Abstract:

The aim and scope of the present research are to delve into the form of geminates and the process of gemination with special reference to Indic Languages. This work presents the results of a cross-linguistic investigation of word-medial geminate consonants. This study is a theoretical as well as experimental, that is, it is based not only on impressionistic data from Indic languages but also on an instrumental analysis of the data. The primary data have been collected from the native speakers. The secondary data have been collected from printed materials such as journals, grammar books and other published articles. The observations made in this study have been checked with a number of educated native speakers of Bangla and Telugu. The study focuses on geminates and gemination in Bangla (Indo-Aryan Language Family) and Telugu (Dravidian Language family) exhaustively. Thus this study also attempts to posit the valid geminates in Bangali and Telugu and provides an account of gemination in these languages. It also makes a comparison of singletons and geminated consonants. It describes the distribution of geminate phonemes and non-geminate phonemes of Bangla and Telugu. The present research would also investigate the vowel lengthening in Bangla with respect to gemination. The study also explains how gemination processes present in Indian Languages are transferred to Indian English.

Keywords: geminate consonant, singleton-geminate contrast, different types of assimilation, gemination derives from borrowed words

Procedia PDF Downloads 245

16 Uvulars Alternation in Hasawi Arabic: A Harmonic Serialism Approach

Authors: Huda Ahmed Al Taisan

Abstract:

This paper investigates a phonological phenomenon, which exhibits variation ‘alternation’ in terms of the uvular consonants [q] and [ʁ] in Hasawi Arabic. This dialect is spoken in Alahsa city, which is located in the Eastern province of Saudi Arabia. To the best of our knowledge, no such research has systematically studied this phenomenon in Hasawi Arabic dialect. This paper is significant because it fills the gap in the literature about this alternation phenomenon in this understudied dialect. A large amount of the data is extracted from several interviews the author has conducted with 10 participants, native speakers of the dialect, and complemented by additional forms from social media. The latter method of collecting the data adds to the significance of the research. The analysis of the data is carried out in Harmonic Serialism Optimality Theory (HS-OT), a version of the Optimality Theoretic (OT) framework, which holds that linguistic forms are the outcome of the interaction among violable universal constraints, and in the recent development of OT into a model that accounts for linguistic variation in harmonic derivational steps. This alternation process is assumed to be phonologically unconditioned and in free variation in other varieties of Arabic dialects in the area. The goal of this paper is to investigate whether this phenomenon is in free variation or governed, what governs this alternation between [q] and [ʁ] and whether the alternation is phonological or other linguistic constraints are in action. The results show that the [q] and [ʁ] alternation is not free and it occurs due to different assimilation processes. Positional, segmental sequence and vowel adjacency factors are in action in Hasawi Arabic.

Keywords: harmonic serialism, Hasawi, uvular, variation

Procedia PDF Downloads 474

15 Visual Speech Perception of Arabic Emphatics

Authors: Maha Saliba Foster

Abstract:

Speech perception has been recognized as a bi-sensory process involving the auditory and visual channels. Compared to the auditory modality, the contribution of the visual signal to speech perception is not very well understood. Studying how the visual modality affects speech recognition can have pedagogical implications in second language learning, as well as clinical application in speech therapy. The current investigation explores the potential effect of speech visual cues on the perception of Arabic emphatics (AEs). The corpus consists of 36 minimal pairs each containing two contrasting consonants, an AE versus a non-emphatic (NE). Movies of four Lebanese speakers were edited to allow perceivers to have partial view of facial regions: lips only, lips-cheeks, lips-chin, lips-cheeks-chin, lips-cheeks-chin-neck. In the absence of any auditory information and relying solely on visual speech, perceivers were above chance at correctly identifying AEs or NEs across vowel contexts; moreover, the models were able to predict the probability of perceivers’ accuracy in identifying some of the COIs produced by certain speakers; additionally, results showed an overlap between the measurements selected by the computer and those selected by human perceivers. The lack of significant face effect on the perception of AEs seems to point to the lips, present in all of the videos, as the most important and often sufficient facial feature for emphasis recognition. Future investigations will aim at refining the analyses of visual cues used by perceivers by using Principal Component Analysis and including time evolution of facial feature measurements.

Keywords: Arabic emphatics, machine learning, speech perception, visual speech perception

Procedia PDF Downloads 272

14 The Effect of Phonetics Factors in Interpretation of Japanese Degree Adverbs

Authors: Yan Lyu

Abstract:

Japanese degree adverbs can be explained in different ways, which is hard for Japanese learners to comprehend. For instance, when ‘tyotto’ is used as a degree word, it can be interpreted literally or not. In the sentence ‘Ano mise, tyotto oishi yo. zehi iku to ii yo.’, ‘tyotto’ can be interpreted as a high degree contextually. Despite pragmatic factors, phonetics factors can also affect the interpretation of such ‘tyotto’. Concentrating on the pattern of ‘tyotto +adjective’, the paper aims to investigate the correlation between the interpretation of ‘tyotto’ and the phonetic factors in some specific contexts based on a listening experiment via PRAAT. It is also investigated that how the phonetic factors affect the interpretation of high degree adverbs, including ‘soutou’ , ‘totemo’ , ‘kanari’ and ‘sugoku’. In the experiment, Japanese speakers listened to sentences which were composed of degree adverbs and adjectives in different intonations and judged which degree the sentences expressed. Two conclusions can be drawn from the experiment results. Firstly, for adverbs expressing a high degree, in the pattern of ‘degree adverb + adjective’, either degree adverb or adjective is pronounced in a higher pitch, or both are highly pronounced, a higher degree can be expressed. Besides, with the insertion of geminate consonant and the extension of the vowel, the longer the duration of the degree adverb becomes, the higher degree can be expressed. Secondly, for ‘tyotto’, which expresses a low degree, the interpretation will be influenced by both phonetic and contextual factors. Phonetically, there are three factors causing ‘tyotto’ to be interpreted as a common degree or a high degree. The three factors are the high pitch of the modified adjective, the extended silence period of the geminate consonant and the change in the intonations of ‘tyotto’. In some contexts just like the comparison sentences, no matter how ‘tyotto + adjective’ is pronounced, ‘tyotto’ tends to be interpreted as a low degree literally.

Keywords: contextual interpretation, Japanese degree adverbs, phonetic interpretation, PRAAT

Procedia PDF Downloads 239

13 Role of Speech Articulation in English Language Learning

Authors: Khadija Rafi, Neha Jamil, Laiba Khalid, Meerub Nawaz, Mahwish Farooq

Abstract:

Speech articulation is a complex process to produce intelligible sounds with the help of precise movements of various structures within the vocal tract. All these structures in the vocal tract are named as articulators, which comprise lips, teeth, tongue, and palate. These articulators work together to produce a range of distinct phonemes, which happen to be the basis of language. It starts with the airstream from the lungs passing through the trachea and into oral and nasal cavities. When the air passes through the mouth, the tongue and the muscles around it form such coordination it creates certain sounds. It can be seen when the tongue is placed in different positions- sometimes near the alveolar ridge, soft palate, roof of the mouth or the back of the teeth which end up creating unique qualities of each phoneme. We can articulate vowels with open vocal tracts, but the height and position of the tongue is different every time depending upon each vowel, while consonants can be pronounced when we create obstructions in the airflow. For instance, the alphabet ‘b’ is a plosive and can be produced only by briefly closing the lips. Articulation disorders can not only affect communication but can also be a hurdle in speech production. To improve articulation skills for such individuals, doctors often recommend speech therapy, which involves various kinds of exercises like jaw exercises and tongue twisters. However, this disorder is more common in children who are going through developmental articulation issues right after birth, but in adults, it can be caused by injury, neurological conditions, or other speech-related disorders. In short, speech articulation is an essential aspect of productive communication, which also includes coordination of the specific articulators to produce different intelligible sounds, which are a vital part of spoken language.

Keywords: linguistics, speech articulation, speech therapy, language learning

Procedia PDF Downloads 32

12 Perceptual and Ultrasound Articulatory Training Effects on English L2 Vowels Production by Italian Learners

Authors: I. Sonia d’Apolito, Bianca Sisinni, Mirko Grimaldi, Barbara Gili Fivela

Abstract:

The American English contrast /ɑ-ʌ/ (cop-cup) is difficult to be produced by Italian learners since they realize L2-/ɑ-ʌ/ as L1-/ɔ-a/ respectively, due to differences in phonetic-phonological systems and also in grapheme-to-phoneme conversion rules. In this paper, we try to answer the following research questions: Can a short training improve the production of English /ɑ-ʌ/ by Italian learners? Is a perceptual training better than an articulatory (ultrasound - US) training? Thus, we compare a perceptual training with an US articulatory one to observe: 1) the effects of short trainings on L2-/ɑ-ʌ/ productions; 2) if the US articulatory training improves the pronunciation better than the perceptual training. In this pilot study, 9 Salento-Italian monolingual adults participated: 3 subjects performed a 1-hour perceptual training (ES-P); 3 subjects performed a 1-hour US training (ES-US); and 3 control subjects did not receive any training (CS). Verbal instructions about the phonetic properties of L2-/ɑ-ʌ/ and L1-/ɔ-a/ and their differences (representation on F1-F2 plane) were provided during both trainings. After these instructions, the ES-P group performed an identification training based on the High Variability Phonetic Training procedure, while the ES-US group performed the articulatory training, by means of US video of tongue gestures in L2-/ɑ-ʌ/ production and dynamic view of their own tongue movements and position using a probe under their chin. The acoustic data were analyzed and the first three formants were calculated. Independent t-tests were run to compare: 1) /ɑ-ʌ/ in pre- vs. post-test respectively; /ɑ-ʌ/ in pre- and post-test vs. L1-/a-ɔ/ respectively. Results show that in the pre-test all speakers realize L2-/ɑ-ʌ/ as L1-/ɔ-a/ respectively. Contrary to CS and ES-P groups, the ES-US group in the post-test differentiates the L2 vowels from those produced in the pre-test as well as from the L1 vowels, although only one ES-US subject produces both L2 vowels accurately. The articulatory training seems more effective than the perceptual one since it favors the production of vowels in the correct direction of L2 vowels and differently from the similar L1 vowels.

Keywords: L2 vowel production, perceptual training, articulatory training, ultrasound

Procedia PDF Downloads 220

11 Translation of the Verbal Nouns (Masadars) Originating from Three-Letter Verbs in the Holy Quran: Verbal Noun with More than One Pattern (Wazn) As a Model

Authors: Montasser Mohamed Abdelwahab Mahmoud, Abdelwahab Saber Esawi

Abstract:

The language of the Qur’an has a wide range of understanding, reflection, and meanings. Therefore, translation of the Qur’an is inevitably nothing but a translation of the interpretation of the meanings of the Qur’an. It requires special competencies and skills for translators so that they can get close to the intended meaning of the verse of the Qur’an and convey it with precision. In the Arabic language, the verbal noun “AlMasdar” is a very important derivative that properly expresses the verbal idea in the form of a noun. It sounds the same as the base form of the verb with minor changes in the vowel pattern. It is one of the important topics in morphology. The morphologists divided verbal nouns into auditory and analogical, and they stated that that the verbal nouns (Masadars) originating from three-letter verbs are auditory, although they set controls for some of them in order to preserve them. As for the lexicographers, they mentioned the verbal nouns while talking about the lexical materials, and in some cases, their explanation of them exceeded that made by the morphologists, especially in their discussion of structures that the morphologists did not refer to in their books. The verb kafara (disbelief), for example, has three patterns, namely: al-kufْr, al-kufrān, and al-kufūr, and it was mentioned in the Holy Qur’an with different connotations. The verb ṣāma (fasted) with his two patterns (al-ṣaūm and al-ṣīām) was mentioned in the Holy Qur’an while their semantic meaning is different. The problem discussed in this research paper lied in the "linguistic loss" committed by translators when dealing with Islamic religious texts, especially the Qur'an. The study tried to identify the strategy adopted by translators of the Holy Qur'an in translating words that were classified as verbal nouns through analyzing the translation rendered by five translations of the Qur’an into English: Yusuf Ali, Pickthall, Mohsin Khan, Muhammad Sarwar, and Shakir. This study was limited to the verbal nouns in the Quraan that originate from three-letter verbs and have different semantic meanings.

Keywords: pattern, three-letter verbs, translation of the Quran, verbal nouns

Procedia PDF Downloads 123

10 An Event-Related Potential Investigation of Speech-in-Noise Recognition in Native and Nonnative Speakers of English

Authors: Zahra Fotovatnia, Jeffery A. Jones, Alexandra Gottardo

Abstract:

Speech communication often occurs in environments where noise conceals part of a message. Listeners should compensate for the lack of auditory information by picking up distinct acoustic cues and using semantic and sentential context to recreate the speaker’s intended message. This situation seems to be more challenging in a nonnative than native language. On the other hand, early bilinguals are expected to show an advantage over the late bilingual and monolingual speakers of a language due to their better executive functioning components. In this study, English monolingual speakers were compared with early and late nonnative speakers of English to understand speech in noise processing (SIN) and the underlying neurobiological features of this phenomenon. Auditory mismatch negativities (MMNs) were recorded using a double-oddball paradigm in response to a minimal pair that differed in their middle vowel (beat/bit) at Wilfrid Laurier University in Ontario, Canada. The results did not show any significant structural and electroneural differences across groups. However, vocabulary knowledge correlated positively with performance on tests that measured SIN processing in participants who learned English after age 6. Moreover, their performance on the test negatively correlated with the integral area amplitudes in the left superior temporal gyrus (STG). In addition, the STG was engaged before the inferior frontal gyrus (IFG) in noise-free and low-noise test conditions in all groups. We infer that the pre-attentive processing of words engages temporal lobes earlier than the fronto-central areas and that vocabulary knowledge helps the nonnative perception of degraded speech.

Keywords: degraded speech perception, event-related brain potentials, mismatch negativities, brain regions

Procedia PDF Downloads 70

9 Experimental Research and Analyses of Yoruba Native Speakers’ Chinese Phonetic Errors

Authors: Obasa Joshua Ifeoluwa

Abstract:

Phonetics is the foundation and most important part of language learning. This article, through an acoustic experiment as well as using Praat software, uses Yoruba students’ Chinese consonants, vowels, and tones pronunciation to carry out a visual comparison with that of native Chinese speakers. This article is aimed at Yoruba native speakers learning Chinese phonetics; therefore, Yoruba students are selected. The students surveyed are required to be at an elementary level and have learned Chinese for less than six months. The students selected are all undergraduates majoring in Chinese Studies at the University of Lagos. These students have already learned Chinese Pinyin and are all familiar with the pinyin used in the provided questionnaire. The Chinese students selected are those that have passed the level two Mandarin proficiency examination, which serves as an assurance that their pronunciation is standard. It is discovered in this work that in terms of Mandarin’s consonants pronunciation, Yoruba students cannot distinguish between the voiced and voiceless as well as the aspirated and non-aspirated phonetics features. For instance, while pronouncing [ph] it is clearly shown in the spectrogram that the Voice Onset Time (VOT) of a Chinese speaker is higher than that of a Yoruba native speaker, which means that the Yoruba speaker is pronouncing the unaspirated counterpart [p]. Another difficulty is to pronounce some affricates like [tʂ]、[tʂʰ]、[ʂ]、[ʐ]、 [tɕ]、[tɕʰ]、[ɕ]. This is because these sounds are not in the phonetic system of the Yoruba language. In terms of vowels, some students find it difficult to pronounce some allophonic high vowels such as [ɿ] and [ʅ], therefore pronouncing them as their phoneme [i]; another pronunciation error is pronouncing [y] as [u], also as shown in the spectrogram, a student pronounced [y] as [iu]. In terms of tone, it is most difficult for students to differentiate between the second (rising) and third (falling and rising) tones because these tones’ emphasis is on the rising pitch. This work concludes that the major error made by Yoruba students while pronouncing Chinese sounds is caused by the interference of their first language (LI) and sometimes by their lingua franca.

Keywords: Chinese, Yoruba, error analysis, experimental phonetics, consonant, vowel, tone

Procedia PDF Downloads 76

8 Symo-syl: A Meta-Phonological Intervention to Support Italian Pre-Schoolers’ Emergent Literacy Skills

Authors: Tamara Bastianello, Rachele Ferrari, Marinella Majorano

Abstract:

The adoption of the syllabic approach in preschool programmes could support and reinforce meta-phonological awareness and literacy skills in children. The introduction of a meta-phonological intervention in preschool could facilitate the transition to primary school, especially for children with learning fragilities. In the present contribution, we want to investigate the efficacy of "Simo-syl" intervention in enhancing emergent literacy skills in children (especially for reading). Simo-syl is a 12 weeks multimedia programme developed for children to improve their language and communication skills and later literacy development in preschool. During the intervention, Simo-syl, an invented character, leads children in a series of meta-phonological games. Forty-six Italian preschool children (i.e., the Simo-syl group) participated in the programme; seventeen preschool children (i.e., the control group) did not participate in the intervention. Children in the two groups were between 4;10 and 5;9 years. They were assessed on their vocabulary, morpho-syntactical, meta-phonological, phonological, and phono-articulatory skills twice: 1) at the beginning of the last year of the preschool through standardised paper-based assessment tools and 2) one week after the intervention. All children in the Simo-syl group took part in the meta-phonological programme based on the syllabic approach. The intervention lasted 12 weeks (three activities per week; week 1: activities focused on syllable blending and spelling and a first approach to the written code; weeks 2-11: activities focused on syllables recognition; week 12: activities focused on vowels recognition). Very few children (Simo-syl group = 21, control group = 9) were tested again (post-test) one week after the intervention. Before starting the intervention programme, the Simo-syl and the control groups had similar meta-phonological, phonological, lexical skills (all ps > .05). One week after the intervention, a significant difference emerged between the two groups in their meta-phonological skills (syllable blending, p = .029; syllable spelling, p = .032), in their vowel recognition ability (p = .032) and their word reading skills (p = .05). An ANOVA confirmed the effect of the group membership on the developmental growth for the word reading task (F (1,28) = 6.83, p = .014, ηp2 = .196). Taking part in the Simo-syl intervention has a positive effect on the ability to read in preschool children.

Keywords: intervention programme, literacy skills, meta-phonological skills, syllabic approach

Procedia PDF Downloads 134

7 Pharyngealization Spread in Ibbi Dialect of Yemeni Arabic: An Acoustic Study

Authors: Fadhl Qutaish

Abstract:

This paper examines the pharyngealization spread in one of the Yemeni Arabic dialects, namely, Ibbi Arabic (IA). It investigates how pharyngealized sounds spread their acoustic features onto the neighboring vowels and change their default features. This feature has been investigated quietly well in MSA but still has to be deeply studied in the different dialect of Arabic which will bring about a clearer picture of the similarities and the differences among these dialects and help in mapping them based on the way this feature is utilized. Though the studies are numerous, no one of them has illustrated how far in the multi-syllabic word the spread can be and whether it takes a steady or gradient manner. This study tries to fill this gap and give a satisfactory explanation of the pharyngealization spread in Ibbi Dialect. This study is the first step towards a larger investigation of the different dialects of Yemeni Arabic in the future. The data recorded are represented in minimal pairs in which the trigger (pharyngealized or the non-pharyngealized sound) is in the initial or final position of monosyllabic and multisyllabic words. A group of 24 words were divided into four groups and repeated three times by three subjects which will yield 216 tokens that are tested and analyzed. The subjects are three male speakers aged between 28 and 31 with no history of neurological, speaking or hearing problems. All of them are bilingual speakers of Arabic and English and native speakers of Ibbi-Dialect. Recordings were done in a sound-proof room and praat software was used for the analysis and coding of the trajectories of F1 and F2 for the low vowel /a/ to see the effect of pharyngealization on the formant trajectory within the same syllable and in other syllables of the same word by comparing the F1 and F2 formants to the non-pharyngealized environment. The results show that pharyngealization spread is gradient (progressively and regressively). The spread is reflected in the gradual raising of F1 as we move closer towards the trigger and the gradual lowering of F2 as well. The results of the F1 mean values in tri-syllabic words when the trigger is word initially show that there is a raise of 37.9 HZ in the first syllable, 26.8HZ in the second syllable and 14.2HZ in the third syllable. F2 mean values undergo a lowering of 239 HZ in the first syllable, 211.7 HZ in the second syllable and 176.5 in the third syllable. This gradual decrease in the difference of F2 values in the non-pharyngealized and pharyngealized context illustrates that the spread is gradient. A similar result was found when the trigger is word-final which proves that the spread is gradient (progressively and regressively.

Keywords: pharyngealization, Yemeni Arabic, Ibbi dialect, pharyngealization spread

Procedia PDF Downloads 193

6 Second Language Perception of Japanese /Cju/ and /Cjo/ Sequences by Mandarin-Speaking Learners of Japanese

Authors: Yili Liu, Honghao Ren, Mariko Kondo

Abstract:

In the field of second language (L2) speech learning, it is well-known that that learner’s first language (L1) phonetic and phonological characteristics will be transferred into their L2 production and perception, which lead to foreign accent. For L1 Mandarin learners of Japanese, the confusion of /u/ and /o/ in /CjV/ sequences has been observed in their utterance frequently. L1 transfer is considered to be the cause of this issue, however, other factors which influence the identification of /Cju/ and /Cjo/ sequences still under investigation. This study investigates the perception of Japanese /Cju/ and /Cjo/ units by L1 Mandarin learners of Japanese. It further examined whether learners’ proficiency, syllable position, phonetic features of preceding consonants and background noise affect learners’ performance in perception. Fifty-two Mandarin-speaking learners of Japanese and nine native Japanese speakers were recruited to participate in an identification task. Learners were divided into beginner, intermediate and advanced level according to their Japanese proficiency. The average correct rate was used to evaluate learners’ perceptual performance. Furthermore, the comparison of the correct rate between learners’ groups and the control group was conducted as well to examine learners’ nativelikeness. Results showed that background noise tends to pose an adverse effect on distinguishing /u/ and /o/ in /CjV/ sequences. Secondly, Japanese proficiency has no influence on learners’ perceptual performance in the quiet and in background noise. Then all learners did not reach a native-like level without the distraction of noise. Beginner level learners performed less native-like, although higher level learners appeared to have achieved nativelikeness in the multi-talker babble noise. Finally, syllable position tends to affect distinguishing /Cju/ and /Cjo/ only under the noisy condition. Phonetic features of preceding consonants did not impact learners’ perception in any listening conditions. Findings in this study can give an insight into a further understanding of Japanese vowel acquisition by L1 Mandarin learners of Japanese. In addition, this study indicates that L1 transfer is not the only explanation for the confusion of /u/ and /o/ in /CjV/ sequences, factors such as listening condition and syllable position are also needed to take into consideration in future research. It also suggests the importance of perceiving speech in a noisy environment, which is close to the actual conversation required more attention to pedagogy.

Keywords: background noise, Chinese learners of Japanese, /Cju/ and /Cjo/ sequences, second language perception

Procedia PDF Downloads 135

5 The Strategies and Mediating Processes of Learning the Inflectional Morphology in English: A Case Study for Taiwanese English Learners

Authors: Hsiu-Ling Hsu, En-Minh (John) Lan

Abstract:

Pronunciation has received more and more language researchers’ and teachers’ attention because it is important for effective or even successful communication. How to consistently and correctly orally produce verbal morphology, such as English regular past tense inflection, has been a big challenge and troublesome for FL learners. The research aims to explore EFL (English as a foreign language) learners’ developmental trajectory of the inflectional morphology, that is, what mediating processes and strategies EFL learners use, to attain native-like prosodic structure of inflectional morphemes (e.g., –ed and –s suffixes) by comparing the differences among EFL learners at different English levels. This research adopted a self-repair analysis and Prosodic Transfer Hypothesis with three developmental stages as a theoretical framework. To answer the research questions, we conducted two experiments, grammatical tense test written production (Experiment 1) and read-aloud oral production (Experiment 2), and recruited 30 participants who were divided into three groups, low-, middle-, and advanced EFL learners. Experiment 1 was conducted to ensure that participants had learned the knowledge of forming the English regular past tense rules and Experiment 2 was carried out to compare the data across FL English learner groups at different English levels. The EFL learners’ self-repair data showed at least four interesting findings. First, low achievers were more sensitive to the plural suffix -s than the past tense suffix -ed. Middle achievers exhibited a greater responsiveness to the past tense suffix, while high achievers demonstrated equal sensitivity to both suffixes. Additionally, two strategies used by EFL English learners to produce verbs and nouns with inflectional morphemes were to delete internal syllable and to divide a four-syllable verb (e.g., ‘graduated’) into two prosodic structures (e.g., ‘gradu’ and ‘ated’ or ‘gradua’ and ‘ted’). Third, true vowel epenthesis was found only in the low EFL achievers. Moreover fortition (native-like sound) was observed in the low and middle EFL achievers. These findings and self-repair data disclosed mediating processes between the developmental stages and provided insight on how Taiwan EFL learners attained the adjunction prosodic structures of inflectional Morphemes in English.

Keywords: inflectional morphology, prosodic structure, developmental trajectory, strategies and mediating processes, English as a foreign language

Procedia PDF Downloads 22

4 Quantum Cum Synaptic-Neuronal Paradigm and Schema for Human Speech Output and Autism

Authors: Gobinathan Devathasan, Kezia Devathasan

Abstract:

Objective: To improve the current modified Broca-Wernicke-Lichtheim-Kussmaul speech schema and provide insight into autism. Methods: We reviewed the pertinent literature. Current findings, involving Brodmann areas 22, 46, 9,44,45,6,4 are based on neuropathology and functional MRI studies. However, in primary autism, there is no lucid explanation and changes described, whether neuropathology or functional MRI, appear consequential. Findings: We forward an enhanced model which may explain the enigma related to autism. Vowel output is subcortical and does need cortical representation whereas consonant speech is cortical in origin. Left lateralization is needed to commence the circuitry spin as our life have evolved with L-amino acids and left spin of electrons. A fundamental species difference is we are capable of three syllable-consonants and bi-syllable expression whereas cetaceans and songbirds are confined to single or dual consonants. The 4 key sites for speech are superior auditory cortex, Broca’s two areas, and the supplementary motor cortex. Using the Argand’s diagram and Reimann’s projection, we theorize that the Euclidean three dimensional synaptic neuronal circuits of speech are quantized to coherent waves, and then decoherence takes place at area 6 (spherical representation). In this quantum state complex, 3-consonant languages are instantaneously integrated and multiple languages can be learned, verbalized and differentiated. Conclusion: We postulate that evolutionary human speech is elevated to quantum interaction unlike cetaceans and birds to achieve the three consonants/bi-syllable speech. In classical primary autism, the sudden speech switches off and on noted in several cases could now be explained not by any anatomical lesion but failure of coherence. Area 6 projects directly into prefrontal saccadic area (8); and this further explains the second primary feature in autism: lack of eye contact. The third feature which is repetitive finger gestures, located adjacent to the speech/motor areas, are actual attempts to communicate with the autistic child akin to sign language for the deaf.

Keywords: quantum neuronal paradigm, cetaceans and human speech, autism and rapid magnetic stimulation, coherence and decoherence of speech

Procedia PDF Downloads 160

3 Using Locus Equations for Berber Consonants Labiovellarization

Authors: Ali Benali Djouher Leila

Abstract:

Labiovelarization of velar consonants and labials is a very widespread phenomenon. It is attested in all the major northern Berber dialects. Only the Tuareg is totally unaware of it. But, even within the large Berber-speaking regions of the north, it is very unstable: it may be completely absent in certain dialects (such as the Bougie region in Kabylie), and its extension and frequency can vary appreciably between the dialects which know it. Some dialects of Great Kabylia or the Chleuh domain, for example, "labiovélarize" more than others from the same region. Thus, in Great Kabylia, the adjective "large" will be pronounced: amqqwran with the At Yiraten and amqqran with the At Yanni, a few kilometers away. One of the problems with them is deciding whether it is one or two phonemes. All the criteria used by linguists in this kind of case lead to the conclusion that they are unique phonemes (a phoneme and not a succession of two phonemes, / k + w /, for example). The phonetic and phonological criteria are moreover clearly confirmed by the morphological data since, in the system of verbal alternations, these complex segments are treated as single phonemes: agree, "to draw, to fetch water," akwer, "to fly," have exactly the same morphology as as "jealous," arem" taste," Ames, "dirty" or afeg, "steal" ... verbs with two radical consonants (type aCC). At the level of notation, both scientific and usual, it is, therefore, necessary to represent the labiovélarized by a single letter, possibly accompanied by a diacritic. In fact, actual practices are diverse. - The scientific representation of type does not seem adequate for current use because its realization is easy only on a microcomputer. The Berber Documentation File used a small ° (of n °) above the writing line: k °, g ° ... which has the advantage of being easy to achieve since it is part of general typographical conventions in Latin script and that it is present on a typewriter keyboard. Mouloud Mammeri, then the Berber Study Group of Vincennes (Tisuraf review), and a majority of Kabyle practitioners over the last twenty years have used the succession "consonant +" semi-vowel / w / "(CW) on the same line of writing; for all the reasons explained previously, this practice is not a good solution and should be abandoned, especially as it particularizes Kabyle in the Berber ensemble. In this study, we were interested in two velar consonants, / g / and / k /, labiovellarized: / gw / and the / kw / (we adopted the addition of the "w") for the representation for ease of writing in graphical mode. It is a question of trying to characterize these four consonants in order to see if they have different places of articulation and if they are distinct (if these velars are distinct from their labiovellarized counterpart). This characterization is done using locus equations.

Keywords: berber consonants;, labiovelarization, locus equations, acoustical caracterization, kabylian dialect, algerian language

Procedia PDF Downloads 45

2 The Effects of Culture and Language on Social Impression Formation from Voice Pleasantness: A Study with French and Iranian People

Authors: L. Bruckert, A. Mansourzadeh

Abstract:

The voice has a major influence on interpersonal communication in everyday life via the perception of pleasantness. The evolutionary perspective postulates that the mechanisms underlying the pleasantness judgments are universal adaptations that have evolved in the service of choosing a mate (through the process of sexual selection). From this point of view, the favorite voices would be those with more marked sexually dimorphic characteristics; for example, in men with lower voice pitch, pitch is the main criterion. On the other hand, one can postulate that the mechanisms involved are gradually established since childhood through exposure to the environment, and thus the prosodic elements could take precedence in everyday life communication as it conveys information about the speaker's attitude (willingness to communicate, interest toward the interlocutors). Our study focuses on voice pleasantness and its relationship with social impression formation, exploring both the spectral aspects (pitch, timbre) and the prosodic ones. In our study, we recorded the voices through two vocal corpus (five vowels and a reading text) of 25 French males speaking French and 25 Iranian males speaking Farsi. French listeners (40 male/40 female) listened to the French voices and made a judgment either on the voice's pleasantness or on the speaker (judgment about his intelligence, honesty, sociability). The regression analyses from our acoustic measures showed that the prosodic elements (for example, the intonation and the speech rate) are the most important criteria concerning pleasantness, whatever the corpus or the listener's gender. Moreover, the correlation analyses showed that the speakers with the voices judged as the most pleasant are considered the most intelligent, sociable, and honest. The voices in Farsi have been judged by 80 other French listeners (40 male/40 female), and we found the same effect of intonation concerning the judgment of pleasantness with the corpus «vowel» whereas with the corpus «text» the pitch is more important than the prosody. It may suggest that voice perception contains some elements invariant across culture/language, whereas others are influenced by the cultural/linguistic background of the listener. Shortly in the future, Iranian people will be asked to listen either to the French voices for half of them or to the Farsi voices for the other half and produce the same judgments as the French listeners. This experimental design could potentially make it possible to distinguish what is linked to culture and what is linked to language in the case of differences in voice perception.

Keywords: cross-cultural psychology, impression formation, pleasantness, voice perception

Procedia PDF Downloads 41

1 Italian Speech Vowels Landmark Detection through the Legacy Tool 'xkl' with Integration of Combined CNNs and RNNs

Authors: Kaleem Kashif, Tayyaba Anam, Yizhi Wu

Abstract:

This paper introduces a methodology for advancing Italian speech vowels landmark detection within the distinctive feature-based speech recognition domain. Leveraging the legacy tool 'xkl' by integrating combined convolutional neural networks (CNNs) and recurrent neural networks (RNNs), the study presents a comprehensive enhancement to the 'xkl' legacy software. This integration incorporates re-assigned spectrogram methodologies, enabling meticulous acoustic analysis. Simultaneously, our proposed model, integrating combined CNNs and RNNs, demonstrates unprecedented precision and robustness in landmark detection. The augmentation of re-assigned spectrogram fusion within the 'xkl' software signifies a meticulous advancement, particularly enhancing precision related to vowel formant estimation. This augmentation catalyzes unparalleled accuracy in landmark detection, resulting in a substantial performance leap compared to conventional methods. The proposed model emerges as a state-of-the-art solution in the distinctive feature-based speech recognition systems domain. In the realm of deep learning, a synergistic integration of combined CNNs and RNNs is introduced, endowed with specialized temporal embeddings, harnessing self-attention mechanisms, and positional embeddings. The proposed model allows it to excel in capturing intricate dependencies within Italian speech vowels, rendering it highly adaptable and sophisticated in the distinctive feature domain. Furthermore, our advanced temporal modeling approach employs Bayesian temporal encoding, refining the measurement of inter-landmark intervals. Comparative analysis against state-of-the-art models reveals a substantial improvement in accuracy, highlighting the robustness and efficacy of the proposed methodology. Upon rigorous testing on a database (LaMIT) speech recorded in a silent room by four Italian native speakers, the landmark detector demonstrates exceptional performance, achieving a 95% true detection rate and a 10% false detection rate. A majority of missed landmarks were observed in proximity to reduced vowels. These promising results underscore the robust identifiability of landmarks within the speech waveform, establishing the feasibility of employing a landmark detector as a front end in a speech recognition system. The synergistic integration of re-assigned spectrogram fusion, CNNs, RNNs, and Bayesian temporal encoding not only signifies a significant advancement in Italian speech vowels landmark detection but also positions the proposed model as a leader in the field. The model offers distinct advantages, including unparalleled accuracy, adaptability, and sophistication, marking a milestone in the intersection of deep learning and distinctive feature-based speech recognition. This work contributes to the broader scientific community by presenting a methodologically rigorous framework for enhancing landmark detection accuracy in Italian speech vowels. The integration of cutting-edge techniques establishes a foundation for future advancements in speech signal processing, emphasizing the potential of the proposed model in practical applications across various domains requiring robust speech recognition systems.

Keywords: landmark detection, acoustic analysis, convolutional neural network, recurrent neural network

Procedia PDF Downloads 14