Search results for: speech enhancement
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2053

Search results for: speech enhancement

1813 Underwater Image Enhancement and Reconstruction Using CNN and the MultiUNet Model

Authors: Snehal G. Teli, R. J. Shelke

Abstract:

CNN and MultiUNet models are the framework for the proposed method for enhancing and reconstructing underwater images. Multiscale merging of features and regeneration are both performed by the MultiUNet. CNN collects relevant features. Extensive tests on benchmark datasets show that the proposed strategy performs better than the latest methods. As a result of this work, underwater images can be represented and interpreted in a number of underwater applications with greater clarity. This strategy will advance underwater exploration and marine research by enhancing real-time underwater image processing systems, underwater robotic vision, and underwater surveillance.

Keywords: convolutional neural network, image enhancement, machine learning, multiunet, underwater images

Procedia PDF Downloads 42
1812 Speech and LanguageTherapists’ Advices for Multilingual Children with Developmental Language Disorders

Authors: Rudinë Fetahaj, Flaka Isufi, Kristina Hansson

Abstract:

While evidence shows that in most European countries’ multilingualism is rising, unfortunately, the focus of Speech and Language Therapy (SLT) is still monolingualism. Furthermore, there is sparse information on how the needs of multilingual children with language disorders such as Developmental Language Disorder (DLD) are being met and which factors affect the intervention approach of SLTs when treating DLD. This study aims to examine the relationship and correlation between the number of languages SLTs speak, years of experience, and length of education with the advice they give to parents of multilingual children with DLD regarding which language to be spoken. This is a cross-sectional study where a survey was completed online by 2608 SLTs across Europe and data has been used from a 2017 COST-action project. IBM-SPSS-28 was used where descriptive analysis, correlation and Kruskal-Wallis test were performed.SLTs mainly advise the parents of multilingual children with DLD to speak their native language at home. Besides years of experience, language status and the level of education showed to have no association with the type of advice SLTs give. Results showed a non-significant moderate positive correlation between SLTs years of experience and their advice regarding the native language, whereas language status and length of education showed no correlation with the advice SLTs give to parents.

Keywords: quantitative study, developmental language disorders, multilingualism, speech and language therapy, children, European context

Procedia PDF Downloads 55
1811 Refusal Speech Acts in French Learners of Mandarin Chinese

Authors: Jui-Hsueh Hu

Abstract:

This study investigated various models of refusal speech acts among three target groups: French learners of Mandarin Chinese (FM), Taiwanese native Mandarin speakers (TM), and native French speakers (NF). The refusal responses were analyzed in terms of their options, frequencies, and sequences and the contents of their semantic formulas. This study also examined differences in refusal strategies, as determined by social status and social distance, among the three groups. The difficulties of refusal speech acts encountered by FM were then generalized. The results indicated that Mandarin instructors of NF should focus on the different reasons for the pragmatic failure of French learners and should assist these learners in mastering refusal speech acts that rely on abundant cultural information. In this study, refusal policies were mainly classified according to the research of Beebe et al. (1990). Discourse completion questionnaires were collected from TM, FM, and NF, and their responses were compared to determine how refusal policies differed among the groups. This study not only emphasized the dissimilarities of refusal strategies between native Mandarin speakers and second-language Mandarin learners but also used NF as a control group. The results of this study demonstrated that regarding overall strategies, FM were biased toward NF in terms of strategy choice, order, and content, resulting in pragmatic transfer under the influence of social factors such as 'social status' and 'social distance,' strategy choices of FM were still closer to those of NF, and the phenomenon of pragmatic transfer of FM was revealed. Regarding the refusal difficulties among the three groups, the F-test in the analysis of variance revealed statistical significance was achieved for Role Playing Items 13 and 14 (P < 0.05). A difference was observed in the average number of refusal difficulties between the participants. However, after multiple comparisons, it was found that item 13 (unrecognized heterosexual junior colleague requesting contacts) was significantly more difficult for NF than for TM and FM; item 14 (contacts requested by an unrecognized classmate of the opposite sex) was significantly more difficult to refuse for NF than for TM. This study summarized the pragmatic language errors that most FM often perform, including the misuse or absence of modal words, hedging expressions, and empty words at the end of sentences, as the reasons for pragmatic failures. The common social pragmatic failures of FM include inaccurately applying the level of directness and formality.

Keywords: French Mandarin, interlanguage refusal, pragmatic transfer, speech acts

Procedia PDF Downloads 227
1810 Comparison Study of Machine Learning Classifiers for Speech Emotion Recognition

Authors: Aishwarya Ravindra Fursule, Shruti Kshirsagar

Abstract:

In the intersection of artificial intelligence and human-centered computing, this paper delves into speech emotion recognition (SER). It presents a comparative analysis of machine learning models such as K-Nearest Neighbors (KNN),logistic regression, support vector machines (SVM), decision trees, ensemble classifiers, and random forests, applied to SER. The research employs four datasets: Crema D, SAVEE, TESS, and RAVDESS. It focuses on extracting salient audio signal features like Zero Crossing Rate (ZCR), Chroma_stft, Mel Frequency Cepstral Coefficients (MFCC), root mean square (RMS) value, and MelSpectogram. These features are used to train and evaluate the models’ ability to recognize eight types of emotions from speech: happy, sad, neutral, angry, calm, disgust, fear, and surprise. Among the models, the Random Forest algorithm demonstrated superior performance, achieving approximately 79% accuracy. This suggests its suitability for SER within the parameters of this study. The research contributes to SER by showcasing the effectiveness of various machine learning algorithms and feature extraction techniques. The findings hold promise for the development of more precise emotion recognition systems in the future. This abstract provides a succinct overview of the paper’s content, methods, and results.

Keywords: comparison, ML classifiers, KNN, decision tree, SVM, random forest, logistic regression, ensemble classifiers

Procedia PDF Downloads 12
1809 The Mirage of Progress? a Longitudinal Study of Japanese Students’ L2 Oral Grammar

Authors: Robert Long, Hiroaki Watanabe

Abstract:

This longitudinal study examines the grammatical errors of Japanese university students’ dialogues with a native speaker over an academic year. The L2 interactions of 15 Japanese speakers were taken from the JUSFC2018 corpus (April/May 2018) and the JUSFC2019 corpus (January/February). The corpora were based on a self-introduction monologue and a three-question dialogue; however, this study examines the grammatical accuracy found in the dialogues. Research questions focused on a possible significant difference in grammatical accuracy from the first interview session in 2018 and the second one the following year, specifically regarding errors in clauses per 100 words, global errors and local errors, and with specific errors related to parts of speech. The investigation also focused on which forms showed the least improvement or had worsened? Descriptive statistics showed that error-free clauses/errors per 100 words decreased slightly while clauses with errors/100 words increased by one clause. Global errors showed a significant decline, while local errors increased from 97 to 158 errors. For errors related to parts of speech, a t-test confirmed there was a significant difference between the two speech corpora with more error frequency occurring in the 2019 corpus. This data highlights the difficulty in having students self-edit themselves.

Keywords: clause analysis, global vs. local errors, grammatical accuracy, L2 output, longitudinal study

Procedia PDF Downloads 100
1808 Thermal Analysis of Automobile Radiator Using Nanofluids

Authors: S. Sumanth, Babu Rao Ponangi, K. N. Seetharamu

Abstract:

As the technology is emerging day by day, there is a need for some better methodology which will enhance the performance of radiator. Nanofluid is the one area which has promised the enhancement of the radiator performance. Currently, nanofluid has got a well effective solution for enhancing the performance of the automobile radiators. Suspending the nano sized particle in the base fluid, which has got better thermal conductivity value when compared to a base fluid, is preferably considered for nanofluid. In the current work, at first mathematical formulation has been carried out, which will govern the performance of the radiator. Current work is justified by plotting the graph for different parameters. Current work justifies the enhancement of radiator performance using nanofluid.

Keywords: nanofluid, radiator performance, graphene, gamma aluminium oxide (γ-Al2O3), titanium dioxide (TiO2)

Procedia PDF Downloads 216
1807 Functional Outcome of Speech, Voice and Swallowing Following Excision of Glomus Jugulare Tumor

Authors: B. S. Premalatha, Kausalya Sahani

Abstract:

Background: Glomus jugulare tumors arise within the jugular foramen and are commonly seen in females particularly on the left side. Surgical excision of the tumor may cause lower cranial nerve deficits. Cranial nerve involvement produces hoarseness of voice, slurred speech, and dysphagia along with other physical symptoms, thereby affecting the quality of life of individuals. Though oncological clearance is mainly emphasized on while treating these individuals, little importance is given to their communication, voice and swallowing problems, which play a crucial part in daily functioning. Objective: To examine the functions of voice, speech and swallowing outcomes of the subjects, following excision of glomus jugulare tumor. Methods: Two female subjects aged 56 and 62 years had come with a complaint of change in voice, inability to swallow and reduced clarity of speech following surgery for left glomus jugulare tumor were participants of the study. Their surgical information revealed multiple cranial nerve palsies involving the left facial, left superior and recurrent branches of the vagus nerve, left pharyngeal, left soft palate, left hypoglossal and vestibular nerves. Functional outcomes of voice, speech and swallowing were evaluated by perceptual and objective assessment procedures. Assessment included the examination of oral structures and functions, dysarthria by Frenchey dysarthria assessment, cranial nerve functions and swallowing functions. MDVP and Dr. Speech software were used to evaluate acoustic parameters of voice and quality of voice respectively. Results: The study revealed that both the subjects, subsequent to excision of glomus jugulare tumor, showed a varied picture of affected oral structure and functions, articulation, voice and swallowing functions. The cranial nerve assessment showed impairment of the vagus, hypoglossal, facial and glossopharyngeal nerves. Voice examination indicated vocal cord paralysis associated with breathy quality of voice, weak voluntary cough, reduced pitch and loudness range, and poor respiratory support. Perturbation parameters as jitter, shimmer were affected along with s/z ratio indicative of voice fold pathology. Reduced MPD(Maximum Phonation Duration) of vowels indicated that disturbed coordination between respiratory and laryngeal systems. Hypernasality was found to be a prominent feature which reduced speech intelligibility. Imprecise articulation was seen in both the subjects as the hypoglossal nerve was affected following surgery. Injury to vagus, hypoglossal, gloss pharyngeal and facial nerves disturbed the function of swallowing. All the phases of swallow were affected. Aspiration was observed before and during the swallow, confirming the oropharyngeal dysphagia. All the subsystems were affected as per Frenchey Dysarthria Assessment signifying the diagnosis of flaccid dysarthria. Conclusion: There is an observable communication and swallowing difficulty seen following excision of glomus jugulare tumor. Even with complete resection, extensive rehabilitation may be necessary due to significant lower cranial nerve dysfunction. The finding of the present study stresses the need for involvement of as speech and swallowing therapist for pre-operative counseling and assessment of functional outcomes.

Keywords: functional outcome, glomus jugulare tumor excision, multiple cranial nerve impairment, speech and swallowing

Procedia PDF Downloads 228
1806 An Early Attempt of Artificial Intelligence-Assisted Language Oral Practice and Assessment

Authors: Paul Lam, Kevin Wong, Chi Him Chan

Abstract:

Constant practicing and accurate, immediate feedback are the keys to improving students’ speaking skills. However, traditional oral examination often fails to provide such opportunities to students. The traditional, face-to-face oral assessment is often time consuming – attending the oral needs of one student often leads to the negligence of others. Hence, teachers can only provide limited opportunities and feedback to students. Moreover, students’ incentive to practice is also reduced by their anxiety and shyness in speaking the new language. A mobile app was developed to use artificial intelligence (AI) to provide immediate feedback to students’ speaking performance as an attempt to solve the above-mentioned problems. Firstly, it was thought that online exercises would greatly increase the learning opportunities of students as they can now practice more without the needs of teachers’ presence. Secondly, the automatic feedback provided by the AI would enhance students’ motivation to practice as there is an instant evaluation of their performance. Lastly, students should feel less anxious and shy compared to directly practicing oral in front of teachers. Technically, the program made use of speech-to-text functions to generate feedback to students. To be specific, the software analyzes students’ oral input through certain speech-to-text AI engine and then cleans up the results further to the point that can be compared with the targeted text. The mobile app has invited English teachers for the pilot use and asked for their feedback. Preliminary trials indicated that the approach has limitations. Many of the users’ pronunciation were automatically corrected by the speech recognition function as wise guessing is already integrated into many of such systems. Nevertheless, teachers have confidence that the app can be further improved for accuracy. It has the potential to significantly improve oral drilling by giving students more chances to practice. Moreover, they believe that the success of this mobile app confirms the potential to extend the AI-assisted assessment to other language skills, such as writing, reading, and listening.

Keywords: artificial Intelligence, mobile learning, oral assessment, oral practice, speech-to-text function

Procedia PDF Downloads 80
1805 Light Emission Enhancement of Silicon Nanocrystals by Gold Layer

Authors: R. Karmouch

Abstract:

A thin gold metal layer was deposited on the top of silicon oxide films containing embedded Si nanocrystals (Si-nc). The sample was annealed in gas containing nitrogen, and subsequently characterized by photoluminescence. We obtained 3-fold enhancement of photon emission from the Si-nc embedded in silicon dioxide covered with a Gold layer as compared with an uncovered sample. We attribute this enhancement to the increase of the spontaneous emission rate caused by the coupling of the Si-nc emitters with the surface plasmons (SP). The evolution of PL emission with laser irradiated time was also collected from covered samples, and compared to that from uncovered samples. In an uncovered sample, the PL intensity decreases with time, approximately with two decay constants. Although the decrease of the initial PL intensity associated with the increase of sample temperature under CW pumping is still observed in samples covered with a gold layer, this film significantly contributes to reduce the permanent deterioration of the PL intensity. The resistance to degradation of light-emitting silicon nanocrystals can be increased by SP coupling to suppress the permanent deterioration. Controlling the permanent photodeterioration can allow to perform a reliable optical gain measurement.

Keywords: photodeterioration, silicon nanocrystals, ion implantation, photoluminescence, surface plasmons

Procedia PDF Downloads 394
1804 Effect of Tilt Angle of Herringbone Microstructures on Enhancement of Heat and Mass Transfer

Authors: Nathan Estrada, Fangjun Shu, Yanxing Wang

Abstract:

The heat and mass transfer characteristics of a simple shear flow over a surface covered with staggered herringbone structures are numerically investigated using the lattice Boltzmann method. The focus is on the effect of ridge angle of the structures on the enhancement of heat and mass transfer. In the simulation, the temperature and mass concentration are modeled as a passive scalar released from the moving top wall and absorbed at the structured bottom wall. Reynolds number is fixed at 100. Two Prandtl or Schmidt numbers, 1 and 10, are considered. The results show that the advective scalar transport plays a more important role at larger Schmidt numbers. The fluid travels downward with higher scalar concentration into the grooves at the backward grove tips and travel upward with lower scalar concentration at the forward grove tips. Different tile angles result in different flow advection in wall-normal direction and thus different heat and mass transport efficiencies. The maximum enhancement is achieved at an angle between 15o and 30o. The mechanism of heat and mass transfer is analyzed in detail.

Keywords: fluid mechanics, heat and mass transfer, microfluidics, staggered herringbone mixer

Procedia PDF Downloads 83
1803 Auditory and Language Skills Development after Cochlear Implantation in Children with Multiple Disabilities

Authors: Tamer Mesallam, Medhat Yousef, Ayna Almasaad

Abstract:

BACKGROUND: Cochlear implantation (CI) in children with additional disabilities can be a fundamental and supportive intervention. Although, there may be some positive impacts of CI on children with multiple disabilities such as better outcomes of communication skills, development, and quality of life, the families of those children complain from the post-implant habilitation efforts that considered as a burden. OBJECTIVE: To investigate the outcomes of CI children with different co-disabilities through using the Meaningful Auditory Integration Scale (MAIS) and the Meaningful Use of Speech Scale (MUSS) as outcome measurement tools. METHODS: The study sample comprised 25 hearing-impaired children with co-disability who received cochlear implantation. Age and gender-matched control group of 25 cochlear-implanted children without any other disability has been also included. The participants' auditory skills and speech outcomes were assessed using MAIS and MUSS tests. RESULTS: There was a statistically significant difference in the different outcomes measure between the two groups. However, the outcomes of some multiple disabilities subgroups were comparable to the control group. Around 40% of the participants with co-disabilities experienced advancement in their methods of communication from behavior to oral mode. CONCLUSION: Cochlear-implanted children with multiple disabilities showed variable degrees of auditory and speech outcomes. The degree of benefits depends on the type of the co-disability. Long-term follow-up is recommended for those children.

Keywords: children with disabilities, Cochlear implants, hearing impairment, language development

Procedia PDF Downloads 89
1802 Detecting Hate Speech And Cyberbullying Using Natural Language Processing

Authors: Nádia Pereira, Paula Ferreira, Sofia Francisco, Sofia Oliveira, Sidclay Souza, Paula Paulino, Ana Margarida Veiga Simão

Abstract:

Social media has progressed into a platform for hate speech among its users, and thus, there is an increasing need to develop automatic detection classifiers of offense and conflicts to help decrease the prevalence of such incidents. Online communication can be used to intentionally harm someone, which is why such classifiers could be essential in social networks. A possible application of these classifiers is the automatic detection of cyberbullying. Even though identifying the aggressive language used in online interactions could be important to build cyberbullying datasets, there are other criteria that must be considered. Being able to capture the language, which is indicative of the intent to harm others in a specific context of online interaction is fundamental. Offense and hate speech may be the foundation of online conflicts, which have become commonly used in social media and are an emergent research focus in machine learning and natural language processing. This study presents two Portuguese language offense-related datasets which serve as examples for future research and extend the study of the topic. The first is similar to other offense detection related datasets and is entitled Aggressiveness dataset. The second is a novelty because of the use of the history of the interaction between users and is entitled the Conflicts/Attacks dataset. Both datasets were developed in different phases. Firstly, we performed a content analysis of verbal aggression witnessed by adolescents in situations of cyberbullying. Secondly, we computed frequency analyses from the previous phase to gather lexical and linguistic cues used to identify potentially aggressive conflicts and attacks which were posted on Twitter. Thirdly, thorough annotation of real tweets was performed byindependent postgraduate educational psychologists with experience in cyberbullying research. Lastly, we benchmarked these datasets with other machine learning classifiers.

Keywords: aggression, classifiers, cyberbullying, datasets, hate speech, machine learning

Procedia PDF Downloads 195
1801 Haiti and Power Symbolic: An Analysis Understanding of the Impact of the Presidential Political Speeches

Authors: Marc Arthur Bien Aimé, Julio da Silveira Moreira

Abstract:

This study examines the political speech in Haiti over the course of the decade 2011-2021, focusing on the speeches of the presidents Michel J. Martelly and Jovenel Moïse and their impacts on their awareness collective. In using a qualitative approach, we have analyzed the speech of the president pronounced in response to the political instability of countries, as well as interviews with a group of 20 Haitians living in Port- Au-Prince. Our results put in evidence their complex relationship between politics, awareness collective, and the influence of the powers imperialists. We show that the situation in Haiti's disastrous social and political situation is driven by personal political interests and the absence of a state political project. Moreover, the speeches of the president’s analysis are meaningless, transforming concepts such as social progress and justice in simple words. This political rhetoric contributes to the domination symbolic of the population of Haitian. This study is also linked to the theme “Constitutions, processes democratic and critical of the state in Latin America,” emphasizing the importance of analysis of political speech to understand the complexities of the democratic process and criticism of the State in their Latin American region. We suggest future research to deepen our understanding of these political dynamics and their impact on public policies and developments of the constitutions throughout Latin America.

Keywords: political discourse, conscience collective, inequality social, democratic processes, constitutions, Haiti

Procedia PDF Downloads 21
1800 Phonological Variation in the Speech of Grade 1 Teachers in Select Public Elementary Schools in the Philippines

Authors: M. Leonora D. Guerrero

Abstract:

The study attempted to uncover the most and least frequent phonological variation evident in the speech patterns of grade 1 teachers in select public elementary schools in the Philippines. It also determined the lectal description of the participants based on Tayao’s consonant charts for American and Philippine English. Descriptive method was utilized. A total of 24 grade 1 teachers participated in the study. The instrument used was word list. Each column in the word list is represented by words with the target consonant phonemes: labiodental fricatives f/ and /v/ and lingua-alveolar fricative /z/. These phonemes were in the initial, medial, and final positions, respectively. Findings of the study revealed that the most frequent variation happened when the participants read words with /z/ in the final position while the least frequent variation happened when the participants read words with /z/ in the initial position. The study likewise proved that the grade 1 teachers exhibited the segmental features of both the mesolect and basilect. Based on these results, it is suggested that teachers of English in the Philippines must aspire to manifest the features of the mesolect, if not, the acrolect since it is expected of the academicians not to be displaying the phonological features of the acrolects since this variety is only used by the 'uneducated.' This is especially so with grade 1 teachers who are often mimicked by their students who classify their speech as the 'standard.'

Keywords: consonant phonemes, lectal description, Philippine English, phonological variation

Procedia PDF Downloads 181
1799 Speech Emotion Recognition: A DNN and LSTM Comparison in Single and Multiple Feature Application

Authors: Thiago Spilborghs Bueno Meyer, Plinio Thomaz Aquino Junior

Abstract:

Through speech, which privileges the functional and interactive nature of the text, it is possible to ascertain the spatiotemporal circumstances, the conditions of production and reception of the discourse, the explicit purposes such as informing, explaining, convincing, etc. These conditions allow bringing the interaction between humans closer to the human-robot interaction, making it natural and sensitive to information. However, it is not enough to understand what is said; it is necessary to recognize emotions for the desired interaction. The validity of the use of neural networks for feature selection and emotion recognition was verified. For this purpose, it is proposed the use of neural networks and comparison of models, such as recurrent neural networks and deep neural networks, in order to carry out the classification of emotions through speech signals to verify the quality of recognition. It is expected to enable the implementation of robots in a domestic environment, such as the HERA robot from the RoboFEI@Home team, which focuses on autonomous service robots for the domestic environment. Tests were performed using only the Mel-Frequency Cepstral Coefficients, as well as tests with several characteristics of Delta-MFCC, spectral contrast, and the Mel spectrogram. To carry out the training, validation and testing of the neural networks, the eNTERFACE’05 database was used, which has 42 speakers from 14 different nationalities speaking the English language. The data from the chosen database are videos that, for use in neural networks, were converted into audios. It was found as a result, a classification of 51,969% of correct answers when using the deep neural network, when the use of the recurrent neural network was verified, with the classification with accuracy equal to 44.09%. The results are more accurate when only the Mel-Frequency Cepstral Coefficients are used for the classification, using the classifier with the deep neural network, and in only one case, it is possible to observe a greater accuracy by the recurrent neural network, which occurs in the use of various features and setting 73 for batch size and 100 training epochs.

Keywords: emotion recognition, speech, deep learning, human-robot interaction, neural networks

Procedia PDF Downloads 130
1798 Pragmatic Competence of Jordanian EFL Learners

Authors: Dina Mahmoud Hammouri

Abstract:

The study investigates the Jordanian EFL learners’ pragmatic competence through their production of the speech acts of responding to requests, making suggestions, making threats and expressing farewells. The sample of the study consists of 130 Jordanian EFL learners and native speakers. 2600 responses were collected through a Discourse Completion Test (DCT). The findings of the study revealed that the tested students showed similarities and differences in performing the strategies of four speech acts. Differences in the students’ performances led to pragmatic failure instances. The pragmatic failure committed by students refers to a lack of linguistic competence (i.e., pragmalinguistic failure), sociocultural differences and pragmatic transfer (i.e., sociopragmatic failure). EFL learners employed many mechanisms to maintain their communicative competence; the analysis of the test on speech acts showed learners’ tendency towards using particular strategies, resorting to modify strategies and relating them to their grammatical competence, prefabrication, performing long forms, buffing and transfer. The results were also suggestive of the learners’ lack of pragmalinguistic and sociopragmatic knowledge. The implications of this study are for language teachers to teach interlanguage pragmatics explicitly in EFL contexts to draw learners’ attention to both pragmalinguistic and sociopragmatic features, pay more attention to these areas and allocate more time and practice to solve learners’ problems in these areas. The implication of this study is also for pedagogical material designers to provide sufficient and well-organized pragmatic input.

Keywords: pragmatic failure, Jordanian EFL learner, sociopragmatic competence, pragmalinguistic competence

Procedia PDF Downloads 40
1797 Efficiency Improvement for Conventional Rectangular Horn Antenna by Using EBG Technique

Authors: S. Kampeephat, P. Krachodnok, R. Wongsan

Abstract:

The conventional rectangular horn has been used for microwave antenna a long time. Its gain can be increased by enlarging the construction of horn to flare exponentially. This paper presents a study of the shaped woodpile Electromagnetic Band Gap (EBG) to improve its gain for conventional horn without construction enlargement. The gain enhancement synthesis method for shaped woodpile EBG that has to transfer the electromagnetic fields from aperture of a horn antenna through woodpile EBG is presented by using the variety of shaped woodpile EBGs such as planar, triangular, quadratic, circular, gaussian, cosine, and squared cosine structures. The proposed technique has the advantages of low profile, low cost for fabrication and light weight. The antenna characteristics such as reflection coefficient (S11), radiation patterns and gain are simulated by utilized A Computer Simulation Technology (CST) software. With the proposed concept, an antenna prototype was fabricated and experimented. The S11 and radiation patterns obtained from measurements show a good impedance matching and a gain enhancement of the proposed antenna. The gain at dominant frequency of 10 GHz is 25.6 dB, application for X- and Ku-Band Radar, that higher than the gain of the basic rectangular horn antenna around 8 dB with adding only one appropriated EBG structures.

Keywords: conventional rectangular horn antenna, electromagnetic band gap, gain enhancement, X- and Ku-band radar

Procedia PDF Downloads 245
1796 Problems in English into Thai Translation Normally Found in Thai University Students

Authors: Anochao Phetcharat

Abstract:

This research aims to study problems of translation basic knowledge, particularly from English into Thai. The researcher used 38 2nd-year non-English speaking students of Suratthani Rajabhat University as samples. The samples were required to translate an A4-sized article from English into Thai assigned as a part of BEN0202 Translation for Business, a requirement subject for Business English Department, which was also taught by the researcher. After completion of the translation, numerous problems were found and the research grouped them into 4 major types. The normally occurred problems in English-Thai translation works are the lack of knowledge in terms of parts of speech, word-by-word translation employment, misspellings as well as the poor knowledge in English language structure. However, this research is currently under the process of data analysis and shall be completed by the beginning of August. The researcher, nevertheless, predicts that all the above-mentioned problems, will support the researcher’s hypothesizes, that are; 1) the lack of knowledge in terms of parts of speech causes the mistranslation problem; 2) employing word-by-word translation technique hugely results in the mistranslation problem; 3) misspellings yields the mistranslation problem; and 4) the poor knowledge in English language structure also brings about translation errors. The research also predicts that, of all the aforementioned problems, the following ones are found the most, respectively: the poor knowledge in English language structure, word-by-word translation employment, the lack of knowledge in terms of parts of speech, and misspellings.

Keywords: problem, student, Thai, translation

Procedia PDF Downloads 404
1795 Harmony Search-Based K-Coverage Enhancement in Wireless Sensor Networks

Authors: Shaimaa M. Mohamed, Haitham S. Hamza, Imane A. Saroit

Abstract:

Many wireless sensor network applications require K-coverage of the monitored area. In this paper, we propose a scalable harmony search based algorithm in terms of execution time, K-Coverage Enhancement Algorithm (KCEA), it attempts to enhance initial coverage, and achieve the required K-coverage degree for a specific application efficiently. Simulation results show that the proposed algorithm achieves coverage improvement of 5.34% compared to K-Coverage Rate Deployment (K-CRD), which achieves 1.31% when deploying one additional sensor. Moreover, the proposed algorithm is more time efficient.

Keywords: Wireless Sensor Networks (WSN), harmony search algorithms, K-Coverage, Mobile WSN

Procedia PDF Downloads 494
1794 USE-Net: SE-Block Enhanced U-Net Architecture for Robust Speaker Identification

Authors: Kilari Nikhil, Ankur Tibrewal, Srinivas Kruthiventi S. S.

Abstract:

Conventional speaker identification systems often fall short of capturing the diverse variations present in speech data due to fixed-scale architectures. In this research, we propose a CNN-based architecture, USENet, designed to overcome these limitations. Leveraging two key techniques, our approach achieves superior performance on the VoxCeleb 1 Dataset without any pre-training. Firstly, we adopt a U-net-inspired design to extract features at multiple scales, empowering our model to capture speech characteristics effectively. Secondly, we introduce the squeeze and excitation block to enhance spatial feature learning. The proposed architecture showcases significant advancements in speaker identification, outperforming existing methods, and holds promise for future research in this domain.

Keywords: multi-scale feature extraction, squeeze and excitation, VoxCeleb1 speaker identification, mel-spectrograms, USENet

Procedia PDF Downloads 41
1793 Wolof Voice Response Recognition System: A Deep Learning Model for Wolof Audio Classification

Authors: Krishna Mohan Bathula, Fatou Bintou Loucoubar, FNU Kaleemunnisa, Christelle Scharff, Mark Anthony De Castro

Abstract:

Voice recognition algorithms such as automatic speech recognition and text-to-speech systems with African languages can play an important role in bridging the digital divide of Artificial Intelligence in Africa, contributing to the establishment of a fully inclusive information society. This paper proposes a Deep Learning model that can classify the user responses as inputs for an interactive voice response system. A dataset with Wolof language words ‘yes’ and ‘no’ is collected as audio recordings. A two stage Data Augmentation approach is adopted for enhancing the dataset size required by the deep neural network. Data preprocessing and feature engineering with Mel-Frequency Cepstral Coefficients are implemented. Convolutional Neural Networks (CNNs) have proven to be very powerful in image classification and are promising for audio processing when sounds are transformed into spectra. For performing voice response classification, the recordings are transformed into sound frequency feature spectra and then applied image classification methodology using a deep CNN model. The inference model of this trained and reusable Wolof voice response recognition system can be integrated with many applications associated with both web and mobile platforms.

Keywords: automatic speech recognition, interactive voice response, voice response recognition, wolof word classification

Procedia PDF Downloads 85
1792 Extracting Actions with Improved Part of Speech Tagging for Social Networking Texts

Authors: Yassine Jamoussi, Ameni Youssfi, Henda Ben Ghezala

Abstract:

With the growing interest in social networking, the interaction of social actors evolved to a source of knowledge in which it becomes possible to perform context aware-reasoning. The information extraction from social networking especially Twitter and Facebook is one of the problems in this area. To extract text from social networking, we need several lexical features and large scale word clustering. We attempt to expand existing tokenizer and to develop our own tagger in order to support the incorrect words currently in existence in Facebook and Twitter. Our goal in this work is to benefit from the lexical features developed for Twitter and online conversational text in previous works, and to develop an extraction model for constructing a huge knowledge based on actions

Keywords: social networking, information extraction, part-of-speech tagging, natural language processing

Procedia PDF Downloads 275
1791 Formation of an Artificial Cultural and Language Environment When Teaching a Foreign Language in the Material of Original Films

Authors: Konysbek Aksaule

Abstract:

The purpose of this work is to explore new and effective ways of teaching English to students who are studying a foreign language since the timeliness of the problem disclosed in this article is due to the high level of English proficiency that potential specialists must have due to high competition in the context of global globalization. The article presents an analysis of the feasibility and effectiveness of using an authentic feature film in teaching English to students. The methodological basis of the study includes an assessment of the level of students' proficiency in a foreign language, the stage of evaluating the film, and the method of selecting the film for certain categories of students. The study also contains a list of practical tasks that can be applied in the process of viewing and perception of an original feature film in a foreign language, and which are aimed at developing language skills such as speaking and listening. The results of this study proved that teaching English to students through watching an original film is one of the most effective methods because it improves speech perception, speech reproduction ability, and also expands the vocabulary of students and makes their speech fluent. In addition, learning English through watching foreign films has a huge impact on the cultural views and knowledge of students about the country of the language being studied and the world in general. Thus, this study demonstrates the high potential of using authentic feature film in English lessons for pedagogical science and methods of teaching English in general.

Keywords: university, education, students, foreign language, feature film

Procedia PDF Downloads 121
1790 Preservice EFL Teachers in a Blended Professional Development Program: Learning to Teach Speech Acts

Authors: Mei-Hui Liu

Abstract:

This study examines the effectiveness of a blended professional development program on preservice EFL (English as a foreign language) teachers’ learning to teach speech acts with the advent of Information and Communication Technology, researchers and scholars underscore the significance of integrating online and face-to-face learning opportunities in the teacher education field. Yet, a paucity of evidence has been documented to investigate the extent to which such a blended professional learning model may impact real classroom practice and student learning outcome. This yearlong project involves various stakeholders, including 25 preservice teachers, 5 English professionals, and 45 secondary school students. Multiple data sources collected are surveys, interviews, reflection journals, online discussion messages, artifacts, and discourse completion tests. Relying on the theoretical lenses of Community of Inquiry, data analysis depicts the nature and process of preservice teachers’ professional development in this blended learning community, which triggers and fosters both face-to-face and synchronous/asynchronous online interactions among preservice teachers and English professionals (i.e., university faculty and in-service teachers). Also included is the student learning outcome after preservice teachers put what they learn from the support community into instructional practice. Pedagogical implications and research suggestions are further provided based on the research findings and limitations.

Keywords: blended professional development, preservice EFL teachers, speech act instruction, student learning outcome

Procedia PDF Downloads 195
1789 FLIME - Fast Low Light Image Enhancement for Real-Time Video

Authors: Vinay P., Srinivas K. S.

Abstract:

Low Light Image Enhancement is of utmost impor- tance in computer vision based tasks. Applications include vision systems for autonomous driving, night vision devices for defence systems, low light object detection tasks. Many of the existing deep learning methods are resource intensive during the inference step and take considerable time for processing. The algorithm should take considerably less than 41 milliseconds in order to process a real-time video feed with 24 frames per second and should be even less for a video with 30 or 60 frames per second. The paper presents a fast and efficient solution which has two main advantages, it has the potential to be used for a real-time video feed, and it can be used in low compute environments because of the lightweight nature. The proposed solution is a pipeline of three steps, the first one is the use of a simple function to map input RGB values to output RGB values, the second is to balance the colors and the final step is to adjust the contrast of the image. Hence a custom dataset is carefully prepared using images taken in low and bright lighting conditions. The preparation of the dataset, the proposed model, the processing time are discussed in detail and the quality of the enhanced images using different methods is shown.

Keywords: low light image enhancement, real-time video, computer vision, machine learning

Procedia PDF Downloads 165
1788 Analysis of Speaking Skills in Turkish Language Acquisition as a Foreign Language

Authors: Lokman Gozcu, Sule Deniz Gozcu

Abstract:

This study aims to analyze the skills of speaking in the acquisition of Turkish as a foreign language. One of the most important things for the individual who learns a foreign language is to be successful in the oral communication (speaking) skills and to interact in an understandable way. Speech skill requires much more time and effort than other language skills. In this direction, it is necessary to make an analysis of these oral communication skills, which is important in Turkish language acquisition as a foreign language and to draw out a road map according to the result. The aim of this study is to determine the competence and attitudes of speaking competence according to the individuals who learn Turkish as a foreign language and to be considered as speaking skill elements; Grammar, emphasis, intonation, body language, speed, ranking, accuracy, fluency, pronunciation, etc. and the results and suggestions based on these determinations. A mixed method has been chosen for data collection and analysis. A Likert scale (for competence and attitude) was applied to 190 individuals who were interviewed face-to-face (for speech skills) with a semi-structured interview form about 22 participants randomly selected. In addition, the observation form related to the 22 participants interviewed were completed by the researcher during the interview, and after the completion of the collection of all the voice recordings, analyses of voice recordings with the speech skills evaluation scale was made. The results of the research revealed that the speech skills of the individuals who learned Turkish as a foreign language have various perspectives. According to the results, the most inadequate aspects of the participants' ability to speak in Turkish include vocabulary, using humorous elements while speaking Turkish, being able to include items such as idioms and proverbs while speaking Turkish, Turkish fluency respectively. In addition, the participants were found not to feel comfortable while speaking Turkish, to feel ridiculous and to be nervous while speaking in formal settings. There are conclusions and suggestions for the situations that arise after the have been analyses made.

Keywords: learning Turkish as a foreign language, proficiency criteria, phonetic (modalities), speaking skills

Procedia PDF Downloads 215
1787 Investigating the Effect of Metaphor Awareness-Raising Approach on the Right-Hemisphere Involvement in Developing Japanese Learners’ Knowledge of Different Degrees of Politeness

Authors: Masahiro Takimoto

Abstract:

The present study explored how the metaphor awareness-raising approach affects the involvement of the right hemisphere in developing EFL learners’ knowledge regarding the different degrees of politeness embedded within different request expressions. The present study was motivated by theoretical considerations regarding the conceptual projection and the metaphorical idea of politeness is distance, as proposed; this study applied these considerations to develop Japanese learners’ knowledge regarding the different politeness degrees and to explore the connection between the metaphorical concept projection and right-hemisphere dominance. Japanese EFL learners do not know certain language strategies (e.g., English requests can be mitigated with biclausal downgraders, including the if-clause with past-tense modal verbs) and have difficulty adjusting the politeness degrees attached to request expressions according to situations. The present study used a pre/post-test design to reaffirm the efficacy of the cognitive technique and its connection to right-hemisphere involvement by mouth asymmetry technique. Mouth asymmetry measurement has been utilized because speech articulation, normally controlled mainly by one side of the brain, causes muscles on the opposite side of the mouth to move more during speech production. The present research did not administer the delayed post-test because it emphasized determining whether metaphor awareness-raising approaches for developing EFL learners’ pragmatic proficiency entailed right-hemisphere activation. Each test contained an acceptability judgment test (AJT) along with a speaking test in the post-test. The study results show that the metaphor awareness-raising group performed significantly better than the control group with regard to acceptability judgment and speaking tests post-test. These data revealed that the metaphor awareness-raising approach could promote L2 learning because it aided input enhancement and concept projection; through these aspects, the participants were able to comprehend an abstract concept: the degree of politeness in terms of the spatial concept of distance. Accordingly, the proximal-distal metaphor enabled the study participants to connect the newly spatio-visualized concept of distance to the different politeness degrees attached to different request expressions; furthermore, they could recall them with the left side of the mouth being wider than the right. This supported certain findings from previous studies that indicated the possible involvement of the brain's right hemisphere in metaphor processing.

Keywords: metaphor awareness-raising, right hemisphere, L2 politeness, mouth asymmetry

Procedia PDF Downloads 119
1786 Improving Second Language Speaking Skills via Video Exchange

Authors: Nami Takase

Abstract:

Computer-mediated-communication allows people to connect and interact with each other as if they were sharing the same space. The current study examined the effects of using video letters (VLs) on the development of second language speaking skills of Common European Framework of Reference for Languages (CEFR) A1 and CEFR B2 level learners of English as a foreign language. Two groups were formed to measure the impact of VLs. The experimental and control groups were given the same topic, and both groups worked with a native English-speaking university student from the United States of America. Students in the experimental group exchanged VLs, and students in the control group used video conferencing. Pre- and post-tests were conducted to examine the effects of each practice mode. The transcribed speech-text data showed that the VL group had improved speech accuracy scores, while the video conferencing group had increased sentence complexity scores. The use of VLs may be more effective for beginner-level learners because they are able to notice their own errors and replay videos to better understand the native speaker’s speech at their own pace. Both the VL and video conferencing groups provided positive feedback regarding their interactions with native speakers. The results showed how different types of computer-mediated communication impacts different areas of language learning and speaking practice and how each of these types of online communication tool is suited to different teaching objectives.

Keywords: computer-assisted-language-learning, computer-mediated-communication, english as a foreign language, speaking

Procedia PDF Downloads 71
1785 Optimized Brain Computer Interface System for Unspoken Speech Recognition: Role of Wernicke Area

Authors: Nassib Abdallah, Pierre Chauvet, Abd El Salam Hajjar, Bassam Daya

Abstract:

In this paper, we propose an optimized brain computer interface (BCI) system for unspoken speech recognition, based on the fact that the constructions of unspoken words rely strongly on the Wernicke area, situated in the temporal lobe. Our BCI system has four modules: (i) the EEG Acquisition module based on a non-invasive headset with 14 electrodes; (ii) the Preprocessing module to remove noise and artifacts, using the Common Average Reference method; (iii) the Features Extraction module, using Wavelet Packet Transform (WPT); (iv) the Classification module based on a one-hidden layer artificial neural network. The present study consists of comparing the recognition accuracy of 5 Arabic words, when using all the headset electrodes or only the 4 electrodes situated near the Wernicke area, as well as the selection effect of the subbands produced by the WPT module. After applying the articial neural network on the produced database, we obtain, on the test dataset, an accuracy of 83.4% with all the electrodes and all the subbands of 8 levels of the WPT decomposition. However, by using only the 4 electrodes near Wernicke Area and the 6 middle subbands of the WPT, we obtain a high reduction of the dataset size, equal to approximately 19% of the total dataset, with 67.5% of accuracy rate. This reduction appears particularly important to improve the design of a low cost and simple to use BCI, trained for several words.

Keywords: brain-computer interface, speech recognition, artificial neural network, electroencephalography, EEG, wernicke area

Procedia PDF Downloads 246
1784 The Code-Mixing of Japanese, English, and Thai in Line Chat

Authors: Premvadee Na Nakornpanom

Abstract:

Language mixing in spontaneous speech has been widely discussed, but not in virtual situations; especially in context of the third language learning students. Thus, this study was an attempt to explore the characteristics of the mixing of Japanese, English and Thai in a mobile chat room by students with their background of Japanese, English, and Thai. The result found that Insertion of Thai and English content words was a very common linguistic phenomenon embedded in the utterances. As chatting is to be ‘relational’ or ‘interactional’, it affected the style of lexical choices to be speech-like, more personal and emotional-related. A Japanese sentence-final question particle“か”(ka) was added to the end of the sentence based on Thai grammar rule. Moreover, some unique characteristics were created. The non-verbal cues were represented in personal, Thai styles by inserting textual representations of images or feelings available on the websites into streams of conversations.

Keywords: code-mixing, Japanese, English, Thai, line chat

Procedia PDF Downloads 616