Search results for: speech recognition library
2784 Morpheme Based Parts of Speech Tagger for Kannada Language
Authors: M. C. Padma, R. J. Prathibha
Abstract:
Parts of speech tagging is the process of assigning appropriate parts of speech tags to the words in a given text. The critical or crucial information needed for tagging a word come from its internal structure rather from its neighboring words. The internal structure of a word comprises of its morphological features and grammatical information. This paper presents a morpheme based parts of speech tagger for Kannada language. This proposed work uses hierarchical tag set for assigning tags. The system is tested on some Kannada words taken from EMILLE corpus. Experimental result shows that the performance of the proposed system is above 90%.Keywords: hierarchical tag set, morphological analyzer, natural language processing, paradigms, parts of speech
Procedia PDF Downloads 2962783 The Convolution Recurrent Network of Using Residual LSTM to Process the Output of the Downsampling for Monaural Speech Enhancement
Authors: Shibo Wei, Ting Jiang
Abstract:
Convolutional-recurrent neural networks (CRN) have achieved much success recently in the speech enhancement field. The common processing method is to use the convolution layer to compress the feature space by multiple upsampling and then model the compressed features with the LSTM layer. At last, the enhanced speech is obtained by deconvolution operation to integrate the global information of the speech sequence. However, the feature space compression process may cause the loss of information, so we propose to model the upsampling result of each step with the residual LSTM layer, then join it with the output of the deconvolution layer and input them to the next deconvolution layer, by this way, we want to integrate the global information of speech sequence better. The experimental results show the network model (RES-CRN) we introduce can achieve better performance than LSTM without residual and overlaying LSTM simply in the original CRN in terms of scale-invariant signal-to-distortion ratio (SI-SNR), speech quality (PESQ), and intelligibility (STOI).Keywords: convolutional-recurrent neural networks, speech enhancement, residual LSTM, SI-SNR
Procedia PDF Downloads 2002782 EduEasy: Smart Learning Assistant System
Authors: A. Karunasena, P. Bandara, J. A. T. P. Jayasuriya, P. D. Gallage, J. M. S. D. Jayasundara, L. A. P. Y. P. Nuwanjaya
Abstract:
Usage of smart learning concepts has increased rapidly all over the world recently as better teaching and learning methods. Most educational institutes such as universities are experimenting those concepts with their students. Smart learning concepts are especially useful for students to learn better in large classes. In large classes, the lecture method is the most popular method of teaching. In the lecture method, the lecturer presents the content mostly using lecture slides, and the students make their own notes based on the content presented. However, some students may find difficulties with the above method due to various issues such as speed in delivery. The purpose of this research is to assist students in large classes in the following content. The research proposes a solution with four components, namely note-taker, slide matcher, reference finder, and question presenter, which are helpful for the students to obtain a summarized version of the lecture note, easily navigate to the content and find resources, and revise content using questions.Keywords: automatic summarization, extractive text summarization, speech recognition library, sentence extraction, automatic web search, automatic question generator, sentence scoring, the term weight
Procedia PDF Downloads 1462781 The Library as a Metaphor: Perceptions, Evolution, and the Shifting Role in Society Through a Librarian's Lens
Authors: Nihar Kanta Patra, Akhtar Hussain
Abstract:
This comprehensive study, through the perspective of librarians, explores the library as a metaphor and its profound significance in representing knowledge and learning. It delves into how librarians perceive the library as a metaphor and the ways in which it symbolizes the acquisition, preservation, and dissemination of knowledge. The research investigates the most common metaphors used to describe libraries, as witnessed by librarians, and analyzes how these metaphors reflect the evolving role of libraries in society. Furthermore, the study examines how the library metaphor influences the perception of librarians regarding academic libraries as physical places and academic library websites as virtual spaces, exploring their potential for learning and exploration. It investigates the evolving nature of the library as a metaphor over time, as seen by librarians, considering the changing landscape of information and technology. The research explores the ways in which the library metaphor has expanded beyond its traditional representation, encompassing digital resources, online connectivity, and virtual realms, and provides insights into its potential evolution in the future. Drawing on the experiences of librarians in their interactions with library users, the study uncovers any specific cultural or generational differences in how people interpret or relate to the library as a metaphor. It sheds light on the diverse perspectives and interpretations of the metaphor based on cultural backgrounds, educational experiences, and technological familiarity. Lastly, the study investigates the evolving roles of libraries as observed by librarians and explores how these changing roles can influence the metaphors we use to represent them. It examines the dynamic nature of libraries as they adapt to societal needs, technological advancements, and new modes of information dissemination. By analyzing these various dimensions, this research provides a comprehensive understanding of the library as a metaphor through the lens of librarians, illuminating its significance, evolution, and its transformative impact on knowledge, learning, and the changing role of libraries in society.Keywords: library, librarians, metaphor, perception
Procedia PDF Downloads 952780 ViraPart: A Text Refinement Framework for Automatic Speech Recognition and Natural Language Processing Tasks in Persian
Authors: Narges Farokhshad, Milad Molazadeh, Saman Jamalabbasi, Hamed Babaei Giglou, Saeed Bibak
Abstract:
The Persian language is an inflectional subject-object-verb language. This fact makes Persian a more uncertain language. However, using techniques such as Zero-Width Non-Joiner (ZWNJ) recognition, punctuation restoration, and Persian Ezafe construction will lead us to a more understandable and precise language. In most of the works in Persian, these techniques are addressed individually. Despite that, we believe that for text refinement in Persian, all of these tasks are necessary. In this work, we proposed a ViraPart framework that uses embedded ParsBERT in its core for text clarifications. First, used the BERT variant for Persian followed by a classifier layer for classification procedures. Next, we combined models outputs to output cleartext. In the end, the proposed model for ZWNJ recognition, punctuation restoration, and Persian Ezafe construction performs the averaged F1 macro scores of 96.90%, 92.13%, and 98.50%, respectively. Experimental results show that our proposed approach is very effective in text refinement for the Persian language.Keywords: Persian Ezafe, punctuation, ZWNJ, NLP, ParsBERT, transformers
Procedia PDF Downloads 2142779 Face Tracking and Recognition Using Deep Learning Approach
Authors: Degale Desta, Cheng Jian
Abstract:
The most important factor in identifying a person is their face. Even identical twins have their own distinct faces. As a result, identification and face recognition are needed to tell one person from another. A face recognition system is a verification tool used to establish a person's identity using biometrics. Nowadays, face recognition is a common technique used in a variety of applications, including home security systems, criminal identification, and phone unlock systems. This system is more secure because it only requires a facial image instead of other dependencies like a key or card. Face detection and face identification are the two phases that typically make up a human recognition system.The idea behind designing and creating a face recognition system using deep learning with Azure ML Python's OpenCV is explained in this paper. Face recognition is a task that can be accomplished using deep learning, and given the accuracy of this method, it appears to be a suitable approach. To show how accurate the suggested face recognition system is, experimental results are given in 98.46% accuracy using Fast-RCNN Performance of algorithms under different training conditions.Keywords: deep learning, face recognition, identification, fast-RCNN
Procedia PDF Downloads 1402778 Detection of Clipped Fragments in Speech Signals
Authors: Sergei Aleinik, Yuri Matveev
Abstract:
In this paper a novel method for the detection of clipping in speech signals is described. It is shown that the new method has better performance than known clipping detection methods, is easy to implement, and is robust to changes in signal amplitude, size of data, etc. Statistical simulation results are presented.Keywords: clipping, clipped signal, speech signal processing, digital signal processing
Procedia PDF Downloads 3922777 The Agile Management and Its Relationship to Administrative Ambidexterity: An Applied Study in Alexandria Library
Authors: Samar Sheikhelsouk, Dina Abdel Qader, Nada Rizk
Abstract:
The plan of the organization may impede its progress and creativity, especially in the framework of its work in independent environments and fast-shifting markets, unless the leaders and minds of the organization use a set of practices, tools, and techniques encapsulated in so-called “agile methods” or “lightweight” methods. Thus, this research paper examines the agile management approach as a flexible and dynamic approach and its relationship to the administrative ambidexterity at the Alexandria library. The sample of the study is the employees of the Alexandria library. The study is expected to provide both theoretical and practical implications. The current study will bridge the gap between agile management and administrative approaches in the literature. The study will lead managers to comprehend how the role of agile management in establishing administrative ambidexterity in the organization.Keywords: agile management, administrative innovation, Alexandria library, Egypt
Procedia PDF Downloads 852776 Detection of Phoneme [S] Mispronounciation for Sigmatism Diagnosis in Adults
Authors: Michal Krecichwost, Zauzanna Miodonska, Pawel Badura
Abstract:
The diagnosis of sigmatism is mostly based on the observation of articulatory organs. It is, however, not always possible to precisely observe the vocal apparatus, in particular in the oral cavity of the patient. Speech processing can allow to objectify the therapy and simplify the verification of its progress. In the described study the methodology for classification of incorrectly pronounced phoneme [s] is proposed. The recordings come from adults. They were registered with the speech recorder at the sampling rate of 44.1 kHz and the resolution of 16 bit. The database of pathological and normative speech has been collected for the study including reference assessments provided by the speech therapy experts. Ten adult subjects were asked to simulate a certain type of stigmatism under the speech therapy expert supervision. In the recordings, the analyzed phone [s] was surrounded by vowels, viz: ASA, ESE, ISI, SPA, USU, YSY. Thirteen MFCC (mel-frequency cepstral coefficients) and RMS (root mean square) values are calculated within each frame being a part of the analyzed phoneme. Additionally, 3 fricative formants along with corresponding amplitudes are determined for the entire segment. In order to aggregate the information within the segment, the average value of each MFCC coefficient is calculated. All features of other types are aggregated by means of their 75th percentile. The proposed method of features aggregation reduces the size of the feature vector used in the classification. Binary SVM (support vector machine) classifier is employed at the phoneme recognition stage. The first group consists of pathological phones, while the other of the normative ones. The proposed feature vector yields classification sensitivity and specificity measures above 90% level in case of individual logo phones. The employment of a fricative formants-based information improves the sole-MFCC classification results average of 5 percentage points. The study shows that the employment of specific parameters for the selected phones improves the efficiency of pathology detection referred to the traditional methods of speech signal parameterization.Keywords: computer-aided pronunciation evaluation, sibilants, sigmatism diagnosis, speech processing
Procedia PDF Downloads 2832775 Designing an Online Case-Based Library for Technology Integration in Teacher Education
Authors: Mustafa Tevfik Hebebci, Sirin Kucuk, Ismail Celik, A. Oguz Akturk, Ismail Sahin, Fetah Eren
Abstract:
The purpose of this paper is to introduce an interactive online case-study library website developed in a national project. The design goal of the website is to provide interactive, enhanced, case-based and online educational resource for educators through the purpose and within the scope of a national project. The ADDIE instructional design model was used in the development of the website for interactive case-based library. This library is developed on a web-based platform, which is important in terms of manageability, accessibility, and updateability of data. Users are able to sort the displayed case-studies by their titles, dates, ratings, view counts, etc. The usability test is used and the expert opinion is taken for the evaluation of the website. This website is a tool to integrate technology into education. It is believed that this website will be beneficial for pre-service and in-service teachers in terms of their professional developments.Keywords: ADDIE, case-based library, design, technology integration
Procedia PDF Downloads 4452774 Providing Open Access for Scholarly Information in Libya
Authors: Mohamed Abolgasem Arteimi, Ahlam Al-Tajori
Abstract:
This paper describes an ongoing project at the Libyan Academy. The project aims to build digital library for thesis and dissertations (ETD). The researchers developed a system based on Greenstone open source systems for building ETD digital library. A metadata for theses and dissertations was developed. The paper addresses issues related to project design, development and user satisfaction. Conclusions highlighted some important lessons learned to date.Keywords: digital library, electronic theses and dissertations, open access, ETD, metadata
Procedia PDF Downloads 3162773 Developing an Intonation Labeled Dataset for Hindi
Authors: Esha Banerjee, Atul Kumar Ojha, Girish Nath Jha
Abstract:
This study aims to develop an intonation labeled database for Hindi. Although no single standard for prosody labeling exists in Hindi, researchers in the past have employed perceptual and statistical methods in literature to draw inferences about the behavior of prosody patterns in Hindi. Based on such existing research and largely agreed upon intonational theories in Hindi, this study attempts to develop a manually annotated prosodic corpus of Hindi speech data, which can be used for training speech models for natural-sounding speech in the future. 100 sentences ( 500 words) each for declarative and interrogative types have been labeled using Praat.Keywords: speech dataset, Hindi, intonation, labeled corpus
Procedia PDF Downloads 1972772 The Philippines’ War on Drugs: a Pragmatic Analysis on Duterte's Commemorative Speeches
Authors: Ericson O. Alieto, Aprillete C. Devanadera
Abstract:
The main objective of the study is to determine the dominant speech acts in five commemorative speeches of President Duterte. This study employed Speech Act Theory and Discourse analysis to determine how the speech acts features connote the pragmatic meaning of Duterte’s speeches. Identifying the speech acts is significant in elucidating the underlying message or the pragmatic meaning of the speeches. From the 713 sentences or utterances from the speeches, assertive with 208 occurrences from the corpus or 29% is the dominant speech acts. It was followed by expressive with 177 or 25% occurrences, directive accounts for 152 or 15% occurrences. While commisive accounts for 104 or 15% occurrences and declarative got the lowest percentage of occurrences with 72 or 10% only. These sentences when uttered by Duterte carry a certain power of language to move or influence people. Thus, the present study shows the fundamental message perceived by the listeners. Moreover, the frequent use of assertive and expressive not only explains the pragmatic message of the speeches but also reflects the personality of President Duterte.Keywords: commemorative speech, discourse analysis, duterte, pragmatics
Procedia PDF Downloads 2862771 Excitation Modeling for Hidden Markov Model-Based Speech Synthesis Based on Wavelet Analysis
Authors: M. Kiran Reddy, K. Sreenivasa Rao
Abstract:
The conventional Hidden Markov Model (HMM)-based speech synthesis system (HTS) uses only a pulse excitation model, which significantly differs from natural excitation signal. Hence, buzziness can be perceived in the speech generated using HTS. This paper proposes an efficient excitation modeling method that can significantly reduce the buzziness, and improve the quality of HMM-based speech synthesis. The proposed approach models the pitch-synchronous residual frames extracted from the residual excitation signal. Each pitch synchronous residual frame is parameterized using 30 wavelet coefficients. These 30 wavelet coefficients are found to accurately capture the perceptually important information present in the residual waveform. In synthesis phase, the residual frames are reconstructed from the generated wavelet coefficients and are pitch-synchronously overlap-added to generate the excitation signal. The proposed excitation modeling method is integrated into HMM-based speech synthesis system. Evaluation results indicate that the speech synthesized by the proposed excitation model is significantly better than the speech generated using state-of-the-art excitation modeling methods.Keywords: excitation modeling, hidden Markov models, pitch-synchronous frames, speech synthesis, wavelet coefficients
Procedia PDF Downloads 2482770 Comparing Emotion Recognition from Voice and Facial Data Using Time Invariant Features
Authors: Vesna Kirandziska, Nevena Ackovska, Ana Madevska Bogdanova
Abstract:
The problem of emotion recognition is a challenging problem. It is still an open problem from the aspect of both intelligent systems and psychology. In this paper, both voice features and facial features are used for building an emotion recognition system. A Support Vector Machine classifiers are built by using raw data from video recordings. In this paper, the results obtained for the emotion recognition are given, and a discussion about the validity and the expressiveness of different emotions is presented. A comparison between the classifiers build from facial data only, voice data only and from the combination of both data is made here. The need for a better combination of the information from facial expression and voice data is argued.Keywords: emotion recognition, facial recognition, signal processing, machine learning
Procedia PDF Downloads 3152769 Text-to-Speech in Azerbaijani Language via Transfer Learning in a Low Resource Environment
Authors: Dzhavidan Zeinalov, Bugra Sen, Firangiz Aslanova
Abstract:
Most text-to-speech models cannot operate well in low-resource languages and require a great amount of high-quality training data to be considered good enough. Yet, with the improvements made in ASR systems, it is now much easier than ever to collect data for the design of custom text-to-speech models. In this work, our work on using the ASR model to collect data to build a viable text-to-speech system for one of the leading financial institutions of Azerbaijan will be outlined. NVIDIA’s implementation of the Tacotron 2 model was utilized along with the HiFiGAN vocoder. As for the training, the model was first trained with high-quality audio data collected from the Internet, then fine-tuned on the bank’s single speaker call center data. The results were then evaluated by 50 different listeners and got a mean opinion score of 4.17, displaying that our method is indeed viable. With this, we have successfully designed the first text-to-speech model in Azerbaijani and publicly shared 12 hours of audiobook data for everyone to use.Keywords: Azerbaijani language, HiFiGAN, Tacotron 2, text-to-speech, transfer learning, whisper
Procedia PDF Downloads 442768 Hate Speech Detection Using Machine Learning: A Survey
Authors: Edemealem Desalegn Kingawa, Kafte Tasew Timkete, Mekashaw Girmaw Abebe, Terefe Feyisa, Abiyot Bitew Mihretie, Senait Teklemarkos Haile
Abstract:
Currently, hate speech is a growing challenge for society, individuals, policymakers, and researchers, as social media platforms make it easy to anonymously create and grow online friends and followers and provide an online forum for debate about specific issues of community life, culture, politics, and others. Despite this, research on identifying and detecting hate speech is not satisfactory performance, and this is why future research on this issue is constantly called for. This paper provides a systematic review of the literature in this field, with a focus on approaches like word embedding techniques, machine learning, deep learning technologies, hate speech terminology, and other state-of-the-art technologies with challenges. In this paper, we have made a systematic review of the last six years of literature from Research Gate and Google Scholar. Furthermore, limitations, along with algorithm selection and use challenges, data collection, and cleaning challenges, and future research directions, are discussed in detail.Keywords: Amharic hate speech, deep learning approach, hate speech detection review, Afaan Oromo hate speech detection
Procedia PDF Downloads 1772767 Students’ Willingness to Use Public Computing Facilities at a Library
Authors: Norbayah Mohd Suki, Norazah Mohd Suki
Abstract:
This study aims to examine relationships between attitude, self-efficacy, and subjective norm with students’ behavioural intention to use public computing facilities at a library. Data was collected from 200 undergraduate students enrolled at a higher learning institution in the Federal Territory of Labuan, Malaysia via a structured questionnaire comprising closed-ended questions. Data was analyzed using multiple regression analysis. The results show that students’ behavioural intention to use public computing facilities at the library is widely affected by subjective norm factor i.e. influence of the support of family members, friends and neighbours. The findings of this study provide a better understanding of factors likely to influence students’ behavioural intention to use public computing facilities at a library. It also offers valuable insights into factors which university librarians need to focus on to improve students’ behavioural intention to actively use public computing facilities at a library for quality information retrieval. Direction for future research is also presented.Keywords: attitude, self-efficacy, subjective norm, behavioural intention
Procedia PDF Downloads 4462766 A Contribution to Human Activities Recognition Using Expert System Techniques
Authors: Malika Yaici, Soraya Aloui, Sara Semchaoui
Abstract:
This paper deals with human activity recognition from sensor data. It is an active research area, and the main objective is to obtain a high recognition rate. In this work, a recognition system based on expert systems is proposed; the recognition is performed using the objects, object states, and gestures and taking into account the context (the location of the objects and of the person performing the activity, the duration of the elementary actions and the activity). The system recognizes complex activities after decomposing them into simple, easy-to-recognize activities. The proposed method can be applied to any type of activity. The simulation results show the robustness of our system and its speed of decision.Keywords: human activity recognition, ubiquitous computing, context-awareness, expert system
Procedia PDF Downloads 1182765 Switching to the Latin Alphabet in Kazakhstan: A Brief Overview of Character Recognition Methods
Authors: Ainagul Yermekova, Liudmila Goncharenko, Ali Baghirzade, Sergey Sybachin
Abstract:
In this article, we address the problem of Kazakhstan's transition to the Latin alphabet. The transition process started in 2017 and is scheduled to be completed in 2025. In connection with these events, the problem of recognizing the characters of the new alphabet is raised. Well-known character recognition programs such as ABBYY FineReader, FormReader, MyScript Stylus did not recognize specific Kazakh letters that were used in Cyrillic. The author tries to give an assessment of the well-known method of character recognition that could be in demand as part of the country's transition to the Latin alphabet. Three methods of character recognition: template, structured, and feature-based, are considered through the algorithms of operation. At the end of the article, a general conclusion is made about the possibility of applying a certain method to a particular recognition process: for example, in the process of population census, recognition of typographic text in Latin, or recognition of photos of car numbers, store signs, etc.Keywords: text detection, template method, recognition algorithm, structured method, feature method
Procedia PDF Downloads 1862764 Automatic Assignment of Geminate and Epenthetic Vowel for Amharic Text-to-Speech System
Authors: Tadesse Anberbir, Felix Bankole, Tomio Takara, Girma Mamo
Abstract:
In the development of a text-to-speech synthesizer, automatic derivation of correct pronunciation from the grapheme form of a text is a central problem. Particularly deriving phonological features which are not shown in orthography is challenging. In the Amharic language, geminates and epenthetic vowels are very crucial for proper pronunciation but neither is shown in orthography. In this paper, we proposed and integrated a morphological analyzer into an Amharic Text-to-Speech system, mainly to predict geminates and epenthetic vowel positions, and prepared a duration modeling method. Amharic Text-to-Speech system (AmhTTS) is a parametric and rule-based system that adopts a cepstral method and uses a source filter model for speech production and a Log Magnitude Approximation (LMA) filter as the vocal tract filter. The naturalness of the system after employing the duration modeling was evaluated by sentence listening test and we achieved an average Mean Opinion Score (MOS) 3.4 (68%) which is moderate. By modeling the duration of geminates and controlling the locations of epenthetic vowel, we are able to synthesize good quality speech. Our system is mainly suitable to be customized for other Ethiopian languages with limited resources.Keywords: Amharic, gemination, speech synthesis, morphology, epenthesis
Procedia PDF Downloads 872763 Systemic Functional Grammar Analysis of Barack Obama's Second Term Inaugural Speech
Authors: Sadiq Aminu, Ahmed Lamido
Abstract:
This research studies Barack Obama’s second inaugural speech using Halliday’s Systemic Functional Grammar (SFG). SFG is a text grammar which describes how language is used, so that the meaning of the text can be better understood. The primary source of data in this research work is Barack Obama’s second inaugural speech which was obtained from the internet. The analysis of the speech was based on the ideational and textual metafunctions of Systemic Functional Grammar. Specifically, the researcher analyses the Process Types and Participants (ideational) and the Theme/Rheme (textual). It was found that material process (process of doing) was the most frequently used ‘Process type’ and ‘We’ which refers to the people of America was the frequently used ‘Theme’. Application of the SFG theory, therefore, gives a better meaning to Barack Obama’s speech.Keywords: ideational, metafunction, rheme, textual, theme
Procedia PDF Downloads 1592762 Patient-Friendly Hand Gesture Recognition Using AI
Authors: K. Prabhu, K. Dinesh, M. Ranjani, M. Suhitha
Abstract:
During the tough times of covid, those people who were hospitalized found it difficult to always convey what they wanted to or needed to the attendee. Sometimes the attendees might also not be there. In that case, the patients can use simple hand gestures to control electrical appliances (like its set it for a zero watts bulb)and three other gestures for voice note intimation. In this AI-based hand recognition project, NodeMCU is used for the control action of the relay, and it is connected to the firebase for storing the value in the cloud and is interfaced with the python code via raspberry pi. For three hand gestures, a voice clip is added for intimation to the attendee. This is done with the help of Google’s text to speech and the inbuilt audio file option in the raspberry pi 4. All the five gestures will be detected when shown with their hands via the webcam, which is placed for gesture detection. The personal computer is used for displaying the gestures and for running the code in the raspberry pi imager.Keywords: nodeMCU, AI technology, gesture, patient
Procedia PDF Downloads 1662761 Awareness, Use and Searching Behavior of 'Virtua' Online Public Access Catalog Users
Authors: Saira Soroya, Khalid Mahmood
Abstract:
Library catalogs open the door to the library collection. OPAC (Online Public Access Catalog) are one of the services offered by automated libraries. The present study aims to explore user’s awareness, the level of use and their searching behavior of OPAC with a purpose to give suggestions and ways to improve user-friendly features of library OPAC. The population consisted of OPAC users of Lahore University of Management Sciences (LUMS). Convenient sampling technique was carried out. Total sample size was 100 OPAC users. Quantitative research design, based on survey method used to carry out the study. The data collection instrument was adopted. Data was analyzed using SPSS. Results revealed that a considerable number of users were not aware of OPAC i.e. (30%); however, those who were aware were using basic features of the OPAC. It was found that lack of knowledge was considered the frequent reason for not using all features of OPAC. In this regard, it is strongly recommended that compulsory information literacy programme should be established.Keywords: catalog, OPAC, library automation, usability study, university library
Procedia PDF Downloads 3362760 Status of Communication and Swallowing Therapy in Patient with a Tracheostomy
Authors: Ya-Hui Wang
Abstract:
Lower speech therapy rate of tracheostomized patient was noted in comparison with previous researches. This study is aim to shed light on the referral status of speech therapy in those patients in Taiwan. This study developed an analysis for the size and key characteristics of the population of tracheostomized in-patient in the Taiwan. Method: We analyzed National Healthcare Insurance data (The Collaboration Center of Health Information Application, CCHIA) from Jan 1 2010 to Dec 31 2010. Result: over ages 3, number of tracheostomized in-patient is directly proportional to age. A high service loading was observed in North region in comparison with other regions. Only 4.87% of the tracheostomized in-patients were referred for speech therapy, and 1.9% for swallow examination, 2.5% for communication evaluation.Keywords: refer, speech therapy, training, rehabilitation
Procedia PDF Downloads 4402759 Human Activities Recognition Based on Expert System
Authors: Malika Yaici, Soraya Aloui, Sara Semchaoui
Abstract:
Recognition of human activities from sensor data is an active research area, and the main objective is to obtain a high recognition rate. In this work, we propose a recognition system based on expert systems. The proposed system makes the recognition based on the objects, object states, and gestures, taking into account the context (the location of the objects and of the person performing the activity, the duration of the elementary actions, and the activity). This work focuses on complex activities which are decomposed into simple easy to recognize activities. The proposed method can be applied to any type of activity. The simulation results show the robustness of our system and its speed of decision.Keywords: human activity recognition, ubiquitous computing, context-awareness, expert system
Procedia PDF Downloads 1392758 Developed Text-Independent Speaker Verification System
Authors: Mohammed Arif, Abdessalam Kifouche
Abstract:
Speech is a very convenient way of communication between people and machines. It conveys information about the identity of the talker. Since speaker recognition technology is increasingly securing our everyday lives, the objective of this paper is to develop two automatic text-independent speaker verification systems (TI SV) using low-level spectral features and machine learning methods. (i) The first system is based on a support vector machine (SVM), which was widely used in voice signal processing with the aim of speaker recognition involving verifying the identity of the speaker based on its voice characteristics, and (ii) the second is based on Gaussian Mixture Model (GMM) and Universal Background Model (UBM) to combine different functions from different resources to implement the SVM based.Keywords: speaker verification, text-independent, support vector machine, Gaussian mixture model, cepstral analysis
Procedia PDF Downloads 582757 Speech Perception by Monolingual and Bilingual Dravidian Speakers under Adverse Listening Conditions
Authors: S. B. Rathna Kumar, Sale Kranthi, Sandya K. Varudhini
Abstract:
The precise perception of spoken language is influenced by several variables, including the listeners’ native language, distance between speaker and listener, reverberation and background noise. When noise is present in an acoustic environment, it masks the speech signal resulting in reduction in the redundancy of the acoustic and linguistic cues of speech. There is strong evidence that bilinguals face difficulty in speech perception for their second language compared with monolingual speakers under adverse listening conditions such as presence of background noise. This difficulty persists even for speakers who are highly proficient in their second language and is greater in those who have learned the second language later in life. The present study aimed to assess the performance of monolingual (Telugu speaking) and bilingual (Tamil as first language and Telugu as second language) speakers on Telugu speech perception task under quiet and noisy environments. The results indicated that both the groups performed similar in both quiet and noisy environments. The findings of the present study are not in accordance with the findings of previous studies which strongly report poorer speech perception in adverse listening conditions such as noise with bilingual speakers for their second language compared with monolinguals.Keywords: monolingual, bilingual, second language, speech perception, quiet, noise
Procedia PDF Downloads 3892756 An Event-Related Potential Investigation of Speech-in-Noise Recognition in Native and Nonnative Speakers of English
Authors: Zahra Fotovatnia, Jeffery A. Jones, Alexandra Gottardo
Abstract:
Speech communication often occurs in environments where noise conceals part of a message. Listeners should compensate for the lack of auditory information by picking up distinct acoustic cues and using semantic and sentential context to recreate the speaker’s intended message. This situation seems to be more challenging in a nonnative than native language. On the other hand, early bilinguals are expected to show an advantage over the late bilingual and monolingual speakers of a language due to their better executive functioning components. In this study, English monolingual speakers were compared with early and late nonnative speakers of English to understand speech in noise processing (SIN) and the underlying neurobiological features of this phenomenon. Auditory mismatch negativities (MMNs) were recorded using a double-oddball paradigm in response to a minimal pair that differed in their middle vowel (beat/bit) at Wilfrid Laurier University in Ontario, Canada. The results did not show any significant structural and electroneural differences across groups. However, vocabulary knowledge correlated positively with performance on tests that measured SIN processing in participants who learned English after age 6. Moreover, their performance on the test negatively correlated with the integral area amplitudes in the left superior temporal gyrus (STG). In addition, the STG was engaged before the inferior frontal gyrus (IFG) in noise-free and low-noise test conditions in all groups. We infer that the pre-attentive processing of words engages temporal lobes earlier than the fronto-central areas and that vocabulary knowledge helps the nonnative perception of degraded speech.Keywords: degraded speech perception, event-related brain potentials, mismatch negativities, brain regions
Procedia PDF Downloads 1072755 Dual-Channel Multi-Band Spectral Subtraction Algorithm Dedicated to a Bilateral Cochlear Implant
Authors: Fathi Kallel, Ahmed Ben Hamida, Christian Berger-Vachon
Abstract:
In this paper, a Speech Enhancement Algorithm based on Multi-Band Spectral Subtraction (MBSS) principle is evaluated for Bilateral Cochlear Implant (BCI) users. Specifically, dual-channel noise power spectral estimation algorithm using Power Spectral Densities (PSD) and Cross Power Spectral Densities (CPSD) of the observed signals is studied. The enhanced speech signal is obtained using Dual-Channel Multi-Band Spectral Subtraction ‘DC-MBSS’ algorithm. For performance evaluation, objective speech assessment test relying on Perceptual Evaluation of Speech Quality (PESQ) score is performed to fix the optimal number of frequency bands needed in DC-MBSS algorithm. In order to evaluate the speech intelligibility, subjective listening tests are assessed with 3 deafened BCI patients. Experimental results obtained using French Lafon database corrupted by an additive babble noise at different Signal-to-Noise Ratios (SNR) showed that DC-MBSS algorithm improves speech understanding for single and multiple interfering noise sources.Keywords: speech enhancement, spectral substracion, noise estimation, cochlear impalnt
Procedia PDF Downloads 549