Search results for: speech to text
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1983

Search results for: speech to text

1833 Interactive, Topic-Oriented Search Support by a Centroid-Based Text Categorisation

Authors: Mario Kubek, Herwig Unger

Abstract:

Centroid terms are single words that semantically and topically characterise text documents and so may serve as their very compact representation in automatic text processing. In the present paper, centroids are used to measure the relevance of text documents with respect to a given search query. Thus, a new graphbased paradigm for searching texts in large corpora is proposed and evaluated against keyword-based methods. The first, promising experimental results demonstrate the usefulness of the centroid-based search procedure. It is shown that especially the routing of search queries in interactive and decentralised search systems can be greatly improved by applying this approach. A detailed discussion on further fields of its application completes this contribution.

Keywords: search algorithm, centroid, query, keyword, co-occurrence, categorisation

Procedia PDF Downloads 282
1832 Comics Scanlation and Publishing Houses Translation

Authors: Sharifa Alshahrani

Abstract:

Comics is a multimodal text wherein meaning is created by taking in all modes of expression at once. It uses two different semiotic modes, the verbal and the visual modes, together to make meaning and these different semiotic modes can be socially and culturally shaped to give meaning. Therefore, comics translation cannot treat comics as a monomodal text by translating only the verbal mode inside or outside the speech balloons as the cultural differences are encoded in the visual mode as well. Due to the development of the internet and editing software, comics translation is not anymore confined to the publishing houses and official translation as scanlation, or the fan translation took the initiative in translating comics for being emotionally attracted to the culture and genre. Scanlation is carried out by volunteering fans who translate out of passion. However, quality is one of the debatable issues relating to scanlation and fan translation. This study will investigate how the dynamic multimodal relationship in comics is exploited and interpreted in the translation by exploring the translation strategies and procedures adopted by the publishing houses and scanlation in interpreting comics into Arabic using three analytical frameworks; cultural references model, multimodal relation model and translation strategies and procedures models.

Keywords: comics, multimodality, translation, scanlation

Procedia PDF Downloads 212
1831 Binarization and Recognition of Characters from Historical Degraded Documents

Authors: Bency Jacob, S.B. Waykar

Abstract:

Degradations in historical document images appear due to aging of the documents. It is very difficult to understand and retrieve text from badly degraded documents as there is variation between the document foreground and background. Thresholding of such document images either result in broken characters or detection of false texts. Numerous algorithms exist that can separate text and background efficiently in the textual regions of the document; but portions of background are mistaken as text in areas that hardly contain any text. This paper presents a way to overcome these problems by a robust binarization technique that recovers the text from a severely degraded document images and thereby increases the accuracy of optical character recognition systems. The proposed document recovery algorithm efficiently removes degradations from document images. Here we are using the ostus method ,local thresholding and global thresholding and after the binarization training and recognizing the characters in the degraded documents.

Keywords: binarization, denoising, global thresholding, local thresholding, thresholding

Procedia PDF Downloads 344
1830 Virtual Reality Based 3D Video Games and Speech-Lip Synchronization Superseding Algebraic Code Excited Linear Prediction

Authors: P. S. Jagadeesh Kumar, S. Meenakshi Sundaram, Wenli Hu, Yang Yung

Abstract:

In 3D video games, the dominance of production is unceasingly growing with a protruding level of affordability in terms of budget. Afterward, the automation of speech-lip synchronization technique is customarily onerous and has advanced a critical research subject in virtual reality based 3D video games. This paper presents one of these automatic tools, precisely riveted on the synchronization of the speech and the lip movement of the game characters. A robust and precise speech recognition segment that systematized with Algebraic Code Excited Linear Prediction method is developed which unconventionally delivers lip sync results. The Algebraic Code Excited Linear Prediction algorithm is constructed on that used in code-excited linear prediction, but Algebraic Code Excited Linear Prediction codebooks have an explicit algebraic structure levied upon them. This affords a quicker substitute to the software enactments of lip sync algorithms and thus advances the superiority of service factors abridged production cost.

Keywords: algebraic code excited linear prediction, speech-lip synchronization, video games, virtual reality

Procedia PDF Downloads 474
1829 Adaptation of Projection Profile Algorithm for Skewed Handwritten Text Line Detection

Authors: Kayode A. Olaniyi, Tola. M. Osifeko, Adeola A. Ogunleye

Abstract:

Text line segmentation is an important step in document image processing. It represents a labeling process that assigns the same label using distance metric probability to spatially aligned units. Text line detection techniques have successfully been implemented mainly in printed documents. However, processing of the handwritten texts especially unconstrained documents has remained a key problem. This is because the unconstrained hand-written text lines are often not uniformly skewed. The spaces between text lines may not be obvious, complicated by the nature of handwriting and, overlapping ascenders and/or descenders of some characters. Hence, text lines detection and segmentation represents a leading challenge in handwritten document image processing. Text line detection methods that rely on the traditional global projection profile of the text document cannot efficiently confront with the problem of variable skew angles between different text lines. Hence, the formulation of a horizontal line as a separator is often not efficient. This paper presents a technique to segment a handwritten document into distinct lines of text. The proposed algorithm starts, by partitioning the initial text image into columns, across its width into chunks of about 5% each. At each vertical strip of 5%, the histogram of horizontal runs is projected. We have worked with the assumption that text appearing in a single strip is almost parallel to each other. The algorithm developed provides a sliding window through the first vertical strip on the left side of the page. It runs through to identify the new minimum corresponding to a valley in the projection profile. Each valley would represent the starting point of the orientation line and the ending point is the minimum point on the projection profile of the next vertical strip. The derived text-lines traverse around any obstructing handwritten vertical strips of connected component by associating it to either the line above or below. A decision of associating such connected component is made by the probability obtained from a distance metric decision. The technique outperforms the global projection profile for text line segmentation and it is robust to handle skewed documents and those with lines running into each other.

Keywords: connected-component, projection-profile, segmentation, text-line

Procedia PDF Downloads 124
1828 A Cross-Dialect Statistical Analysis of Final Declarative Intonation in Tuvinian

Authors: D. Beziakina, E. Bulgakova

Abstract:

This study continues the research on Tuvinian intonation and presents a general cross-dialect analysis of intonation of Tuvinian declarative utterances, specifically the character of the tone movement in order to test the hypothesis about the prevalence of level tone in some Tuvinian dialects. The results of the analysis of basic pitch characteristics of Tuvinian speech (in general and in comparison with two other Turkic languages - Uzbek and Azerbaijani) are also given in this paper. The goal of our work was to obtain the ranges of pitch parameter values typical for Tuvinian speech. Such language-specific values can be used in speaker identification systems in order to get more accurate results of ethnic speech analysis. We also present the results of a cross-dialect analysis of declarative intonation in the poorly studied Tuvinian language.

Keywords: speech analysis, statistical analysis, speaker recognition, identification of person

Procedia PDF Downloads 470
1827 A Profile of the Patients at the Hearing and Speech Clinic at the University of Jordan: A Retrospective Study

Authors: Maisa Haj-Tas, Jehad Alaraifi

Abstract:

The significance of the study: This retrospective study examined the speech and language profiles of patients who received clinical services at the University of Jordan Hearing and Speech Clinic (UJ-HSC) from 2009 to 2014. The UJ-HSC clinic is located in the capital Amman and was established in the late 1990s. It is the first hearing and speech clinic in Jordan and one of first speech and hearing clinics in the Middle East. This clinic provides services to an annual average of 2000 patients who are diagnosed with different communication disorders. Examining the speech and language profiles of patients in this clinic could provide an insight about the most common disorders seen in patients who attend similar clinics in Jordan. It could also provide information about community awareness of the role of speech therapists in the management of speech and language disorders. Methodology: The researchers examined the clinical records of 1140 patients (797 males and 343 females) who received clinical services at the UJ-HSC between the years 2009 and 2014 for the purpose of data analysis for this study. The main variables examined in the study were disorder type and gender. Participants were divided into four age groups: children, adolescents, adults, and older adults. The examined disorders were classified as either speech disorders, language disorders, or dysphagia (i.e., swallowing problems). The disorders were further classified as childhood language impairments, articulation disorders, stuttering, cluttering, voice disorders, aphasia, and dysphagia. Results: The results indicated that the prevalence for language disorders was the highest (50.7%) followed by speech disorders (48.3%), and dysphagia (0.9%). The majority of patients who were seen at the JU-HSC were diagnosed with childhood language impairments (47.3%) followed consecutively by articulation disorders (21.1%), stuttering (16.3%), voice disorders (12.1%), aphasia (2.2%), dysphagia (0.9%), and cluttering (0.2%). As for gender, the majority of patients seen at the clinic were males in all disorders except for voice disorders and cluttering. Discussion: The results of the present study indicate that the majority of examined patients were diagnosed with childhood language impairments. Based on this result, the researchers suggest that there seems to be a high prevalence of childhood language impairments among children in Jordan compared to other types of speech and language disorders. The researchers also suggest that there is a need for further examination of the actual prevalence data on speech and language disorders in Jordan. The fact that many of the children seen at the UJ-HSC were brought to the clinic either as a result of parental concern or teacher referral indicates that there seems to an increased awareness among parents and teachers about the services speech pathologists can provide about assessment and treatment of childhood speech and language disorders. The small percentage of other disorders (i.e., stuttering, cluttering, dysphasia, aphasia, and voice disorders) seen at the UJ-HSC may indicate a little awareness by the local community about the role of speech pathologists in the assessment and treatment of these disorders.

Keywords: clinic, disorders, language, profile, speech

Procedia PDF Downloads 313
1826 Environmentally Adaptive Acoustic Echo Suppression for Barge-in Speech Recognition

Authors: Jong Han Joo, Jung Hoon Lee, Young Sun Kim, Jae Young Kang, Seung Ho Choi

Abstract:

In this study, we propose a novel technique for acoustic echo suppression (AES) during speech recognition under barge-in conditions. Conventional AES methods based on spectral subtraction apply fixed weights to the estimated echo path transfer function (EPTF) at the current signal segment and to the EPTF estimated until the previous time interval. We propose a new approach that adaptively updates weight parameters in response to abrupt changes in the acoustic environment due to background noises or double-talk. Furthermore, we devised a voice activity detector and an initial time-delay estimator for barge-in speech recognition in communication networks. The initial time delay is estimated using log-spectral distance measure, as well as cross-correlation coefficients. The experimental results show that the developed techniques can be successfully applied in barge-in speech recognition systems.

Keywords: acoustic echo suppression, barge-in, speech recognition, echo path transfer function, initial delay estimator, voice activity detector

Procedia PDF Downloads 372
1825 Role of Speech Articulation in English Language Learning

Authors: Khadija Rafi, Neha Jamil, Laiba Khalid, Meerub Nawaz, Mahwish Farooq

Abstract:

Speech articulation is a complex process to produce intelligible sounds with the help of precise movements of various structures within the vocal tract. All these structures in the vocal tract are named as articulators, which comprise lips, teeth, tongue, and palate. These articulators work together to produce a range of distinct phonemes, which happen to be the basis of language. It starts with the airstream from the lungs passing through the trachea and into oral and nasal cavities. When the air passes through the mouth, the tongue and the muscles around it form such coordination it creates certain sounds. It can be seen when the tongue is placed in different positions- sometimes near the alveolar ridge, soft palate, roof of the mouth or the back of the teeth which end up creating unique qualities of each phoneme. We can articulate vowels with open vocal tracts, but the height and position of the tongue is different every time depending upon each vowel, while consonants can be pronounced when we create obstructions in the airflow. For instance, the alphabet ‘b’ is a plosive and can be produced only by briefly closing the lips. Articulation disorders can not only affect communication but can also be a hurdle in speech production. To improve articulation skills for such individuals, doctors often recommend speech therapy, which involves various kinds of exercises like jaw exercises and tongue twisters. However, this disorder is more common in children who are going through developmental articulation issues right after birth, but in adults, it can be caused by injury, neurological conditions, or other speech-related disorders. In short, speech articulation is an essential aspect of productive communication, which also includes coordination of the specific articulators to produce different intelligible sounds, which are a vital part of spoken language.

Keywords: linguistics, speech articulation, speech therapy, language learning

Procedia PDF Downloads 62
1824 Hate Speech in Selected Nigerian Newspapers

Authors: Laurel Chikwado Madumere, Kevin O. Ugorji

Abstract:

A speech is said to be full of hate when it appropriates disparaging and vituperative locutions and/or appellations, which are riddled with prejudices and misconceptions about an antagonizing party on the grounds of gender, race, political orientation, religious affiliations, tribe, etc. Due largely to the dichotomies and polarities that exist in Nigeria across political ideological spectrum, tribal affiliations, and gender contradistinctions, there are possibilities for the existence of socioeconomic, religious and political conditions that would induce, provoke and catalyze hate speeches in Nigeria’s mainstream media. Therefore the aim of this paper is to investigate, using select daily newspapers in Nigeria, the extent and complexity of those likely hate speeches that emanate from the pluralism in Nigeria and to set in to relief, the discrepancies and contrariety in the interpretation of those hate words. To achieve the above, the paper shall be qualitative in orientation as it shall be using the Speech Act Theory of J. L. Austin and J. R. Searle to interpret and evaluate the hate speeches in the select Nigerian daily newspapers. Also this paper shall help to elucidate the conditions that generate hate, and inform the government and NGOs how best to approach those conditions and put an end to the possible violence and extremism that emanate from extreme cases of hate.

Keywords: extremism, gender, hate speech, pluralism, prejudice, speech act theory

Procedia PDF Downloads 146
1823 Glossematics and Textual Structure

Authors: Abdelhadi Nadjer

Abstract:

The structure of the text to the systemic school -(glossématique-Helmslev). At the beginning of the note we have a cursory look around the concepts of general linguistics The science that studies scientific study of human language based on the description and preview the facts away from the trend of education than we gave a detailed overview the founder of systemic school and most important customers and more methods and curriculum theory and analysis they extend to all humanities, practical action each offset by a theoretical and the procedure can be analyzed through the elements that pose as another method we talked to its links with other language schools where they are based on the sharp criticism of the language before and deflected into consideration for the field of language and its erection has outside or language network and its participation in the actions (non-linguistic) and after that we started our Valglosamatik analytical structure of the text is ejected text terminal or all of the words to was put for expression. This text Negotiable divided into types in turn are divided into classes and class should not be carrying a contradiction and be inclusive. It is on the same materials as described relationships that combine language and seeks to describe their relations and identified.

Keywords: text, language schools, linguistics, human language

Procedia PDF Downloads 459
1822 We Wonder If They Mind: An Empirical Inquiry into the Narratological Function of Mind Wandering in Readers of Literary Texts

Authors: Tina Ternes, Florian Kleinau

Abstract:

The study investigates the content and triggers of mind wandering (MW) in readers of fictional texts. It asks whether readers’ MW is productive (text-related) or unproductive (text-unrelated). Methodologically, it bridges the gap between narratological and data-driven approaches by utilizing a sentence-by-sentence self-paced reading paradigm combined with thought probes in the reading of an excerpt of A. L. Kennedy’s “Baby Blue”. Results show that the contents of MW can be linked to text properties. We validated the role of self-reference in MW and found prediction errors to be triggers of MW. Results also indicate that the content of MW often travels along the lines of the text at hand and can thus be viewed as productive and integral to interpretation.

Keywords: narratology, mind wandering, reading fiction, meta cognition

Procedia PDF Downloads 82
1821 Incorporating Information Gain in Regular Expressions Based Classifiers

Authors: Rosa L. Figueroa, Christopher A. Flores, Qing Zeng-Treitler

Abstract:

A regular expression consists of sequence characters which allow describing a text path. Usually, in clinical research, regular expressions are manually created by programmers together with domain experts. Lately, there have been several efforts to investigate how to generate them automatically. This article presents a text classification algorithm based on regexes. The algorithm named REX was designed, and then, implemented as a simplified method to create regexes to classify Spanish text automatically. In order to classify ambiguous cases, such as, when multiple labels are assigned to a testing example, REX includes an information gain method Two sets of data were used to evaluate the algorithm’s effectiveness in clinical text classification tasks. The results indicate that the regular expression based classifier proposed in this work performs statically better regarding accuracy and F-measure than Support Vector Machine and Naïve Bayes for both datasets.

Keywords: information gain, regular expressions, smith-waterman algorithm, text classification

Procedia PDF Downloads 320
1820 Development and Clinical Application of a Cochlear Implant Mapping Assistance System

Authors: Hong Mengdi, Li Jianan, Ji Fei, Chen Aiting, Wang Qian

Abstract:

Objective: To overcome the communication barriers that audiologists encounter during cochlear implant mapping, particularly the challenge of eliciting subjective feedback from recipients regarding electrical stimulation, and to enhance the capabilities of existing technologies, we teamed up with software engineers to design an interactive approach for patient-audiologist communication. This approach employs a tablet (PAD) as the interface for a communication and feedback system between patients and audiologists during the mapping process, known as the Cochlear Implant Mapping Assistance System. Methods: Capitalizing on the touchscreen functionality of the PAD, the recipients' subjective feedback during cochlear implant mapping is instantly transmitted to the audiologist's mapping computer. The system acts as a platform for auditory assessment instruments, facilitating immediate evaluation of recipients' post-mapping hearing and speech discrimination capabilities. Furthermore, the system is designed to augment the visual reinforcement audiometry (VRA) process. The system consists of six modules, including three testing projects: loudness testing, hearing threshold testing, and loudness balance testing; two assessment projects: warble tone testing and digit speech testing; and one VRA animation project. It also incorporates speech-to-text and text input display functions tailored to accommodate speech communication difficulties in hearing-impaired individuals, with pre-installed common exchange content between audiologists and recipients. Audiologists can input sentences by selecting options. The system supports switching between Chinese and English versions, suitable for audiologists and recipients who use English, facilitating international application of the system. Results: The Cochlear Implant Mapping Assistance System has been in use for over a year in the Auditory Implant Center of the Department of Otology and Neurotology, Medical Center of Otology and Head & Neck Surgery, Chinese PLA General Hospital, with more than 300 recipients using this mapping system. Currently, the system operates stably, with both audiologists and recipients providing positive feedback, indicating a significant improvement over previous methods. It is particularly well-received by pediatric recipients, significantly enhancing the work efficiency of audiologists and improving the feedback efficiency and accuracy of recipients. The system enhances the comprehensibility for cochlear implant recipients, improves wearing comfort and user experience, facilitates cochlear implant auditory mapping, and increases the collection of previously challenging-to-obtain data during the existing assisted mapping process, such as loudness testing data, electrical stimulation testing data, warble tone testing data, loudness balance testing data, digit speech testing data, and visual reinforcement audiometry testing data. Real-time data recording improves the accuracy of assisted mapping. The interface design is meticulously crafted to accommodate patients of varying ages and cognitive abilities, featuring an intuitive design that allows for effortless, guidance-free use by patients.

Keywords: audiologist, subjective feedback, mapping, cochlear implant

Procedia PDF Downloads 20
1819 Absence of Developmental Change in Epenthetic Vowel Duration in Japanese Speakers’ English

Authors: Takayuki Konishi, Kakeru Yazawa, Mariko Kondo

Abstract:

This study examines developmental change in the production of epenthetic vowels by Japanese learners of English in relation to acquisition of L2 English speech rhythm. Seventy-two Japanese learners of English in the J-AESOP corpus were divided into lower- and higher-level learners according to their proficiency score and the frequency of vowel epenthesis. Three learners were excluded because no vowel epenthesis was observed in their utterances. The analysis of their read English speech data showed no statistical difference between lower- and higher-level learners, implying the absence of any developmental change in durations of epenthetic vowels. This result, together with the findings of previous studies, will be discussed in relation to the transfer of L1 phonology and manifestation of L2 English rhythm.

Keywords: vowel epenthesis, Japanese learners of English, L2 speech corpus, speech rhythm

Procedia PDF Downloads 268
1818 Speech and Swallowing Function after Tonsillo-Lingual Sulcus Resection with PMMC Flap Reconstruction: A Case Study

Authors: K. Rhea Devaiah, B. S. Premalatha

Abstract:

Background: Tonsillar Lingual sulcus is the area between the tonsils and the base of the tongue. The surgical resection of the lesions in the head and neck results in changes in speech and swallowing functions. The severity of the speech and swallowing problem depends upon the site and extent of the lesion, types and extent of surgery and also the flexibility of the remaining structures. Need of the study: This paper focuses on the importance of speech and swallowing rehabilitation in an individual with the lesion in the Tonsillar Lingual Sulcus and post-operative functions. Aim: Evaluating the speech and swallow functions post-intensive speech and swallowing rehabilitation. The objectives are to evaluate the speech intelligibility and swallowing functions after intensive therapy and assess the quality of life. Method: The present study describes a report of an individual aged 47years male, with the diagnosis of basaloid squamous cell carcinoma, left tonsillar lingual sulcus (pT2n2M0) and underwent wide local excision with left radical neck dissection with PMMC flap reconstruction. Post-surgery the patient came with a complaint of reduced speech intelligibility, and difficulty in opening the mouth and swallowing. Detailed evaluation of the speech and swallowing functions were carried out such as OPME, articulation test, speech intelligibility, different phases of swallowing and trismus evaluation. Self-reported questionnaires such as SHI-E(Speech handicap Index- Indian English), DHI (Dysphagia handicap Index) and SESEQ -K (Self Evaluation of Swallowing Efficiency in Kannada) were also administered to know what the patient feels about his problem. Based on the evaluation, the patient was diagnosed with pharyngeal phase dysphagia associated with trismus and reduced speech intelligibility. Intensive speech and swallowing therapy was advised weekly twice for the duration of 1 hour. Results: Totally the patient attended 10 intensive speech and swallowing therapy sessions. Results indicated misarticulation of speech sounds such as lingua-palatal sounds. Mouth opening was restricted to one finger width with difficulty chewing, masticating, and swallowing the bolus. Intervention strategies included Oro motor exercise, Indirect swallowing therapy, usage of a trismus device to facilitate mouth opening, and change in the food consistency to help to swallow. A practice session was held with articulation drills to improve the production of speech sounds and also improve speech intelligibility. Significant changes in articulatory production and speech intelligibility and swallowing abilities were observed. The self-rated quality of life measures such as DHI, SHI and SESE Q-K revealed no speech handicap and near-normal swallowing ability indicating the improved QOL after the intensive speech and swallowing therapy. Conclusion: Speech and swallowing therapy post carcinoma in the tonsillar lingual sulcus is crucial as the tongue plays an important role in both speech and swallowing. The role of Speech-language and swallowing therapists in oral cancer should be highlighted in treating these patients and improving the overall quality of life. With intensive speech-language and swallowing therapy post-surgery for oral cancer, there can be a significant change in the speech outcome and swallowing functions depending on the site and extent of lesions which will thereby improve the individual’s QOL.

Keywords: oral cancer, speech and swallowing therapy, speech intelligibility, trismus, quality of life

Procedia PDF Downloads 112
1817 The Communicative Nature of Linguistic Interference in Learning and Teaching of Slavic Languages

Authors: Kseniia Fedorova

Abstract:

The article is devoted to interlinguistic homonymy and enantiosemy analysis. These phenomena belong to the process of linguistic interference, which leads to violation of the communicative utterances integrity and causes misunderstanding between foreign interlocutors - native speakers of different Slavic languages. More attention is paid to investigation of non-typical speech situations, which occurred spontaneously or created by somebody intentionally being based on described phenomenon mechanism. The classification of typical students' mistakes connected with the paradox of interference is being represented in the article. The survey contributes to speech act theory, contemporary linguodidactics, translation science and comparative lexicology of Slavonic languages.

Keywords: adherent enantiosemy, interference, interslavonic homonymy, speech act

Procedia PDF Downloads 244
1816 Speech Emotion Recognition with Bi-GRU and Self-Attention based Feature Representation

Authors: Bubai Maji, Monorama Swain

Abstract:

Speech is considered an essential and most natural medium for the interaction between machines and humans. However, extracting effective features for speech emotion recognition (SER) is remains challenging. The present studies show that the temporal information captured but high-level temporal-feature learning is yet to be investigated. In this paper, we present an efficient novel method using the Self-attention (SA) mechanism in a combination of Convolutional Neural Network (CNN) and Bi-directional Gated Recurrent Unit (Bi-GRU) network to learn high-level temporal-feature. In order to further enhance the representation of the high-level temporal-feature, we integrate a Bi-GRU output with learnable weights features by SA, and improve the performance. We evaluate our proposed method on our created SITB-OSED and IEMOCAP databases. We report that the experimental results of our proposed method achieve state-of-the-art performance on both databases.

Keywords: Bi-GRU, 1D-CNNs, self-attention, speech emotion recognition

Procedia PDF Downloads 113
1815 Multi-Class Text Classification Using Ensembles of Classifiers

Authors: Syed Basit Ali Shah Bukhari, Yan Qiang, Saad Abdul Rauf, Syed Saqlaina Bukhari

Abstract:

Text Classification is the methodology to classify any given text into the respective category from a given set of categories. It is highly important and vital to use proper set of pre-processing , feature selection and classification techniques to achieve this purpose. In this paper we have used different ensemble techniques along with variance in feature selection parameters to see the change in overall accuracy of the result and also on some other individual class based features which include precision value of each individual category of the text. After subjecting our data through pre-processing and feature selection techniques , different individual classifiers were tested first and after that classifiers were combined to form ensembles to increase their accuracy. Later we also studied the impact of decreasing the classification categories on over all accuracy of data. Text classification is highly used in sentiment analysis on social media sites such as twitter for realizing people’s opinions about any cause or it is also used to analyze customer’s reviews about certain products or services. Opinion mining is a vital task in data mining and text categorization is a back-bone to opinion mining.

Keywords: Natural Language Processing, Ensemble Classifier, Bagging Classifier, AdaBoost

Procedia PDF Downloads 232
1814 An Examination of the Effectiveness of iPad-Based Augmentative and Alternative Intervention on Acquisition, Generalization and Maintenance of the Requesting Information Skills of Children with Autism

Authors: Amaal Almigal

Abstract:

Technology has been argued to offer distinct advantages and benefits for teaching children with autism spectrum disorder (ASD) to communicate. One aspect of this technology is augmentative and alternative communication (AAC) systems such as picture exchange or speech generation devices. Whilst there has been significant progress in teaching these children to request their wants and needs with AAC, there remains a need for developing technologies that can really make a difference in teaching them to ask questions. iPad-based AAC can be effective for communication. However, the effectiveness of this type of AAC in teaching children to ask questions needs to be examined. Thus, in order to examine the effectiveness of iPad-based AAC in teaching children with ASD to ask questions, This research will test whether iPad leads to more learning than a traditional approach picture and text cards does. Two groups of children who use AAC will be taught to ask ‘What is it?’ questions. With the first group, low-tech AAC picture and text cards will be used, while an iPad-based AAC application called Proloquo2Go will be used with the second group. Interviews with teachers and parents will be conducted before and after the experiment. The children’s perspectives will also be considered. The initial outcomes of this research indicate that iPad can be an effective tool to help children with autism to ask questions.

Keywords: autism, communication, information, iPad, pictures, requesting

Procedia PDF Downloads 264
1813 Resource Creation Using Natural Language Processing Techniques for Malay Translated Qur'an

Authors: Nor Diana Ahmad, Eric Atwell, Brandon Bennett

Abstract:

Text processing techniques for English have been developed for several decades. But for the Malay language, text processing methods are still far behind. Moreover, there are limited resources, tools for computational linguistic analysis available for the Malay language. Therefore, this research presents the use of natural language processing (NLP) in processing Malay translated Qur’an text. As the result, a new language resource for Malay translated Qur’an was created. This resource will help other researchers to build the necessary processing tools for the Malay language. This research also develops a simple question-answer prototype to demonstrate the use of the Malay Qur’an resource for text processing. This prototype has been developed using Python. The prototype pre-processes the Malay Qur’an and an input query using a stemming algorithm and then searches for occurrences of the query word stem. The result produced shows improved matching likelihood between user query and its answer. A POS-tagging algorithm has also been produced. The stemming and tagging algorithms can be used as tools for research related to other Malay texts and can be used to support applications such as information retrieval, question answering systems, ontology-based search and other text analysis tasks.

Keywords: language resource, Malay translated Qur'an, natural language processing (NLP), text processing

Procedia PDF Downloads 318
1812 Method of Complex Estimation of Text Perusal and Indicators of Reading Quality in Different Types of Commercials

Authors: Victor N. Anisimov, Lyubov A. Boyko, Yazgul R. Almukhametova, Natalia V. Galkina, Alexander V. Latanov

Abstract:

Modern commercials presented on billboards, TV and on the Internet contain a lot of information about the product or service in text form. However, this information cannot always be perceived and understood by consumers. Typical sociological focus group studies often cannot reveal important features of the interpretation and understanding information that has been read in text messages. In addition, there is no reliable method to determine the degree of understanding of the information contained in a text. Only the fact of viewing a text does not mean that consumer has perceived and understood the meaning of this text. At the same time, the tools based on marketing analysis allow only to indirectly estimate the process of reading and understanding a text. Therefore, the aim of this work is to develop a valid method of recording objective indicators in real time for assessing the fact of reading and the degree of text comprehension. Psychophysiological parameters recorded during text reading can form the basis for this objective method. We studied the relationship between multimodal psychophysiological parameters and the process of text comprehension during reading using the method of correlation analysis. We used eye-tracking technology to record eye movements parameters to estimate visual attention, electroencephalography (EEG) to assess cognitive load and polygraphic indicators (skin-galvanic reaction, SGR) that reflect the emotional state of the respondent during text reading. We revealed reliable interrelations between perceiving the information and the dynamics of psychophysiological parameters during reading the text in commercials. Eye movement parameters reflected the difficulties arising in respondents during perceiving ambiguous parts of text. EEG dynamics in rate of alpha band were related with cumulative effect of cognitive load. SGR dynamics were related with emotional state of the respondent and with the meaning of text and type of commercial. EEG and polygraph parameters together also reflected the mental difficulties of respondents in understanding text and showed significant differences in cases of low and high text comprehension. We also revealed differences in psychophysiological parameters for different type of commercials (static vs. video, financial vs. cinema vs. pharmaceutics vs. mobile communication, etc.). Conclusions: Our methodology allows to perform multimodal evaluation of text perusal and the quality of text reading in commercials. In general, our results indicate the possibility of designing an integral model to estimate the comprehension of reading the commercial text in percent scale based on all noticed markers.

Keywords: reading, commercials, eye movements, EEG, polygraphic indicators

Procedia PDF Downloads 166
1811 Text Based Shuffling Algorithm on Graphics Processing Unit for Digital Watermarking

Authors: Zayar Phyo, Ei Chaw Htoon

Abstract:

In a New-LSB based Steganography method, the Fisher-Yates algorithm is used to permute an existing array randomly. However, that algorithm performance became slower and occurred memory overflow problem while processing the large dimension of images. Therefore, the Text-Based Shuffling algorithm aimed to select only necessary pixels as hiding characters at the specific position of an image according to the length of the input text. In this paper, the enhanced text-based shuffling algorithm is presented with the powered of GPU to improve more excellent performance. The proposed algorithm employs the OpenCL Aparapi framework, along with XORShift Kernel including the Pseudo-Random Number Generator (PRNG) Kernel. PRNG is applied to produce random numbers inside the kernel of OpenCL. The experiment of the proposed algorithm is carried out by practicing GPU that it can perform faster-processing speed and better efficiency without getting the disruption of unnecessary operating system tasks.

Keywords: LSB based steganography, Fisher-Yates algorithm, text-based shuffling algorithm, OpenCL, XORShiftKernel

Procedia PDF Downloads 150
1810 Text Mining of Twitter Data Using a Latent Dirichlet Allocation Topic Model and Sentiment Analysis

Authors: Sidi Yang, Haiyi Zhang

Abstract:

Twitter is a microblogging platform, where millions of users daily share their attitudes, views, and opinions. Using a probabilistic Latent Dirichlet Allocation (LDA) topic model to discern the most popular topics in the Twitter data is an effective way to analyze a large set of tweets to find a set of topics in a computationally efficient manner. Sentiment analysis provides an effective method to show the emotions and sentiments found in each tweet and an efficient way to summarize the results in a manner that is clearly understood. The primary goal of this paper is to explore text mining, extract and analyze useful information from unstructured text using two approaches: LDA topic modelling and sentiment analysis by examining Twitter plain text data in English. These two methods allow people to dig data more effectively and efficiently. LDA topic model and sentiment analysis can also be applied to provide insight views in business and scientific fields.

Keywords: text mining, Twitter, topic model, sentiment analysis

Procedia PDF Downloads 179
1809 Text Localization in Fixed-Layout Documents Using Convolutional Networks in a Coarse-to-Fine Manner

Authors: Beier Zhu, Rui Zhang, Qi Song

Abstract:

Text contained within fixed-layout documents can be of great semantic value and so requires a high localization accuracy, such as ID cards, invoices, cheques, and passports. Recently, algorithms based on deep convolutional networks achieve high performance on text detection tasks. However, for text localization in fixed-layout documents, such algorithms detect word bounding boxes individually, which ignores the layout information. This paper presents a novel architecture built on convolutional neural networks (CNNs). A global text localization network and a regional bounding-box regression network are introduced to tackle the problem in a coarse-to-fine manner. The text localization network simultaneously locates word bounding points, which takes the layout information into account. The bounding-box regression network inputs the features pooled from arbitrarily sized RoIs and refine the localizations. These two networks share their convolutional features and are trained jointly. A typical type of fixed-layout documents: ID cards, is selected to evaluate the effectiveness of the proposed system. These networks are trained on data cropped from nature scene images, and synthetic data produced by a synthetic text generation engine. Experiments show that our approach locates high accuracy word bounding boxes and achieves state-of-the-art performance.

Keywords: bounding box regression, convolutional networks, fixed-layout documents, text localization

Procedia PDF Downloads 194
1808 Investigating the Online Effect of Language on Gesture in Advanced Bilinguals of Two Structurally Different Languages in Comparison to L1 Native Speakers of L2 and Explores Whether Bilinguals Will Follow Target L2 Patterns in Speech and Co-speech

Authors: Armita Ghobadi, Samantha Emerson, Seyda Ozcaliskan

Abstract:

Being a bilingual involves mastery of both speech and gesture patterns in a second language (L2). We know from earlier work in first language (L1) production contexts that speech and co-speech gesture form a tightly integrated system: co-speech gesture mirrors the patterns observed in speech, suggesting an online effect of language on nonverbal representation of events in gesture during the act of speaking (i.e., “thinking for speaking”). Relatively less is known about the online effect of language on gesture in bilinguals speaking structurally different languages. The few existing studies—mostly with small sample sizes—suggests inconclusive findings: some show greater achievement of L2 patterns in gesture with more advanced L2 speech production, while others show preferences for L1 gesture patterns even in advanced bilinguals. In this study, we focus on advanced bilingual speakers of two structurally different languages (Spanish L1 with English L2) in comparison to L1 English speakers. We ask whether bilingual speakers will follow target L2 patterns not only in speech but also in gesture, or alternatively, follow L2 patterns in speech but resort to L1 patterns in gesture. We examined this question by studying speech and gestures produced by 23 advanced adult Spanish (L1)-English (L2) bilinguals (Mage=22; SD=7) and 23 monolingual English speakers (Mage=20; SD=2). Participants were shown 16 animated motion event scenes that included distinct manner and path components (e.g., "run over the bridge"). We recorded and transcribed all participant responses for speech and segmented it into sentence units that included at least one motion verb and its associated arguments. We also coded all gestures that accompanied each sentence unit. We focused on motion event descriptions as it shows strong crosslinguistic differences in the packaging of motion elements in speech and co-speech gesture in first language production contexts. English speakers synthesize manner and path into a single clause or gesture (he runs over the bridge; running fingers forward), while Spanish speakers express each component separately (manner-only: el corre=he is running; circle arms next to body conveying running; path-only: el cruza el puente=he crosses the bridge; trace finger forward conveying trajectory). We tallied all responses by group and packaging type, separately for speech and co-speech gesture. Our preliminary results (n=4/group) showed that productions in English L1 and Spanish L1 differed, with greater preference for conflated packaging in L1 English and separated packaging in L1 Spanish—a pattern that was also largely evident in co-speech gesture. Bilinguals’ production in L2 English, however, followed the patterns of the target language in speech—with greater preference for conflated packaging—but not in gesture. Bilinguals used separated and conflated strategies in gesture in roughly similar rates in their L2 English, showing an effect of both L1 and L2 on co-speech gesture. Our results suggest that online production of L2 language has more limited effects on L2 gestures and that mastery of native-like patterns in L2 gesture might take longer than native-like L2 speech patterns.

Keywords: bilingualism, cross-linguistic variation, gesture, second language acquisition, thinking for speaking hypothesis

Procedia PDF Downloads 76
1807 Cognitive Semantics Study of Conceptual and Metonymical Expressions in Johnson's Speeches about COVID-19

Authors: Hussain Hameed Mayuuf

Abstract:

The study is an attempt to investigate the conceptual metonymies is used in political discourse about COVID-19. Thus, this study tries to analyze and investigate how the conceptual metonymies in Johnson's speech about coronavirus are constructed. This study aims at: Identifying how are metonymies relevant to understand the messages in Boris Johnson speeches and to find out how can conceptual blending theory help people to understand the messages in the political speech about COVID-19. Lastly, it tries to Point out the kinds of integration networks are common in political speech. The study is based on the hypotheses that conceptual blending theory is a powerful tool for investigating the intended messages in Johnson's speech and there are different processes of blending networks and conceptual mapping that enable the listeners to identify the messages in political speech. This study presents a qualitative and quantitative analysis of four speeches about COVID-19; they are said by Boris Johnson. The selected data have been tackled from the cognitive-semantic perspective by adopting Conceptual Blending Theory as a model for the analysis. It concludes that CBT is applicable to the analysis of metonymies in political discourse. Its mechanisms enable listeners to analyze and understand these speeches. Also the listener can identify and understand the hidden messages in Biden and Johnson's discourse about COVID-19 by using different conceptual networks. Finally, it is concluded that the double scope networks are the most common types of blending of metonymies in the political speech.

Keywords: cognitive, semantics, conceptual, metonymical, Covid-19

Procedia PDF Downloads 129
1806 Bidirectional Dynamic Time Warping Algorithm for the Recognition of Isolated Words Impacted by Transient Noise Pulses

Authors: G. Tamulevičius, A. Serackis, T. Sledevič, D. Navakauskas

Abstract:

We consider the biggest challenge in speech recognition – noise reduction. Traditionally detected transient noise pulses are removed with the corrupted speech using pulse models. In this paper we propose to cope with the problem directly in Dynamic Time Warping domain. Bidirectional Dynamic Time Warping algorithm for the recognition of isolated words impacted by transient noise pulses is proposed. It uses simple transient noise pulse detector, employs bidirectional computation of dynamic time warping and directly manipulates with warping results. Experimental investigation with several alternative solutions confirms effectiveness of the proposed algorithm in the reduction of impact of noise on recognition process – 3.9% increase of the noisy speech recognition is achieved.

Keywords: transient noise pulses, noise reduction, dynamic time warping, speech recognition

Procedia PDF Downloads 559
1805 The Combination of the Mel Frequency Cepstral Coefficients (MFCC), Perceptual Linear Prediction (PLP), JITTER and SHIMMER Coefficients for the Improvement of Automatic Recognition System for Dysarthric Speech

Authors: Brahim-Fares Zaidi, Malika Boudraa, Sid-Ahmed Selouani

Abstract:

Our work aims to improve our Automatic Recognition System for Dysarthria Speech (ARSDS) based on the Hidden Models of Markov (HMM) and the Hidden Markov Model Toolkit (HTK) to help people who are sick. With pronunciation problems, we applied two techniques of speech parameterization based on Mel Frequency Cepstral Coefficients (MFCC's) and Perceptual Linear Prediction (PLP's) and concatenated them with JITTER and SHIMMER coefficients in order to increase the recognition rate of a dysarthria speech. For our tests, we used the NEMOURS database that represents speakers with dysarthria and normal speakers.

Keywords: hidden Markov model toolkit (HTK), hidden models of Markov (HMM), Mel-frequency cepstral coefficients (MFCC), perceptual linear prediction (PLP’s)

Procedia PDF Downloads 161
1804 Recognition of Cursive Arabic Handwritten Text Using Embedded Training Based on Hidden Markov Models (HMMs)

Authors: Rabi Mouhcine, Amrouch Mustapha, Mahani Zouhir, Mammass Driss

Abstract:

In this paper, we present a system for offline recognition cursive Arabic handwritten text based on Hidden Markov Models (HMMs). The system is analytical without explicit segmentation used embedded training to perform and enhance the character models. Extraction features preceded by baseline estimation are statistical and geometric to integrate both the peculiarities of the text and the pixel distribution characteristics in the word image. These features are modelled using hidden Markov models and trained by embedded training. The experiments on images of the benchmark IFN/ENIT database show that the proposed system improves recognition.

Keywords: recognition, handwriting, Arabic text, HMMs, embedded training

Procedia PDF Downloads 354