Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 1660

Search results for: text comprehension EIAH

1540 Text Localization in Fixed-Layout Documents Using Convolutional Networks in a Coarse-to-Fine Manner

Authors: Beier Zhu, Rui Zhang, Qi Song

Abstract:

Text contained within fixed-layout documents can be of great semantic value and so requires a high localization accuracy, such as ID cards, invoices, cheques, and passports. Recently, algorithms based on deep convolutional networks achieve high performance on text detection tasks. However, for text localization in fixed-layout documents, such algorithms detect word bounding boxes individually, which ignores the layout information. This paper presents a novel architecture built on convolutional neural networks (CNNs). A global text localization network and a regional bounding-box regression network are introduced to tackle the problem in a coarse-to-fine manner. The text localization network simultaneously locates word bounding points, which takes the layout information into account. The bounding-box regression network inputs the features pooled from arbitrarily sized RoIs and refine the localizations. These two networks share their convolutional features and are trained jointly. A typical type of fixed-layout documents: ID cards, is selected to evaluate the effectiveness of the proposed system. These networks are trained on data cropped from nature scene images, and synthetic data produced by a synthetic text generation engine. Experiments show that our approach locates high accuracy word bounding boxes and achieves state-of-the-art performance.

Keywords: bounding box regression, convolutional networks, fixed-layout documents, text localization

Procedia PDF Downloads 181

1539 Recognition of Cursive Arabic Handwritten Text Using Embedded Training Based on Hidden Markov Models (HMMs)

Authors: Rabi Mouhcine, Amrouch Mustapha, Mahani Zouhir, Mammass Driss

Abstract:

In this paper, we present a system for offline recognition cursive Arabic handwritten text based on Hidden Markov Models (HMMs). The system is analytical without explicit segmentation used embedded training to perform and enhance the character models. Extraction features preceded by baseline estimation are statistical and geometric to integrate both the peculiarities of the text and the pixel distribution characteristics in the word image. These features are modelled using hidden Markov models and trained by embedded training. The experiments on images of the benchmark IFN/ENIT database show that the proposed system improves recognition.

Keywords: recognition, handwriting, Arabic text, HMMs, embedded training

Procedia PDF Downloads 340

1538 Poetics of the Connecting ha’: A Textual Study in the Poetry of Al-Husari Al-Qayrawani

Authors: Mahmoud al-Ashiriy

Abstract:

This paper begins from the idea that the real history of literature is the history of its style. And since the rhyme –as known- is not merely the last letter, that have received a lot of analysis and investigation, but it is a collection of other values in addition to its different markings. This paper will explore the work of the connecting ha’ and its effectiveness in shaping the text of poetry, since it establishes vocal rhythms in addition to its role in indicating references through the pronoun, vertically through the poem through the sequence of its verses, also horizontally through what environs the one verse of sentences. If the scientific formation of prosody stopped at the possibilities and prohibitions; literary criticism and poetry studies should explore what is above the rule of aesthetic horizon of poetic effectiveness that varies from a text to another, a poet to another, a literary period to another, or from a poetic taste to another. Then the paper will explore this poetic essence in the texts of the famous Andalusian Poet Al-Husari Al-Qayrawani through his well-known Daliyya (a poem that its verses end with the letter D), and the role of the connecting ha’ in fulfilling its text and the accomplishment of its poetics, departing from this to the diwan (the big collection of poems) also as a higher text that surpasses the text/poem, and through what it represents of effectiveness the work of the phenomenon in accomplishing the poetics of the poem of Al-Husari Al-Qayrawani who is one of the pillars of Arabic poetics in Andalusia.

Keywords: Al-Husari Al-Qayrawni, poetics, rhyme, stylistics, science of the text

Procedia PDF Downloads 554

1537 A Clustering Algorithm for Massive Texts

Authors: Ming Liu, Chong Wu, Bingquan Liu, Lei Chen

Abstract:

Internet users have to face the massive amount of textual data every day. Organizing texts into categories can help users dig the useful information from large-scale text collection. Clustering, in fact, is one of the most promising tools for categorizing texts due to its unsupervised characteristic. Unfortunately, most of traditional clustering algorithms lose their high qualities on large-scale text collection. This situation mainly attributes to the high- dimensional vectors generated from texts. To effectively and efficiently cluster large-scale text collection, this paper proposes a vector reconstruction based clustering algorithm. Only the features that can represent the cluster are preserved in cluster’s representative vector. This algorithm alternately repeats two sub-processes until it converges. One process is partial tuning sub-process, where feature’s weight is fine-tuned by iterative process. To accelerate clustering velocity, an intersection based similarity measurement and its corresponding neuron adjustment function are proposed and implemented in this sub-process. The other process is overall tuning sub-process, where the features are reallocated among different clusters. In this sub-process, the features useless to represent the cluster are removed from cluster’s representative vector. Experimental results on the three text collections (including two small-scale and one large-scale text collections) demonstrate that our algorithm obtains high quality on both small-scale and large-scale text collections.

Keywords: vector reconstruction, large-scale text clustering, partial tuning sub-process, overall tuning sub-process

Procedia PDF Downloads 416

1536 A Text Classification Approach Based on Natural Language Processing and Machine Learning Techniques

Authors: Rim Messaoudi, Nogaye-Gueye Gning, François Azelart

Abstract:

Automatic text classification applies mostly natural language processing (NLP) and other AI-guided techniques to automatically classify text in a faster and more accurate manner. This paper discusses the subject of using predictive maintenance to manage incident tickets inside the sociality. It focuses on proposing a tool that treats and analyses comments and notes written by administrators after resolving an incident ticket. The goal here is to increase the quality of these comments. Additionally, this tool is based on NLP and machine learning techniques to realize the textual analytics of the extracted data. This approach was tested using real data taken from the French National Railways (SNCF) company and was given a high-quality result.

Keywords: machine learning, text classification, NLP techniques, semantic representation

Procedia PDF Downloads 84

1535 Weighted-Distance Sliding Windows and Cooccurrence Graphs for Supporting Entity-Relationship Discovery in Unstructured Text

Authors: Paolo Fantozzi, Luigi Laura, Umberto Nanni

Abstract:

The problem of Entity relation discovery in structured data, a well covered topic in literature, consists in searching within unstructured sources (typically, text) in order to find connections among entities. These can be a whole dictionary, or a specific collection of named items. In many cases machine learning and/or text mining techniques are used for this goal. These approaches might be unfeasible in computationally challenging problems, such as processing massive data streams. A faster approach consists in collecting the cooccurrences of any two words (entities) in order to create a graph of relations - a cooccurrence graph. Indeed each cooccurrence highlights some grade of semantic correlation between the words because it is more common to have related words close each other than having them in the opposite sides of the text. Some authors have used sliding windows for such problem: they count all the occurrences within a sliding windows running over the whole text. In this paper we generalise such technique, coming up to a Weighted-Distance Sliding Window, where each occurrence of two named items within the window is accounted with a weight depending on the distance between items: a closer distance implies a stronger evidence of a relationship. We develop an experiment in order to support this intuition, by applying this technique to a data set consisting in the text of the Bible, split into verses.

Keywords: cooccurrence graph, entity relation graph, unstructured text, weighted distance

Procedia PDF Downloads 136

1534 Symmetric Key Encryption Algorithm Using Indian Traditional Musical Scale for Information Security

Authors: Aishwarya Talapuru, Sri Silpa Padmanabhuni, B. Jyoshna

Abstract:

Cryptography helps in preventing threats to information security by providing various algorithms. This study introduces a new symmetric key encryption algorithm for information security which is linked with the "raagas" which means Indian traditional scale and pattern of music notes. This algorithm takes the plain text as input and starts its encryption process. The algorithm then randomly selects a raaga from the list of raagas that is assumed to be present with both sender and the receiver. The plain text is associated with the thus selected raaga and an intermediate cipher-text is formed as the algorithm converts the plain text characters into other characters, depending upon the rules of the algorithm. This intermediate code or cipher text is arranged in various patterns in three different rounds of encryption performed. The total number of rounds in the algorithm is equal to the multiples of 3. To be more specific, the outcome or output of the sequence of first three rounds is again passed as the input to this sequence of rounds recursively, till the total number of rounds of encryption is performed. The raaga selected by the algorithm and the number of rounds performed will be specified at an arbitrary location in the key, in addition to important information regarding the rounds of encryption, embedded in the key which is known by the sender and interpreted only by the receiver, thereby making the algorithm hack proof. The key can be constructed of any number of bits without any restriction to the size. A software application is also developed to demonstrate this process of encryption, which dynamically takes the plain text as input and readily generates the cipher text as output. Therefore, this algorithm stands as one of the strongest tools for information security.

Keywords: cipher text, cryptography, plaintext, raaga

Procedia PDF Downloads 274

1533 Instruction High-Leverage Practices in Reading Instruction for Adolescents

Authors: Nicole Pyle, Daniel Pyle, Christa Haring, Marty Hougen

Abstract:

Effective special education teachers utilize evidence-based practices for adolescent reading instruction and target the skills needed to improve the reading of older struggling readers. High-Leverage Practices (HLPs) are critical to helping students with disabilities learn important content. Therefore, special education teachers are encouraged to implement HLPs to maximize the learning of students with disabilities, including students with reading difficulties. Teachers’ implementation of HLPs in reading comprehension instruction should aim to develop adolescents’ understanding of grade-level narrative texts and informational texts, including content area texts. Instruction High-Leverage Practices (11-22) that ensure effective implementation of evidence-based practice in reading comprehension instruction for adolescents are presented. Effective reading comprehension activities within the 12 Instruction HLPs are illustrated.

Keywords: high-leverage practices, adolescent, instructional activities, students with disabilities

Procedia PDF Downloads 69

1532 The Effects of Watching Text-Relevant Video Segments with/without Subtitles on Vocabulary Development of Arabic as a Foreign Language Learners

Authors: Amirreza Karami, Hawraa Nafea Hameed Alzouwain, Freddie A. Bowles

Abstract:

This study investigates the effects of watching text-relevant video segments with/without subtitles on vocabulary development of Arabic as a Foreign Language (AFL) learners. The participants of the study were assigned to two groups: one control group and one experimental group. The control group received no video-based instruction while the experimental group watched a text-relevant video segment in three stages: pre, while, and post-instruction. The preliminary results of the pre-test and post-test show that watching text-relevant video segments through following a pre-while-post procedure can help the vocabulary development of AFL learners more than non-video-based instruction.

Keywords: text-relevant video segments, vocabulary development, Arabic as a Foreign Language, AFL, pre-while-post instruction

Procedia PDF Downloads 152

1531 An Investigation into Slow ESL Reading Speed in Pakistani Students

Authors: Hina Javed

Abstract:

This study investigated the different strategies used by Pakistani students learning English as a second language at secondary level school. The basic premise of the study is that ESL students face tremendous difficulty while they are reading a text in English. It also purports to dig into the different causes of their slow reading. They might range from word reading accuracy, mental translation, lexical density, cultural gaps, complex syntactic constructions, and back skipping. Sixty Grade 7 students from two secondary mainstream schools in Lahore were selected for the study, thirty being boys and thirty girls. They were administered reading-related and reading speed pre and post-tests. The purpose of the tests was to gauge their performance on different reading tasks so as to be able to see how they used strategies, if any, and also to ascertain the causes hampering their performance on those tests. In the pretests, they were given simple texts with considerable lexical density and moderately complex sentential layout. In the post-tests, the reading tasks contained comic strips, texts with visuals, texts with controlled vocabulary, and an evenly distributed varied range of simple, compound, and complex sentences. Both the tests were timed. The results gleaned through the data gathered corroborated the researchers’ basic hunch that they performed significantly better than pretests. The findings suggest that the morphological structure of words and lexical density are the main sources of reading comprehension difficulties in poor ESL readers. It is also confirmed that if the texts are accompanied by pictorial visuals, it greatly facilitates students’ reading speed and comprehension. There is no substantial evidence that ESL readers adopt any specific strategy while reading in English.

Keywords: slow ESL reading speed, mental translation, complex syntactic constructions, back skipping

Procedia PDF Downloads 58

1530 A Study of Various Ontology Learning Systems from Text and a Look into Future

Authors: Fatima Al-Aswadi, Chan Yong

Abstract:

With the large volume of unstructured data that increases day by day on the web, the motivation of representing the knowledge in this data in the machine processable form is increased. Ontology is one of the major cornerstones of representing the information in a more meaningful way on the semantic Web. The goal of Ontology learning from text is to elicit and represent domain knowledge in the machine readable form. This paper aims to give a follow-up review on the ontology learning systems from text and some of their defects. Furthermore, it discusses how far the ontology learning process will enhance in the future.

Keywords: concept discovery, deep learning, ontology learning, semantic relation, semantic web

Procedia PDF Downloads 500

1529 Principle Components Updates via Matrix Perturbations

Authors: Aiman Elragig, Hanan Dreiwi, Dung Ly, Idriss Elmabrook

Abstract:

This paper highlights a new approach to look at online principle components analysis (OPCA). Given a data matrix X ∈ R,^m x n we characterise the online updates of its covariance as a matrix perturbation problem. Up to the principle components, it turns out that online updates of the batch PCA can be captured by symmetric matrix perturbation of the batch covariance matrix. We have shown that as n→ n0 >> 1, the batch covariance and its update become almost similar. Finally, utilize our new setup of online updates to find a bound on the angle distance of the principle components of X and its update.

Keywords: online data updates, covariance matrix, online principle component analysis, matrix perturbation

Procedia PDF Downloads 185

1528 The Role of Metacognitive Strategy Intervention through Dialogic Interaction on Listeners’ Level of Cognitive Load

Authors: Ali Babajanzade, Hossein Bozorgian

Abstract:

Cognitive load plays an important role in learning in general and L2 listening comprehension in particular. This study is an attempt to investigate the effect of metacognitive strategy intervention through dialogic interaction (MSIDI) on L2 listeners’ cognitive load. A mixed-method design with 50 participants of male and female Iranian lower-intermediate learners between 20 to 25 years of age was used. An experimental group (n=25) received weekly interventions based on metacognitive strategy intervention through dialogic interaction for ten sessions. The second group, which was control (n=25), had the same listening samples with the regular procedure without a metacognitive intervention program in each session. The study used three different instruments: a) a modified version of the cognitive load questionnaire, b) digit span tests, and c) focused group interviews to investigate listeners’ level of cognitive load throughout the process. Results testified not only improvements in listening comprehension in MSIDI but a radical shift of cognitive load rate within this group. In other words, listeners experienced a lower level of cognitive load in MSIDI in comparison with their peers in the control group.

Keywords: cognitive load theory, human mental functioning, metacognitive theory, listening comprehension, sociocultural theory

Procedia PDF Downloads 137

1527 The Impacts of an Adapted Literature Circle Model on Reading Comprehension, Engagement, and Cooperation in an EFL Reading Course

Authors: Tiantian Feng

Abstract:

There is a dearth of research on the literary circle as a teaching strategy in English as a Foreign Language (EFL) classes in Chinese colleges and universities and even fewer empirical studies on its impacts. In this one-quarter, design-based project, the researcher aims to increase students’ engagement, cooperation, and, on top of that, reading comprehension performance by utilizing a researcher-developed, adapted reading circle model in an EFL reading course at a Chinese college. The model also integrated team-based learning and portfolio assessment, with an emphasis on the specialization of individual responsibilities, contributions, and outcomes in reading projects, with the goal of addressing current issues in EFL classes at Chinese colleges, such as passive learning, test orientation, ineffective and uncooperative teamwork, and lack of dynamics. In this quasi-experimental research, two groups of students enrolled in the course were invited to participate in four in-class team projects, with the intervention class following the adapted literature circle model and team members rotating as Leader, Coordinator, Brain trust, and Reporter. The researcher/instructor used a sequential explanatory mixed-methods approach to quantitatively analyze the final grades for the pre-and post-tests, as well as individual scores for team projects and will code students' artifacts in the next step, with the results to be reported in a subsequent paper(s). Initial analysis showed that both groups saw an increase in final grades, but the intervention group enjoyed a more significant boost, suggesting that the adapted reading circle model is effective in improving students’ reading comprehension performance. This research not only closes the empirical research gap of literature circles in college EFL classes in China but also adds to the pool of effective ways to optimize reading comprehension performance and class performance in college EFL classes.

Keywords: literature circle, EFL teaching, college english reading, reading comprehension

Procedia PDF Downloads 87

1526 A Similarity Measure for Classification and Clustering in Image Based Medical and Text Based Banking Applications

Authors: K. P. Sandesh, M. H. Suman

Abstract:

Text processing plays an important role in information retrieval, data-mining, and web search. Measuring the similarity between the documents is an important operation in the text processing field. In this project, a new similarity measure is proposed. To compute the similarity between two documents with respect to a feature the proposed measure takes the following three cases into account: (1) The feature appears in both documents; (2) The feature appears in only one document and; (3) The feature appears in none of the documents. The proposed measure is extended to gauge the similarity between two sets of documents. The effectiveness of our measure is evaluated on several real-world data sets for text classification and clustering problems, especially in banking and health sectors. The results show that the performance obtained by the proposed measure is better than that achieved by the other measures.

Keywords: document classification, document clustering, entropy, accuracy, classifiers, clustering algorithms

Procedia PDF Downloads 508

1525 Visual Text Analytics Technologies for Real-Time Big Data: Chronological Evolution and Issues

Authors: Siti Azrina B. A. Aziz, Siti Hafizah A. Hamid

Abstract:

New approaches to analyze and visualize data stream in real-time basis is important in making a prompt decision by the decision maker. Financial market trading and surveillance, large-scale emergency response and crowd control are some example scenarios that require real-time analytic and data visualization. This situation has led to the development of techniques and tools that support humans in analyzing the source data. With the emergence of Big Data and social media, new techniques and tools are required in order to process the streaming data. Today, ranges of tools which implement some of these functionalities are available. In this paper, we present chronological evolution evaluation of technologies for supporting of real-time analytic and visualization of the data stream. Based on the past research papers published from 2002 to 2014, we gathered the general information, main techniques, challenges and open issues. The techniques for streaming text visualization are identified based on Text Visualization Browser in chronological order. This paper aims to review the evolution of streaming text visualization techniques and tools, as well as to discuss the problems and challenges for each of identified tools.

Keywords: information visualization, visual analytics, text mining, visual text analytics tools, big data visualization

Procedia PDF Downloads 388

1524 Visual Aid and Imagery Ramification on Decision Making: An Exploratory Study Applicable in Emergency Situations

Authors: Priyanka Bharti

Abstract:

Decades ago designs were based on common sense and tradition, but after an enhancement in visualization technology and research, we are now able to comprehend the cognitive ability involved in the decoding of the visual information. However, many fields in visuals need intense research to deliver an efficient explanation for the events. Visuals are an information representation mode through images, symbols and graphics. It plays an impactful role in decision making by facilitating quick recognition, comprehension, and analysis of a situation. They enhance problem-solving capabilities by enabling the processing of more data without overloading the decision maker. As research proves that, visuals offer an improved learning environment by a factor of 400 compared to textual information. Visual information engages learners at a cognitive level and triggers the imagination, which enables the user to process the information faster (visuals are processed 60,000 times faster in the brain than text). Appropriate information, visualization, and its presentation are known to aid and intensify the decision-making process for the users. However, most literature discusses the role of visual aids in comprehension and decision making during normal conditions alone. Unlike emergencies, in a normal situation (e.g. our day to day life) users are neither exposed to stringent time constraints nor face the anxiety of survival and have sufficient time to evaluate various alternatives before making any decision. An emergency is an unexpected probably fatal real-life situation which may inflict serious ramifications on both human life and material possessions unless corrective measures are taken instantly. The situation demands the exposed user to negotiate in a dynamic and unstable scenario in the absence or lack of any preparation, but still, take swift and appropriate decisions to save life/lives or possessions. But the resulting stress and anxiety restricts cue sampling, decreases vigilance, reduces the capacity of working memory, causes premature closure in evaluating alternative options, and results in task shedding. Limited time, uncertainty, high stakes and vague goals negatively affect cognitive abilities to take appropriate decisions. More so, theory of natural decision making by experts has been understood with far more depth than that of an ordinary user. Therefore, in this study, the author aims to understand the role of visual aids in supporting rapid comprehension to take appropriate decisions during an emergency situation.

Keywords: cognition, visual, decision making, graphics, recognition

Procedia PDF Downloads 259

1523 Automatic Assignment of Geminate and Epenthetic Vowel for Amharic Text-to-Speech System

Authors: Tadesse Anberbir, Bankole Felix, Tomio Takara

Abstract:

In the development of a text-to-speech synthesizer, automatic derivation of correct pronunciation from the grapheme form of a text is a central problem. Particularly deriving phonological features which are not shown in orthography is challenging. In the Amharic language, geminates and epenthetic vowels are very crucial for proper pronunciation, but neither is shown in orthography. In this paper, to proposed and integrated a morphological analyzer into an Amharic Text-to-Speech system, mainly to predict geminates and epenthetic vowel positions and prepared a duration modeling method. Amharic Text-to-Speech system (AmhTTS) is a parametric and rule-based system that adopts a cepstral method and uses a source filter model for speech production and a Log Magnitude Approximation (LMA) filter as the vocal tract filter. The naturalness of the system after employing the duration modeling was evaluated by sentence listening test, and we achieved an average Mean Opinion Score (MOS) 3.4 (68%), which is moderate. By modeling the duration of geminates and controlling the locations of epenthetic vowel, we are able to synthesize good quality speech. Our system is mainly suitable to be customized for other Ethiopian languages with limited resources.

Keywords: amharic, gemination, Speech synthesis, morphology, epenthesis

Procedia PDF Downloads 71

1522 Assessment of the Validity of Sentiment Analysis as a Tool to Analyze the Emotional Content of Text

Authors: Trisha Malhotra

Abstract:

Sentiment analysis is a recent field of study that computationally assesses the emotional nature of a body of text. To assess its test-validity, sentiment analysis was carried out on the emotional corpus of text from a personal 15-day mood diary. Self-reported mood scores varied more or less accurately with daily mood evaluation score given by the software. On further assessment, it was found that while sentiment analysis was good at assessing ‘global’ mood, it was not able to ‘locally’ identify and differentially score synonyms of various emotional words. It is further critiqued for treating the intensity of an emotion as universal across cultures. Finally, the software is shown not to account for emotional complexity in sentences by treating emotions as strictly positive or negative. Hence, it is posited that a better output could be two (positive and negative) affect scores for the same body of text.

Keywords: analysis, data, diary, emotions, mood, sentiment

Procedia PDF Downloads 251

1521 Motion Effects of Arabic Typography on Screen-Based Media

Authors: Ibrahim Hassan

Abstract:

Motion typography is one of the most important types of visual communication based on display. Through the digital display media, we can control the text properties (size, direction, thickness, color, etc.). The use of motion typography in visual communication made it have several images. We need to adjust the terminology and clarify the different differences between them, so relying on the word motion typography -considered a general term- is not enough to separate the different communicative functions of the moving text. In this paper, we discuss the different effects of motion typography on Arabic writing and how we can achieve harmony between the movement and the letterform, and we will, during our experiments, present a new type of text movement.

Keywords: Arabic typography, motion typography, kinetic typography, fluid typography, temporal typography

Procedia PDF Downloads 151

1520 Recognition of Grocery Products in Images Captured by Cellular Phones

Authors: Farshideh Einsele, Hassan Foroosh

Abstract:

In this paper, we present a robust algorithm to recognize extracted text from grocery product images captured by mobile phone cameras. Recognition of such text is challenging since text in grocery product images varies in its size, orientation, style, illumination, and can suffer from perspective distortion. Pre-processing is performed to make the characters scale and rotation invariant. Since text degradations can not be appropriately defined using wellknown geometric transformations such as translation, rotation, affine transformation and shearing, we use the whole character black pixels as our feature vector. Classification is performed with minimum distance classifier using the maximum likelihood criterion, which delivers very promising Character Recognition Rate (CRR) of 89%. We achieve considerably higher Word Recognition Rate (WRR) of 99% when using lower level linguistic knowledge about product words during the recognition process.

Keywords: camera-based OCR, feature extraction, document, image processing, grocery products

Procedia PDF Downloads 397

1519 Pragmatic Survey of Precedence as Linguistic 'Déjà Vu' in Political Text and Talk

Authors: Zarine Avetisyan

Abstract:

Both in language and literature there exists the theory of recurrence of text and talk chunks which brings us to the notion of precedence. It must be stated that precedence as a pragma-linguistic phenomenon is yet underknown and it is the main objective of the present research to revisit and reveal it thoroughly. In line with the main research objective, analysis of political text and talk provides abundant relevant data for the illustration of the phenomenon of precedence. The analysis focuses on certain pragmatic universals (e.g. intention) and categories (e.g. speech techniques) which lead to the disclosure of the present object of study.

Keywords: intention, precedence, political discourse, pragmatic universals

Procedia PDF Downloads 411

1518 Automatic Assignment of Geminate and Epenthetic Vowel for Amharic Text-to-Speech System

Authors: Tadesse Anberbir, Felix Bankole, Tomio Takara, Girma Mamo

Abstract:

In the development of a text-to-speech synthesizer, automatic derivation of correct pronunciation from the grapheme form of a text is a central problem. Particularly deriving phonological features which are not shown in orthography is challenging. In the Amharic language, geminates and epenthetic vowels are very crucial for proper pronunciation but neither is shown in orthography. In this paper, we proposed and integrated a morphological analyzer into an Amharic Text-to-Speech system, mainly to predict geminates and epenthetic vowel positions, and prepared a duration modeling method. Amharic Text-to-Speech system (AmhTTS) is a parametric and rule-based system that adopts a cepstral method and uses a source filter model for speech production and a Log Magnitude Approximation (LMA) filter as the vocal tract filter. The naturalness of the system after employing the duration modeling was evaluated by sentence listening test and we achieved an average Mean Opinion Score (MOS) 3.4 (68%) which is moderate. By modeling the duration of geminates and controlling the locations of epenthetic vowel, we are able to synthesize good quality speech. Our system is mainly suitable to be customized for other Ethiopian languages with limited resources.

Keywords: Amharic, gemination, speech synthesis, morphology, epenthesis

Procedia PDF Downloads 69

1517 E-Portfolios as a Means of Perceiving Students’ Listening and Speaking Progress

Authors: Heba Salem

Abstract:

This paper aims to share the researcher’s experience of using e-Portfolios as an assessment tool to follow up on students’ learning experiences and performance throughout the semester. It also aims at highlighting the importance of students’ self-reflection in the process of language learning. The paper begins by introducing the advanced media course, with its focus on listening and speaking skills, and introduces the students’ profiles. Then it explains the students’ role in the e-portfolio process as they are given the option to choose a listening text they studied throughout the semester and to choose a recorded oral production of their collection of artifacts throughout the semester. Students showcase and reflect on their progress in both listening comprehension and speaking. According to the research, re-listening to work given to them and to their production is a means of reflecting on both their progress and achievement. And choosing the work students want to showcase is a means to promote independent learning as well as self-expression. Students are encouraged to go back to the class learning outcomes in the process of choosing the work. In their reflections, students express how they met the specific learning outcome. While giving their presentations, students expressed how useful the experience of returning and going over what they covered to select one and going over their production as well. They also expressed how beneficial it was to listen to themselves and literally see their progress in both listening comprehension and speaking. Students also reported that they grasped more details from the texts than they did when first having it as an assignment, which coincided with one of the class learning outcomes. They also expressed the fact that they had more confidence speaking as well as they were able to use a variety of vocabulary and idiomatic expressions that students have accumulated. For illustration, this paper includes practical samples of students’ tasks and instructions as well as samples of their reflections. The results of students’ reflections coincide with what the research confirms about the effectiveness of the e-portfolios as a means of assessment. The employment of e-Portfolios has two-folded benefits; students are able to measure the achievement of the targeted learning outcomes, and teachers receive constructive feedback on their teaching methods.

Keywords: e-portfolios, assessment, self assessment, listening and speaking progress, foreign language, reflection, learning out comes, sharing experience

Procedia PDF Downloads 86

1516 Part of Speech Tagging Using Statistical Approach for Nepali Text

Authors: Archit Yajnik

Abstract:

Part of Speech Tagging has always been a challenging task in the era of Natural Language Processing. This article presents POS tagging for Nepali text using Hidden Markov Model and Viterbi algorithm. From the Nepali text, annotated corpus training and testing data set are randomly separated. Both methods are employed on the data sets. Viterbi algorithm is found to be computationally faster and accurate as compared to HMM. The accuracy of 95.43% is achieved using Viterbi algorithm. Error analysis where the mismatches took place is elaborately discussed.

Keywords: hidden markov model, natural language processing, POS tagging, viterbi algorithm

Procedia PDF Downloads 317

1515 Association of Sensory Processing and Cognitive Deficits in Children with Autism Spectrum Disorders – Pioneer Study in Saudi Arabia

Authors: Rana Zeina

Abstract:

Objective: The association between Sensory problems and cognitive abilities has been studied in individuals with Autism Spectrum Disorders (ASDs). In this study, we used a neuropsychological test to evaluate memory and attention in ASDs children with sensory problems compared to the ASDs children without sensory problems. Methods: Four visual memory tests of Cambridge Neuropsychological Test Automated Battery (CANTAB) including Big/Little Circle (BLC), Simple Reaction Time (SRT), Intra/Extra Dimensional Set Shift (IED), Spatial Recognition Memory (SRM), were administered to 14 ASDs children with sensory problems compared to 13 ASDs without sensory problems aged 3 to 12 with IQ of above 70. Results: ASDs Individuals with sensory problems performed worse than the ASDs group without sensory problems on comprehension, learning, reversal and simple reaction time tasks, and no significant difference between the two groups was recorded in terms of the visual memory and visual comprehension tasks. Conclusion: The findings of this study suggest that ASDs children with sensory problems are facing deficits in learning, comprehension, reversal, and speed of response to stimuli.

Keywords: visual memory, attention, autism spectrum disorders, CANTAB eclipse

Procedia PDF Downloads 442

1514 Deep Learning Based-Object-classes Semantic Classification of Arabic Texts

Authors: Imen Elleuch, Wael Ouarda, Gargouri Bilel

Abstract:

We proposes in this paper a Deep Learning based approach to classify text in order to enrich an Arabic ontology based on the objects classes of Gaston Gross. Those object classes are defined by taking into account the syntactic and semantic features of the treated language. Thus, our proposed approach is a hybrid one. In fact, it is based on the one hand on the object classes that represents a knowledge based-approach on classification of text and in the other hand it uses the deep learning approach that use the word embedding-based-approach to classify text. We have applied our proposed approach on a corpus constructed from an Arabic dictionary. The obtained semantic classification of text will enrich the Arabic objects classes ontology. In fact, new classes can be added to the ontology or an expansion of the features that characterizes each object class can be updated. The obtained results are compared to a similar work that treats the same object with a classical linguistic approach for the semantic classification of text. This comparison highlight our hybrid proposed approach that can be ameliorated by broaden the dataset used in the deep learning process.

Keywords: deep-learning approach, object-classes, semantic classification, Arabic

Procedia PDF Downloads 61

1513 Towards a Deconstructive Text: Beyond Language and the Politics of Absences in Samuel Beckett’s Waiting for Godot

Authors: Afia Shahid

Abstract:

The writing of Samuel Beckett is associated with meaning in the meaninglessness and the production of what he calls ‘literature of unword’. The casual escape from the world of words in the form of silences and pauses, in his play Waiting for Godot, urges to ask question of their existence and ultimately leads to investigate the theory behind their use in the play. This paper proposes that these absences (silence and pause) in Beckett’s play force to think ‘beyond’ language. This paper asks how silence and pause in Beckett’s text speak for the emergence of poststructuralist text. It aims to identify the significant features of the philosophy of deconstruction in the play of Beckett to demystify the hostile complicity between literature and philosophy. With the interpretive paradigm of poststructuralism this research focuses on the text as a research data. It attempts to delineate the relationship between poststructuralist theoretical concerns and text of Beckett. Keeping in view the theoretical concerns of Poststructuralist theorist Jacques Derrida, the main concern of the discussion is directed towards the notion of ‘beyond’ language into the absences that are aimed at silencing the existing discourse with the ‘radical irony’ of this anti-formal art that contains its own denial and thus represents the idea of ceaseless questioning and radical contradiction in art and any text. This article asks how text of Beckett vibrates with loud silence and has disrupted language to demonstrate the emptiness of words and thus exploring the limitless void of absences. Beckett’s text resonates with silence and pause that is neither negation nor affirmation rather a poststructuralist’s suspension of reality that is ever changing with the undecidablity of all meanings. Within the theoretical notion of Derrida’s Différance this study interprets silence and pause in Beckett’s art. The silence and pause behave like Derrida’s Différance and have questioned their own existence in the text to deconstruct any definiteness and finality of reality to extend an undecidable threshold of poststructuralists that aims to evade the ‘labyrinth of language’.

Keywords: Différance, language, pause, poststructuralism, silence, text

Procedia PDF Downloads 198

1512 The Platform for Digitization of Georgian Documents

Authors: Erekle Magradze, Davit Soselia, Levan Shughliashvili, Irakli Koberidze, Shota Tsiskaridze, Victor Kakhniashvili, Tamar Chaghiashvili

Abstract:

Since the beginning of active publishing activity in Georgia, voluminous printed material has been accumulated, the digitization of which is an important task. Digitized materials will be available to the audience, and it will be possible to find text in them and conduct various factual research. Digitizing scanned documents means scanning documents, extracting text from the scanned documents, and processing the text into a corresponding language model to detect inaccuracies and grammatical errors. Implementing these stages requires a unified, scalable, and automated platform, where the digital service developed for each stage will perform the task assigned to it; at the same time, it will be possible to develop these services dynamically so that there is no interruption in the work of the platform.

Keywords: NLP, OCR, BERT, Kubernetes, transformers

Procedia PDF Downloads 129

1511 In-Context Meta Learning for Automatic Designing Pretext Tasks for Self-Supervised Image Analysis

Authors: Toktam Khatibi

Abstract:

Self-supervised learning (SSL) includes machine learning models that are trained on one aspect and/or one part of the input to learn other aspects and/or part of it. SSL models are divided into two different categories, including pre-text task-based models and contrastive learning ones. Pre-text tasks are some auxiliary tasks learning pseudo-labels, and the trained models are further fine-tuned for downstream tasks. However, one important disadvantage of SSL using pre-text task solving is defining an appropriate pre-text task for each image dataset with a variety of image modalities. Therefore, it is required to design an appropriate pretext task automatically for each dataset and each downstream task. To the best of our knowledge, the automatic designing of pretext tasks for image analysis has not been considered yet. In this paper, we present a framework based on In-context learning that describes each task based on its input and output data using a pre-trained image transformer. Our proposed method combines the input image and its learned description for optimizing the pre-text task design and its hyper-parameters using Meta-learning models. The representations learned from the pre-text tasks are fine-tuned for solving the downstream tasks. We demonstrate that our proposed framework outperforms the compared ones on unseen tasks and image modalities in addition to its superior performance for previously known tasks and datasets.

Keywords: in-context learning (ICL), meta learning, self-supervised learning (SSL), vision-language domain, transformers

Procedia PDF Downloads 67