Search results for: word processing
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 4214

Search results for: word processing

4124 Students' Errors in Translating Algebra Word Problems to Mathematical Structure

Authors: Ledeza Jordan Babiano

Abstract:

Translating statements into mathematical notations is one of the processes in word problem-solving. However, based on the literature, students still have difficulties with this skill. The purpose of this study was to investigate the translation errors of the students when they translate algebraic word problems into mathematical structures and locate the errors via the lens of the Translation-Verification Model. Moreover, this qualitative research study employed content analysis. During the data-gathering process, the students were asked to answer a six-item algebra word problem questionnaire, and their answers were analyzed by experts through blind coding using the Translation-Verification Model to determine their translation errors. After this, a focus group discussion was conducted, and the data gathered was analyzed through thematic analysis to determine the causes of the students’ translation errors. It was found out that students’ prevalent error in translation was the interpretation error, which was situated in the Attribute construct. The emerging themes during the FGD were: (1) The procedure of translation is strategically incorrect; (2) Lack of comprehension; (3) Algebra concepts related to difficulty; (4) Lack of spatial skills; (5) Unprepared for independent learning; and (6) The content of the problem is developmentally inappropriate. These themes boiled down to the major concept of independent learning preparedness in solving mathematical problems. This concept has subcomponents, which include contextual and conceptual factors in translation. Consequently, the results provided implications for instructors and professors in Mathematics to innovate their teaching pedagogies and strategies to address translation gaps among students.

Keywords: mathematical structure, algebra word problems, translation, errors

Procedia PDF Downloads 20
4123 Electronic-Word of Mouth(e-WoM): Preliminary Study of Malaysian Undergrad Students Smartphone Online Review

Authors: Norshakirah Ab.Aziz, Nurul Atiqah Jamaluddin

Abstract:

Consequently, electronic word-of-mouth (e-WoM) becomes one of the resources in the decision making process and considered a valuable marketing channel for consumers and organizations. Admittedly, there is increasing concern on the accuracy and genuine of e-WoM content because consumers prefer to look out product or service information available online. Thus, the focus of this study is to propose a model and guidelines how to select trusted online review content according to domain chosen –undergrad students smartphone online review. Undeniable, mobile devices like smartphone has now become a necessity in today are daily life to complete our daily chores. The model and guideline focused on product competency review and the message integrity. In other words, this study aims to enable consumers to identify trusted online review content, which helps them in buying decisions.

Keywords: electronic word of mouth, e-WoM, WoM, online review

Procedia PDF Downloads 305
4122 Reduplication in Dhiyan: An Indo-Aryan Language of Assam

Authors: S. Sulochana Singha

Abstract:

Dhiyan or Dehan is the name of the community and language spoken by the Koch-Rajbangshi people of Barak Valley of Assam. Ethnically, they are Mongoloids, and their language belongs to the Indo-Aryan language family. However, Dhiyan is absent in any classification of Indo-Aryan languages. So the classification of Dhiyan language under the Indo-Aryan language family is completely based on the shared typological features of the other Indo-Aryan languages. Typologically, Dhiyan is an agglutinating language, and it shares many features of Indo-Aryan languages like presence of aspirated voiced stops, non-tonal, verb-person agreement, adjectives as different word class, prominent tense and subject object verb word order. Reduplication is a productive word-formation process in Dhiyan. Besides it also expresses plurality, intensification, and distributive. Generally, reduplication in Dhiyan can be at the morphological or lexical level. Morphological reduplication in Dhiyan involves expressives which includes onomatopoeias, sound symbolism, idiophones, and imitatives. Lexical reduplication in the language can be formed by echo formations and word reduplication. Echo formation in Dhiyan is formed by partial repetition from the base word which can be either consonant alternation or vowel alternation. The consonant alternation is basically found in onset position while the alternation of vowel is basically found in open syllable particularly in final syllable. Word reduplication involves reduplication of nouns, interrogatives, adjectives, and numerals which further can be class changing or class maintaining reduplication. The process of reduplication can be partial or complete whether it is lexical or morphological. The present paper is an attempt to describe some aspects of the formation, function, and usage of reduplications in Dhiyan which is mainly spoken in ten villages in the Eastern part of Barak River in the Cachar District of Assam.

Keywords: Barak-Valley, Dhiyan, Indo-Aryan, reduplication

Procedia PDF Downloads 188
4121 Use of Pragmatic Cues for Word Learning in Bilingual and Monolingual Children

Authors: Isabelle Lorge, Napoleon Katsos

Abstract:

BACKGROUND: Children growing up in a multilingual environment face challenges related to the need to monitor the speaker’s linguistic abilities, more frequent communication failures, and having to acquire a large number of words in a limited amount of time compared to monolinguals. As a result, bilingual learners may develop different word learning strategies, rely more on some strategies than others, and engage cognitive resources such as theory of mind and attention skills in different ways. HYPOTHESIS: The goal of our study is to investigate whether multilingual exposure leads to improvements in the ability to use pragmatic inference for word learning, i.e., to use speaker cues to derive their referring intentions, often by overcoming lower level salience effects. The speaker cues we identified as relevant are (a) use of a modifier with or without stress (‘the WET dax’ prompting the choice of the referent which has a dry counterpart), (b) referent extension (‘this is a kitten with a fep’ prompting the choice of the unique rather than shared object), (c) referent novelty (choosing novel action rather than novel object which has been manipulated already), (d) teacher versus random sampling (assuming the choice of specific examples for a novel word to be relevant to the extension of that new category), and finally (e) emotional affect (‘look at the figoo’ uttered in a sad or happy voice) . METHOD: To this end, we implemented on a touchscreen computer a task corresponding to each of the cues above, where the child had to pick the referent of a novel word. These word learning tasks (a), (b), (c), (d) and (e) were adapted from previous word learning studies. 113 children have been tested (54 reception and 59 year 1, ranging from 4 to 6 years old) in a London primary school. Bilingual or monolingual status and other relevant information (age of onset, proficiency, literacy for bilinguals) is ascertained through language questionnaires from parents (34 out of 113 received to date). While we do not yet have the data that will allow us to test for effect of bilingualism, we can already see that performances are far from approaching ceiling in any of the tasks. In some cases the children’s performances radically differ from adults’ in a qualitative way, which means that there is scope for quantitative and qualitative effects to arise between language groups. The findings should contribute to explain the puzzling speed and efficiency that bilinguals demonstrate in acquiring competence in two languages.

Keywords: bilingualism, pragmatics, word learning, attention

Procedia PDF Downloads 109
4120 The Impact of Purpose as a Principal Leadership Skill on the Performance Select Township Schools in South Africa

Authors: Pepe Marais, Krishna Govender

Abstract:

This study aimed to investigate the impact of “purpose” as a principal leadership skill on the performance of two township schools using a quantitative research design and collecting data from the school principals, teachers and matric learners, using the 28-scale Servant Leadership Test as well as Gallup’s Q12 Employee Engagement survey. The questionnaires addressed the key objectives, namely, the extent to which the principals of the participating schools exhibited servant leadership and their understanding of “purpose” as one word in leadership and how teachers and learners perceived the impact of a “one-word” purpose-driven leader on the performance of the selected schools. Although no relationship could be demonstrated between ‘’purpose’’ and the performance of the two township schools, it became evident that a significant increase in Servant Leadership leads to a significant increase in engagement and performance, as measured by the matric pass rate. It is recommended that workshops be facilitated with principals and teachers in order to entrench ‘’purpose’’ deeper throughout the schools. In addition, Servant Leadership training has to be conduced to increase the leadership ability of the school principals. Future research in the area of ‘’purpose as one word’’, as well as Servant Leadership as a principal skillset within South Africa’s public school leadership, is recommended.

Keywords: school leadership, servant leadership, one-word purpose, engagement, leadership

Procedia PDF Downloads 90
4119 Recognition of Grocery Products in Images Captured by Cellular Phones

Authors: Farshideh Einsele, Hassan Foroosh

Abstract:

In this paper, we present a robust algorithm to recognize extracted text from grocery product images captured by mobile phone cameras. Recognition of such text is challenging since text in grocery product images varies in its size, orientation, style, illumination, and can suffer from perspective distortion. Pre-processing is performed to make the characters scale and rotation invariant. Since text degradations can not be appropriately defined using wellknown geometric transformations such as translation, rotation, affine transformation and shearing, we use the whole character black pixels as our feature vector. Classification is performed with minimum distance classifier using the maximum likelihood criterion, which delivers very promising Character Recognition Rate (CRR) of 89%. We achieve considerably higher Word Recognition Rate (WRR) of 99% when using lower level linguistic knowledge about product words during the recognition process.

Keywords: camera-based OCR, feature extraction, document, image processing, grocery products

Procedia PDF Downloads 375
4118 A Review on Artificial Neural Networks in Image Processing

Authors: B. Afsharipoor, E. Nazemi

Abstract:

Artificial neural networks (ANNs) are powerful tool for prediction which can be trained based on a set of examples and thus, it would be useful for nonlinear image processing. The present paper reviews several paper regarding applications of ANN in image processing to shed the light on advantage and disadvantage of ANNs in this field. Different steps in the image processing chain including pre-processing, enhancement, segmentation, object recognition, image understanding and optimization by using ANN are summarized. Furthermore, results on using multi artificial neural networks are presented.

Keywords: neural networks, image processing, segmentation, object recognition, image understanding, optimization, MANN

Procedia PDF Downloads 361
4117 Occasional Word-Formation in Postfeminist Fiction: Cognitive Approach

Authors: Kateryna Nykytchenko

Abstract:

Modern fiction and non-fiction writers commonly use their own lexical and stylistic devices to capture a reader’s attention and bring certain thoughts and feelings to his reader. Among such devices is the appearance of one of the neologic notions – individual author’s formations: occasionalisms or nonce words. To a significant extent, the host of examples of new words occurs in chick lit genre which has experienced exponential growth in recent years. Chick Lit is a new-millennial postfeminist fiction which focuses primarily on twenty- to thirtysomething middle-class women. It brings into focus the image of 'a new woman' of the 21st century who is always fallible, funny. This paper aims to investigate different types of occasional word-formation which reflect cognitive mechanisms of conveying women’s perception of the world. Chick lit novels of Irish author Marian Keyes present genuinely innovative mixture of forms, both literary and nonliterary which is displayed in different types of occasional word-formation processes such as blending, compounding, creative respelling, etc. Crossing existing mental and linguistic boundaries, adopting herself to new and overlapping linguistic spaces, chick lit author creates new words which demonstrate the result of development and progress of language and the relationship between language, thought and new reality, ultimately resulting in hybrid word-formation (e.g. affixation or pseudoborrowing). Moreover, this article attempts to present the main characteristics of chick-lit fiction genre with the help of the Marian Keyes’s novels and their influence on occasionalisms. There has been a lack of research concerning cognitive nature of occasionalisms. The current paper intends to account for occasional word-formation as a set of interconnected cognitive mechanisms, operations and procedures meld together to create a new word. The results of the generalized analysis solidify arguments that the kind of new knowledge an occasionalism manifests is inextricably linked with cognitive procedure underlying it, which results in corresponding type of word-formation processes. In addition, the findings of the study reveal that the necessity of creating occasionalisms in postmodern fiction novels arises from the need to write in a new way keeping up with a perpetually developing world, and thus the evolution of the speaker herself and her perception of the world.

Keywords: Chick Lit, occasionalism, occasional word-formation, cognitive linguistics

Procedia PDF Downloads 156
4116 Multi-Sensory Coding as Intervention Therapy for ESL Spellers with Auditory Processing Delays: A South African Case-Study

Authors: A. Van Staden, N. Purcell

Abstract:

Spelling development is complex and multifaceted and relies on several cognitive-linguistic processes. This paper explored the spelling difficulties of English second language learners with auditory processing delays. This empirical study aims to address these issues by means of an intervention design. Specifically, the objectives are: (a) to develop and implement a multi-sensory spelling program for second language learners with auditory processing difficulties (APD) for a period of 6 months; (b) to assess the efficacy of the multi-sensory spelling program and whether this intervention could significantly improve experimental learners' spelling, phonological awareness, and processing (PA), rapid automatized naming (RAN), working memory (WM), word reading and reading comprehension; and (c) to determine the relationship (or interplay) between these cognitive and linguistic skills (mentioned above), and how they influence spelling development. Forty-four English, second language learners with APD were sampled from one primary school in the Free State province. The learners were randomly assigned to either an experimental (n=22) or control group (n=22). During the implementation of the spelling program, several visual, tactile and kinesthetic exercises, including the utilization of fingerspelling were introduced to support the experimental learners’ (N = 22) spelling development. Post-test results showed the efficacy of the multi-sensory spelling program, with the experimental group who were trained in utilising multi-sensory coding and fingerspelling outperforming learners from the control group on the cognitive-linguistic, spelling and reading measures. The results and efficacy of this multi-sensory spelling program and the utilisation of fingerspelling for hearing second language learners with APD open up innovative perspectives for the prevention and targeted remediation of spelling difficulties.

Keywords: English second language spellers, auditory processing delays, spelling difficulties, multi-sensory intervention program

Procedia PDF Downloads 102
4115 An Empirical Study on the Integration of Listening and Speaking Activities with Writing Instruction for Middles School English Language Learners

Authors: Xueyan Hu, Liwen Chen, Weilin He, Sujie Peng

Abstract:

Writing is an important but challenging skill For English language learners. Due to the small amount of time allocated for writing classes at schools, students have relatively few opportunities to practice writing in the classroom. While the practice of integrating listening and speaking activates with writing instruction has been used for adult English language learners, its application for young English learners has seldom been examined due to the challenge of listening and speaking activities for young English language learners. The study attempted to integrating integrating listening and speaking activities with writing instruction for middle school English language learners so as to improving their writing achievements and writing abilities in terms of the word use, coherence, and complexity in their writings. Guided by Gagne's information processing learning theory and memetics, this study conducted a 8-week writing instruction with an experimental class (n=44) and a control class (n=48) . Students in the experimental class participated in a series of listening and retelling activities about a writing sample the teacher used for writing instruction during each period of writing class. Students in the control class were taught traditionally with teachers’ direction instruction using the writing sample. Using the ANCOVA analysis of the scores of students’ writing, word-use, Chinese-English translation and the text structure, this study showed that the experimental writing instruction can significantly improve students’ writing performance. Compared with the students in the control class, the students in experimental class had significant better performance in word use and complexity in their essays. This study provides useful enlightenment for the teaching of English writing for middle school English language learners. Teachers can skillfully use information technology to integrate listening, speaking, and writing teaching, considering students’ language input and output. Teachers need to select suitable and excellent composition templates for students to ensure their high-quality language input.

Keywords: wring instruction, retelling, English language learners, listening and speaking

Procedia PDF Downloads 48
4114 The Lexical Eidos as an Invariant of a Polysemantic Word

Authors: S. Pesina, T. Solonchak

Abstract:

Phenomenological analysis is not based on natural language, but ideal language which is able to be a carrier of ideal meanings – eidos representing typical structures or essences. For this purpose, it’s necessary to release from the spatio-temporal definiteness of a subject and then state its noetic essence (eidos) by means of free fantasy generation. Herewith, as if a totally new objectness is created - the universal, confirming the thesis that thinking process takes place in generalizations passing by numerous means through the specific to the general and from the general through the specific to the singular.

Keywords: lexical eidos, phenomenology, noema, polysemantic word, semantic core

Procedia PDF Downloads 246
4113 Formation of Clipped Forms in Hausa Language

Authors: Maryam Maimota Shehu

Abstract:

Words are the basic building blocks of a language. In everyday usage of a language, words are used, and new words are formed and reformed in order to contain and accommodate all entities, phenomena, qualities and every aspect of the entire life. Despite the fact that many studies have been conducted on morphological processes in Hausa language. Most of the works concentrated on borrowing, affixation, reduplication and derivation, but clipping has been neglected to the extent that only a few scholars sited some examples in the language. Therefore, the current study investigates and examines clipping as one of the word formation processes fully found in the language. The study focuses its main attention on clipping as a word-formation process and how this process is used adequately in the formation of words and their occurrence in Hausa sentences. In order to achieve the aims, the research answered these questions: 1) is clipping used as process of word formation in Hausa? 2) What are the words formed using this process? This study utilizes the Natural Morphology Theory proposed by Dressler, (1985) which was adopted by belly (2007). The data of this study have been collected from newspaper articles, novels, and written literature of Hausa language. Based on the findings, this study found out that, there exist many kinds of words formed in Hausa language using clipping in sentence and discuss, which previous findings did not either reveals, or explain in detail. Other part of the finding shows that clipping in Hausa language occurs on nouns, verbs, adjectives, reduplicated words and compounds while retains their meanings and grammatical classes.

Keywords: clipping, Hausa language, morphology, word formation processes

Procedia PDF Downloads 433
4112 Preparation on Sentimental Analysis on Social Media Comments with Bidirectional Long Short-Term Memory Gated Recurrent Unit and Model Glove in Portuguese

Authors: Leonardo Alfredo Mendoza, Cristian Munoz, Marco Aurelio Pacheco, Manoela Kohler, Evelyn Batista, Rodrigo Moura

Abstract:

Natural Language Processing (NLP) techniques are increasingly more powerful to be able to interpret the feelings and reactions of a person to a product or service. Sentiment analysis has become a fundamental tool for this interpretation but has few applications in languages other than English. This paper presents a classification of sentiment analysis in Portuguese with a base of comments from social networks in Portuguese. A word embedding's representation was used with a 50-Dimension GloVe pre-trained model, generated through a corpus completely in Portuguese. To generate this classification, the bidirectional long short-term memory and bidirectional Gated Recurrent Unit (GRU) models are used, reaching results of 99.1%.

Keywords: natural processing language, sentiment analysis, bidirectional long short-term memory, BI-LSTM, gated recurrent unit, GRU

Procedia PDF Downloads 129
4111 Interpretation of the Russia-Ukraine 2022 War via N-Gram Analysis

Authors: Elcin Timur Cakmak, Ayse Oguzlar

Abstract:

This study presents the results of the tweets sent by Twitter users on social media about the Russia-Ukraine war by bigram and trigram methods. On February 24, 2022, Russian President Vladimir Putin declared a military operation against Ukraine, and all eyes were turned to this war. Many people living in Russia and Ukraine reacted to this war and protested and also expressed their deep concern about this war as they felt the safety of their families and their futures were at stake. Most people, especially those living in Russia and Ukraine, express their views on the war in different ways. The most popular way to do this is through social media. Many people prefer to convey their feelings using Twitter, one of the most frequently used social media tools. Since the beginning of the war, it is seen that there have been thousands of tweets about the war from many countries of the world on Twitter. These tweets accumulated in data sources are extracted using various codes for analysis through Twitter API and analysed by Python programming language. The aim of the study is to find the word sequences in these tweets by the n-gram method, which is known for its widespread use in computational linguistics and natural language processing. The tweet language used in the study is English. The data set consists of the data obtained from Twitter between February 24, 2022, and April 24, 2022. The tweets obtained from Twitter using the #ukraine, #russia, #war, #putin, #zelensky hashtags together were captured as raw data, and the remaining tweets were included in the analysis stage after they were cleaned through the preprocessing stage. In the data analysis part, the sentiments are found to present what people send as a message about the war on Twitter. Regarding this, negative messages make up the majority of all the tweets as a ratio of %63,6. Furthermore, the most frequently used bigram and trigram word groups are found. Regarding the results, the most frequently used word groups are “he, is”, “I, do”, “I, am” for bigrams. Also, the most frequently used word groups are “I, do, not”, “I, am, not”, “I, can, not” for trigrams. In the machine learning phase, the accuracy of classifications is measured by Classification and Regression Trees (CART) and Naïve Bayes (NB) algorithms. The algorithms are used separately for bigrams and trigrams. We gained the highest accuracy and F-measure values by the NB algorithm and the highest precision and recall values by the CART algorithm for bigrams. On the other hand, the highest values for accuracy, precision, and F-measure values are achieved by the CART algorithm, and the highest value for the recall is gained by NB for trigrams.

Keywords: classification algorithms, machine learning, sentiment analysis, Twitter

Procedia PDF Downloads 52
4110 An Event-Related Potential Study of Individual Differences in Word Recognition: The Evidence from Morphological Knowledge of Sino-Korean Prefixes

Authors: Jinwon Kang, Seonghak Jo, Joohee Ahn, Junghye Choi, Sun-Young Lee

Abstract:

A morphological priming has proved its importance by showing that segmentation occurs in morphemes when visual words are recognized within a noticeably short time. Regarding Sino-Korean prefixes, this study conducted an experiment on visual masked priming tasks with 57 ms stimulus-onset asynchrony (SOA) to see how individual differences in the amount of morphological knowledge affect morphological priming. The relationship between the prime and target words were classified as morphological (e.g., 미개척 migaecheog [unexplored] – 미해결 mihaegyel [unresolved]), semantical (e.g., 친환경 chinhwangyeong [eco-friendly]) – 무공해 mugonghae [no-pollution]), and orthographical (e.g., 미용실 miyongsil [beauty shop] – 미확보 mihwagbo [uncertainty]) conditions. We then compared the priming by configuring irrelevant paired stimuli for each condition’s control group. As a result, in the behavioral data, we observed facilitatory priming from a group with high morphological knowledge only under the morphological condition. In contrast, a group with low morphological knowledge showed the priming only under the orthographic condition. In the event-related potential (ERP) data, the group with high morphological knowledge presented the N250 only under the morphological condition. The findings of this study imply that individual differences in morphological knowledge in Korean may have a significant influence on the segmental processing of Korean word recognition.

Keywords: ERP, individual differences, morphological priming, sino-Korean prefixes

Procedia PDF Downloads 181
4109 Cloud Shield: Model to Secure User Data While Using Content Delivery Network Services

Authors: Rachna Jain, Sushila Madan, Bindu Garg

Abstract:

Cloud computing is the key powerhouse in numerous organizations due to shifting of their data to the cloud environment. In recent years it has been observed that cloud-based-services are being used on large scale for content storage, distribution and processing. Various issues have been observed in cloud computing environment that need to be addressed. Security and privacy are found topmost concern area. In this paper, a novel security model is proposed to secure data by utilizing CDN services like image to icon conversion. CDN Service is a content delivery service which converts an image to icon, word to pdf & Latex to pdf etc. Presented model is used to convert an image into icon by keeping image secret. Here security of image is imparted so that image should be encrypted and decrypted by data owners only. It is also discussed in the paper that how server performs multiplication and selection on encrypted data without decryption. The data can be image file, word file, audio or video file. Moreover, the proposed model is capable enough to multiply images, encrypt them and send to a server application for conversion. Eventually, the prime objective is to encrypt an image and convert the encrypted image to image Icon by utilizing homomorphic encryption.

Keywords: cloud computing, user data security, homomorphic encryption, image multiplication, CDN service

Procedia PDF Downloads 314
4108 Using Synonymy in Translation of Hemingway’s 'A Farewell to Arms' from English into Albanian

Authors: Miranda Enesi, Helena Grillo Mukli

Abstract:

The English word-stock is extremely rich in synonyms which can be largely accounted for by the abundant borrowing. Translation problems encountered by translators in general are usually ‘transfer problems’. They face more difficulties in the interpretation of meaning from the source language text than lexical differences between languages. The aim of the study is to inspect the various strategies used in translating from English into Albanian specific words in the ‘A Farwell to arms’ novel. For this purpose, examples translated from English into Albanian were examined. The Albanian equivalents have shown that various strategies were used in order to overcome the problem of rendering words and expressions into the target language. Employed strategies were synonymy, modulation, transposition, calque and word for word translation. In addition, this paper shows that the strategy of translating using synonymy is mostly used. In this paper, an attempt is made to examine the nature of contextual synonymy in order to investigate its problematic nature regarding translation. Types of synonymy are analyzed and then examples from English and Albanian versions are provided to examine the overlap between them.

Keywords: equivalence, literal translation, paraphrasing, transfer problems, synonymy

Procedia PDF Downloads 149
4107 Detecting Paraphrases in Arabic Text

Authors: Amal Alshahrani, Allan Ramsay

Abstract:

Paraphrasing is one of the important tasks in natural language processing; i.e. alternative ways to express the same concept by using different words or phrases. Paraphrases can be used in many natural language applications, such as Information Retrieval, Machine Translation, Question Answering, Text Summarization, or Information Extraction. To obtain pairs of sentences that are paraphrases we create a system that automatically extracts paraphrases from a corpus, which is built from different sources of news article since these are likely to contain paraphrases when they report the same event on the same day. There are existing simple standard approaches (e.g. TF-IDF vector space, cosine similarity) and alignment technique (e.g. Dynamic Time Warping (DTW)) for extracting paraphrase which have been applied to the English. However, the performance of these approaches could be affected when they are applied to another language, for instance Arabic language, due to the presence of phenomena which are not present in English, such as Free Word Order, Zero copula, and Pro-dropping. These phenomena will affect the performance of these algorithms. Thus, if we can analysis how the existing algorithms for English fail for Arabic then we can find a solution for Arabic. The results are promising.

Keywords: natural language processing, TF-IDF, cosine similarity, dynamic time warping (DTW)

Procedia PDF Downloads 354
4106 A Hebbian Neural Network Model of the Stroop Effect

Authors: Vadim Kulikov

Abstract:

The classical Stroop effect is the phenomenon that it takes more time to name the ink color of a printed word if the word denotes a conflicting color than if it denotes the same color. Over the last 80 years, there have been many variations of the experiment revealing various mechanisms behind semantic, attentional, behavioral and perceptual processing. The Stroop task is known to exhibit asymmetry. Reading the words out loud is hardly dependent on the ink color, but naming the ink color is significantly influenced by the incongruent words. This asymmetry is reversed, if instead of naming the color, one has to point at a corresponding color patch. Another debated aspects are the notions of automaticity and how much of the effect is due to semantic and how much due to response stage interference. Is automaticity a continuous or an all-or-none phenomenon? There are many models and theories in the literature tackling these questions which will be discussed in the presentation. None of them, however, seems to capture all the findings at once. A computational model is proposed which is based on the philosophical idea developed by the author that the mind operates as a collection of different information processing modalities such as different sensory and descriptive modalities, which produce emergent phenomena through mutual interaction and coherence. This is the framework theory where ‘framework’ attempts to generalize the concepts of modality, perspective and ‘point of view’. The architecture of this computational model consists of blocks of neurons, each block corresponding to one framework. In the simplest case there are four: visual color processing, text reading, speech production and attention selection modalities. In experiments where button pressing or pointing is required, a corresponding block is added. In the beginning, the weights of the neural connections are mostly set to zero. The network is trained using Hebbian learning to establish connections (corresponding to ‘coherence’ in framework theory) between these different modalities. The amount of data fed into the network is supposed to mimic the amount of practice a human encounters, in particular it is assumed that converting written text into spoken words is a more practiced skill than converting visually perceived colors to spoken color-names. After the training, the network performs the Stroop task. The RT’s are measured in a canonical way, as these are continuous time recurrent neural networks (CTRNN). The above-described aspects of the Stroop phenomenon along with many others are replicated. The model is similar to some existing connectionist models but as will be discussed in the presentation, has many advantages: it predicts more data, the architecture is simpler and biologically more plausible.

Keywords: connectionism, Hebbian learning, artificial neural networks, philosophy of mind, Stroop

Procedia PDF Downloads 239
4105 Online Topic Model for Broadcasting Contents Using Semantic Correlation Information

Authors: Chang-Uk Kwak, Sun-Joong Kim, Seong-Bae Park, Sang-Jo Lee

Abstract:

This paper proposes a method of learning topics for broadcasting contents. There are two kinds of texts related to broadcasting contents. One is a broadcasting script which is a series of texts including directions and dialogues. The other is blogposts which possesses relatively abstracted contents, stories and diverse information of broadcasting contents. Although two texts range over similar broadcasting contents, words in blogposts and broadcasting script are different. In order to improve the quality of topics, it needs a method to consider the word difference. In this paper, we introduce a semantic vocabulary expansion method to solve the word difference. We expand topics of the broadcasting script by incorporating the words in blogposts. Each word in blogposts is added to the most semantically correlated topics. We use word2vec to get the semantic correlation between words in blogposts and topics of scripts. The vocabularies of topics are updated and then posterior inference is performed to rearrange the topics. In experiments, we verified that the proposed method can learn more salient topics for broadcasting contents.

Keywords: broadcasting script analysis, topic expansion, semantic correlation analysis, word2vec

Procedia PDF Downloads 230
4104 Complex Event Processing System Based on the Extended ECA Rule

Authors: Kwan Hee Han, Jun Woo Lee, Sung Moon Bae, Twae Kyung Park

Abstract:

ECA (Event-Condition-Action) languages are largely adopted for event processing since they are an intuitive and powerful paradigm for programming reactive systems. However, there are some limitations about ECA rules for processing of complex events such as coupling of event producer and consumer. The objective of this paper is to propose an ECA rule pattern to improve the current limitations of ECA rule, and to develop a prototype system. In this paper, conventional ECA rule is separated into 3 parts and each part is extended to meet the requirements of CEP. Finally, event processing logic is established by combining the relevant elements of 3 parts. The usability of proposed extended ECA rule is validated by a test scenario in this study.

Keywords: complex event processing, ECA rule, Event processing system, event-driven architecture, internet of things

Procedia PDF Downloads 506
4103 Using Maximization Entropy in Developing a Filipino Phonetically Balanced Wordlist for a Phoneme-Level Speech Recognition System

Authors: John Lorenzo Bautista, Yoon-Joong Kim

Abstract:

In this paper, a set of Filipino Phonetically Balanced Word list consisting of 250 words (PBW250) were constructed for a phoneme-level ASR system for the Filipino language. The Entropy Maximization is used to obtain phonological balance in the list. Entropy of phonemes in a word is maximized, providing an optimal balance in each word’s phonological distribution using the Add-Delete Method (PBW algorithm) and is compared to the modified PBW algorithm implemented in a dynamic algorithm approach to obtain optimization. The gained entropy score of 4.2791 and 4.2902 for the PBW and modified algorithm respectively. The PBW250 was recorded by 40 respondents, each with 2 sets data. Recordings from 30 respondents were trained to produce an acoustic model that were tested using recordings from 10 respondents using the HMM Toolkit (HTK). The results of test gave the maximum accuracy rate of 97.77% for a speaker dependent test and 89.36% for a speaker independent test.

Keywords: entropy maximization, Filipino language, Hidden Markov Model, phonetically balanced words, speech recognition

Procedia PDF Downloads 430
4102 Framework for Detecting External Plagiarism from Monolingual Documents: Use of Shallow NLP and N-Gram Frequency Comparison

Authors: Saugata Bose, Ritambhra Korpal

Abstract:

The internet has increased the copy-paste scenarios amongst students as well as amongst researchers leading to different levels of plagiarized documents. For this reason, much of research is focused on for detecting plagiarism automatically. In this paper, an initiative is discussed where Natural Language Processing (NLP) techniques as well as supervised machine learning algorithms have been combined to detect plagiarized texts. Here, the major emphasis is on to construct a framework which detects external plagiarism from monolingual texts successfully. For successfully detecting the plagiarism, n-gram frequency comparison approach has been implemented to construct the model framework. The framework is based on 120 characteristics which have been extracted during pre-processing the documents using NLP approach. Afterwards, filter metrics has been applied to select most relevant characteristics and then supervised classification learning algorithm has been used to classify the documents in four levels of plagiarism. Confusion matrix was built to estimate the false positives and false negatives. Our plagiarism framework achieved a very high the accuracy score.

Keywords: lexical matching, shallow NLP, supervised machine learning algorithm, word n-gram

Procedia PDF Downloads 328
4101 Phonological Encoding and Working Memory in Kannada Speaking Adults Who Stutter

Authors: Nirmal Sugathan, Santosh Maruthy

Abstract:

Background: A considerable number of studies have evidenced that phonological encoding (PE) and working memory (WM) skills operate differently in adults who stutter (AWS). In order to tap these skills, several paradigms have been employed such as phonological priming, phoneme monitoring, and nonword repetition tasks. This study, however, utilizes a word jumble paradigm to assess both PE and WM using different modalities and this may give a better understanding of phonological processing deficits in AWS. Aim: The present study investigated PE and WM abilities in conjunction with lexical access in AWS using jumbled words. The study also aimed at investigating the effect of increase in cognitive load on phonological processing in AWS by comparing the speech reaction time (SRT) and accuracy scores across various syllable lengths. Method: Participants were 11 AWS (Age range=19-26) and 11 adults who do not stutter (AWNS) (Age range=19-26) matched for age, gender and handedness. Stimuli: Ninety 3-, 4-, and 5-syllable jumbled words (JWs) (n=30 per syllable length category) constructed from Kannada words served as stimuli for jumbled word paradigm. In order to generate jumbled words (JWs), the syllables in the real words were randomly transpositioned. Procedures: To assess PE, the JWs were presently visually using DMDX software and for WM task, JWs were presented through auditory mode through headphones. The participants were asked to silently manipulate the jumbled words to form a Kannada real word and verbally respond once. The responses for both tasks were audio recorded using record function in DMDX software and the recorded responses were analyzed using PRAAT software to calculate the SRT. Results: SRT: Mann-Whitney test results demonstrated that AWS performed significantly slower on both tasks (p < 0.001) as indicated by increased SRT. Also, AWS presented with increased SRT on both the tasks in all syllable length conditions (p < 0.001). Effect of syllable length: Wilcoxon signed rank test was carried out revealed that, on task assessing PE, the SRT of 4syllable JWs were significantly higher in both AWS (Z= -2.93, p=.003) and AWNS (Z= -2.41, p=.003) when compared to 3-syllable words. However, the findings for 4- and 5-syllable words were not significant. Task Accuracy: The accuracy scores were calculated for three syllable length conditions for both PE and PM tasks and were compared across the groups using Mann-Whitney test. The results indicated that the accuracy scores of AWS were significantly below that of AWNS in all the three syllable conditions for both the tasks (p < 0.001). Conclusion: The above findings suggest that PE and WM skills are compromised in AWS as indicated by increased SRT. Also, AWS were progressively less accurate in descrambling JWs of increasing syllable length and this may be interpreted as, rather than existing as a uniform deficiency, PE and WM deficits emerge when the cognitive load is increased. AWNS exhibited increased SRT and increased accuracy for JWs of longer syllable length whereas AWS was not benefited from increasing the reaction time, thus AWS had to compromise for both SRT and accuracy while solving JWs of longer syllable length.

Keywords: adults who stutter, phonological ability, working memory, encoding, jumbled words

Procedia PDF Downloads 208
4100 The Usage of Negative Emotive Words in Twitter

Authors: Martina Katalin Szabó, István Üveges

Abstract:

In this paper, the usage of negative emotive words is examined on the basis of a large Hungarian twitter-database via NLP methods. The data is analysed from a gender point of view, as well as changes in language usage over time. The term negative emotive word refers to those words that, on their own, without context, have semantic content that can be associated with negative emotion, but in particular cases, they may function as intensifiers (e.g. rohadt jó ’damn good’) or a sentiment expression with positive polarity despite their negative prior polarity (e.g. brutális, ahogy ez a férfi rajzol ’it’s awesome (lit. brutal) how this guy draws’. Based on the findings of several authors, the same phenomenon can be found in other languages, so it is probably a language-independent feature. For the recent analysis, 67783 tweets were collected: 37818 tweets (19580 tweets written by females and 18238 tweets written by males) in 2016 and 48344 (18379 tweets written by females and 29965 tweets written by males) in 2021. The goal of the research was to make up two datasets comparable from the viewpoint of semantic changes, as well as from gender specificities. An exhaustive lexicon of Hungarian negative emotive intensifiers was also compiled (containing 214 words). After basic preprocessing steps, tweets were processed by ‘magyarlanc’, a toolkit is written in JAVA for the linguistic processing of Hungarian texts. Then, the frequency and collocation features of all these words in our corpus were automatically analyzed (via the analysis of parts-of-speech and sentiment values of the co-occurring words). Finally, the results of all four subcorpora were compared. Here some of the main outcomes of our analyses are provided: There are almost four times fewer cases in the male corpus compared to the female corpus when the negative emotive intensifier modified a negative polarity word in the tweet (e.g., damn bad). At the same time, male authors used these intensifiers more frequently, modifying a positive polarity or a neutral word (e.g., damn good and damn big). Results also pointed out that, in contrast to female authors, male authors used these words much more frequently as a positive polarity word as well (e.g., brutális, ahogy ez a férfi rajzol ’it’s awesome (lit. brutal) how this guy draws’). We also observed that male authors use significantly fewer types of emotive intensifiers than female authors, and the frequency proportion of the words is more balanced in the female corpus. As for changes in language usage over time, some notable differences in the frequency and collocation features of the words examined were identified: some of the words collocate with more positive words in the 2nd subcorpora than in the 1st, which points to the semantic change of these words over time.

Keywords: gender differences, negative emotive words, semantic changes over time, twitter

Procedia PDF Downloads 172
4099 A Corpus-Assisted Discourse Analysis of Adjectival Collocation of the Word 'Education' in the American Context

Authors: Ngan Nguyen

Abstract:

The study analyses adjectives collocating with the word ‘education’ in the American language of the Corpus of Global Web-based English using a combination of corpus linguistic and discourse analytical methods to examine not only language patterns but also social political ideologies around the topic. Significant conclusions are deduced: (1) there are a large number of adjectival collocates of the word education which have been identified and classified into four categories representing four different aspects of education: level, quality, forms and types of education; (2) education, as in combination with three first categories, carries the meaning as the act and process of teaching and learning while with the last category having the meaning of a particular kind of teaching or training; (3) higher education is the topic that gains most concerns from the American public; (4) five most significant ideologies are discovered from the corpus: higher education associates with financial affairs, higher education is an industry, monetary policy of the government on higher education, people require greater accessibility to higher education and people value higher education. The study contributes to the field of developing meanings of words through corpus analysis and the field of discourse analysis.

Keywords: adjectival collocation, American context, corpus linguistics, discourse analysis, education

Procedia PDF Downloads 304
4098 Speech Identification Test for Individuals with High-Frequency Sloping Hearing Loss in Telugu

Authors: S. B. Rathna Kumar, Sandya K. Varudhini, Aparna Ravichandran

Abstract:

Telugu is a south central Dravidian language spoken in Andhra Pradesh, a southern state of India. The available speech identification tests in Telugu have been developed to determine the communication problems of individuals having a flat frequency hearing loss. These conventional speech audiometric tests would provide redundant information when used on individuals with high-frequency sloping hearing loss because of better hearing sensitivity in the low- and mid-frequency regions. Hence, conventional speech identification tests do not indicate the true nature of the communication problem of individuals with high-frequency sloping hearing loss. It is highly possible that a person with a high-frequency sloping hearing loss may get maximum scores if conventional speech identification tests are used. Hence, there is a need to develop speech identification test materials that are specifically designed to assess the speech identification performance of individuals with high-frequency sloping hearing loss. The present study aimed to develop speech identification test for individuals with high-frequency sloping hearing loss in Telugu. Individuals with high-frequency sloping hearing loss have difficulty in perception of voiceless consonants whose spectral energy is above 1000 Hz. Hence, the word lists constructed with phonemes having mid- and high-frequency spectral energy will estimate speech identification performance better for such individuals. The phonemes /k/, /g/, /c/, /ṭ/ /t/, /p/, /s/, /ś/, /ṣ/ and /h/are preferred for the construction of words as these phonemes have spectral energy distributed in the frequencies above 1000 KHz predominantly. The present study developed two word lists in Telugu (each word list contained 25 words) for evaluating speech identification performance of individuals with high-frequency sloping hearing loss. The performance of individuals with high-frequency sloping hearing loss was evaluated using both conventional and high-frequency word lists under recorded voice condition. The results revealed that the developed word lists were found to be more sensitive in identifying the true nature of the communication problem of individuals with high-frequency sloping hearing loss.

Keywords: speech identification test, high-frequency sloping hearing loss, recorded voice condition, Telugu

Procedia PDF Downloads 393
4097 Verb Bias in Mandarin: The Corpus Based Study of Children

Authors: Jou-An Chung

Abstract:

The purpose of this study is to investigate the verb bias of the Mandarin verbs in children’s reading materials and provide the criteria for categorization. Verb bias varies cross-linguistically. As Mandarin and English are typological different, this study hopes to shed light on Mandarin verb bias with the use of corpus and provide thorough and detailed criteria for analysis. Moreover, this study focuses on children’s reading materials since it is a significant issue in understanding children’s sentence processing. Therefore, investigating verb bias of Mandarin verbs in children’s reading materials is also an important issue and can provide further insights into children’s sentence processing. The small corpus is built up for this study. The corpus consists of the collection of school textbooks and Mandarin Daily News for children. The files are then segmented and POS tagged by JiebaR (Chinese segmentation with R). For the ease of analysis, the one-word character verbs and intransitive verbs are excluded beforehand. The total of 20 high frequency verbs are hand-coded and are further categorized into one of the three types, namely DO type, SC type and other category. If the frequency of taking Other Type exceeds the threshold of 25%, the verb is excluded from the study. The results show that 10 verbs are direct object bias verbs, and six verbs are sentential complement bias verbs. The paired T-test was done to assure the statistical significance (p = 0.0001062 for DO bias verb, p=0.001149 for SC bias verb). The result has shown that in children’s reading materials, the DO biased verbs are used more than the SC bias verbs since the simplest structure of sentences is easier for children’s sentence comprehension or processing. In sum, this study not only discussed verb bias in child's reading materials but also provided basic coding criteria for verb bias analysis in Mandarin and underscored the role of context. Sentences are easier for children’s sentence comprehension or processing. In sum, this study not only discussed verb bias in child corpus, but also provided basic coding criteria for verb bias analysis in Mandarin and underscored the role of context.

Keywords: corpus linguistics, verb bias, child language, psycholinguistics

Procedia PDF Downloads 248
4096 IoT Based Information Processing and Computing

Authors: Mannan Ahmad Rasheed, Sawera Kanwal, Mansoor Ahmad Rasheed

Abstract:

The Internet of Things (IoT) has revolutionized the way we collect and process information, making it possible to gather data from a wide range of connected devices and sensors. This has led to the development of IoT-based information processing and computing systems that are capable of handling large amounts of data in real time. This paper provides a comprehensive overview of the current state of IoT-based information processing and computing, as well as the key challenges and gaps that need to be addressed. This paper discusses the potential benefits of IoT-based information processing and computing, such as improved efficiency, enhanced decision-making, and cost savings. Despite the numerous benefits of IoT-based information processing and computing, several challenges need to be addressed to realize the full potential of these systems. These challenges include security and privacy concerns, interoperability issues, scalability and reliability of IoT devices, and the need for standardization and regulation of IoT technologies. Moreover, this paper identifies several gaps in the current research related to IoT-based information processing and computing. One major gap is the lack of a comprehensive framework for designing and implementing IoT-based information processing and computing systems.

Keywords: IoT, computing, information processing, Iot computing

Procedia PDF Downloads 144
4095 The Algorithm of Semi-Automatic Thai Spoonerism Words for Bi-Syllable

Authors: Nutthapat Kaewrattanapat, Wannarat Bunchongkien

Abstract:

The purposes of this research are to study and develop the algorithm of Thai spoonerism words by semi-automatic computer programs, that is to say, in part of data input, syllables are already separated and in part of spoonerism, the developed algorithm is utilized, which can establish rules and mechanisms in Thai spoonerism words for bi-syllables by utilizing analysis in elements of the syllables, namely cluster consonant, vowel, intonation mark and final consonant. From the study, it is found that bi-syllable Thai spoonerism has 1 case of spoonerism mechanism, namely transposition in value of vowel, intonation mark and consonant of both 2 syllables but keeping consonant value and cluster word (if any). From the study, the rules and mechanisms in Thai spoonerism word were applied to develop as Thai spoonerism word software, utilizing PHP program. the software was brought to conduct a performance test on software execution; it is found that the program performs bi-syllable Thai spoonerism correctly or 99% of all words used in the test and found faults on the program at 1% as the words obtained from spoonerism may not be spelling in conformity with Thai grammar and the answer in Thai spoonerism could be more than 1 answer.

Keywords: algorithm, spoonerism, computational linguistics, Thai spoonerism

Procedia PDF Downloads 196