Search results for: lemma_pos (a token where lemma and pos of word are joined by underscore)
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1152

Search results for: lemma_pos (a token where lemma and pos of word are joined by underscore)

1032 Contextual Senses of Ambiguous Words Based on Cognitive Semantics

Authors: Madhavi

Abstract:

All linguistic units are context-dependent. They occur in particular settings, from which they derive much of their import, and are recognized by speakers as distinct entities only through a process of abstraction. Most of the words have several concepts associated with them and convey a number of meanings in different contexts in any language. For instance, there are different uses of the word good as an adjective from English. The adjective good expresses many senses like (1) ‘high quality of someone or something’ (2) ‘efficient’ (3) ‘virtuous’ (4) ‘reliable’ etc. These senses will be analyzed by using cognitive semantics framework. The context has the power to insulate one meaning from all the other meanings in communication. This paper will provide a cognitive semantic analysis. The basic tenet of cognitive semantics is the sense of a word is the way we conceptualize it. Our conceptualization is based on the physical experience we go through. Cognitive semantics tries to capture this conceptualization in terms of some categories like schema, frame, and domain. Cognitive semantics is a subfield of cognitive linguistics. Cognitive linguistics studies the language creation, learning, and usage by the reference to human cognition. The semantic structure is conceptual structure which is related to the concepts which are the elements of reason and constitute the meanings of words and linguistic expressions. Cognitive semantics studies how our mind works for the meaning of any word and how it perceives meaning from the environment through senses and works to map with the knowledge which already exists in our mind through experience. In the present paper, the senses are further classified into some categories.

Keywords: cognitive, contexts, semantics, senses

Procedia PDF Downloads 221
1031 The Facilitatory Effect of Phonological Priming on Visual Word Recognition in Arabic as a Function of Lexicality and Overlap Positions

Authors: Ali Al Moussaoui

Abstract:

An experiment was designed to assess the performance of 24 Lebanese adults (mean age 29:5 years) in a lexical decision making (LDM) task to find out how the facilitatory effect of phonological priming (PP) affects the speed of visual word recognition in Arabic as lexicality (wordhood) and phonological overlap positions (POP) vary. The experiment falls in line with previous research on phonological priming in the light of the cohort theory and in relation to visual word recognition. The experiment also departs from the research on the Arabic language in which the importance of the consonantal root as a distinct morphological unit is confirmed. Based on previous research, it is hypothesized that (1) PP has a facilitating effect in LDM with words but not with nonwords and (2) final phonological overlap between the prime and the target is more facilitatory than initial overlap. An LDM task was programmed on PsychoPy application. Participants had to decide if a target (e.g., bayn ‘between’) preceded by a prime (e.g., bayt ‘house’) is a word or not. There were 4 conditions: no PP (NP), nonwords priming nonwords (NN), nonwords priming words (NW), and words priming words (WW). The conditions were simultaneously controlled for word length, wordhood, and POP. The interstimulus interval was 700 ms. Within the PP conditions, POP was controlled for in which there were 3 overlap positions between the primes and the targets: initial (e.g., asad ‘lion’ and asaf ‘sorrow’), final (e.g., kattab ‘cause to write’ 2sg-mas and rattab ‘organize’ 2sg-mas), or two-segmented (e.g., namle ‘ant’ and naħle ‘bee’). There were 96 trials, 24 in each condition, using a within-subject design. The results show that concerning (1), the highest average reaction time (RT) is that in NN, followed firstly by NW and finally by WW. There is statistical significance only between the pairs NN-NW and NN-WW. Regarding (2), the shortest RT is that in the two-segmented overlap condition, followed by the final POP in the first place and the initial POP in the last place. The difference between the two-segmented and the initial overlap is significant, while other pairwise comparisons are not. Based on these results, PP emerges as a facilitatory phenomenon that is highly sensitive to lexicality and POP. While PP can have a facilitating effect under lexicality, it shows no facilitation in its absence, which intersects with several previous findings. Participants are found to be more sensitive to the final phonological overlap than the initial overlap, which also coincides with a body of earlier literature. The results contradict the cohort theory’s stress on the onset overlap position and, instead, give more weight to final overlap, and even heavier weight to the two-segmented one. In conclusion, this study confirms the facilitating effect of PP with words but not when stimuli (at least the primes and at most both the primes and targets) are nonwords. It also shows that the two-segmented priming is the most influential in LDM in Arabic.

Keywords: lexicality, phonological overlap positions, phonological priming, visual word recognition

Procedia PDF Downloads 186
1030 The Conflict of Grammaticality and Meaningfulness of the Corrupt Words: A Cross-lingual Sociolinguistic Study

Authors: Jayashree Aanand, Gajjam

Abstract:

The grammatical tradition in Sanskrit literature emphasizes the importance of the correct use of Sanskrit words or linguistic units (sādhu śabda) that brings the meritorious values, denying the attribution of the same religious merit to the incorrect use of Sanskrit words (asādhu śabda) or the vernacular or corrupt forms (apa-śabda or apabhraṁśa), even though they may help in communication. The current research, the culmination of the doctoral research on sentence definition, studies the difference among the comprehension of both correct and incorrect word forms in Sanskrit and Marathi languages in India. Based on the total of 19 experiments (both web-based and classroom-controlled) on approximately 900 Indian readers, it is found that while the incorrect forms in Sanskrit are comprehended with lesser accuracy than the correct word forms, no such difference can be seen for the Marathi language. It is interpreted that the incorrect word forms in the native language or in the language which is spoken daily (such as Marathi) will pose a lesser cognitive load as compared to the language that is not spoken on a daily basis but only used for reading (such as Sanskrit). The theoretical base for the research problem is as follows: among the three main schools of Language Science in ancient India, the Vaiyākaraṇas (Grammarians) hold that the corrupt word forms do have their own expressive power since they convey meaning, while as the Mimāṁsakas (the Exegesists) and the Naiyāyikas (the Logicians) believe that the corrupt forms can only convey the meaning indirectly, by recalling their association and similarity with the correct forms. The grammarians argue that the vernaculars that are born of the speaker’s inability to speak proper Sanskrit are regarded as degenerate versions or fallen forms of the ‘divine’ Sanskrit language and speakers who could not use proper Sanskrit or the standard language were considered as Śiṣṭa (‘elite’). The different ideas of different schools strictly adhere to their textual dispositions. For the last few years, sociolinguists have agreed that no variety of language is inherently better than any other; they are all the same as long as they serve the need of people that use them. Although the standard form of a language may offer the speakers some advantages, the non-standard variety is considered the most natural style of speaking. This is visible in the results. If the incorrect word forms incur the recall of the correct word forms in the reader as the theory suggests, it would have added one extra step in the process of sentential cognition leading to more cognitive load and less accuracy. This has not been the case for the Marathi language. Although speaking and listening to the vernaculars is the common practice and reading the vernacular is not, Marathi readers have readily and accurately comprehended the incorrect word forms in the sentences, as against the Sanskrit readers. The primary reason being Sanskrit is spoken and also read in the standard form only and the vernacular forms in Sanskrit are not found in the conversational data.

Keywords: experimental sociolinguistics, grammaticality and meaningfulness, Marathi, Sanskrit

Procedia PDF Downloads 127
1029 Morphological Rules of Bangla Repetition Words for UNL Based Machine Translation

Authors: Nawab Yousuf Ali, S. Golam, A. Ameer, Ashok Toru Roy

Abstract:

This paper develops new morphological rules suitable for Bangla repetition words to be incorporated into an inter lingua representation called Universal Networking Language (UNL). The proposed rules are to be used to combine verb roots and their inflexions to produce words which are then combined with other similar types of words to generate repetition words. This paper outlines the format of morphological rules for different types of repetition words that come from verb roots based on the framework of UNL provided by the UNL centre of the Universal Networking Digital Language (UNDL) foundation.

Keywords: Universal Networking Language (UNL), universal word (UW), head word (HW), Bangla-UNL Dictionary, morphological rule, enconverter (EnCo)

Procedia PDF Downloads 311
1028 The Development of Chinese-English Homophonic Word Pairs Databases for English Teaching and Learning

Authors: Yuh-Jen Wu, Chun-Min Lin

Abstract:

Homophonic words are common in Mandarin Chinese which belongs to the tonal language family. Using homophonic cues to study foreign languages is one of the learning techniques of mnemonics that can aid the retention and retrieval of information in the human memory. When learning difficult foreign words, some learners transpose them with words in a language they are familiar with to build an association and strengthen working memory. These phonological clues are beneficial means for novice language learners. In the classroom, if mnemonic skills are used at the appropriate time in the instructional sequence, it may achieve their maximum effectiveness. For Chinese-speaking students, proper use of Chinese-English homophonic word pairs may help them learn difficult vocabulary. In this study, a database program is developed by employing Visual Basic. The database contains two corpora, one with Chinese lexical items and the other with English ones. The Chinese corpus contains 59,053 Chinese words that were collected by a web crawler. The pronunciations of this group of words are compared with words in an English corpus based on WordNet, a lexical database for the English language. Words in both databases with similar pronunciation chunks and batches are detected. A total of approximately 1,000 Chinese lexical items are located in the preliminary comparison. These homophonic word pairs can serve as a valuable tool to assist Chinese-speaking students in learning and memorizing new English vocabulary.

Keywords: Chinese, corpus, English, homophonic words, vocabulary

Procedia PDF Downloads 185
1027 Morpheme Based Parts of Speech Tagger for Kannada Language

Authors: M. C. Padma, R. J. Prathibha

Abstract:

Parts of speech tagging is the process of assigning appropriate parts of speech tags to the words in a given text. The critical or crucial information needed for tagging a word come from its internal structure rather from its neighboring words. The internal structure of a word comprises of its morphological features and grammatical information. This paper presents a morpheme based parts of speech tagger for Kannada language. This proposed work uses hierarchical tag set for assigning tags. The system is tested on some Kannada words taken from EMILLE corpus. Experimental result shows that the performance of the proposed system is above 90%.

Keywords: hierarchical tag set, morphological analyzer, natural language processing, paradigms, parts of speech

Procedia PDF Downloads 296
1026 Polish Catholic Discourse on Gender Equality in the Face of Social and Cultural Changes in Poland

Authors: Anna Jagielska

Abstract:

Five years ago, the word ‘gender’ was discussed in Poland exclusively in academic contexts. One year later, it was chosen as the word of the year and omnipresent in the Polish media. The rapid career of this word is due to the involvement of the Polish church hierarchy who strategically brought this term into relation with abortion, pornography and paedophilia. ‘Gender’ is more than a political slogan. It is a symbol of social anxiety and moral panic in Poland which need to be historically considered. The aim of this paper is to present selected rhetorical strategies used by the Polish Catholic clergy who strive to have an impact on the current gender discourse in Poland. In particular, the gender debate, culminated in the pastoral letter of the Bishops' Conference of Poland, will be discussed. The church’s protest against the Council of Europe’s Convention on Preventing and Combating Violence against Women and Domestic Violence will be analyzed and the recent heated debates in Poland on contraception, abortion, in vitro fertilization, and sex education will be mentioned. To provide explanations on the specificity of Polish gender debates the role of the Catholic Church in the fall of communism in Poland as well as the charismatisation of Polish society by Pope John Paul II will be explained. The social constructions of communism and feminism which are manifested in both written and symbolic contracts on gender equality between the Church and the State will be demonstrated. At the end of the paper, theories about the changing role of religion in society will be applied.

Keywords: gender, Poland, religion, catholicism, feminism

Procedia PDF Downloads 292
1025 Bilingual Gaming Kit to Teach English Language through Collaborative Learning

Authors: Sarayu Agarwal

Abstract:

This paper aims to teach English (secondary language) by bridging the understanding between the Regional language (primary language) and the English Language (secondary language). Here primary language is the one a person has learned from birth or within the critical period, while secondary language would be any other language one learns or speaks. The paper also focuses on evolving old teaching methods to a contemporary participatory model of learning and teaching. Pilot studies were conducted to gauge an understanding of student’s knowledge of the English language. Teachers and students were interviewed and their academic curriculum was assessed as a part of the initial study. Extensive literature study and design thinking principles were used to devise a solution to the problem. The objective is met using a holistic learning kit/card game to teach children word recognition, word pronunciation, word spelling and writing words. Implication of the paper is a noticeable improvement in the understanding and grasping of English language. With increasing usage and applicability of English as a second language (ESL) world over, the paper becomes relevant due to its easy replicability to any other primary or secondary language. Future scope of this paper would be transforming the idea of participatory learning into self-regulated learning methods. With the upcoming govt. learning centres in rural areas and provision of smart devices such as tablets, the development of the card games into digital applications seems very feasible.

Keywords: English as a second language, vocabulary-building card games, learning through gamification, rural education

Procedia PDF Downloads 247
1024 Comparing the Contribution of General Vocabulary Knowledge and Academic Vocabulary Knowledge to Learners' Academic Achievement

Authors: Reem Alsager, James Milton

Abstract:

Coxhead’s (2000) Academic Word List (AWL) believed to be essential for students pursuing higher education and helps differentiate English for Academic Purposes (EAP) from General English as a course of study, and it is thought to be important for comprehending English academic texts. It has been described that AWL is an infrequent, discrete set of vocabulary items unreachable from general language. On the other hand, it has been known for a period of time that general vocabulary knowledge is a good predictor of academic achievement. This study, however, is an attempt to measure and compare the contribution of academic knowledge and general vocabulary knowledge to learners’ GPA and examine what knowledge is a better predictor of academic achievement and investigate whether AWL as a specialised list of infrequent words relates to the frequency effect. The participants were comprised of 44 international postgraduate students in Swansea University, all from the School of Management, following the taught MSc (Master of Science). The study employed the Academic Vocabulary Size Test (AVST) and the XK_Lex vocabulary size test. The findings indicate that AWL is a list based on word frequency rather than a discrete and unique word list and that the AWL performs the same function as general vocabulary, with tests of each found to measure largely the same quality of knowledge. The findings also suggest that the contribution that AWL knowledge provides for academic success is not sufficient and that general vocabulary knowledge is better in predicting academic achievement. Furthermore, the contribution that academic knowledge added above the contribution of general vocabulary knowledge when combined is really small and noteworthy. This study’s results are in line with the argument and suggest that it is the development of general vocabulary size is an essential quality for academic success and acquiring the words of the AWL will form part of this process. The AWL by itself does not provide sufficient coverage, and is probably not specialised enough, for knowledge of this list to influence this general process. It can be concluded that AWL as an academic word list epitomizes only a fraction of words that are actually needed for academic success in English and that knowledge of academic vocabulary combined with general vocabulary knowledge above the most frequent 3000 words is what matters most to ultimate academic success.

Keywords: academic achievement, academic vocabulary, general vocabulary, vocabulary size

Procedia PDF Downloads 220
1023 Method to Create Signed Word - Application in Teaching and Learning Vietnamese Sign Language

Authors: Nguyen Thi Kim Thoa

Abstract:

Vietnam currently has about two million five hundred deaf/hard of hearing people. Although the issue of Vietnamese Sign Language (VSL) education has received attention from the State, there are still many issues that need to be resolved, such as policies, teacher training in both knowledge and teaching methods, education programs, and textbook compilation. Furthermore, the issue of research on VSL has not yet attracted the attention of linguists. Using the quantitative description method, the article will analyze, synthesize, and compare to find methods to create signed words in VSL, such as based on external shape characteristics, operational characteristics, operating methods, and basic meanings, from which we can see the special nature of signed words, the division of word types and the morphological meaning of creating new words through sign methods. From the results of this research, the aspect of ‘visual culture’ will be clarified in Vietnamese Deaf Culture. Through that, we also develop a number of vocabulary teaching methods (such as teaching vocabulary through a group of methods of forming signed words, teaching vocabulary using mind maps, and teaching vocabulary through culture...), with the aim of further improving the effectiveness of teaching and learning VSL in Vietnam. The research results also provide deaf people in Vietnam with a scientific and effective method of learning vocabulary, helping them quickly integrate into the community. The article will be a useful reference for linguists who want to research VSL.

Keywords: Vietnamese sign language (VSL), signed word, teaching, method

Procedia PDF Downloads 40
1022 Signed Language Phonological Awareness: Building Deaf Children's Vocabulary in Signed and Written Language

Authors: Lynn Mcquarrie, Charlotte Enns

Abstract:

The goal of this project was to develop a visually-based, signed language phonological awareness training program and to pilot the intervention with signing deaf children (ages 6 -10 years/ grades 1 - 4) who were beginning readers to assess the effects of systematic explicit American Sign Language (ASL) phonological instruction on both ASL vocabulary and English print vocabulary learning. Growing evidence that signing learners utilize visually-based signed language phonological knowledge (homologous to the sound-based phonological level of spoken language processing) when reading underscore the critical need for further research on the innovation of reading instructional practices for visual language learners. Multiple single-case studies using a multiple probe design across content (i.e., sign and print targets incorporating specific ASL phonological parameters – handshapes) was implemented to examine if a functional relationship existed between instruction and acquisition of these skills. The results indicated that for all cases, representing a variety of language abilities, the visually-based phonological teaching approach was exceptionally powerful in helping children to build their sign and print vocabularies. Although intervention/teaching studies have been essential in testing hypotheses about spoken language phonological processes supporting non-deaf children’s reading development, there are no parallel intervention/teaching studies exploring hypotheses about signed language phonological processes in supporting deaf children’s reading development. This study begins to provide the needed evidence to pursue innovative teaching strategies that incorporate the strengths of visual learners.

Keywords: American sign language phonological awareness, dual language strategies, vocabulary learning, word reading

Procedia PDF Downloads 334
1021 The Image of Cultural Tourism in the Tourists’ Point of View

Authors: Wanida Suwunniponth

Abstract:

The purposes of this research were to investigate the perceived of a cultural image and loyalty of tourists toward the attraction at Banglumphu neighborhood in Bangkok and to study the relationship of the cultural image of Banglumphu community and loyalty to visit this area of the tourists. This study employed both quantitative approach and qualitative approach. In a quantitative research, a questionnaire was used to collect data from 300 systematic sampled tourists who visited Banglumphu area and the correlation analysis were used to analyze data. The results revealed that the overall tourists’ point of view toward Banglumphu cultural image was at a good level which lifestyle had the best image, followed by value and belief, physical dimension, community identity, tradition, and local wisdom. In addition, the overall aspect of tourists’ loyalty including satisfaction, word of mouths, and revisiting were at good levels which word of mouths received the highest value, followed by revisiting, and satisfaction, respectively. In addition, the relationship between cultural image in aspect on lifestyle, tradition, local wisdom, belief, community identity and loyalty to visit Banglumphu in each aspect on satisfaction, word of mouths, and revisiting were moderately correlated at the significant level of 0.05, except physical dimension was not correlated with each aspect of tourists’ loyalty.

Keywords: cultural tourism, image, loyalty, revisit

Procedia PDF Downloads 252
1020 Contribution of Word Decoding and Reading Fluency on Reading Comprehension in Young Typical Readers of Kannada Language

Authors: Vangmayee V. Subban, Suzan Deelan. Pinto, Somashekara Haralakatta Shivananjappa, Shwetha Prabhu, Jayashree S. Bhat

Abstract:

Introduction and Need: During early years of schooling, the instruction in the schools mainly focus on children’s word decoding abilities. However, the skilled readers should master all the components of reading such as word decoding, reading fluency and comprehension. Nevertheless, the relationship between each component during the process of learning to read is less clear. The studies conducted in alphabetical languages have mixed opinion on relative contribution of word decoding and reading fluency on reading comprehension. However, the scenarios in alphasyllabary languages are unexplored. Aim and Objectives: The aim of the study was to explore the role of word decoding, reading fluency on reading comprehension abilities in children learning to read Kannada between the age ranges of 5.6 to 8.6 years. Method: In this cross sectional study, a total of 60 typically developing children, 20 each from Grade I, Grade II, Grade III maintaining equal gender ratio between the age range of 5.6 to 6.6 years, 6.7 to 7.6 years and 7.7 to 8.6 years respectively were selected from Kannada medium schools. The reading fluency and reading comprehension abilities of the children were assessed using Grade level passages selected from the Kannada text book of children core curriculum. All the passages consist of five questions to assess reading comprehension. The pseudoword decoding skills were assessed using 40 pseudowords with varying syllable length and their Akshara composition. Pseudowords are formed by interchanging the syllables within the meaningful word while maintaining the phonotactic constraints of Kannada language. The assessment material was subjected to content validation and reliability measures before collecting the data on the study samples. The data were collected individually, and reading fluency was assessed for words correctly read per minute. Pseudoword decoding was scored for the accuracy of reading. Results: The descriptive statistics indicated that the mean pseudoword reading, reading comprehension, words accurately read per minute increased with the Grades. The performance of Grade III children found to be higher, Grade I lower and Grade II remained intermediate of Grade III and Grade I. The trend indicated that reading skills gradually improve with the Grades. Pearson’s correlation co-efficient showed moderate and highly significant (p=0.00) positive co-relation between the variables, indicating the interdependency of all the three components required for reading. The hierarchical regression analysis revealed 37% variance in reading comprehension was explained by pseudoword decoding and was highly significant. Subsequent entry of reading fluency measure, there was no significant change in R-square and was only change 3%. Therefore, pseudoword-decoding evolved as a single most significant predictor of reading comprehension during early Grades of reading acquisition. Conclusion: The present study concludes that the pseudoword decoding skills contribute significantly to reading comprehension than reading fluency during initial years of schooling in children learning to read Kannada language.

Keywords: alphasyllabary, pseudo-word decoding, reading comprehension, reading fluency

Procedia PDF Downloads 263
1019 Numerical Board Game for Low-Income Preschoolers

Authors: Gozde Inal Kiziltepe, Ozgun Uyanik

Abstract:

There is growing evidence that socioeconomic (SES)-related differences in mathematical knowledge primarily start in early childhood period. Preschoolers from low-income families are likely to perform substantially worse in mathematical knowledge than their counterparts from middle and higher income families. The differences are seen on a wide range of recognizing written numerals, counting, adding and subtracting, and comparing numerical magnitudes. Early differences in numerical knowledge have a permanent effect childrens’ mathematical knowledge in other grades. In this respect, analyzing the effect of number board game on the number knowledge of 48-60 month-old children from disadvantaged low-income families constitutes the main objective of the study. Participants were the 71 preschoolers from a childcare center which served low-income urban families. Children were randomly assigned to the number board condition or to the color board condition. The number board condition included 35 children and the color board game condition included 36 children. Both board games were 50 cm long and 30 cm high; had ‘The Great Race’ written across the top; and included 11 horizontally arranged, different colored squares of equal sizes with the leftmost square labeled ‘Start’. The numerical board had the numbers 1–10 in the rightmost 10 squares; the color board had different colors in those squares. A rabbit or a bear token were presented to children for selecting, and on each trial spun a spinner to determine whether the token would move one or two spaces. The number condition spinner had a ‘1’ half and a ‘2’ half; the color condition spinner had colors that matched the colors of the squares on the board. Children met one-on-one with an experimenter for four 15- to 20-min sessions within a 2-week period. In the first and fourth sessions, children were administered identical pretest and posttest measures of numerical knowledge. All children were presented three numerical tasks and one subtest presented in the following order: counting, numerical magnitude comparison, numerical identification and Count Objects – Circle Number Probe subtest of Early Numeracy Assessment. In addition, same numerical tasks and subtest were given as a follow-up test four weeks after the post-test administration. Findings obtained from the study; showed that there was a meaningful difference between scores of children who played a color board game in favor of children who played number board game.

Keywords: low income, numerical board game, numerical knowledge, preschool education

Procedia PDF Downloads 354
1018 Words Spotting in the Images Handwritten Historical Documents

Authors: Issam Ben Jami

Abstract:

Information retrieval in digital libraries is very important because most famous historical documents occupy a significant value. The word spotting in historical documents is a very difficult notion, because automatic recognition of such documents is naturally cursive, it represents a wide variability in the level scale and translation words in the same documents. We first present a system for the automatic recognition, based on the extraction of interest points words from the image model. The extraction phase of the key points is chosen from the representation of the image as a synthetic description of the shape recognition in a multidimensional space. As a result, we use advanced methods that can find and describe interesting points invariant to scale, rotation and lighting which are linked to local configurations of pixels. We test this approach on documents of the 15th century. Our experiments give important results.

Keywords: feature matching, historical documents, pattern recognition, word spotting

Procedia PDF Downloads 275
1017 Performance Analysis with the Combination of Visualization and Classification Technique for Medical Chatbot

Authors: Shajida M., Sakthiyadharshini N. P., Kamalesh S., Aswitha B.

Abstract:

Natural Language Processing (NLP) continues to play a strategic part in complaint discovery and medicine discovery during the current epidemic. This abstract provides an overview of performance analysis with a combination of visualization and classification techniques of NLP for a medical chatbot. Sentiment analysis is an important aspect of NLP that is used to determine the emotional tone behind a piece of text. This technique has been applied to various domains, including medical chatbots. In this, we have compared the combination of the decision tree with heatmap and Naïve Bayes with Word Cloud. The performance of the chatbot was evaluated using accuracy, and the results indicate that the combination of visualization and classification techniques significantly improves the chatbot's performance.

Keywords: sentimental analysis, NLP, medical chatbot, decision tree, heatmap, naïve bayes, word cloud

Procedia PDF Downloads 77
1016 Fat-Tail Test of Regulatory DNA Sequences

Authors: Jian-Jun Shu

Abstract:

The statistical properties of CRMs are explored by estimating similar-word set occurrence distribution. It is observed that CRMs tend to have a fat-tail distribution for similar-word set occurrence. Thus, the fat-tail test with two fatness coefficients is proposed to distinguish CRMs from non-CRMs, especially from exons. For the first fatness coefficient, the separation accuracy between CRMs and exons is increased as compared with the existing content-based CRM prediction method – fluffy-tail test. For the second fatness coefficient, the computing time is reduced as compared with fluffy-tail test, making it very suitable for long sequences and large data-base analysis in the post-genome time. Moreover, these indexes may be used to predict the CRMs which have not yet been observed experimentally. This can serve as a valuable filtering process for experiment.

Keywords: statistical approach, transcription factor binding sites, cis-regulatory modules, DNA sequences

Procedia PDF Downloads 293
1015 On the Interactive Search with Web Documents

Authors: Mario Kubek, Herwig Unger

Abstract:

Due to the large amount of information in the World Wide Web (WWW, web) and the lengthy and usually linearly ordered result lists of web search engines that do not indicate semantic relationships between their entries, the search for topically similar and related documents can become a tedious task. Especially, the process of formulating queries with proper terms representing specific information needs requires much effort from the user. This problem gets even bigger when the user's knowledge on a subject and its technical terms is not sufficient enough to do so. This article presents the new and interactive search application DocAnalyser that addresses this problem by enabling users to find similar and related web documents based on automatic query formulation and state-of-the-art search word extraction. Additionally, this tool can be used to track topics across semantically connected web documents

Keywords: DocAnalyser, interactive web search, search word extraction, query formulation, source topic detection, topic tracking

Procedia PDF Downloads 394
1014 Information Retrieval for Kafficho Language

Authors: Mareye Zeleke Mekonen

Abstract:

The Kafficho language has distinct issues in information retrieval because of its restricted resources and dearth of standardized methods. In this endeavor, with the cooperation and support of linguists and native speakers, we investigate the creation of information retrieval systems specifically designed for the Kafficho language. The Kafficho information retrieval system allows Kafficho speakers to access information easily in an efficient and effective way. Our objective is to conduct an information retrieval experiment using 220 Kafficho text files, including fifteen sample questions. Tokenization, normalization, stop word removal, stemming, and other data pre-processing chores, together with additional tasks like term weighting, were prerequisites for the vector space model to represent each page and a particular query. The three well-known measurement metrics we used for our word were Precision, Recall, and and F-measure, with values of 87%, 28%, and 35%, respectively. This demonstrates how well the Kaffiho information retrieval system performed well while utilizing the vector space paradigm.

Keywords: Kafficho, information retrieval, stemming, vector space

Procedia PDF Downloads 58
1013 IMPERTIO: An Efficient Communication Interface for Cerebral Palsy Patients

Authors: M. Zaïgouche, A. Kouvahe, F. Stefanelli

Abstract:

IMPERTIO is a high technology based project aiming at offering efficient assistance help in communication for persons affected by Cerebral Palsy. The systems currently available are hardly used by these patients who are not satisfied by ergonomics and response time. The project rests upon the concept that, opposite to usual master-slave communication giving power to the entity with larger range of possibilities, providing conversely the mastery to the entity with smaller range of possibilities will allow a better understanding ground for both parties. Entirely customizable, the application developed from this idea gives full freedom to the user. Through pictograms (one button linked to a word or a sentence) and adapted keyboard, noticeable improvements are brought to the response time and ease to use ergonomics.

Keywords: cerebral palsy, master-slave relation, communication interface, virtual keyboard, word construction algorithm

Procedia PDF Downloads 401
1012 Reading Comprehension in Profound Deaf Readers

Authors: S. Raghibdoust, E. Kamari

Abstract:

Research show that reduced functional hearing has a detrimental influence on the ability of an individual to establish proper phonological representations of words, since the phonological representations are claimed to mediate the conceptual processing of written words. Word processing efficiency is expected to decrease with a decrease in functional hearing. In other words, it is predicted that hearing individuals would be more capable of word processing than individuals with hearing loss, as their functional hearing works normally. Studies also demonstrate that the quality of the functional hearing affects reading comprehension via its effect on their word processing skills. In other words, better hearing facilitates the development of phonological knowledge, and can promote enhanced strategies for the recognition of written words, which in turn positively affect higher-order processes underlying reading comprehension. The aims of this study were to investigate and compare the effect of deafness on the participants’ abilities to process written words at the lexical and sentence levels through using two online and one offline reading comprehension tests. The performance of a group of 8 deaf male students (ages 8-12) was compared with that of a control group of normal hearing male students. All the participants had normal IQ and visual status, and came from an average socioeconomic background. None were diagnosed with a particular learning or motor disability. The language spoken in the homes of all participants was Persian. Two tests of word processing were developed and presented to the participants using OpenSesame software, in order to measure the speed and accuracy of their performance at the two perceptual and conceptual levels. In the third offline test of reading comprehension which comprised of semantically plausible and semantically implausible subject relative clauses, the participants had to select the correct answer out of two choices. The data derived from the statistical analysis using SPSS software indicated that hearing and deaf participants had a similar word processing performance both in terms of speed and accuracy of their responses. The results also showed that there was no significant difference between the performance of the deaf and hearing participants in comprehending semantically plausible sentences (p > 0/05). However, a significant difference between the performances of the two groups was observed with respect to their comprehension of semantically implausible sentences (p < 0/05). In sum, the findings revealed that the seriously impoverished sentence reading ability characterizing the profound deaf subjects of the present research, exhibited their reliance on reading strategies that are based on insufficient or deviant structural knowledge, in particular in processing semantically implausible sentences, rather than a failure to efficiently process written words at the lexical level. This conclusion, of course, does not mean to say that deaf individuals may never experience deficits at the word processing level, deficits that impede their understanding of written texts. However, as stated in previous researches, it sounds reasonable to assume that the more deaf individuals get familiar with written words, the better they can recognize them, despite having a profound phonological weakness.

Keywords: deafness, reading comprehension, reading strategy, word processing, subject and object relative sentences

Procedia PDF Downloads 339
1011 The Sinful Pig: Social Construction of Hogs through Corpus Analysis in Czech

Authors: Zdeněk Joukl

Abstract:

The word for pig in Czech (prase) seems to be one of the most negatively connotated words denoting animals. This paper represents an analysis of the largest Czech corpora, including a diachronic corpus. Besides corpus-analytical tools, sentiment analysis methods and tools such as LIWC and word clouds are used to better capture the usage of the words for pigs in Czech. The most frequent collocations across domains are identified and extracted with context to be used for sentiment analysis, which reveals an almost exclusive negative sentiment or culinary context. The animal is burdened with a disproportionately high number of meanings representing negatively viewed human characteristics or behaviors (dirtiness, fatness, sweating, inebriation, aggressive driving, greediness or chauvinism are among the most frequent ones). The diachronic view helps us understand how this extreme bias came to existence both through institutional construction and human-animal relations.

Keywords: corpus analysis, pig, sentiment, social construction

Procedia PDF Downloads 15
1010 Linguistic Insights Improve Semantic Technology in Medical Research and Patient Self-Management Contexts

Authors: William Michael Short

Abstract:

Semantic Web’ technologies such as the Unified Medical Language System Metathesaurus, SNOMED-CT, and MeSH have been touted as transformational for the way users access online medical and health information, enabling both the automated analysis of natural-language data and the integration of heterogeneous healthrelated resources distributed across the Internet through the use of standardized terminologies that capture concepts and relationships between concepts that are expressed differently across datasets. However, the approaches that have so far characterized ‘semantic bioinformatics’ have not yet fulfilled the promise of the Semantic Web for medical and health information retrieval applications. This paper argues within the perspective of cognitive linguistics and cognitive anthropology that four features of human meaning-making must be taken into account before the potential of semantic technologies can be realized for this domain. First, many semantic technologies operate exclusively at the level of the word. However, texts convey meanings in ways beyond lexical semantics. For example, transitivity patterns (distributions of active or passive voice) and modality patterns (configurations of modal constituents like may, might, could, would, should) convey experiential and epistemic meanings that are not captured by single words. Language users also naturally associate stretches of text with discrete meanings, so that whole sentences can be ascribed senses similar to the senses of words (so-called ‘discourse topics’). Second, natural language processing systems tend to operate according to the principle of ‘one token, one tag’. For instance, occurrences of the word sound must be disambiguated for part of speech: in context, is sound a noun or a verb or an adjective? In syntactic analysis, deterministic annotation methods may be acceptable. But because natural language utterances are typically characterized by polyvalency and ambiguities of all kinds (including intentional ambiguities), such methods leave the meanings of texts highly impoverished. Third, ontologies tend to be disconnected from everyday language use and so struggle in cases where single concepts are captured through complex lexicalizations that involve profile shifts or other embodied representations. More problematically, concept graphs tend to capture ‘expert’ technical models rather than ‘folk’ models of knowledge and so may not match users’ common-sense intuitions about the organization of concepts in prototypical structures rather than Aristotelian categories. Fourth, and finally, most ontologies do not recognize the pervasively figurative character of human language. However, since the time of Galen the widespread use of metaphor in the linguistic usage of both medical professionals and lay persons has been recognized. In particular, metaphor is a well-documented linguistic tool for communicating experiences of pain. Because semantic medical knowledge-bases are designed to help capture variations within technical vocabularies – rather than the kinds of conventionalized figurative semantics that practitioners as well as patients actually utilize in clinical description and diagnosis – they fail to capture this dimension of linguistic usage. The failure of semantic technologies in these respects degrades the efficiency and efficacy not only of medical research, where information retrieval inefficiencies can lead to direct financial costs to organizations, but also of care provision, especially in contexts of patients’ self-management of complex medical conditions.

Keywords: ambiguity, bioinformatics, language, meaning, metaphor, ontology, semantic web, semantics

Procedia PDF Downloads 133
1009 Application of Vector Representation for Revealing the Richness of Meaning of Facial Expressions

Authors: Carmel Sofer, Dan Vilenchik, Ron Dotsch, Galia Avidan

Abstract:

Studies investigating emotional facial expressions typically reveal consensus among observes regarding the meaning of basic expressions, whose number ranges between 6 to 15 emotional states. Given this limited number of discrete expressions, how is it that the human vocabulary of emotional states is so rich? The present study argues that perceivers use sequences of these discrete expressions as the basis for a much richer vocabulary of emotional states. Such mechanisms, in which a relatively small number of basic components is expanded to a much larger number of possible combinations of meanings, exist in other human communications modalities, such as spoken language and music. In these modalities, letters and notes, which serve as basic components of spoken language and music respectively, are temporally linked, resulting in the richness of expressions. In the current study, in each trial participants were presented with sequences of two images containing facial expression in different combinations sampled out of the eight static basic expressions (total 64; 8X8). In each trial, using single word participants were required to judge the 'state of mind' portrayed by the person whose face was presented. Utilizing word embedding methods (Global Vectors for Word Representation), employed in the field of Natural Language Processing, and relying on machine learning computational methods, it was found that the perceived meanings of the sequences of facial expressions were a weighted average of the single expressions comprising them, resulting in 22 new emotional states, in addition to the eight, classic basic expressions. An interaction between the first and the second expression in each sequence indicated that every single facial expression modulated the effect of the other facial expression thus leading to a different interpretation ascribed to the sequence as a whole. These findings suggest that the vocabulary of emotional states conveyed by facial expressions is not restricted to the (small) number of discrete facial expressions. Rather, the vocabulary is rich, as it results from combinations of these expressions. In addition, present research suggests that using word embedding in social perception studies, can be a powerful, accurate and efficient tool, to capture explicit and implicit perceptions and intentions. Acknowledgment: The study was supported by a grant from the Ministry of Defense in Israel to GA and CS. CS is also supported by the ABC initiative in Ben-Gurion University of the Negev.

Keywords: Glove, face perception, facial expression perception. , facial expression production, machine learning, word embedding, word2vec

Procedia PDF Downloads 176
1008 Text Localization in Fixed-Layout Documents Using Convolutional Networks in a Coarse-to-Fine Manner

Authors: Beier Zhu, Rui Zhang, Qi Song

Abstract:

Text contained within fixed-layout documents can be of great semantic value and so requires a high localization accuracy, such as ID cards, invoices, cheques, and passports. Recently, algorithms based on deep convolutional networks achieve high performance on text detection tasks. However, for text localization in fixed-layout documents, such algorithms detect word bounding boxes individually, which ignores the layout information. This paper presents a novel architecture built on convolutional neural networks (CNNs). A global text localization network and a regional bounding-box regression network are introduced to tackle the problem in a coarse-to-fine manner. The text localization network simultaneously locates word bounding points, which takes the layout information into account. The bounding-box regression network inputs the features pooled from arbitrarily sized RoIs and refine the localizations. These two networks share their convolutional features and are trained jointly. A typical type of fixed-layout documents: ID cards, is selected to evaluate the effectiveness of the proposed system. These networks are trained on data cropped from nature scene images, and synthetic data produced by a synthetic text generation engine. Experiments show that our approach locates high accuracy word bounding boxes and achieves state-of-the-art performance.

Keywords: bounding box regression, convolutional networks, fixed-layout documents, text localization

Procedia PDF Downloads 198
1007 A Grey-Box Text Attack Framework Using Explainable AI

Authors: Esther Chiramal, Kelvin Soh Boon Kai

Abstract:

Explainable AI is a strong strategy implemented to understand complex black-box model predictions in a human-interpretable language. It provides the evidence required to execute the use of trustworthy and reliable AI systems. On the other hand, however, it also opens the door to locating possible vulnerabilities in an AI model. Traditional adversarial text attack uses word substitution, data augmentation techniques, and gradient-based attacks on powerful pre-trained Bidirectional Encoder Representations from Transformers (BERT) variants to generate adversarial sentences. These attacks are generally white-box in nature and not practical as they can be easily detected by humans e.g., Changing the word from “Poor” to “Rich”. We proposed a simple yet effective Grey-box cum Black-box approach that does not require the knowledge of the model while using a set of surrogate Transformer/BERT models to perform the attack using Explainable AI techniques. As Transformers are the current state-of-the-art models for almost all Natural Language Processing (NLP) tasks, an attack generated from BERT1 is transferable to BERT2. This transferability is made possible due to the attention mechanism in the transformer that allows the model to capture long-range dependencies in a sequence. Using the power of BERT generalisation via attention, we attempt to exploit how transformers learn by attacking a few surrogate transformer variants which are all based on a different architecture. We demonstrate that this approach is highly effective to generate semantically good sentences by changing as little as one word that is not detectable by humans while still fooling other BERT models.

Keywords: BERT, explainable AI, Grey-box text attack, transformer

Procedia PDF Downloads 138
1006 The Power of Words: A Corpus Analysis of Campaign Speeches of President Donald J. Trump

Authors: Aiza Dalman

Abstract:

Words are powerful when these are used wisely and strategically. In this study, twelve (12) campaign speeches of President Donald J. Trump were analyzed as to frequently used words and ethos, pathos and logos being employed. The speeches were read thoroughly, analyzed and interpreted. With the use of Word Counter Tool and Text Analyzer software accessible online, it was found out that the word ‘will’ has the highest frequency of 121, followed by Hillary (58), American (38), going (35), plan and Clinton (32), illegal (30), government (28), corruption (26) and criminal (24). When the speeches were analyzed as to ethos, pathos and logos, on the other hand, it revealed that these were all employed in his speeches. The statements under these pointed out against Hillary or in his favor. The unique strategy of President Donald J. Trump as to frequently used words and ethos, pathos and logos in persuading people perhaps lead the way to his victory.

Keywords: campaign speeches, corpus analysis, ethos, logos and pathos, power of words

Procedia PDF Downloads 281
1005 Memory Retrieval and Implicit Prosody during Reading: Anaphora Resolution by L1 and L2 Speakers of English

Authors: Duong Thuy Nguyen, Giulia Bencini

Abstract:

The present study examined structural and prosodic factors on the computation of antecedent-reflexive relationships and sentence comprehension in native English (L1) and Vietnamese-English bilinguals (L2). Participants read sentences presented on the computer screen in one of three presentation formats aimed at manipulating prosodic parsing: word-by-word (RSVP), phrase-segment (self-paced), or whole-sentence (self-paced), then completed a grammaticality rating and a comprehension task (following Pratt & Fernandez, 2016). The design crossed three factors: syntactic structure (simple; complex), grammaticality (target-match; target-mismatch) and presentation format. An example item is provided in (1): (1) The actress that (Mary/John) interviewed at the awards ceremony (about two years ago/organized outside the theater) described (herself/himself) as an extreme workaholic). Results showed that overall, both L1 and L2 speakers made use of a good-enough processing strategy at the expense of more detailed syntactic analyses. L1 and L2 speakers’ comprehension and grammaticality judgements were negatively affected by the most prosodically disrupting condition (word-by-word). However, the two groups demonstrated differences in their performance in the other two reading conditions. For L1 speakers, the whole-sentence and the phrase-segment formats were both facilitative in the grammaticality rating and comprehension tasks; for L2, compared with the whole-sentence condition, the phrase-segment paradigm did not significantly improve accuracy or comprehension. These findings are consistent with the findings of Pratt & Fernandez (2016), who found a similar pattern of results in the processing of subject-verb agreement relations using the same experimental paradigm and prosodic manipulation with English L1 and L2 English-Spanish speakers. The results provide further support for a Good-Enough cue model of sentence processing that integrates cue-based retrieval and implicit prosodic parsing (Pratt & Fernandez, 2016) and highlights similarities and differences between L1 and L2 sentence processing and comprehension.

Keywords: anaphora resolution, bilingualism, implicit prosody, sentence processing

Procedia PDF Downloads 152
1004 Unsupervised Assistive and Adaptative Intelligent Agent in Smart Enviroment

Authors: Sebastião Pais, João Casal, Ricardo Ponciano, Sérgio Lorenço

Abstract:

The adaptation paradigm is a basic defining feature for pervasive computing systems. Adaptation systems must work efficiently in a smart environment while providing suitable information relevant to the user system interaction. The key objective is to deduce the information needed information changes. Therefore relying on fixed operational models would be inappropriate. This paper presents a study on developing an Intelligent Personal Assistant to assist the user in interacting with their Smart Environment. We propose an Unsupervised and Language-Independent Adaptation through Intelligent Speech Interface and a set of methods of Acquiring Knowledge, namely Semantic Similarity and Unsupervised Learning.

Keywords: intelligent personal assistants, intelligent speech interface, unsupervised learning, language-independent, knowledge acquisition, association measures, symmetric word similarities, attributional word similarities

Procedia PDF Downloads 564
1003 Unsupervised Assistive and Adaptive Intelligent Agent in Smart Environment

Authors: Sebastião Pais, João Casal, Ricardo Ponciano, Sérgio Lourenço

Abstract:

The adaptation paradigm is a basic defining feature for pervasive computing systems. Adaptation systems must work efficiently in smart environment while providing suitable information relevant to the user system interaction. The key objective is to deduce the information needed information changes. Therefore, relying on fixed operational models would be inappropriate. This paper presents a study on developing a Intelligent Personal Assistant to assist the user in interacting with their Smart Environment. We propose a Unsupervised and Language-Independent Adaptation through Intelligent Speech Interface and a set of methods of Acquiring Knowledge, namely Semantic Similarity and Unsupervised Learning.

Keywords: intelligent personal assistants, intelligent speech interface, unsupervised learning, language-independent, knowledge acquisition, association measures, symmetric word similarities, attributional word similarities

Procedia PDF Downloads 645