Search results for: polysemantic word
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 757

Search results for: polysemantic word

667 Information Disclosure And Financial Sentiment Index Using a Machine Learning Approach

Authors: Alev Atak

Abstract:

In this paper, we aim to create a financial sentiment index by investigating the company’s voluntary information disclosures. We retrieve structured content from BIST 100 companies’ financial reports for the period 1998-2018 and extract relevant financial information for sentiment analysis through Natural Language Processing. We measure strategy-related disclosures and their cross-sectional variation and classify report content into generic sections using synonym lists divided into four main categories according to their liquidity risk profile, risk positions, intra-annual information, and exposure to risk. We use Word Error Rate and Cosin Similarity for comparing and measuring text similarity and derivation in sets of texts. In addition to performing text extraction, we will provide a range of text analysis options, such as the readability metrics, word counts using pre-determined lists (e.g., forward-looking, uncertainty, tone, etc.), and comparison with reference corpus (word, parts of speech and semantic level). Therefore, we create an adequate analytical tool and a financial dictionary to depict the importance of granular financial disclosure for investors to identify correctly the risk-taking behavior and hence make the aggregated effects traceable.

Keywords: financial sentiment, machine learning, information disclosure, risk

Procedia PDF Downloads 63
666 Probing Language Models for Multiple Linguistic Information

Authors: Bowen Ding, Yihao Kuang

Abstract:

In recent years, large-scale pre-trained language models have achieved state-of-the-art performance on a variety of natural language processing tasks. The word vectors produced by these language models can be viewed as dense encoded presentations of natural language that in text form. However, it is unknown how much linguistic information is encoded and how. In this paper, we construct several corresponding probing tasks for multiple linguistic information to clarify the encoding capabilities of different language models and performed a visual display. We firstly obtain word presentations in vector form from different language models, including BERT, ELMo, RoBERTa and GPT. Classifiers with a small scale of parameters and unsupervised tasks are then applied on these word vectors to discriminate their capability to encode corresponding linguistic information. The constructed probe tasks contain both semantic and syntactic aspects. The semantic aspect includes the ability of the model to understand semantic entities such as numbers, time, and characters, and the grammatical aspect includes the ability of the language model to understand grammatical structures such as dependency relationships and reference relationships. We also compare encoding capabilities of different layers in the same language model to infer how linguistic information is encoded in the model.

Keywords: language models, probing task, text presentation, linguistic information

Procedia PDF Downloads 54
665 Contextual Senses of Ambiguous Words Based on Cognitive Semantics

Authors: Madhavi

Abstract:

All linguistic units are context-dependent. They occur in particular settings, from which they derive much of their import, and are recognized by speakers as distinct entities only through a process of abstraction. Most of the words have several concepts associated with them and convey a number of meanings in different contexts in any language. For instance, there are different uses of the word good as an adjective from English. The adjective good expresses many senses like (1) ‘high quality of someone or something’ (2) ‘efficient’ (3) ‘virtuous’ (4) ‘reliable’ etc. These senses will be analyzed by using cognitive semantics framework. The context has the power to insulate one meaning from all the other meanings in communication. This paper will provide a cognitive semantic analysis. The basic tenet of cognitive semantics is the sense of a word is the way we conceptualize it. Our conceptualization is based on the physical experience we go through. Cognitive semantics tries to capture this conceptualization in terms of some categories like schema, frame, and domain. Cognitive semantics is a subfield of cognitive linguistics. Cognitive linguistics studies the language creation, learning, and usage by the reference to human cognition. The semantic structure is conceptual structure which is related to the concepts which are the elements of reason and constitute the meanings of words and linguistic expressions. Cognitive semantics studies how our mind works for the meaning of any word and how it perceives meaning from the environment through senses and works to map with the knowledge which already exists in our mind through experience. In the present paper, the senses are further classified into some categories.

Keywords: cognitive, contexts, semantics, senses

Procedia PDF Downloads 178
664 The Facilitatory Effect of Phonological Priming on Visual Word Recognition in Arabic as a Function of Lexicality and Overlap Positions

Authors: Ali Al Moussaoui

Abstract:

An experiment was designed to assess the performance of 24 Lebanese adults (mean age 29:5 years) in a lexical decision making (LDM) task to find out how the facilitatory effect of phonological priming (PP) affects the speed of visual word recognition in Arabic as lexicality (wordhood) and phonological overlap positions (POP) vary. The experiment falls in line with previous research on phonological priming in the light of the cohort theory and in relation to visual word recognition. The experiment also departs from the research on the Arabic language in which the importance of the consonantal root as a distinct morphological unit is confirmed. Based on previous research, it is hypothesized that (1) PP has a facilitating effect in LDM with words but not with nonwords and (2) final phonological overlap between the prime and the target is more facilitatory than initial overlap. An LDM task was programmed on PsychoPy application. Participants had to decide if a target (e.g., bayn ‘between’) preceded by a prime (e.g., bayt ‘house’) is a word or not. There were 4 conditions: no PP (NP), nonwords priming nonwords (NN), nonwords priming words (NW), and words priming words (WW). The conditions were simultaneously controlled for word length, wordhood, and POP. The interstimulus interval was 700 ms. Within the PP conditions, POP was controlled for in which there were 3 overlap positions between the primes and the targets: initial (e.g., asad ‘lion’ and asaf ‘sorrow’), final (e.g., kattab ‘cause to write’ 2sg-mas and rattab ‘organize’ 2sg-mas), or two-segmented (e.g., namle ‘ant’ and naħle ‘bee’). There were 96 trials, 24 in each condition, using a within-subject design. The results show that concerning (1), the highest average reaction time (RT) is that in NN, followed firstly by NW and finally by WW. There is statistical significance only between the pairs NN-NW and NN-WW. Regarding (2), the shortest RT is that in the two-segmented overlap condition, followed by the final POP in the first place and the initial POP in the last place. The difference between the two-segmented and the initial overlap is significant, while other pairwise comparisons are not. Based on these results, PP emerges as a facilitatory phenomenon that is highly sensitive to lexicality and POP. While PP can have a facilitating effect under lexicality, it shows no facilitation in its absence, which intersects with several previous findings. Participants are found to be more sensitive to the final phonological overlap than the initial overlap, which also coincides with a body of earlier literature. The results contradict the cohort theory’s stress on the onset overlap position and, instead, give more weight to final overlap, and even heavier weight to the two-segmented one. In conclusion, this study confirms the facilitating effect of PP with words but not when stimuli (at least the primes and at most both the primes and targets) are nonwords. It also shows that the two-segmented priming is the most influential in LDM in Arabic.

Keywords: lexicality, phonological overlap positions, phonological priming, visual word recognition

Procedia PDF Downloads 142
663 The Conflict of Grammaticality and Meaningfulness of the Corrupt Words: A Cross-lingual Sociolinguistic Study

Authors: Jayashree Aanand, Gajjam

Abstract:

The grammatical tradition in Sanskrit literature emphasizes the importance of the correct use of Sanskrit words or linguistic units (sādhu śabda) that brings the meritorious values, denying the attribution of the same religious merit to the incorrect use of Sanskrit words (asādhu śabda) or the vernacular or corrupt forms (apa-śabda or apabhraṁśa), even though they may help in communication. The current research, the culmination of the doctoral research on sentence definition, studies the difference among the comprehension of both correct and incorrect word forms in Sanskrit and Marathi languages in India. Based on the total of 19 experiments (both web-based and classroom-controlled) on approximately 900 Indian readers, it is found that while the incorrect forms in Sanskrit are comprehended with lesser accuracy than the correct word forms, no such difference can be seen for the Marathi language. It is interpreted that the incorrect word forms in the native language or in the language which is spoken daily (such as Marathi) will pose a lesser cognitive load as compared to the language that is not spoken on a daily basis but only used for reading (such as Sanskrit). The theoretical base for the research problem is as follows: among the three main schools of Language Science in ancient India, the Vaiyākaraṇas (Grammarians) hold that the corrupt word forms do have their own expressive power since they convey meaning, while as the Mimāṁsakas (the Exegesists) and the Naiyāyikas (the Logicians) believe that the corrupt forms can only convey the meaning indirectly, by recalling their association and similarity with the correct forms. The grammarians argue that the vernaculars that are born of the speaker’s inability to speak proper Sanskrit are regarded as degenerate versions or fallen forms of the ‘divine’ Sanskrit language and speakers who could not use proper Sanskrit or the standard language were considered as Śiṣṭa (‘elite’). The different ideas of different schools strictly adhere to their textual dispositions. For the last few years, sociolinguists have agreed that no variety of language is inherently better than any other; they are all the same as long as they serve the need of people that use them. Although the standard form of a language may offer the speakers some advantages, the non-standard variety is considered the most natural style of speaking. This is visible in the results. If the incorrect word forms incur the recall of the correct word forms in the reader as the theory suggests, it would have added one extra step in the process of sentential cognition leading to more cognitive load and less accuracy. This has not been the case for the Marathi language. Although speaking and listening to the vernaculars is the common practice and reading the vernacular is not, Marathi readers have readily and accurately comprehended the incorrect word forms in the sentences, as against the Sanskrit readers. The primary reason being Sanskrit is spoken and also read in the standard form only and the vernacular forms in Sanskrit are not found in the conversational data.

Keywords: experimental sociolinguistics, grammaticality and meaningfulness, Marathi, Sanskrit

Procedia PDF Downloads 88
662 Morphological Rules of Bangla Repetition Words for UNL Based Machine Translation

Authors: Nawab Yousuf Ali, S. Golam, A. Ameer, Ashok Toru Roy

Abstract:

This paper develops new morphological rules suitable for Bangla repetition words to be incorporated into an inter lingua representation called Universal Networking Language (UNL). The proposed rules are to be used to combine verb roots and their inflexions to produce words which are then combined with other similar types of words to generate repetition words. This paper outlines the format of morphological rules for different types of repetition words that come from verb roots based on the framework of UNL provided by the UNL centre of the Universal Networking Digital Language (UNDL) foundation.

Keywords: Universal Networking Language (UNL), universal word (UW), head word (HW), Bangla-UNL Dictionary, morphological rule, enconverter (EnCo)

Procedia PDF Downloads 274
661 The Development of Chinese-English Homophonic Word Pairs Databases for English Teaching and Learning

Authors: Yuh-Jen Wu, Chun-Min Lin

Abstract:

Homophonic words are common in Mandarin Chinese which belongs to the tonal language family. Using homophonic cues to study foreign languages is one of the learning techniques of mnemonics that can aid the retention and retrieval of information in the human memory. When learning difficult foreign words, some learners transpose them with words in a language they are familiar with to build an association and strengthen working memory. These phonological clues are beneficial means for novice language learners. In the classroom, if mnemonic skills are used at the appropriate time in the instructional sequence, it may achieve their maximum effectiveness. For Chinese-speaking students, proper use of Chinese-English homophonic word pairs may help them learn difficult vocabulary. In this study, a database program is developed by employing Visual Basic. The database contains two corpora, one with Chinese lexical items and the other with English ones. The Chinese corpus contains 59,053 Chinese words that were collected by a web crawler. The pronunciations of this group of words are compared with words in an English corpus based on WordNet, a lexical database for the English language. Words in both databases with similar pronunciation chunks and batches are detected. A total of approximately 1,000 Chinese lexical items are located in the preliminary comparison. These homophonic word pairs can serve as a valuable tool to assist Chinese-speaking students in learning and memorizing new English vocabulary.

Keywords: Chinese, corpus, English, homophonic words, vocabulary

Procedia PDF Downloads 137
660 Morpheme Based Parts of Speech Tagger for Kannada Language

Authors: M. C. Padma, R. J. Prathibha

Abstract:

Parts of speech tagging is the process of assigning appropriate parts of speech tags to the words in a given text. The critical or crucial information needed for tagging a word come from its internal structure rather from its neighboring words. The internal structure of a word comprises of its morphological features and grammatical information. This paper presents a morpheme based parts of speech tagger for Kannada language. This proposed work uses hierarchical tag set for assigning tags. The system is tested on some Kannada words taken from EMILLE corpus. Experimental result shows that the performance of the proposed system is above 90%.

Keywords: hierarchical tag set, morphological analyzer, natural language processing, paradigms, parts of speech

Procedia PDF Downloads 254
659 Polish Catholic Discourse on Gender Equality in the Face of Social and Cultural Changes in Poland

Authors: Anna Jagielska

Abstract:

Five years ago, the word ‘gender’ was discussed in Poland exclusively in academic contexts. One year later, it was chosen as the word of the year and omnipresent in the Polish media. The rapid career of this word is due to the involvement of the Polish church hierarchy who strategically brought this term into relation with abortion, pornography and paedophilia. ‘Gender’ is more than a political slogan. It is a symbol of social anxiety and moral panic in Poland which need to be historically considered. The aim of this paper is to present selected rhetorical strategies used by the Polish Catholic clergy who strive to have an impact on the current gender discourse in Poland. In particular, the gender debate, culminated in the pastoral letter of the Bishops' Conference of Poland, will be discussed. The church’s protest against the Council of Europe’s Convention on Preventing and Combating Violence against Women and Domestic Violence will be analyzed and the recent heated debates in Poland on contraception, abortion, in vitro fertilization, and sex education will be mentioned. To provide explanations on the specificity of Polish gender debates the role of the Catholic Church in the fall of communism in Poland as well as the charismatisation of Polish society by Pope John Paul II will be explained. The social constructions of communism and feminism which are manifested in both written and symbolic contracts on gender equality between the Church and the State will be demonstrated. At the end of the paper, theories about the changing role of religion in society will be applied.

Keywords: gender, Poland, religion, catholicism, feminism

Procedia PDF Downloads 254
658 Bilingual Gaming Kit to Teach English Language through Collaborative Learning

Authors: Sarayu Agarwal

Abstract:

This paper aims to teach English (secondary language) by bridging the understanding between the Regional language (primary language) and the English Language (secondary language). Here primary language is the one a person has learned from birth or within the critical period, while secondary language would be any other language one learns or speaks. The paper also focuses on evolving old teaching methods to a contemporary participatory model of learning and teaching. Pilot studies were conducted to gauge an understanding of student’s knowledge of the English language. Teachers and students were interviewed and their academic curriculum was assessed as a part of the initial study. Extensive literature study and design thinking principles were used to devise a solution to the problem. The objective is met using a holistic learning kit/card game to teach children word recognition, word pronunciation, word spelling and writing words. Implication of the paper is a noticeable improvement in the understanding and grasping of English language. With increasing usage and applicability of English as a second language (ESL) world over, the paper becomes relevant due to its easy replicability to any other primary or secondary language. Future scope of this paper would be transforming the idea of participatory learning into self-regulated learning methods. With the upcoming govt. learning centres in rural areas and provision of smart devices such as tablets, the development of the card games into digital applications seems very feasible.

Keywords: English as a second language, vocabulary-building card games, learning through gamification, rural education

Procedia PDF Downloads 213
657 Comparing the Contribution of General Vocabulary Knowledge and Academic Vocabulary Knowledge to Learners' Academic Achievement

Authors: Reem Alsager, James Milton

Abstract:

Coxhead’s (2000) Academic Word List (AWL) believed to be essential for students pursuing higher education and helps differentiate English for Academic Purposes (EAP) from General English as a course of study, and it is thought to be important for comprehending English academic texts. It has been described that AWL is an infrequent, discrete set of vocabulary items unreachable from general language. On the other hand, it has been known for a period of time that general vocabulary knowledge is a good predictor of academic achievement. This study, however, is an attempt to measure and compare the contribution of academic knowledge and general vocabulary knowledge to learners’ GPA and examine what knowledge is a better predictor of academic achievement and investigate whether AWL as a specialised list of infrequent words relates to the frequency effect. The participants were comprised of 44 international postgraduate students in Swansea University, all from the School of Management, following the taught MSc (Master of Science). The study employed the Academic Vocabulary Size Test (AVST) and the XK_Lex vocabulary size test. The findings indicate that AWL is a list based on word frequency rather than a discrete and unique word list and that the AWL performs the same function as general vocabulary, with tests of each found to measure largely the same quality of knowledge. The findings also suggest that the contribution that AWL knowledge provides for academic success is not sufficient and that general vocabulary knowledge is better in predicting academic achievement. Furthermore, the contribution that academic knowledge added above the contribution of general vocabulary knowledge when combined is really small and noteworthy. This study’s results are in line with the argument and suggest that it is the development of general vocabulary size is an essential quality for academic success and acquiring the words of the AWL will form part of this process. The AWL by itself does not provide sufficient coverage, and is probably not specialised enough, for knowledge of this list to influence this general process. It can be concluded that AWL as an academic word list epitomizes only a fraction of words that are actually needed for academic success in English and that knowledge of academic vocabulary combined with general vocabulary knowledge above the most frequent 3000 words is what matters most to ultimate academic success.

Keywords: academic achievement, academic vocabulary, general vocabulary, vocabulary size

Procedia PDF Downloads 185
656 The Image of Cultural Tourism in the Tourists’ Point of View

Authors: Wanida Suwunniponth

Abstract:

The purposes of this research were to investigate the perceived of a cultural image and loyalty of tourists toward the attraction at Banglumphu neighborhood in Bangkok and to study the relationship of the cultural image of Banglumphu community and loyalty to visit this area of the tourists. This study employed both quantitative approach and qualitative approach. In a quantitative research, a questionnaire was used to collect data from 300 systematic sampled tourists who visited Banglumphu area and the correlation analysis were used to analyze data. The results revealed that the overall tourists’ point of view toward Banglumphu cultural image was at a good level which lifestyle had the best image, followed by value and belief, physical dimension, community identity, tradition, and local wisdom. In addition, the overall aspect of tourists’ loyalty including satisfaction, word of mouths, and revisiting were at good levels which word of mouths received the highest value, followed by revisiting, and satisfaction, respectively. In addition, the relationship between cultural image in aspect on lifestyle, tradition, local wisdom, belief, community identity and loyalty to visit Banglumphu in each aspect on satisfaction, word of mouths, and revisiting were moderately correlated at the significant level of 0.05, except physical dimension was not correlated with each aspect of tourists’ loyalty.

Keywords: cultural tourism, image, loyalty, revisit

Procedia PDF Downloads 224
655 Contribution of Word Decoding and Reading Fluency on Reading Comprehension in Young Typical Readers of Kannada Language

Authors: Vangmayee V. Subban, Suzan Deelan. Pinto, Somashekara Haralakatta Shivananjappa, Shwetha Prabhu, Jayashree S. Bhat

Abstract:

Introduction and Need: During early years of schooling, the instruction in the schools mainly focus on children’s word decoding abilities. However, the skilled readers should master all the components of reading such as word decoding, reading fluency and comprehension. Nevertheless, the relationship between each component during the process of learning to read is less clear. The studies conducted in alphabetical languages have mixed opinion on relative contribution of word decoding and reading fluency on reading comprehension. However, the scenarios in alphasyllabary languages are unexplored. Aim and Objectives: The aim of the study was to explore the role of word decoding, reading fluency on reading comprehension abilities in children learning to read Kannada between the age ranges of 5.6 to 8.6 years. Method: In this cross sectional study, a total of 60 typically developing children, 20 each from Grade I, Grade II, Grade III maintaining equal gender ratio between the age range of 5.6 to 6.6 years, 6.7 to 7.6 years and 7.7 to 8.6 years respectively were selected from Kannada medium schools. The reading fluency and reading comprehension abilities of the children were assessed using Grade level passages selected from the Kannada text book of children core curriculum. All the passages consist of five questions to assess reading comprehension. The pseudoword decoding skills were assessed using 40 pseudowords with varying syllable length and their Akshara composition. Pseudowords are formed by interchanging the syllables within the meaningful word while maintaining the phonotactic constraints of Kannada language. The assessment material was subjected to content validation and reliability measures before collecting the data on the study samples. The data were collected individually, and reading fluency was assessed for words correctly read per minute. Pseudoword decoding was scored for the accuracy of reading. Results: The descriptive statistics indicated that the mean pseudoword reading, reading comprehension, words accurately read per minute increased with the Grades. The performance of Grade III children found to be higher, Grade I lower and Grade II remained intermediate of Grade III and Grade I. The trend indicated that reading skills gradually improve with the Grades. Pearson’s correlation co-efficient showed moderate and highly significant (p=0.00) positive co-relation between the variables, indicating the interdependency of all the three components required for reading. The hierarchical regression analysis revealed 37% variance in reading comprehension was explained by pseudoword decoding and was highly significant. Subsequent entry of reading fluency measure, there was no significant change in R-square and was only change 3%. Therefore, pseudoword-decoding evolved as a single most significant predictor of reading comprehension during early Grades of reading acquisition. Conclusion: The present study concludes that the pseudoword decoding skills contribute significantly to reading comprehension than reading fluency during initial years of schooling in children learning to read Kannada language.

Keywords: alphasyllabary, pseudo-word decoding, reading comprehension, reading fluency

Procedia PDF Downloads 219
654 Words Spotting in the Images Handwritten Historical Documents

Authors: Issam Ben Jami

Abstract:

Information retrieval in digital libraries is very important because most famous historical documents occupy a significant value. The word spotting in historical documents is a very difficult notion, because automatic recognition of such documents is naturally cursive, it represents a wide variability in the level scale and translation words in the same documents. We first present a system for the automatic recognition, based on the extraction of interest points words from the image model. The extraction phase of the key points is chosen from the representation of the image as a synthetic description of the shape recognition in a multidimensional space. As a result, we use advanced methods that can find and describe interesting points invariant to scale, rotation and lighting which are linked to local configurations of pixels. We test this approach on documents of the 15th century. Our experiments give important results.

Keywords: feature matching, historical documents, pattern recognition, word spotting

Procedia PDF Downloads 232
653 Performance Analysis with the Combination of Visualization and Classification Technique for Medical Chatbot

Authors: Shajida M., Sakthiyadharshini N. P., Kamalesh S., Aswitha B.

Abstract:

Natural Language Processing (NLP) continues to play a strategic part in complaint discovery and medicine discovery during the current epidemic. This abstract provides an overview of performance analysis with a combination of visualization and classification techniques of NLP for a medical chatbot. Sentiment analysis is an important aspect of NLP that is used to determine the emotional tone behind a piece of text. This technique has been applied to various domains, including medical chatbots. In this, we have compared the combination of the decision tree with heatmap and Naïve Bayes with Word Cloud. The performance of the chatbot was evaluated using accuracy, and the results indicate that the combination of visualization and classification techniques significantly improves the chatbot's performance.

Keywords: sentimental analysis, NLP, medical chatbot, decision tree, heatmap, naïve bayes, word cloud

Procedia PDF Downloads 36
652 Fat-Tail Test of Regulatory DNA Sequences

Authors: Jian-Jun Shu

Abstract:

The statistical properties of CRMs are explored by estimating similar-word set occurrence distribution. It is observed that CRMs tend to have a fat-tail distribution for similar-word set occurrence. Thus, the fat-tail test with two fatness coefficients is proposed to distinguish CRMs from non-CRMs, especially from exons. For the first fatness coefficient, the separation accuracy between CRMs and exons is increased as compared with the existing content-based CRM prediction method – fluffy-tail test. For the second fatness coefficient, the computing time is reduced as compared with fluffy-tail test, making it very suitable for long sequences and large data-base analysis in the post-genome time. Moreover, these indexes may be used to predict the CRMs which have not yet been observed experimentally. This can serve as a valuable filtering process for experiment.

Keywords: statistical approach, transcription factor binding sites, cis-regulatory modules, DNA sequences

Procedia PDF Downloads 253
651 On the Interactive Search with Web Documents

Authors: Mario Kubek, Herwig Unger

Abstract:

Due to the large amount of information in the World Wide Web (WWW, web) and the lengthy and usually linearly ordered result lists of web search engines that do not indicate semantic relationships between their entries, the search for topically similar and related documents can become a tedious task. Especially, the process of formulating queries with proper terms representing specific information needs requires much effort from the user. This problem gets even bigger when the user's knowledge on a subject and its technical terms is not sufficient enough to do so. This article presents the new and interactive search application DocAnalyser that addresses this problem by enabling users to find similar and related web documents based on automatic query formulation and state-of-the-art search word extraction. Additionally, this tool can be used to track topics across semantically connected web documents

Keywords: DocAnalyser, interactive web search, search word extraction, query formulation, source topic detection, topic tracking

Procedia PDF Downloads 359
650 Information Retrieval for Kafficho Language

Authors: Mareye Zeleke Mekonen

Abstract:

The Kafficho language has distinct issues in information retrieval because of its restricted resources and dearth of standardized methods. In this endeavor, with the cooperation and support of linguists and native speakers, we investigate the creation of information retrieval systems specifically designed for the Kafficho language. The Kafficho information retrieval system allows Kafficho speakers to access information easily in an efficient and effective way. Our objective is to conduct an information retrieval experiment using 220 Kafficho text files, including fifteen sample questions. Tokenization, normalization, stop word removal, stemming, and other data pre-processing chores, together with additional tasks like term weighting, were prerequisites for the vector space model to represent each page and a particular query. The three well-known measurement metrics we used for our word were Precision, Recall, and and F-measure, with values of 87%, 28%, and 35%, respectively. This demonstrates how well the Kaffiho information retrieval system performed well while utilizing the vector space paradigm.

Keywords: Kafficho, information retrieval, stemming, vector space

Procedia PDF Downloads 2
649 IMPERTIO: An Efficient Communication Interface for Cerebral Palsy Patients

Authors: M. Zaïgouche, A. Kouvahe, F. Stefanelli

Abstract:

IMPERTIO is a high technology based project aiming at offering efficient assistance help in communication for persons affected by Cerebral Palsy. The systems currently available are hardly used by these patients who are not satisfied by ergonomics and response time. The project rests upon the concept that, opposite to usual master-slave communication giving power to the entity with larger range of possibilities, providing conversely the mastery to the entity with smaller range of possibilities will allow a better understanding ground for both parties. Entirely customizable, the application developed from this idea gives full freedom to the user. Through pictograms (one button linked to a word or a sentence) and adapted keyboard, noticeable improvements are brought to the response time and ease to use ergonomics.

Keywords: cerebral palsy, master-slave relation, communication interface, virtual keyboard, word construction algorithm

Procedia PDF Downloads 362
648 Reading Comprehension in Profound Deaf Readers

Authors: S. Raghibdoust, E. Kamari

Abstract:

Research show that reduced functional hearing has a detrimental influence on the ability of an individual to establish proper phonological representations of words, since the phonological representations are claimed to mediate the conceptual processing of written words. Word processing efficiency is expected to decrease with a decrease in functional hearing. In other words, it is predicted that hearing individuals would be more capable of word processing than individuals with hearing loss, as their functional hearing works normally. Studies also demonstrate that the quality of the functional hearing affects reading comprehension via its effect on their word processing skills. In other words, better hearing facilitates the development of phonological knowledge, and can promote enhanced strategies for the recognition of written words, which in turn positively affect higher-order processes underlying reading comprehension. The aims of this study were to investigate and compare the effect of deafness on the participants’ abilities to process written words at the lexical and sentence levels through using two online and one offline reading comprehension tests. The performance of a group of 8 deaf male students (ages 8-12) was compared with that of a control group of normal hearing male students. All the participants had normal IQ and visual status, and came from an average socioeconomic background. None were diagnosed with a particular learning or motor disability. The language spoken in the homes of all participants was Persian. Two tests of word processing were developed and presented to the participants using OpenSesame software, in order to measure the speed and accuracy of their performance at the two perceptual and conceptual levels. In the third offline test of reading comprehension which comprised of semantically plausible and semantically implausible subject relative clauses, the participants had to select the correct answer out of two choices. The data derived from the statistical analysis using SPSS software indicated that hearing and deaf participants had a similar word processing performance both in terms of speed and accuracy of their responses. The results also showed that there was no significant difference between the performance of the deaf and hearing participants in comprehending semantically plausible sentences (p > 0/05). However, a significant difference between the performances of the two groups was observed with respect to their comprehension of semantically implausible sentences (p < 0/05). In sum, the findings revealed that the seriously impoverished sentence reading ability characterizing the profound deaf subjects of the present research, exhibited their reliance on reading strategies that are based on insufficient or deviant structural knowledge, in particular in processing semantically implausible sentences, rather than a failure to efficiently process written words at the lexical level. This conclusion, of course, does not mean to say that deaf individuals may never experience deficits at the word processing level, deficits that impede their understanding of written texts. However, as stated in previous researches, it sounds reasonable to assume that the more deaf individuals get familiar with written words, the better they can recognize them, despite having a profound phonological weakness.

Keywords: deafness, reading comprehension, reading strategy, word processing, subject and object relative sentences

Procedia PDF Downloads 295
647 Application of Vector Representation for Revealing the Richness of Meaning of Facial Expressions

Authors: Carmel Sofer, Dan Vilenchik, Ron Dotsch, Galia Avidan

Abstract:

Studies investigating emotional facial expressions typically reveal consensus among observes regarding the meaning of basic expressions, whose number ranges between 6 to 15 emotional states. Given this limited number of discrete expressions, how is it that the human vocabulary of emotional states is so rich? The present study argues that perceivers use sequences of these discrete expressions as the basis for a much richer vocabulary of emotional states. Such mechanisms, in which a relatively small number of basic components is expanded to a much larger number of possible combinations of meanings, exist in other human communications modalities, such as spoken language and music. In these modalities, letters and notes, which serve as basic components of spoken language and music respectively, are temporally linked, resulting in the richness of expressions. In the current study, in each trial participants were presented with sequences of two images containing facial expression in different combinations sampled out of the eight static basic expressions (total 64; 8X8). In each trial, using single word participants were required to judge the 'state of mind' portrayed by the person whose face was presented. Utilizing word embedding methods (Global Vectors for Word Representation), employed in the field of Natural Language Processing, and relying on machine learning computational methods, it was found that the perceived meanings of the sequences of facial expressions were a weighted average of the single expressions comprising them, resulting in 22 new emotional states, in addition to the eight, classic basic expressions. An interaction between the first and the second expression in each sequence indicated that every single facial expression modulated the effect of the other facial expression thus leading to a different interpretation ascribed to the sequence as a whole. These findings suggest that the vocabulary of emotional states conveyed by facial expressions is not restricted to the (small) number of discrete facial expressions. Rather, the vocabulary is rich, as it results from combinations of these expressions. In addition, present research suggests that using word embedding in social perception studies, can be a powerful, accurate and efficient tool, to capture explicit and implicit perceptions and intentions. Acknowledgment: The study was supported by a grant from the Ministry of Defense in Israel to GA and CS. CS is also supported by the ABC initiative in Ben-Gurion University of the Negev.

Keywords: Glove, face perception, facial expression perception. , facial expression production, machine learning, word embedding, word2vec

Procedia PDF Downloads 144
646 Text Localization in Fixed-Layout Documents Using Convolutional Networks in a Coarse-to-Fine Manner

Authors: Beier Zhu, Rui Zhang, Qi Song

Abstract:

Text contained within fixed-layout documents can be of great semantic value and so requires a high localization accuracy, such as ID cards, invoices, cheques, and passports. Recently, algorithms based on deep convolutional networks achieve high performance on text detection tasks. However, for text localization in fixed-layout documents, such algorithms detect word bounding boxes individually, which ignores the layout information. This paper presents a novel architecture built on convolutional neural networks (CNNs). A global text localization network and a regional bounding-box regression network are introduced to tackle the problem in a coarse-to-fine manner. The text localization network simultaneously locates word bounding points, which takes the layout information into account. The bounding-box regression network inputs the features pooled from arbitrarily sized RoIs and refine the localizations. These two networks share their convolutional features and are trained jointly. A typical type of fixed-layout documents: ID cards, is selected to evaluate the effectiveness of the proposed system. These networks are trained on data cropped from nature scene images, and synthetic data produced by a synthetic text generation engine. Experiments show that our approach locates high accuracy word bounding boxes and achieves state-of-the-art performance.

Keywords: bounding box regression, convolutional networks, fixed-layout documents, text localization

Procedia PDF Downloads 154
645 A Grey-Box Text Attack Framework Using Explainable AI

Authors: Esther Chiramal, Kelvin Soh Boon Kai

Abstract:

Explainable AI is a strong strategy implemented to understand complex black-box model predictions in a human-interpretable language. It provides the evidence required to execute the use of trustworthy and reliable AI systems. On the other hand, however, it also opens the door to locating possible vulnerabilities in an AI model. Traditional adversarial text attack uses word substitution, data augmentation techniques, and gradient-based attacks on powerful pre-trained Bidirectional Encoder Representations from Transformers (BERT) variants to generate adversarial sentences. These attacks are generally white-box in nature and not practical as they can be easily detected by humans e.g., Changing the word from “Poor” to “Rich”. We proposed a simple yet effective Grey-box cum Black-box approach that does not require the knowledge of the model while using a set of surrogate Transformer/BERT models to perform the attack using Explainable AI techniques. As Transformers are the current state-of-the-art models for almost all Natural Language Processing (NLP) tasks, an attack generated from BERT1 is transferable to BERT2. This transferability is made possible due to the attention mechanism in the transformer that allows the model to capture long-range dependencies in a sequence. Using the power of BERT generalisation via attention, we attempt to exploit how transformers learn by attacking a few surrogate transformer variants which are all based on a different architecture. We demonstrate that this approach is highly effective to generate semantically good sentences by changing as little as one word that is not detectable by humans while still fooling other BERT models.

Keywords: BERT, explainable AI, Grey-box text attack, transformer

Procedia PDF Downloads 99
644 The Power of Words: A Corpus Analysis of Campaign Speeches of President Donald J. Trump

Authors: Aiza Dalman

Abstract:

Words are powerful when these are used wisely and strategically. In this study, twelve (12) campaign speeches of President Donald J. Trump were analyzed as to frequently used words and ethos, pathos and logos being employed. The speeches were read thoroughly, analyzed and interpreted. With the use of Word Counter Tool and Text Analyzer software accessible online, it was found out that the word ‘will’ has the highest frequency of 121, followed by Hillary (58), American (38), going (35), plan and Clinton (32), illegal (30), government (28), corruption (26) and criminal (24). When the speeches were analyzed as to ethos, pathos and logos, on the other hand, it revealed that these were all employed in his speeches. The statements under these pointed out against Hillary or in his favor. The unique strategy of President Donald J. Trump as to frequently used words and ethos, pathos and logos in persuading people perhaps lead the way to his victory.

Keywords: campaign speeches, corpus analysis, ethos, logos and pathos, power of words

Procedia PDF Downloads 233
643 Memory Retrieval and Implicit Prosody during Reading: Anaphora Resolution by L1 and L2 Speakers of English

Authors: Duong Thuy Nguyen, Giulia Bencini

Abstract:

The present study examined structural and prosodic factors on the computation of antecedent-reflexive relationships and sentence comprehension in native English (L1) and Vietnamese-English bilinguals (L2). Participants read sentences presented on the computer screen in one of three presentation formats aimed at manipulating prosodic parsing: word-by-word (RSVP), phrase-segment (self-paced), or whole-sentence (self-paced), then completed a grammaticality rating and a comprehension task (following Pratt & Fernandez, 2016). The design crossed three factors: syntactic structure (simple; complex), grammaticality (target-match; target-mismatch) and presentation format. An example item is provided in (1): (1) The actress that (Mary/John) interviewed at the awards ceremony (about two years ago/organized outside the theater) described (herself/himself) as an extreme workaholic). Results showed that overall, both L1 and L2 speakers made use of a good-enough processing strategy at the expense of more detailed syntactic analyses. L1 and L2 speakers’ comprehension and grammaticality judgements were negatively affected by the most prosodically disrupting condition (word-by-word). However, the two groups demonstrated differences in their performance in the other two reading conditions. For L1 speakers, the whole-sentence and the phrase-segment formats were both facilitative in the grammaticality rating and comprehension tasks; for L2, compared with the whole-sentence condition, the phrase-segment paradigm did not significantly improve accuracy or comprehension. These findings are consistent with the findings of Pratt & Fernandez (2016), who found a similar pattern of results in the processing of subject-verb agreement relations using the same experimental paradigm and prosodic manipulation with English L1 and L2 English-Spanish speakers. The results provide further support for a Good-Enough cue model of sentence processing that integrates cue-based retrieval and implicit prosodic parsing (Pratt & Fernandez, 2016) and highlights similarities and differences between L1 and L2 sentence processing and comprehension.

Keywords: anaphora resolution, bilingualism, implicit prosody, sentence processing

Procedia PDF Downloads 107
642 Unsupervised Assistive and Adaptative Intelligent Agent in Smart Enviroment

Authors: Sebastião Pais, João Casal, Ricardo Ponciano, Sérgio Lorenço

Abstract:

The adaptation paradigm is a basic defining feature for pervasive computing systems. Adaptation systems must work efficiently in a smart environment while providing suitable information relevant to the user system interaction. The key objective is to deduce the information needed information changes. Therefore relying on fixed operational models would be inappropriate. This paper presents a study on developing an Intelligent Personal Assistant to assist the user in interacting with their Smart Environment. We propose an Unsupervised and Language-Independent Adaptation through Intelligent Speech Interface and a set of methods of Acquiring Knowledge, namely Semantic Similarity and Unsupervised Learning.

Keywords: intelligent personal assistants, intelligent speech interface, unsupervised learning, language-independent, knowledge acquisition, association measures, symmetric word similarities, attributional word similarities

Procedia PDF Downloads 515
641 Unsupervised Assistive and Adaptive Intelligent Agent in Smart Environment

Authors: Sebastião Pais, João Casal, Ricardo Ponciano, Sérgio Lourenço

Abstract:

The adaptation paradigm is a basic defining feature for pervasive computing systems. Adaptation systems must work efficiently in smart environment while providing suitable information relevant to the user system interaction. The key objective is to deduce the information needed information changes. Therefore, relying on fixed operational models would be inappropriate. This paper presents a study on developing a Intelligent Personal Assistant to assist the user in interacting with their Smart Environment. We propose a Unsupervised and Language-Independent Adaptation through Intelligent Speech Interface and a set of methods of Acquiring Knowledge, namely Semantic Similarity and Unsupervised Learning.

Keywords: intelligent personal assistants, intelligent speech interface, unsupervised learning, language-independent, knowledge acquisition, association measures, symmetric word similarities, attributional word similarities

Procedia PDF Downloads 587
640 The Meaning of Happiness and Unhappiness among Female Teenagers in Urban Finland: A Social Representations Approach

Authors: Jennifer De Paola

Abstract:

Objectives: The literature is saturated with figures and hard data on happiness and its rates, causes and effects at a large scale, whereas very little is known about the way specific groups of people within societies understand and talk about happiness in their everyday life. The present study contributes to fill this gap in the happiness research by analyzing social representations of happiness among young women through the theoretical frame provided by Moscovici’s Social Representation Theory. Methods: Participants were (N= 351) female students (16-18 year olds) from Finnish, Swedish and English speaking high schools in the Helsinki region, Finland. Main source of data collection were word associations using the stimulus word ‘happiness’ and word associations using as stimulus the term that in the participants’ opinion represents the opposite of happiness. The allowed number of associations was five per stimulus word (10 associations per participant). In total, the 351 participants produced 6973 associations with the two stimulus words given: 3500 (50,19%) associations with ‘happiness’ and 3473 (49,81%) associations with ‘opposite of happiness’. The associations produced were analyzed qualitatively to identify associations with similar meaning and then coded combining similar associations in larger categories. Results: In total, 33 categories were identified respectively for the stimulus word ‘happiness’ and for the stimulus word ‘opposite of happiness’. In general terms, the 33 categories identified for ‘happiness’ included associations regarding relationships with key people considered important, such as ‘family’, abstract concepts such as meaningful life, success and moral values as well as more mundane and hedonic elements like food, pleasure and fun. Similarly, the 33 categories emerged for ‘opposite of happiness’ included relationship problems and arguments, negative feelings such as sadness, depression, stress as well as more concrete issues such as financial problems. Participants were also asked to rate their own level of happiness on a scale from 1 to 10. Results indicated the mean of the self-rated level of happiness was 7,93 (the range varied from 1 to 10; SD = 1, 50). Participants’ responses were further divided into three different groups according to the self-rated level of happiness: group 1 (level 10-9), group 2 (level 8-6), and group 3 (level 5 and lower) in order to investigate the way the categories mentioned above were distributed among the different groups. Preliminary results show that the category ‘family’ is associated with higher level of happiness, whereas its presence gradually decreases among the participants with a lower level of happiness. Moreover, the category ‘depression’ seems to be mainly present among participants in group 3, whereas the category ‘sadness’ is mainly present among participants with higher level of happiness. Conclusion: In conclusion, this study indicates the prevalent ways of thinking about happiness and its opposite among young female students, suggesting that representations varied to some extent depending on the happiness level of the participants. This study contributes to bringing new knowledge as it considers happiness as a holistic state, thus going beyond the literature that so far has too often viewed happiness as a mere unidimensional spectrum.

Keywords: female, happiness, social representations, unhappiness

Procedia PDF Downloads 189
639 Pharyngealization Spread in Ibbi Dialect of Yemeni Arabic: An Acoustic Study

Authors: Fadhl Qutaish

Abstract:

This paper examines the pharyngealization spread in one of the Yemeni Arabic dialects, namely, Ibbi Arabic (IA). It investigates how pharyngealized sounds spread their acoustic features onto the neighboring vowels and change their default features. This feature has been investigated quietly well in MSA but still has to be deeply studied in the different dialect of Arabic which will bring about a clearer picture of the similarities and the differences among these dialects and help in mapping them based on the way this feature is utilized. Though the studies are numerous, no one of them has illustrated how far in the multi-syllabic word the spread can be and whether it takes a steady or gradient manner. This study tries to fill this gap and give a satisfactory explanation of the pharyngealization spread in Ibbi Dialect. This study is the first step towards a larger investigation of the different dialects of Yemeni Arabic in the future. The data recorded are represented in minimal pairs in which the trigger (pharyngealized or the non-pharyngealized sound) is in the initial or final position of monosyllabic and multisyllabic words. A group of 24 words were divided into four groups and repeated three times by three subjects which will yield 216 tokens that are tested and analyzed. The subjects are three male speakers aged between 28 and 31 with no history of neurological, speaking or hearing problems. All of them are bilingual speakers of Arabic and English and native speakers of Ibbi-Dialect. Recordings were done in a sound-proof room and praat software was used for the analysis and coding of the trajectories of F1 and F2 for the low vowel /a/ to see the effect of pharyngealization on the formant trajectory within the same syllable and in other syllables of the same word by comparing the F1 and F2 formants to the non-pharyngealized environment. The results show that pharyngealization spread is gradient (progressively and regressively). The spread is reflected in the gradual raising of F1 as we move closer towards the trigger and the gradual lowering of F2 as well. The results of the F1 mean values in tri-syllabic words when the trigger is word initially show that there is a raise of 37.9 HZ in the first syllable, 26.8HZ in the second syllable and 14.2HZ in the third syllable. F2 mean values undergo a lowering of 239 HZ in the first syllable, 211.7 HZ in the second syllable and 176.5 in the third syllable. This gradual decrease in the difference of F2 values in the non-pharyngealized and pharyngealized context illustrates that the spread is gradient. A similar result was found when the trigger is word-final which proves that the spread is gradient (progressively and regressively.

Keywords: pharyngealization, Yemeni Arabic, Ibbi dialect, pharyngealization spread

Procedia PDF Downloads 183
638 Towards a Dialogical Approach between Christianity and Hinduism: A Comparative Theological Analysis of the Concept of Logos, and Shabd

Authors: Abraham Kuruvilla

Abstract:

Since the inception of Christianity, one of the most important precepts has been that of the ‘word becoming flesh.’ Incarnation, as we understand it, is that the ‘word became flesh.’ As we know, it is a commonly held understanding that the concept of Logos was borrowed from the Greek religion. Such understanding has dominated our thought process. This is problematic as it does not draw out the deep roots of Logos. The understanding of Logos also existed in religion such as Hinduism. For the Hindu faith, the understanding of Shabd is pivotal. It could be arguably equated with the understanding of the Logos. The paper looks into the connection of the primal Christian doctrine of the Logos with that of the Hindu understanding of Shabd. The methodology of the paper would be a comparative theological analysis with the New Testament understanding of the Logos with that of the understanding of Shabd as perceived in the different Vedas of the Hindu faith. The paper would come to the conclusion that there is a conceptual connectivity between Logos and the Shabd. As such the understanding of Logos cannot just be attributed to the Greek understanding of Logos, but rather it predates the Greek understanding of Logos by being connected to the Hindu understanding of Shabd. Accordingly, such comparison brings out the implication for a constructive dialogue between Christianity and the Hindu faith.

Keywords: Christianity, Hinudism, Logos, Shabd

Procedia PDF Downloads 168