Search results for: everyday-life words
1277 Bag of Words Representation Based on Weighting Useful Visual Words
Authors: Fatma Abdedayem
Abstract:
The most effective and efficient methods in image categorization are almost based on bag-of-words (BOW) which presents image by a histogram of occurrence of visual words. In this paper, we propose a novel extension to this method. Firstly, we extract features in multi-scales by applying a color local descriptor named opponent-SIFT. Secondly, in order to represent image we use Spatial Pyramid Representation (SPR) and an extension to the BOW method which based on weighting visual words. Typically, the visual words are weighted during histogram assignment by computing the ratio of their occurrences in the image to the occurrences in the background. Finally, according to classical BOW retrieval framework, only a few words of the vocabulary is useful for image representation. Therefore, we select the useful weighted visual words that respect the threshold value. Experimentally, the algorithm is tested by using different image classes of PASCAL VOC 2007 and is compared against the classical bag-of-visual-words algorithm.Keywords: BOW, useful visual words, weighted visual words, bag of visual words
Procedia PDF Downloads 4361276 The Repetition of New Words and Information in Mandarin-Speaking Children: A Corpus-Based Study
Authors: Jian-Jun Gao
Abstract:
Repetition is used for a variety of functions in conversation. When young children first learn to speak, they often repeat words from the adult’s recent utterance with the learning and social function. The objective of this study was to ascertain whether the repetitions are equivalent in indicating attention to new words and the initial repeat of information in conversation. Based on the observation of naturally occurring language use in Taiwan Corpus of Child Mandarin (TCCM), the results in this study provided empirical support to the previous findings that children are more likely to repeat new words they are offered than to repeat new information. When children get older, there would be a drop in the repetition of both new words and new information.Keywords: acquisition, corpus, mandarin, new words, new information, repetition
Procedia PDF Downloads 1491275 A Word-to-Vector Formulation for Word Representation
Authors: Sandra Rizkallah, Amir F. Atiya
Abstract:
This work presents a novel word to vector representation that is based on embedding the words into a sphere, whereby the dot product of the corresponding vectors represents the similarity between any two words. Embedding the vectors into a sphere enabled us to take into consideration the antonymity between words, not only the synonymity, because of the suitability to handle the polarity nature of words. For example, a word and its antonym can be represented as a vector and its negative. Moreover, we have managed to extract an adequate vocabulary. The obtained results show that the proposed approach can capture the essence of the language, and can be generalized to estimate a correct similarity of any new pair of words.Keywords: natural language processing, word to vector, text similarity, text mining
Procedia PDF Downloads 2751274 Morphological Rules of Bangla Repetition Words for UNL Based Machine Translation
Authors: Nawab Yousuf Ali, S. Golam, A. Ameer, Ashok Toru Roy
Abstract:
This paper develops new morphological rules suitable for Bangla repetition words to be incorporated into an inter lingua representation called Universal Networking Language (UNL). The proposed rules are to be used to combine verb roots and their inflexions to produce words which are then combined with other similar types of words to generate repetition words. This paper outlines the format of morphological rules for different types of repetition words that come from verb roots based on the framework of UNL provided by the UNL centre of the Universal Networking Digital Language (UNDL) foundation.Keywords: Universal Networking Language (UNL), universal word (UW), head word (HW), Bangla-UNL Dictionary, morphological rule, enconverter (EnCo)
Procedia PDF Downloads 3101273 Determining the Number of Words Required to Fulfil the Writing Task in an English Proficiency Exam with the Raters’ Scores
Authors: Defne Akinci Midas
Abstract:
The aim of this study was to determine the minimum, and maximum number of words that would be sufficient to fulfill the writing task in the local English Proficiency Exam (EPE) produced and administered at the Middle East Technical University, Ankara, Turkey. The relationship between the number of words and the scores of the written products that had been awarded by two raters in three online EPEs administered in 2020 was examined. The means, standard deviations, percentages, range, minimum and maximum scores as well as correlations of the scores awarded to written products with the words that amount to 0-50, 51-100, 101-150, 151-200, 201-250, 251-300, and so on were computed. The results showed that the raters did not award a full score to texts that had fewer than 100 words. Moreover, the texts that had around 200 words were awarded the highest scores. The highest number of words that earned the highest scores was about 225, and from then onwards, the scores were either stable or lower. A positive low to moderate correlation was found between the number of words and scores awarded to the texts. We understand that the idea of ‘the longer, the better’ did not apply here. The results also showed that words between 101 to about 225 were sufficient to fulfill the writing task to fully display writing skills and language ability in the specific case of this exam.Keywords: English proficiency exam, number of words, scoring, writing task
Procedia PDF Downloads 1751272 Pudhaiyal: A Maze-Based Treasure Hunt Game for Tamil Words
Authors: Aarthy Anandan, Anitha Narasimhan, Madhan Karky
Abstract:
Word-based games are popular in helping people to improve their vocabulary skills. Games like ‘word search’ and crosswords provide a smart way of increasing vocabulary skills. Word search games are fun to play, but also educational which actually helps to learn a language. Finding the words from word search puzzle helps the player to remember words in an easier way, and it also helps to learn the spellings of words. In this paper, we present a tile distribution algorithm for a Maze-Based Treasure Hunt Game 'Pudhaiyal’ for Tamil words, which describes how words can be distributed horizontally, vertically or diagonally in a 10 x 10 grid. Along with the tile distribution algorithm, we also present an algorithm for the scoring model of the game. The proposed game has been tested with 20,000 Tamil words.Keywords: Pudhaiyal, Tamil word game, word search, scoring, maze, algorithm
Procedia PDF Downloads 4401271 Towards Kurdish Internet Linguistics: A Case Study on the Impact of Social Media on Kurdish Language
Authors: Karwan K. Abdalrahman
Abstract:
Due to the impacts of the internet and social media, new words and expressions enter the Kurdish language, and a number of familiarized words get new meanings. The case is especially true when the technique of transliteration is taken into consideration. Through transliteration, a number of selected words widely used on social media are entering the Kurdish media discourse. In addition, a selected number of Kurdish words get new cultural and psychological meanings. The significance of this study is to delve into the process of word formation in the Kurdish language and explore how new words and expressions are formed by social media users and got public recognition. First, the study investigates the English words that enter the Kurdish language through different social media platforms. All of these words are transliterated and are used in spoken and written discourses. Second, there are a specific number of Kurdish words that got new meanings in social media. As for these words, there are psychological and cultural factors that make people use these expressions for specific political reasons. It can be argued that they have an indirect political message along with their new linguistic usages. This is a qualitative study analyzing video content that was published in the last two years on social media platforms, including Facebook and YouTube. The collected data was analyzed based on the themes discussed above. The findings of the research can be summarized as follows: the widely used transliterated words have entered both the spoken and written discourses. Authors in online and offline newspapers, TV presenters, literary writers, columnists are using these new expressions in their writings. As for the Kurdish words with new meanings, they are also widely used for psychological, cultural, and political reasons.Keywords: Kurdish language, social media, new meanings, transliteration, vocabulary
Procedia PDF Downloads 1801270 The Cultural and Semantic Danger of English Transparent Words Translated from English into Arabic
Authors: Abdullah Khuwaileh
Abstract:
While teaching and translating vocabulary is no longer a neglected area in ELT in general and in translation in particular, the psychology of its acquisition has been a neglected area. Our paper aims at exploring some of the learning and translating conditions under which vocabulary is acquired and translated properly. To achieve this objective, two teaching methods (experiments) were applied on 4 translators to measure their acquisition of a number of transparent vocabulary items. Some of these items were knowingly chosen from 'deceptively transparent words'. All the data, sample, etc., were taken from Jordan University of Science and Technology (JUST) and Yarmouk University, where the researcher is employed. The study showed that translators might translate transparent words inaccurately, particularly if these words are uncontextualised. It was also shown that the morphological structures of words may lead translators or even EFL learners to misinterpretations of meaning.Keywords: english, transparent, word, processing, translation
Procedia PDF Downloads 711269 Intensifier as Changed from the Impolite Word in Thai
Authors: Methawee Yuttapongtada
Abstract:
Intensifier is the linguistic term and device that is generally found in different languages in order to enhance and give additional quantity, quality or emotion to the words of each language. In fact, each language in the world has both of the similar and dissimilar intensifying device. More specially, the wide variety of intensifying device is used for Thai language and one of those is usage of the impolite word or the word that used to mean something negative as intensifier. The data collection in this study was done throughout the spoken language style by collecting from intensifiers regarded as impolite words because these words as employed in the other contexts will be held as the rude, swear words or the words with negative meaning. Then, backward study to the past was done in order to consider the historical change. Explanation of the original meaning and the contexts of words use from the past till the present time were done by use of both textual documents and dictionaries available in different periods. It was found that regarding the semantics and pragmatic aspects, subjectification also is the significant motivation that changed the impolite words to intensifiers. At last, it can explain pathway of the semantic change of these very words undoubtedly. Moreover, it is found that use tendency in the impolite word or the word that used to mean something negative will more be increased and this phenomenon is commonly found in many languages in the world and results of this research may support to the belief that human language in the world is universal and the same still reflected that human has the fundamental thought as the same to each other basically.Keywords: impolite word, intensifier, Thai, semantic change
Procedia PDF Downloads 1811268 Hybrid SVM/DBN Model for Arabic Isolated Words Recognition
Authors: Elyes Zarrouk, Yassine Benayed, Faiez Gargouri
Abstract:
This paper presents a new hybrid model for isolated Arabic words recognition. To do this, we apply Support Vectors Machine (SVM) as an estimator of posterior probabilities within the Dynamic Bayesian networks (DBN). This paper deals a comparative study between DBN and SVM/DBN systems for multi-dialect isolated Arabic words. Performance using SVM/DBN is found to exceed that of DBNs trained on an identical task, giving higher recognition accuracy for four different Arabic dialects. In fact, the average of recognition rates for the four dialects with SVM/DBN was 87.67% while 83.01% with DBN.Keywords: dynamic Bayesian networks, hybrid models, supports vectors machine, Arabic isolated words
Procedia PDF Downloads 5601267 Formation of Blends in Hausa Language
Authors: Maryam Maimota Shehu
Abstract:
Words are the basic building blocks of a language. In everyday usage of a language, words are used, and new words are formed and reformed to contain and accommodate all entities, phenomena, qualities and every aspect of the entire life. Despite the fact that many studies have been conducted on morphological processes in The Hausa language. Most of the works concentrated on borrowing, affixation, reduplication and derivation, but blending has been neglected to the extent that some of the Hausa linguists claim that, blending does not exist in the language. Therefore, the current study investigates and examines blending as one of the word formation processes' in the language. The study focuses its main attention on blending as a word-formation process and how this process is used adequately in the formation of words in The Hausa language. To achieve the aims, the research answered these questions: 1) is blending used as a process of word formation in Hausa? 2) What are the words formed using this process? This study utilizes the Natural Morphology Theory proposed by Dressler, (1985) which was adopted by Belly (2007). The data of this study have been collected from newspaper articles, novels, and written literature of Hausa language. Based on the findings, this study found out that, there exist new kind of words formed in The Hausa language under blending, which previous findings did not either reveal or explain in detail. Another part of the finding shows that some of the words change their grammatical classes and meaning while blended.Keywords: morphology, word formation, blending in hausa language, language
Procedia PDF Downloads 4191266 The Power of Words: A Corpus Analysis of Campaign Speeches of President Donald J. Trump
Authors: Aiza Dalman
Abstract:
Words are powerful when these are used wisely and strategically. In this study, twelve (12) campaign speeches of President Donald J. Trump were analyzed as to frequently used words and ethos, pathos and logos being employed. The speeches were read thoroughly, analyzed and interpreted. With the use of Word Counter Tool and Text Analyzer software accessible online, it was found out that the word ‘will’ has the highest frequency of 121, followed by Hillary (58), American (38), going (35), plan and Clinton (32), illegal (30), government (28), corruption (26) and criminal (24). When the speeches were analyzed as to ethos, pathos and logos, on the other hand, it revealed that these were all employed in his speeches. The statements under these pointed out against Hillary or in his favor. The unique strategy of President Donald J. Trump as to frequently used words and ethos, pathos and logos in persuading people perhaps lead the way to his victory.Keywords: campaign speeches, corpus analysis, ethos, logos and pathos, power of words
Procedia PDF Downloads 2791265 The Development of Chinese-English Homophonic Word Pairs Databases for English Teaching and Learning
Authors: Yuh-Jen Wu, Chun-Min Lin
Abstract:
Homophonic words are common in Mandarin Chinese which belongs to the tonal language family. Using homophonic cues to study foreign languages is one of the learning techniques of mnemonics that can aid the retention and retrieval of information in the human memory. When learning difficult foreign words, some learners transpose them with words in a language they are familiar with to build an association and strengthen working memory. These phonological clues are beneficial means for novice language learners. In the classroom, if mnemonic skills are used at the appropriate time in the instructional sequence, it may achieve their maximum effectiveness. For Chinese-speaking students, proper use of Chinese-English homophonic word pairs may help them learn difficult vocabulary. In this study, a database program is developed by employing Visual Basic. The database contains two corpora, one with Chinese lexical items and the other with English ones. The Chinese corpus contains 59,053 Chinese words that were collected by a web crawler. The pronunciations of this group of words are compared with words in an English corpus based on WordNet, a lexical database for the English language. Words in both databases with similar pronunciation chunks and batches are detected. A total of approximately 1,000 Chinese lexical items are located in the preliminary comparison. These homophonic word pairs can serve as a valuable tool to assist Chinese-speaking students in learning and memorizing new English vocabulary.Keywords: Chinese, corpus, English, homophonic words, vocabulary
Procedia PDF Downloads 1821264 Network Word Discovery Framework Based on Sentence Semantic Vector Similarity
Authors: Ganfeng Yu, Yuefeng Ma, Shanliang Yang
Abstract:
The word discovery is a key problem in text information retrieval technology. Methods in new word discovery tend to be closely related to words because they generally obtain new word results by analyzing words. With the popularity of social networks, individual netizens and online self-media have generated various network texts for the convenience of online life, including network words that are far from standard Chinese expression. How detect network words is one of the important goals in the field of text information retrieval today. In this paper, we integrate the word embedding model and clustering methods to propose a network word discovery framework based on sentence semantic similarity (S³-NWD) to detect network words effectively from the corpus. This framework constructs sentence semantic vectors through a distributed representation model, uses the similarity of sentence semantic vectors to determine the semantic relationship between sentences, and finally realizes network word discovery by the meaning of semantic replacement between sentences. The experiment verifies that the framework not only completes the rapid discovery of network words but also realizes the standard word meaning of the discovery of network words, which reflects the effectiveness of our work.Keywords: text information retrieval, natural language processing, new word discovery, information extraction
Procedia PDF Downloads 951263 Optimized Text Summarization Model on Mobile Screens for Sight-Interpreters: An Empirical Study
Authors: Jianhua Wang
Abstract:
To obtain key information quickly from long texts on small screens of mobile devices, sight-interpreters need to establish optimized summarization model for fast information retrieval. Four summarization models based on previous studies were studied including title+key words (TKW), title+topic sentences (TTS), key words+topic sentences (KWTS) and title+key words+topic sentences (TKWTS). Psychological experiments were conducted on the four models for three different genres of interpreting texts to establish the optimized summarization model for sight-interpreters. This empirical study shows that the optimized summarization model for sight-interpreters to quickly grasp the key information of the texts they interpret is title+key words (TKW) for cultural texts, title+key words+topic sentences (TKWTS) for economic texts and topic sentences+key words (TSKW) for political texts.Keywords: different genres, mobile screens, optimized summarization models, sight-interpreters
Procedia PDF Downloads 3141262 English Pashto Contact: Morphological Adaptation of Bilingual Compound Words in Pashto
Authors: Imran Ullah Imran
Abstract:
Language contact is a familiar concept in the present global world. Across the globe, languages get mixed up at different levels. Borrowing, code-switching are some of the means through which languages interact. This study examines Pashto-English contact at word and syllable levels. By recording the speech of 30 Pashto native speakers, selected via 'social network' sampling, the study located a number of Pashto-English compound words, which is a unique contact of its kind. In data analysis, tokens were categorized on the basis of their pattern and morphological structure. The study shows that Pashto-English Bilingual Compound words (BCWs) are very prevalent in the Pashto language. The study also found that the BCWs in Pashto are completely productive and have their own meanings. It also shows that the dominant pattern of hybrid words in Pashto is the conjugation of an independent English root word followed by a Pashto inflectional morpheme, which contributes to the core semantic content of the construction. The BCWs construction shows that how both the languages are closer to each other. Pashto-English contact results into bilingual compound and hybrid words, which forms a considerable number of tokens in the present-day spoken Pashto. On the basis of these findings, the study assumes that the same phenomenon may increase with the passage of time that would, in turn, result in the formation of more bilingual compound or hybrid words.Keywords: code-mixing, bilingual compound words, pashto-english contact, hybrid words, inflectional lexical morpheme
Procedia PDF Downloads 2491261 Reasons for Language Words in the Quran and Literary Approaches That Are Persian
Authors: Fateme Mazbanpoor, Sayed Mohammad Amiri
Abstract:
In this article, we will examine the Persian words in Quran and study the reasons of their presence in this holy book. Writers of this paper extracted about 70 Persian words of Quran by referring to resources. (Alalfaz ol Moarab ol Farsieh Edishir, Almoarabol Javalighi, Almahzab va Etghan Seuti; Vocabulary involved in Quran Arthur Jeffry;, and etc…), some of these words are: ‘Abarigh, ‘Estabragh’,’Barzakh’, ‘Din’,’Zamharir, ‘Sondos’ ‘Sejil’,’ Namaregh’, ‘Fil’ etc. These Persian words have entered Arabic and finally entered Quran in two ways: 1) directly from Persian language, 2) via other languages. The first way: because of the Iranian dominance on Hira, Yemen, whole Oman and Bahrein land in Sasanian period, there were political, religious, linguistic, literary, and trade ties between these Arab territories causing the impact of Persian on Arabic; giving way to many Persian-loan words into Arabic in this period of time. The second way: Since the geographical and business conditions of the areas were dominated by Iran, Hejaz had lots of deals and trades with Mesopotamia and Yemen. On the other hand, Arabic language which was relatively a young language at that time, used to be impressed by Semitic languages in order to expand its vocabulary (Syrian and Aramaic were influenced by the languages of Iran). Consequently, due to the long relationship between Iranian and Arabs, some of the Persian words have taken longer ways through Aramaic and Syrian to find their way into Quran.Keywords: Quran, Persian word, Arabic language, Persian
Procedia PDF Downloads 4621260 Literary Words of Foreign Origin as Social Markers in Jeffrey Archer's Novels Speech Portrayals
Authors: Tatiana Ivushkina
Abstract:
The paper is aimed at studying the use of literary words of foreign origin in modern fiction from a sociolinguistic point of view, which presupposes establishing correlation between this category of words in a speech portrayal or narrative and a social status of the speaker, verifying that it bears social implications and serves as a social marker or index of socially privileged identity in the British literature of the 21-st century. To this end, there were selected literary words of foreign origin in context (60 contexts) and subjected to careful examination. The study is carried out on two novels by Jeffrey Archer – Not a Penny More, Not a Penny Less and A Prisoner of Birth – who, being a graduate from Oxford, represents socially privileged classes himself and gives a wide depiction of characters with different social backgrounds and statuses. The analysis of the novels enabled us to categorize the selected words into four relevant groups. The first represented by terms (commodity, debenture, recuperation, syringe, luminescence, umpire, etc.) serves to unambiguously indicate education, occupation, a field of knowledge in which a character is involved or a situation of communication. The second group is formed of words used in conjunction with their Germanic counterparts (perspiration – sweat, padre – priest, convivial – friendly) to contrast social position of the characters: literary words serving as social indices of upper class speakers whereas their synonyms of Germanic origin characterize middle or lower class speech portrayals. The third class of words comprises socially marked words (verbs, nouns, and adjectives), or U-words (the term first coined by Allan Ross and Nancy Mitford), the status acquired in the course of social history development (elegant, excellent, sophistication, authoritative, preposterous, etc.). The fourth includes words used in a humorous or ironic meaning to convey the narrator’s attitude to the characters or situation itself (ministrations, histrionic, etc.). Words of this group are perceived as 'alien', stylistically distant as they create incongruity between style and subject matter. Social implication of the selected words is enhanced by French words and phrases often accompanying them.Keywords: British literature of the XXI century, literary words of foreign origin, social context, social meaning
Procedia PDF Downloads 1341259 Morphological Analysis of Manipuri Language: Wahei-Neinarol
Authors: Y. Bablu Singh, B. S. Purkayashtha, Chungkham Yashawanta Singh
Abstract:
Morphological analysis forms the basic foundation in NLP applications including syntax parsing Machine Translation (MT), Information Retrieval (IR) and automatic indexing in all languages. It is the field of the linguistics; it can provide valuable information for computer based linguistics task such as lemmatization and studies of internal structure of the words. Computational Morphology is the application of morphological rules in the field of computational linguistics, and it is the emerging area in AI, which studies the structure of words, which are formed by combining smaller units of linguistics information, called morphemes: the building blocks of words. Morphological analysis provides about semantic and syntactic role in a sentence. It analyzes the Manipuri word forms and produces several grammatical information associated with the words. The Morphological Analyzer for Manipuri has been tested on 3500 Manipuri words in Shakti Standard format (SSF) using Meitei Mayek as source; thereby an accuracy of 80% has been obtained on a manual check.Keywords: morphological analysis, machine translation, computational morphology, information retrieval, SSF
Procedia PDF Downloads 3261258 Formation of Clipped Forms in Hausa Language
Authors: Maryam Maimota Shehu
Abstract:
Words are the basic building blocks of a language. In everyday usage of a language, words are used, and new words are formed and reformed in order to contain and accommodate all entities, phenomena, qualities and every aspect of the entire life. Despite the fact that many studies have been conducted on morphological processes in Hausa language. Most of the works concentrated on borrowing, affixation, reduplication and derivation, but clipping has been neglected to the extent that only a few scholars sited some examples in the language. Therefore, the current study investigates and examines clipping as one of the word formation processes fully found in the language. The study focuses its main attention on clipping as a word-formation process and how this process is used adequately in the formation of words and their occurrence in Hausa sentences. In order to achieve the aims, the research answered these questions: 1) is clipping used as process of word formation in Hausa? 2) What are the words formed using this process? This study utilizes the Natural Morphology Theory proposed by Dressler, (1985) which was adopted by belly (2007). The data of this study have been collected from newspaper articles, novels, and written literature of Hausa language. Based on the findings, this study found out that, there exist many kinds of words formed in Hausa language using clipping in sentence and discuss, which previous findings did not either reveals, or explain in detail. Other part of the finding shows that clipping in Hausa language occurs on nouns, verbs, adjectives, reduplicated words and compounds while retains their meanings and grammatical classes.Keywords: clipping, Hausa language, morphology, word formation processes
Procedia PDF Downloads 4701257 The Grammatical Dictionary Compiler: A System for Kartvelian Languages
Authors: Liana Lortkipanidze, Nino Amirezashvili, Nino Javashvili
Abstract:
The purpose of the grammatical dictionary is to provide information on the morphological and syntactic characteristics of the basic word in the dictionary entry. The electronic grammatical dictionaries are used as a tool of automated morphological analysis for texts processing. The Georgian Grammatical Dictionary should contain grammatical information for each word: part of speech, type of declension/conjugation, grammatical forms of the word (paradigm), alternative variants of basic word/lemma. In this paper, we present the system for compiling the Georgian Grammatical Dictionary automatically. We propose dictionary-based methods for extending grammatical lexicons. The input lexicon contains only a few number of words with identical grammatical features. The extension is based on similarity measures between features of words; more precisely, we add words to the extended lexicons, which are similar to those, which are already in the grammatical dictionary. Our dictionaries are corpora-based, and for the compiling, we introduce the method for lemmatization of unknown words, i.e., words of which neither full form nor lemma is in the grammatical dictionary.Keywords: acquisition of lexicon, Georgian grammatical dictionary, lemmatization rules, morphological processor
Procedia PDF Downloads 1451256 Compounding and Blending in English and Hausa Languages
Authors: Maryam Maimota
Abstract:
Words are the basic building blocks of a language. In everyday usage of a language, words are used and new words are formed and reformed in order to contain and accommodate all entities, phenomena, qualities and every aspect of the entire human life. This research study seeks to examine and compare some of the word formation processes and how they are used in forming new words in English and Hausa languages. The study focuses its main attention on blending and compounding as word formation processes and how the processes are used adequately in the formation of words in both English and Hausa languages. The research aims to find out, how compounding and blending are used, as processes of word formation in these two languages. And also, to investigate the word formation processes involved in compounding and blending in these languages, and the nature of words that are formed. Therefore, the research tries to find the answers to the following research questions; What types of compound and blended forms are found and how they are formed in the English and Hausa languages? How these compounded and blended forms functioned in both English and Hausa languages in different context such as in phrases and sentences structures? Findings of the study reveal that, there exist new kind of words formed in Hausa and English language under blending, which previous findings did not either reveal or explain in detail. Similarly, there are a lot of similarities found in the way these blends and compounds forms in the two languages, however, the data available shows that, blends in the Hausa language are more, when compared to the blends in English. The data of this study will be gathered based on discourse found in newspaper, articles, novels, and written literature of the Hausa and English languages.Keywords: blending, compounding, morphology, word formation
Procedia PDF Downloads 3811255 Web Search Engine Based Naming Procedure for Independent Topic
Authors: Takahiro Nishigaki, Takashi Onoda
Abstract:
In recent years, the number of document data has been increasing since the spread of the Internet. Many methods have been studied for extracting topics from large document data. We proposed Independent Topic Analysis (ITA) to extract topics independent of each other from large document data such as newspaper data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis. The topic represented by ITA is represented by a set of words. However, the set of words is quite different from the topics the user imagines. For example, the top five words with high independence of a topic are as follows. Topic1 = {"scor", "game", "lead", "quarter", "rebound"}. This Topic 1 is considered to represent the topic of "SPORTS". This topic name "SPORTS" has to be attached by the user. ITA cannot name topics. Therefore, in this research, we propose a method to obtain topics easy for people to understand by using the web search engine, topics given by the set of words given by independent topic analysis. In particular, we search a set of topical words, and the title of the homepage of the search result is taken as the topic name. And we also use the proposed method for some data and verify its effectiveness.Keywords: independent topic analysis, topic extraction, topic naming, web search engine
Procedia PDF Downloads 1191254 Structural Analysis of Username Segment in E-Mail Addresses of Engineering Institutes of Gujarat State of India
Authors: Jatinderkumar R. Saini
Abstract:
E-mail has become a key mechanism of electronic communication. This is truer for professional organizations that like to communicate with their subjects online and are slowly shifting to paper-less office. The current paper focuses specifically on academic institutions offering Engineering course in Gujarat state and attempts for textual analysis of the usernames of the institutional e-mail addresses. We found that the institutions tend to design the username segment of their e-mail addresses by choosing words or combination of words from specific categories. The paper also highlights the use of special characters, digits and random words in designing the usernames. On the sidelines, the paper lists the style of employing department names and designations for the design process. To the best of our knowledge, this is the first formal attempt to analyze the selection of words employed for designing username segment of e-mail addresses of Engineering institutions.Keywords: e-mail address, institute, engineering, username
Procedia PDF Downloads 3361253 A Method for the Extraction of the Character's Tendency from Korean Novels
Authors: Min-Ha Hong, Kee-Won Kim, Seung-Hoon Kim
Abstract:
The character in the story-based content, such as novels and movies, is one of the core elements to understand the story. In particular, the character’s tendency is an important factor to analyze the story-based content, because it has a significant influence on the storyline. If readers have the knowledge of the tendency of characters before reading a novel, it will be helpful to understand the structure of conflict, episode and relationship between characters in the novel. It may therefore help readers to select novel that the reader wants to read. In this paper, we propose a method of extracting the tendency of the characters from a novel written in Korean. In advance, we build the dictionary with pairs of the emotional words in Korean and English since the emotion words in the novel’s sentences express character’s feelings. We rate the degree of polarity (positive or negative) of words in our emotional words dictionary based on SenticNet. Then we extract characters and emotion words from sentences in a novel. Since the polarity of a word grows strong or weak due to sentence features such as quotations and modifiers, our proposed method consider them to calculate the polarity of characters. The information of the extracted character’s polarity can be used in the book search service or book recommendation service.Keywords: character tendency, data mining, emotion word, Korean novel
Procedia PDF Downloads 3341252 N400 Investigation of Semantic Priming Effect to Symbolic Pictures in Text
Authors: Thomas Ousterhout
Abstract:
The purpose of this study was to investigate if incorporating meaningful pictures of gestures and facial expressions in short sentences of text could supplement the text with enough semantic information to produce and N400 effect when probe words incongruent to the picture were subsequently presented. Event-related potentials (ERPs) were recorded from a 14-channel commercial grade EEG headset while subjects performed congruent/incongruent reaction time discrimination tasks. Since pictures of meaningful gestures have been shown to be semantically processed in the brain in a similar manner as words are, it is believed that pictures will add supplementary information to text just as the inclusion of their equivalent synonymous word would. The hypothesis is that when subjects read the text/picture mixed sentences, they will process the images and words just like in face-to-face communication and therefore probe words incongruent to the image will produce an N400.Keywords: EEG, ERP, N400, semantics, congruency, facilitation, Emotiv
Procedia PDF Downloads 2581251 Therapeutic Power of Words through Reading Writing and Storytelling
Authors: Sakshi Kaul, Sundeep Verma
Abstract:
The focus of the current paper is to evaluate the therapeutic power of words. This will be done by critically evaluating the impact reading, writing and storytelling have on individuals. When we read, tell or listen to a story we are exercising our imagination. Imagination becomes the source of activation of thoughts and actions. This enables and helps the reader, writer or the listener to express the suppressed emotions or desires. The stories told, untold may bring various human emotions and attributes to forth such as hope, optimism, fear, happiness. Each story narrated evokes different emotions, at times they help us unravel ourselves in the world of the teller thereby bringing solace. Stories heard or told add to individual’s life by creating a community around, giving wings of thoughts that enable individual to be more imaginative and creative thereby fostering positively and happiness. Reading if looked at from the reader’s point of view can broaden the horizon of information and ideas about facts and life laws giving more meaning to life. From ‘once upon a time’ to ‘to happily ever after’, all that stories talk about is life’s learning. The power of words sometimes may be negated, this paper would reiterate the power of words by critically evaluating how words can become powerful and therapeutic in various structures and forms in the society. There is a story behind every situation, action and reaction. Hence it is of prime importance to understand each story, to enable a person to deal with whatever he or she may be going through. For example, if a client is going through some trauma in his or her life, the counsellor needs to know exactly what is the turmoil that is being faced so that the client can be assisted accordingly. Counselling is considered a process of healing through words or as Talk therapy, where merely through words we try to heal the client. In a counselling session, the counsellor focuses on working with the clients to bring a positive change. The counsellor allows the client to express themselves which is referred to as catharsis. The words spoken, written or heard transcend to heal and can be therapeutic. The therapeutic power of words has been seen in various cultural practices and belief systems. The underlining belief that words have the power to heal, save and bring change has existed from ages. Many religious and spiritual practices also acclaim the power of the words. Through this empirical paper, we have tried to bring to light how reading, writing, and storytelling have been used as mediums of healing and have been therapeutic in nature.Keywords: reading, storytelling, therapeutic, words
Procedia PDF Downloads 2691250 Computable Difference Matrix for Synonyms in the Holy Quran
Authors: Mohamed Ali Al Shaari, Khalid M. El Fitori
Abstract:
In the field of Quran Studies known as Ghareeb A Quran (the study of the meanings of strange words and structures in Holy Quran), it is difficult to distinguish some pragmatic meanings from conceptual meanings. One who wants to study this subject may need to look for a common usage between any two words or more; to understand general meaning, and sometimes may need to look for common differences between them, even if there are synonyms (word sisters). Some of the distinguished scholars of Arabic linguistics believe that there are no synonym words, they believe in varieties of meaning and multi-context usage. Based on this viewpoint, our method was designed to look for synonyms of a word, then the differences that distinct the word and their synonyms. There are many available books that use such a method e.g. synonyms books, dictionaries, glossaries, and some books on the interpretations of strange vocabulary of the Holy Quran, but it is difficult to look up words in these written works. For that reason, we proposed a logical entity, which we called Differences Matrix (DM). DM groups the synonyms words to extract the relations between them and to know the general meaning, which defines the skeleton of all word synonyms; this meaning is expressed by a word of its sisters. In Differences Matrix, we used the sisters(words) as titles for rows and columns, and in the obtained cells we tried to define the row title (word) by using column title (her sister), so the relations between sisters appear, the expected result is well defined groups of sisters for each word. We represented the obtained results formally, and used the defined groups as a base for building the ontology of the Holy Quran synonyms.Keywords: Quran, synonyms, differences matrix, ontology
Procedia PDF Downloads 4191249 Investigating the Influences of Long-Term, as Compared to Short-Term, Phonological Memory on the Word Recognition Abilities of Arabic Readers vs. Arabic Native Speakers: A Word-Recognition Study
Authors: Insiya Bhalloo
Abstract:
It is quite common in the Muslim faith for non-Arabic speakers to be able to convert written Arabic, especially Quranic Arabic, into a phonological code without significant semantic or syntactic knowledge. This is due to prior experience learning to read the Quran (a religious text written in Classical Arabic), from a very young age such as via enrolment in Quranic Arabic classes. As compared to native speakers of Arabic, these Arabic readers do not have a comprehensive morpho-syntactic knowledge of the Arabic language, nor can understand, or engage in Arabic conversation. The study seeks to investigate whether mere phonological experience (as indicated by the Arabic readers’ experience with Arabic phonology and the sound-system) is sufficient to cause phonological-interference during word recognition of previously-heard words, despite the participants’ non-native status. Both native speakers of Arabic and non-native speakers of Arabic, i.e., those individuals that learned to read the Quran from a young age, will be recruited. Each experimental session will include two phases: An exposure phase and a test phase. During the exposure phase, participants will be presented with Arabic words (n=40) on a computer screen. Half of these words will be common words found in the Quran while the other half will be words commonly found in Modern Standard Arabic (MSA) but either non-existent or prevalent at a significantly lower frequency within the Quran. During the test phase, participants will then be presented with both familiar (n = 20; i.e., those words presented during the exposure phase) and novel Arabic words (n = 20; i.e., words not presented during the exposure phase. ½ of these presented words will be common Quranic Arabic words and the other ½ will be common MSA words but not Quranic words. Moreover, ½ the Quranic Arabic and MSA words presented will be comprised of nouns, while ½ the Quranic Arabic and MSA will be comprised of verbs, thereby eliminating word-processing issues affected by lexical category. Participants will then determine if they had seen that word during the exposure phase. This study seeks to investigate whether long-term phonological memory, such as via childhood exposure to Quranic Arabic orthography, has a differential effect on the word-recognition capacities of native Arabic speakers and Arabic readers; we seek to compare the effects of long-term phonological memory in comparison to short-term phonological exposure (as indicated by the presentation of familiar words from the exposure phase). The researcher’s hypothesis is that, despite the lack of lexical knowledge, early experience with converting written Quranic Arabic text into a phonological code will help participants recall the familiar Quranic words that appeared during the exposure phase more accurately than those that were not presented during the exposure phase. Moreover, it is anticipated that the non-native Arabic readers will also report more false alarms to the unfamiliar Quranic words, due to early childhood phonological exposure to Quranic Arabic script - thereby causing false phonological facilitatory effects.Keywords: modern standard arabic, phonological facilitation, phonological memory, Quranic arabic, word recognition
Procedia PDF Downloads 3571248 Aspect-Level Sentiment Analysis with Multi-Channel and Graph Convolutional Networks
Authors: Jiajun Wang, Xiaoge Li
Abstract:
The purpose of the aspect-level sentiment analysis task is to identify the sentiment polarity of aspects in a sentence. Currently, most methods mainly focus on using neural networks and attention mechanisms to model the relationship between aspects and context, but they ignore the dependence of words in different ranges in the sentence, resulting in deviation when assigning relationship weight to other words other than aspect words. To solve these problems, we propose a new aspect-level sentiment analysis model that combines a multi-channel convolutional network and graph convolutional network (GCN). Firstly, the context and the degree of association between words are characterized by Long Short-Term Memory (LSTM) and self-attention mechanism. Besides, a multi-channel convolutional network is used to extract the features of words in different ranges. Finally, a convolutional graph network is used to associate the node information of the dependency tree structure. We conduct experiments on four benchmark datasets. The experimental results are compared with those of other models, which shows that our model is better and more effective.Keywords: aspect-level sentiment analysis, attention, multi-channel convolution network, graph convolution network, dependency tree
Procedia PDF Downloads 216