Search results for: number of words
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 10850

Search results for: number of words

10850 Determining the Number of Words Required to Fulfil the Writing Task in an English Proficiency Exam with the Raters’ Scores

Authors: Defne Akinci Midas

Abstract:

The aim of this study was to determine the minimum, and maximum number of words that would be sufficient to fulfill the writing task in the local English Proficiency Exam (EPE) produced and administered at the Middle East Technical University, Ankara, Turkey. The relationship between the number of words and the scores of the written products that had been awarded by two raters in three online EPEs administered in 2020 was examined. The means, standard deviations, percentages, range, minimum and maximum scores as well as correlations of the scores awarded to written products with the words that amount to 0-50, 51-100, 101-150, 151-200, 201-250, 251-300, and so on were computed. The results showed that the raters did not award a full score to texts that had fewer than 100 words. Moreover, the texts that had around 200 words were awarded the highest scores. The highest number of words that earned the highest scores was about 225, and from then onwards, the scores were either stable or lower. A positive low to moderate correlation was found between the number of words and scores awarded to the texts. We understand that the idea of ‘the longer, the better’ did not apply here. The results also showed that words between 101 to about 225 were sufficient to fulfill the writing task to fully display writing skills and language ability in the specific case of this exam.

Keywords: English proficiency exam, number of words, scoring, writing task

Procedia PDF Downloads 136
10849 Towards Kurdish Internet Linguistics: A Case Study on the Impact of Social Media on Kurdish Language

Authors: Karwan K. Abdalrahman

Abstract:

Due to the impacts of the internet and social media, new words and expressions enter the Kurdish language, and a number of familiarized words get new meanings. The case is especially true when the technique of transliteration is taken into consideration. Through transliteration, a number of selected words widely used on social media are entering the Kurdish media discourse. In addition, a selected number of Kurdish words get new cultural and psychological meanings. The significance of this study is to delve into the process of word formation in the Kurdish language and explore how new words and expressions are formed by social media users and got public recognition. First, the study investigates the English words that enter the Kurdish language through different social media platforms. All of these words are transliterated and are used in spoken and written discourses. Second, there are a specific number of Kurdish words that got new meanings in social media. As for these words, there are psychological and cultural factors that make people use these expressions for specific political reasons. It can be argued that they have an indirect political message along with their new linguistic usages. This is a qualitative study analyzing video content that was published in the last two years on social media platforms, including Facebook and YouTube. The collected data was analyzed based on the themes discussed above. The findings of the research can be summarized as follows: the widely used transliterated words have entered both the spoken and written discourses. Authors in online and offline newspapers, TV presenters, literary writers, columnists are using these new expressions in their writings. As for the Kurdish words with new meanings, they are also widely used for psychological, cultural, and political reasons.

Keywords: Kurdish language, social media, new meanings, transliteration, vocabulary

Procedia PDF Downloads 153
10848 Bag of Words Representation Based on Weighting Useful Visual Words

Authors: Fatma Abdedayem

Abstract:

The most effective and efficient methods in image categorization are almost based on bag-of-words (BOW) which presents image by a histogram of occurrence of visual words. In this paper, we propose a novel extension to this method. Firstly, we extract features in multi-scales by applying a color local descriptor named opponent-SIFT. Secondly, in order to represent image we use Spatial Pyramid Representation (SPR) and an extension to the BOW method which based on weighting visual words. Typically, the visual words are weighted during histogram assignment by computing the ratio of their occurrences in the image to the occurrences in the background. Finally, according to classical BOW retrieval framework, only a few words of the vocabulary is useful for image representation. Therefore, we select the useful weighted visual words that respect the threshold value. Experimentally, the algorithm is tested by using different image classes of PASCAL VOC 2007 and is compared against the classical bag-of-visual-words algorithm.

Keywords: BOW, useful visual words, weighted visual words, bag of visual words

Procedia PDF Downloads 403
10847 English Pashto Contact: Morphological Adaptation of Bilingual Compound Words in Pashto

Authors: Imran Ullah Imran

Abstract:

Language contact is a familiar concept in the present global world. Across the globe, languages get mixed up at different levels. Borrowing, code-switching are some of the means through which languages interact. This study examines Pashto-English contact at word and syllable levels. By recording the speech of 30 Pashto native speakers, selected via 'social network' sampling, the study located a number of Pashto-English compound words, which is a unique contact of its kind. In data analysis, tokens were categorized on the basis of their pattern and morphological structure. The study shows that Pashto-English Bilingual Compound words (BCWs) are very prevalent in the Pashto language. The study also found that the BCWs in Pashto are completely productive and have their own meanings. It also shows that the dominant pattern of hybrid words in Pashto is the conjugation of an independent English root word followed by a Pashto inflectional morpheme, which contributes to the core semantic content of the construction. The BCWs construction shows that how both the languages are closer to each other. Pashto-English contact results into bilingual compound and hybrid words, which forms a considerable number of tokens in the present-day spoken Pashto. On the basis of these findings, the study assumes that the same phenomenon may increase with the passage of time that would, in turn, result in the formation of more bilingual compound or hybrid words.

Keywords: code-mixing, bilingual compound words, pashto-english contact, hybrid words, inflectional lexical morpheme

Procedia PDF Downloads 216
10846 The Cultural and Semantic Danger of English Transparent Words Translated from English into Arabic

Authors: Abdullah Khuwaileh

Abstract:

While teaching and translating vocabulary is no longer a neglected area in ELT in general and in translation in particular, the psychology of its acquisition has been a neglected area. Our paper aims at exploring some of the learning and translating conditions under which vocabulary is acquired and translated properly. To achieve this objective, two teaching methods (experiments) were applied on 4 translators to measure their acquisition of a number of transparent vocabulary items. Some of these items were knowingly chosen from 'deceptively transparent words'. All the data, sample, etc., were taken from Jordan University of Science and Technology (JUST) and Yarmouk University, where the researcher is employed. The study showed that translators might translate transparent words inaccurately, particularly if these words are uncontextualised. It was also shown that the morphological structures of words may lead translators or even EFL learners to misinterpretations of meaning.

Keywords: english, transparent, word, processing, translation

Procedia PDF Downloads 43
10845 The Repetition of New Words and Information in Mandarin-Speaking Children: A Corpus-Based Study

Authors: Jian-Jun Gao

Abstract:

Repetition is used for a variety of functions in conversation. When young children first learn to speak, they often repeat words from the adult’s recent utterance with the learning and social function. The objective of this study was to ascertain whether the repetitions are equivalent in indicating attention to new words and the initial repeat of information in conversation. Based on the observation of naturally occurring language use in Taiwan Corpus of Child Mandarin (TCCM), the results in this study provided empirical support to the previous findings that children are more likely to repeat new words they are offered than to repeat new information. When children get older, there would be a drop in the repetition of both new words and new information.

Keywords: acquisition, corpus, mandarin, new words, new information, repetition

Procedia PDF Downloads 116
10844 Exploring the Power of Words: Domesticating the Competence/Competency Concept in Ugandan Organisations

Authors: John C. Munene, Florence Nansubuga

Abstract:

The study set out to examine a number of theories that have directly or indirectly implied that words are potent but that the potency depends on the context or practice in which they are utilised. The theories include the Freudian theory of Cathexis, which directly suggests that ambiguous events when named become potent as well as the word that is used to name them. We briefly examine Psychological differentiation, which submit that ambiguity is often a result of failure to distinguish figure from ground. The investigate Prospecting Theory, which suggests that in a situation when people have to make decisions, they have options to utilise intuition or reasoned judgment. It suggests that more often than not, the tendency is to utilise intuition especially when generic heuristics such as representativeness and similarity are available. That usage of these heuristics may depend on lack of a salience or accessibility of the situation due to ambiguity. We also examine Activity Theory, which proposes that meaning of words emerge directly and dialectically from the activities in which they are used. The paper argues that the power of words will depend on either or all of the theories mentioned above. To examine this general proposition we test the utilization of a generic competence framework in a local setting. The assumption is that generic frameworks are inherently ambiguous and lack the potency normally associated with the competence concept in the management of human resources. A number of case studies provide initial supporting evidence for the general proposition.

Keywords: competence, meaning, operationalisation, power of words

Procedia PDF Downloads 364
10843 The Grammatical Dictionary Compiler: A System for Kartvelian Languages

Authors: Liana Lortkipanidze, Nino Amirezashvili, Nino Javashvili

Abstract:

The purpose of the grammatical dictionary is to provide information on the morphological and syntactic characteristics of the basic word in the dictionary entry. The electronic grammatical dictionaries are used as a tool of automated morphological analysis for texts processing. The Georgian Grammatical Dictionary should contain grammatical information for each word: part of speech, type of declension/conjugation, grammatical forms of the word (paradigm), alternative variants of basic word/lemma. In this paper, we present the system for compiling the Georgian Grammatical Dictionary automatically. We propose dictionary-based methods for extending grammatical lexicons. The input lexicon contains only a few number of words with identical grammatical features. The extension is based on similarity measures between features of words; more precisely, we add words to the extended lexicons, which are similar to those, which are already in the grammatical dictionary. Our dictionaries are corpora-based, and for the compiling, we introduce the method for lemmatization of unknown words, i.e., words of which neither full form nor lemma is in the grammatical dictionary.

Keywords: acquisition of lexicon, Georgian grammatical dictionary, lemmatization rules, morphological processor

Procedia PDF Downloads 112
10842 A Word-to-Vector Formulation for Word Representation

Authors: Sandra Rizkallah, Amir F. Atiya

Abstract:

This work presents a novel word to vector representation that is based on embedding the words into a sphere, whereby the dot product of the corresponding vectors represents the similarity between any two words. Embedding the vectors into a sphere enabled us to take into consideration the antonymity between words, not only the synonymity, because of the suitability to handle the polarity nature of words. For example, a word and its antonym can be represented as a vector and its negative. Moreover, we have managed to extract an adequate vocabulary. The obtained results show that the proposed approach can capture the essence of the language, and can be generalized to estimate a correct similarity of any new pair of words.

Keywords: natural language processing, word to vector, text similarity, text mining

Procedia PDF Downloads 237
10841 Morphological Rules of Bangla Repetition Words for UNL Based Machine Translation

Authors: Nawab Yousuf Ali, S. Golam, A. Ameer, Ashok Toru Roy

Abstract:

This paper develops new morphological rules suitable for Bangla repetition words to be incorporated into an inter lingua representation called Universal Networking Language (UNL). The proposed rules are to be used to combine verb roots and their inflexions to produce words which are then combined with other similar types of words to generate repetition words. This paper outlines the format of morphological rules for different types of repetition words that come from verb roots based on the framework of UNL provided by the UNL centre of the Universal Networking Digital Language (UNDL) foundation.

Keywords: Universal Networking Language (UNL), universal word (UW), head word (HW), Bangla-UNL Dictionary, morphological rule, enconverter (EnCo)

Procedia PDF Downloads 280
10840 Learning on the Go: Practicing Vocabulary with Mobile Apps

Authors: Shoba Bandi-Rao

Abstract:

The lack of college readiness is one of the major contributors to low graduation rates at community colleges, especially among educationally and financially disadvantaged students. About 45% of underprepared high school graduates are required to complete ‘remedial’ reading/writing courses before they can begin taking college-level courses. Mobile apps present ‘bite-size’ learning materials that can be useful for practicing certain literacy skills, such as vocabulary learning. The convenience of mobile phones is ideal for a majority of students at community colleges who hold full or part-time jobs. Mobile apps allow students to learn during small ‘chunks’ of time available to them outside of the class—during subway commute, between classes, etc. Learning with mobile apps is a relatively new area in research, and their effectiveness for learning new words has been inconclusive. Using Mishra & Koehler’s TPCK theoretical framework, this study explored the effectiveness of the mobile app (Quizlet) for learning one hundred common college-level words in ‘remedial’ writing class over one semester. Each week, before coming to class, students studied a list of 10-15 words presented in context within sentences. Students came across these words in the article they read in class making their learning more meaningful. A pre and post-test measured the number of words students knew, learned and remembered. Statistical analysis shows that students performed better by 41% on the post-test indicating that the mobile app was helpful for learning words. Students also completed a short survey each week that sought to determine the amount of time students spent on the vocabulary app. A positive correlation was found between the amount of time spent on the mobile app and the number of words learned. The goal of this research is to capitalize on the convenience of smartphones to (1) better prepare them for college-level course work, and (2) contribute to current literature on mobile learning.

Keywords: mobile learning, vocabulary learning, literacy skills, Quizlet

Procedia PDF Downloads 195
10839 Web Search Engine Based Naming Procedure for Independent Topic

Authors: Takahiro Nishigaki, Takashi Onoda

Abstract:

In recent years, the number of document data has been increasing since the spread of the Internet. Many methods have been studied for extracting topics from large document data. We proposed Independent Topic Analysis (ITA) to extract topics independent of each other from large document data such as newspaper data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis. The topic represented by ITA is represented by a set of words. However, the set of words is quite different from the topics the user imagines. For example, the top five words with high independence of a topic are as follows. Topic1 = {"scor", "game", "lead", "quarter", "rebound"}. This Topic 1 is considered to represent the topic of "SPORTS". This topic name "SPORTS" has to be attached by the user. ITA cannot name topics. Therefore, in this research, we propose a method to obtain topics easy for people to understand by using the web search engine, topics given by the set of words given by independent topic analysis. In particular, we search a set of topical words, and the title of the homepage of the search result is taken as the topic name. And we also use the proposed method for some data and verify its effectiveness.

Keywords: independent topic analysis, topic extraction, topic naming, web search engine

Procedia PDF Downloads 95
10838 Pudhaiyal: A Maze-Based Treasure Hunt Game for Tamil Words

Authors: Aarthy Anandan, Anitha Narasimhan, Madhan Karky

Abstract:

Word-based games are popular in helping people to improve their vocabulary skills. Games like ‘word search’ and crosswords provide a smart way of increasing vocabulary skills. Word search games are fun to play, but also educational which actually helps to learn a language. Finding the words from word search puzzle helps the player to remember words in an easier way, and it also helps to learn the spellings of words. In this paper, we present a tile distribution algorithm for a Maze-Based Treasure Hunt Game 'Pudhaiyal’ for Tamil words, which describes how words can be distributed horizontally, vertically or diagonally in a 10 x 10 grid. Along with the tile distribution algorithm, we also present an algorithm for the scoring model of the game. The proposed game has been tested with 20,000 Tamil words.

Keywords: Pudhaiyal, Tamil word game, word search, scoring, maze, algorithm

Procedia PDF Downloads 409
10837 The Development of Space-Time and Space-Number Associations: The Role of Non-Symbolic vs. Symbolic Representations

Authors: Letizia Maria Drammis, Maria Antonella Brandimonte

Abstract:

The idea that people use space representations to think about time and number received support from several lines of research. However, how these representations develop in children and then shape space-time and space-number mappings is still a debated issue. In the present study, 40 children (20 pre-schoolers and 20 elementary-school children) performed 4 main tasks, which required the use of more concrete (non-symbolic) or more abstract (symbolic) space-time and space-number associations. In the non-symbolic conditions, children were required to order pictures of everyday-life events occurring in a specific temporal order (Temporal sequences) and of quantities varying in numerosity (Numerical sequences). In the symbolic conditions, they were asked to perform the typical time-to-position and number-to-position tasks by mapping time-related words and numbers onto lines. Results showed that children performed reliably better in the non-symbolic Time conditions than the symbolic Time conditions, independently of age, whereas only pre-schoolers performed worse in the Number-to-position task (symbolic) as compared to the Numerical sequence (non-symbolic) task. In addition, only older children mapped time-related words onto space following the typical left-right orientation, pre-schoolers’ performance being somewhat mixed. In contrast, mapping numbers onto space showed a clear left-right orientation, independently of age. Overall, these results indicate a cross-domain difference in the way younger and older children process time and number, with time-related tasks being more difficult than number-related tasks only when space-time tasks require symbolic representations.

Keywords: space-time associations, space-number associations, orientation, children

Procedia PDF Downloads 308
10836 A Linguistic Analysis of the Inconsistencies in the Meaning of Some -er Suffix Morphemes

Authors: Amina Abubakar

Abstract:

English like any other language is rich by means of arbitrary, conventional, symbols which lend it to lot of inconsistencies in spelling, phonology, syntax, and morphology. The research examines the irregularities prevalent in the structure and meaning of some ‘er’ lexical items in English and its implication to vocabulary acquisition. It centers its investigation on the derivational suffix ‘er’, which changes the grammatical category of word. English language poses many challenges to Second Language Learners because of its irregularities, exceptions, and rules. One of the meaning of –er derivational suffix is someone or somebody who does something. This rule often confuses the learners when they meet with the exceptions in normal discourse. The need to investigate instances of such inconsistencies in the formation of –er words and the meanings given to such words by the students motivated this study. For this purpose, some senior secondary two (SS2) students in six randomly selected schools in the metropolis were provided a large number of alphabetically selected ‘er’ suffix ending words, The researcher opts for a test technique, which requires them to provide the meaning of the selected words with- er. The marking of the test was scored on the scale of 1-0, where correct formation of –er word and meaning is scored one while wrong formation and meaning is scored zero. The number of wrong and correct formations of –er words meaning were calculated using percentage. The result of this research shows that a large number of students made wrong generalization of the meaning of the selected -er ending words. This shows how enormous the inconsistencies are in English language and how are affect the learning of English. Findings from the study revealed that though students mastered the basic morphological rules but the errors are generally committed on those vocabulary items that are not frequently in use. The study arrives at this conclusion from the survey of their textbook and their spoken activities. Therefore, the researcher recommends that there should be effective reappraisal of language teaching through implementation of the designed curriculum to reflect on modern strategies of teaching language, identification, and incorporation of the exceptions in rigorous communicative activities in language teaching, language course books and tutorials, training and retraining of teachers on the strategies that conform to the new pedagogy.

Keywords: ESL(English as a second language), derivational morpheme, inflectional morpheme, suffixes

Procedia PDF Downloads 341
10835 Evaluation of Persian Medical Terms Compatibility with International Naming Criteria Based on the Applied Translation Procedures

Authors: Ali Akbar Zeinali

Abstract:

Lack of appropriate equivalences for the terms or technical words is the result of ineffective translation guidelines adopted in the translation processes. The increasing number of foreign words and specific terms incorporated into the native language are due to the ongoing development of technology and science. Many problems appear in medical translation when the Persian translators try to employ non-Persian or imported words in medical texts, in which multiple equivalents may be created for one particular word based on the individual preferences of authors and translators in the target language due to lack of standardization. The study attempted to discuss the findings based on the compatibility of the international naming criteria, considering the translation procedures. About 67% of 339 equivalents under this study were grouped as incompatible words while about 33% of them were compatible terms. The similarities and differences were investigated and discussed according to the compatibility status of the equivalents with Sager’s criteria. Such equivalents have been classified into several groups through bi-dimensional descriptions that were different features of translation procedures related to the international naming criteria. In review of the frequency distribution of compatibilities, the equivalents were divided into two categories of compatibles and incompatibles, indicating the effectiveness of the applied translation procedures.

Keywords: linguistics, medical translation, naming, terminology

Procedia PDF Downloads 96
10834 Intensifier as Changed from the Impolite Word in Thai

Authors: Methawee Yuttapongtada

Abstract:

Intensifier is the linguistic term and device that is generally found in different languages in order to enhance and give additional quantity, quality or emotion to the words of each language. In fact, each language in the world has both of the similar and dissimilar intensifying device. More specially, the wide variety of intensifying device is used for Thai language and one of those is usage of the impolite word or the word that used to mean something negative as intensifier. The data collection in this study was done throughout the spoken language style by collecting from intensifiers regarded as impolite words because these words as employed in the other contexts will be held as the rude, swear words or the words with negative meaning. Then, backward study to the past was done in order to consider the historical change. Explanation of the original meaning and the contexts of words use from the past till the present time were done by use of both textual documents and dictionaries available in different periods. It was found that regarding the semantics and pragmatic aspects, subjectification also is the significant motivation that changed the impolite words to intensifiers. At last, it can explain pathway of the semantic change of these very words undoubtedly. Moreover, it is found that use tendency in the impolite word or the word that used to mean something negative will more be increased and this phenomenon is commonly found in many languages in the world and results of this research may support to the belief that human language in the world is universal and the same still reflected that human has the fundamental thought as the same to each other basically.

Keywords: impolite word, intensifier, Thai, semantic change

Procedia PDF Downloads 148
10833 Determining Optimal Number of Trees in Random Forests

Authors: Songul Cinaroglu

Abstract:

Background: Random Forest is an efficient, multi-class machine learning method using for classification, regression and other tasks. This method is operating by constructing each tree using different bootstrap sample of the data. Determining the number of trees in random forests is an open question in the literature for studies about improving classification performance of random forests. Aim: The aim of this study is to analyze whether there is an optimal number of trees in Random Forests and how performance of Random Forests differ according to increase in number of trees using sample health data sets in R programme. Method: In this study we analyzed the performance of Random Forests as the number of trees grows and doubling the number of trees at every iteration using “random forest” package in R programme. For determining minimum and optimal number of trees we performed Mc Nemar test and Area Under ROC Curve respectively. Results: At the end of the analysis it was found that as the number of trees grows, it does not always means that the performance of the forest is better than forests which have fever trees. In other words larger number of trees only increases computational costs but not increases performance results. Conclusion: Despite general practice in using random forests is to generate large number of trees for having high performance results, this study shows that increasing number of trees doesn’t always improves performance. Future studies can compare different kinds of data sets and different performance measures to test whether Random Forest performance results change as number of trees increase or not.

Keywords: classification methods, decision trees, number of trees, random forest

Procedia PDF Downloads 370
10832 Hybrid SVM/DBN Model for Arabic Isolated Words Recognition

Authors: Elyes Zarrouk, Yassine Benayed, Faiez Gargouri

Abstract:

This paper presents a new hybrid model for isolated Arabic words recognition. To do this, we apply Support Vectors Machine (SVM) as an estimator of posterior probabilities within the Dynamic Bayesian networks (DBN). This paper deals a comparative study between DBN and SVM/DBN systems for multi-dialect isolated Arabic words. Performance using SVM/DBN is found to exceed that of DBNs trained on an identical task, giving higher recognition accuracy for four different Arabic dialects. In fact, the average of recognition rates for the four dialects with SVM/DBN was 87.67% while 83.01% with DBN.

Keywords: dynamic Bayesian networks, hybrid models, supports vectors machine, Arabic isolated words

Procedia PDF Downloads 529
10831 Formation of Blends in Hausa Language

Authors: Maryam Maimota Shehu

Abstract:

Words are the basic building blocks of a language. In everyday usage of a language, words are used, and new words are formed and reformed to contain and accommodate all entities, phenomena, qualities and every aspect of the entire life. Despite the fact that many studies have been conducted on morphological processes in The Hausa language. Most of the works concentrated on borrowing, affixation, reduplication and derivation, but blending has been neglected to the extent that some of the Hausa linguists claim that, blending does not exist in the language. Therefore, the current study investigates and examines blending as one of the word formation processes' in the language. The study focuses its main attention on blending as a word-formation process and how this process is used adequately in the formation of words in The Hausa language. To achieve the aims, the research answered these questions: 1) is blending used as a process of word formation in Hausa? 2) What are the words formed using this process? This study utilizes the Natural Morphology Theory proposed by Dressler, (1985) which was adopted by Belly (2007). The data of this study have been collected from newspaper articles, novels, and written literature of Hausa language. Based on the findings, this study found out that, there exist new kind of words formed in The Hausa language under blending, which previous findings did not either reveal or explain in detail. Another part of the finding shows that some of the words change their grammatical classes and meaning while blended.

Keywords: morphology, word formation, blending in hausa language, language

Procedia PDF Downloads 373
10830 The Power of Words: A Corpus Analysis of Campaign Speeches of President Donald J. Trump

Authors: Aiza Dalman

Abstract:

Words are powerful when these are used wisely and strategically. In this study, twelve (12) campaign speeches of President Donald J. Trump were analyzed as to frequently used words and ethos, pathos and logos being employed. The speeches were read thoroughly, analyzed and interpreted. With the use of Word Counter Tool and Text Analyzer software accessible online, it was found out that the word ‘will’ has the highest frequency of 121, followed by Hillary (58), American (38), going (35), plan and Clinton (32), illegal (30), government (28), corruption (26) and criminal (24). When the speeches were analyzed as to ethos, pathos and logos, on the other hand, it revealed that these were all employed in his speeches. The statements under these pointed out against Hillary or in his favor. The unique strategy of President Donald J. Trump as to frequently used words and ethos, pathos and logos in persuading people perhaps lead the way to his victory.

Keywords: campaign speeches, corpus analysis, ethos, logos and pathos, power of words

Procedia PDF Downloads 240
10829 The Development of Chinese-English Homophonic Word Pairs Databases for English Teaching and Learning

Authors: Yuh-Jen Wu, Chun-Min Lin

Abstract:

Homophonic words are common in Mandarin Chinese which belongs to the tonal language family. Using homophonic cues to study foreign languages is one of the learning techniques of mnemonics that can aid the retention and retrieval of information in the human memory. When learning difficult foreign words, some learners transpose them with words in a language they are familiar with to build an association and strengthen working memory. These phonological clues are beneficial means for novice language learners. In the classroom, if mnemonic skills are used at the appropriate time in the instructional sequence, it may achieve their maximum effectiveness. For Chinese-speaking students, proper use of Chinese-English homophonic word pairs may help them learn difficult vocabulary. In this study, a database program is developed by employing Visual Basic. The database contains two corpora, one with Chinese lexical items and the other with English ones. The Chinese corpus contains 59,053 Chinese words that were collected by a web crawler. The pronunciations of this group of words are compared with words in an English corpus based on WordNet, a lexical database for the English language. Words in both databases with similar pronunciation chunks and batches are detected. A total of approximately 1,000 Chinese lexical items are located in the preliminary comparison. These homophonic word pairs can serve as a valuable tool to assist Chinese-speaking students in learning and memorizing new English vocabulary.

Keywords: Chinese, corpus, English, homophonic words, vocabulary

Procedia PDF Downloads 145
10828 Network Word Discovery Framework Based on Sentence Semantic Vector Similarity

Authors: Ganfeng Yu, Yuefeng Ma, Shanliang Yang

Abstract:

The word discovery is a key problem in text information retrieval technology. Methods in new word discovery tend to be closely related to words because they generally obtain new word results by analyzing words. With the popularity of social networks, individual netizens and online self-media have generated various network texts for the convenience of online life, including network words that are far from standard Chinese expression. How detect network words is one of the important goals in the field of text information retrieval today. In this paper, we integrate the word embedding model and clustering methods to propose a network word discovery framework based on sentence semantic similarity (S³-NWD) to detect network words effectively from the corpus. This framework constructs sentence semantic vectors through a distributed representation model, uses the similarity of sentence semantic vectors to determine the semantic relationship between sentences, and finally realizes network word discovery by the meaning of semantic replacement between sentences. The experiment verifies that the framework not only completes the rapid discovery of network words but also realizes the standard word meaning of the discovery of network words, which reflects the effectiveness of our work.

Keywords: text information retrieval, natural language processing, new word discovery, information extraction

Procedia PDF Downloads 60
10827 Optimized Text Summarization Model on Mobile Screens for Sight-Interpreters: An Empirical Study

Authors: Jianhua Wang

Abstract:

To obtain key information quickly from long texts on small screens of mobile devices, sight-interpreters need to establish optimized summarization model for fast information retrieval. Four summarization models based on previous studies were studied including title+key words (TKW), title+topic sentences (TTS), key words+topic sentences (KWTS) and title+key words+topic sentences (TKWTS). Psychological experiments were conducted on the four models for three different genres of interpreting texts to establish the optimized summarization model for sight-interpreters. This empirical study shows that the optimized summarization model for sight-interpreters to quickly grasp the key information of the texts they interpret is title+key words (TKW) for cultural texts, title+key words+topic sentences (TKWTS) for economic texts and topic sentences+key words (TSKW) for political texts.

Keywords: different genres, mobile screens, optimized summarization models, sight-interpreters

Procedia PDF Downloads 281
10826 Made-in-Japan English and the Negative Impact on English Language Learning

Authors: Anne Crescini

Abstract:

The number of loanwords borrowed into the Japanese language is increasing rapidly in recent years, and many linguists argue that loanwords make up more than 10% of the Japanese lexicon. While these loanwords come from various Western languages, 80%-90% are borrowed from English. Also, there is a separate group of words and phrases categorized as ‘Japanese English’. These made-in-Japan linguistic creations may look and sound like English, but in fact are not used by native speakers and are often incomprehensible to them. Linguistically, the important thing to remember is that these terms are not English ones, but in fact, 100% Japanese words. A problem arises in language teaching, however, when Japanese English learners are unable to distinguish authentic loans from Japanese English terms. This confusion could greatly impede language acquisition and communication. The goal of this paper is to determine to what degree this potential misunderstanding may interfere with communication. Native English speakers living in the United States were interviewed and shown a list of romanized Japanese English terms, which are both commonly used and often mistaken for authentic loans. Then, the words were put into the context of a sentence in order to ascertain if context in any way aided comprehension. The results showed that while some terms are understood on their own, and others are understood better in context, a large number of the terms are entirely incomprehensible to native English speakers. If that is the case, and a Japanese learner mistakes a Japanese English term for an authentic loan, a communication breakdown may occur during interaction in English. With the ever-increasing presence of both groups of terms in the Japanese language, it is more important than ever that teaching professionals address this topic in the language classroom.

Keywords: Japanese, Japanese English, language acquisition, loanwords

Procedia PDF Downloads 190
10825 Reasons for Language Words in the Quran and Literary Approaches That Are Persian

Authors: Fateme Mazbanpoor, Sayed Mohammad Amiri

Abstract:

In this article, we will examine the Persian words in Quran and study the reasons of their presence in this holy book. Writers of this paper extracted about 70 Persian words of Quran by referring to resources. (Alalfaz ol Moarab ol Farsieh Edishir, Almoarabol Javalighi, Almahzab va Etghan Seuti; Vocabulary involved in Quran Arthur Jeffry;, and etc…), some of these words are: ‘Abarigh, ‘Estabragh’,’Barzakh’, ‘Din’,’Zamharir, ‘Sondos’ ‘Sejil’,’ Namaregh’, ‘Fil’ etc. These Persian words have entered Arabic and finally entered Quran in two ways: 1) directly from Persian language, 2) via other languages. The first way: because of the Iranian dominance on Hira, Yemen, whole Oman and Bahrein land in Sasanian period, there were political, religious, linguistic, literary, and trade ties between these Arab territories causing the impact of Persian on Arabic; giving way to many Persian-loan words into Arabic in this period of time. The second way: Since the geographical and business conditions of the areas were dominated by Iran, Hejaz had lots of deals and trades with Mesopotamia and Yemen. On the other hand, Arabic language which was relatively a young language at that time, used to be impressed by Semitic languages in order to expand its vocabulary (Syrian and Aramaic were influenced by the languages of Iran). Consequently, due to the long relationship between Iranian and Arabs, some of the Persian words have taken longer ways through Aramaic and Syrian to find their way into Quran.

Keywords: Quran, Persian word, Arabic language, Persian

Procedia PDF Downloads 434
10824 Everyday-Life Vocabulary: A Missing Component in Iranian EFL Context

Authors: Yasser Aminifard, Hamdollah Askari

Abstract:

This study aimed at investigating any difference between Iranian senior high school students' performance on Academic Words (AWs) and Everyday-Life Words (ELWs). To this end, in the first phase, a number of 120 male senior high school students were randomly selected from among twelve high schools in Gachsaran to serve as the participants of the study. In the second phase, using purposive sampling, six high school teachers holding an MA in TEFL and with over twenty years of teaching experience were interviewed. Two multiple-choice tests, each comprising 40 items, were given to the participants in order to determine their performance on AWs and ELWs and follow-up semi-structured interviews were conducted to explore teachers' opinions about participants' performance on the two tests. To analyze the data, a paired-samples t-test was carried out to compare the results of both tests and the interviews were also transcribed to pinpoint important themes. The results of the t-test indicated that the participants performed significantly better on AWs than on ELWs. Additionally, results of the interviews boiled down to the fact that the English textbooks designed for Iranian high school students are fundamentally flawed on the grounds that there is a mismatch between students' real language learning needs and what is presented to them as "teaching-to-the-test" materials via these books. Finally, the implications and suggestions for further research are discussed.

Keywords: everyday-life words, academic words, textbooks, washback

Procedia PDF Downloads 429
10823 Literary Words of Foreign Origin as Social Markers in Jeffrey Archer's Novels Speech Portrayals

Authors: Tatiana Ivushkina

Abstract:

The paper is aimed at studying the use of literary words of foreign origin in modern fiction from a sociolinguistic point of view, which presupposes establishing correlation between this category of words in a speech portrayal or narrative and a social status of the speaker, verifying that it bears social implications and serves as a social marker or index of socially privileged identity in the British literature of the 21-st century. To this end, there were selected literary words of foreign origin in context (60 contexts) and subjected to careful examination. The study is carried out on two novels by Jeffrey Archer – Not a Penny More, Not a Penny Less and A Prisoner of Birth – who, being a graduate from Oxford, represents socially privileged classes himself and gives a wide depiction of characters with different social backgrounds and statuses. The analysis of the novels enabled us to categorize the selected words into four relevant groups. The first represented by terms (commodity, debenture, recuperation, syringe, luminescence, umpire, etc.) serves to unambiguously indicate education, occupation, a field of knowledge in which a character is involved or a situation of communication. The second group is formed of words used in conjunction with their Germanic counterparts (perspiration – sweat, padre – priest, convivial – friendly) to contrast social position of the characters: literary words serving as social indices of upper class speakers whereas their synonyms of Germanic origin characterize middle or lower class speech portrayals. The third class of words comprises socially marked words (verbs, nouns, and adjectives), or U-words (the term first coined by Allan Ross and Nancy Mitford), the status acquired in the course of social history development (elegant, excellent, sophistication, authoritative, preposterous, etc.). The fourth includes words used in a humorous or ironic meaning to convey the narrator’s attitude to the characters or situation itself (ministrations, histrionic, etc.). Words of this group are perceived as 'alien', stylistically distant as they create incongruity between style and subject matter. Social implication of the selected words is enhanced by French words and phrases often accompanying them.

Keywords: British literature of the XXI century, literary words of foreign origin, social context, social meaning

Procedia PDF Downloads 105
10822 Morphological Analysis of Manipuri Language: Wahei-Neinarol

Authors: Y. Bablu Singh, B. S. Purkayashtha, Chungkham Yashawanta Singh

Abstract:

Morphological analysis forms the basic foundation in NLP applications including syntax parsing Machine Translation (MT), Information Retrieval (IR) and automatic indexing in all languages. It is the field of the linguistics; it can provide valuable information for computer based linguistics task such as lemmatization and studies of internal structure of the words. Computational Morphology is the application of morphological rules in the field of computational linguistics, and it is the emerging area in AI, which studies the structure of words, which are formed by combining smaller units of linguistics information, called morphemes: the building blocks of words. Morphological analysis provides about semantic and syntactic role in a sentence. It analyzes the Manipuri word forms and produces several grammatical information associated with the words. The Morphological Analyzer for Manipuri has been tested on 3500 Manipuri words in Shakti Standard format (SSF) using Meitei Mayek as source; thereby an accuracy of 80% has been obtained on a manual check.

Keywords: morphological analysis, machine translation, computational morphology, information retrieval, SSF

Procedia PDF Downloads 299
10821 Preferred Character Size for Oblique Angles

Authors: Photjanat Phimnom, Haruetai Lohasiriwat

Abstract:

In today’s world, the LED display has been used for presenting visual information under various circumstances. Such information is an important intermediary in the human information processing. Researchers have been investigated diverse factors that influence this process effectiveness. The letter size is undoubtedly one major factor that has been tested and recommended by many standards and guidelines. However, viewing information on the display from direct perpendicular position is a typical assumption whereas many actual events are required viewing from the angles. This current research aims to study the effect of oblique viewing angle and viewing distance on ability to recognize alphabet, number, and English word. The total of ten participants was volunteered to our 3 x 4 x 4 within subject study. Independent variables include three distance levels (2, 6, and 12 m), four oblique angle (0, 45, 60, 75 degree), and four target types (alphabet, number, short words, and long words). Following the method of constant stimuli we found that the larger oblique angle, ranging from 0 to 75 degree from the line of sight, results in significant higher legibility threshold or larger font size required (p-value < 0.05). Viewing distance factor also shows to have significant effect on the threshold (p-value < 0.05). However, the effect from distance factor is expected to be confounded by the quality of the screen we used in our experiment. Lastly, our results show that single alphabet as well as single number are recognized at significant lower threshold (smaller font size) as compared to both short and long words (p-value < 0.05). Therefore, it is recommended that when designs information to be presented on LED display, understanding of all possible ranges of oblique angle should be taken into account in order to specify the preferred letter size. Additionally, the recommendation of letter size for 100 % readability in our tested conditions is provided in the paper.

Keywords: letter size, oblique angle, viewing distance, legibility threshold

Procedia PDF Downloads 359