Search results for: natural language grammar models
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 14576

Search results for: natural language grammar models

14456 Working Memory and Phonological Short-Term Memory in the Acquisition of Academic Formulaic Language

Authors: Zhicheng Han

Abstract:

This study examines the correlation between knowledge of formulaic language, working memory (WM), and phonological short-term memory (PSTM) in Chinese L2 learners of English. This study investigates if WM and PSTM correlate differently to the acquisition of formulaic language, which may be relevant for the discourse around the conceptualization of formulas. Connectionist approaches have lead scholars to argue that formulas are form-meaning connections stored whole, making PSTM significant in the acquisitional process as it pertains to the storage and retrieval of chunk information. Generativist scholars, on the other hand, argued for active participation of interlanguage grammar in the acquisition and use of formulaic language, where formulas are represented in the mind but retain the internal structure built around a lexical core. This would make WM, especially the processing component of WM an important cognitive factor since it plays a role in processing and holding information for further analysis and manipulation. The current study asked L1 Chinese learners of English enrolled in graduate programs in China to complete a preference raking task where they rank their preference for formulas, grammatical non-formulaic expressions, and ungrammatical phrases with and without the lexical core in academic contexts. Participants were asked to rank the options in order of the likeliness of them encountering these phrases in the test sentences within academic contexts. Participants’ syntactic proficiency is controlled with a cloze test and grammar test. Regression analysis found a significant relationship between the processing component of WM and preference of formulaic expressions in the preference ranking task while no significant correlation is found for PSTM or syntactic proficiency. The correlational analysis found that WM, PSTM, and the two proficiency test scores have significant covariates. However, WM and PSTM have different predictor values for participants’ preference for formulaic language. Both storage and processing components of WM are significantly correlated with the preference for formulaic expressions while PSTM is not. These findings are in favor of the role of interlanguage grammar and syntactic knowledge in the acquisition of formulaic expressions. The differing effects of WM and PSTM suggest that selective attention to and processing of the input beyond simple retention play a key role in successfully acquiring formulaic language. Similar correlational patterns were found for preferring the ungrammatical phrase with the lexical core of the formula over the ones without the lexical core, attesting to learners’ awareness of the lexical core around which formulas are constructed. These findings support the view that formulaic phrases retain internal syntactic structures that are recognized and processed by the learners.

Keywords: formulaic language, working memory, phonological short-term memory, academic language

Procedia PDF Downloads 26
14455 L2 Acquisition of Tense and Aspect by Cantonese and Mandarin ESL Learners of Different Proficiency Levels

Authors: Mable Chan

Abstract:

The present study about the acquisition of tense and aspect by Cantonese and Mandarin ESL learners aims to investigate the relationship between knowledge, the role that classroom input plays in the development of that knowledge, and learners' use of the L2 knowledge they acquire (i.e. their performance). Chinese has been argued as a tenseless language and Chinese ESL learners have to acquire the property from scratch. The study of acquisition of tense and aspect is a very fruitful research area in second language acquisition for a number of reasons. First, tense and aspect are notorious for being difficult for Chinese ESL learners. Second, to our knowledge, no studies have been done to compare Cantonese and Mandarin ESL learners and age effects in one single study. Data are now being collected and the findings from this comparison study of tense-aspect acquisition will shed light on both theoretical and pedagogical issues in second language acquisition, and contribute to a better understanding of both theoretical aspect concerning L2 acquisition of tense and aspect, and pedagogy of tense for L2 Chinese ESL learners.

Keywords: aspect, second language acquisition, tense, universal grammar

Procedia PDF Downloads 318
14454 Are Some Languages Harder to Learn and Teach Than Others?

Authors: David S. Rosenstein

Abstract:

The author believes that modern spoken languages should be equally difficult (or easy) to learn, since all normal children learning their native languages do so at approximately the same rate and with the same competence, progressing from easy to more complex grammar and syntax in the same way. Why then, do some languages seem more difficult than others? Perhaps people are referring to the written language, where it may be true that mastering Chinese requires more time than French, which in turn requires more time than Spanish. But this may be marginal, since Chinese and French children quickly catch up to their Spanish peers in reading comprehension. Rather, the real differences in difficulty derive from two sources: hardened L1 language habits trying to cope with contrasting L2 habits; and unfamiliarity with unique L2 characteristics causing faulty expectations. It would seem that effective L2 teaching and learning must take these two sources of difficulty into consideration. The author feels that the latter (faulty expectations) causes the greatest difficulty, making effective teaching and learning somewhat different for each given foreign language. Examples from Chinese and other languages are presented.

Keywords: learning different languages, language learning difficulties, faulty language expectations

Procedia PDF Downloads 508
14453 Generativism in Language Design and Their Effects on String of Constructions

Authors: Christian Uchechukwu Gilbert

Abstract:

Generativism in language design investigates the framework on which varying sentence structures are built in the English language. Propounded by Noam Chomsky in 1965, the theory transforms sentences from an active structure to a passive one by the application of established rules of the theory. Resident in the body of syntax, the rules include movement, insertion, substitution, and deletion rules. Using the movement rule, the analysis is armed with the qualitative research method, on which the works of scholars were duly consulted for more insight and in line with the academic practice in research activities. The investigation showed that the rules of competent grammar explain the formulation of sentences in a language and how transformation takes place among sentences from a deep structure to a surface structure with accurate results. The structural differences that could be got through dative movement and the deletion of the preposition; passivisation got from an active sentence by the insertion of the preposition “by” a “be verb” and the aspect tense marker “–en”, held as the creative aspect of language vocabulary and the subject-auxiliary inversion that exchanges the auxiliary of a sentence with the subject of the same sentence thereby transforming a kennel sentence to a polar question, viewed as an external argument under θ-theory. Generativism in language design, therefore, changes available types of sentences and relates one form of linguistic category with others in language design.

Keywords: language, generate, transformation, structure, design

Procedia PDF Downloads 24
14452 British English vs. American English: A Comparative Study

Authors: Halima Benazzouz

Abstract:

It is often believed that British English and American English are the foremost varieties of the English Language serving as reference norms for other varieties;that is the reason why they have obviously been compared and contrasted.Meanwhile,the terms “British English” and “American English” are used differently by different people to refer to: 1) Two national varieties each subsuming regional and other sub-varieties standard and non-standard. 2) Two national standard varieties in which each one is only part of the range of English within its own state, but the most prestigious part. 3) Two international varieties, that is each is more than a national variety of the English Language. 4) Two international standard varieties that may or may not each subsume other standard varieties.Furthermore,each variety serves as a reference norm for users of the language elsewhere. Moreover, without a clear identification, as primarily belonging to one variety or the other, British English(Br.Eng) and American English (Am.Eng) are understood as national or international varieties. British English and American English are both “variants” and “varieties” of the English Language, more similar than different.In brief, the following may justify general categories of difference between Standard American English (S.Am.E) and Standard British English (S.Br.e) each having their own sociolectic value: A difference in pronunciation exists between the two foremost varieties, although it is the same spelling, by contrast, a divergence in spelling may be recognized, eventhough the same pronunciation. In such case, the same term is different but there is a similarity in spelling and pronunciation. Otherwise, grammar, syntax, and punctuation are distinctively used to distinguish the two varieties of the English Language. Beyond these differences, spelling is noted as one of the chief sources of variation.

Keywords: Greek, Latin, French pronunciation expert, varieties of English language

Procedia PDF Downloads 469
14451 The Use of AI to Measure Gross National Happiness

Authors: Riona Dighe

Abstract:

This research attempts to identify an alternative approach to the measurement of Gross National Happiness (GNH). It uses artificial intelligence (AI), incorporating natural language processing (NLP) and sentiment analysis to measure GNH. We use ‘off the shelf’ NLP models responsible for the sentiment analysis of a sentence as a building block for this research. We constructed an algorithm using NLP models to derive a sentiment analysis score against sentences. This was then tested against a sample of 20 respondents to derive a sentiment analysis score. The scores generated resembled human responses. By utilising the MLP classifier, decision tree, linear model, and K-nearest neighbors, we were able to obtain a test accuracy of 89.97%, 54.63%, 52.13%, and 47.9%, respectively. This gave us the confidence to use the NLP models against sentences in websites to measure the GNH of a country.

Keywords: artificial intelligence, NLP, sentiment analysis, gross national happiness

Procedia PDF Downloads 70
14450 User Guidance for Effective Query Interpretation in Natural Language Interfaces to Ontologies

Authors: Aliyu Isah Agaie, Masrah Azrifah Azmi Murad, Nurfadhlina Mohd Sharef, Aida Mustapha

Abstract:

Natural Language Interfaces typically support a restricted language and also have scopes and limitations that naïve users are unaware of, resulting in errors when the users attempt to retrieve information from ontologies. To overcome this challenge, an auto-suggest feature is introduced into the querying process where users are guided through the querying process using interactive query construction system. Guiding users to formulate their queries, while providing them with an unconstrained (or almost unconstrained) way to query the ontology results in better interpretation of the query and ultimately lead to an effective search. The approach described in this paper is unobtrusive and subtly guides the users, so that they have a choice of either selecting from the suggestion list or typing in full. The user is not coerced into accepting system suggestions and can express himself using fragments or full sentences.

Keywords: auto-suggest, expressiveness, habitability, natural language interface, query interpretation, user guidance

Procedia PDF Downloads 452
14449 Creating Complementary Bi-Modal Learning Environments: An Exploratory Study Combining Online and Classroom Techniques

Authors: Justin P. Pool, Haruyo Yoshida

Abstract:

This research focuses on the effects of creating an English as a foreign language curriculum that combines online learning and classroom teaching in a complementary manner. Through pre- and post-test results, teacher observation, and learner reflection, it will be shown that learners can benefit from online programs focusing on receptive skills if combined with a communicative classroom environment that encourages learners to develop their productive skills. Much research has lamented the fact that many modern mobile assisted language learning apps do not take advantage of the affordances of modern technology by focusing only on receptive skills rather than inviting learners to interact with one another and develop communities of practice. This research takes into account the realities of the state of such apps and focuses on how to best create a curriculum that complements apps which focus on receptive skills. The research involved 15 adult learners working for a business in Japan simultaneously engaging in 1) a commercial online English language learning application that focused on reading, listening, grammar, and vocabulary and 2) a 15-week class focused on communicative language teaching, presentation skills, and mitigation of error aversion tendencies. Participants of the study experienced large gains on a standardized test, increased motivation and willingness to communicate, and asserted that they felt more confident regarding English communication. Moreover, learners continued to study independently at higher rates after the study than they had before the onset of the program. This paper will include the details of the program, reveal the improvement in test scores, share learner reflections, and critically view current evaluation models for mobile assisted language learning applications.

Keywords: adult learners, communicative language teaching, mobile assisted language learning, motivation

Procedia PDF Downloads 107
14448 Short Answer Grading Using Multi-Context Features

Authors: S. Sharan Sundar, Nithish B. Moudhgalya, Nidhi Bhandari, Vineeth Vijayaraghavan

Abstract:

Automatic Short Answer Grading is one of the prime applications of artificial intelligence in education. Several approaches involving the utilization of selective handcrafted features, graphical matching techniques, concept identification and mapping, complex deep frameworks, sentence embeddings, etc. have been explored over the years. However, keeping in mind the real-world application of the task, these solutions present a slight overhead in terms of computations and resources in achieving high performances. In this work, a simple and effective solution making use of elemental features based on statistical, linguistic properties, and word-based similarity measures in conjunction with tree-based classifiers and regressors is proposed. The results for classification tasks show improvements ranging from 1%-30%, while the regression task shows a stark improvement of 35%. The authors attribute these improvements to the addition of multiple similarity scores to provide ensemble of scoring criteria to the models. The authors also believe the work could reinstate that classical natural language processing techniques and simple machine learning models can be used to achieve high results for short answer grading.

Keywords: artificial intelligence, intelligent systems, natural language processing, text mining

Procedia PDF Downloads 113
14447 Transportation Language Register as One of Language Community

Authors: Diyah Atiek Mustikawati

Abstract:

Language register refers to a variety of a language used for particular purpose or in a particular social setting. Language register also means as a concept of adapting one’s use of language to conform to standards or tradition in a given professional or social situation. This descriptive study tends to discuss about the form of language register in transportation aspect, factors, also the function of use it. Mostly, language register in transportation aspect uses short sentences in form of informal register. The factor caused language register used are speaker, word choice, background of language. The functions of language register in transportations aspect are to make communication between crew easily, also to keep safety when they were in bad condition. Transportation language register developed naturally as one of variety of language used.

Keywords: language register, language variety, communication, transportation

Procedia PDF Downloads 443
14446 Articles, Delimitation of Speech and Perception

Authors: Nataliya L. Ogurechnikova

Abstract:

The paper aims to clarify the function of articles in the English speech and specify their place and role in the English language, taking into account the use of articles for delimitation of speech. A focus of the paper is the use of the definite and the indefinite articles with different types of noun phrases which comprise either one noun with or without attributes, such as the King, the Queen, the Lion, the Unicorn, a dimple, a smile, a new language, an unknown dialect, or several nouns with or without attributes, such as the King and Queen of Hearts, the Lion and Unicorn, a dimple or smile, a completely isolated language or dialect. It is stated that the function of delimitation is related to perception: the number of speech units in a text correlates with the way the speaker perceives and segments the denotation. The two following combinations of words the house and garden and the house and the garden contain different numbers of speech units, one and two respectively, and reveal two different perception modes which correspond to the use of the definite article in the examples given. Thus, the function of delimitation is twofold, it is related to perception and cognition, on the one hand, and, on the other hand, to grammar, if the subject of grammar is the structure of speech. Analysis of speech units in the paper is not limited by noun phrases and is amplified by discussion of peripheral phenomena which are nevertheless important because they enable to qualify articles as a syntactic phenomenon whereas they are not infrequently described in terms of noun morphology. With this regard attention is given to the history of linguistic studies, specifically to the description of English articles by Niels Haislund, a disciple of Otto Jespersen. A discrepancy is noted between the initial plan of Jespersen who intended to describe articles as a syntactic phenomenon in ‘A Modern English Grammar on Historical Principles’ and the interpretation of articles in terms of noun morphology, finally given by Haislund. Another issue of the paper is correlation between description and denotation, being a traditional aspect of linguistic studies focused on articles. An overview of relevant studies, given in the paper, goes back to the works of G. Frege, which gave rise to a series of scientific works where the meaning of articles was described within the scope of logical semantics. Correlation between denotation and description is treated in the paper as the meaning of article, i.e. a component in its semantic structure, which differs from the function of delimitation and is similar to the meaning of other quantifiers. The paper further explains why the relation between description and denotation, i.e. the meaning of English article, is irrelevant for noun morphology and has nothing to do with nominal categories of the English language.

Keywords: delimitation of speech, denotation, description, perception, speech units, syntax

Procedia PDF Downloads 217
14445 Text Similarity in Vector Space Models: A Comparative Study

Authors: Omid Shahmirzadi, Adam Lugowski, Kenneth Younge

Abstract:

Automatic measurement of semantic text similarity is an important task in natural language processing. In this paper, we evaluate the performance of different vector space models to perform this task. We address the real-world problem of modeling patent-to-patent similarity and compare TFIDF (and related extensions), topic models (e.g., latent semantic indexing), and neural models (e.g., paragraph vectors). Contrary to expectations, the added computational cost of text embedding methods is justified only when: 1) the target text is condensed; and 2) the similarity comparison is trivial. Otherwise, TFIDF performs surprisingly well in other cases: in particular for longer and more technical texts or for making finer-grained distinctions between nearest neighbors. Unexpectedly, extensions to the TFIDF method, such as adding noun phrases or calculating term weights incrementally, were not helpful in our context.

Keywords: big data, patent, text embedding, text similarity, vector space model

Procedia PDF Downloads 140
14444 A Comparative Analysis of Hyper-Parameters Using Neural Networks for E-Mail Spam Detection

Authors: Syed Mahbubuz Zaman, A. B. M. Abrar Haque, Mehedi Hassan Nayeem, Misbah Uddin Sagor

Abstract:

Everyday e-mails are being used by millions of people as an effective form of communication over the Internet. Although e-mails allow high-speed communication, there is a constant threat known as spam. Spam e-mail is often called junk e-mails which are unsolicited and sent in bulk. These unsolicited emails cause security concerns among internet users because they are being exposed to inappropriate content. There is no guaranteed way to stop spammers who use static filters as they are bypassed very easily. In this paper, a smart system is proposed that will be using neural networks to approach spam in a different way, and meanwhile, this will also detect the most relevant features that will help to design the spam filter. Also, a comparison of different parameters for different neural network models has been shown to determine which model works best within suitable parameters.

Keywords: long short-term memory, bidirectional long short-term memory, gated recurrent unit, natural language processing, natural language processing

Procedia PDF Downloads 178
14443 Parametric Models of Facade Designs of High-Rise Residential Buildings

Authors: Yuchen Sharon Sung, Yingjui Tseng

Abstract:

High-rise residential buildings have become the most mainstream housing pattern in the world’s metropolises under the current trend of urbanization. The facades of high-rise buildings are essential elements of the urban landscape. The skins of these facades are important media between the interior and exterior of high- rise buildings. It not only connects between users and environments, but also plays an important functional and aesthetic role. This research involves a study of skins of high-rise residential buildings using the methodology of shape grammar to find out the rules which determine the combinations of the facade patterns and analyze the patterns’ parameters using software Grasshopper. We chose a number of facades of high-rise residential buildings as source to discover the underlying rules and concepts of the generation of facade skins. This research also provides the rules that influence the composition of facade skins. The items of the facade skins, such as windows, balconies, walls, sun visors and metal grilles are treated as elements in the system of facade skins. The compositions of these elements will be categorized and described by logical rules; and the types of high-rise building facade skins will be modelled by Grasshopper. Then a variety of analyzed patterns can also be applied on other facade skins through this parametric mechanism. Using these patterns established in the models, researchers can analyze each single item to do more detail tests and architects can apply each of these items to construct their facades for other buildings through various combinations and permutations. The goal of these models is to develop a mechanism to generate prototypes in order to facilitate generation of various facade skins.

Keywords: facade skin, grasshopper, high-rise residential building, shape grammar

Procedia PDF Downloads 480
14442 Learner's Difficulties Acquiring English: The Case of Native Speakers of Rio de La Plata Spanish Towards Justifying the Need for Corpora

Authors: Maria Zinnia Bardas Hoffmann

Abstract:

Contrastive Analysis (CA) is the systematic comparison between two languages. It stems from the notion that errors are caused by interference of the L1 system in the acquisition process of an L2. CA represents a useful tool to understand the nature of learning and acquisition. Also, this particular method promises a path to un-derstand the nature of underlying cognitive processes, even when other factors such as intrinsic motivation and teaching strategies were found to best explain student’s problems in acquisition. CA study is justified not only from the need to get a deeper understanding of the nature of SLA, but as an invaluable source to provide clues, at a cognitive level, for those general processes involved in rule formation and abstract thought. It is relevant for cross disciplinary studies and the fields of Computational Thought, Natural Language processing, Applied Linguistics, Cognitive Linguistics and Math Theory. That being said, this paper intends to address here as well its own set of constraints and limitations. Finally, this paper: (a) aims at identifying some of the difficulties students may find in their learning process due to the nature of their specific variety of L1, Rio de la Plata Spanish (RPS), (b) represents an attempt to discuss the necessity for specific models to approach CA.

Keywords: second language acquisition, applied linguistics, contrastive analysis, applied contrastive analysis English language department, meta-linguistic rules, cross-linguistics studies, computational thought, natural language processing

Procedia PDF Downloads 113
14441 A Comparative Study of Approaches in User-Centred Health Information Retrieval

Authors: Harsh Thakkar, Ganesh Iyer

Abstract:

In this paper, we survey various user-centered or context-based biomedical health information retrieval systems. We present and discuss the performance of systems submitted in CLEF eHealth 2014 Task 3 for this purpose. We classify and focus on comparing the two most prevalent retrieval models in biomedical information retrieval namely: Language Model (LM) and Vector Space Model (VSM). We also report on the effectiveness of using external medical resources and ontologies like MeSH, Metamap, UMLS, etc. We observed that the LM based retrieval systems outperform VSM based systems on various fronts. From the results we conclude that the state-of-art system scores for MAP was 0.4146, P@10 was 0.7560 and NDCG@10 was 0.7445, respectively. All of these score were reported by systems built on language modeling approaches.

Keywords: clinical document retrieval, concept-based information retrieval, query expansion, language models, vector space models

Procedia PDF Downloads 290
14440 American Slang: Perception and Connotations – Issues of Translation

Authors: Lison Carlier

Abstract:

The English language that is taught in school or used in media nowadays is defined as 'standard English,' although unstandardized Englishes, or 'parallel' Englishes, are practiced throughout the world. The existence of these 'parallel' Englishes has challenged standardization by imposing its own specific vocabulary or grammar. These non-standard languages tend to be regarded as inferior and, therefore, pose a problem regarding their translation. In the USA, 'slanguage', or slang, is a good example of a 'parallel' language. It consists of a particular set of vocabulary, used mostly in speech, and rarely in writing. Qualified as vulgar, often reduced to an urban language spoken by young people from lower classes, slanguage – or the language that is often first spoken between youths – is still the most common language used in the English-speaking world. Moreover, it appears that the prime meaning of 'informal' (as in an informal language) – a language that is spoken with persons the speaker knows – has been put aside and replaced in the general mind by the idea of vulgarity and non-appropriateness, when in fact informality is a sign of intimacy, not of vulgarity. When it comes to translating American slang, the main problem a translator encounters is the image and the cultural background usually associated with this 'parallel' language. Indeed, one will have, unwillingly, a predisposition to categorize a speaker of a 'parallel' language as being part of a particular group of people. The way one sees a speaker using it is paramount, and needs to be transposed into the target language. This paper will conduct an analysis of American slang – its use, perception and the image it gives of its speakers – and its translation into French, using the novel Is Everyone Hanging Out Without Me? (and other concerns) by way of example. In her autobiography/personal essay book, comedy writer, actress and author Mindy Kaling speaks with a very familiar English, including slang, which participates in the construction of her own voice and style, and enables a deeper connection with her readers.

Keywords: translation, English, slang, French

Procedia PDF Downloads 293
14439 Evaluation of Modern Natural Language Processing Techniques via Measuring a Company's Public Perception

Authors: Burak Oksuzoglu, Savas Yildirim, Ferhat Kutlu

Abstract:

Opinion mining (OM) is one of the natural language processing (NLP) problems to determine the polarity of opinions, mostly represented on a positive-neutral-negative axis. The data for OM is usually collected from various social media platforms. In an era where social media has considerable control over companies’ futures, it’s worth understanding social media and taking actions accordingly. OM comes to the fore here as the scale of the discussion about companies increases, and it becomes unfeasible to gauge opinion on individual levels. Thus, the companies opt to automize this process by applying machine learning (ML) approaches to their data. For the last two decades, OM or sentiment analysis (SA) has been mainly performed by applying ML classification algorithms such as support vector machines (SVM) and Naïve Bayes to a bag of n-gram representations of textual data. With the advent of deep learning and its apparent success in NLP, traditional methods have become obsolete. Transfer learning paradigm that has been commonly used in computer vision (CV) problems started to shape NLP approaches and language models (LM) lately. This gave a sudden rise to the usage of the pretrained language model (PTM), which contains language representations that are obtained by training it on the large datasets using self-supervised learning objectives. The PTMs are further fine-tuned by a specialized downstream task dataset to produce efficient models for various NLP tasks such as OM, NER (Named-Entity Recognition), Question Answering (QA), and so forth. In this study, the traditional and modern NLP approaches have been evaluated for OM by using a sizable corpus belonging to a large private company containing about 76,000 comments in Turkish: SVM with a bag of n-grams, and two chosen pre-trained models, multilingual universal sentence encoder (MUSE) and bidirectional encoder representations from transformers (BERT). The MUSE model is a multilingual model that supports 16 languages, including Turkish, and it is based on convolutional neural networks. The BERT is a monolingual model in our case and transformers-based neural networks. It uses a masked language model and next sentence prediction tasks that allow the bidirectional training of the transformers. During the training phase of the architecture, pre-processing operations such as morphological parsing, stemming, and spelling correction was not used since the experiments showed that their contribution to the model performance was found insignificant even though Turkish is a highly agglutinative and inflective language. The results show that usage of deep learning methods with pre-trained models and fine-tuning achieve about 11% improvement over SVM for OM. The BERT model achieved around 94% prediction accuracy while the MUSE model achieved around 88% and SVM did around 83%. The MUSE multilingual model shows better results than SVM, but it still performs worse than the monolingual BERT model.

Keywords: BERT, MUSE, opinion mining, pretrained language model, SVM, Turkish

Procedia PDF Downloads 113
14438 Emotional Analysis for Text Search Queries on Internet

Authors: Gemma García López

Abstract:

The goal of this study is to analyze if search queries carried out in search engines such as Google, can offer emotional information about the user that performs them. Knowing the emotional state in which the Internet user is located can be a key to achieve the maximum personalization of content and the detection of worrying behaviors. For this, two studies were carried out using tools with advanced natural language processing techniques. The first study determines if a query can be classified as positive, negative or neutral, while the second study extracts emotional content from words and applies the categorical and dimensional models for the representation of emotions. In addition, we use search queries in Spanish and English to establish similarities and differences between two languages. The results revealed that text search queries performed by users on the Internet can be classified emotionally. This allows us to better understand the emotional state of the user at the time of the search, which could involve adapting the technology and personalizing the responses to different emotional states.

Keywords: emotion classification, text search queries, emotional analysis, sentiment analysis in text, natural language processing

Procedia PDF Downloads 116
14437 The Advancements of Transformer Models in Part-of-Speech Tagging System for Low-Resource Tigrinya Language

Authors: Shamm Kidane, Ibrahim Abdella, Fitsum Gaim, Simon Mulugeta, Sirak Asmerom, Natnael Ambasager, Yoel Ghebrihiwot

Abstract:

The call for natural language processing (NLP) systems for low-resource languages has become more apparent than ever in the past few years, with the arduous challenges still present in preparing such systems. This paper presents an improved dataset version of the Nagaoka Tigrinya Corpus for Parts-of-Speech (POS) classification system in the Tigrinya language. The size of the initial Nagaoka dataset was incremented, totaling the new tagged corpus to 118K tokens, which comprised the 12 basic POS annotations used previously. The additional content was also annotated manually in a stringent manner, followed similar rules to the former dataset and was formatted in CONLL format. The system made use of the novel approach in NLP tasks and use of the monolingually pre-trained TiELECTRA, TiBERT and TiRoBERTa transformer models. The highest achieved score is an impressive weighted F1-score of 94.2%, which surpassed the previous systems by a significant measure. The system will prove useful in the progress of NLP-related tasks for Tigrinya and similarly related low-resource languages with room for cross-referencing higher-resource languages.

Keywords: Tigrinya POS corpus, TiBERT, TiRoBERTa, conditional random fields

Procedia PDF Downloads 60
14436 Examining the Usefulness of an ESP Textbook for Information Technology: Learner Perspectives

Authors: Yun-Husan Huang

Abstract:

Many English for Specific Purposes (ESP) textbooks are distributed globally as the content development is often obliged to compromises between commercial and pedagogical demands. Therefore, the issue of regional application and usefulness of globally published ESP textbooks has received much debate. For ESP instructors, textbook selection is definitely a priority consideration for curriculum design. An appropriate ESP textbook can facilitate teaching and learning, while an inappropriate one may cause a disaster for both teachers and students. This study aims to investigate the regional application and usefulness of an ESP textbook for information technology (IT). Participants were 51 sophomores majoring in Applied Informatics and Multimedia at a university in Taiwan. As they were non-English majors, their English proficiency was mostly at elementary and elementary-to-intermediate levels. This course was offered for two semesters. The textbook selected was Oxford English for Information Technology. At class end, the students were required to complete a survey comprising five choices of Very Easy, Easy, Neutral, Difficult, and Very Difficult for each item. Based on the content design of the textbook, the survey investigated how the students viewed the difficulty of grammar, listening, speaking, reading, and writing materials of the textbook. In terms of difficulty, results reveal that only 22% of them found the grammar section difficult and very difficult. For listening, 71% responded difficult and very difficult. For general reading, 55% responded difficult and very difficult. For speaking, 56% responded difficult and very difficult. For writing, 78% responded difficult and very difficult. For advanced reading, 90% reported difficult and very difficult. These results indicate that, except the grammar section, more than half of the students found the textbook contents difficult in terms of listening, speaking, reading, and writing materials. Such contradictory results between the easy grammar section and the difficult four language skills sections imply that the textbook designers do not well understand the English learning background of regional ESP learners. For the participants, the learning contents of the grammar section were the general grammar level of junior high school, while the learning contents of the four language skills sections were more of the levels of college English majors. Implications from the findings are obtained for instructors and textbook designers. First of all, existing ESP textbooks for IT are few and thus textbook selections for instructors are insufficient. Second, existing globally published textbooks for IT cannot be applied to learners of all English proficiency levels, especially the low level. With limited textbook selections, third, instructors should modify the selected textbook contents or supplement extra ESP materials to meet the proficiency level of target learners. Fourth, local ESP publishers should collaborate with local ESP instructors who understand best the learning background of their students in order to develop appropriate ESP textbooks for local learners. Even though the instructor reduced learning contents and simplified tests in curriculum design, in conclusion, the students still found difficult. This implies that in addition to the instructor’s professional experience, there is a need to understand the usefulness of the textbook from learner perspectives.

Keywords: ESP textbooks, ESP materials, ESP textbook design, learner perspectives on ESP textbooks

Procedia PDF Downloads 311
14435 Exploring SL Writing and SL Sensitivity during Writing Tasks: Poor and Advanced Writing in a Context of Second Language other than English

Authors: Sandra Figueiredo, Margarida Alves Martins, Carlos Silva, Cristina Simões

Abstract:

This study integrates a larger research empirical project that examines second language (SL) learners’ profiles and valid procedures to perform complete and diagnostic assessment in schools. 102 learners of Portuguese as a SL aged 7 and 17 years speakers of distinct home languages were assessed in several linguistic tasks. In this article, we focused on writing performance in the specific task of narrative essay composition. The written outputs were measured using the score in six components adapted from an English SL assessment context (Alberta Education): linguistic vocabulary, grammar, syntax, strategy, socio-linguistic, and discourse. The writing processes and strategies in Portuguese language used by different immigrant students were analysed to determine features and diversity of deficits on authentic texts performed by SL writers. Differentiated performance was based on the diversity of the following variables: grades, previous schooling, home language, instruction in first language, and exposure to Portuguese as Second Language. Indo-Aryan languages speakers showed low writing scores compared to their peers and the type of language and respective cognitive mapping (such as Mandarin and Arabic) was the predictor, not linguistic distance. Home language instruction should also be prominently considered in further research to understand specificities of cognitive academic profile in a Romance languages learning context. Additionally, this study also examined the teachers representations that will be here addressed to understand educational implications of second language teaching in psychological distress of different minorities in schools of specific host countries.

Keywords: home language, immigrant students, Portuguese language, second language, writing assessment

Procedia PDF Downloads 436
14434 Syntactic, Semantic, and Pragmatic Rationalization of Modal Auxiliary Verbs in Akan

Authors: Joana Portia Sakyi

Abstract:

The uniqueness of auxiliary verbs and their contribution to grammar as constituents, which act as preverbs to supply additional grammatical or functional meanings to clauses, are well established. Functionally, they relate clauses to tense, aspect, mood, voice, emphasis, and modality, along with the main verbs conveying the appropriate lexical content. There has been an issue in Akan grammar vis-à-vis the status of auxiliary verbs, in terms of whether Akan has auxiliaries or not and even which forms are to be regarded as auxiliaries. We investigate the syntactic, semantic, and pragmatic components of expressions and claim that Akan has auxiliary verbs that contribute the functional or grammatical meaning of modality, tense/aspect, etc., to clauses they occur in. Essentially, we use a self-created corpus data to consider the affix bέ- ‘may’, ‘must’, ‘should’; the form tùmí ‘can’, ‘be able to’; mà ‘to let’, ‘to allow’, ‘to permit’, ‘to make’, or ‘to cause’ someone to do something; the multi-word forms ὲsὲ sέ ‘must’, ‘should’ or ‘have to’ and ètwà sέ ‘must’, ‘should’ or ‘have to’, and assert that they are legitimate modal auxiliaries conveying epistemic, deontic, and dynamic modalities, as well as other meanings in the language.

Keywords: Akan, modality, modal auxiliaries, semantics

Procedia PDF Downloads 38
14433 Preparation on Sentimental Analysis on Social Media Comments with Bidirectional Long Short-Term Memory Gated Recurrent Unit and Model Glove in Portuguese

Authors: Leonardo Alfredo Mendoza, Cristian Munoz, Marco Aurelio Pacheco, Manoela Kohler, Evelyn Batista, Rodrigo Moura

Abstract:

Natural Language Processing (NLP) techniques are increasingly more powerful to be able to interpret the feelings and reactions of a person to a product or service. Sentiment analysis has become a fundamental tool for this interpretation but has few applications in languages other than English. This paper presents a classification of sentiment analysis in Portuguese with a base of comments from social networks in Portuguese. A word embedding's representation was used with a 50-Dimension GloVe pre-trained model, generated through a corpus completely in Portuguese. To generate this classification, the bidirectional long short-term memory and bidirectional Gated Recurrent Unit (GRU) models are used, reaching results of 99.1%.

Keywords: natural processing language, sentiment analysis, bidirectional long short-term memory, BI-LSTM, gated recurrent unit, GRU

Procedia PDF Downloads 129
14432 Translation as a Foreign Language Teaching Tool: Results of an Experiment with University Level Students in Spain

Authors: Nune Ayvazyan

Abstract:

Since the proclamation of monolingual foreign-language learning methods (the Berlitz Method in the early 20ᵗʰ century and the like), the dilemma has been to allow or not to allow learners’ mother tongue in the foreign-language learning process. The reason for not allowing learners’ mother tongue is reported to create a situation of immersion where students will only use the target language. It could be argued that this artificial monolingual situation is defective, mainly because there are very few real monolingual situations in the society. This is mainly due to the fact that societies are nowadays increasingly multilingual as plurilingual speakers are the norm rather than an exception. More recently, the use of learners’ mother tongue and translation has been put under the spotlight as valid foreign-language teaching tools. The logic dictates that if learners were permitted to use their mother tongue in the foreign-language learning process, that would not only be natural, but also would give them additional means of participation in class, which could eventually lead to learning. For example, when learners’ metalinguistic skills are poor in the target language, a question they might have could be asked in their mother tongue. Otherwise, that question might be left unasked. Attempts at empirically testing the role of translation as a didactic tool in foreign-language teaching are still very scant. In order to fill this void, this study looks into the interaction patterns between students in two kinds of English-learning classes: one with translation and the other in English only (immersion). The experiment was carried out with 61 students enrolled in a second-year university subject in English grammar in Spain. All the students underwent the two treatments, classes with translation and in English only, in order to see how they interacted under the different conditions. The analysis centered on four categories of interaction: teacher talk, teacher-initiated student interaction, student-initiated student-to-teacher interaction, and student-to-student interaction. Also, pre-experiment and post-experiment questionnaires and individual interviews gathered information about the students’ attitudes to translation. The findings show that translation elicited more student-initiated interaction than did the English-only classes, while the difference in teacher-initiated interactional turns was not statistically significant. Also, student-initiated participation was higher in comprehension-based activities (into L1) as opposed to production-based activities (into L2). As evidenced by the questionnaires, the students’ attitudes to translation were initially positive and mainly did not vary as a result of the experiment.

Keywords: foreign language, learning, mother tongue, translation

Procedia PDF Downloads 131
14431 Statistical Analysis of Natural Images after Applying ICA and ISA

Authors: Peyman Sheikholharam Mashhadi

Abstract:

Difficulties in analyzing real world images in classical image processing and machine vision framework have motivated researchers towards considering the biology-based vision. It is a common belief that mammalian visual cortex has been adapted to the statistics of the real world images through the evolution process. There are two well-known successful models of mammalian visual cortical cells: Independent Component Analysis (ICA) and Independent Subspace Analysis (ISA). In this paper, we statistically analyze the dependencies which remain in the components after applying these models to the natural images. Also, we investigate the response of feature detectors to gratings with various parameters in order to find optimal parameters of the feature detectors. Finally, the selectiveness of feature detectors to phase, in both models is considered.

Keywords: statistics, independent component analysis, independent subspace analysis, phase, natural images

Procedia PDF Downloads 318
14430 Topological Language for Classifying Linear Chord Diagrams via Intersection Graphs

Authors: Michela Quadrini

Abstract:

Chord diagrams occur in mathematics, from the study of RNA to knot theory. They are widely used in theory of knots and links for studying the finite type invariants, whereas in molecular biology one important motivation to study chord diagrams is to deal with the problem of RNA structure prediction. An RNA molecule is a linear polymer, referred to as the backbone, that consists of four types of nucleotides. Each nucleotide is represented by a point, whereas each chord of the diagram stands for one interaction for Watson-Crick base pairs between two nonconsecutive nucleotides. A chord diagram is an oriented circle with a set of n pairs of distinct points, considered up to orientation preserving diffeomorphisms of the circle. A linear chord diagram (LCD) is a special kind of graph obtained cutting the oriented circle of a chord diagram. It consists of a line segment, called its backbone, to which are attached a number of chords with distinct endpoints. There is a natural fattening on any linear chord diagram; the backbone lies on the real axis, while all the chords are in the upper half-plane. Each linear chord diagram has a natural genus of its associated surface. To each chord diagram and linear chord diagram, it is possible to associate the intersection graph. It consists of a graph whose vertices correspond to the chords of the diagram, whereas the chord intersections are represented by a connection between the vertices. Such intersection graph carries a lot of information about the diagram. Our goal is to define an LCD equivalence class in terms of identity of intersection graphs, from which many chord diagram invariants depend. For studying these invariants, we introduce a new representation of Linear Chord Diagrams based on a set of appropriate topological operators that permits to model LCD in terms of the relations among chords. Such set is composed of: crossing, nesting, and concatenations. The crossing operator is able to generate the whole space of linear chord diagrams, and a multiple context free grammar able to uniquely generate each LDC starting from a linear chord diagram adding a chord for each production of the grammar is defined. In other words, it allows to associate a unique algebraic term to each linear chord diagram, while the remaining operators allow to rewrite the term throughout a set of appropriate rewriting rules. Such rules define an LCD equivalence class in terms of the identity of intersection graphs. Starting from a modelled RNA molecule and the linear chord, some authors proposed a topological classification and folding. Our LCD equivalence class could contribute to the RNA folding problem leading to the definition of an algorithm that calculates the free energy of the molecule more accurately respect to the existing ones. Such LCD equivalence class could be useful to obtain a more accurate estimate of link between the crossing number and the topological genus and to study the relation among other invariants.

Keywords: chord diagrams, linear chord diagram, equivalence class, topological language

Procedia PDF Downloads 174
14429 Validating the Micro-Dynamic Rule in Opinion Dynamics Models

Authors: Dino Carpentras, Paul Maher, Caoimhe O'Reilly, Michael Quayle

Abstract:

Opinion dynamics is dedicated to modeling the dynamic evolution of people's opinions. Models in this field are based on a micro-dynamic rule, which determines how people update their opinion when interacting. Despite the high number of new models (many of them based on new rules), little research has been dedicated to experimentally validate the rule. A few studies started bridging this literature gap by experimentally testing the rule. However, in these studies, participants are forced to express their opinion as a number instead of using natural language. Furthermore, some of these studies average data from experimental questions, without testing if differences existed between them. Indeed, it is possible that different topics could show different dynamics. For example, people may be more prone to accepting someone's else opinion regarding less polarized topics. In this work, we collected data from 200 participants on 5 unpolarized topics. Participants expressed their opinions using natural language ('agree' or 'disagree') and the certainty of their answer, expressed as a number between 1 and 10. To keep the interaction based on natural language, certainty was not shown to other participants. We then showed to the participant someone else's opinion on the same topic and, after a distraction task, we repeated the measurement. To produce data compatible with standard opinion dynamics models, we multiplied the opinion (encoded as agree=1 and disagree=-1) with the certainty to obtain a single 'continuous opinion' ranging from -10 to 10. By analyzing the topics independently, we observed that each one shows a different initial distribution. However, the dynamics (i.e., the properties of the opinion change) appear to be similar between all topics. This suggested that the same micro-dynamic rule could be applied to unpolarized topics. Another important result is that participants that change opinion tend to maintain similar levels of certainty. This is in contrast with typical micro-dynamics rules, where agents move to an average point instead of directly jumping to the opposite continuous opinion. As expected, in the data, we also observed the effect of social influence. This means that exposing someone with 'agree' or 'disagree' influenced participants to respectively higher or lower values of the continuous opinion. However, we also observed random variations whose effect was stronger than the social influence’s one. We even observed cases of people that changed from 'agree' to 'disagree,' even if they were exposed to 'agree.' This phenomenon is surprising, as, in the standard literature, the strength of the noise is usually smaller than the strength of social influence. Finally, we also built an opinion dynamics model from the data. The model was able to explain more than 80% of the data variance. Furthermore, by iterating the model, we were able to produce polarized states even starting from an unpolarized population. This experimental approach offers a way to test the micro-dynamic rule. This also allows us to build models which are directly grounded on experimental results.

Keywords: experimental validation, micro-dynamic rule, opinion dynamics, update rule

Procedia PDF Downloads 129
14428 A CFD Analysis of Flow through a High-Pressure Natural Gas Pipeline with an Undeformed and Deformed Orifice Plate

Authors: R. Kiš, M. Malcho, M. Janovcová

Abstract:

This work aims to present a numerical analysis of the natural gas which flows through a high-pressure pipeline and an orifice plate, through the use of CFD methods. The paper contains CFD calculations for the flow of natural gas in a pipe with different geometry used for the orifice plates. One of them has a standard geometry and a shape without any deformation and the other is deformed by the action of the pressure differential. It shows the behaviour of natural gas in a pipeline using the velocity profiles and pressure fields of the gas in both models with their differences. The entire research is based on the elimination of any inaccuracy which should appear in the flow of the natural gas measured in the high-pressure pipelines of the gas industry and which is currently not given in the relevant standard.

Keywords: orifice plate, high-pressure pipeline, natural gas, CFD analysis

Procedia PDF Downloads 358
14427 From the “Movement Language” to Communication Language

Authors: Mahmudjon Kuchkarov, Marufjon Kuchkarov

Abstract:

The origin of ‘Human Language’ is still a secret and the most interesting subject of historical linguistics. The core element is the nature of labeling or coding the things or processes with symbols and sounds. In this paper, we investigate human’s involuntary Paired Sounds and Shape Production (PSSP) and its contribution to the development of early human communication. Aimed at twenty-six volunteers who provided many physical movements with various difficulties, the research team investigated the natural, repeatable, and paired sounds and shape productions during human activities. The paper claims the involvement of Paired Sounds and Shape Production (PSSP) in the phonetic origin of some modern words and the existence of similarities between elements of PSSP with characters of the classic Latin alphabet. The results may be used not only as a supporting idea for existing theories but to create a closer look at some fundamental nature of the origin of the languages as well.

Keywords: body shape, body language, coding, Latin alphabet, merging method, movement language, movement sound, natural sound, origin of language, pairing, phonetics, sound and shape production, word origin, word semantic

Procedia PDF Downloads 160