Search results for: custodial sentence
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 234

Search results for: custodial sentence

54 Gender Agreement in Italian Compounds with Capo-

Authors: Irene Lami, Silvia Micheli, Jan Radimský, Joost van de Weijer

Abstract:

The present study examines gender agreement in Italian compounds with "capo-". Compounds containing "capo-" as the first element is highly productive in Italian and are attested from the earliest stages of the language, with "capo" indicating a prominent role in a group. This type of compound has become progressively more productive over time, establishing itself in the language to indicate human referents with a leadership role over someone or something belonging to both subordinate and coordinate compound categories. In light of the debates on the use of inclusive language, especially with regard to female professional titles in Italian, the gender agreement of the word "capo" is investigated, which in addition to social resistance, also encounters etymological resistance. Regarding the gender agreement of the word "capo-" as the first element of compounds, in addition to social and etymological resistances, morphological constraints must also be considered. In our experiment, 190 native informants were asked to match the gender of the given the word in a sentence, thinking of female referents. The results confirm a scalar hypothesis of gender agreement (i.e., titles traditionally attributed to women > titles traditionally attributed to men > the word "capo" in isolation > the word "capo-" as an element of subordinate compound > the word “capo-“ as an element of a coordinate compound). A significant interplay with number marking is also shown, as words are inflected in gender when the trait +plural is present. Moreover, the results show that, contrary to what is prescriptively established, speakers do inflect the word "capo" according to gender, in limited instances, even when this is found as a compound element, even though to a lesser extent than words that only have social hinders and not etymological or morphological ones. The results appear to show that, although a morphological obstacle is visible, sociolinguistic claims seem to be able to divert these obstacles. This study appears particularly suitable for replication tests over the next few decades, which, if society opens up further to claims of inclusiveness, could further corroborate this trend.

Keywords: compounds, gender inflection, Italian, morphology

Procedia PDF Downloads 43
53 Logical Thinking: A Surprising and Promising Insight for Creative and Critical Thinkers

Authors: Luc de Brabandere

Abstract:

Searchers in various disciplines have long tried to understand how a human being thinks. Most of them seem to agree that the brain works in two very different modes. For us, the first phase of thought imagines, diverges, and unlocks the field of possibilities. The second phase, judges converge and choose. But if we were to stop there, that would give the impression that thought is essentially an individual effort that seldom depends on context. This is, however, not the case. Whether we be a champion in creativity, so primarily in induction, or a master in logic where we are confronted with reality, the ideas we layout are indeed destined to be presented to third parties. They should therefore be exposed, defended, communicated, negotiated, or even sold. Regardless of the quality of the concepts we craft (creative thinking) and the interferences we build (logical thinking) we will take one day, or another, be confronted by people whose beliefs, opinions and ideas differ from ours (critical thinking). Logic and critique: The shared characteristics of logical and critical thoughts include a three-level structure of reasoning invented by the Greeks. For the first time in history, Aristotle tried to model thought deployable in three stages: the concept, the statement, and the reasoning. The three levels can be assessed according to different criteria. A concept is more or less useful, a statement is true or false, and reasoning is right or wrong. This three-level structure allows us to differentiate logic and critique, where the intention and words used are not the same. Logic only deals with the structure of reasoning and exhausts the problem. It regards premises as acquired and excludes the debate. Logic is in all certainty and pursues the truth. Critique is most probably searching for the plausible. Logic and creativity: Many known models present the brain as a two-stroke engine (divergence vs convergence, fast vs. slow, left-brain vs right-brain, Yin vs Yang, etc.). But that’s not the only thing. “Why didn’t we think of that before?” How often have we heard that sentence? A creative idea is the outcome of logic, but you can only understand it afterward! Through the use of exercises, we will witness how logic and creativity work together. A third theme is hidden behind the two main themes of the conference: logical thought, which the author can shed some light on.

Keywords: creativity, logic, critique, digital

Procedia PDF Downloads 70
52 Gender-Based Differences in the Social Judgment of Hungarian Politicians' Sex Scandals

Authors: Sara Dalma Galgoczi, Judith Gabriella Kengyel

Abstract:

Sex scandals are quite an engaging topic to work with, especially with their judgment in society. Most people are interested in other people's lives, specifically in public figures' such as celebrities or politicians, because ordinary people feel like they have the right to know more things about the famous and notorious ones than they would probably willing to share. Intimacy and sexual acts aren't exceptions; moreover, sexuality is one of the central interests of humans ever since. Besides, knowing and having an opinion about any kind of scandal can change even whole social groups or classes estimation of anyone. This study aims to research the social judgment of some Hungarian politicians' sex scandals and asks important questions like diverse public opinions in the light of gender or delegates’ abuse of power. Considering that this study is about collecting and evaluating opinions from the public, and no one before researched and published this exact topic and cases, an online survey was created. In the survey were different sections. We collected data about party-preference, conservativism-liberalism scale; then we used the following questionnaires: from Zero-sum perspective with regard to gender equality (Ruthig, Kehn, Gamblin, Vanderzanden & Jones, 2017), Ambivalent Sexism Inventory (ASI; Glick & Fiske, 1996), Ambivalence Toward Men Inventory (AMI; Glick & Fiske, 1999). Finally, 5 short summaries were presented about five Hungarian politicians' sex scandal cases (3 males, 2 females) from the recent past. These stories were followed by questions about their opinion of the party and attitudes towards the parties' reactions to the cases. We came to the conclusion that people are more permissive with the scandals of men, and benevolent sexism and ambivalence towards men mediate this relation. Men tend to see these cases as part of politicians' private lives more than women. Party preference had a significant effect - people tend to pass a sentence the delegates of the opposing parties, and they rather release the delegates of their preferred party.

Keywords: sex scandal, sexism, social judgement, politician

Procedia PDF Downloads 98
51 Legal Judgment Prediction through Indictments via Data Visualization in Chinese

Authors: Kuo-Chun Chien, Chia-Hui Chang, Ren-Der Sun

Abstract:

Legal Judgment Prediction (LJP) is a subtask for legal AI. Its main purpose is to use the facts of a case to predict the judgment result. In Taiwan's criminal procedure, when prosecutors complete the investigation of the case, they will decide whether to prosecute the suspect and which article of criminal law should be used based on the facts and evidence of the case. In this study, we collected 305,240 indictments from the public inquiry system of the procuratorate of the Ministry of Justice, which included 169 charges and 317 articles from 21 laws. We take the crime facts in the indictments as the main input to jointly learn the prediction model for law source, article, and charge simultaneously based on the pre-trained Bert model. For single article cases where the frequency of the charge and article are greater than 50, the prediction performance of law sources, articles, and charges reach 97.66, 92.22, and 60.52 macro-f1, respectively. To understand the big performance gap between articles and charges, we used a bipartite graph to visualize the relationship between the articles and charges, and found that the reason for the poor prediction performance was actually due to the wording precision. Some charges use the simplest words, while others may include the perpetrator or the result to make the charges more specific. For example, Article 284 of the Criminal Law may be indicted as “negligent injury”, "negligent death”, "business injury", "driving business injury", or "non-driving business injury". As another example, Article 10 of the Drug Hazard Control Regulations can be charged as “Drug Control Regulations” or “Drug Hazard Control Regulations”. In order to solve the above problems and more accurately predict the article and charge, we plan to include the article content or charge names in the input, and use the sentence-pair classification method for question-answer problems in the BERT model to improve the performance. We will also consider a sequence-to-sequence approach to charge prediction.

Keywords: legal judgment prediction, deep learning, natural language processing, BERT, data visualization

Procedia PDF Downloads 99
50 Data Mining Model for Predicting the Status of HIV Patients during Drug Regimen Change

Authors: Ermias A. Tegegn, Million Meshesha

Abstract:

Human Immunodeficiency Virus and Acquired Immunodeficiency Syndrome (HIV/AIDS) is a major cause of death for most African countries. Ethiopia is one of the seriously affected countries in sub Saharan Africa. Previously in Ethiopia, having HIV/AIDS was almost equivalent to a death sentence. With the introduction of Antiretroviral Therapy (ART), HIV/AIDS has become chronic, but manageable disease. The study focused on a data mining technique to predict future living status of HIV/AIDS patients at the time of drug regimen change when the patients become toxic to the currently taking ART drug combination. The data is taken from University of Gondar Hospital ART program database. Hybrid methodology is followed to explore the application of data mining on ART program dataset. Data cleaning, handling missing values and data transformation were used for preprocessing the data. WEKA 3.7.9 data mining tools, classification algorithms, and expertise are utilized as means to address the research problem. By using four different classification algorithms, (i.e., J48 Classifier, PART rule induction, Naïve Bayes and Neural network) and by adjusting their parameters thirty-two models were built on the pre-processed University of Gondar ART program dataset. The performances of the models were evaluated using the standard metrics of accuracy, precision, recall, and F-measure. The most effective model to predict the status of HIV patients with drug regimen substitution is pruned J48 decision tree with a classification accuracy of 98.01%. This study extracts interesting attributes such as Ever taking Cotrim, Ever taking TbRx, CD4 count, Age, Weight, and Gender so as to predict the status of drug regimen substitution. The outcome of this study can be used as an assistant tool for the clinician to help them make more appropriate drug regimen substitution. Future research directions are forwarded to come up with an applicable system in the area of the study.

Keywords: HIV drug regimen, data mining, hybrid methodology, predictive model

Procedia PDF Downloads 115
49 Specific Language Impairment: Assessing Bilingual Children for Identifying Children with Specific Language Impairment (SLI)

Authors: Manish Madappa, Madhavi Gayathri Raman

Abstract:

The primary vehicle of human communication is language. A breakdown occurring in any aspect of communication may lead to frustration and isolation among the learners and the teachers. Over seven percent of the population in the world currently experience limitations and those children who exhibit a deviant/deficient language acquisition curve even when being in a language rich environment as their peers may be at risk of having a language disorder or language impairment. The difficulty may be in the word level [vocabulary/word knowledge] and/or the sentence level [syntax/morphology) Children with SLI appear to be developing normally in all aspects except for their receptive and/or expressive language skills. Thus, it is utmost importance to identify children with or at risk of SLI so that an early intervention can foster language and social growth, provide the best possible learning environment with special support for language to be explicitly taught and a step in providing continuous and ongoing support. The present study looks at Kannada English bilingual children and works towards identifying children at risk of “specific language impairment”. The study was conducted through an exploratory study which systematically enquired into the narratives of young Kannada-English bilinguals and to investigate the data for story structure in their narrative formulations. Oral narrative offers a rich source of data about a child’s language use in a relatively natural context. The fundamental objective is to ensure comparability and to be more universal and thus allows for the evaluation narrative text competence. The data was collected from 10 class three students at a primary school in Mysore, Karnataka and analyzed for macrostructure component reflecting the goal directed behavior of a protagonist who is motivated to carry out some kind of action with the intention of attaining a goal. The results show that the children exhibiting a deviation of -1.25 SD are at risk of SLI. Two learners were identified to be at risk of Specific Language Impairment with a standard deviation of more the 1.25 below the mean score.

Keywords: bilingual, oral narratives, SLI, macrostructure

Procedia PDF Downloads 261
48 Syntax and Words as Evolutionary Characters in Comparative Linguistics

Authors: Nancy Retzlaff, Sarah J. Berkemer, Trudie Strauss

Abstract:

In the last couple of decades, the advent of digitalization of any kind of data was probably one of the major advances in all fields of study. This paves the way for also analysing these data even though they might come from disciplines where there was no initial computational necessity to do so. Especially in linguistics, one can find a rather manual tradition. Still when considering studies that involve the history of language families it is hard to overlook the striking similarities to bioinformatics (phylogenetic) approaches. Alignments of words are such a fairly well studied example of an application of bioinformatics methods to historical linguistics. In this paper we will not only consider alignments of strings, i.e., words in this case, but also alignments of syntax trees of selected Indo-European languages. Based on initial, crude alignments, a sophisticated scoring model is trained on both letters and syntactic features. The aim is to gain a better understanding on which features in two languages are related, i.e., most likely to have the same root. Initially, all words in two languages are pre-aligned with a basic scoring model that primarily selects consonants and adjusts them before fitting in the vowels. Mixture models are subsequently used to filter ‘good’ alignments depending on the alignment length and the number of inserted gaps. Using these selected word alignments it is possible to perform tree alignments of the given syntax trees and consequently find sentences that correspond rather well to each other across languages. The syntax alignments are then filtered for meaningful scores—’good’ scores contain evolutionary information and are therefore used to train the sophisticated scoring model. Further iterations of alignments and training steps are performed until the scoring model saturates, i.e., barely changes anymore. A better evaluation of the trained scoring model and its function in containing evolutionary meaningful information will be given. An assessment of sentence alignment compared to possible phrase structure will also be provided. The method described here may have its flaws because of limited prior information. This, however, may offer a good starting point to study languages where only little prior knowledge is available and a detailed, unbiased study is needed.

Keywords: alignments, bioinformatics, comparative linguistics, historical linguistics, statistical methods

Procedia PDF Downloads 130
47 Improving Subjective Bias Detection Using Bidirectional Encoder Representations from Transformers and Bidirectional Long Short-Term Memory

Authors: Ebipatei Victoria Tunyan, T. A. Cao, Cheol Young Ock

Abstract:

Detecting subjectively biased statements is a vital task. This is because this kind of bias, when present in the text or other forms of information dissemination media such as news, social media, scientific texts, and encyclopedias, can weaken trust in the information and stir conflicts amongst consumers. Subjective bias detection is also critical for many Natural Language Processing (NLP) tasks like sentiment analysis, opinion identification, and bias neutralization. Having a system that can adequately detect subjectivity in text will boost research in the above-mentioned areas significantly. It can also come in handy for platforms like Wikipedia, where the use of neutral language is of importance. The goal of this work is to identify the subjectively biased language in text on a sentence level. With machine learning, we can solve complex AI problems, making it a good fit for the problem of subjective bias detection. A key step in this approach is to train a classifier based on BERT (Bidirectional Encoder Representations from Transformers) as upstream model. BERT by itself can be used as a classifier; however, in this study, we use BERT as data preprocessor as well as an embedding generator for a Bi-LSTM (Bidirectional Long Short-Term Memory) network incorporated with attention mechanism. This approach produces a deeper and better classifier. We evaluate the effectiveness of our model using the Wiki Neutrality Corpus (WNC), which was compiled from Wikipedia edits that removed various biased instances from sentences as a benchmark dataset, with which we also compare our model to existing approaches. Experimental analysis indicates an improved performance, as our model achieved state-of-the-art accuracy in detecting subjective bias. This study focuses on the English language, but the model can be fine-tuned to accommodate other languages.

Keywords: subjective bias detection, machine learning, BERT–BiLSTM–Attention, text classification, natural language processing

Procedia PDF Downloads 100
46 Discovering Word-Class Deficits in Persons with Aphasia

Authors: Yashaswini Channabasavegowda, Hema Nagaraj

Abstract:

Aim: The current study aims at discovering word-class deficits concerning the noun-verb ratio in confrontation naming, picture description, and picture-word matching tasks. A total of ten persons with aphasia (PWA) and ten age-matched neurotypical individuals (NTI) were recruited for the study. The research includes both behavioural and objective measures to assess the word class deficits in PWA. Objective: The main objective of the research is to identify word class deficits seen in persons with aphasia, using various speech eliciting tasks. Method: The study was conducted in the L1 of the participants, considered to be Kannada. Action naming test and Boston naming test adapted to the Kannada version are administered to the participants; also, a picture description task is carried out. Picture-word matching task was carried out using e-prime software (version 2) to measure the accuracy and reaction time with respect to identification verbs and nouns. The stimulus was presented through auditory and visual modes. Data were analysed to identify errors noticed in the naming of nouns versus verbs, with respect to the Boston naming test and action naming test and also usage of nouns and verbs in the picture description task. Reaction time and accuracy for picture-word matching were extracted from the software. Results: PWA showed a significant difference in sentence structure compared to age-matched NTI. Also, PWA showed impairment in syntactic measures in the picture description task, with fewer correct grammatical sentences and fewer correct usage of verbs and nouns, and they produced a greater proportion of nouns compared to verbs. PWA had poorer accuracy and lesser reaction time in the picture-word matching task compared to NTI, and accuracy was higher for nouns compared to verbs in PWA. The deficits were noticed irrespective of the cause leading to aphasia.

Keywords: nouns, verbs, aphasia, naming, description

Procedia PDF Downloads 80
45 Climate Change and Health in Policies

Authors: Corinne Kowalski, Lea de Jong, Rainer Sauerborn, Niamh Herlihy, Anneliese Depoux, Jale Tosun

Abstract:

Climate change is considered one of the biggest threats to human health of the 21st century. The link between climate change and health has received relatively little attention in the media, in research and in policy-making. A long term and broad overview of how health is represented in the legislation on climate change is missing in the legislative literature. It is unknown if or how the argument for health is referred in legal clauses addressing climate change, in national and European legislation. Integrating scientific based evidence into policies regarding the impacts of climate change on health could be a key step to inciting the political and societal changes necessary to decelerate global warming. This may also drive the implementation of new strategies to mitigate the consequences on health systems. To provide an overview of this issue, we are analyzing the Global Climate Legislation Database provided by the Grantham Research Institute on Climate Change and the Environment. This institution was established in 2008 at the London School of Economics and Political Science. The database consists of (updated as of 1st January 2015) legislations on climate change in 99 countries around the world. This tool offers relevant information about the state of climate related policies. We will use the database to systematically analyze the 829 identified legislations to identify how health is represented as a relevant aspect of climate change legislation. We are conducting explorative research of national and supranational legislations and anticipate health to be addressed in various forms. The goal is to highlight how often, in what specific terms, which aspects of health or health risks of climate change are mentioned in various legislations. The position and recurrence of the mention of health is also of importance. Data will be extracted with complete quotation of the sentence which mentions health, which will allow for second qualitative stage to analyze which aspects of health are represented and in what context. This study is part of an interdisciplinary project called 4CHealth that confronts results of the research done on scientific, political and press literature to better understand how the knowledge on climate change and health circulates within those different fields and whether and how it is translated to real world change.

Keywords: climate change, explorative research, health, policies

Procedia PDF Downloads 337
44 Everyday Interactions among Imprisoned Sex Offenders: A Qualitative Study within the 'Due Palazzi' Prison in Padua

Authors: Matteo Mazzucato, Elena Faccio, Antonio Iudici

Abstract:

Prison is a social reality constructed by everyday interactions between an inmate, other social actors (cellmates, prison officers, educationalists and psychologists or other detainees) and the external world which participates in this complex construction through the social discourses on prison reality and its problems. Being a detainee means performing a self dealing with processes of stereotypization, attribution of a social role and prejudices assigned by various interlocutors and depending on what kind of crime one has been convicted of. Among all inmates, sex offenders are the ones who risk more to be socially condemned beyond a legal sentence since they have committed one of the most hated and disapproved crime. Regarding this, prison has to be considered as a critical context in which all community expectations and beliefs are converged: for common sense, rapists and child molesters are dangerous people who have to be stigmatized, punished and isolated. Furthermore, other detainees share a code of conduct by which the ‘sex offender’ is collocated at the lowest level of the social hierarchy of the prison. The penitentiary administration too defines this kind of detainee as a ‘vulnerable person to protect’ while prison staff considers him as a particular inmate who has to be treated and definitely changed. Considering all the complexities connected with being imprisoned as a sex offender, our research aimed at exploring how people convicted of sex crimes are called upon to manage all these hetero-narrations about their selves. Set this goal, textual data retrieved from this qualitative research show that sex offenders tend to not face the stigma assigned to them. They are rather used to minimize the story telling about their selves and costruct alternative biographies to be shared with other inmates. Managing narrations about their selves in this way permits to distance them from all the threats perceived living together with other detainees but it blocks sex offenders’ ri-signification of their offences during prison treatment. Given these results, prison administration should develop activities in order to create fields of interaction between detainees where experiencing new versions of their selves spendable even in external social situations. Regarding this it’s important to re-consider prison as part of the community and the sex offenders as a member of it.

Keywords: interactions, qualitative research, prison reality, sex offender

Procedia PDF Downloads 194
43 Contextual SenSe Model: Word Sense Disambiguation using Sense and Sense Value of Context Surrounding the Target

Authors: Vishal Raj, Noorhan Abbas

Abstract:

Ambiguity in NLP (Natural language processing) refers to the ability of a word, phrase, sentence, or text to have multiple meanings. This results in various kinds of ambiguities such as lexical, syntactic, semantic, anaphoric and referential am-biguities. This study is focused mainly on solving the issue of Lexical ambiguity. Word Sense Disambiguation (WSD) is an NLP technique that aims to resolve lexical ambiguity by determining the correct meaning of a word within a given context. Most WSD solutions rely on words for training and testing, but we have used lemma and Part of Speech (POS) tokens of words for training and testing. Lemma adds generality and POS adds properties of word into token. We have designed a novel method to create an affinity matrix to calculate the affinity be-tween any pair of lemma_POS (a token where lemma and POS of word are joined by underscore) of given training set. Additionally, we have devised an al-gorithm to create the sense clusters of tokens using affinity matrix under hierar-chy of POS of lemma. Furthermore, three different mechanisms to predict the sense of target word using the affinity/similarity value are devised. Each contex-tual token contributes to the sense of target word with some value and whichever sense gets higher value becomes the sense of target word. So, contextual tokens play a key role in creating sense clusters and predicting the sense of target word, hence, the model is named Contextual SenSe Model (CSM). CSM exhibits a noteworthy simplicity and explication lucidity in contrast to contemporary deep learning models characterized by intricacy, time-intensive processes, and chal-lenging explication. CSM is trained on SemCor training data and evaluated on SemEval test dataset. The results indicate that despite the naivety of the method, it achieves promising results when compared to the Most Frequent Sense (MFS) model.

Keywords: word sense disambiguation (wsd), contextual sense model (csm), most frequent sense (mfs), part of speech (pos), natural language processing (nlp), oov (out of vocabulary), lemma_pos (a token where lemma and pos of word are joined by underscore), information retrieval (ir), machine translation (mt)

Procedia PDF Downloads 72
42 An Exploration of Gender Differences in Academic Writing in Science

Authors: Gayani Ranawake, Kate Wilson

Abstract:

Underrepresentation of women in academia, particularly in science, has been discussed by many scholars for decades. The causes of this underrepresentation are debated to this day. Publication is an important aspect of success in academia, and publication and citation rates are significant metrics in performance review, promotion, and employment. It has been established that men’s and women’s language use in general, both spoken and written, is different. However, no one, to our knowledge, has looked at whether men’s and women’s writing in science is different. If there are significant differences in the writing of men and women, then these differences may affect women’s ability to succeed in science. This study is part of a larger project to explore whether differences can be recognized in the academic science writing of men and women. Mono authored articles from high ranking physics, biology and psychology journals by men and women authors were compared in terms of readability statistics. In particular, the abstract and introduction sections were compared, as these are the first sections encountered by a reviewer, and so may have an important effect on their impression of the work. The Flesch Reading Ease, the percentage of passive sentences and the Flesch-Kincaid Reading Grade Level were calculated for each section of each article, along with counts of numbers of sentences, words per sentence and sentences per paragraph. Significance of differences was tested using the Behrens statistic. It was found that for both physics and biology papers there were no significant differences in the complexity or verbosity of the writing of men and women authors. However, there was a significant difference between the two disciplines, with physics articles being generally more readable (higher readability score) while also more passive (higher number of passive sentences). In contrast, the psychology articles showed a difference between men and women authors which may be significant. The average readability for introductions in women’s articles was 28 which was higher than for men’s articles, which was 19 (higher values indicate more readable). Women’s articles in psychology also had a greater proportion of passive sentences. It can be concluded that, at least in the more traditional sciences, men and women have adopted similar ways of writing, and that disciplinary differences are greater than gender differences. This may not be the case in psychology, which many consider to be more closely aligned with the humanities. Whether the lack of differences is because women have adapted to a masculine way of writing, or whether the genre itself is gender neutral needs further investigation.

Keywords: academic writing, gender differences, readability, science

Procedia PDF Downloads 167
41 Unsupervised Part-of-Speech Tagging for Amharic Using K-Means Clustering

Authors: Zelalem Fantahun

Abstract:

Part-of-speech tagging is the process of assigning a part-of-speech or other lexical class marker to each word into naturally occurring text. Part-of-speech tagging is the most fundamental and basic task almost in all natural language processing. In natural language processing, the problem of providing large amount of manually annotated data is a knowledge acquisition bottleneck. Since, Amharic is one of under-resourced language, the availability of tagged corpus is the bottleneck problem for natural language processing especially for POS tagging. A promising direction to tackle this problem is to provide a system that does not require manually tagged data. In unsupervised learning, the learner is not provided with classifications. Unsupervised algorithms seek out similarity between pieces of data in order to determine whether they can be characterized as forming a group. This paper explicates the development of unsupervised part-of-speech tagger using K-Means clustering for Amharic language since large amount of data is produced in day-to-day activities. In the development of the tagger, the following procedures are followed. First, the unlabeled data (raw text) is divided into 10 folds and tokenization phase takes place; at this level, the raw text is chunked at sentence level and then into words. The second phase is feature extraction which includes word frequency, syntactic and morphological features of a word. The third phase is clustering. Among different clustering algorithms, K-means is selected and implemented in this study that brings group of similar words together. The fourth phase is mapping, which deals with looking at each cluster carefully and the most common tag is assigned to a group. This study finds out two features that are capable of distinguishing one part-of-speech from others these are morphological feature and positional information and show that it is possible to use unsupervised learning for Amharic POS tagging. In order to increase performance of the unsupervised part-of-speech tagger, there is a need to incorporate other features that are not included in this study, such as semantic related information. Finally, based on experimental result, the performance of the system achieves a maximum of 81% accuracy.

Keywords: POS tagging, Amharic, unsupervised learning, k-means

Procedia PDF Downloads 416
40 Arabicization and Terminology with Reference to Social Media Terms

Authors: Ahmed Al-Awthan

Abstract:

This study addresses the prevalence of English terminology in published Arabic documentation on social media. Although the problem of using English terms in translation instead of existing native ones has been addressed in general by researchers around the world, to the best of the author’s knowledge the attitude of the translators as professionals to this phenomenon in Qatar and Yemen has not received a detailed study. This study examines the impact of the use of English, social media terms in the Arab world on aspiring and professional translators; it explores the benefits and drawbacks of linguistic borrowing as identified by the translators and investigates whether translators consider any means of resisting linguistic borrowing and prioritizing Arabic. It also aims to answer the following questions: i. Is there any prevalence of English, social media terms in Arabic translation? Why or why not? ii. Do Arabic translators prefer using English, social media terms to their equivalents in Arabic? If so, why? iii. Which measures could be adopted to help reduce the frequently observed borrowing of English terms? In particular, how do translators see the role of the Arabic Language Academies in preserving Arabic? iv. This research is descriptive, comparative and analytical in nature. It is both qualitative and quantitative. To validate the problem, the researcher will analyze articles published by Al-Jazeera in 2016-2018 that refer to the use of social media in diplomacy. It will be examined whether the increased international discussion of political events in social media increased the amount of transliterated English terminology referring to this mode of communication.To investigate whether the translators recognize the phenomenon of borrowing, the researcher proposes to use a survey. This survey will use multiple choice questions. It will target 20 aspiring translators from Yemen and 20 participants from Qatar. It will offer 15 English, social media terms used in discourse in 15 sentences. For each sentence, the researcher will provide three different translations and will ask the translators to rate them and offer their rendition. After collecting all the answers online, the researcher will analyze the data. The results are expected to confirm whether there is a prevalence of English terms in translating into Arabic. It is also expected to show what measures the translators used to render the English, social media terms, and it raises awareness of borrowing English terms. It will guide the translator toward using Arabicization methods in order to contribute to preserving Arabic.

Keywords: Arabicization, trans lingual borrowing, social media terms, terminology

Procedia PDF Downloads 128
39 Simplifying Writing Composition to Assist Students in Rural Areas: An Experimental Study for the Comparison of Guided and Unguided Instruction

Authors: Neha Toppo

Abstract:

Method and strategies of teaching instruction highly influence learning of students. In second language teaching, number of ways and methods has been suggested by different scholars and researchers through times. The present article deals with the role of teaching instruction in developing compositional ability of students in writing. It focuses on the secondary level students of rural areas, whose exposure to English language is limited and they face challenges even in simple compositions. The students till high school suffer with their disability in writing formal letter, application, essay, paragraph etc. They face problem in note making, writing answers in examination using their own words and depend fully on rote learning. It becomes difficult for them to give language to their own ideas. Teaching writing composition deserves special attention as writing is an integral part of language learning and students at this level are expected to have sound compositional ability for it is useful in numerous domains. Effective method of instruction could help students to learn expression of self, correct selection of vocabulary and grammar, contextual writing, composition of formal and informal writing. It is not limited to school but continues to be important in various other fields outside the school such as in newspaper and magazine, official work, legislative work, material writing, academic writing, personal writing, etc. The study is based on the experimental method, which hypothesize that guided instruction will be more effective in teaching writing compositions than usual instruction in which students are left to compose by their own without any help. In the test, students of one section are asked to write an essay on the given topic without guidance and another section are asked to write the same but with the assistance of guided instruction in which students have been provided with a few vocabulary and sentence structure. This process is repeated in few more schools to get generalize data. The study shows the difference on students’ performance using both the instructions; guided and unguided. The conclusion of the study is followed by the finding that writing skill of the students is quite poor but with the help of guided instruction they perform better. The students are in need of better teaching instruction to develop their writing skills.

Keywords: composition, essay, guided instruction, writing skill

Procedia PDF Downloads 253
38 Enhanced Tensor Tomographic Reconstruction: Integrating Absorption, Refraction and Temporal Effects

Authors: Lukas Vierus, Thomas Schuster

Abstract:

A general framework is examined for dynamic tensor field tomography within an inhomogeneous medium characterized by refraction and absorption, treated as an inverse source problem concerning the associated transport equation. Guided by Fermat’s principle, the Riemannian metric within the specified domain is determined by the medium's refractive index. While considerable literature exists on the inverse problem of reconstructing a tensor field from its longitudinal ray transform within a static Euclidean environment, limited inversion formulas and algorithms are available for general Riemannian metrics and time-varying tensor fields. It is established that tensor field tomography, akin to an inverse source problem for a transport equation, persists in dynamic scenarios. Framing dynamic tensor tomography as an inverse source problem embodies a comprehensive perspective within this domain. Ensuring well-defined forward mappings necessitates establishing existence and uniqueness for the underlying transport equations. However, the bilinear forms of the associated weak formulations fail to meet the coercivity condition. Consequently, recourse to viscosity solutions is taken, demonstrating their unique existence within suitable Sobolev spaces (in the static case) and Sobolev-Bochner spaces (in the dynamic case), under a specific assumption restricting variations in the refractive index. Notably, the adjoint problem can also be reformulated as a transport equation, with analogous results regarding uniqueness. Analytical solutions are expressed as integrals over geodesics, facilitating more efficient evaluation of forward and adjoint operators compared to solving partial differential equations. Certainly, here's the revised sentence in English: Numerical experiments are conducted using a Nesterov-accelerated Landweber method, encompassing various fields, absorption coefficients, and refractive indices, thereby illustrating the enhanced reconstruction achieved through this holistic modeling approach.

Keywords: attenuated refractive dynamic ray transform of tensor fields, geodesics, transport equation, viscosity solutions

Procedia PDF Downloads 19
37 Interlanguage Acquisition of a Postposition ‘e’ in Korean: Analysis of the Korean Novice Learners’ Output

Authors: Eunjung Lee

Abstract:

This study aims to analyze the sentences generated by the beginners who learn ‘e,’ a postposition in Korean and to find out the regularity of learners’ interlanguage upon investigating the usages of ‘e’ that appears by meanings and functions in their interlanguage, and conditions that ‘e’ is used. This study was conducted with mainly two assumptions; first, the learner’s language has the specific type of interlanguage; and second, there is the regularity of interlanguage when students produce ‘e’ under the specific conditions. Learners’ output has various values and can be used as the useful data to understand interlanguage. Therefore, all the sentences containing a postposition ‘e’ by English speaking learners were searched in ‘Learners’ corpus sharing center in The National Institute of Korean Language’ in Korea, and the data were collected upon limiting the levels of learners with Level 1 and 2. 789 sentences that were used with ‘e’ were selected as the final subjects of the analysis. First, to understand the environmental characteristics to be used with a postposition, ‘e’ after summarizing 13 meaning and functions of ‘e’ appeared in three books of Korean dictionary that summarized the Korean grammar, 1) meaning function of ‘e’ that were used in each sentence was classified; 2) the nouns that were combined with ‘e,’ keywords of the sentences, and the characteristics of modifiers, linkers, and predicates appeared in front of ‘e’ were analyzed; 3) the regularity by the novice learners’ meaning and functions were reviewed; and 4) the differences of the regularity by level 1 and 2 learners’ meaning and functions were found. Upon the study results, the novice learners showed 1) they used the nouns related to ‘time(시간), before(전), after(후), next(다음), the next(그다음), then(때), day of the week(요일), and season(계절)’ mainly in front of ‘e’ when they used ‘e’ as the meaning function of time; 2) they used mainly the verbs of ‘go(가다),’ ‘come(오다),’ and ‘go round(다니다)’ as the predicate to match with ‘e’ that was the meaning function of direction and destination; and 3) they used mainly the nouns related to ‘locations or countries’ in front of ‘e,’ a meaning function postposition of ‘place,’ used mainly the verbs ‘be(있다), not be(없다), live(살다), be many(많다)’ after ‘e,’ and ‘i(이) or ka(가)’ was combined mainly in the subject words in case of ‘be(있다), not be(없다)’ or ‘be many(많다),’ and ‘eun(은) or nun(는)’ was combined mainly in the subject words in front of ‘live at’ In addition, 4) they used ‘e’ which indicates ‘cause or reason’ in the form of ‘because( 때문에),’ and 5) used ‘e’ of the subjects as the predicates to match with the predicates such as ‘treat(대하다), like(들다), and catch(걸리다).’ From these results, ‘e’ usage patterns of the Korean novice learners demonstrated very differently by the meaning functions and the learners’ interlanguage regularity could be deducted. However, little difference was found in interlanguage regularity between level 1 and 2. This study has the meaning to try to understand the interlanguage system and regularity in the learners’ acquisition process of postposition ‘e’ and this can be utilized to lessen their errors.

Keywords: interlanguage, interlagnage anaylsis, postposition ‘e’, Korean acquisition

Procedia PDF Downloads 106
36 Structuring Paraphrases: The Impact Sentence Complexity Has on Key Leader Engagements

Authors: Meaghan Bowman

Abstract:

Soldiers are taught about the importance of effective communication with repetition of the phrase, “Communication is key.” They receive training in preparing for, and carrying out, interactions between foreign and domestic leaders to gain crucial information about a mission. These interactions are known as Key Leader Engagements (KLEs). For the training of KLEs, doctrine mandates the skills needed to conduct these “engagements” such as how to: behave appropriately, identify key leaders, and employ effective strategies. Army officers in training learn how to confront leaders, what information to gain, and how to ask questions respectfully. Unfortunately, soldiers rarely learn how to formulate questions optimally. Since less complex questions are easier to understand, we hypothesize that semantic complexity affects content understanding, and that age and education levels may have an effect on one’s ability to form paraphrases and judge their quality. In this study, we looked at paraphrases of queries as well as judgments of both the paraphrases’ naturalness and their semantic similarity to the query. Queries were divided into three complexity categories based on the number of relations (the first number) and the number of knowledge graph edges (the second number). Two crowd-sourced tasks were completed by Amazon volunteer participants, also known as turkers, to answer the research questions: (i) Are more complex queries harder to paraphrase and judge and (ii) Do age and education level affect the ability to understand complex queries. We ran statistical tests as follows: MANOVA for query understanding and two-way ANOVA to understand the relationship between query complexity and education and age. A probe of the number of given-level queries selected for paraphrasing by crowd-sourced workers in seven age ranges yielded promising results. We found significant evidence that age plays a role and marginally significant evidence that education level plays a role. These preliminary tests, with output p-values of 0.0002 and 0.068, respectively, suggest the importance of content understanding in a communication skill set. This basic ability to communicate, which may differ by age and education, permits reproduction and quality assessment and is crucial in training soldiers for effective participation in KLEs.

Keywords: engagement, key leader, paraphrasing, query complexity, understanding

Procedia PDF Downloads 134
35 Attention Deficit Hyperactivity Disorder and Criminality: A Psychological Profile of Convicts Serving Prison Sentences

Authors: Agnieszka Nowogrodzka

Abstract:

Objectives: ADHD is a neurodevelopmental disorder in which symptoms are most prominent throughout childhood. In the longer term, these symptoms, as well as the behaviour of the child, the experiences arising from the response of the community to the child's symptoms, as well as the functioning of the community itself, all contribute to the onset of secondary symptoms and subsequent outcomes of the disorder, such as crime or mental disorders. The purpose of this study is to estimate the prevalence of ADHD among Polish convicts serving a prison sentence. To that end, the study will focus on the relationship between the severity of ADHD and early childhood trauma, family relations, maladaptive cognitive schemas, as well as mental disorders. It is an attempt to assess the interdependence between ADHD, childhood experiences, and secondary outcomes. Methods: The study enrolled two groups of first-time convicts and repeat offenders aged between 21 and 65 –each of the study groups comprised 120 participants; 240 participants in total took part in the study. Participants were recruited in semi-open penal institutions in Poland (Poznań Custody Suite, Wronki Penal Institution, Iława Penal Institution). The control group comprised 110 men without criminal records aged 21 to 65. The DIVA 5.0 questionnaire was employed to identify the severity of ADHD symptoms. Other questionnaires employed in the course of the study included the Childhood Trauma Questionnaire (CTQ), The Family Adaptability and Cohesion Scale IV (FACES-IV), Young Schema Questionnaire (YSQ), and the General Health Questionnaire (GHQ-30). Results: The findings of the study in question are currently still being compiled and will be shared during the conference. The findings of a pilot study involving two cohorts of convicts (each numbering 20 men) and a control group (20 men with no criminal records) indicate a significant correlation between ADHD and the experience of early childhood trauma. The severity of ADHD also shows a correlation with the assessment of the functioning of the family, with the subjects assessing the relationships in their families more negatively than the control group. Furthermore, the severity of ADHD is also correlated with maladaptive emotional schemas manifesting in the participants. The findings also show a correlation between selected dimensions and the severity of offenses.

Keywords: ADHD, social impairments, mental disorders, early childhood traumas, criminality

Procedia PDF Downloads 57
34 From Shallow Semantic Representation to Deeper One: Verb Decomposition Approach

Authors: Aliaksandr Huminski

Abstract:

Semantic Role Labeling (SRL) as shallow semantic parsing approach includes recognition and labeling arguments of a verb in a sentence. Verb participants are linked with specific semantic roles (Agent, Patient, Instrument, Location, etc.). Thus, SRL can answer on key questions such as ‘Who’, ‘When’, ‘What’, ‘Where’ in a text and it is widely applied in dialog systems, question-answering, named entity recognition, information retrieval, and other fields of NLP. However, SRL has the following flaw: Two sentences with identical (or almost identical) meaning can have different semantic role structures. Let consider 2 sentences: (1) John put butter on the bread. (2) John buttered the bread. SRL for (1) and (2) will be significantly different. For the verb put in (1) it is [Agent + Patient + Goal], but for the verb butter in (2) it is [Agent + Goal]. It happens because of one of the most interesting and intriguing features of a verb: Its ability to capture participants as in the case of the verb butter, or their features as, say, in the case of the verb drink where the participant’s feature being liquid is shared with the verb. This capture looks like a total fusion of meaning and cannot be decomposed in direct way (in comparison with compound verbs like babysit or breastfeed). From this perspective, SRL looks really shallow to represent semantic structure. If the key point in semantic representation is an opportunity to use it for making inferences and finding hidden reasons, it assumes by default that two different but semantically identical sentences must have the same semantic structure. Otherwise we will have different inferences from the same meaning. To overcome the above-mentioned flaw, the following approach is suggested. Assume that: P is a participant of relation; F is a feature of a participant; Vcp is a verb that captures a participant; Vcf is a verb that captures a feature of a participant; Vpr is a primitive verb or a verb that does not capture any participant and represents only a relation. In another word, a primitive verb is a verb whose meaning does not include meanings from its surroundings. Then Vcp and Vcf can be decomposed as: Vcp = Vpr +P; Vcf = Vpr +F. If all Vcp and Vcf will be represented this way, then primitive verbs Vpr can be considered as a canonical form for SRL. As a result of that, there will be no hidden participants caught by a verb since all participants will be explicitly unfolded. An obvious example of Vpr is the verb go, which represents pure movement. In this case the verb drink can be represented as man-made movement of liquid into specific direction. Extraction and using primitive verbs for SRL create a canonical representation unique for semantically identical sentences. It leads to the unification of semantic representation. In this case, the critical flaw related to SRL will be resolved.

Keywords: decomposition, labeling, primitive verbs, semantic roles

Procedia PDF Downloads 343
33 An Experimental Exploration of the Interaction between Consumer Ethics Perceptions, Legality Evaluations, and Mind-Sets

Authors: Daphne Sobolev, Niklas Voege

Abstract:

During the last three decades, consumer ethics perceptions have attracted the attention of a large number of researchers. Nevertheless, little is known about the effect of the cognitive and situational contexts of the decision on ethics judgments. In this paper, the interrelationship between consumers’ ethics perceptions, legality evaluations and mind-sets are explored. Legality evaluations represent the cognitive context of the ethical judgments, whereas mind-sets represent their situational context. Drawing on moral development theories and priming theories, it is hypothesized that both factors are significantly related to consumer ethics perceptions. To test this hypothesis, 289 participants were allocated to three mind-set experimental conditions and a control group. Participants in the mind-set conditions were primed for aggressiveness, politeness or awareness to the negative legal consequences of breaking the law. Mind-sets were induced using a sentence-unscrambling task, in which target words were included. Ethics and legality judgments were assessed using consumer ethics and internet ethics questionnaires. All participants were asked to rate the ethicality and legality of consumer actions described in the questionnaires. The results showed that consumer ethics and legality perceptions were significantly correlated. Moreover, including legality evaluations as a variable in ethics judgment models increased the predictive power of the models. In addition, inducing aggressiveness in participants reduced their sensitivity to ethical issues; priming awareness to negative legal consequences increased their sensitivity to ethics when uncertainty about the legality of the judged scenario was high. Furthermore, the correlation between ethics and legality judgments was significant overall mind-set conditions. However, the results revealed conflicts between ethics and legality perceptions: consumers considered 10%-14% of the presented behaviors unethical and legal, or ethical and illegal. In 10-23% of the questions, participants indicated that they did not know whether the described action was legal or not. In addition, an asymmetry between the effects of aggressiveness and politeness priming was found. The results show that the legality judgments and mind-sets interact with consumer ethics perceptions. Thus, they portray consumer ethical judgments as dynamical processes which are inseparable from other cognitive processes and situational variables. They highlight that legal and ethical education, as well as adequate situational cues at the service place, could have a positive effect on consumer ethics perceptions. Theoretical contribution is discussed.

Keywords: consumer ethics, legality judgments, mind-set, priming, aggressiveness

Procedia PDF Downloads 271
32 The Current Importance of the Rules of Civil Procedure in the Portuguese Legal Order: Between Legalism and Adequation

Authors: Guilherme Gomes, Jose Lebre de Freitas

Abstract:

The rules of Civil Procedure that are defined in the Portuguese Civil Procedure Code of 2013 particularly their articles 552 to 626- represent the model that the legislator thought that would be more suitable for national civil litigation, from the moment the action is brought by the plaintiff to the moment when the sentence is issued. However, procedural legalism is no longer a reality in the Portuguese Civil Procedural Law. According to the article 547 of the code of 2013, the civil judge has a duty to adopt the procedure that better suits the circumstances of the case, whether or not it is the one defined by law. The main goal of our paper is to answer the question whether the formal adequation imposed by this article diminishes the importance of the Portuguese rules of Civil Procedure and their daily application by national civil judges. We will start by explaining the appearance of the abovementioned rules in the Civil Procedure Code of 2013. Then we will analyse, using specific examples that were obtained by the books we read, how the legal procedure defined in the abovementioned code does not suit the circumstances of some specific cases and is totally inefficient in some situations. After that, we will, by using the data obtained in the practical research that we are conducting in the Portuguese civil courts within the scope of our Ph.D. thesis (until now, we have been able to consult 150 civil lawsuits), verify whether and how judges and parties make the procedure more efficient and effective in the case sub judice. In the scope of our research, we have already reached some preliminary findings: 1) despite the fact that the legal procedure does not suit the circumstances of some civil lawsuits, there are only two situations of frequent use of formal adequation (the judge allowing the plaintiff to respond to the procedural exceptions deduced in the written defense and the exemption from prior hearing for the judges who never summon it), 2) the other aspects of procedural adequation (anticipation of the production of expert evidence, waiving of oral argument at the final hearing, written allegations, dismissal of the dispatch on the controversial facts and the examination of witnesses at the domicile of one of the lawyers) are still little used and 3) formal adequation tends to happen by initiative of the judge, as plaintiffs and defendants are afraid of celebrating procedural agreements in most situations. In short, we can say that, in the Portuguese legal order of the 21st century, the flexibility of the legal procedure, as it is defined in the law and applied by procedural subjects, does not affect the importance of the rules of Civil Procedure of the code of 2013.

Keywords: casuistic adequation, civil procedure code of 2013, procedural subjects, rules of civil procedure

Procedia PDF Downloads 105
31 The Effects of English Contractions on the Application of Syntactic Theories

Authors: Wakkai Hosanna Hussaini

Abstract:

A formal structure of the English clause is composed of at least two elements – subject and verb, in structural grammar and at least one element – predicate, in systemic (functional) and generative grammars. Each of the elements can be represented by a word or group (of words). In modern English structure, very often speakers merge two words as one with the use of an apostrophe. Each of the two words can come from different elements or belong to the same element. In either case, result of the merger is called contraction. Although contractions constitute a part of modern English structure, they are considered informal in nature (more frequently used in spoken than written English) that is why they were initially viewed as constituting an evidence of language deterioration. To our knowledge, no formal syntactic theory yet has been particular on the contractions because of its deviation from the formal rules of syntax that seek to identify the elements that form a clause in English. The inconsistency between the formal rules and a contraction is established when two words representing two elements in a non-contraction are merged as one element to form a contraction. Thus the paper presents the various syntactic issues as effects arising from converting non-contracted to contracted forms. It categorizes English contractions and describes each category according to its syntactic relations (position and relationship) and morphological formation (form and content) as integral part of modern structure of English. This is a position paper as such the methodology is observational, descriptive and explanatory/analytical based on existing related literature. The inventory of English contractions contained in books on syntax forms the data from where specific examples are drawn. It is noted as conclusion that the existing syntactic theories were not originally established to account for English contractions. The paper, when published, will further expose the inadequacies of the existing syntactic theories by giving more reasons for the establishment of a more comprehensive syntactic theory for analyzing English clause/sentence structure involving contractions. The method used reveals the extent of the inadequacies in applying the three major syntactic theories: structural, systemic (functional) and generative, on the English contractions. Although no theory is without scope, shying away from the three major theories from recognizing the English contractions need to be broken because of the increasing popularity of its use in modern English structure. The paper, therefore, recommends that as use of contraction gains more popular even in formal speeches today, there is need to establish a syntactic theory to handle its patterns of syntactic relations and morphological formation.

Keywords: application, effects, English contractions, syntactic theories

Procedia PDF Downloads 230
30 The Role of Specificity in Mastering the English Article System

Authors: Sugene Kim

Abstract:

The English articles are taught as a binary system based on nominal countability and definiteness. Despite the detailed rules of prescriptive grammar, it has been consistently reported in the literature that their correct usage is extremely difficult to master even for advanced learners of English as a second language (ESL) or a foreign language (EFL). Given that an English sentence (except for an imperative) cannot be constructed without a noun, which is always paired with one of the indefinite, definite, and zero articles; it is essential to understand specifically what causes ESL/EFL learners to misuse them. To that end, this study examined EFL learners’ article use employing a one-group pre–post-test design. Forty-three Korean college students received instruction on correct English article usage for two 75-minute classes employing the binary schema set up for the study. They also practiced in class how to apply the rules as instructed. Then, the participants were assigned a forced-choice elicitation task, which was also used as a pre-test administered three months prior to the instruction. Unlike the pre-test on which they only chose the correct article for each of the 40 items, the post-instruction task additionally asked them to give written accounts of their decision-making procedure to choose the article as they did. The participants’ performance was scored manually by checking whether the answer given is correct or incorrect, and their written comments were first categorized using thematic analysis and then ranked by frequency. The analyses of the performance on the two tasks and the written think-aloud data suggested that EFL learners exhibit fluctuation between specificity and definiteness, overgeneralizing the use of the definite article for almost all cataphoric references. It was apparent that they have trouble distinguishing from the two concepts possibly because the former is almost never introduced in the grammar books or classes designed for ESL/EFL learners. Particularly, most participants were found to be ignorant of the possibility of using nouns as [+specific, –definite]. Not surprisingly, the correct answer rates for such nouns averaged out at 33% and 46% on the pre- and post-tests, respectively, which narrowly reach half the overall mean correct answer rates of 65% on the pre-test and 81% on the post-test. In addition, correct article use for specific indefinites was most impermeable to instruction when compared with nouns used as [–specific, –definite] or [± specific, +definite]. Such findings underline the necessity for expanding the binary schema to a ternary form that incorporates the specificity feature, albeit not morphologically marked in the English language.

Keywords: countability, definiteness, English articles, specificity, ternary system

Procedia PDF Downloads 106
29 Towards End-To-End Disease Prediction from Raw Metagenomic Data

Authors: Maxence Queyrel, Edi Prifti, Alexandre Templier, Jean-Daniel Zucker

Abstract:

Analysis of the human microbiome using metagenomic sequencing data has demonstrated high ability in discriminating various human diseases. Raw metagenomic sequencing data require multiple complex and computationally heavy bioinformatics steps prior to data analysis. Such data contain millions of short sequences read from the fragmented DNA sequences and stored as fastq files. Conventional processing pipelines consist in multiple steps including quality control, filtering, alignment of sequences against genomic catalogs (genes, species, taxonomic levels, functional pathways, etc.). These pipelines are complex to use, time consuming and rely on a large number of parameters that often provide variability and impact the estimation of the microbiome elements. Training Deep Neural Networks directly from raw sequencing data is a promising approach to bypass some of the challenges associated with mainstream bioinformatics pipelines. Most of these methods use the concept of word and sentence embeddings that create a meaningful and numerical representation of DNA sequences, while extracting features and reducing the dimensionality of the data. In this paper we present an end-to-end approach that classifies patients into disease groups directly from raw metagenomic reads: metagenome2vec. This approach is composed of four steps (i) generating a vocabulary of k-mers and learning their numerical embeddings; (ii) learning DNA sequence (read) embeddings; (iii) identifying the genome from which the sequence is most likely to come and (iv) training a multiple instance learning classifier which predicts the phenotype based on the vector representation of the raw data. An attention mechanism is applied in the network so that the model can be interpreted, assigning a weight to the influence of the prediction for each genome. Using two public real-life data-sets as well a simulated one, we demonstrated that this original approach reaches high performance, comparable with the state-of-the-art methods applied directly on processed data though mainstream bioinformatics workflows. These results are encouraging for this proof of concept work. We believe that with further dedication, the DNN models have the potential to surpass mainstream bioinformatics workflows in disease classification tasks.

Keywords: deep learning, disease prediction, end-to-end machine learning, metagenomics, multiple instance learning, precision medicine

Procedia PDF Downloads 99
28 Linguistic Analysis of Argumentation Structures in Georgian Political Speeches

Authors: Mariam Matiashvili

Abstract:

Argumentation is an integral part of our daily communications - formal or informal. Argumentative reasoning, techniques, and language tools are used both in personal conversations and in the business environment. Verbalization of the opinions requires the use of extraordinary syntactic-pragmatic structural quantities - arguments that add credibility to the statement. The study of argumentative structures allows us to identify the linguistic features that make the text argumentative. Knowing what elements make up an argumentative text in a particular language helps the users of that language improve their skills. Also, natural language processing (NLP) has become especially relevant recently. In this context, one of the main emphases is on the computational processing of argumentative texts, which will enable the automatic recognition and analysis of large volumes of textual data. The research deals with the linguistic analysis of the argumentative structures of Georgian political speeches - particularly the linguistic structure, characteristics, and functions of the parts of the argumentative text - claims, support, and attack statements. The research aims to describe the linguistic cues that give the sentence a judgmental/controversial character and helps to identify reasoning parts of the argumentative text. The empirical data comes from the Georgian Political Corpus, particularly TV debates. Consequently, the texts are of a dialogical nature, representing a discussion between two or more people (most often between a journalist and a politician). The research uses the following approaches to identify and analyze the argumentative structures Lexical Classification & Analysis - Identify lexical items that are relevant in argumentative texts creating process - Creating the lexicon of argumentation (presents groups of words gathered from a semantic point of view); Grammatical Analysis and Classification - means grammatical analysis of the words and phrases identified based on the arguing lexicon. Argumentation Schemas - Describe and identify the Argumentation Schemes that are most likely used in Georgian Political Speeches. As a final step, we analyzed the relations between the above mentioned components. For example, If an identified argument scheme is “Argument from Analogy”, identified lexical items semantically express analogy too, and they are most likely adverbs in Georgian. As a result, we created the lexicon with the words that play a significant role in creating Georgian argumentative structures. Linguistic analysis has shown that verbs play a crucial role in creating argumentative structures.

Keywords: georgian, argumentation schemas, argumentation structures, argumentation lexicon

Procedia PDF Downloads 50
27 The Influence of Cognitive Load in the Acquisition of Words through Sentence or Essay Writing

Authors: Breno Barrreto Silva, Agnieszka Otwinowska, Katarzyna Kutylowska

Abstract:

Research comparing lexical learning following the writing of sentences and longer texts with keywords is limited and contradictory. One possibility is that the recursivity of writing may enhance processing and increase lexical learning; another possibility is that the higher cognitive load of complex-text writing (e.g., essays), at least when timed, may hinder the learning of words. In our study, we selected 2 sets of 10 academic keywords matched for part of speech, length (number of characters), frequency (SUBTLEXus), and concreteness, and we asked 90 L1-Polish advanced-level English majors to use the keywords when writing sentences, timed (60 minutes) or untimed essays. First, all participants wrote a timed Control essay (60 minutes) without keywords. Then different groups produced Timed essays (60 minutes; n=33), Untimed essays (n=24), or Sentences (n=33) using the two sets of glossed keywords (counterbalanced). The comparability of the participants in the three groups was ensured by matching them for proficiency in English (LexTALE), and for few measures derived from the control essay: VocD (assessing productive lexical diversity), normed errors (assessing productive accuracy), words per minute (assessing productive written fluency), and holistic scores (assessing overall quality of production). We measured lexical learning (depth and breadth) via an adapted Vocabulary Knowledge Scale (VKS) and a free association test. Cognitive load was measured in the three essays (Control, Timed, Untimed) using normed number of errors and holistic scores (TOEFL criteria). The number of errors and essay scores were obtained from two raters (interrater reliability Pearson’s r=.78-91). Generalized linear mixed models showed no difference in the breadth and depth of keyword knowledge after writing Sentences, Timed essays, and Untimed essays. The task-based measurements found that Control and Timed essays had similar holistic scores, but that Untimed essay had better quality than Timed essay. Also, Untimed essay was the most accurate, and Timed essay the most error prone. Concluding, using keywords in Timed, but not Untimed, essays increased cognitive load, leading to more errors and lower quality. Still, writing sentences and essays yielded similar lexical learning, and differences in the cognitive load between Timed and Untimed essays did not affect lexical acquisition.

Keywords: learning academic words, writing essays, cognitive load, english as an L2

Procedia PDF Downloads 50
26 Code Switching and Code Mixing among Adolescents in Kashmir

Authors: Sarwat un Nisa

Abstract:

One of the remarkable gifts that a human being is blessed with is the ability to speak using a combination of sounds. Different combinations of sounds combine to form a word which in turn make a sentence and therefore give birth to a language. A person can either be a monolingual, i.e., can speak one language or bilingual, i.e., can speak more than one language. Whether a person speaks one language or multiple languages or in whatever language a person speaks, the main aim is to communicate, express ideas, feelings or thoughts. Sometimes the choice of a language is deliberate and sometimes it is a habitual act. The language which is used to put our ideas across speaks many things about our cultural, linguistic and ethnic identities. It can never be claimed that bilinguals are better than monolinguals in terms of linguistic skills, bilinguals or multilinguals have more than one language at their disposal. Therefore, how effectively two languages are used by the same person keeps linguists always intrigued. The most prominent and common features found in the speech of bilingual speakers are code switching and code mixing. The aim of the present paper is to explore these features among the adolescent speakers of Kashmir. The reason for studying the linguistics behavior of adolescents is the age when a person is neither an adult nor a child. They want to drift away from the norms and make a new norm for themselves. Therefore, how their linguistics skills are influenced by their age is of great interest because it can set the trend for the future generation. Kashmir is a multilingual society where three languages, i.e., Kashmiri, Urdu, and English are regularly used by the speakers, especially the educated ones. Kashmiri is widely used at home or mostly among adults. Urdu is the official language, and English is used in schools and for most of the written official correspondences. Thus, it is not uncommon to find these three languages coming in contact with each other quite frequently. The language contact results in the code switching and code mixing. In this paper different aspects of code switching and code mixing are discussed. Research Method: The data were collected from the different districts of Kashmir. The informants did not have prior knowledge of the survey. The situation was spontaneous and natural. The topics were introduced by the interviewer to the group of informants which comprised of three participants. They were asked to discuss the topic, most of the times without any intervention of the interviewer. Along with conversations, the informants also filled in written questionnaires comprising sociolinguistic questions. Questionnaires were analysed to get an idea about the sociolinguistic attitude of the informants. Percentage, frequency, and average were used as statistical tools to analyse the data. Conclusions were drawn taking into consideration of interpretations of both speech samples and questionnaires.

Keywords: code mixing, code switching, Kashmir, bilingualism

Procedia PDF Downloads 115
25 The Processing of Implicit Stereotypes in Contexts of Reading, Using Eye-Tracking and Self-Paced Reading Tasks

Authors: Magali Mari, Misha Muller

Abstract:

The present study’s objectives were to determine how diverse implicit stereotypes affect the processing of written information and linguistic inferential processes, such as presupposition accommodation. When reading a text, one constructs a representation of the described situation, which is then updated, according to new outputs and based on stereotypes inscribed within society. If the new output contradicts stereotypical expectations, the representation must be corrected, resulting in longer reading times. A similar process occurs in cases of linguistic inferential processes like presupposition accommodation. Presupposition accommodation is traditionally regarded as fast, automatic processing of background information (e.g., ‘Mary stopped eating meat’ is quickly processed as Mary used to eat meat). However, very few accounts have investigated if this process is likely to be influenced by domains of social cognition, such as implicit stereotypes. To study the effects of implicit stereotypes on presupposition accommodation, adults were recorded while they read sentences in French, combining two methods, an eye-tracking task and a classic self-paced reading task (where participants read sentence segments at their own pace by pressing a computer key). In one condition, presuppositions were activated with the French definite articles ‘le/la/les,’ whereas in the other condition, the French indefinite articles ‘un/une/des’ was used, triggering no presupposition. Using a definite article presupposes that the object has already been uttered and is thus part of background information, whereas using an indefinite article is understood as the introduction of new information. Two types of stereotypes were under examination in order to enlarge the scope of stereotypes traditionally analyzed. Study 1 investigated gender stereotypes linked to professional occupations to replicate previous findings. Study 2 focused on nationality-related stereotypes (e.g. ‘the French are seducers’ versus ‘the Japanese are seducers’) to determine if the effects of implicit stereotypes on reading are generalizable to other types of implicit stereotypes. The results show that reading is influenced by the two types of implicit stereotypes; in the two studies, the reading pace slowed down when a counter-stereotype was presented. However, presupposition accommodation did not affect participants’ processing of information. Altogether these results show that (a) implicit stereotypes affect the processing of written information, regardless of the type of stereotypes presented, and (b) that implicit stereotypes prevail over the superficial linguistic treatment of presuppositions, which suggests faster processing for treating social information compared to linguistic information.

Keywords: eye-tracking, implicit stereotypes, reading, social cognition

Procedia PDF Downloads 172