Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 251

Search results for: sentence parsing

71 Author Profiling: Prediction of Learners’ Gender on a MOOC Platform Based on Learners’ Comments

Authors: Tahani Aljohani, Jialin Yu, Alexandra. I. Cristea

Abstract:

The more an educational system knows about a learner, the more personalised interaction it can provide, which leads to better learning. However, asking a learner directly is potentially disruptive, and often ignored by learners. Especially in the booming realm of MOOC Massive Online Learning platforms, only a very low percentage of users disclose demographic information about themselves. Thus, in this paper, we aim to predict learners’ demographic characteristics, by proposing an approach using linguistically motivated Deep Learning Architectures for Learner Profiling, particularly targeting gender prediction on a FutureLearn MOOC platform. Additionally, we tackle here the difficult problem of predicting the gender of learners based on their comments only – which are often available across MOOCs. The most common current approaches to text classification use the Long Short-Term Memory (LSTM) model, considering sentences as sequences. However, human language also has structures. In this research, rather than considering sentences as plain sequences, we hypothesise that higher semantic - and syntactic level sentence processing based on linguistics will render a richer representation. We thus evaluate, the traditional LSTM versus other bleeding edge models, which take into account syntactic structure, such as tree-structured LSTM, Stack-augmented Parser-Interpreter Neural Network (SPINN) and the Structure-Aware Tag Augmented model (SATA). Additionally, we explore using different word-level encoding functions. We have implemented these methods on Our MOOC dataset, which is the most performant one comparing with a public dataset on sentiment analysis that is further used as a cross-examining for the models' results.

Keywords: deep learning, data mining, gender predication, MOOCs

Procedia PDF Downloads 152

70 Validity and Reliability of a Questionaire for Measuring Behaviour Change of Low Performance Employee

Authors: Hazaila Binti Hassan, Abu Yazid Bin Abu Bakar, Salleh Amat

Abstract:

This study is to get the validity and reliability of the questionnaire for behaviour change on low-performing officers. This study aimed to develop and evaluate the behaviour of low performing officers. There are 75 items in this questionnaire which involves 5 subscales, which are the 5 dimensions intended to be studied: 1st emotional stability, 2nd psycho-spiritual enhancement, 3rd social skills development, 4th cognitive and rationality improvement and 5th behavioural alignment and adjustment. There are 2 processes in this research whereby to check the validity and reliability. Both use quantitative methods. Validity content testing has been conducted to validate the behavioural change questionnaire of the low performing officers. For the face validity, 4 people are involved, two are psychologists who carried out the program and the other two are officers of the same rank, i.e. supporting officers. They are involved in correction of sentences, languages, and grammar as well as the sentence structures so that it tallies with the purpose of studies. The questionnaire underwent content validity by the experts. Five experts are appointed to attend this session, 3 are directly involved in the construction of this questionnaire and 2 others are experts from the university with a background in questionnaire development. The result shows that the content validity obtained a high coefficient of 0.745 with a minimum and maximum value of more than 0.60 which satisfies the characteristic of Content Value Ratio. The Cronbach’s alpha result is 0.867. The highest scores are the behavioural alignment and adjustment sub-scale recorded the highest value, followed by social skills development sub-scale, cognitive and rational improvements sub-scale, psycho-spiritual enhancement sub-scale, and lastly emotional stability. Therefore, both of validity and reliability result were accepted that this questionnaire is valid and reliable can be used in the study of behaviour changes of low performing officers in the civil service.

Keywords: content validity, reliability, five dimension, low-performing officers, questionnaire

Procedia PDF Downloads 290

69 Creating Systems Change: Implementing Cross-Sector Initiatives within the Justice System to Support Ontarians with Mental Health and Addictions Needs

Authors: Tania Breton, Dorina Simeonov, Shauna MacEachern

Abstract:

Ontario’s 10 Year Mental Health and Addictions Strategy has included the establishment of 18 Service Collaborative across the province; cross-sector tables in a specific region coming together to explore mental health and addiction system needs and adopting an intervention to address that need. The process is community led and supported by implementation teams from the Centre for Addiction and Mental Health (CAMH), using the framework of implementation science (IS) to enable evidence-based and sustained change. These justice initiatives are focused on the intersection of the justice system and the mental health and addiction systems. In this presentation, we will share the learnings, achievements and challenges of implementing innovative practices to the mental health and addictions needs of Ontarians within the justice system. Specifically, we will focus on the key points across the justice system - from early intervention and trauma-informed, culturally appropriate services to post-sentence support and community reintegration. Our approach to this work involves external implementation support from the CAMH team including coaching, knowledge exchange, evaluation, Aboriginal engagement and health equity expertise. Agencies supported the implementation of tools and processes which changed practice at the local level. These practices are being scaled up across Ontario and community agencies have come together in an unprecedented collaboration and there is a shared vision of the issues overlapping between the mental health, addictions and justice systems. Working with ministry partners has allowed space for innovation and created an environment where better approaches can be nurtured and spread.

Keywords: implementation, innovation, early identification, mental health and addictions, prevention, systems

Procedia PDF Downloads 367

68 Characteristics of an Impact on Reading Comprehension of Elementary School Students

Authors: Judith Hanke

Abstract:

Due to the rise of students with reading difficulties, a digital reading support was developed. The digital reading support focuses on reading comprehension of elementary school students. It consists of literary texts and reading exercises with diagnostics. To analyze the use of the reading packages an intervention study took place in 2023. For the methodology, an ABA-design was selected for the intervention study to examine the reading packages. The study was expedited from April 2023 until July 2023 and collected quantitative data of individuals, groups, and classes. It consisted of a survey group (N = 58) and a control group (N = 53). The pretest was conducted before the reading support intervention. The students of the survey group received reading support on their ability level to aid the individual student’s needs. At the beginning of the study characteristics of the students were collected. The characteristics included gender, age, repetition of a class, spoken language at home, German as a second language, and special support needs such as dyslexia; right after the intervention, the posttest was examined. At least three weeks after the intervention, the follow-up testing was administered. A standardized reading comprehension test was used for the three test times. The test consists of three subtests: word comprehension, sentence comprehension, and text comprehension. The focus of this paper is to determine which characteristics have an impact on reading comprehension of elementary school students. The students’ characteristics were correlated with the three test times through a Pearson correlation. The main findings are that age, repetition of a class, spoken language at home, German as a second language have an effect on reading comprehension. Interestingly gender and special support needs did not have a significant effect on the reading comprehension of the students. The significance of the study is to determine which characteristics have an impact on reading comprehension and then to assess how reading support can be modified to support the diverse students.

Keywords: class repetition, reading comprehension, reading support, second language, spoken language at home

Procedia PDF Downloads 37

67 Online Multilingual Dictionary Using Hamburg Notation for Avatar-Based Indian Sign Language Generation System

Authors: Sugandhi, Parteek Kumar, Sanmeet Kaur

Abstract:

Sign Language (SL) is used by deaf and other people who cannot speak but can hear or have a problem with spoken languages due to some disability. It is a visual gesture language that makes use of either one hand or both hands, arms, face, body to convey meanings and thoughts. SL automation system is an effective way which provides an interface to communicate with normal people using a computer. In this paper, an avatar based dictionary has been proposed for text to Indian Sign Language (ISL) generation system. This research work will also depict a literature review on SL corpus available for various SL s over the years. For ISL generation system, a written form of SL is required and there are certain techniques available for writing the SL. The system uses Hamburg sign language Notation System (HamNoSys) and Signing Gesture Mark-up Language (SiGML) for ISL generation. It is developed in PHP using Web Graphics Library (WebGL) technology for 3D avatar animation. A multilingual ISL dictionary is developed using HamNoSys for both English and Hindi Language. This dictionary will be used as a database to associate signs with words or phrases of a spoken language. It provides an interface for admin panel to manage the dictionary, i.e., modification, addition, or deletion of a word. Through this interface, HamNoSys can be developed and stored in a database and these notations can be converted into its corresponding SiGML file manually. The system takes natural language input sentence in English and Hindi language and generate 3D sign animation using an avatar. SL generation systems have potential applications in many domains such as healthcare sector, media, educational institutes, commercial sectors, transportation services etc. This research work will help the researchers to understand various techniques used for writing SL and generation of Sign Language systems.

Keywords: avatar, dictionary, HamNoSys, hearing impaired, Indian sign language (ISL), sign language

Procedia PDF Downloads 236

66 The Recording of Personal Data in the Spanish Criminal Justice System and Its Impact on the Right to Privacy

Authors: Deborah García-Magna

Abstract:

When a person goes through the criminal justice system, either as a suspect, arrested, prosecuted or convicted, certain personal data are recorded, and a wide range of persons and organizations may have access to it. The recording of data can have a great impact on the daily life of the person concerned during the period of time determined by the legislation. In addition, this registered information can refer to various aspects not strictly related directly to the alleged or actually committed infraction. In some areas, the Spanish legislation does not clearly determine the cancellation period of the registers nor what happens when they are cancelled since some of the files are not really erased and remain recorded, even if their consultation is no more allowed or it is stated that they should not be taken into account. Thus, access to the recorded data of arrested or convicted persons may reduce their possibilities of reintegration into society. In this research, some of the areas in which data recording has a special impact on the lives of affected persons are analyzed in a critical manner, taking into account Spanish legislation and jurisprudence, and the influence of the European Court of Human Rights, the Council of Europe and other supranational instruments. In particular, the analysis cover the scope of video-surveillance in public spaces, the police record, the recording of personal data for the purposes of police investigation (especially DNA and psychological profiles), the registry of administrative and minor offenses (especially as they are taken into account to impose aggravating circumstaces), criminal records (of adults, minors and legal entities), and the registration of special circumstances occurred during the execution of the sentence (files of inmates under special surveillance –FIES–, disciplinary sanctions, special therapies in prison, etc.).

Keywords: ECHR jurisprudence, formal and informal criminal control, privacy, disciplinary sanctions, social reintegration

Procedia PDF Downloads 149

65 Analyzing the Effect of Multilingualism, Language 1, and Language 2 on Reading Comprehension

Authors: Judith Hanke

Abstract:

Due to the increase of students with reading difficulties, digital reading support with diagnostics was developed to foster the individual student's reading comprehension. The digital reading support focused on the reading comprehension of elementary school students. The digital reading packages consist of literary texts with aligned reading exercises. The number of students with German as a second language is growing in Germany. Students with multilingualism, language 1, and language 2 learn German together in school. The research's focus is on determining whether and to what extent multilingualism, language 1, and language 2 affect reading comprehension. For the methodology, an ABA design was selected for the intervention study to examine the reading support. The study was expedited from April 2023 until July 2023 and collected quantitative data of individuals, groups, and classes. It comprised a survey group (N = 58) and a control group (N = 53). The quantitative data was collected from 3 classes of 3 teachers and 47 students for all three test times. To show differences between the groups, a standardized reading comprehension test was used for the three test times, pretest, posttest, and follow-up. The standardized test consists of three subtests regarding word comprehension, sentence comprehension, and text comprehension. The main findings include that students who spoke German as their first language had the best test scores. Interestingly, students with a different language had better testing scores than students with German as the first language and (an) other language/s. Also, the students with another language outperformed the native language speakers in one of the subtests of the post-testing. The variables of spoken language at home and German as a second language were also examined and correlated with the test results. One significant correlation was found between spoken language at home and the text comprehension test of the pretesting. Additionally, the variable German as a second language had multiple significant correlations in the pretest, posttest and follow-up. The study's significance is to understand the influence of several languages, language 1, and language 2, on reading comprehension.

Keywords: multilingualism, language 1, language 2, reading comprehension, second language

Procedia PDF Downloads 33

64 Translating Discourse Organization Structures Used in Chinese and English Scientific and Engineering Writings

Authors: Ming Qian, Davis Qian

Abstract:

This study compares the different organization structures of Chinese and English writing discourses in the engineering and scientific fields, and recommends approaches for translators to convert the organization structures properly. Based on existing intercultural communication literature, English authors tend to deductively give their main points at the beginning, following with detailed explanations or arguments afterwards while the Chinese authors tend to place their main points inductively towards the end. In this study, this hypothesis has been verified by the authors’ Chinese-to-English translation experiences in the fields of science and engineering (e.g. journal papers, conference papers and monographs). The basic methodology used is the comparison of writings by Chinese authors with writings of the same or similar topic written by English authors in terms of organization structures. Translators should be aware of this nuance, so that instead of limiting themselves to translating the contents of an article in its original structure, they can convert the structures to fill the cross-culture gap. This approach can be controversial because if a translator changes the structure organization of a paragraph (e.g. from a 'because-therefore' inductive structure by a Chinese author to a deductive structure in English), this change of sentence order could be questioned by the original authors. For this reason, translators need to properly inform the original authors on the intercultural differences of English and Chinese writing (e.g. inductive structure versus deductive structure), and work with the original authors to maintain accuracy while converting from one structure used in a source language to another structure in the target language. The authors have incorporated these methodologies into their translation practices and work closely with the authors on the inter-cultural organization structure mapping. Translating discourse organization structure should become a standard practice in the translation process.

Keywords: discourse structure, information structure, intercultural communication, translation practice

Procedia PDF Downloads 443

63 Cognitive and Functional Analysis of Experiencer Subject and Experiencer Object Psychological Predicate Constructions in French

Authors: Carine Kawakami

Abstract:

In French, as well as in English, there are two types of psychological predicate constructions depending on where the experiencer argument is realized; the first type is in the subject position (e.g. Je regrette d’être venu ici. ‘I regret coming here'), hereinafter called ES construction, and the second type is in the object position (e.g. Cette nouvelle m’a surpris. ‘This new surprised me.'), referred as EO construction. In the previous studies about psychological predicates, the syntactic position of the experiencer argument has been just a matter of its connection with the syntactic or semantic structure of the predicate. So that few attentions have been paid to how two types of realization of experiencer are related to the conceptualization of psychological event and to the function of the sentence describing the psychological event, in the sense of speech act theory. In this research, focusing on the French phenomena limited to the first personal pronoun and the present tense, the ES constructions and the EO constructions will be analyzed from cognitive and functional approach. It will be revealed that, due to the possibility to be used in soliloquy and the high co-occurrence with ça (‘it’), the EO constructions may have expressive function to betray what speaker feels in hic et nunc, like interjection. And in the expressive case, the experiencer is construed as a locus where a feeling appears spontaneously and is construed subjectively (e.g. Ah, ça m’énerve! ‘Oh, it irritates me!'). On the other hand, the ES constructions describe speaker’s mental state in an assertive manner rather than the expressive and spontaneously way. In other words, they describe what speaker feels to the interlocutor (e.g. Je suis énervé. ‘I am irritated.'). As a consequence, when the experiencer argument is realized in the subject position, it is construed objectively and have a participant feature in the sense of cognitive grammar. Finally, it will be concluded that the choice of construction type, at least in French, is correlated to the conceptualization of the psychological event and the discourse feature of its expression.

Keywords: french psychological verb, conceptualization, expressive function, assertive function, experiencer realization

Procedia PDF Downloads 139

62 Sociological Analysis on Prisoners; with Special Reference to Prisoners of Death Penalty and Life Imprisonment in Sri Lanka

Authors: Wasantha Subasinghe

Abstract:

Crimes are one of big social problems in Sri Lanka. Crimes can be seen as simply way as an activity that against for the society or public law. There are offences in minor crimes and grave crimes including murder, rape, trafficking, robbery, excise, narcotic, kidnapping and so on. There are various forms of punishment such as bailing, fining, and prisoning to the death penalty. Death penalty contains the killing of an offender for an offense. There are 23 prison institutions in Sri Lanka including 03 closed prisoners and 20 remand prisons. There are 10 work camps, 02 open prison camps, 01 training school for youthful offenders and 02 correctional centers for youthful offenders. Capital punishment is legal in Sri Lanka as many other countries as India, Japan, Bangladesh, Iran and Iraq so on. When compared unconvicted prisoners from 2006-2010 there is an increase. It was 89190 in 2006 and it was 100191 in 2010. There were 28732 of convicted prisoners and it was 32128 in 2010. There were 165 Death sentences in 2006 and it was 96 in 2010. There are 540 individuals had been sentenced to death. The death penalty has not been implemented in Sri Lanka since 1976. Research problem: What are the feelings of prisoners as waiting for death?’ Objectives of the study were identifying prisoners’ point of view on their punishment and root causes for their offence. Case studies were conducted to identify the research problem and data were collected using formal interviews. Research area was Welikada prison. Stratified sampling method in probability samplings was used. Sample size was 20 cases from death penalty and life in prison prisoners and 20 from other convicted prisoners. Findings revealed causes and feelings them as offenders. They need if death penalty or freedom. Some of them need to convert death sentence to life imprisonment. They are physically and mentally damaged after their imprisonment. Lack of hope and as well as lack of welfare and rehabilitation programs they suffered their lives.

Keywords: death penalty, expectations, life imprisonment, rehabilitation

Procedia PDF Downloads 285

61 Narrative Therapy as a Way of Terrorist Rehabilitation at Mohammad Bin Naif Counselling and Care Center: A Case Study

Authors: Yasser Almazrua

Abstract:

Terrorism is a multidimensional phenomenon that has increased recently. Many countries started combating terrorism through security forces; however, there has been relatively little attention given to rehabilitation programs for people involved in such terrorism acts. In Saudi Arabia, after facing so many terrorist attacks, they started understanding and countering terrorism differently by establishing Mohammad bin Naif Counselling and Care Center in 2006. The center now is considered one of the top experience centers in the world for terrorist rehabilitation and ideology correction. The center offers different programs such as training, educational, social, art and psychological programs. One of the approaches that have been used by psychological experts at the center is Narrative Therapy. It is a therapeutic approach that focuses on the ability of the client to identify their personal life story. The client during therapy works as a storyteller where he or she gets insight, meaning and better understanding of their own lives. Because each client at the center had a story, it can be better fit method for rehabilitation towards healing and personal development. The case describes a 34-years-old man who was involved in some terrorism activities locally by technically and financially supporting a terrorist group related to Al-Qaida. The beneficiary joined Mohammad bin Naif Counseling and Care Center after serving his sentence. Informed of consent has been given to the beneficiary before starting the therapeutic program. Both qualitative and quantitative data on the beneficiary are collected by self-reporting during the initial session, and by using a psychological measurement. The result found that the beneficiary was not insightful about himself, and he had a high level of repression which relatedly moved him to be targeted for recruitment in the terrorist group. With rehabilitation and by using the therapeutic approach, the beneficiary improved on the level of insight, specifically about himself and also about the experience. This case illustrates the importance of considering the effect of Narrative Therapy in terrorist rehabilitation programs.

Keywords: narrative therapy, rehabilitation, Saudi Arabia, terrorism

Procedia PDF Downloads 299

60 Compilation and Statistical Analysis of an Arabic-English Legal Corpus in Sketch Engine

Authors: C. Brierley, H. El-Farahaty, A. Farhan

Abstract:

The Leeds Parallel Corpus of Arabic-English Constitutions is a parallel corpus for the Arabic legal domain. Analysis of legal language via Corpus Linguistics techniques is an important development. In legal proceedings, a corpus-based approach to disambiguating meaning is set to replace the dictionary as an interpretative tool, and legal scholarship in the States is now attuned to the potential for Text Analytics over vast quantities of text-based legal material, following the business and medical industries. This trend is reflected in Europe: the interdisciplinary research group in Computer Assisted Legal Linguistics mines big data collections of legal and non-legal texts to analyse: legal interpretations; legal discourse; the comprehensibility of legal texts; conflict resolution; and linguistic human rights. This paper focuses on ‘dignity’ as an important aspect of the overarching concept of human rights in current constitutions across the Arab world. We have compiled a parallel, Arabic-English raw text corpus (169,861 Arabic words and 205,893 English words) from reputable websites such as the World Intellectual Property Organisation and CONSTITUTE, and uploaded and queried our corpus in Sketch Engine. Our most challenging task was sentence-level alignment of Arabic-English data. This entailed manual intervention to ensure correspondence on a one-to-many basis since Arabic sentences differ from English in length and punctuation. We have searched for morphological variants of ‘dignity’ (رامة ك, karāma) in the Arabic data and inspected their English translation equivalents. The term occurs most frequently in the Sudanese constitution (10 instances), and not at all in the constitution of Palestine. Its most frequent collocate, determined via the logDice statistic in Sketch Engine, is ‘human’ as in ‘human dignity’.

Keywords: Arabic constitution, corpus-based legal linguistics, human rights, parallel Arabic-English legal corpora

Procedia PDF Downloads 185

59 Another Justice: Litigation Masters in Chinese Legal Story

Authors: Lung-Lung Hu

Abstract:

Ronald Dworkin offered a legal theory of ‘chain enterprise’ that all the judges in legal history altogether create a ‘law’ aiming a specific purpose. Those judges are like co-writers of a chain-story who not only create freely but also are constrained by the story made by the judges before them. The law created by Chinese traditional judges is another case, they, compared with the judges mentioned by Ronald Dworkin, have relatively narrower space of making a legal sentence according to their own discretions because the statutes in Chinese traditional law at the very beginning have been designed as panel code that leaves small room to judge’s discretion. Furthermore, because law is a representative of the authority of the government, i.e. the emperor, any misjudges and misuses deviated from the law will be considered as a challenge to the supreme power. However, different from judges as the defenders of law, Chinese litigation masters who want to win legal cases have to be offenders challenging the verdict that does not favor his or his client’s interest. Besides, litigation master as an illegal or non-authorized profession does not belong to any legal system, therefore, they are relatively freer to ‘create’ the law. According to Stanley Fish’s articles that question Ronald Dworkin and Owen Fiss’ ideas about law, he construes that, since law is made of language, law is open to interpretations that cannot be constrained by any rules or any particular legal purposes. Stanley Fish’s idea can also be applied on the analysis about the stories of Chinese litigation masters in traditional Chinese literature. These Chinese litigation masters’ legal opinions in the so-called chain enterprise are like an unexpected episode that tries to revise the fixed story told by law. Although they are not welcome to the officials and also to the society, their existence is still a phenomenon representing another version of justice different from the official’s and can be seen as a de-structural power to the government. Hence, in this present paper the language and strategy applied by Chinese litigation masters in Chinese legal stories will be analysed to see how they refute made legal judgments and challenge the official standard of justice.

Keywords: Chinese legal stories, interdisciplinary, litigation master, post-structuralism

Procedia PDF Downloads 393

58 English Language Teaching Graduate Students' Use of Discussion Moves in Research Articles

Authors: Gamzegul Koca, Evrim Eveyik-Aydin

Abstract:

Genre and discipline-specific knowledge of academic discourse in writing has long been acknowledged as being a core skill to achieve formidable tasks that are expected of graduate students in academic settings. Genre analysis approaches can be adopted to unveil the challenges encountered in these tasks to be able to take instructional actions addressing the aspects of graduate writing that need improvement. In an attempt to find genre-specific academic writing needs of Turkish students enrolled in a graduate program in ELT, this study examines the rhetorical structure of discussion sections of research articles written during the course load stage of their graduate studies. The 35.437-word specialized corpus of graduate papers compiled for the purpose of the study includes discussions of 58 unpublished reports of empirical studies, 31 written in MA courses and 27 in Ph.D. courses by a total of 44 graduate students. The study does sentence-based move structure analysis using the framework developed by Eveyik-Aydın, Karabacak and Akyel in a corpus-based study that analyzed the discussion moves of expert writers in published articles in ELT journals indexed by Social Sciences Citation. The coding of 1577 sentences by three graders using this framework revealed that while the graduate papers included the same moves used in published articles, the rhetorical structure of MA and Ph.D. papers showed considerable differences in terms of the frequency of occurrence of main discussion moves, including interpretation of the results and drawing implications. The implications of these findings will be discussed with respect to the needs of graduate writers and the expectations of discourse community.

Keywords: discussion moves, genre-specific rhetorical structure, move analysis, research articles, the specialized corpus of graduate papers

Procedia PDF Downloads 168

57 Deep Learning Based Text to Image Synthesis for Accurate Facial Composites in Criminal Investigations

Authors: Zhao Gao, Eran Edirisinghe

Abstract:

The production of an accurate sketch of a suspect based on a verbal description obtained from a witness is an essential task for most criminal investigations. The criminal investigation system employs specifically trained professional artists to manually draw a facial image of the suspect according to the descriptions of an eyewitness for subsequent identification. Within the advancement of Deep Learning, Recurrent Neural Networks (RNN) have shown great promise in Natural Language Processing (NLP) tasks. Additionally, Generative Adversarial Networks (GAN) have also proven to be very effective in image generation. In this study, a trained GAN conditioned on textual features such as keywords automatically encoded from a verbal description of a human face using an RNN is used to generate photo-realistic facial images for criminal investigations. The intention of the proposed system is to map corresponding features into text generated from verbal descriptions. With this, it becomes possible to generate many reasonably accurate alternatives to which the witness can use to hopefully identify a suspect from. This reduces subjectivity in decision making both by the eyewitness and the artist while giving an opportunity for the witness to evaluate and reconsider decisions. Furthermore, the proposed approach benefits law enforcement agencies by reducing the time taken to physically draw each potential sketch, thus increasing response times and mitigating potentially malicious human intervention. With publically available 'CelebFaces Attributes Dataset' (CelebA) and additionally providing verbal description as training data, the proposed architecture is able to effectively produce facial structures from given text. Word Embeddings are learnt by applying the RNN architecture in order to perform semantic parsing, the output of which is fed into the GAN for synthesizing photo-realistic images. Rather than the grid search method, a metaheuristic search based on genetic algorithms is applied to evolve the network with the intent of achieving optimal hyperparameters in a fraction the time of a typical brute force approach. With the exception of the ‘CelebA’ training database, further novel test cases are supplied to the network for evaluation. Witness reports detailing criminals from Interpol or other law enforcement agencies are sampled on the network. Using the descriptions provided, samples are generated and compared with the ground truth images of a criminal in order to calculate the similarities. Two factors are used for performance evaluation: The Structural Similarity Index (SSIM) and the Peak Signal-to-Noise Ratio (PSNR). A high percentile output from this performance matrix should attribute to demonstrating the accuracy, in hope of proving that the proposed approach can be an effective tool for law enforcement agencies. The proposed approach to criminal facial image generation has potential to increase the ratio of criminal cases that can be ultimately resolved using eyewitness information gathering.

Keywords: RNN, GAN, NLP, facial composition, criminal investigation

Procedia PDF Downloads 167

56 Gender Agreement in Italian Compounds with Capo-

Authors: Irene Lami, Silvia Micheli, Jan Radimský, Joost van de Weijer

Abstract:

The present study examines gender agreement in Italian compounds with "capo-". Compounds containing "capo-" as the first element is highly productive in Italian and are attested from the earliest stages of the language, with "capo" indicating a prominent role in a group. This type of compound has become progressively more productive over time, establishing itself in the language to indicate human referents with a leadership role over someone or something belonging to both subordinate and coordinate compound categories. In light of the debates on the use of inclusive language, especially with regard to female professional titles in Italian, the gender agreement of the word "capo" is investigated, which in addition to social resistance, also encounters etymological resistance. Regarding the gender agreement of the word "capo-" as the first element of compounds, in addition to social and etymological resistances, morphological constraints must also be considered. In our experiment, 190 native informants were asked to match the gender of the given the word in a sentence, thinking of female referents. The results confirm a scalar hypothesis of gender agreement (i.e., titles traditionally attributed to women > titles traditionally attributed to men > the word "capo" in isolation > the word "capo-" as an element of subordinate compound > the word “capo-“ as an element of a coordinate compound). A significant interplay with number marking is also shown, as words are inflected in gender when the trait +plural is present. Moreover, the results show that, contrary to what is prescriptively established, speakers do inflect the word "capo" according to gender, in limited instances, even when this is found as a compound element, even though to a lesser extent than words that only have social hinders and not etymological or morphological ones. The results appear to show that, although a morphological obstacle is visible, sociolinguistic claims seem to be able to divert these obstacles. This study appears particularly suitable for replication tests over the next few decades, which, if society opens up further to claims of inclusiveness, could further corroborate this trend.

Keywords: compounds, gender inflection, Italian, morphology

Procedia PDF Downloads 62

55 Logical Thinking: A Surprising and Promising Insight for Creative and Critical Thinkers

Authors: Luc de Brabandere

Abstract:

Searchers in various disciplines have long tried to understand how a human being thinks. Most of them seem to agree that the brain works in two very different modes. For us, the first phase of thought imagines, diverges, and unlocks the field of possibilities. The second phase, judges converge and choose. But if we were to stop there, that would give the impression that thought is essentially an individual effort that seldom depends on context. This is, however, not the case. Whether we be a champion in creativity, so primarily in induction, or a master in logic where we are confronted with reality, the ideas we layout are indeed destined to be presented to third parties. They should therefore be exposed, defended, communicated, negotiated, or even sold. Regardless of the quality of the concepts we craft (creative thinking) and the interferences we build (logical thinking) we will take one day, or another, be confronted by people whose beliefs, opinions and ideas differ from ours (critical thinking). Logic and critique: The shared characteristics of logical and critical thoughts include a three-level structure of reasoning invented by the Greeks. For the first time in history, Aristotle tried to model thought deployable in three stages: the concept, the statement, and the reasoning. The three levels can be assessed according to different criteria. A concept is more or less useful, a statement is true or false, and reasoning is right or wrong. This three-level structure allows us to differentiate logic and critique, where the intention and words used are not the same. Logic only deals with the structure of reasoning and exhausts the problem. It regards premises as acquired and excludes the debate. Logic is in all certainty and pursues the truth. Critique is most probably searching for the plausible. Logic and creativity: Many known models present the brain as a two-stroke engine (divergence vs convergence, fast vs. slow, left-brain vs right-brain, Yin vs Yang, etc.). But that’s not the only thing. “Why didn’t we think of that before?” How often have we heard that sentence? A creative idea is the outcome of logic, but you can only understand it afterward! Through the use of exercises, we will witness how logic and creativity work together. A third theme is hidden behind the two main themes of the conference: logical thought, which the author can shed some light on.

Keywords: creativity, logic, critique, digital

Procedia PDF Downloads 93

54 Gender-Based Differences in the Social Judgment of Hungarian Politicians' Sex Scandals

Authors: Sara Dalma Galgoczi, Judith Gabriella Kengyel

Abstract:

Sex scandals are quite an engaging topic to work with, especially with their judgment in society. Most people are interested in other people's lives, specifically in public figures' such as celebrities or politicians, because ordinary people feel like they have the right to know more things about the famous and notorious ones than they would probably willing to share. Intimacy and sexual acts aren't exceptions; moreover, sexuality is one of the central interests of humans ever since. Besides, knowing and having an opinion about any kind of scandal can change even whole social groups or classes estimation of anyone. This study aims to research the social judgment of some Hungarian politicians' sex scandals and asks important questions like diverse public opinions in the light of gender or delegates’ abuse of power. Considering that this study is about collecting and evaluating opinions from the public, and no one before researched and published this exact topic and cases, an online survey was created. In the survey were different sections. We collected data about party-preference, conservativism-liberalism scale; then we used the following questionnaires: from Zero-sum perspective with regard to gender equality (Ruthig, Kehn, Gamblin, Vanderzanden & Jones, 2017), Ambivalent Sexism Inventory (ASI; Glick & Fiske, 1996), Ambivalence Toward Men Inventory (AMI; Glick & Fiske, 1999). Finally, 5 short summaries were presented about five Hungarian politicians' sex scandal cases (3 males, 2 females) from the recent past. These stories were followed by questions about their opinion of the party and attitudes towards the parties' reactions to the cases. We came to the conclusion that people are more permissive with the scandals of men, and benevolent sexism and ambivalence towards men mediate this relation. Men tend to see these cases as part of politicians' private lives more than women. Party preference had a significant effect - people tend to pass a sentence the delegates of the opposing parties, and they rather release the delegates of their preferred party.

Keywords: sex scandal, sexism, social judgement, politician

Procedia PDF Downloads 127

53 Legal Judgment Prediction through Indictments via Data Visualization in Chinese

Authors: Kuo-Chun Chien, Chia-Hui Chang, Ren-Der Sun

Abstract:

Legal Judgment Prediction (LJP) is a subtask for legal AI. Its main purpose is to use the facts of a case to predict the judgment result. In Taiwan's criminal procedure, when prosecutors complete the investigation of the case, they will decide whether to prosecute the suspect and which article of criminal law should be used based on the facts and evidence of the case. In this study, we collected 305,240 indictments from the public inquiry system of the procuratorate of the Ministry of Justice, which included 169 charges and 317 articles from 21 laws. We take the crime facts in the indictments as the main input to jointly learn the prediction model for law source, article, and charge simultaneously based on the pre-trained Bert model. For single article cases where the frequency of the charge and article are greater than 50, the prediction performance of law sources, articles, and charges reach 97.66, 92.22, and 60.52 macro-f1, respectively. To understand the big performance gap between articles and charges, we used a bipartite graph to visualize the relationship between the articles and charges, and found that the reason for the poor prediction performance was actually due to the wording precision. Some charges use the simplest words, while others may include the perpetrator or the result to make the charges more specific. For example, Article 284 of the Criminal Law may be indicted as “negligent injury”, "negligent death”, "business injury", "driving business injury", or "non-driving business injury". As another example, Article 10 of the Drug Hazard Control Regulations can be charged as “Drug Control Regulations” or “Drug Hazard Control Regulations”. In order to solve the above problems and more accurately predict the article and charge, we plan to include the article content or charge names in the input, and use the sentence-pair classification method for question-answer problems in the BERT model to improve the performance. We will also consider a sequence-to-sequence approach to charge prediction.

Keywords: legal judgment prediction, deep learning, natural language processing, BERT, data visualization

Procedia PDF Downloads 125

52 Data Mining Model for Predicting the Status of HIV Patients during Drug Regimen Change

Authors: Ermias A. Tegegn, Million Meshesha

Abstract:

Human Immunodeficiency Virus and Acquired Immunodeficiency Syndrome (HIV/AIDS) is a major cause of death for most African countries. Ethiopia is one of the seriously affected countries in sub Saharan Africa. Previously in Ethiopia, having HIV/AIDS was almost equivalent to a death sentence. With the introduction of Antiretroviral Therapy (ART), HIV/AIDS has become chronic, but manageable disease. The study focused on a data mining technique to predict future living status of HIV/AIDS patients at the time of drug regimen change when the patients become toxic to the currently taking ART drug combination. The data is taken from University of Gondar Hospital ART program database. Hybrid methodology is followed to explore the application of data mining on ART program dataset. Data cleaning, handling missing values and data transformation were used for preprocessing the data. WEKA 3.7.9 data mining tools, classification algorithms, and expertise are utilized as means to address the research problem. By using four different classification algorithms, (i.e., J48 Classifier, PART rule induction, Naïve Bayes and Neural network) and by adjusting their parameters thirty-two models were built on the pre-processed University of Gondar ART program dataset. The performances of the models were evaluated using the standard metrics of accuracy, precision, recall, and F-measure. The most effective model to predict the status of HIV patients with drug regimen substitution is pruned J48 decision tree with a classification accuracy of 98.01%. This study extracts interesting attributes such as Ever taking Cotrim, Ever taking TbRx, CD4 count, Age, Weight, and Gender so as to predict the status of drug regimen substitution. The outcome of this study can be used as an assistant tool for the clinician to help them make more appropriate drug regimen substitution. Future research directions are forwarded to come up with an applicable system in the area of the study.

Keywords: HIV drug regimen, data mining, hybrid methodology, predictive model

Procedia PDF Downloads 143

51 Specific Language Impairment: Assessing Bilingual Children for Identifying Children with Specific Language Impairment (SLI)

Authors: Manish Madappa, Madhavi Gayathri Raman

Abstract:

The primary vehicle of human communication is language. A breakdown occurring in any aspect of communication may lead to frustration and isolation among the learners and the teachers. Over seven percent of the population in the world currently experience limitations and those children who exhibit a deviant/deficient language acquisition curve even when being in a language rich environment as their peers may be at risk of having a language disorder or language impairment. The difficulty may be in the word level [vocabulary/word knowledge] and/or the sentence level [syntax/morphology) Children with SLI appear to be developing normally in all aspects except for their receptive and/or expressive language skills. Thus, it is utmost importance to identify children with or at risk of SLI so that an early intervention can foster language and social growth, provide the best possible learning environment with special support for language to be explicitly taught and a step in providing continuous and ongoing support. The present study looks at Kannada English bilingual children and works towards identifying children at risk of “specific language impairment”. The study was conducted through an exploratory study which systematically enquired into the narratives of young Kannada-English bilinguals and to investigate the data for story structure in their narrative formulations. Oral narrative offers a rich source of data about a child’s language use in a relatively natural context. The fundamental objective is to ensure comparability and to be more universal and thus allows for the evaluation narrative text competence. The data was collected from 10 class three students at a primary school in Mysore, Karnataka and analyzed for macrostructure component reflecting the goal directed behavior of a protagonist who is motivated to carry out some kind of action with the intention of attaining a goal. The results show that the children exhibiting a deviation of -1.25 SD are at risk of SLI. Two learners were identified to be at risk of Specific Language Impairment with a standard deviation of more the 1.25 below the mean score.

Keywords: bilingual, oral narratives, SLI, macrostructure

Procedia PDF Downloads 291

50 Syntax and Words as Evolutionary Characters in Comparative Linguistics

Authors: Nancy Retzlaff, Sarah J. Berkemer, Trudie Strauss

Abstract:

In the last couple of decades, the advent of digitalization of any kind of data was probably one of the major advances in all fields of study. This paves the way for also analysing these data even though they might come from disciplines where there was no initial computational necessity to do so. Especially in linguistics, one can find a rather manual tradition. Still when considering studies that involve the history of language families it is hard to overlook the striking similarities to bioinformatics (phylogenetic) approaches. Alignments of words are such a fairly well studied example of an application of bioinformatics methods to historical linguistics. In this paper we will not only consider alignments of strings, i.e., words in this case, but also alignments of syntax trees of selected Indo-European languages. Based on initial, crude alignments, a sophisticated scoring model is trained on both letters and syntactic features. The aim is to gain a better understanding on which features in two languages are related, i.e., most likely to have the same root. Initially, all words in two languages are pre-aligned with a basic scoring model that primarily selects consonants and adjusts them before fitting in the vowels. Mixture models are subsequently used to filter ‘good’ alignments depending on the alignment length and the number of inserted gaps. Using these selected word alignments it is possible to perform tree alignments of the given syntax trees and consequently find sentences that correspond rather well to each other across languages. The syntax alignments are then filtered for meaningful scores—’good’ scores contain evolutionary information and are therefore used to train the sophisticated scoring model. Further iterations of alignments and training steps are performed until the scoring model saturates, i.e., barely changes anymore. A better evaluation of the trained scoring model and its function in containing evolutionary meaningful information will be given. An assessment of sentence alignment compared to possible phrase structure will also be provided. The method described here may have its flaws because of limited prior information. This, however, may offer a good starting point to study languages where only little prior knowledge is available and a detailed, unbiased study is needed.

Keywords: alignments, bioinformatics, comparative linguistics, historical linguistics, statistical methods

Procedia PDF Downloads 159

49 Improving Subjective Bias Detection Using Bidirectional Encoder Representations from Transformers and Bidirectional Long Short-Term Memory

Authors: Ebipatei Victoria Tunyan, T. A. Cao, Cheol Young Ock

Abstract:

Detecting subjectively biased statements is a vital task. This is because this kind of bias, when present in the text or other forms of information dissemination media such as news, social media, scientific texts, and encyclopedias, can weaken trust in the information and stir conflicts amongst consumers. Subjective bias detection is also critical for many Natural Language Processing (NLP) tasks like sentiment analysis, opinion identification, and bias neutralization. Having a system that can adequately detect subjectivity in text will boost research in the above-mentioned areas significantly. It can also come in handy for platforms like Wikipedia, where the use of neutral language is of importance. The goal of this work is to identify the subjectively biased language in text on a sentence level. With machine learning, we can solve complex AI problems, making it a good fit for the problem of subjective bias detection. A key step in this approach is to train a classifier based on BERT (Bidirectional Encoder Representations from Transformers) as upstream model. BERT by itself can be used as a classifier; however, in this study, we use BERT as data preprocessor as well as an embedding generator for a Bi-LSTM (Bidirectional Long Short-Term Memory) network incorporated with attention mechanism. This approach produces a deeper and better classifier. We evaluate the effectiveness of our model using the Wiki Neutrality Corpus (WNC), which was compiled from Wikipedia edits that removed various biased instances from sentences as a benchmark dataset, with which we also compare our model to existing approaches. Experimental analysis indicates an improved performance, as our model achieved state-of-the-art accuracy in detecting subjective bias. This study focuses on the English language, but the model can be fine-tuned to accommodate other languages.

Keywords: subjective bias detection, machine learning, BERT–BiLSTM–Attention, text classification, natural language processing

Procedia PDF Downloads 134

48 Discovering Word-Class Deficits in Persons with Aphasia

Authors: Yashaswini Channabasavegowda, Hema Nagaraj

Abstract:

Aim: The current study aims at discovering word-class deficits concerning the noun-verb ratio in confrontation naming, picture description, and picture-word matching tasks. A total of ten persons with aphasia (PWA) and ten age-matched neurotypical individuals (NTI) were recruited for the study. The research includes both behavioural and objective measures to assess the word class deficits in PWA. Objective: The main objective of the research is to identify word class deficits seen in persons with aphasia, using various speech eliciting tasks. Method: The study was conducted in the L1 of the participants, considered to be Kannada. Action naming test and Boston naming test adapted to the Kannada version are administered to the participants; also, a picture description task is carried out. Picture-word matching task was carried out using e-prime software (version 2) to measure the accuracy and reaction time with respect to identification verbs and nouns. The stimulus was presented through auditory and visual modes. Data were analysed to identify errors noticed in the naming of nouns versus verbs, with respect to the Boston naming test and action naming test and also usage of nouns and verbs in the picture description task. Reaction time and accuracy for picture-word matching were extracted from the software. Results: PWA showed a significant difference in sentence structure compared to age-matched NTI. Also, PWA showed impairment in syntactic measures in the picture description task, with fewer correct grammatical sentences and fewer correct usage of verbs and nouns, and they produced a greater proportion of nouns compared to verbs. PWA had poorer accuracy and lesser reaction time in the picture-word matching task compared to NTI, and accuracy was higher for nouns compared to verbs in PWA. The deficits were noticed irrespective of the cause leading to aphasia.

Keywords: nouns, verbs, aphasia, naming, description

Procedia PDF Downloads 106

47 Climate Change and Health in Policies

Authors: Corinne Kowalski, Lea de Jong, Rainer Sauerborn, Niamh Herlihy, Anneliese Depoux, Jale Tosun

Abstract:

Climate change is considered one of the biggest threats to human health of the 21st century. The link between climate change and health has received relatively little attention in the media, in research and in policy-making. A long term and broad overview of how health is represented in the legislation on climate change is missing in the legislative literature. It is unknown if or how the argument for health is referred in legal clauses addressing climate change, in national and European legislation. Integrating scientific based evidence into policies regarding the impacts of climate change on health could be a key step to inciting the political and societal changes necessary to decelerate global warming. This may also drive the implementation of new strategies to mitigate the consequences on health systems. To provide an overview of this issue, we are analyzing the Global Climate Legislation Database provided by the Grantham Research Institute on Climate Change and the Environment. This institution was established in 2008 at the London School of Economics and Political Science. The database consists of (updated as of 1st January 2015) legislations on climate change in 99 countries around the world. This tool offers relevant information about the state of climate related policies. We will use the database to systematically analyze the 829 identified legislations to identify how health is represented as a relevant aspect of climate change legislation. We are conducting explorative research of national and supranational legislations and anticipate health to be addressed in various forms. The goal is to highlight how often, in what specific terms, which aspects of health or health risks of climate change are mentioned in various legislations. The position and recurrence of the mention of health is also of importance. Data will be extracted with complete quotation of the sentence which mentions health, which will allow for second qualitative stage to analyze which aspects of health are represented and in what context. This study is part of an interdisciplinary project called 4CHealth that confronts results of the research done on scientific, political and press literature to better understand how the knowledge on climate change and health circulates within those different fields and whether and how it is translated to real world change.

Keywords: climate change, explorative research, health, policies

Procedia PDF Downloads 369

46 Everyday Interactions among Imprisoned Sex Offenders: A Qualitative Study within the 'Due Palazzi' Prison in Padua

Authors: Matteo Mazzucato, Elena Faccio, Antonio Iudici

Abstract:

Prison is a social reality constructed by everyday interactions between an inmate, other social actors (cellmates, prison officers, educationalists and psychologists or other detainees) and the external world which participates in this complex construction through the social discourses on prison reality and its problems. Being a detainee means performing a self dealing with processes of stereotypization, attribution of a social role and prejudices assigned by various interlocutors and depending on what kind of crime one has been convicted of. Among all inmates, sex offenders are the ones who risk more to be socially condemned beyond a legal sentence since they have committed one of the most hated and disapproved crime. Regarding this, prison has to be considered as a critical context in which all community expectations and beliefs are converged: for common sense, rapists and child molesters are dangerous people who have to be stigmatized, punished and isolated. Furthermore, other detainees share a code of conduct by which the ‘sex offender’ is collocated at the lowest level of the social hierarchy of the prison. The penitentiary administration too defines this kind of detainee as a ‘vulnerable person to protect’ while prison staff considers him as a particular inmate who has to be treated and definitely changed. Considering all the complexities connected with being imprisoned as a sex offender, our research aimed at exploring how people convicted of sex crimes are called upon to manage all these hetero-narrations about their selves. Set this goal, textual data retrieved from this qualitative research show that sex offenders tend to not face the stigma assigned to them. They are rather used to minimize the story telling about their selves and costruct alternative biographies to be shared with other inmates. Managing narrations about their selves in this way permits to distance them from all the threats perceived living together with other detainees but it blocks sex offenders’ ri-signification of their offences during prison treatment. Given these results, prison administration should develop activities in order to create fields of interaction between detainees where experiencing new versions of their selves spendable even in external social situations. Regarding this it’s important to re-consider prison as part of the community and the sex offenders as a member of it.

Keywords: interactions, qualitative research, prison reality, sex offender

Procedia PDF Downloads 224

45 Contextual SenSe Model: Word Sense Disambiguation using Sense and Sense Value of Context Surrounding the Target

Authors: Vishal Raj, Noorhan Abbas

Abstract:

Ambiguity in NLP (Natural language processing) refers to the ability of a word, phrase, sentence, or text to have multiple meanings. This results in various kinds of ambiguities such as lexical, syntactic, semantic, anaphoric and referential am-biguities. This study is focused mainly on solving the issue of Lexical ambiguity. Word Sense Disambiguation (WSD) is an NLP technique that aims to resolve lexical ambiguity by determining the correct meaning of a word within a given context. Most WSD solutions rely on words for training and testing, but we have used lemma and Part of Speech (POS) tokens of words for training and testing. Lemma adds generality and POS adds properties of word into token. We have designed a novel method to create an affinity matrix to calculate the affinity be-tween any pair of lemma_POS (a token where lemma and POS of word are joined by underscore) of given training set. Additionally, we have devised an al-gorithm to create the sense clusters of tokens using affinity matrix under hierar-chy of POS of lemma. Furthermore, three different mechanisms to predict the sense of target word using the affinity/similarity value are devised. Each contex-tual token contributes to the sense of target word with some value and whichever sense gets higher value becomes the sense of target word. So, contextual tokens play a key role in creating sense clusters and predicting the sense of target word, hence, the model is named Contextual SenSe Model (CSM). CSM exhibits a noteworthy simplicity and explication lucidity in contrast to contemporary deep learning models characterized by intricacy, time-intensive processes, and chal-lenging explication. CSM is trained on SemCor training data and evaluated on SemEval test dataset. The results indicate that despite the naivety of the method, it achieves promising results when compared to the Most Frequent Sense (MFS) model.

Keywords: word sense disambiguation (wsd), contextual sense model (csm), most frequent sense (mfs), part of speech (pos), natural language processing (nlp), oov (out of vocabulary), lemma_pos (a token where lemma and pos of word are joined by underscore), information retrieval (ir), machine translation (mt)

Procedia PDF Downloads 112

44 An Exploration of Gender Differences in Academic Writing in Science

Authors: Gayani Ranawake, Kate Wilson

Abstract:

Underrepresentation of women in academia, particularly in science, has been discussed by many scholars for decades. The causes of this underrepresentation are debated to this day. Publication is an important aspect of success in academia, and publication and citation rates are significant metrics in performance review, promotion, and employment. It has been established that men’s and women’s language use in general, both spoken and written, is different. However, no one, to our knowledge, has looked at whether men’s and women’s writing in science is different. If there are significant differences in the writing of men and women, then these differences may affect women’s ability to succeed in science. This study is part of a larger project to explore whether differences can be recognized in the academic science writing of men and women. Mono authored articles from high ranking physics, biology and psychology journals by men and women authors were compared in terms of readability statistics. In particular, the abstract and introduction sections were compared, as these are the first sections encountered by a reviewer, and so may have an important effect on their impression of the work. The Flesch Reading Ease, the percentage of passive sentences and the Flesch-Kincaid Reading Grade Level were calculated for each section of each article, along with counts of numbers of sentences, words per sentence and sentences per paragraph. Significance of differences was tested using the Behrens statistic. It was found that for both physics and biology papers there were no significant differences in the complexity or verbosity of the writing of men and women authors. However, there was a significant difference between the two disciplines, with physics articles being generally more readable (higher readability score) while also more passive (higher number of passive sentences). In contrast, the psychology articles showed a difference between men and women authors which may be significant. The average readability for introductions in women’s articles was 28 which was higher than for men’s articles, which was 19 (higher values indicate more readable). Women’s articles in psychology also had a greater proportion of passive sentences. It can be concluded that, at least in the more traditional sciences, men and women have adopted similar ways of writing, and that disciplinary differences are greater than gender differences. This may not be the case in psychology, which many consider to be more closely aligned with the humanities. Whether the lack of differences is because women have adapted to a masculine way of writing, or whether the genre itself is gender neutral needs further investigation.

Keywords: academic writing, gender differences, readability, science

Procedia PDF Downloads 199

43 Unsupervised Part-of-Speech Tagging for Amharic Using K-Means Clustering

Authors: Zelalem Fantahun

Abstract:

Part-of-speech tagging is the process of assigning a part-of-speech or other lexical class marker to each word into naturally occurring text. Part-of-speech tagging is the most fundamental and basic task almost in all natural language processing. In natural language processing, the problem of providing large amount of manually annotated data is a knowledge acquisition bottleneck. Since, Amharic is one of under-resourced language, the availability of tagged corpus is the bottleneck problem for natural language processing especially for POS tagging. A promising direction to tackle this problem is to provide a system that does not require manually tagged data. In unsupervised learning, the learner is not provided with classifications. Unsupervised algorithms seek out similarity between pieces of data in order to determine whether they can be characterized as forming a group. This paper explicates the development of unsupervised part-of-speech tagger using K-Means clustering for Amharic language since large amount of data is produced in day-to-day activities. In the development of the tagger, the following procedures are followed. First, the unlabeled data (raw text) is divided into 10 folds and tokenization phase takes place; at this level, the raw text is chunked at sentence level and then into words. The second phase is feature extraction which includes word frequency, syntactic and morphological features of a word. The third phase is clustering. Among different clustering algorithms, K-means is selected and implemented in this study that brings group of similar words together. The fourth phase is mapping, which deals with looking at each cluster carefully and the most common tag is assigned to a group. This study finds out two features that are capable of distinguishing one part-of-speech from others these are morphological feature and positional information and show that it is possible to use unsupervised learning for Amharic POS tagging. In order to increase performance of the unsupervised part-of-speech tagger, there is a need to incorporate other features that are not included in this study, such as semantic related information. Finally, based on experimental result, the performance of the system achieves a maximum of 81% accuracy.

Keywords: POS tagging, Amharic, unsupervised learning, k-means

Procedia PDF Downloads 455

42 Arabicization and Terminology with Reference to Social Media Terms

Authors: Ahmed Al-Awthan

Abstract:

This study addresses the prevalence of English terminology in published Arabic documentation on social media. Although the problem of using English terms in translation instead of existing native ones has been addressed in general by researchers around the world, to the best of the author’s knowledge the attitude of the translators as professionals to this phenomenon in Qatar and Yemen has not received a detailed study. This study examines the impact of the use of English, social media terms in the Arab world on aspiring and professional translators; it explores the benefits and drawbacks of linguistic borrowing as identified by the translators and investigates whether translators consider any means of resisting linguistic borrowing and prioritizing Arabic. It also aims to answer the following questions: i. Is there any prevalence of English, social media terms in Arabic translation? Why or why not? ii. Do Arabic translators prefer using English, social media terms to their equivalents in Arabic? If so, why? iii. Which measures could be adopted to help reduce the frequently observed borrowing of English terms? In particular, how do translators see the role of the Arabic Language Academies in preserving Arabic? iv. This research is descriptive, comparative and analytical in nature. It is both qualitative and quantitative. To validate the problem, the researcher will analyze articles published by Al-Jazeera in 2016-2018 that refer to the use of social media in diplomacy. It will be examined whether the increased international discussion of political events in social media increased the amount of transliterated English terminology referring to this mode of communication.To investigate whether the translators recognize the phenomenon of borrowing, the researcher proposes to use a survey. This survey will use multiple choice questions. It will target 20 aspiring translators from Yemen and 20 participants from Qatar. It will offer 15 English, social media terms used in discourse in 15 sentences. For each sentence, the researcher will provide three different translations and will ask the translators to rate them and offer their rendition. After collecting all the answers online, the researcher will analyze the data. The results are expected to confirm whether there is a prevalence of English terms in translating into Arabic. It is also expected to show what measures the translators used to render the English, social media terms, and it raises awareness of borrowing English terms. It will guide the translator toward using Arabicization methods in order to contribute to preserving Arabic.

Keywords: Arabicization, trans lingual borrowing, social media terms, terminology

Procedia PDF Downloads 155