Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 378

Search results for: corpus stylistics

78 Space Debris: An Environmental Hazard

Abstract:

Space law refers to all legal provisions that may regulate or apply to space travel, as well as to space-related activity. Although there is undoubtedly a core corpus of “space law,” rather than designating a conceptually distinct single kind of law, the phrase can be seen as a label applied to a bucket that includes a variety of different laws and regulations. Similar to ‘family law' or ‘environmental law' "space law" refers to a variety of laws that are identified by the subject matter they address rather than by the logical extension of a single legal concept. The word "space law" refers to the Law of Space, which can cover anything from the specifics of an insurance agreement for a specific space launch to the most general guidelines that direct state behaviour in space. Space debris, often referred to as space junk, space pollution, space waste, space trash, or space garbage, is a term used to describe abandoned human-made objects in space, primarily in Earth orbit. These include disused spacecraft, discarded launch vehicle stages, mission-related detritus, and fragmentation material from the destruction of disused rocket bodies and spacecraft, which is particularly prevalent in Earth orbit. Other types of space debris, besides abandoned human-made objects in orbit, include pieces left over from collisions, erosion, and disintegration, or even paint specks, solidified liquids ejected from spacecraft, and unburned components from solid rocket engines. The initial action of launching or using a spacecraft in near-Earth orbit imposes an external cost on others that is typically not taken into account or fully accounted for in the cost by the launcher or payload owner.

Keywords: space, outer space treaty, geostationary orbit, satellites, spacecrafts

Procedia PDF Downloads 72

77 Methodological Proposal, Archival Thesaurus in Colombian Sign Language

Authors: Pedro A. Medina-Rios, Marly Yolie Quintana-Daza

Abstract:

Having the opportunity to communicate in a social, academic and work context is very relevant for any individual and more for a deaf person when oral language is not their natural language, and written language is their second language. Currently, in Colombia, there is not a specialized dictionary for our best knowledge in sign language archiving. Archival is one of the areas that the deaf community has a greater chance of performing. Nourishing new signs in dictionaries for deaf people extends the possibility that they have the appropriate signs to communicate and improve their performance. The aim of this work was to illustrate the importance of designing pedagogical and technological strategies of knowledge management, for the academic inclusion of deaf people through proposals of lexicon in Colombian sign language (LSC) in the area of archival. As a method, the analytical study was used to identify relevant words in the technical area of the archival and its counterpart with the LSC, 30 deaf people, apprentices - students of the Servicio Nacional de Aprendizaje (SENA) in Documentary or Archival Management programs, were evaluated through direct interviews in LSC. For the analysis tools were maintained to evaluate correlation patterns and linguistic methods of visual, gestural analysis and corpus; besides, methods of linear regression were used. Among the results, significant data were found among the variables socioeconomic stratum, academic level, labor location. The need to generate new signals on the subject of the file to improve communication between the deaf person, listener and the sign language interpreter. It is concluded that the generation of new signs to nourish the LSC dictionary in archival subjects is necessary to improve the labor inclusion of deaf people in Colombia.

Keywords: archival, inclusion, deaf, thesaurus

Procedia PDF Downloads 261

76 The Visual Side of Islamophobia: A Social-Semiotic Analysis

Authors: Carmen Aguilera-Carnerero

Abstract:

Islamophobia, the unfounded hostility towards Muslims and Islam, has been deeply studied in the last decades from different perspectives ranging from anthropology, sociology, media studies, and linguistics. In the past few years, we have witnessed how the birth of social media has transformed formerly passive audiences into an active group that not only receives and digests information but also creates and comments publicly on any event of their interest. In this way, average citizens now have been entitled with the power of becoming potential opinion leaders. This rise of social media in the last years gave way to a different way of Islamophobia, the so called ‘cyberIslamophobia’. Considerably less attention, however, has been given to the study of islamophobic images that accompany the texts in social media. This paper attempts to analyse a corpus of 300 images of islamophobic nature taken from social media (from Twitter and Facebook) from the years 2014-2017 to see: a) how hate speech is visually constructed, b) how cyberislamophobia is articulated through images and whether there are differences/similarities between the textual and the visual elements, c) the impact of those images in the audience and their reaction to it and d) whether visual cyberislamophobia has undergone any process of permeating popular culture (for example, through memes) and its real impact. To carry out this task, we have used Critical Discourse Analysis as the most suitable theoretical framework that analyses and criticizes the dominant discourses that affect inequality, injustice, and oppression. The analysis of images was studied according to the theoretical framework provided by the visual framing theory and the visual design grammar to conclude that memes are subtle but very powerful tools to spread Islamophobia and foster hate speech under the guise of humour within popular culture.

Keywords: cyberIslamophobia, visual grammar, social media, popular culture

Procedia PDF Downloads 147

75 Towards Law Data Labelling Using Topic Modelling

Authors: Daniel Pinheiro Da Silva Junior, Aline Paes, Daniel De Oliveira, Christiano Lacerda Ghuerren, Marcio Duran

Abstract:

The Courts of Accounts are institutions responsible for overseeing and point out irregularities of Public Administration expenses. They have a high demand for processes to be analyzed, whose decisions must be grounded on severity laws. Despite the existing large amount of processes, there are several cases reporting similar subjects. Thus, previous decisions on already analyzed processes can be a precedent for current processes that refer to similar topics. Identifying similar topics is an open, yet essential task for identifying similarities between several processes. Since the actual amount of topics is considerably large, it is tedious and error-prone to identify topics using a pure manual approach. This paper presents a tool based on Machine Learning and Natural Language Processing to assists in building a labeled dataset. The tool relies on Topic Modelling with Latent Dirichlet Allocation to find the topics underlying a document followed by Jensen Shannon distance metric to generate a probability of similarity between documents pairs. Furthermore, in a case study with a corpus of decisions of the Rio de Janeiro State Court of Accounts, it was noted that data pre-processing plays an essential role in modeling relevant topics. Also, the combination of topic modeling and a calculated distance metric over document represented among generated topics has been proved useful in helping to construct a labeled base of similar and non-similar document pairs.

Keywords: courts of accounts, data labelling, document similarity, topic modeling

Procedia PDF Downloads 159

74 Effect of Synchronization Protocols on Serum Concentrations of Estrogen and Progesterone in Holstein Dairy Heifers

Authors: K. Shafiei, A. Pirestani, G. Ghalamkari, S. Safavipour

Abstract:

Use of GnRH or its agonists to increase conception rates should be based on an understanding of GnRH-induced biological effects on the reproductive-endocrine system. This effect may occur through GnRH-stimulated LH surge stimulating production of progesterone by corpus luteum.the aim of this study was to compare the effects on reproductive efficiency of a luteolytic dose of a synthetic prostaglandin Cloprostenol Sodium versus ainjectable progesterone and Luliberin- A on Follicle estrogen and progesterone levels.In this study, we used45 head of holstein dairy heifersin the three treatments, with 15 replicates per treatment were performed in random groups. all the heifers before the projects is began in two steps injection 3 mL CloprostenolSodium with an interval of 11 days been synchronized and 10 days later, second injection of prostaglandin was conducted after that we started below protocol:Control group (daily sodium chloride serum injection 1 cc), Group B: Day Zero, intramuscular injection of 15 mg Luliberin- A + every other day injection of 3 cc progesterone + day 7, injection of Cloprostenol Sodium+ day 9, injection of 15 mg Luliberin- A.Group C: similar to Grop B + daily injection of progesterone after that blood samples was collected and centrifuged.plasma were analysed by ELISA.the analysis of this study uses SPSS data software package and compared between the mean and LS Means LSD test at 5% significance level was used.The results of this study shows that maximum of progesterone plasma levels were in the control gruop (P ≥ 0.05).Therefore, daily injection of progesterone inhibit the growth CL. the most estrogen levels in plasma were in Group C (P ≥ 0.05) thus it can be concluded, rise in endogenous estrogen concentrations normally stimulates the preovulatory LH release in heifers.

Keywords: Luliberin- A, Cloprostenol Sodium, estrogen, progesterone, dairy heifers

Procedia PDF Downloads 524

73 The Intonation of Romanian Greetings: A Sociolinguistics Approach

Authors: Anca-Diana Bibiri, Mihaela Mocanu, Adrian Turculeț

Abstract:

In a language the inventory of greetings is dynamic with frequent input and output, although this is hardly noticed by the speakers. In this register, there are a number of constant, conservative elements that survive different language models (among them, the classic formulae: bună ziua! (good afternoon!), bună seara! (good evening!), noapte bună! (good night!), la revedere! (goodbye!) and a number of items that fail to pass the test of time, according to language use at a time (ciao!, pa!, bai!). The source of innovation depends both of internal factors (contraction, conversion, combination of classic formulae of greetings), and of external ones (borrowings and calques). Their use imposes their frequencies at once, namely the elimination of the use of others. This paper presents a sociolinguistic approach of contemporary Romanian greetings, based on prosodic surveys in two research projects: AMPRom, and SoRoEs. Romanian language presents a rich inventory of questions (especially partial interrogatives questions/WH-Q) which are used as greetings, alone or, more commonly accompanying a proper greeting. The representative of the typical formulae is Ce mai faci? (How are you?), which, unlike its English counterpart How do you do?, has not become a stereotype, but retains an obvious emotional impact, while serving as a mark of sociolinguistic group. The analyzed corpus consists of structures containing greetings recorded in the main Romanian cultural (urban) centers. From the methodological point of view, the acoustic analysis of the recorded data is performed using software tools (GoldWave, Praat), identifying intonation patterns related to three sociolinguistics variables: age, sex and level of education. The intonation patterns of the analyzed statements are at the interface between partial questions and typical greetings.

Keywords: acoustic analysis, greetings, Romanian language, sociolinguistics

Procedia PDF Downloads 323

72 Investigating Complement Clause Choice in Written Educated Nigerian English (ENE)

Authors: Juliet Udoudom

Abstract:

Inappropriate complement selection constitutes one of the major features of non-standard complementation in the Nigerian users of English output of sentence construction. This paper investigates complement clause choice in Written Educated Nigerian English (ENE) and offers some results. It aims at determining preferred and dispreferred patterns of complement clause selection in respect of verb heads in English by selected Nigerian users of English. The complementation data analyzed in this investigation were obtained from experimental tasks designed to elicit complement categories of Verb – Noun -, Adjective – and Prepositional – heads in English. Insights from the Government – Binding relations were employed in analyzing data, which comprised responses obtained from one hundred subjects to a picture elicitation exercise, a grammaticality judgement test, and a free composition task. The findings indicate a general tendency for clausal complements (CPs) introduced by the complementizer that to be preferred by the subjects studied. Of the 235 tokens of clausal complements which occurred in our corpus, 128 of them representing 54.46% were CPs headed by that, while whether – and if-clauses recorded 31.07% and 8.94%, respectively. The complement clause-type which recorded the lowest incidence of choice was the CP headed by the Complementiser, for with a 5.53% incident of occurrence. Further findings from the study indicate that semantic features of relevant embedding verb heads were not taken into consideration in the choice of complementisers which introduce the respective complement clauses, hence the that-clause was chosen to complement verbs like prefer. In addition, the dispreferred choice of the for-clause is explicable in terms of the fact that the respondents studied regard ‘for’ as a preposition, and not a complementiser.

Keywords: complement, complement clause complement selection, complementisers, government-binding

Procedia PDF Downloads 172

71 Comparative Assessment of hCG with Estrogen in Increasing Pregnancy Rate in Mixed Parity Buffaloes

Authors: Sanan Raza, Tariq Abbas, Ahmad Yar Qamar, Muhammad Younus, Hamayun Khan, Mujahid Zafar

Abstract:

Water Buffaloes contribute significantly in Asian agriculture. The objective of this study was to evaluate the efficacy of two synchronization protocols in enhancing pregnancy rate in 105 mixed parity buffaloes particularly in summer season. Buffaloes are seasonal breeders showing more fertility from October to January in subtropical environment of Pakistan. In current study 105 lactating buffaloes of mixed parity were used having normal estrous cycle, age ranging 5-9 years, weighing between 400-650 kg, BCS 4 ± 0.5 (1-5) and lactation varied from first to 5th. Experimental animals were divided into three groups based on corpus leteummorphometry. Morphometry of C.L was done using rectal population and ultrasonography. All animals were injected 25mg of PGi.m. (Cloprostenol). In Group-1 (n=35) hCG was administered at follicular size of 10mm having scanned after detection of heat. Similarly Group-2 (n=35) received 25 mg EB i.m (Estradiol Benzoate) after confirmation of follicular size of 10mm with ultrasound. Likewise, buffaloes of Group-3 (n=35) were administered normal saline respectively using as control. All buffaloes of three groups were inseminated after 12h of hCG, EB, and normal saline administration respectively. Pregnancy was assessed by ultrasound at 18th and 45th day post insemination. Pregnancy rates at 18th day were 38.2%, 34.5%, and 27.3% for G1, G2, and G3 respectively indicating that hCG and EB administered groups have no difference in results except control group having lower conception rate than both groups respectively. Similarly on 42nd day, these were 40.4%, 32.7% for G1 and G2 which are significantly higher than G3= 26.6 (control Group). Also, hCG and EB treated buffaloes have more probability of pregnancy than control group. Based on the findings of current study, it seems reasonable that the use of hCG and EB has been associated with improving pregnancy rates in non-breeding season of buffaloes.

Keywords: buffalo, hCG, EB, pregnancy rate, follicle, insemination

Procedia PDF Downloads 784

70 The First Japanese-Japanese Dictionary for Non-Japanese Using the Defining Vocabulary

Authors: Minoru Moriguchi

Abstract:

This research introduces the concept of a monolingual Japanese dictionary for non-native speakers of Japanese, whose temporal title is Dictionary of Contemporary Japanese for Advanced Learners (DCJAL). As the language market is very small compared with English, a monolingual Japanese dictionary for non-native speakers, containing sufficient entries, has not been published yet. In such a dictionary environment, Japanese-language learners are using bilingual dictionaries or monolingual Japanese dictionaries for Japanese people. This research started in 2017, as a project team which consists of four Japanese and two non-native speakers, all of whom are linguists of the Japanese language. The team has been trying to propose the concept of a monolingual dictionary for non-native speakers of Japanese and to provide the entry list, the definition samples, the list of defining vocabulary, and the writing manual. As the result of seven-year research, DCJAL has come to have 28,060 head words, 539 entry examples, 4,598-word defining vocabulary, and the writing manual. First, the number of the entry was determined as about 30,000, based on an experimental method using existing six dictionaries. To make the entry list satisfying this number, words suitable for DCJAL were extracted from the Tsukuba corpus of the Japanese language, and later the entry list was adjusted according to the experience as Japanese instructor. Among the head words of the entry list, 539 words were selected and added with lexicographical information such as proficiency level, pronunciation, writing system (hiragana, katakana, kanji, or alphabet), definition, example sentences, idiomatic expression, synonyms, antonyms, grammatical information, sociolinguistic information, and etymology. While writing the definition of the above 539 words, the list of the defining vocabulary was constructed, based on frequent vocabulary used in a Japanese monolingual dictionary. Although the concept of DCJAL has been almost perfected, it may need some more adjustment, and the research is continued.

Keywords: monolingual dictionary, the Japanese language, non-native speaker of Japanese, defining vocabulary

Procedia PDF Downloads 25

69 Voice of Customer: Mining Customers' Reviews on On-Line Car Community

Authors: Kim Dongwon, Yu Songjin

Abstract:

This study identifies the business value of VOC (Voice of Customer) on the business. Precisely, we intend to demonstrate how much negative and positive sentiment of VOC has an influence on car sales market share in the unites states. We extract 7 emotions such as sadness, shame, anger, fear, frustration, delight and satisfaction from the VOC data, 23,204 pieces of opinions, that had been posted on car-related on-line community from 2007 to 2009(a part of data collection from 2007 to 2015), and intend to clarify the correlation between negative and positive sentimental keywords and contribution to market share. In order to develop a lexicon for each category of negative and positive sentiment, we took advantage of Corpus program, Antconc 3.4.1.w and on-line sentimental data, SentiWordNet and identified the part of speech(POS) information of words in the customers' opinion by using a part-of-speech tagging function provided by TextAnalysisOnline. For the purpose of this present study, a total of 45,741 pieces of customers' opinions of 28 car manufacturing companies had been collected including titles and status information. We conducted an experiment to examine whether the inclusion, frequency and intensity of terms with negative and positive emotions in each category affect the adoption of customer opinions for vehicle organizations' market share. In the experiment, we statistically verified that there is correlation between customer ideas containing negative and positive emotions and variation of marker share. Particularly, "Anger," a domain of negative domains, is significantly influential to car sales market share. The domain "Delight" and "Satisfaction" increased in proportion to growth of market share.

Keywords: data mining, opinion mining, sentiment analysis, VOC

Procedia PDF Downloads 201

68 Multidimensional Inequality and Deprivation Among Tribal Communities of Andhra Pradesh, India

Authors: Sanjay Sinha, Mohd Umair Khan

Abstract:

The level of income inequality in India has been worrisome as the World Inequality Report termed it as a “poor and unequal country, with an affluent elite”. As important as income is to understand inequality and deprivation, it is just one dimension. But the historical roots and current realities of inequality and deprivation in India lies in many of the non-income dimensions such as housing, nutrition, education, agency, sense of inclusion etc. which are often ignored, especially in solution-oriented research. The level of inequality and deprivation among the tribal is one such case. There is a corpus of literature establishing that the tribal communities in India are disadvantageous on various grounds. Given their rural geography, issues of access and quality of basic facilities such as education and healthcare are often unaddressed. COVID-19 has further exacerbated this challenge and climate change will make it even more worrying. With this background, a succinct measurement tool at the village level is necessary to design short to medium-term actions with reference to risk mitigation for tribal communities. This research paper examines the level of inequality and deprivation among the tribal communities in the rural areas of Andhra Pradesh state of India using a Multidimensional Inequality and Deprivation Index based on the Alkire-Foster methodology. The methodology is theoretically grounded in the capability approach propounded by Amartya Sen, emphasizing on achieving the “beings and doings” (functionings) an individual reason to value. In the index, the authors have five domains, including Livelihood, Food Security, Education, Health and Housing and these domains are divided into sixteen indicators. This assessment is followed by domain-wise short-term and long-term solutions.

Keywords: Andhra Pradesh, Alkire-Foster methodology, deprivation, inequality, multidimensionality, poverty, tribal

Procedia PDF Downloads 140

67 Phonology and Syntax of Article Incorporation in Mauritian Creole: Evidence from Bantou Languages

Authors: Emmanuel Nikiema

Abstract:

This paper examines article incorporation in Mauritian Creole, a French Lexifier Creole which exhibits three forms of article incorporation as illustrated in (1-3). While various analyses of article incorporation have been proposed in the literature, fewer studies have explored the motivation of this widespread phenomenon in Mauritian Creole (MC) as opposed to other French Lexifier Creoles spoken in the Caribbean. For example, Mauritian Creole exhibits 4 times more CV incorporation than Haitian Creole, and 40 times more than Reunion Creole. (1) Consonantal type (C): loraz ‘thunder storm’, lete ‘summer’, zwazo ‘bird’, nide ‘idea’. (2) Syllabic type (CV): lapo ‘skin’, liku ‘neck’, ledo ‘back’, leker ‘heart’, diber ‘butter’. (3) Bi-consonantal (CVC): delo ‘water’, dizef ‘egg’, lizye ‘eye’, dilwil ‘oil’. The goal of this study is twofold: 1) uncover the rules governing the three types of article incorporation in MC, and 2) account for its remarkable occurrence in MC as opposed to its quasi-absence in Reunion Creole. We have collected a corpus of over 700 cases and organized it into three categories (C; CV and CVC). For example, there are 471 examples of CV incorporation in MC against 112 in Haitian Creole and only 12 in Reunion Creole. Two questions can be raised: 1) what is the motivation and distribution of the three types of incorporation in MC, and 2) how can one account for the high volume of incorporation in MC as opposed to its quasi-absence in Reunion Creole? We suggest that article incorporation in MC is related to the structure of nouns in Bantou languages. While previous authors have largely used population settlement data in the colonies during the Creole formation period to justify their analyses, we propose an account based on the syntactic structure of Bantou nouns. This analysis will shed light on the contribution of African languages to the formation of MC, and on to why MC has exhibited more article incorporation cases than any other French Lexifier Creole.

Keywords: article incorporation, creole languages, description, phonology

Procedia PDF Downloads 100

66 Oestrous Synchronization: A Technical Note for Nepalese Goat Farmers

Authors: Pravin Mishra, Ajeet K. Jha, Pankaj K. Jha

Abstract:

This technical note is aimed at providing a brief information on goat breeds, its breeding seasonality and different methods of oestrous synchronization for Nepalese goat farmers. It was observed that, these goats are seasonal breeder and showed oestrous during mainly two season; December- February and March-May. This leads to an irregular supply of goat to market and a wide variations in market price. Oestrus synchronization is only an alternative reproductive tool to overcome this scarcity by enhancing production and productivity. This technique enables goat producers breed their animals within a short pre-determined period and permits breeding round the year. The principle of oestrus synchronisation is based on controlling of the luteal phase of the oestrous cycle. There are two basic mechanisms; one by shortening the luteal life (premature luteolysis) using prostaglandins or its analogues and the other by prolonging the luteal life (simulating the activity of natural progesterone produced by the corpus luteum) using exogenous progesterone source. The former is easy to apply and only effective during breeding season, whereas the latter is advantageous when the reproductive status of the goat flock is unknown. The common hormonal products easily available in Nepal includes; prostaglandins or its analogues (Oviprost® Dinoprost® Lutalyse® and Estrumate®), exogenous progesterone (Fluorogestone acetate® and Controlled Internal Drug Release®, CIDR) devices). However, before practicing the oestrous synchronization protocol, it needs to be validated for oestrous response rate, time to onset of oestrous, duration of oestrous and pregnancy rates at farmer’s field. In conclusion, application of oestrus synchronisation practice enhanced goat production and surplus the goat meat demand in Nepal.

Keywords: goat, Nepal, oestrous, synchronization

Procedia PDF Downloads 136

65 A Sentence-to-Sentence Relation Network for Recognizing Textual Entailment

Authors: Isaac K. E. Ampomah, Seong-Bae Park, Sang-Jo Lee

Abstract:

Over the past decade, there have been promising developments in Natural Language Processing (NLP) with several investigations of approaches focusing on Recognizing Textual Entailment (RTE). These models include models based on lexical similarities, models based on formal reasoning, and most recently deep neural models. In this paper, we present a sentence encoding model that exploits the sentence-to-sentence relation information for RTE. In terms of sentence modeling, Convolutional neural network (CNN) and recurrent neural networks (RNNs) adopt different approaches. RNNs are known to be well suited for sequence modeling, whilst CNN is suited for the extraction of n-gram features through the filters and can learn ranges of relations via the pooling mechanism. We combine the strength of RNN and CNN as stated above to present a unified model for the RTE task. Our model basically combines relation vectors computed from the phrasal representation of each sentence and final encoded sentence representations. Firstly, we pass each sentence through a convolutional layer to extract a sequence of higher-level phrase representation for each sentence from which the first relation vector is computed. Secondly, the phrasal representation of each sentence from the convolutional layer is fed into a Bidirectional Long Short Term Memory (Bi-LSTM) to obtain the final sentence representations from which a second relation vector is computed. The relations vectors are combined and then used in then used in the same fashion as attention mechanism over the Bi-LSTM outputs to yield the final sentence representations for the classification. Experiment on the Stanford Natural Language Inference (SNLI) corpus suggests that this is a promising technique for RTE.

Keywords: deep neural models, natural language inference, recognizing textual entailment (RTE), sentence-to-sentence relation

Procedia PDF Downloads 333

64 Alignment and Antagonism in Flux: A Diachronic Sentiment Analysis of Attitudes towards the Chinese Mainland in the Hong Kong Press

Authors: William Feng, Qingyu Gao

Abstract:

Despite the extensive discussions about Hong Kong’s sentiments towards the Chinese Mainland since the sovereignty transfer in 1997, there has been no large-scale empirical analysis of the changing attitudes in the mainstream media, which both reflect and shape sentiments in the society. To address this gap, the present study uses an optimised semantic-based automatic sentiment analysis method to examine a corpus of news about China from 1997 to 2020 in three main Chinese-language newspapers in Hong Kong, namely Apple Daily, Ming Pao, and Oriental Daily News. The analysis shows that although the Hong Kong press had a positive emotional tone toward China in general, the overall trend of sentiment was becoming increasingly negative. Meanwhile, the alignment and antagonism toward China have both increased, providing empirical evidence of attitudinal polarisation in the Hong Kong society. Specifically, Apple Daily’s depictions of China have become increasingly negative, though with some positive turns before 2008, whilst Oriental Daily News has consistently expressed more favourable sentiments. Ming Pao maintained an impartial stance toward China through an increased but balanced representation of positive and negative sentiments, with its subjectivity and sentiment intensity growing to an industry-standard level. The results provide new insights into the complexity of sentiments towards China in the Hong Kong press and media attitudes in general in terms of the “us” and “them” positioning by explicating the cross-newspaper and cross-period variations using an enhanced sentiment analysis method which incorporates sentiment-oriented and semantic role analysis techniques.

Keywords: media attitude, sentiment analysis, Hong Kong press, one country two systems

Procedia PDF Downloads 87

63 Online Multilingual Dictionary Using Hamburg Notation for Avatar-Based Indian Sign Language Generation System

Authors: Sugandhi, Parteek Kumar, Sanmeet Kaur

Abstract:

Sign Language (SL) is used by deaf and other people who cannot speak but can hear or have a problem with spoken languages due to some disability. It is a visual gesture language that makes use of either one hand or both hands, arms, face, body to convey meanings and thoughts. SL automation system is an effective way which provides an interface to communicate with normal people using a computer. In this paper, an avatar based dictionary has been proposed for text to Indian Sign Language (ISL) generation system. This research work will also depict a literature review on SL corpus available for various SL s over the years. For ISL generation system, a written form of SL is required and there are certain techniques available for writing the SL. The system uses Hamburg sign language Notation System (HamNoSys) and Signing Gesture Mark-up Language (SiGML) for ISL generation. It is developed in PHP using Web Graphics Library (WebGL) technology for 3D avatar animation. A multilingual ISL dictionary is developed using HamNoSys for both English and Hindi Language. This dictionary will be used as a database to associate signs with words or phrases of a spoken language. It provides an interface for admin panel to manage the dictionary, i.e., modification, addition, or deletion of a word. Through this interface, HamNoSys can be developed and stored in a database and these notations can be converted into its corresponding SiGML file manually. The system takes natural language input sentence in English and Hindi language and generate 3D sign animation using an avatar. SL generation systems have potential applications in many domains such as healthcare sector, media, educational institutes, commercial sectors, transportation services etc. This research work will help the researchers to understand various techniques used for writing SL and generation of Sign Language systems.

Keywords: avatar, dictionary, HamNoSys, hearing impaired, Indian sign language (ISL), sign language

Procedia PDF Downloads 209

62 Anaphora and Cataphora on the Selected State of the City Addresses of the Mayor of Dapitan

Authors: Mark Herman Sumagang Potoy

Abstract:

State of the City Address (SOCA) is a speech, modelled after the State of the Nation Address, given not as mandated by law but usually a matter of practice or tradition delivered before the chief executive’s constituents. Through this, the general public is made to know the performance of the local government unit and its agenda for the coming year. Therefore, it is imperative for SOCAs to clearly convey its message and carry out the myriad function of enlightening its readers which could be achieved through the proper use of reference. Anaphora and cataphora are the two major types of reference; the former refer back to something that has already been mentioned while the latter points forward to something which is yet to be said. This paper seeks to identify the types of reference employed on the SOCAs from 2014 to 2016 of Hon. Rosalina Garcia Jalosjos, Mayor of Dapitan City and look into how the references contribute to the clarity of the message of the text. The qualitative method of research is used in this study through an in-depth analysis of the corpus. As soon as the copies of the SOCAs are secured from the Office of the City Mayor, they are then analyzed using documentary technique categorizing the types of reference as to anaphora and cataphora, counting each of these types and describing the implications of the dominant types used in the addresses. After a thorough analysis, it is found out that the two reference types namely, anaphora and cataphora are both employed on the three SOCAs, the former being used more frequently than the latter accounting to 80% and 20% of actual usage, respectively. Moreover, the use of anaphors and cataphora on the three addresses helps in conveying the message clearly because they primarily become aids to avoid the repetition of the same element in the text especially when there wasn’t a need to emphasize a point. Finally, it is recommended that writers of State of the City Addresses should have a vast knowledge on how reference should be used and the functions they take in the text since this is a vital tool to clearly transmit a message. Moreover, English teachers should explicitly teach the proper usage of anaphora and cataphora, as instruments to develop cohesion in written discourse, to enable students to write not only with sense but also with fluidity in tying utterances together.

Keywords: anaphora, cataphora, reference, State of the City Address

Procedia PDF Downloads 176

61 AI Peer Review Challenge: Standard Model of Physics vs 4D GEM EOS

Authors: David A. Harness

Abstract:

Natural evolution of ATP cognitive systems is to meet AI peer review standards. ATP process of axiom selection from Mizar to prove a conjecture would be further refined, as in all human and machine learning, by solving the real world problem of the proposed AI peer review challenge: Determine which conjecture forms the higher confidence level constructive proof between Standard Model of Physics SU(n) lattice gauge group operation vs. present non-standard 4D GEM EOS SU(n) lattice gauge group spatially extended operation in which the photon and electron are the first two trace angular momentum invariants of a gravitoelectromagnetic (GEM) energy momentum density tensor wavetrain integration spin-stress pressure-volume equation of state (EOS), initiated via 32 lines of Mathematica code. Resulting gravitoelectromagnetic spectrum ranges from compressive through rarefactive of the central cosmological constant vacuum energy density in units of pascals. Said self-adjoint group operation exclusively operates on the stress energy momentum tensor of the Einstein field equations, introducing quantization directly on the 4D spacetime level, essentially reformulating the Yang-Mills virtual superpositioned particle compounded lattice gauge groups quantization of the vacuum—into a single hyper-complex multi-valued GEM U(1) × SU(1,3) lattice gauge group Planck spacetime mesh quantization of the vacuum. Thus the Mizar corpus already contains all of the axioms required for relevant DeepMath premise selection and unambiguous formal natural language parsing in context deep learning.

Keywords: automated theorem proving, constructive quantum field theory, information theory, neural networks

Procedia PDF Downloads 158

60 Building a Composite Approach to Employees' Motivational Needs by Combining Cognitive Needs

Authors: Alexis Akinyemi, Laurene Houtin

Abstract:

Measures of employee motivation at work are often based on the theory of self-determined motivation, which implies that human resources departments and managers seek to motivate employees in the most self-determined way possible and use strategies to achieve this goal. In practice, they often tend to assess employee motivation and then adapt management to the most important source of motivation for their employees, for example by financially rewarding an employee who is extrinsically motivated, and by rewarding an intrinsically motivated employee with congratulations and recognition. Thus, the use of motivation measures contradicts theoretical positioning: theory does not provide for the promotion of extrinsically motivated behaviour. In addition, a corpus of social psychology linked to fundamental needs makes it possible to personally address a person’s different sources of motivation (need for cognition, need for uniqueness, need for effects and need for closure). By developing a composite measure of motivation based on these needs, we provide human resources professionals, and in particular occupational psychologists, with a tool that complements the assessment of self-determined motivation, making it possible to precisely address the objective of adapting work not to the self-determination of behaviours, but to the motivational traits of employees. To develop such a model, we gathered the French versions of the cognitive needs scales (need for cognition, need for uniqueness, need for effects, need for closure) and conducted a study with 645 employees of several French companies. On the basis of the data collected, we conducted a confirmatory factor analysis to validate the model, studied the correlations between the various needs, and highlighted the different reference groups that could be used to use these needs as a basis for interviews with employees (career, recruitment, etc.). The results showed a coherent model and the expected links between the different needs. Taken together, these results make it possible to propose a valid and theoretically adjusted tool to managers who wish to adapt their management to their employees’ current motivations, whether or not these motivations are self-determined.

Keywords: motivation, personality, work commitment, cognitive needs

Procedia PDF Downloads 103

59 MRI Findings in Children with Intrac Table Epilepsy Compared to Children with Medical Responsive Epilepsy

Authors: Susan Amirsalari, Azime Khosrinejad, Elham Rahimian

Abstract:

Objective: Epilepsy is a common brain disorder characterized by a persistent tendency to develop in neurological, cognitive, and psychological contents. Magnetic Resonance Imaging (MRI) is a neuroimaging test facilitating the detection of structural epileptogenic lesions. This study aimed to compare the MRI findings between patients with intractable and drug-responsive epilepsy. Material & methods: This case-control study was conducted from 2007 to 2019. The research population encompassed all 1-16- year-old patients with intractable epilepsy referred to the Shafa Neuroscience Center (n=72) (a case group) and drug-responsive patients referred to the pediatric neurology clinic of Baqiyatallah Hospital (a control group). Results: There were 72 (23.5%) patients in the intractable epilepsy group and 200 (76.5%) patients in the drug-responsive group. The participants' mean age was 6.70 ±4.13 years, and there were 126 males and 106 females in this study Normal brain MRI was noticed in 21 (29.16%) patients in the case group and 184 (92.46%) patients in the control group. Neuronal migration disorder (NMD)was also exhibited in 7 (9.72%) patients in the case group and no patient in the control group. There were hippocampal abnormalities and focal lesions (mass, dysplasia, etc.) in 10 (13.88%) patients in the case group and only 1 (0.05%) patient in the control group. Gliosis and porencephalic cysts were presented in 3 (4.16%) patients in the case group and no patient in the control group. Cerebral and cerebellar atrophy was revealed in 8 (11.11%) patients in the case group and 4 (2.01%) patients in the control group. Corpus callosum agenesis, hydrocephalus, brain malacia, and developmental cyst were more frequent in the case group; however, the difference between the groups was not significant. Conclusion: The MRI findings such as hippocampal abnormalities, focal lesions (mass, dysplasia), NMD, porencephalic cysts, gliosis, and atrophy are significantly more frequent in children with intractable epilepsy than in those with drug-responsive epilepsy.

Keywords: magnetic resonance imaging, intractable epilepsy, drug responsive epilepsy, neuronal migrational disorder

Procedia PDF Downloads 25

58 Translatability of Sylistic Devices in Poetry Across Language-Cultures: An Intercultural Rhetoric Perspective

Authors: Hazel P. Atilano

Abstract:

Contrastive rhetoricians working on L2 writing are often unfamiliar with the theories and research of scholars in translation studies. Publications on translation studies give little or no attention to describing the translation strategies of translators, with a focus on the influence of their L1 on the language they produce. This descriptive qualitative study anchored on Eugene Nida’s Translation Theory employed stylistic, lexico-semantic, and grammatical analyses of the stylistic devices employed by poets across nine language cultures to reveal the translation strategies employed by translators and to establish the type of equivalence manifested in the translated texts. The corpus consists of 27 poems written in Bahasa Indonesia, Hiligaynon, Tagalog (Malayo-Polynesian languages), French, Italian, Spanish (Romance languages), German, Icelandic, and Norwegian (Germanic Languages), translated into English. Stylistic analysis reveals that both original texts and English translations share the same stylistic devices, suggesting that stylistic devices do not get lost in translation. Lexico-semantic and grammatical analyses showed that translators of Malayo-Polynesian languages employed idiomatic translation as a compensatory strategy, producing English translations that manifest Dynamic Equivalence or transparency; translators of Romance languages resorted to synonymous substitution or literal translation, suggesting Formal Equivalence or fidelity; and translators of Germanic languages used a combination of idiomatic and literal translation strategies, with noticeable preference for Dynamic Equivalence, evidenced by the prevalence of metaphorical translations as compensatory strategy. Implications on the intricate relationship between culture and language in the translation process were drawn based on the findings.

Keywords: translation strategy, dynamic equivalence, formal equivalence, translation theory, transparency, fidelity

Procedia PDF Downloads 43

57 Exploring Twitter Data on Human Rights Activism on Olympics Stage through Social Network Analysis and Mining

Authors: Teklu Urgessa, Joong Seek Lee

Abstract:

Social media is becoming the primary choice of activists to make their voices heard. This fact is coupled by two main reasons. The first reason is the emergence web 2.0, which gave the users opportunity to become content creators than passive recipients. Secondly the control of the mainstream mass media outlets by the governments and individuals with their political and economic interests. This paper aimed at exploring twitter data of network actors talking about the marathon silver medalists on Rio2016, who showed solidarity with the Oromo protesters in Ethiopia on the marathon race finish line when he won silver. The aim is to discover important insight using social network analysis and mining. The hashtag #FeyisaLelisa was used for Twitter network search. The actors’ network was visualized and analyzed. It showed the central influencers during first 10 days in August, were international media outlets while it was changed to individual activist in September. The degree distribution of the network is scale free where the frequency of degrees decay by power low. Text mining was also used to arrive at meaningful themes from tweet corpus about the event selected for analysis. The semantic network indicated important clusters of concepts (15) that provided different insight regarding the why, who, where, how of the situation related to the event. The sentiments of the words in the tweets were also analyzed and indicated that 95% of the opinions in the tweets were either positive or neutral. Overall, the finding showed that Olympic stage protest of the marathoner brought the issue of Oromo protest to the global stage. The new research framework is proposed based for event-based social network analysis and mining based on the practical procedures followed in this research for event-based social media sense making.

Keywords: human rights, Olympics, social media, network analysis, social network ming

Procedia PDF Downloads 239

56 Between the Pen and the Dish Towel: Paradox of Globalization

Authors: Sandra Maria Cerqueira Da Silva

Abstract:

In Brazil, women are the majority of the country's population. They have advanced in terms of years of education and professional training. However, this has not prevented the differences in the labor market from being sustained, particularly the wage gap and inequalities concerning the access to command positions and promotions, i.e., in the gender relations and treatment. One of the conditions which constitute a barrier to career advancement is the necessary support chain to support women when they are in the labor market. Therefore, the purpose of this research is to demonstrate, describe, and criticize some of the current conformations of support chains and how these compete to promote the phenomenon known as glass ceiling in the country. However, this support may come even from inside a woman's own home, with a fairer division of household activities between men and women. Such behavior can free an entire network of women within the same family. In addition, it can serve as pressure to structure better conditions for women as a whole, improving the living conditions of the poor population. This can occur through programs and projects for qualification and retraining of adult women. In answer to the question that guides this study, it is concluded that a family support system is critical to the success of women in management positions. To meet this demand, one of the ways could be the development of specific gender policies by the public authorities, in accordance with the emerging global economic policies, in order to provide and structure the necessary support. This would respond to feminist manifestations - which should go on pointing needs – although the legislative assembly should also propose ideas to change this picture. This is a qualitative research, with a poststructuralist approach, featuring a cutout corpus of three interviews carried out with women holding leadership positions in the academia. Questions related to this very discussion are many. New studies could address points as the promotion of qualification and expansion of skills of women in subaltern condition. There is also need to investigate possible support systems, considering the inequalities and local economic conditions.

Keywords: gender and labor market, glass ceiling, post-structuralism, support chain

Procedia PDF Downloads 220

55 Teaching Translation in Brazilian Universities: A Study about the Possible Impacts of Translators’ Comments on the Cyberspace about Translator Education

Authors: Erica Lima

Abstract:

The objective of this paper is to discuss relevant points about teaching translation in Brazilian universities and the possible impacts of blogs and social networks to translator education today. It is intended to analyze the curricula of Brazilian translation courses, contrasting them to information obtained from two social networking groups of great visibility in the area concerning essential characteristics to become a successful profession. Therefore, research has, as its main corpus, a few undergraduate translation programs’ syllabuses, as well as a few postings on social networks groups that specifically share professional opinions regarding the necessity for a translator to obtain a degree in translation to practice the profession. To a certain extent, such comments and their corresponding responses lead to the propagation of discourses which influence the ideas that aspiring translators and recent graduates end up having towards themselves and their undergraduate courses. The postings also show that many professionals do not have a clear position regarding the translator education; while refuting it, they also encourage “free” courses. It is thus observed that cyberspace constitutes, on the one hand, a place of mobilization of people in defense of similar ideas. However, on the other hand, it embodies a place of tension and conflict, in view of the fact that there are many participants and, as in any other situation of interlocution, disagreements may arise. From the postings, aspects related to professionalism were analyzed (including discussions about regulation), as well as questions about the classic dichotomies: theory/practice; art/technique; self-education/academic training. As partial result, the common interest regarding the valorization of the profession could be mentioned, although there is no consensus on the essential characteristics to be a good translator. It was also possible to observe that the set of socially constructed representations in the group reflects characteristics of the world situation of the translation courses (especially in some European countries and in the United States), which, in the first instance, does not accurately reflect the Brazilian idiosyncrasies of the area.

Keywords: cyberspace, teaching translation, translator education, university

Procedia PDF Downloads 367

54 Language Politics and Identity in Translation: From a Monolingual Text to Multilingual Text in Chinese Translations

Authors: Chu-Ching Hsu

Abstract:

This paper focuses on how the government-led language policies and the political changes in Taiwan manipulate the languages choice in translations and what translation strategies are employed by the translator to show his or her language ideology behind the power struggles and decision-making. Therefore, framed by Lefevere’s theoretical concept of translating as rewriting, and carried out a diachronic and chronological study, this paper specifically sets out to investigate the language ideology and translator’s idiolect of Chinese language translations of Anglo-American novels. The examples drawn to explore these issues were taken from different versions of Chinese renditions of Mark Twain’s English-language novel The Adventures of Huckleberry Finn in which there are several different dialogues originally written in the colloquial language and dialect used in the American state of Mississippi and reproduced in Mark Twain’s works. Also, adapted corpus methodology, many examples are extracted as instances from the translated texts and source text, to illuminate how the translators in Taiwan deal with the dialectal features encoded in Twain’s works, and how different versions of Chinese translations are employed by Taiwanese translators to confirm the language polices and to express their language identity textually in different periods of the past five decades, from the 1960s onward. The finding of this study suggests that the use of Taiwanese dialect and language patterns in translations does relate to the movement of the mother-tongue language and language ideology of the translator as well as to the issue of language identity raised in the island of Taiwan. Furthermore, this study confirms that the change of political power in Taiwan does bring significantly impact in language policy-- assimilationism, pluralism or multiculturalism, which also makes Taiwan from a monolingual to multilingual society, where the language ideology and identity can be revealed not only in people’s daily communication but also in written translations.

Keywords: language politics and policies, literary translation, mother-tongue, multiculturalism, translator’s ideology

Procedia PDF Downloads 381

53 A Longitudinal Case Study of Greek as a Second Language

Authors: M. Vassou, A. Karasimos

Abstract:

A primary concern in the field of Second Language Acquisition (SLA) research is to determine the innate mechanisms of second language learning and acquisition through the systematic study of a learner's interlanguage. Errors emerge while a learner attempts to communicate using the target-language and can be seen either as the observable linguistic product of the latent cognitive and language process of mental representations or as an indispensable learning mechanism. Therefore, the study of the learner’s erroneous forms may depict the various strategies and mechanisms that take place during the language acquisition process resulting in deviations from the target-language norms and difficulties in communication. Mapping the erroneous utterances of a late adult learner in the process of acquiring Greek as a second language constitutes one of the main aims of this study. For our research purposes, we created an error-tagged learner corpus composed of the participant’s written texts produced throughout a period of a 4- year instructed language acquisition. Error analysis and interlanguage theory constitute the methodological and theoretical framework, respectively. The research questions pertain to the learner's most frequent errors per linguistic category and per year as well as his choices concerning the Greek Article System. According to the quantitative analysis of the data, the most frequent errors are observed in the categories of the stress system and syntax, whereas a significant fluctuation and/or gradual reduction throughout the 4 years of instructed acquisition indicate the emergence of developmental stages. The findings with regard to the article usage bespeak fossilization of erroneous structures in certain contexts. In general, our results point towards the existence and further development of an established learner’s (inter-) language system governed not only by mother- tongue and target-language influences but also by the learner’s assumptions and set of rules as the result of a complex cognitive process. It is expected that this study will contribute not only to the knowledge in the field of Greek as a second language and SLA generally, but it will also provide an insight into the cognitive mechanisms and strategies developed by multilingual learners of late adulthood.

Keywords: Greek as a second language, error analysis, interlanguage, late adult learner

Procedia PDF Downloads 114

52 Improving Subjective Bias Detection Using Bidirectional Encoder Representations from Transformers and Bidirectional Long Short-Term Memory

Authors: Ebipatei Victoria Tunyan, T. A. Cao, Cheol Young Ock

Abstract:

Detecting subjectively biased statements is a vital task. This is because this kind of bias, when present in the text or other forms of information dissemination media such as news, social media, scientific texts, and encyclopedias, can weaken trust in the information and stir conflicts amongst consumers. Subjective bias detection is also critical for many Natural Language Processing (NLP) tasks like sentiment analysis, opinion identification, and bias neutralization. Having a system that can adequately detect subjectivity in text will boost research in the above-mentioned areas significantly. It can also come in handy for platforms like Wikipedia, where the use of neutral language is of importance. The goal of this work is to identify the subjectively biased language in text on a sentence level. With machine learning, we can solve complex AI problems, making it a good fit for the problem of subjective bias detection. A key step in this approach is to train a classifier based on BERT (Bidirectional Encoder Representations from Transformers) as upstream model. BERT by itself can be used as a classifier; however, in this study, we use BERT as data preprocessor as well as an embedding generator for a Bi-LSTM (Bidirectional Long Short-Term Memory) network incorporated with attention mechanism. This approach produces a deeper and better classifier. We evaluate the effectiveness of our model using the Wiki Neutrality Corpus (WNC), which was compiled from Wikipedia edits that removed various biased instances from sentences as a benchmark dataset, with which we also compare our model to existing approaches. Experimental analysis indicates an improved performance, as our model achieved state-of-the-art accuracy in detecting subjective bias. This study focuses on the English language, but the model can be fine-tuned to accommodate other languages.

Keywords: subjective bias detection, machine learning, BERT–BiLSTM–Attention, text classification, natural language processing

Procedia PDF Downloads 114

51 Academic Literacy: Semantic-Discursive Resource and the Relationship with the Constitution of Genre for the Development of Writing

Authors: Lucia Rottava

Abstract:

The present study focuses on academic literacy and addresses the impact of semantic-discursive resources on the constitution of genres that are produced in such context. The research considers the development of writing in the academic context in Portuguese. Researches that address academic literacy and the characteristics of the texts produced in this context are rare, mainly with focus on the development of writing, considering three variables: the constitution of the writer, the perception of the reader/interlocutor and the organization of the informational text flow. The research aims to map the semantic-discursive resources of the written register in texts of several genres and produced by students in the first semester of the undergraduate course in letters. The hypothesis raised is that writing in the academic environment is not a recurrent literacy practice for these learners and can be explained by the ontogenetic and phylogenetic nature of language development. Qualitative in nature, the present research has as empirical data texts produced in a half-yearly course of Reading and Textual Production; these data result from the proposition of four different writing proposals, in a total of 600 texts. The corpus is analyzed based on semantic-discursive resources, seeking to contemplate relevant aspects of language (grammar, discourse and social context) that reveal the choices made in the reader/writer interrelationship and the organizational flow of the text. Among the semantic-discursive resources, the analysis includes three resources, including (a) appraisal and negotiation to understand the attitudes negotiated (roles of the participants of the discourse and their relationship with the other); (b) ideation to explain the construction of the experience (activities performed and participants); and (c) periodicity to outline the flow of information in the organization of the text according to the genre it instantiates. The results indicate the organizational difficulties of the flow of the text information. Cartography contributes to the understanding of the way writers use language in an effort to present themselves, evaluate someone else’s work, and communicate with readers.

Keywords: academic writing, portuguese mother tongue, semantic-discursive resources, sistemic funcional linguistic

Procedia PDF Downloads 110

50 Searching for the ‘Why’ of Gendered News: Journalism Practices and Societal Contexts

Authors: R. Simões, M. Silveirinha

Abstract:

Driven by the need to understand the results of previous research that clearly shows deep unbalances of the media discourses about women and men in spite of the growing numbers of female journalists, our paper aims to progress from the 'what' to the 'why' of these unbalanced representations. Furthermore, it does so at a time when journalism is undergoing a dramatic change in terms of professional practices and in how media organizations are organized and run, affecting women in particular. While some feminist research points to the fact that female and male journalists evaluate the role of the news and production methods in similar ways feminist theorizing also suggests that thought and knowledge are highly influenced by social identity, which is also inherently affected by the experiences of gender. This is particularly important at a time of deep societal and professional changes. While there are persuasive discussions of gender identities at work in newsrooms in various countries studies on the issue will benefit from cases that focus on the particularities of local contexts. In our paper, we present one such case: the case of Portugal, a country hit hard by austerity measures that have affected all cultural industries including journalism organizations, already feeling the broader impacts of the larger societal changes of the media landscape. Can we gender these changes? How are they felt and understood by female and male journalists? And how are these discourses framed by androcentric, feminist and post-feminist sensibilities? Foregrounding questions of gender, our paper seeks to explore some of the interactions of societal and professional forces, identifying their gendered character and outlining how they shape journalism work in general and the production of unbalanced gender representations in particular. We do so grounded in feminist studies of journalism as well as feminist organizational and work studies, looking at a corpus of 20 in-depth interviews of female and male Portuguese journalists. The research findings illustrate how gender in journalism practices interacts with broader experiences of the cultural and economic contexts and show the ambivalences of these interactions in news organizations.

Keywords: gender, journalism, newsroom culture, Portuguese journalists

Procedia PDF Downloads 387

49 Unsupervised Part-of-Speech Tagging for Amharic Using K-Means Clustering

Authors: Zelalem Fantahun

Abstract:

Part-of-speech tagging is the process of assigning a part-of-speech or other lexical class marker to each word into naturally occurring text. Part-of-speech tagging is the most fundamental and basic task almost in all natural language processing. In natural language processing, the problem of providing large amount of manually annotated data is a knowledge acquisition bottleneck. Since, Amharic is one of under-resourced language, the availability of tagged corpus is the bottleneck problem for natural language processing especially for POS tagging. A promising direction to tackle this problem is to provide a system that does not require manually tagged data. In unsupervised learning, the learner is not provided with classifications. Unsupervised algorithms seek out similarity between pieces of data in order to determine whether they can be characterized as forming a group. This paper explicates the development of unsupervised part-of-speech tagger using K-Means clustering for Amharic language since large amount of data is produced in day-to-day activities. In the development of the tagger, the following procedures are followed. First, the unlabeled data (raw text) is divided into 10 folds and tokenization phase takes place; at this level, the raw text is chunked at sentence level and then into words. The second phase is feature extraction which includes word frequency, syntactic and morphological features of a word. The third phase is clustering. Among different clustering algorithms, K-means is selected and implemented in this study that brings group of similar words together. The fourth phase is mapping, which deals with looking at each cluster carefully and the most common tag is assigned to a group. This study finds out two features that are capable of distinguishing one part-of-speech from others these are morphological feature and positional information and show that it is possible to use unsupervised learning for Amharic POS tagging. In order to increase performance of the unsupervised part-of-speech tagger, there is a need to incorporate other features that are not included in this study, such as semantic related information. Finally, based on experimental result, the performance of the system achieves a maximum of 81% accuracy.

Keywords: POS tagging, Amharic, unsupervised learning, k-means

Procedia PDF Downloads 426