Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 1711

Search results for: arabic text

1111 Exploring Moroccan Teachers Beliefs About Multilingualism

Abstract:

In this study, author tried to explore the beliefs of some Moroccan teachers working in the delegations of Safi and Youcefia about the usefulness of first and second languages in learning the third language. More specifically, author attempted to see the extent to which these teachers believe that a first and second language can serve students in learning a third one. The first language in this context is Arabic, the second is French, and the third is English. The teachers’ beliefs were gathered through a questionnaire that was addressed via Google Forms. Then, the results were analyzed using the same application. It was found that teachers are positive about the usefulness of the first and second language in learning the third one, but most of them rarely use in a conscious way activities that serve this purpose.

Keywords: Bilinguilism, teachers beliefs, English as ESL, Morocco

Procedia PDF Downloads 56

1110 JaCoText: A Pretrained Model for Java Code-Text Generation

Authors: Jessica Lopez Espejel, Mahaman Sanoussi Yahaya Alassan, Walid Dahhane, El Hassane Ettifouri

Abstract:

Pretrained transformer-based models have shown high performance in natural language generation tasks. However, a new wave of interest has surged: automatic programming language code generation. This task consists of translating natural language instructions to a source code. Despite the fact that well-known pre-trained models on language generation have achieved good performance in learning programming languages, effort is still needed in automatic code generation. In this paper, we introduce JaCoText, a model based on Transformer neural network. It aims to generate java source code from natural language text. JaCoText leverages the advantages of both natural language and code generation models. More specifically, we study some findings from state of the art and use them to (1) initialize our model from powerful pre-trained models, (2) explore additional pretraining on our java dataset, (3) lead experiments combining the unimodal and bimodal data in training, and (4) scale the input and output length during the fine-tuning of the model. Conducted experiments on CONCODE dataset show that JaCoText achieves new state-of-the-art results.

Keywords: java code generation, natural language processing, sequence-to-sequence models, transformer neural networks

Procedia PDF Downloads 287

1109 Disowning of ‘Our Lady of Alice Bhatti’ by Mohammad Hanif Through Gendered and Religious Discourse

Authors: Abrar Ajmal

Abstract:

The language used in literature reveals the culture and social gestalt of any society in which it has been constructed and consumed. This paper carries the same rationale, which aims to track certain socio-religious and cultural-economic disparities and discrepancies towards minorities, particularly Christians, in an Islamic re(public) where there is a clear majority of Muslims with the help of analysis of instances of language used in the narratives “Our Lady of Alice Bhatt” by Mohammad Hanif. It would highlight social inequalities practiced deeply in sociocultural discourse. Moreover, this research would also touch upon the question of gender discrimination and gender construction as a female entity in a male-chauvinistic scenic turnout using language since the novel revolves around communicative forfeits of Alice Bhatti’s life where she is fraying in fisticuffs to befit herself in a miss-fitted society. It would employ using Fairclough's framework for analysis to conduct a critical discourse analysis of the text at three axiom levels namely textual analysis, discursive practices, and socio-cultural analysis. Thus, the results would reveal textual findings in linguistic analysis, a range of embedded discourses in discursive practices, and consumption of the text into socio-cultural explications with the use of language and lexicalization employed in the selected excerpts.

Keywords: gendered discourse, socio-economic disparities minorities, Islamization, analytical framework

Procedia PDF Downloads 60

1108 Probing Language Models for Multiple Linguistic Information

Authors: Bowen Ding, Yihao Kuang

Abstract:

In recent years, large-scale pre-trained language models have achieved state-of-the-art performance on a variety of natural language processing tasks. The word vectors produced by these language models can be viewed as dense encoded presentations of natural language that in text form. However, it is unknown how much linguistic information is encoded and how. In this paper, we construct several corresponding probing tasks for multiple linguistic information to clarify the encoding capabilities of different language models and performed a visual display. We firstly obtain word presentations in vector form from different language models, including BERT, ELMo, RoBERTa and GPT. Classifiers with a small scale of parameters and unsupervised tasks are then applied on these word vectors to discriminate their capability to encode corresponding linguistic information. The constructed probe tasks contain both semantic and syntactic aspects. The semantic aspect includes the ability of the model to understand semantic entities such as numbers, time, and characters, and the grammatical aspect includes the ability of the language model to understand grammatical structures such as dependency relationships and reference relationships. We also compare encoding capabilities of different layers in the same language model to infer how linguistic information is encoded in the model.

Keywords: language models, probing task, text presentation, linguistic information

Procedia PDF Downloads 114

1107 On ‘Freaks’ and the Feminine in Margaret Atwood’s ‘Lusus Naturae’

Authors: Shahd Alshammari

Abstract:

This paper considers one of Margaret Atwood’s short stories ‘Lusus Naturae'. Through a critical lens that makes use of Julia Kristeva’s work on Powers of Horror and abjection, this paper suggests that the monstrous girl is the disabled woman, the abject in society. The monster is used as a metaphor for the unknown, the misunderstood, and the ‘different’ woman. Culturally Relevant Teaching (CRT) is a pedagogy that calls for making course material accessible and relevant to students. Through the study of literary texts, we are able to help create agency inside and outside the classroom. Stories are a necessary part of establishing connections across borders and boundaries. Stories are meant to raise awareness both inside and outside the classroom. The discussion is equally important, and the text is meant to facilitate relevant questions that the students need to consider when it comes to identity. Questions to consider are: what does it mean to be a ‘girl’ today, and what implications and consequences are at hand when you fail to perform this gendered identity? Gender is sometimes a fatal bond in the Middle East, and even more so, is the disability. In the case of our unnamed protagonist, she undergoes a process of un-becoming, a non-linear process of growing up. In a sense, it is a counter-Bildungsroman. The reading of this text emphasizes that a non-linear narrative is sometimes necessary for the female protagonist’s self-awareness and development. Discussion in class facilitates this sense of agency and questioning of gender and disability.

Keywords: disability, gender, literature, pedagogy

Procedia PDF Downloads 664

1106 The Linguistic Fingerprint in Western and Arab Judicial Applications

Authors: Asem Bani Amer

Abstract:

This study handles the linguistic fingerprint in judicial applications described in a law technicality that is recent and developing. It can be adopted to discover criminals by identifying their way of speaking and their special linguistic expressions. This is achieved by understanding the expression "linguistic fingerprint," its concept, and its extended domain, then revealing some of the linguistic fingerprint tools in Western judicial applications and deducing a technical imagination for a linguistic fingerprint in the Arabic language, which is needy for such judicial applications regarding this field, through dictionaries, language rhythm, and language structure.

Keywords: linguistic fingerprint, judicial, application, dictionary, picture, rhythm, structure

Procedia PDF Downloads 81

1105 The Use of Neuter in Oedipus Lines to Refer to Antigone in Phoenissae of Seneca

Authors: Cíntia Martins Sanches

Abstract:

In the first part of Phoenissae of Seneca, Antigone is a guide to Oedipus, and they leave Thebes: he is blind searching for death (inflicting the punishment himself wished on the killer of Laius, ie exile and death); she is trying to convince him to give up such punishment and bring him back to Thebes. Concerning Oedipus lines, we observed a high frequency of Latin neuter in the treatment the protagonist gave to his daughter Antigone. We considered in this study that such frequency may be related to the sanctification of the daughter, who is seen by him as an enlightened being and without defects, free of the human condition (which takes on the existence of failures by essence). This study, thus, puts forward an analysis of the passages the said feature is present, relating them to the effect of meaning found in each occurrence. As part of a doctorate, this study investigates the stylistic idiom of Seneca in the Oedipus and Phoenissae tragedies, aiming at translating both tragedies expressively. The concept of stylistic idiom concerns the stylistic affinity required for a translation to be equivalent to the source text. In this wise, this study inquires into how the Latin text is organized poetically, pointing out the expressive features frequently appearing in both dramas. The method we used is based on the Semiotics theory — observing how connotation, ie a language use in which prevails the poetic function, naturally polysemous, acts to achieve each expressive effect.

Keywords: antigone, neuter, Oedipus, Phoenissae, Seneca

Procedia PDF Downloads 289

1104 Ranking Priorities for Digital Health in Portugal: Aligning Health Managers’ Perceptions with Official Policy Perspectives

Authors: Pedro G. Rodrigues, Maria J. Bárrios, Sara A. Ambrósio

Abstract:

The digitalisation of health is a profoundly transformative economic, political, and social process. As is often the case, such processes need to be carefully managed if misunderstandings, policy misalignments, or outright conflicts between the government and a wide gamut of stakeholders with competing interests are to be avoided. Thus, ensuring open lines of communication where all parties know what each other’s concerns are is key to good governance, as well as efficient and effective policymaking. This project aims to make a small but still significant contribution in this regard in that we seek to determine the extent to which health managers’ perceptions of what is a priority for digital health in Portugal are aligned with official policy perspectives. By applying state-of-the-art artificial intelligence technology first to the indexed literature on digital health and then to a set of official policy documents on the same topic, followed by a survey directed at health managers working in public and private hospitals in Portugal, we obtain two priority rankings that, when compared, will allow us to produce a synthesis and toolkit on digital health policy in Portugal, with a view to identifying areas of policy convergence and divergence. This project is also particularly peculiar in the sense that sophisticated digital methods related to text analytics are employed to study good governance aspects of digitalisation applied to health care.

Keywords: digital health, health informatics, text analytics, governance, natural language understanding

Procedia PDF Downloads 66

1103 Leveraging Large Language Models to Build a Cutting-Edge French Word Sense Disambiguation Corpus

Authors: Mouheb Mehdoui, Amel Fraisse, Mounir Zrigui

Abstract:

With the increasing amount of data circulating over the Web, there is a growing need to develop and deploy tools aimed at unraveling semantic nuances within text or sentences. The challenges in extracting precise meanings arise from the complexity of natural language, while words usually have multiple interpretations depending on the context. The challenge of precisely interpreting words within a given context is what the task of Word Sense Disambiguation meets. It is a very old domain within the area of Natural Language Processing aimed at determining a word’s meaning that it is going to carry in a particular context, hence increasing the correctness of applications processing the language. Numerous linguistic resources are accessible online, including WordNet, thesauri, and dictionaries, enabling exploration of diverse contextual meanings. However, several limitations persist. These include the scarcity of resources for certain languages, a limited number of examples within corpora, and the challenge of accurately detecting the topic or context covered by text, which significantly impacts word sense disambiguation. This paper will discuss the different approaches to WSD and review corpora available for this task. We will contrast these approaches, highlighting the limitations, which will allow us to build a corpus in French, targeted for WSD.

Keywords: semantic enrichment, disambiguation, context fusion, natural language processing, multilingual applications

Procedia PDF Downloads 15

1102 The Role of Digital Text in School and Vernacular Literacies: Students Digital Practices at Cybercafés in Mexico

Authors: Guadalupe López-Bonilla

Abstract:

Students of all educational levels participate in literacy practices that may involve print or digital media. Scholars from the New Literacy Studies distinguish practices that fulfill institutional purposes such as those established at schools from literate practices aimed at doing other kinds of activities, such as reading instructions in order to play a video game; the first are known as institutional practices while the latter are considered vernacular literacies. When students perform these kinds of activities they engage with print and digital media according to the demands of the task. In this paper, it is aimed to discuss the results of a research project focusing on literacy practices of high school students at 10 urban cybercafés in Mexico. The main objective was to analyze the literacy practices of students performing both school tasks and vernacular literacies. The methodology included a focused ethnography with online and face to face observations of 10 high school students (5 male and 5 female) and interviews after performing each task. In the results, it is presented how students treat texts as open, dynamic and relational artifacts when engaging in vernacular literacies; while texts are conceived as closed, authoritarian and fixed documents when performing school activities. Samples of each type of activity are shown followed by a discussion of the pedagogical implications for improving school literacy.

Keywords: digital literacy, text, school literacy, vernacular practices

Procedia PDF Downloads 272

1101 Myanmar Character Recognition Using Eight Direction Chain Code Frequency Features

Authors: Kyi Pyar Zaw, Zin Mar Kyu

Abstract:

Character recognition is the process of converting a text image file into editable and searchable text file. Feature Extraction is the heart of any character recognition system. The character recognition rate may be low or high depending on the extracted features. In the proposed paper, 25 features for one character are used in character recognition. Basically, there are three steps of character recognition such as character segmentation, feature extraction and classification. In segmentation step, horizontal cropping method is used for line segmentation and vertical cropping method is used for character segmentation. In the Feature extraction step, features are extracted in two ways. The first way is that the 8 features are extracted from the entire input character using eight direction chain code frequency extraction. The second way is that the input character is divided into 16 blocks. For each block, although 8 feature values are obtained through eight-direction chain code frequency extraction method, we define the sum of these 8 feature values as a feature for one block. Therefore, 16 features are extracted from that 16 blocks in the second way. We use the number of holes feature to cluster the similar characters. We can recognize the almost Myanmar common characters with various font sizes by using these features. All these 25 features are used in both training part and testing part. In the classification step, the characters are classified by matching the all features of input character with already trained features of characters.

Keywords: chain code frequency, character recognition, feature extraction, features matching, segmentation

Procedia PDF Downloads 321

1100 Teaching Tolerance in the Language Classroom through a Text

Authors: Natalia Kasatkina

Abstract:

In an ever-increasing globalization, one’s grasp of diversity and tolerance has never been more indispensable, and it is a vital duty for all those in the field of foreign language teaching to help children cultivate such values. The present study explores the role of DIVERSITY and TOLERANCE in the language classroom and elementary, middle, and high school students’ perceptions of these two concepts. It draws on several theoretical domains of language acquisition, cultural awareness, and school psychology. Relying on these frameworks, the major findings are synthesized, and a paradigm of teaching tolerance through language-teaching is formulated. Upon analysing how tolerant our children are with ‘others’ in and outside the classroom, we have concluded that intolerance and aggression towards the ‘other’ increase with age, and that a feeling of supremacy over migrants and a sense of fear towards them begin to manifest more apparently when the students are in high school. In addition, we have also found that children in elementary school do not exhibit such prejudiced thoughts and behavior, which leads us to the believe that tolerance as well as intolerance are learned. Therefore, it is within our reach to teach our children to be open-minded and accepting. We have used the novel ‘Uncle Tom’s Cabin’ by Harriet Beecher Stowe as a springboard for lessons which are not only targeted at shedding light on the role of language in the modern world, but also aim to stimulate an awareness of cultural diversity. We equally strive to conduct further cross-cultural research in order to solidify the theory behind this study, and thus devise a language-based curriculum which would encourage tolerance through the examination of various literary texts.

Keywords: literary text, tolerance, EFL classroom, word-association test

Procedia PDF Downloads 293

1099 Multimodal Content: Fostering Students’ Language and Communication Competences

Authors: Victoria L. Malakhova

Abstract:

The research is devoted to multimodal content and its effectiveness in developing students’ linguistic and intercultural communicative competences as an indefeasible constituent of their future professional activity. Description of multimodal content both as a linguistic and didactic phenomenon makes the study relevant. The objective of the article is the analysis of creolized texts and the effect they have on fostering higher education students’ skills and their productivity. The main methods used are linguistic text analysis, qualitative and quantitative methods, deduction, generalization. The author studies texts with full and partial creolization, their features and role in composing multimodal textual space. The main verbal and non-verbal markers and paralinguistic means that enhance the linguo-pragmatic potential of creolized texts are covered. To reveal the efficiency of multimodal content application in English teaching, the author conducts an experiment among both undergraduate students and teachers. This allows specifying main functions of creolized texts in the process of language learning, detecting ways of enhancing students’ competences, and increasing their motivation. The described stages of using creolized texts can serve as an algorithm for work with multimodal content in teaching English as a foreign language. The findings contribute to improving the efficiency of the academic process.

Keywords: creolized text, English language learning, higher education, language and communication competences, multimodal content

Procedia PDF Downloads 113

1098 Evaluation of the Efficacy and Tolerance of Gabapentin in the Treatment of Neuropathic Pain

Authors: A. Ibovi Mouondayi, S. Zaher, R. Assadi, K. Erraoui, S. Sboul, J. Daoudim, S. Bousselham, K. Nassar, S. Janani

Abstract:

INTRODUCTION: Neuropathic pain (NP) caused by damage to the somatosensory nervous system has a significant impact on quality of life and is associated with a high economic burden on the individual and society. The treatment of neuropathic pain consists of the use of a wide range of therapeutic agents, including gabapentin, which is used in the treatment of neuropathic pain. OBJECTIF: The objective of this study was to evaluate the efficacy and tolerance of gabapentin in the treatment of neuropathic pain. MATERIAL AND METHOD: This is a monocentric, cross-sectional, descriptive, retrospective study conducted in our department over a period of 19 months from October 2020 to April 2022. The missing parameters were collected during phone calls of the patients concerned. The diagnostic tool adopted was the DN4 questionnaire in the dialectal Arabic version. The impact of NP was assessed by the visual analog scale (VAS) on pain, sleep, and function. The impact of PN on mood was assessed by the "Hospital anxiety, and depression scale HAD" score in the validated Arabic version. The exclusion criteria were patients followed up for depression and other psychiatric pathologies. RESULTS: A total of 67 patients' data were collected. The average age was 64 years (+/- 15 years), with extremes ranging from 26 years to 94 years. 58 women and 9 men with an M/F sex ratio of 0.15. Cervical radiculopathy was found in 21% of this population, and lumbosacral radiculopathy in 61%. Gabapentin was introduced in doses ranging from 300 to 1800 mg per day with an average dose of 864 mg (+/- 346) per day for an average duration of 12.6 months. Before treatment, 93% of patients had a non-restorative sleep quality (VAS>3). 54% of patients had a pain VAS greater than 5. The function was normal in only 9% of patients. The mean anxiety score was 3.25 (standard deviation: 2.70), and the mean HAD depression score was 3.79 (standard deviation: 1.79). After treatment, all patients had improved the quality of their sleep (p<0.0001). A significant difference was noted in pain VAS, function, as well as anxiety and depression, and HAD score. Gabapentin was stopped for side effects (dizziness and drowsiness) and/or unsatisfactory response. CONCLUSION: Our data demonstrate a favorable effect of gabapentin on the management of neuropathic pain with a significant difference before and after treatment on the quality of life of patients associated with an acceptable tolerance profile.

Keywords: neuropathic pain, chronic pain, treatment, gabapentin

Procedia PDF Downloads 95

1097 A Postmodern Framework for Quranic Hermeneutics

Authors: Christiane Paulus

Abstract:

Post-Islamism assumes that the Quran should not be viewed in terms of what Lyotard identifies as a ‘meta-narrative'. However, its socio-ethical content can be viewed as critical of power discourse (Foucault). Practicing religion seems to be limited to rites and individual spirituality, taqwa. Alternatively, can we build on Muhammad Abduh's classic-modern reform and develop it through a postmodernist frame? This is the main question of this study. Through his general and vague remarks on the context of the Quran, Abduh was the first to refer to the historical and cultural distance of the text as an obstacle for interpretation. His application, however, corresponded to the modern absolute idea of authentic sharia. He was followed by Amin al-Khuli, who hermeneutically linked the content of the Quran to the theory of evolution. Fazlur Rahman and Nasr Hamid abu Zeid remain reluctant to go beyond the general level in terms of context. The hermeneutic circle, therefore, persists in challenging, how to get out to overcome one’s own assumptions. The insight into and the acceptance of the lasting ambivalence of understanding can be grasped as a postmodern approach; it is documented in Derrida's discovery of the shift in text meanings, difference, also in Lyotard's theory of différend. The resulting mixture of meanings (Wolfgang Welsch) can be read together with the classic ambiguity of the premodern interpreters of the Quran (Thomas Bauer). Confronting hermeneutic difficulties in general, Niklas Luhmann proves every description an attribution, tautology, i.e., remaining in the circle. ‘De-tautologization’ is possible, namely by analyzing the distinctions in the sense of objective, temporal and social information that every text contains. This could be expanded with the Kantian aesthetic dimension of reason (critique of pure judgment) corresponding to the iʽgaz of the Coran. Luhmann asks, ‘What distinction does the observer/author make?’ Quran as a speech from God to the first listeners could be seen as a discourse responding to the problems of everyday life of that time, which can be viewed as the general goal of the entire Qoran. Through reconstructing koranic Lifeworlds (Alfred Schütz) in detail, the social structure crystallizes the socio-economic differences, the enormous poverty. The koranic instruction to provide the basic needs for the neglected groups, which often intersect (old, poor, slaves, women, children), can be seen immediately in the text. First, the references to lifeworlds/social problems and discourses in longer koranic passages should be hypothesized. Subsequently, information from the classic commentaries could be extracted, the classical Tafseer, in particular, contains rich narrative material for reconstructing. By selecting and assigning suitable, specific context information, the meaning of the description becomes condensed (Clifford Geertz). In this manner, the text gets necessarily an alienation and is newly accessible. The socio-ethical implications can thus be grasped from the difference of the original problem and the revealed/improved order/procedure; this small step can be materialized as such, not as an absolute solution but as offering plausible patterns for today’s challenges as the Agenda 2030.

Keywords: postmodern hermeneutics, condensed description, sociological approach, small steps of reform

Procedia PDF Downloads 221

1096 Aspectual Verbs in Modern Standard Arabic

Authors: Yasir Alotaibi

Abstract:

The aim of this paper is to discuss the syntactic analysis of aspectual or phasal verbs in Modern Standard Arabic (MSA). Aspectual or phasal verbs refer to a class of verbs that require a verbal complement and denote the inception, duration, termination ...etc. of a state or event. This paper will discuss two groups of aspectual verbs in MSA. The first group includes verbs such as ̆gacala, tafiqa, ?akhatha, ?ansha?a, sharaca and bada?a and these verbs are used to denote the inception of an event. The second group includes verbs such as ?awshaka, kaada and karaba and the meaning of these verbs is equivalent to be near/almost . The following examples illustrate the use of the verb bada?a ‘begin’ which is from the first group: a. saalim-un bada?a yuthaakiru. Salem-NOM begin.PFV.3SGM study.IPFV.3SGM ‘Salem began to study’ b.*saalim-un bada?a ?an yuthaakiru. Salem-NOM begin.PFV.3SGM COMP study.IPFV.3SGM ‘Salem began to study’ The example in (1a) is grammatical because the aspectual verb is used with a verbal complement that is not introduced by a complementizer. In contrast, example (1b) is not grammatical because the verbal complement is introduced by the complementizer ?an ‘that’. In contrast, the following examples illustrate the use of the verb kaada ‘be almost’ which is from the second group. However, the two examples are grammatical and this means that the verbal complement of this verb can be without (as in example (2a)) or with ( as in example (2b)) a complementizer. (2) a. saalim-un kaada yuthaakiru. Salem-NOM be.almost.PFV.3SGM study.IPFV.3SGM ‘Salem was almost to study’ b. saalim-un kaada ?an yuthaakiru. Salem-NOM be.almost.PFV.3SGM COMP study.IPFV.3SGM ‘Salem was almost to study’ The salient properties of this class of verbs are that they require a verbal complement, there is no a complementizer that can introduce the complement with the first group while it is possible with the second and the aspectual verb and the embedded verb share and agree with the same subject. To the best of knowledge, aspectual verbs in MSA are discussed in traditional grammar only and have not been studied in modern syntactic theories. This paper will consider the analysis of aspectual verbs in MSA within the Lexical Functional Grammar (LFG) framework. It will use some evidence such as modifier or negation to find out whether these verbs have PRED values and head their f-structures or they form complex predicates with their complements. If aspectual verbs show the properties of heads, then the paper will explore what kind of heads they are. In particular, they should be raising or control verbs. The paper will use some tests such as agreement, selectional restrictions...etc. to find out what kind of verbs they are.

Keywords: aspectual verbs, biclausal, monoclausal, raising

Procedia PDF Downloads 56

1095 Alphabet Recognition Using Pixel Probability Distribution

Authors: Vaidehi Murarka, Sneha Mehta, Dishant Upadhyay

Abstract:

Our project topic is “Alphabet Recognition using pixel probability distribution”. The project uses techniques of Image Processing and Machine Learning in Computer Vision. Alphabet recognition is the mechanical or electronic translation of scanned images of handwritten, typewritten or printed text into machine-encoded text. It is widely used to convert books and documents into electronic files etc. Alphabet Recognition based OCR application is sometimes used in signature recognition which is used in bank and other high security buildings. One of the popular mobile applications includes reading a visiting card and directly storing it to the contacts. OCR's are known to be used in radar systems for reading speeders license plates and lots of other things. The implementation of our project has been done using Visual Studio and Open CV (Open Source Computer Vision). Our algorithm is based on Neural Networks (machine learning). The project was implemented in three modules: (1) Training: This module aims “Database Generation”. Database was generated using two methods: (a) Run-time generation included database generation at compilation time using inbuilt fonts of OpenCV library. Human intervention is not necessary for generating this database. (b) Contour–detection: ‘jpeg’ template containing different fonts of an alphabet is converted to the weighted matrix using specialized functions (contour detection and blob detection) of OpenCV. The main advantage of this type of database generation is that the algorithm becomes self-learning and the final database requires little memory to be stored (119kb precisely). (2) Preprocessing: Input image is pre-processed using image processing concepts such as adaptive thresholding, binarizing, dilating etc. and is made ready for segmentation. “Segmentation” includes extraction of lines, words, and letters from the processed text image. (3) Testing and prediction: The extracted letters are classified and predicted using the neural networks algorithm. The algorithm recognizes an alphabet based on certain mathematical parameters calculated using the database and weight matrix of the segmented image.

Keywords: contour-detection, neural networks, pre-processing, recognition coefficient, runtime-template generation, segmentation, weight matrix

Procedia PDF Downloads 390

1094 Improving Topic Quality of Scripts by Using Scene Similarity Based Word Co-Occurrence

Authors: Yunseok Noh, Chang-Uk Kwak, Sun-Joong Kim, Seong-Bae Park

Abstract:

Scripts are one of the basic text resources to understand broadcasting contents. Since broadcast media wields lots of influence over the public, tools for understanding broadcasting contents are more required. Topic modeling is the method to get the summary of the broadcasting contents from its scripts. Generally, scripts represent contents descriptively with directions and speeches. Scripts also provide scene segments that can be seen as semantic units. Therefore, a script can be topic modeled by treating a scene segment as a document. Because scripts consist of speeches mainly, however, relatively small co-occurrences among words in the scene segments are observed. This causes inevitably the bad quality of topics based on statistical learning method. To tackle this problem, we propose a method of learning with additional word co-occurrence information obtained using scene similarities. The main idea of improving topic quality is that the information that two or more texts are topically related can be useful to learn high quality of topics. In addition, by using high quality of topics, we can get information more accurate whether two texts are related or not. In this paper, we regard two scene segments are related if their topical similarity is high enough. We also consider that words are co-occurred if they are in topically related scene segments together. In the experiments, we showed the proposed method generates a higher quality of topics from Korean drama scripts than the baselines.

Keywords: broadcasting contents, scripts, text similarity, topic model

Procedia PDF Downloads 319

1093 Arabic Quran Search Tool Based on Ontology

Authors: Mohammad Alqahtani, Eric Atwell

Abstract:

This paper reviews and classifies most of the important types of search techniques that have been applied on the holy Quran. Then, it addresses the limitations in these techniques. Additionally, this paper surveys most existing Quranic ontologies and what are their deficiencies. Finally, it explains a new search tool called: A semantic search tool for Al Quran based on Qur’anic ontologies. This tool will overcome all limitations in the existing Quranic search applications.

Keywords: holy Quran, natural language processing (NLP), semantic search, information retrieval (IR), ontology

Procedia PDF Downloads 572

1092 Treating Voxels as Words: Word-to-Vector Methods for fMRI Meta-Analyses

Authors: Matthew Baucum

Abstract:

With the increasing popularity of fMRI as an experimental method, psychology and neuroscience can greatly benefit from advanced techniques for summarizing and synthesizing large amounts of data from brain imaging studies. One promising avenue is automated meta-analyses, in which natural language processing methods are used to identify the brain regions consistently associated with certain semantic concepts (e.g. “social”, “reward’) across large corpora of studies. This study builds on this approach by demonstrating how, in fMRI meta-analyses, individual voxels can be treated as vectors in a semantic space and evaluated for their “proximity” to terms of interest. In this technique, a low-dimensional semantic space is built from brain imaging study texts, allowing words in each text to be represented as vectors (where words that frequently appear together are near each other in the semantic space). Consequently, each voxel in a brain mask can be represented as a normalized vector sum of all of the words in the studies that showed activation in that voxel. The entire brain mask can then be visualized in terms of each voxel’s proximity to a given term of interest (e.g., “vision”, “decision making”) or collection of terms (e.g., “theory of mind”, “social”, “agent”), as measured by the cosine similarity between the voxel’s vector and the term vector (or the average of multiple term vectors). Analysis can also proceed in the opposite direction, allowing word cloud visualizations of the nearest semantic neighbors for a given brain region. This approach allows for continuous, fine-grained metrics of voxel-term associations, and relies on state-of-the-art “open vocabulary” methods that go beyond mere word-counts. An analysis of over 11,000 neuroimaging studies from an existing meta-analytic fMRI database demonstrates that this technique can be used to recover known neural bases for multiple psychological functions, suggesting this method’s utility for efficient, high-level meta-analyses of localized brain function. While automated text analytic methods are no replacement for deliberate, manual meta-analyses, they seem to show promise for the efficient aggregation of large bodies of scientific knowledge, at least on a relatively general level.

Keywords: FMRI, machine learning, meta-analysis, text analysis

Procedia PDF Downloads 450

1091 A Systematic Review: Prevalence and Risk Factors of Low Back Pain among Waste Collection Workers

Authors: Benedicta Asante, Brenna Bath, Olugbenga Adebayo, Catherine Trask

Abstract:

Background: Waste Collection Workers’ (WCWs) activities contribute greatly to the recycling sector and are an important component of the waste management industry. As the recycling sector evolves, reports of injuries and fatal accidents in the industry demand notice particularly common and debilitating musculoskeletal disorders such as low back pain (LBP). WCWs are likely exposed to diverse work-related hazards that could contribute to LBP. However, to our knowledge there has never been a systematic review or other synthesis of LBP findings within this workforce. The aim of this systematic review was to determine the prevalence and risk factors of LBP among WCWs. Method: A comprehensive search was conducted in Ovid Medline, EMBASE, and Global Health e-publications with search term categories ‘low back pain’ and ‘waste collection workers’. Articles were screened at title, abstract, and full-text stages by two reviewers. Data were extracted on study design, sampling strategy, socio-demographic, geographical region, and exposure definition, definition of LBP, risk factors, response rate, statistical techniques, and LBP prevalence. Risk of bias (ROB) was assessed based on Hoy Damien’s ROB scale. Results: The search of three databases generated 79 studies. Thirty-two studies met the study inclusion criteria for both title and abstract; thirteen full-text articles met the study criteria at the full-text stage. Seven articles (54%) reported prevalence within 12 months of LBP between 42-82% among WCW. The major risk factors for LBP among WCW included: awkward posture; lifting; pulling; pushing; repetitive motions; work duration; and physical loads. Summary data and syntheses of findings was presented in trend-lines and tables to establish the several prevalence periods based on age and region distribution. Public health implications: LBP is a major occupational hazard among WCWs. In light of these risks and future growth in this industry, further research should focus on more detail ergonomic exposure assessment and LBP prevention efforts.

Keywords: low back pain, scavenger, waste collection workers, waste pickers

Procedia PDF Downloads 330

1090 A BERT-Based Model for Financial Social Media Sentiment Analysis

Authors: Josiel Delgadillo, Johnson Kinyua, Charles Mutigwe

Abstract:

The purpose of sentiment analysis is to determine the sentiment strength (e.g., positive, negative, neutral) from a textual source for good decision-making. Natural language processing in domains such as financial markets requires knowledge of domain ontology, and pre-trained language models, such as BERT, have made significant breakthroughs in various NLP tasks by training on large-scale un-labeled generic corpora such as Wikipedia. However, sentiment analysis is a strong domain-dependent task. The rapid growth of social media has given users a platform to share their experiences and views about products, services, and processes, including financial markets. StockTwits and Twitter are social networks that allow the public to express their sentiments in real time. Hence, leveraging the success of unsupervised pre-training and a large amount of financial text available on social media platforms could potentially benefit a wide range of financial applications. This work is focused on sentiment analysis using social media text on platforms such as StockTwits and Twitter. To meet this need, SkyBERT, a domain-specific language model pre-trained and fine-tuned on financial corpora, has been developed. The results show that SkyBERT outperforms current state-of-the-art models in financial sentiment analysis. Extensive experimental results demonstrate the effectiveness and robustness of SkyBERT.

Keywords: BERT, financial markets, Twitter, sentiment analysis

Procedia PDF Downloads 154

1089 PaSA: A Dataset for Patent Sentiment Analysis to Highlight Patent Paragraphs

Authors: Renukswamy Chikkamath, Vishvapalsinhji Ramsinh Parmar, Christoph Hewel, Markus Endres

Abstract:

Given a patent document, identifying distinct semantic annotations is an interesting research aspect. Text annotation helps the patent practitioners such as examiners and patent attorneys to quickly identify the key arguments of any invention, successively providing a timely marking of a patent text. In the process of manual patent analysis, to attain better readability, recognising the semantic information by marking paragraphs is in practice. This semantic annotation process is laborious and time-consuming. To alleviate such a problem, we proposed a dataset to train machine learning algorithms to automate the highlighting process. The contributions of this work are: i) we developed a multi-class dataset of size 150k samples by traversing USPTO patents over a decade, ii) articulated statistics and distributions of data using imperative exploratory data analysis, iii) baseline Machine Learning models are developed to utilize the dataset to address patent paragraph highlighting task, and iv) future path to extend this work using Deep Learning and domain-specific pre-trained language models to develop a tool to highlight is provided. This work assists patent practitioners in highlighting semantic information automatically and aids in creating a sustainable and efficient patent analysis using the aptitude of machine learning.

Keywords: machine learning, patents, patent sentiment analysis, patent information retrieval

Procedia PDF Downloads 93

1088 Mining User-Generated Contents to Detect Service Failures with Topic Model

Authors: Kyung Bae Park, Sung Ho Ha

Abstract:

Online user-generated contents (UGC) significantly change the way customers behave (e.g., shop, travel), and a pressing need to handle the overwhelmingly plethora amount of various UGC is one of the paramount issues for management. However, a current approach (e.g., sentiment analysis) is often ineffective for leveraging textual information to detect the problems or issues that a certain management suffers from. In this paper, we employ text mining of Latent Dirichlet Allocation (LDA) on a popular online review site dedicated to complaint from users. We find that the employed LDA efficiently detects customer complaints, and a further inspection with the visualization technique is effective to categorize the problems or issues. As such, management can identify the issues at stake and prioritize them accordingly in a timely manner given the limited amount of resources. The findings provide managerial insights into how analytics on social media can help maintain and improve their reputation management. Our interdisciplinary approach also highlights several insights by applying machine learning techniques in marketing research domain. On a broader technical note, this paper illustrates the details of how to implement LDA in R program from a beginning (data collection in R) to an end (LDA analysis in R) since the instruction is still largely undocumented. In this regard, it will help lower the boundary for interdisciplinary researcher to conduct related research.

Keywords: latent dirichlet allocation, R program, text mining, topic model, user generated contents, visualization

Procedia PDF Downloads 187

1087 On the Relationship between the Concepts of "[New] Social Democracy" and "Democratic Socialism"

Authors: Gintaras Mitrulevičius

Abstract:

This text, which is based on the conference report, seeks to briefly examine the relationship between the concepts of social democracy and democratic socialism, drawing attention to the essential aspects of its development and, in particular, discussing the contradictions in the relationship between these concepts in the modern period. In the preparation of this text, such research methods as historical, historical-comparative methods were used, as well as methods of analyzing, synthesizing, and generalizing texts. The history of the use of terms in social democracy and democratic socialism shows that these terms were used alternately and almost synonymously. At the end of the 20th century, traditional social democracy was transformed into the so-called "new social democracy." Many of the new social democrats do not consider themselves democratic socialists and avoid the historically characteristic identification of social democracy with democratic socialism. It has become quite popular to believe that social democracy is a separate ideology from democratic socialism. Or that it has become a variant of the ideology of liberalism. This is a testimony to the crisis of ideological self-awareness of social democracy. Since the beginning of the 21st century, social democracy has also experienced a growing crisis of electoral support. This, among other things, led to her slight shift to the left. In this context, some social democrats are once again talking about democratic socialism. The rise of the ideas of democratic socialism in the United States was catalyzed by Bernie Sanders. But the proponents of democratic socialism in the United States have different concepts of democratic socialism. In modern Europe, democratic socialism is also spoken of by leftists of non-social democratic origin, whose understanding is different from that of democratic socialism inherent in classical social democracy. Some political scientists also single out the concepts in question. Analysis of the problem shows that there are currently several concepts of democratic socialism on the spectrum of the political left, both social-democratic and non-social-democratic.

Keywords: democratic socializm, socializm, social democracy, new social democracy, political ideologies

Procedia PDF Downloads 113

1086 Examining Reading Comprehension Skills Based on Different Reading Comprehension Frameworks and Taxonomies

Authors: Seval Kula-Kartal

Abstract:

Developing students’ reading comprehension skills is an aim that is difficult to accomplish and requires to follow long-term and systematic teaching and assessment processes. In these processes, teachers need tools to provide guidance to them on what reading comprehension is and which comprehension skills they should develop. Due to a lack of clear and evidence-based frameworks defining reading comprehension skills, especially in Turkiye, teachers and students mostly follow various processes in the classrooms without having an idea about what their comprehension goals are and what those goals mean. Since teachers and students do not have a clear view of comprehension targets, strengths, and weaknesses in students’ comprehension skills, the formative feedback processes cannot be managed in an effective way. It is believed that detecting and defining influential comprehension skills may provide guidance both to teachers and students during the feedback process. Therefore, in the current study, some of the reading comprehension frameworks that define comprehension skills operationally were examined. The aim of the study is to develop a simple and clear framework that can be used by teachers and students during their teaching, learning, assessment, and feedback processes. The current study is qualitative research in which documents related to reading comprehension skills were analyzed. Therefore, the study group consisted of recourses and frameworks which made big contributions to theoretical and operational definitions of reading comprehension. A content analysis was conducted on the resources included in the study group. To determine the validity of the themes and sub-categories revealed as the result of content analysis, three educational assessment experts were asked to examine the content analysis results. The Fleiss’ Cappa coefficient revealed that there is consistency among themes and categories defined by three different experts. The content analysis of the reading comprehension frameworks revealed that comprehension skills could be examined under four different themes. The first and second themes focus on understanding information given explicitly or implicitly within a text. The third theme includes skills used by the readers to make connections between their personal knowledge and the information given in the text. Lastly, the fourth theme focus on skills used by readers to examine the text with a critical view. The results suggested that fundamental reading comprehension skills can be examined under four themes. Teachers are recommended to use these themes in their reading comprehension teaching and assessment processes. Acknowledgment: This research is supported by Pamukkale University Scientific Research Unit within the project, whose title is Developing A Reading Comprehension Rubric.

Keywords: reading comprehension, assessing reading comprehension, comprehension taxonomies, educational assessment

Procedia PDF Downloads 82

1085 Revolution and Political Opposition in Contemporary Arabic Poetry: A Thematic Study of Two Poems by Muzaffar Al-Nawwab

Authors: Nasser Y. Athamneh

Abstract:

Muzaffar al-Nawwab (1934--) is a modern Iraqi poet, critic, and painter, well-known to Arab youth of the second half of the 20th century for his revolutionary spirit and political activism. For the greater part of his relatively long life, al-Nawwab was wanted 'dead or alive,' so to speak, by most of the Arab regimes and authorities due to his scathing, and at times unsparingly obscene attacks on them. Hence it is that the Arab masses found in his poetry the rebellious expression of their own anger and frustration, stifled by fear for their physical safety. Thus, al-Nawwab’s contemporary Arab audience loved and embraced him both as an Arab exile and as a poet. They memorized and celebrated his poems and transmitted them secretly by word of mouth and on compact cassette tapes. He himself recited his own poetry and had it recorded on compact cassette tapes for fans to smuggle from one Arab country to the other. The themes of al-Nawwab’s poems are varied, but the most predominant among them is political opposition. In most of his poems, al-Nawwab takes up politics as the major theme. Yet, he often represents it coupled with the leitmotifs of women and wine. Indeed he oscillates almost systematically between political commitment to the revolutionary cause of the masses of his nation and homeland on the one hand and love for women and wine on the other. For the persona in al-Nawwab’s poetry, love-longing for the woman and devotion to the cause of revolution and Pan-Arabism are interrelated; each of them readily evokes the other. In this paper, an attempt is made at investigating the treatment and representation of the theme of revolution and political opposition in some of al-Nawwab’s poems. This investigation will be conducted through close reading and textual analysis of representative sections of the poetic texts under consideration in the paper. The primary texts for the study are selected passages from two representative poems, namely, 'The Night Song of the Bow Strings' (Watariyyaat Layliyyah) and 'In Wine and Sorrow My Heart [Is Immersed]' (bil-khamri wa bil-huzni fu’aady). Other poems and extracts from al-Nawwab’s poetic works will be drawn upon as secondary texts to clarify the arguments in the paper and support its thesis. The discussions and textual analysis of the texts under consideration are meant to show that revolution and undaunted political opposition is a predominant theme in al-Nawwab’s poetry, often represented through the use of the leitmotifs of women and wine.

Keywords: Arabic poetry, Muzaffar al-Nawwab, politics, revolution

Procedia PDF Downloads 138

1084 Translation as a Cultural Medium: Understanding the Mauritian Culture and History through an English Translation

Authors: Pooja Booluck

Abstract:

This project seeks to translate a chapter in Le Silence des Chagos by Shenaz Patel a Mauritian author whose work has never been translated before. The chapter discusses the attempt of the protagonist to return to her home country Diego Garcia after her deportation. The English translation will offer an historical account to the target audience of the deportation of Chagossians to Mauritius during the 1970s. The target audience comprises of English-speaking translation scholars translation students and African literature scholars. In light of making the cultural elements of Mauritian culture accessible the translation will maintain the cultural items such as food and oral discourses in Creole so as to preserve the authenticity of the source culture. In order to better comprehend the cultural elements mentioned the target reader will be provided with detailed footnotes explaining the cultural and historical references. This translation will also address the importance of folkloric songs in Mauritius and its intergenerational function in Mauritian communities which will also remain in Creole. While such an approach will help to preserve the meaning of the source text the borrowing technique and the foreignizing method will be employed which will in turn help the reader in becoming more familiar with the Mauritian community. Translating a text from French to English while maintaining certain words or discourses in a minority language such as Creole bears certain challenges: How does the translator ensure the comprehensibility of the reader? Are there any translation losses? What are the choices of the translator?

Keywords: Chagos archipelagos in Exile, English translation, Le Silence des Chagos, Mauritian culture and history

Procedia PDF Downloads 317

1083 A Methodology for Automatic Diversification of Document Categories

Authors: Dasom Kim, Chen Liu, Myungsu Lim, Su-Hyeon Jeon, ByeoungKug Jeon, Kee-Young Kwahk, Namgyu Kim

Abstract:

Recently, numerous documents including unstructured data and text have been created due to the rapid increase in the usage of social media and the Internet. Each document is usually provided with a specific category for the convenience of the users. In the past, the categorization was performed manually. However, in the case of manual categorization, not only can the accuracy of the categorization be not guaranteed but the categorization also requires a large amount of time and huge costs. Many studies have been conducted towards the automatic creation of categories to solve the limitations of manual categorization. Unfortunately, most of these methods cannot be applied to categorizing complex documents with multiple topics because the methods work by assuming that one document can be categorized into one category only. In order to overcome this limitation, some studies have attempted to categorize each document into multiple categories. However, they are also limited in that their learning process involves training using a multi-categorized document set. These methods therefore cannot be applied to multi-categorization of most documents unless multi-categorized training sets are provided. To overcome the limitation of the requirement of a multi-categorized training set by traditional multi-categorization algorithms, we previously proposed a new methodology that can extend a category of a single-categorized document to multiple categorizes by analyzing relationships among categories, topics, and documents. In this paper, we design a survey-based verification scenario for estimating the accuracy of our automatic categorization methodology.

Keywords: big data analysis, document classification, multi-category, text mining, topic analysis

Procedia PDF Downloads 273

1082 Cognitive Translation and Conceptual Wine Tasting Metaphors: A Corpus-Based Research

Authors: Christine Demaecker

Abstract:

Many researchers have underlined the importance of metaphors in specialised language. Their use of specific domains helps us understand the conceptualisations used to communicate new ideas or difficult topics. Within the wide area of specialised discourse, wine tasting is a very specific example because it is almost exclusively metaphoric. Wine tasting metaphors express various conceptualisations. They are not linguistic but rather conceptual, as defined by Lakoff & Johnson. They correspond to the linguistic expression of a mental projection from a well-known or more concrete source domain onto the target domain, which is the taste of wine. But unlike most specialised terminologies, the vocabulary is never clearly defined. When metaphorical terms are listed in dictionaries, their definitions remain vague, unclear, and circular. They cannot be replaced by literal linguistic expressions. This makes it impossible to transfer them into another language with the traditional linguistic translation methods. Qualitative research investigates whether wine tasting metaphors could rather be translated with the cognitive translation process, as well described by Nili Mandelblit (1995). The research is based on a corpus compiled from two high-profile wine guides; the Parker’s Wine Buyer’s Guide and its translation into French and the Guide Hachette des Vins and its translation into English. In this small corpus with a total of 68,826 words, 170 metaphoric expressions have been identified in the original English text and 180 in the original French text. They have been selected with the MIPVU Metaphor Identification Procedure developed at the Vrije Universiteit Amsterdam. The selection demonstrates that both languages use the same set of conceptualisations, which are often combined in wine tasting notes, creating conceptual integrations or blends. The comparison of expressions in the source and target texts also demonstrates the use of the cognitive translation approach. In accordance with the principle of relevance, the translation always uses target language conceptualisations, but compared to the original, the highlighting of the projection is often different. Also, when original metaphors are complex with a combination of conceptualisations, at least one element of the original metaphor underlies the target expression. This approach perfectly integrates into Lederer’s interpretative model of translation (2006). In this triangular model, the transfer of conceptualisation could be included at the level of ‘deverbalisation/reverbalisation’, the crucial stage of the model, where the extraction of meaning combines with the encyclopedic background to generate the target text.

Keywords: cognitive translation, conceptual integration, conceptual metaphor, interpretative model of translation, wine tasting metaphor

Procedia PDF Downloads 131