Search results for: text
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1272

Search results for: text

882 Investigating Malaysian Prereader’s Cognitive Processes when Reading English Picture Storybooks: A Comparative Eye-Tracking Experiment

Authors: Siew Ming Thang, Wong Hoo Keat, Chee Hao Sue, Fung Lan Loo, Ahju Rosalind

Abstract:

There are numerous studies that explored young learners’ literacy skills in Malaysia but none that uses the eye-tracking device to track their cognitive processes when reading picture storybooks. This study used this method to investigate two groups of prereaders’ cognitive processes in four conditions. (1) A congruent picture was presented, and a matching narration was read aloud by a recorder; (2) Children heard a narration telling about the same characters in the picture but involves a different scene; (3) Only a picture with matching text was present; (4) Students only heard the reading aloud of the text on the screen. The two main objectives of this project are to test which content of pictures helps the prereaders (i.e., young children who have not received any formal reading instruction) understand the narration and whether children try to create a coherent mental representation from the oral narration and the pictures. The study compares two groups of children from two different kindergartens. Group1: 15 Chinese children; Group2: 17 Malay children. The medium of instruction was English. An eye-tracker were used to identify Areas of Interest (AOI) of each picture and the five target elements and calculate number of fixations and total time spent on fixation of pictures and written texts. Two mixed factorial ANOVAs with the storytelling performance (good, average, or weak) and vocabulary level (low, medium, high) as between-subject variables, and the Areas of Interests (AOIs) and display conditions as the within-subject variables were performedon the variables.

Keywords: eye-tracking, cognitive processes, literacy skills, prereaders, visual attention

Procedia PDF Downloads 69
881 Developing the Skills of Reading Comprehension of Learners of English as a Second Language

Authors: Indu Gamage

Abstract:

Though commonly utilized as a language improvement technique, reading has not been fully employed by both language teachers and learners to develop reading comprehension skills in English as a second language. In a Sri Lankan context, this area has to be delved deep into as the learners’ show more propensity to analyze. Reading comprehension is an area that most language teachers and learners struggle with though it appears easy. Most ESL learners engage in reading tasks without being properly aware of the objective of doing reading comprehension. It is observed that when doing reading tasks, the language learners’ concern is more on the meanings of individual words than on the overall comprehension of the given text. The passiveness with which the ESL learners engage themselves in reading comprehension makes reading a tedious task for the learner thereby giving the learner a sense of disappointment at the end. Certain reading tasks take the form of translations. The active cognitive participation of the learner in the mode of using productive strategies for predicting, employing schemata and using contextual clues seems quite less. It was hypothesized that the learners’ lack of knowledge of the productive strategies of reading was the major obstacle that makes reading comprehension a tedious task for them. This study is based on a group of 30 tertiary students who read English only as a fundamental requirement for their degree. They belonged to the Faculty of Humanities and Social Sciences of the University of Ruhuna, Sri Lanka. Almost all learners hailed from areas where English was hardly utilized in their day to day conversations. The study is carried out in the mode of a questionnaire to check their opinions on reading and a test to check whether the learners are using productive strategies of reading when doing reading comprehension tasks. The test comprised reading questions covering major productive strategies for reading. Then the results were analyzed to see the degree of their active engagement in comprehending the text. The findings depicted the validity of the hypothesis as grounds behind the difficulties related to reading comprehension.

Keywords: reading, comprehension, skills, reading strategies

Procedia PDF Downloads 151
880 Corpus Linguistics as a Tool for Translation Studies Analysis: A Bilingual Parallel Corpus of Students’ Translations

Authors: Juan-Pedro Rica-Peromingo

Abstract:

Nowadays, corpus linguistics has become a key research methodology for Translation Studies, which broadens the scope of cross-linguistic studies. In the case of the study presented here, the approach used focuses on learners with little or no experience to study, at an early stage, general mistakes and errors, the correct or incorrect use of translation strategies, and to improve the translational competence of the students. Led by Sylviane Granger and Marie-Aude Lefer of the Centre for English Corpus Linguistics of the University of Louvain, the MUST corpus (MUltilingual Student Translation Corpus) is an international project which brings together partners from Europe and worldwide universities and connects Learner Corpus Research (LCR) and Translation Studies (TS). It aims to build a corpus of translations carried out by students including both direct (L2 > L1) an indirect (L1 > L2) translations, from a great variety of text types, genres, and registers in a wide variety of languages: audiovisual translations (including dubbing, subtitling for hearing population and for deaf population), scientific, humanistic, literary, economic and legal translation texts. This paper focuses on the work carried out by the Spanish team from the Complutense University (UCMA), which is part of the MUST project, and it describes the specific features of the corpus built by its members. All the texts used by UCMA are either direct or indirect translations between English and Spanish. Students’ profiles comprise translation trainees, foreign language students with a major in English, engineers studying EFL and MA students, all of them with different English levels (from B1 to C1); for some of the students, this would be their first experience with translation. The MUST corpus is searchable via Hypal4MUST, a web-based interface developed by Adam Obrusnik from Masaryk University (Czech Republic), which includes a translation-oriented annotation system (TAS). A distinctive feature of the interface is that it allows source texts and target texts to be aligned, so we can be able to observe and compare in detail both language structures and study translation strategies used by students. The initial data obtained point out the kind of difficulties encountered by the students and reveal the most frequent strategies implemented by the learners according to their level of English, their translation experience and the text genres. We have also found common errors in the graduate and postgraduate university students’ translations: transfer errors, lexical errors, grammatical errors, text-specific translation errors, and cultural-related errors have been identified. Analyzing all these parameters will provide more material to bring better solutions to improve the quality of teaching and the translations produced by the students.

Keywords: corpus studies, students’ corpus, the MUST corpus, translation studies

Procedia PDF Downloads 121
879 Using Visualization Techniques to Support Common Clinical Tasks in Clinical Documentation

Authors: Jonah Kenei, Elisha Opiyo

Abstract:

Electronic health records, as a repository of patient information, is nowadays the most commonly used technology to record, store and review patient clinical records and perform other clinical tasks. However, the accurate identification and retrieval of relevant information from clinical records is a difficult task due to the unstructured nature of clinical documents, characterized in particular by a lack of clear structure. Therefore, medical practice is facing a challenge thanks to the rapid growth of health information in electronic health records (EHRs), mostly in narrative text form. As a result, it's becoming important to effectively manage the growing amount of data for a single patient. As a result, there is currently a requirement to visualize electronic health records (EHRs) in a way that aids physicians in clinical tasks and medical decision-making. Leveraging text visualization techniques to unstructured clinical narrative texts is a new area of research that aims to provide better information extraction and retrieval to support clinical decision support in scenarios where data generated continues to grow. Clinical datasets in electronic health records (EHR) offer a lot of potential for training accurate statistical models to classify facets of information which can then be used to improve patient care and outcomes. However, in many clinical note datasets, the unstructured nature of clinical texts is a common problem. This paper examines the very issue of getting raw clinical texts and mapping them into meaningful structures that can support healthcare professionals utilizing narrative texts. Our work is the result of a collaborative design process that was aided by empirical data collected through formal usability testing.

Keywords: classification, electronic health records, narrative texts, visualization

Procedia PDF Downloads 93
878 A Design for Customer Preferences Model by Cluster Analysis of Geometric Features and Customer Preferences

Authors: Yuan-Jye Tseng, Ching-Yen Chen

Abstract:

In the design cycle, a main design task is to determine the external shape of the product. The external shape of a product is one of the key factors that can affect the customers’ preferences linking to the motivation to buy the product, especially in the case of a consumer electronic product such as a mobile phone. The relationship between the external shape and the customer preferences needs to be studied to enhance the customer’s purchase desire and action. In this research, a design for customer preferences model is developed for investigating the relationships between the external shape and the customer preferences of a product. In the first stage, the names of the geometric features are collected and evaluated from the data of the specified internet web pages using the developed text miner. The key geometric features can be determined if the number of occurrence on the web pages is relatively high. For each key geometric feature, the numerical values are explored using the text miner to collect the internet data from the web pages. In the second stage, a cluster analysis model is developed to evaluate the numerical values of the key geometric features to divide the external shapes into several groups. Several design suggestion cases can be proposed, for example, large model, mid-size model, and mini model, for designing a mobile phone. A customer preference index is developed by evaluating the numerical data of each of the key geometric features of the design suggestion cases. The design suggestion case with the top ranking of the customer preference index can be selected as the final design of the product. In this paper, an example product of a notebook computer is illustrated. It shows that the external shape of a product can be used to drive customer preferences. The presented design for customer preferences model is useful for determining a suitable external shape of the product to increase customer preferences.

Keywords: cluster analysis, customer preferences, design evaluation, design for customer preferences, product design

Procedia PDF Downloads 162
877 Graph Neural Network-Based Classification for Disease Prediction in Health Care Heterogeneous Data Structures of Electronic Health Record

Authors: Raghavi C. Janaswamy

Abstract:

In the healthcare sector, heterogenous data elements such as patients, diagnosis, symptoms, conditions, observation text from physician notes, and prescriptions form the essentials of the Electronic Health Record (EHR). The data in the form of clear text and images are stored or processed in a relational format in most systems. However, the intrinsic structure restrictions and complex joins of relational databases limit the widespread utility. In this regard, the design and development of realistic mapping and deep connections as real-time objects offer unparallel advantages. Herein, a graph neural network-based classification of EHR data has been developed. The patient conditions have been predicted as a node classification task using a graph-based open source EHR data, Synthea Database, stored in Tigergraph. The Synthea DB dataset is leveraged due to its closer representation of the real-time data and being voluminous. The graph model is built from the EHR heterogeneous data using python modules, namely, pyTigerGraph to get nodes and edges from the Tigergraph database, PyTorch to tensorize the nodes and edges, PyTorch-Geometric (PyG) to train the Graph Neural Network (GNN) and adopt the self-supervised learning techniques with the AutoEncoders to generate the node embeddings and eventually perform the node classifications using the node embeddings. The model predicts patient conditions ranging from common to rare situations. The outcome is deemed to open up opportunities for data querying toward better predictions and accuracy.

Keywords: electronic health record, graph neural network, heterogeneous data, prediction

Procedia PDF Downloads 63
876 A Cultural Materialistic Approach to Toni Morrison’s Beloved and the Bluest Eye

Authors: Irfan Mehmood

Abstract:

The goal of this paper is to examine Toni Morrison's novels Beloved and The Bluest Eye from a cultural materialistic perspective. The history and society of African Americans provide the inspiration for the stories of Beloved and The Bluest Eye. The cultural materialist elements and characteristics of Morrison's literary text will be highlighted in this study. The topic covered in this paper will include racism, gender discrimination, social class differences, and slavery in the text. In other words, the study will focus on the underrepresented groups in society, including women, slaves, and Afro-Americans. In this aspect, Toni Morrison is a fantastic writer whose works are full of diverse races. Morrison uses her incredibly well-informed language and well-produced stories to attempt to illuminate many facets of American life. She establishes a distinctive style of writing that sharply contrasts the suffering and enslavement of Afro-Americans with the traditional writings of Euro-American authors. Morrison shows a profound understanding of the exploitation of Afro-Americans in terms of race, gender, and class conflict in Beloved and The Bluest Eye. A unique culture and the history of a typically ignored set of people whose minds and societies have been permanently changed by class, racial, and gender discrimination were introduced through the study of Morrison's chosen novels. Toni Morrison places a lot of emphasis on the marginalized members of society, particularly in terms of class, ethnicity, and gender, because the majority of the key characters in her book are black. Therefore, the purpose of this essay is to concentrate on the culturally materialistic elements of Morrison's Beloved and The Bluest Eye and to ascertain the author's position on these minorities.

Keywords: race, slavery, social class, Toni Morrison, African American culture

Procedia PDF Downloads 50
875 Debating the Ethical Questions of the Super Soldier

Authors: Jean-François Caron

Abstract:

The current attempts to develop what we can call 'super soldiers' are problematic in many regards. This is what this text will try to explore by concentrating primarily on the repercussions of this technology and medical research on the physical and psychological integrity of soldiers. It argues that medicines or technologies may affect soldiers’ psychological and mental features and deprive them of their capacity to reflect upon their actions as autonomous subjects and that such a possibility entails serious moral as well as judicial consequences.

Keywords: military research, super soldiers, involuntary intoxication, criminal responsibility

Procedia PDF Downloads 321
874 A Randomized, Controlled Trial To Test Behavior Change Techniques (BCTS) To Improve Low Intensity Physical Activity In Older Adults

Authors: Ciaran Friel, Jerry Suls, Patrick Robles, Frank Vicari, Joan Duer-Hefele, Karina W. Davidson

Abstract:

Physical activity guidelines focus on increasing moderate intensity activity for older adults, but adherence to recommendations remains low. This is despite the fact that scientific evidence supports that any increase in physical activity is positively correlated with health benefits. Behavior change techniques (BCTs) have demonstrated effectiveness in reducing sedentary behavior and promoting physical activity. This pilot study uses a Personalized Trials (N-of-1) design to evaluate the efficacy of using four BCTs to promote an increase in low-intensity physical activity (2,000 steps of walking per day) in adults aged 45-75 years old. The 4 BCTs tested were goal setting, action planning, feedback, and self-monitoring. BCTs were tested in random order and delivered by text message prompts requiring participant response. The study recruited health system employees in the target age range, without mobility restrictions and demonstrating interest in increasing their daily activity by a minimum of 2,000 steps per day for a minimum of five days per week. Participants were sent a Fitbit Charge 4 fitness tracker with an established study account and password. Participants were recommended to wear the Fitbit device 24/7, but were required to wear it for a minimum of ten hours per day. Baseline physical activity was measured by the Fitbit for two weeks. Participants then engaged with a clinical research coordinator to review comprehension of the text message content and required actions for each of the BCTs to be tested. Participants then selected a consistent daily time in which they would receive their text message prompt. In the 8 week intervention phase of the study, participants received each of the four BCTs, in random order, for a two week period. Text message prompts were delivered daily at a time selected by the participant. All prompts required an interactive response from participants and may have included recording their detailed plan for walking or daily step goal (action planning, goal setting). Additionally, participants may have been directed to a study dashboard to view their step counts or compare themselves with peers (self-monitoring, feedback). At the end of each two week testing interval, participants were asked to complete the Self-Efficacy for Walking Scale (SEW_Dur), a validated measure that assesses the participant’s confidence in walking incremental distances and a survey measuring their satisfaction with the individual BCT that they tested. At the end of their trial, participants received a personalized summary of their step data in response to each individual BCT. Analysis will examine the novel individual-level heterogeneity of treatment effect made possible by N-of-1 design, and pool results across participants to efficiently estimate the overall efficacy of the selected behavioral change techniques in increasing low-intensity walking by 2,000 steps, 5 days per week. Self-efficacy will be explored as the likely mechanism of action prompting behavior change. This study will inform the providers and demonstrate the feasibility of N-of-1 study design to effectively promote physical activity as a component of healthy aging.

Keywords: aging, exercise, habit, walking

Procedia PDF Downloads 104
873 Facilitating Written Biology Assessment in Large-Enrollment Courses Using Machine Learning

Authors: Luanna B. Prevost, Kelli Carter, Margaurete Romero, Kirsti Martinez

Abstract:

Writing is an essential scientific practice, yet, in several countries, the increasing university science class-size limits the use of written assessments. Written assessments allow students to demonstrate their learning in their own words and permit the faculty to evaluate students’ understanding. However, the time and resources required to grade written assessments prohibit their use in large-enrollment science courses. This study examined the use of machine learning algorithms to automatically analyze student writing and provide timely feedback to the faculty about students' writing in biology. Written responses to questions about matter and energy transformation were collected from large-enrollment undergraduate introductory biology classrooms. Responses were analyzed using the LightSide text mining and classification software. Cohen’s Kappa was used to measure agreement between the LightSide models and human raters. Predictive models achieved agreement with human coding of 0.7 Cohen’s Kappa or greater. Models captured that when writing about matter-energy transformation at the ecosystem level, students focused on primarily on the concepts of heat loss, recycling of matter, and conservation of matter and energy. Models were also produced to capture writing about processes such as decomposition and biochemical cycling. The models created in this study can be used to provide automatic feedback about students understanding of these concepts to biology faculty who desire to use formative written assessments in larger enrollment biology classes, but do not have the time or personnel for manual grading.

Keywords: machine learning, written assessment, biology education, text mining

Procedia PDF Downloads 250
872 Exploring Reading into Writing: A Corpus-Based Analysis of Postgraduate Students’ Literature Review Essays

Authors: Tanzeela Anbreen, Ammara Maqsood

Abstract:

Reading into writing is one of university students' most required academic skills. The current study explored postgraduate university students’ writing quality using a corpus-based approach. Twelve postgraduate students’ literature review essays were chosen for the corpus-based analysis. These essays were chosen because students had to incorporate multiple reading sources in these essays, which was a new writing exercise for them. The students were provided feedback at least two times which comprised of the written comments by the tutor highlighting the areas of improvement and also by using the ‘track changes’ function. This exercise was repeated two times, and students submitted two drafts. This investigation included only the finally submitted work of the students. A corpus-based approach was adopted to analyse the essays because it promotes autonomous discovery and personalised learning. The aim of this analysis was to understand the existing level of students’ writing before the start of their postgraduate thesis. Text Inspector was used to analyse the quality of essays. With the help of the Text Inspector tool, the vocabulary used in the essays was compared to the English Vocabulary Profile (EVP), which describes what learners know and can do at each Common European Framework of Reference (CEFR) level. Writing quality was also measured for the Flesch reading ease score, which is a standard to describe the ease of understanding the writing content. The results reflected that students found writing essays using multiple sources challenging. In most essays, the vocabulary level achieved was between B1-B2 of the CEFL level. The study recommends that students need extensive training in developing academic writing skills, particularly in writing the literature review type assignment, which requires multiple sources citations.

Keywords: literature review essays, postgraduate students, corpus-based analysis, vocabulary proficiency

Procedia PDF Downloads 42
871 Translating Silence: An Analysis of Dhofar University Student Translations of Elliptical Structures from English into Arabic

Authors: Ali Algryani

Abstract:

Ellipsis involves the omission of an item or items that can be recovered from the preceding clause. Ellipsis is used as a cohesion marker; it enhances the cohesiveness of a text/discourse as a clause is interpretable only through making reference to an antecedent clause. The present study attempts to investigate the linguistic phenomenon of ellipsis from a translation perspective. It is mainly concerned with how ellipsis is translated from English into Arabic. The study covers different forms of ellipsis, such as noun phrase ellipsis, verb phrase ellipsis, gapping, pseudo-gapping, stripping, and sluicing. The primary aim of the study, apart from discussing the use and function of ellipsis, is to find out how such ellipsis phenomena are dealt with in English-Arabic translation and determine the implications of the translations of elliptical structures into Arabic. The study is based on the analysis of Dhofar University (DU) students' translations of sentences containing different forms of ellipsis. The initial findings of the study indicate that due to differences in syntactic structures and stylistic preferences between English and Arabic, Arabic tends to use lexical repetition in the translation of some elliptical structures, thus achieving a higher level of explicitness. This implies that Arabic tends to prefer lexical repetition to create cohesion more than English does. Furthermore, the study also reveals that the improper translation of ellipsis leads to interpretations different from those understood from the source text. Such mistranslations can be attributed to student translators’ lack of awareness of the use and function of ellipsis as well as the stylistic preferences of both languages. This has pedagogical implications on the teaching and training of translation students at DU. Students' linguistic competence needs to be enhanced through teaching linguistics-related issues with reference to translation and both languages, .i.e. source and target languages and with special emphasis on their use, function and stylistic preferences.

Keywords: cohesion, ellipsis, explicitness, lexical repetition

Procedia PDF Downloads 96
870 (Re)Framing the Muslim Subject: Studying the Artistic Representation of Guantanamo and Abu Ghraib Detainees

Authors: Iqra Raza

Abstract:

This paper attempts to conceptualize the (de)humanization of the Muslim subject in Karen J. Greenberg and Janet Hamlin’s transmedia Sketching Guantanamo through a close study of the aesthetics and semiotics of the text. The Muslim experience, the paper shall argue, is mediated through a (de)humanization confined and incarcerated within the chains of artistic representation. Hamlin’s reliance on the distortions offered by stereotypes is reminiscent of the late Victorian epistemology on criminality, as evidenced most starkly in the sketch of Khalid Sheikh Mohammad. The position of the white artist thus becomes suspect in the enterprise of neo-Victorian ethnography. The visual stories of movement from within Guantanamo become potent; the paper shall argue, especially in juxtaposition with the images of stillness that came out from the detention centers, which portrayed the enactment of violence on individual bodies with a deliberate erasure of faces. So, while art becomes a way for reclaiming subjectivity or humanizing these identifiable bodies, the medium predicates itself on their objectification. The paper shall explore various questions about what it means for the (criminal?) subjects to be rendered into art rather than being photographed. Does art entail a necessary departure from the assumed objectivity of the photographic images? What makes art the preferred medium for (de)humanization of the violated Muslim bodies? What happens when art is produced without a recognition of the ‘precariousness’ of the life being portrayed? Rendering the detainees into art becomes a slippery task complicated by Hamlin’s privileged position outside the glass walls of the court. The paper shall adjourn analysis at the many dichotomies that exist in the text viz. between the White men and the brown, the Muslims and the Christians, Occident and the Orient problematized by Hamlin’s politics, that of a ‘neutral outsider’ which quickly turns on its head and becomes complicity in her deliberate erasure of the violence that shaped and still shapes Guantanamo.

Keywords: Abu Ghraib, Derrida, Guantanamo, graphic journalism, Muslimness, orient, spectrality

Procedia PDF Downloads 125
869 Identification and Evaluation of Environmental Concepts in Paulo Coelho's "The Alchemist"

Authors: Tooba Sabir, Asima Jaffar, Namra Sabir, Mohammad Amjad Sabir

Abstract:

Ecocriticism is the study of relationship between human and environment which has been represented in literature since the very beginning in pastoral tradition. However, the analysis of such representation is new as compared to the other critical evaluations like Psychoanalysis, Marxism, Post-colonialism, Modernism and many others. Ecocritics seek to find information like anthropocentrism, ecocentrism, ecofeminism, eco-Marxism, representation of environment and environmental concept and several other topics. In the current study the representation of environmental concepts, were ecocritically analyzed in Paulo Coelho’s The Alchemist, one of the most read novels throughout the world, having been translated into many languages. Analysis of the text revealed, the representations of environmental ideas like landscapes and tourism, biodiversity, land-sea displacement, environmental disasters and warfare, desert winds and sand dunes. 'This desert was once a sea' throws light on different theories of land-sea displacement, one being the plate-tectonic theory which proposes Earth’s lithosphere to be divided into different large and small plates, continuously moving toward, away from or parallel to each other, resulting in land-sea displacement. Another theory is the continental drift theory which holds onto the belief that one large landmass—Pangea, broke down into smaller pieces of land that moved relative to each other and formed continents of the present time. The cause of desertification may, however, be natural i.e. climate change or artificial i.e. by human activities. Imagery of the environmental concepts, at some instances in the novel, is detailed and at other instances, is not as striking, but still is capable of arousing readers’ imagination. The study suggests that ecocritical justifications of environmental concepts in the text will increase the interactions between literature and environment which should be encouraged in order to induce environmental awareness among the readers.

Keywords: biodiversity, ecocritical analysis, ecocriticism, environmental disasters, landscapes

Procedia PDF Downloads 237
868 Between Fiction and Reality: Reading the Silences in Partition History

Authors: Shazia Salam

Abstract:

This paper focuses on studying the literary reactions of selected Muslim women writers to the event of Partition of India in the north western region. It aims to explore how Muslim women experienced the Partition and how that experience was articulated through their writing. There is a serious dearth of research on the experience of Muslim women who had to witness the momentous event of the subcontinent. Since scholars have often questioned the silence around the historiography related to the experiences of Muslim women, this paper aims to explore if literature could provide insights that may be less readily available in other modes of narration. Using literature as an archival source, it aims to delve into the arenas of history that have been cloistered and closed. Muslim women have been silent about their experiences of Partition which at the cost of essentializing could be attributed to patriarchal constraints, and taboos, on speaking of intimate matters. These silences have consigned the question of their experience to a realm of anonymity. The lack of ethnographic research has in a way been compensated in the realm of literature, mainly poetry and fiction. Besides reportage, literature remains an important source of social history about Partition and how Muslim women lived through it. Where traditional history fails to record moments of rupture and dislocation, literature serves the crucial purpose. The central premise in this paper is that there is a need to revise the history of partition owing to the gaps in historiography. It looks into if literature can serve as a ground for developing new approaches to history since the question of the representation always confronts us--between what a text represents and how it represents it since imagination of the writer plays a great role in the construction of any text. With this approach as an entry point, this paper aims to unpack the questions of representation, the coalescing of history /literature and the gendered nature of partition history. It concludes that the gaps in the narratives of Partition and the memory of Partition can be addressed by way of suing literary as a source to fill in the cracks and fissures.

Keywords: gender, history, literature, partition

Procedia PDF Downloads 180
867 Gender Bias in Natural Language Processing: Machines Reflect Misogyny in Society

Authors: Irene Yi

Abstract:

Machine learning, natural language processing, and neural network models of language are becoming more and more prevalent in the fields of technology and linguistics today. Training data for machines are at best, large corpora of human literature and at worst, a reflection of the ugliness in society. Machines have been trained on millions of human books, only to find that in the course of human history, derogatory and sexist adjectives are used significantly more frequently when describing females in history and literature than when describing males. This is extremely problematic, both as training data, and as the outcome of natural language processing. As machines start to handle more responsibilities, it is crucial to ensure that they do not take with them historical sexist and misogynistic notions. This paper gathers data and algorithms from neural network models of language having to deal with syntax, semantics, sociolinguistics, and text classification. Results are significant in showing the existing intentional and unintentional misogynistic notions used to train machines, as well as in developing better technologies that take into account the semantics and syntax of text to be more mindful and reflect gender equality. Further, this paper deals with the idea of non-binary gender pronouns and how machines can process these pronouns correctly, given its semantic and syntactic context. This paper also delves into the implications of gendered grammar and its effect, cross-linguistically, on natural language processing. Languages such as French or Spanish not only have rigid gendered grammar rules, but also historically patriarchal societies. The progression of society comes hand in hand with not only its language, but how machines process those natural languages. These ideas are all extremely vital to the development of natural language models in technology, and they must be taken into account immediately.

Keywords: gendered grammar, misogynistic language, natural language processing, neural networks

Procedia PDF Downloads 93
866 Detect Critical Thinking Skill in Written Text Analysis. The Use of Artificial Intelligence in Text Analysis vs Chat/Gpt

Authors: Lucilla Crosta, Anthony Edwards

Abstract:

Companies and the market place nowadays struggle to find employees with adequate skills in relation to anticipated growth of their businesses. At least half of workers will need to undertake some form of up-skilling process in the next five years in order to remain aligned with the requests of the market . In order to meet these challenges, there is a clear need to explore the potential uses of AI (artificial Intelligence) based tools in assessing transversal skills (critical thinking, communication and soft skills of different types in general) of workers and adult students while empowering them to develop those same skills in a reliable trustworthy way. Companies seek workers with key transversal skills that can make a difference between workers now and in the future. However, critical thinking seems to be the one of the most imprtant skill, bringing unexplored ideas and company growth in business contexts. What employers have been reporting since years now, is that this skill is lacking in the majority of workers and adult students, and this is particularly visible trough their writing. This paper investigates how critical thinking and communication skills are currently developed in Higher Education environments through use of AI tools at postgraduate levels. It analyses the use of a branch of AI namely Machine Learning and Big Data and of Neural Network Analysis. It also examines the potential effect the acquisition of these skills through AI tools and what kind of effects this has on employability This paper will draw information from researchers and studies both at national (Italy & UK) and international level in Higher Education. The issues associated with the development and use of one specific AI tool Edulai, will be examined in details. Finally comparisons will be also made between these tools and the more recent phenomenon of Chat GPT and forthcomings and drawbacks will be analysed.

Keywords: critical thinking, artificial intelligence, higher education, soft skills, chat GPT

Procedia PDF Downloads 75
865 Adapting Tools for Text Monitoring and for Scenario Analysis Related to the Field of Social Disasters

Authors: Svetlana Cojocaru, Mircea Petic, Inga Titchiev

Abstract:

Humanity faces more and more often with different social disasters, which in turn can generate new accidents and catastrophes. To mitigate their consequences, it is important to obtain early possible signals about the events which are or can occur and to prepare the corresponding scenarios that could be applied. Our research is focused on solving two problems in this domain: identifying signals related that an accident occurred or may occur and mitigation of some consequences of disasters. To solve the first problem, methods of selecting and processing texts from global network Internet are developed. Information in Romanian is of special interest for us. In order to obtain the mentioned tools, we should follow several steps, divided into preparatory stage and processing stage. Throughout the first stage, we manually collected over 724 news articles and classified them into 10 categories of social disasters. It constitutes more than 150 thousand words. Using this information, a controlled vocabulary of more than 300 keywords was elaborated, that will help in the process of classification and identification of the texts related to the field of social disasters. To solve the second problem, the formalism of Petri net has been used. We deal with the problem of inhabitants’ evacuation in useful time. The analysis methods such as reachability or coverability tree and invariants technique to determine dynamic properties of the modeled systems will be used. To perform a case study of properties of extended evacuation system by adding time, the analysis modules of PIPE such as Generalized Stochastic Petri Nets (GSPN) Analysis, Simulation, State Space Analysis, and Invariant Analysis have been used. These modules helped us to obtain the average number of persons situated in the rooms and the other quantitative properties and characteristics related to its dynamics.

Keywords: lexicon of disasters, modelling, Petri nets, text annotation, social disasters

Procedia PDF Downloads 181
864 An Unsupervised Domain-Knowledge Discovery Framework for Fake News Detection

Authors: Yulan Wu

Abstract:

With the rapid development of social media, the issue of fake news has gained considerable prominence, drawing the attention of both the public and governments. The widespread dissemination of false information poses a tangible threat across multiple domains of society, including politics, economy, and health. However, much research has concentrated on supervised training models within specific domains, their effectiveness diminishes when applied to identify fake news across multiple domains. To solve this problem, some approaches based on domain labels have been proposed. By segmenting news to their specific area in advance, judges in the corresponding field may be more accurate on fake news. However, these approaches disregard the fact that news records can pertain to multiple domains, resulting in a significant loss of valuable information. In addition, the datasets used for training must all be domain-labeled, which creates unnecessary complexity. To solve these problems, an unsupervised domain knowledge discovery framework for fake news detection is proposed. Firstly, to effectively retain the multidomain knowledge of the text, a low-dimensional vector for each news text to capture domain embeddings is generated. Subsequently, a feature extraction module utilizing the unsupervisedly discovered domain embeddings is used to extract the comprehensive features of news. Finally, a classifier is employed to determine the authenticity of the news. To verify the proposed framework, a test is conducted on the existing widely used datasets, and the experimental results demonstrate that this method is able to improve the detection performance for fake news across multiple domains. Moreover, even in datasets that lack domain labels, this method can still effectively transfer domain knowledge, which can educe the time consumed by tagging without sacrificing the detection accuracy.

Keywords: fake news, deep learning, natural language processing, multiple domains

Procedia PDF Downloads 55
863 Narrative Constructs and Environmental Engagement: A Textual Analysis of Climate Fiction’s Role in Shaping Sustainability Consciousness

Authors: Dean J. Hill

Abstract:

This paper undertakes the task of conducting an in-depth textual analysis of the cli-fi genre. It examines how writing in the genre contributes to expressing and facilitating the articulation of environmental consciousness through the form of narrative. The paper begins by situating cli-fi within the literary continuum of ecological narratives and identifying the unique textual characteristics and thematic preoccupations of this area. The paper unfolds how cli-fi transforms the esoteric nature of climate science into credible narrative forms by drawing on language use, metaphorical constructs, and narrative framing. It also involves how descriptive and figurative language in the description of nature and disaster makes climate change so vivid and emotionally resonant. The work also points out the dialogic nature of cli-fi, whereby the characters and the narrators experience inner disputes in the novel regarding the ethical dilemma of environmental destruction, thus demanding the readers challenge and re-evaluate their standpoints on sustainability and ecological responsibilities. The paper proceeds with analysing the feature of narrative voice and its role in eliciting empathy, as well as reader involvement with the ecological material. In looking at how different narratorial perspectives contribute to the emotional and cognitive reaction of the reader to text, this study demonstrates the profound power of perspective in developing intimacy with the dominating concerns. Finally, the emotional arc of cli-fi narratives, running its course over themes of loss, hope, and resilience, is analysed in relation to how these elements function to marshal public feeling and discourse into action around climate change. Therefore, we can say that the complexity of the text in the cli-fi not only shows the hard edge of the reality of climate change but also influences public perception and behaviour toward a more sustainable future.

Keywords: cli-fi genre, ecological narratives, emotional arc, narrative voice, public perception

Procedia PDF Downloads 12
862 Construction and Analysis of Tamazight (Berber) Text Corpus

Authors: Zayd Khayi

Abstract:

This paper deals with the construction and analysis of the Tamazight text corpus. The grammatical structure of the Tamazight remains poorly understood, and a lack of comparative grammar leads to linguistic issues. In order to fill this gap, even though it is small, by constructed the diachronic corpus of the Tamazight language, and elaborated the program tool. In addition, this work is devoted to constructing that tool to analyze the different aspects of the Tamazight, with its different dialects used in the north of Africa, specifically in Morocco. It also focused on three Moroccan dialects: Tamazight, Tarifiyt, and Tachlhit. The Latin version was good choice because of the many sources it has. The corpus is based on the grammatical parameters and features of that language. The text collection contains more than 500 texts that cover a long historical period. It is free, and it will be useful for further investigations. The texts were transformed into an XML-format standardization goal. The corpus counts more than 200,000 words. Based on the linguistic rules and statistical methods, the original user interface and software prototype were developed by combining the technologies of web design and Python. The corpus presents more details and features about how this corpus provides users with the ability to distinguish easily between feminine/masculine nouns and verbs. The interface used has three languages: TMZ, FR, and EN. Selected texts were not initially categorized. This work was done in a manual way. Within corpus linguistics, there is currently no commonly accepted approach to the classification of texts. Texts are distinguished into ten categories. To describe and represent the texts in the corpus, we elaborated the XML structure according to the TEI recommendations. Using the search function may provide us with the types of words we would search for, like feminine/masculine nouns and verbs. Nouns are divided into two parts. The gender in the corpus has two forms. The neutral form of the word corresponds to masculine, while feminine is indicated by a double t-t affix (the prefix t- and the suffix -t), ex: Tarbat (girl), Tamtut (woman), Taxamt (tent), and Tislit (bride). However, there are some words whose feminine form contains only the prefix t- and the suffix –a, ex: Tasa (liver), tawja (family), and tarwa (progenitors). Generally, Tamazight masculine words have prefixes that distinguish them from other words. For instance, 'a', 'u', 'i', ex: Asklu (tree), udi (cheese), ighef (head). Verbs in the corpus are for the first person singular and plural that have suffixes 'agh','ex', 'egh', ex: 'ghrex' (I study), 'fegh' (I go out), 'nadagh' (I call). The program tool permits the following characteristics of this corpus: list of all tokens; list of unique words; lexical diversity; realize different grammatical requests. To conclude, this corpus has only focused on a small group of parts of speech in Tamazight language verbs, nouns. Work is still on the adjectives, prounouns, adverbs and others.

Keywords: Tamazight (Berber) language, corpus linguistic, grammar rules, statistical methods

Procedia PDF Downloads 39
861 Upside Down Words as Initial Clinical Presentation of an Underlying Acute Ischemic Stroke

Authors: Ramuel Spirituel Mattathiah A. San Juan, Neil Ambasing

Abstract:

Background: Reversal of vision metamorphopsia is a transient form of metamorphopsia described as an upside-down alteration of the visual field in the coronal plane. Patients would describe objects, such as cups, upside down, but the tea would not spill, and people would walk on their heads. It is extremely rare as a stable finding, lasting days or weeks. We report a case wherein this type of metamorphopsia occurred only in written words and lasted for six months. Objective: To the best of our knowledge, we report the first rare occurrence of reversal of vision metamorphopsia described as inverted words as the sole initial presentation of an underlying stroke. Case Presentation: We report a 59-year-old male with poorly controlled hypertension and diabetes mellitus who presented with a 3-day history of difficulty reading, described as the words were turned upside down as if the words were inverted horizontally then with the progression of deficits such as right homonymous hemianopia and achromatopsia, prosopagnosia. Cranial magnetic resonance imaging (MRI) revealed an acute infarct on the left posterior cerebral artery territory. Follow-up after six months revealed improvement of the visual field cut but with the persistence of the higher cortical function deficits. Conclusion: We report the first rare occurrence of metamorphopsia described as purely inverted words as the sole initial presentation of an underlying stroke. The differential diagnoses of a patient presenting with text reversal metamorphopsia should include stroke in the occipitotemporal areas. It further expands the landscape of metamorphopsias due to its exclusivity to written words and prolonged duration. Knowing these clinical features will help identify the lesion locus and improve subsequent stroke care, especially in time-bound management like intravenous thrombolysis.

Keywords: rare presentation, text reversal metamorphopsia, ischemic stroke, stroke

Procedia PDF Downloads 41
860 Web Data Scraping Technology Using Term Frequency Inverse Document Frequency to Enhance the Big Data Quality on Sentiment Analysis

Authors: Sangita Pokhrel, Nalinda Somasiri, Rebecca Jeyavadhanam, Swathi Ganesan

Abstract:

Tourism is a booming industry with huge future potential for global wealth and employment. There are countless data generated over social media sites every day, creating numerous opportunities to bring more insights to decision-makers. The integration of Big Data Technology into the tourism industry will allow companies to conclude where their customers have been and what they like. This information can then be used by businesses, such as those in charge of managing visitor centers or hotels, etc., and the tourist can get a clear idea of places before visiting. The technical perspective of natural language is processed by analysing the sentiment features of online reviews from tourists, and we then supply an enhanced long short-term memory (LSTM) framework for sentiment feature extraction of travel reviews. We have constructed a web review database using a crawler and web scraping technique for experimental validation to evaluate the effectiveness of our methodology. The text form of sentences was first classified through Vader and Roberta model to get the polarity of the reviews. In this paper, we have conducted study methods for feature extraction, such as Count Vectorization and TFIDF Vectorization, and implemented Convolutional Neural Network (CNN) classifier algorithm for the sentiment analysis to decide the tourist’s attitude towards the destinations is positive, negative, or simply neutral based on the review text that they posted online. The results demonstrated that from the CNN algorithm, after pre-processing and cleaning the dataset, we received an accuracy of 96.12% for the positive and negative sentiment analysis.

Keywords: counter vectorization, convolutional neural network, crawler, data technology, long short-term memory, web scraping, sentiment analysis

Procedia PDF Downloads 57
859 Emotions Triggered by Children’s Literature Images

Authors: Ana Maria Reis d'Azevedo Breda, Catarina Maria Neto da Cruz

Abstract:

The role of images/illustrations in communicating meanings and triggering emotions assumes an increasingly relevant role in contemporary texts, regardless of the age group for which they are intended or the nature of the texts that host them. It is no coincidence that children's books are full of illustrations and that the image/text ratio decreases as the age group grows. The vast majority of children's books can be considered multimodal texts containing text and images/illustrations interacting with each other to provide the young reader with a broader and more creative understanding of the book's narrative. This interaction is very diverse, ranging from images/illustrations that are not essential for understanding the storytelling to those that contribute significantly to the meaning of the story. Usually, these books are also read by adults, namely by parents, educators, and teachers who act as mediators between the book and the children, explaining aspects that are or seem to be too complex for the child's context. It should be noted that there are books labeled as children's books that are clearly intended for both children and adults. In this work, following a qualitative and interpretative methodology based on written productions, participant observation, and field notes, we will describe the perceptions of future teachers of the 1st cycle of basic education, attending a master's degree at a Portuguese university, about the role of the image in literary and non-literary texts, namely in mathematical texts, and how these can constitute precious resources for emotional regulation and for the design of creative didactic situations. The analysis of the collected data allowed us to obtain evidence regarding the evolution of the participants' perception regarding the crucial role of images in children's literature, not only as an emotional regulator for young readers but also as a creative source for the design of meaningful didactical situations, crossing other scientific areas, other than the mother tongue, namely mathematics.

Keywords: children’s literature, emotions, multimodal texts, soft skills

Procedia PDF Downloads 63
858 Multi-source Question Answering Framework Using Transformers for Attribute Extraction

Authors: Prashanth Pillai, Purnaprajna Mangsuli

Abstract:

Oil exploration and production companies invest considerable time and efforts to extract essential well attributes (like well status, surface, and target coordinates, wellbore depths, event timelines, etc.) from unstructured data sources like technical reports, which are often non-standardized, multimodal, and highly domain-specific by nature. It is also important to consider the context when extracting attribute values from reports that contain information on multiple wells/wellbores. Moreover, semantically similar information may often be depicted in different data syntax representations across multiple pages and document sources. We propose a hierarchical multi-source fact extraction workflow based on a deep learning framework to extract essential well attributes at scale. An information retrieval module based on the transformer architecture was used to rank relevant pages in a document source utilizing the page image embeddings and semantic text embeddings. A question answering framework utilizingLayoutLM transformer was used to extract attribute-value pairs incorporating the text semantics and layout information from top relevant pages in a document. To better handle context while dealing with multi-well reports, we incorporate a dynamic query generation module to resolve ambiguities. The extracted attribute information from various pages and documents are standardized to a common representation using a parser module to facilitate information comparison and aggregation. Finally, we use a probabilistic approach to fuse information extracted from multiple sources into a coherent well record. The applicability of the proposed approach and related performance was studied on several real-life well technical reports.

Keywords: natural language processing, deep learning, transformers, information retrieval

Procedia PDF Downloads 169
857 Experiences Using Autoethnography as a Methodology for Research in Education

Authors: Sarah Amodeo

Abstract:

Drawing on the author’s research about the experiences of female immigrant students in academic Adult Education, in Montreal, Quebec, this paper deconstructs the benefits of autoethnography as a methodology for educators in Adult Education. Autoethnography is an advantageous methodology for teachers in Adult Education as it allows for deep engagement, allowing for educators to reflect on student experiences and their day-to-day realities, and in turn, allowing for professional development, improved andragogy, and changes to classroom practices. Autoethnography is a qualitative research methodology that cultivates strategies for improving adult learning. The paper begins by outlining the context that inspired autoethnography for the author’s work, highlighting the emergence of autoethnography as a method, while examining how it is evolving and drawing on foundational work that continues to inspire research. The basic autoethnographic methodologies that are explored in this paper include the use of memory work in episode formation, the use of personal photographs, and textual readings of artworks. Memory work allows for the researcher to use their professional experience and the lived/shared experiences of their students in their research, drawing on episodes from their past. Personal photographs and descriptions of artwork allow researchers to explore images of learning environments/realities in ways that compliment student experiences. Major findings of the text are examined through the analysis of categories of autoethnography. Specific categories include realism, impressionism, and conceptualism which aid in orientating the analysis and emergent themes that develop through self-study. Finally, the text presents a discussion surrounding the limitations of autoethnography, with attention to the trustworthiness and ethical issues. The paper concludes with a consideration of the implications of autoethnography for adult educators in juxtaposition with youth sector work.

Keywords: artwork, autoethnography, conceptualism, episode formation, impressionism, memory work, personal photographs, and realism, realism

Procedia PDF Downloads 156
856 The Prevalence of Organized Retail Crime in Riyadh, Saudi Arabia

Authors: Saleh Dabil

Abstract:

This study investigates the level of existence of organized retail crime in supermarkets of Riyadh, Saudi Arabia. The store managers, security managers and general employees were asked about the types of retail crimes occur in the stores. Three independent variables were related to the report of organized retail theft. The independent variables are: (1) the supermarket profile (volume, location, standard and type of the store), (2) the social physical environment of the store (maintenance, cleanness and overall organizational cooperation), (3) the security techniques and loss prevention electronics techniques used. The theoretical framework of this study based on the social disorganization theory. This study concluded that the organized retail theft, in specific, organized theft is moderately apparent in Riyadh stores. The general result showed that the environment of the stores has an effect on the prevalence of organized retail theft with relation to the gender of thieves, age groups, working shift, type of stolen items as well as the number of thieves in one case. Among other reasons, some factors of the organized theft are: economic pressure of customers based on the location of the store. The dealing of theft also was investigated to have a clear picture of stores dealing with organized retail theft. The result showed that mostly, thieves sent without any action and sometimes given written warning. Very few cases dealt with by police. There are other factors in the study can be looked up in the text. This study suggests solving the problem of organized theft; first is ‘the well distributing of the duties and responsibilities between the employees especially for security purposes’. Second is ‘installation of strong security system’ and ‘making well-designed store layout’. Third is ‘giving training for general employees’ and ‘to give periodically security skills training of employees’. There are other suggestions in the study can be looked up in the text.

Keywords: organized crime, retail, theft, loss prevention, store environment

Procedia PDF Downloads 169
855 Archaeological Study of Statues of King Thutmosis III from Luxor

Authors: Mahmoud Abualsoud

Abstract:

The era of Thutmosis III represents a transitional period between the art of the Thutmoside art and the Amarna period, so we intend to declare that it serves as the cradle of Amarna art. The study will examine the Statues of king Thutmose III that was discovered in Luxor by an Egyptian mission. These Statues have been transferred to the Conservation Center of the Grand Egyptian Museum (GEM) to be conserved and made ready to be displayed at the new museum (the project of the century). We focus on three Statues chosen because they relate to different years of the king's reign. These Statues were all made of granite. The first one is a Kneeling statue representing the god Amun showing king Thutmose III offering to the goddess Hathor. The second is decorated with king Thutmose III with the red crown, between the goddess Hathor and the royal wife, Nefertari. The third shows the king offering NW vessels and bread to the god Seker. Each statue is divided into registers containing a description and decorated with scenes of the king presenting offerings to gods. The proposed study will focus on the development which happened sequentially according to differences that occur in each statue. We will use comparative research to determine the workshops of these statues, whether one or several, and what are the distinguishing features of each one. We will examine what innovations the artisans added to royal art. The description and the texts will be translated with linguistic comments. This research focuses on text analyses and technology. Paleographic information found on these objects includes the names and titles of the king. This research focuses on text analyses and technology. The study aims to create a manual that may help in dating the artwork of Thutmosis III. This research will be beneficial and useful for heritage and ancient civilizations, particularly when we talk about opening museums like the Grand Egyptian Museum, which will exhibit a collection of statues. Indeed, this kind of study will open a new destination in order to know how to identify these collections and how to exhibit them commensurate with the nature of ancient Egyptian history and heritage.

Keywords: archaeological study, Giza, new kingdom, statues, royal art

Procedia PDF Downloads 45
854 Visualization of Taiwan's Religious Social Networking Sites

Authors: Jia-Jane Shuai

Abstract:

Purpose of this research aims to improve understanding of the nature of online religion by examining the religious social websites. What motivates individual users to use the online religious social websites, and which factors affect those motivations. We survey various online religious social websites provided by different religions, especially the Taiwanese folk religion. Based on the theory of the Content Analysis and Social Network Analysis, religious social websites and religious web activities are examined. This research examined the folk religion websites’ presentation and contents that promote the religious use of the Internet in Taiwan. The difference among different religions and religious websites also be compared. First, this study used keywords to examine what types of messages gained the most clicks of “Like”, “Share” and comments on Facebook. Dividing the messages into four media types, namely, text, link, video, and photo, reveal which category receive more likes and comments than the others. Meanwhile, this study analyzed the five dialogic principles of religious websites accessed from mobile phones and also assessed their mobile readiness. Using the five principles of dialogic theory as a basis, do a general survey on the websites with elements of online religion. Second, the project analyzed the characteristics of Taiwanese participants for online religious activities. Grounded by social network analysis and text mining, this study comparatively explores the network structure, interaction pattern, and geographic distribution of users involved in communication networks of the folk religion in social websites and mobile sites. We studied the linkage preference of different religious groups. The difference among different religions and religious websites also be compared. We examined the reasons for the success of these websites, as well as reasons why young users accept new religious media. The outcome of the research will be useful for online religious service providers and non-profit organizations to manage social websites and internet marketing.

Keywords: content analysis, online religion, social network analysis, social websites

Procedia PDF Downloads 143
853 Integrating Natural Language Processing (NLP) and Machine Learning in Lung Cancer Diagnosis

Authors: Mehrnaz Mostafavi

Abstract:

The assessment and categorization of incidental lung nodules present a considerable challenge in healthcare, often necessitating resource-intensive multiple computed tomography (CT) scans for growth confirmation. This research addresses this issue by introducing a distinct computational approach leveraging radiomics and deep-learning methods. However, understanding local services is essential before implementing these advancements. With diverse tracking methods in place, there is a need for efficient and accurate identification approaches, especially in the context of managing lung nodules alongside pre-existing cancer scenarios. This study explores the integration of text-based algorithms in medical data curation, indicating their efficacy in conjunction with machine learning and deep-learning models for identifying lung nodules. Combining medical images with text data has demonstrated superior data retrieval compared to using each modality independently. While deep learning and text analysis show potential in detecting previously missed nodules, challenges persist, such as increased false positives. The presented research introduces a Structured-Query-Language (SQL) algorithm designed for identifying pulmonary nodules in a tertiary cancer center, externally validated at another hospital. Leveraging natural language processing (NLP) and machine learning, the algorithm categorizes lung nodule reports based on sentence features, aiming to facilitate research and assess clinical pathways. The hypothesis posits that the algorithm can accurately identify lung nodule CT scans and predict concerning nodule features using machine-learning classifiers. Through a retrospective observational study spanning a decade, CT scan reports were collected, and an algorithm was developed to extract and classify data. Results underscore the complexity of lung nodule cohorts in cancer centers, emphasizing the importance of careful evaluation before assuming a metastatic origin. The SQL and NLP algorithms demonstrated high accuracy in identifying lung nodule sentences, indicating potential for local service evaluation and research dataset creation. Machine-learning models exhibited strong accuracy in predicting concerning changes in lung nodule scan reports. While limitations include variability in disease group attribution, the potential for correlation rather than causality in clinical findings, and the need for further external validation, the algorithm's accuracy and potential to support clinical decision-making and healthcare automation represent a significant stride in lung nodule management and research.

Keywords: lung cancer diagnosis, structured-query-language (SQL), natural language processing (NLP), machine learning, CT scans

Procedia PDF Downloads 43