Search results for: semantic repository
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 663

Search results for: semantic repository

123 Author Profiling: Prediction of Learners’ Gender on a MOOC Platform Based on Learners’ Comments

Authors: Tahani Aljohani, Jialin Yu, Alexandra. I. Cristea

Abstract:

The more an educational system knows about a learner, the more personalised interaction it can provide, which leads to better learning. However, asking a learner directly is potentially disruptive, and often ignored by learners. Especially in the booming realm of MOOC Massive Online Learning platforms, only a very low percentage of users disclose demographic information about themselves. Thus, in this paper, we aim to predict learners’ demographic characteristics, by proposing an approach using linguistically motivated Deep Learning Architectures for Learner Profiling, particularly targeting gender prediction on a FutureLearn MOOC platform. Additionally, we tackle here the difficult problem of predicting the gender of learners based on their comments only – which are often available across MOOCs. The most common current approaches to text classification use the Long Short-Term Memory (LSTM) model, considering sentences as sequences. However, human language also has structures. In this research, rather than considering sentences as plain sequences, we hypothesise that higher semantic - and syntactic level sentence processing based on linguistics will render a richer representation. We thus evaluate, the traditional LSTM versus other bleeding edge models, which take into account syntactic structure, such as tree-structured LSTM, Stack-augmented Parser-Interpreter Neural Network (SPINN) and the Structure-Aware Tag Augmented model (SATA). Additionally, we explore using different word-level encoding functions. We have implemented these methods on Our MOOC dataset, which is the most performant one comparing with a public dataset on sentiment analysis that is further used as a cross-examining for the models' results.

Keywords: deep learning, data mining, gender predication, MOOCs

Procedia PDF Downloads 140
122 The Phenomena of False Cognates and Deceptive Cognates: Issues to Foreign Language Learning and Teaching Methodology Based on Set Theory

Authors: Marilei Amadeu Sabino

Abstract:

The aim of this study is to establish differences between the terms ‘false cognates’, ‘false friends’ and ‘deceptive cognates’, usually considered to be synonyms. It will be shown they are not synonyms, since they do not designate the same linguistic process or phenomenon. Despite their differences in meaning, many pairs of formally similar words in two (or more) different languages are true cognates, although they are usually known as ‘false’ cognates – such as, for instance, the English and Italian lexical items ‘assist x assistere’; ‘attend x attendere’; ‘argument x argomento’; ‘apology x apologia’; ‘camera x camera’; ‘cucumber x cocomero’; ‘fabric x fabbrica’; ‘factory x fattoria’; ‘firm x firma’; ‘journal x giornale’; ‘library x libreria’; ‘magazine x magazzino’; ‘parent x parente’; ‘preservative x preservativo’; ‘pretend x pretendere’; ‘vacancy x vacanza’, to name but a few examples. Thus, one of the theoretical objectives of this paper is firstly to elaborate definitions establishing a distinction between the words that are definitely ‘false cognates’ (derived from different etyma) and those that are just ‘deceptive cognates’ (derived from the same etymon). Secondly, based on Set Theory and on the concepts of equal sets, subsets, intersection of sets and disjoint sets, this study is intended to elaborate some theoretical and practical questions that will be useful in identifying more precisely similarities and differences between cognate words of different languages, and according to graphic interpretation of sets it will be possible to classify them and provide discernment about the processes of semantic changes. Therefore, these issues might be helpful not only to the Learning of Second and Foreign Languages, but they could also give insights into Foreign and Second Language Teaching Methodology. Acknowledgements: FAPESP – São Paulo State Research Support Foundation – the financial support offered (proc. n° 2017/02064-7).

Keywords: deceptive cognates, false cognates, foreign language learning, teaching methodology

Procedia PDF Downloads 335
121 Prompt Design for Code Generation in Data Analysis Using Large Language Models

Authors: Lu Song Ma Li Zhi

Abstract:

With the rapid advancement of artificial intelligence technology, large language models (LLMs) have become a milestone in the field of natural language processing, demonstrating remarkable capabilities in semantic understanding, intelligent question answering, and text generation. These models are gradually penetrating various industries, particularly showcasing significant application potential in the data analysis domain. However, retraining or fine-tuning these models requires substantial computational resources and ample downstream task datasets, which poses a significant challenge for many enterprises and research institutions. Without modifying the internal parameters of the large models, prompt engineering techniques can rapidly adapt these models to new domains. This paper proposes a prompt design strategy aimed at leveraging the capabilities of large language models to automate the generation of data analysis code. By carefully designing prompts, data analysis requirements can be described in natural language, which the large language model can then understand and convert into executable data analysis code, thereby greatly enhancing the efficiency and convenience of data analysis. This strategy not only lowers the threshold for using large models but also significantly improves the accuracy and efficiency of data analysis. Our approach includes requirements for the precision of natural language descriptions, coverage of diverse data analysis needs, and mechanisms for immediate feedback and adjustment. Experimental results show that with this prompt design strategy, large language models perform exceptionally well in multiple data analysis tasks, generating high-quality code and significantly shortening the data analysis cycle. This method provides an efficient and convenient tool for the data analysis field and demonstrates the enormous potential of large language models in practical applications.

Keywords: large language models, prompt design, data analysis, code generation

Procedia PDF Downloads 20
120 The Ideology of the Jordanian Media Women’s Discourse: Lana Mamkgh as an Example

Authors: Amani Hassan Abu Atieh

Abstract:

This study aims at examining the patterns of ideology reflected in the written discourse of women writers in the media of Jordan; Lana Mamkgh is taken as an example. This study critically analyzes the discursive, linguistic, and cognitive representations that she employs as an agent in the institutionalized discourse of the media. Grounded in van Dijk’s critical discourse analysis approach to Sociocognitive Discourse Studies, the present study builds a multilayer framework that encompasses van Dijk’s triangle: discourse, society, and cognition. Specifically, the study attempts to analyze, at both micro and macro levels, the underlying cognitive processes and structures, mainly ideology and discursive strategies, which are functional in the production of women’s discourse in terms of meaning, forms, and functions. Cognitive processes that social actors adopt are underlined by experience/context and semantic mental models on the one hand and social cognition on the other. This study is based on qualitative research and adopts purposive sampling, taking as an example a sample of an opinion article written by Lana Mamkgh in the Arabic Jordanian Daily, Al Rai. Taking her role as an agent in the public sphere, she stresses the National and feminist ideologies, demonstrating the use of assertive, evaluative, and expressive linguistic and rhetorical devices that appeal to the logic, ethics, and emotions of the addressee. Highlighting the agency of Jordanian writers in the media, the study sought to achieve the macro goal of dispensing political and social justice to the underprivileged. Further, the study seeks to prove that the voice of Jordanian women, viewed as underrepresented and invisible in the public arena, has come through clearly.

Keywords: critical discourse analysis, sociocognitive theory, ideology, women discourse, media

Procedia PDF Downloads 103
119 Analyzing Apposition and the Typology of Specific Reference in Newspaper Discourse in Nigeria

Authors: Monday Agbonica Bello Eje

Abstract:

The language of the print media is characterized by the use of apposition. This linguistic element function strategically in journalistic discourse where it is communicatively necessary to name individuals and provide information about them. Linguistic studies on the language of the print media with bias for apposition have largely dwelt on other areas but the examination of the typology of appositive reference in newspaper discourse. Yet, it is capable of revealing ways writers communicate and provide information necessary for readers to follow and understand the message. The study, therefore, analyses the patterns of appositional occurrences and the typology of reference in newspaper articles. The data were obtained from The Punch and Daily Trust Newspapers. A total of six editions of these newspapers were collected randomly spread over three months. News and feature articles were used in the analysis. Guided by the referential theory of meaning in discourse, the appositions identified were subjected to analysis. The findings show that the semantic relation of coreference and speaker coreference have the highest percentage and frequency of occurrence in the data. This is because the subject matter of news reports and feature articles focuses on humans and the events around them; as a result, readers need to be provided with some form of detail and background information in order to identify as well as follow the discourse. Also, the non-referential relation of absolute synonymy and speaker synonymy no doubt have fewer occurrences and percentages in the analysis. This is tied to a major feature of the language of the media: simplicity. The paper concludes that appositions is mainly used for the purpose of providing the reader with much detail. In this way, the writer transmits information which helps him not only to give detailed yet concise descriptions but also in some way help the reader to follow the discourse.

Keywords: apposition, discourse, newspaper, Nigeria, reference

Procedia PDF Downloads 155
118 The Role of Anti-corruption Clauses in the Fight Against Corruption in Petroleum Sector

Authors: Azar Mahmoudi

Abstract:

Despite the rise of global anti-corruption movements and the strong emergence of international and national anti-corruption laws, corrupt practices are still prevalent in most places, and countries still struggle to translate these laws into practice. On the other hand, in most countries, political and economic elites oppose anti-corruption reforms. In such a situation, the role of external actors, like the other States, international organizations, and transnational actors, becomes essential. Among them, Transnational Corporations [TNCs] can develop their own regime-like framework to govern their internal activities, and through this, they can contribute to the regimes established by State actors to solve transnational issues. Among various regimes, TNCs may choose to comply with the transnational anti-corruption legal regime to avoid the cost of non-compliance with anti-corruption laws. As a result, they decide to strenghen their anti-corruption compliance as they expand into new overseas markets. Such a decision extends anti-corruption standards among their employees and third-party agents and within their projects across countries. To better address the challenges posed by corruption, TNCs have adopted a comprehensive anti-corruption toolkit. Among the various instruments, anti-corruption clauses have become one of the most anti-corruption means in international commercial agreements. Anti-corruption clauses, acting as a due diligence tool, can protect TNCs against the engagement of third-party agents in corrupt practices and further promote anti-corruption standards among businesses operating across countries. An anti-corruption clause allows parties to create a contractual commitment to exclude corrupt practices during the term of their agreement, including all levels of negotiation and implementation. Such a clause offers companies a mechanism to reduce the risk of potential corruption in their dealings with third parties while avoiding civil and administrative penalties. There have been few attempts to examine the role of anti-corruption clauses in the fight against corruption; therefore, this paper aims to fill this gap and examine anti-corruption clauses in a specific sector where corrupt practices are widespread and endemic, i.e., the petroleum industry. This paper argues that anti-corruption clauses are a positive step in ensuring that the petroleum industry operates in an ethical and transparent manner, helping to reducing the risk of corruption and promote integrity in this sector. Contractual anti-corruption clauses vary in terms of the types commitment, so parties have a wide range of options to choose from for their preferred clauses incorporated within their contracts. This paper intends to propose a categorization of anti-corruption clauses in the petroleum sector. It examines particularly the anti-corruption clauses incorporated in transnational hydrocarbon contracts published by the Resource Contract Portal, an online repository of extractive contracts. Then, this paper offers a quantitative assessment of anti-corruption clauses according to the types of contract, the date of conclusion, and the geographical distribution.

Keywords: anti-corruption, oil and gas, transnational corporations, due diligence, contractual clauses, hydrocarbon, petroleum sector

Procedia PDF Downloads 120
117 The Psychology of Cross-Cultural Communication: A Socio-Linguistics Perspective

Authors: Tangyie Evani, Edmond Biloa, Emmanuel Nforbi, Lem Lilian Atanga, Kom Beatrice

Abstract:

The dynamics of languages in contact necessitates a close study of how its users negotiate meanings from shared values in the process of cross-cultural communication. A transverse analysis of the situation demonstrates the existence of complex efforts on connecting cultural knowledge to cross-linguistic competencies within a widening range of communicative exchanges. This paper sets to examine the psychology of cross-cultural communication in a multi-linguistic setting like Cameroon where many local and international languages are in close contact. The paper equally analyses the pertinence of existing macro sociological concepts as fundamental knowledge traits in literal and idiomatic cross semantic mapping. From this point, the article presents a path model of connecting sociolinguistics to the increasing adoption of a widening range of communicative genre piloted by the on-going globalisation trends with its high-speed information technology machinery. By applying a cross cultural analysis frame, the paper will be contributing to a better understanding of the fundamental changes in the nature and goals of cross-cultural knowledge in pragmatics of communication and cultural acceptability’s. It emphasises on the point that, in an era of increasing global interchange, a comprehensive inclusive global culture through bridging gaps in cross-cultural communication would have significant potentials to contribute to achieving global social development goals, if inadequacies in language constructs are adjusted to create avenues that intertwine with sociocultural beliefs, ensuring that meaningful and context bound sociolinguistic values are observed within the global arena of communication.

Keywords: cross-cultural communication, customary language, literalisms, primary meaning, subclasses, transubstantiation

Procedia PDF Downloads 277
116 Estimating Estimators: An Empirical Comparison of Non-Invasive Analysis Methods

Authors: Yan Torres, Fernanda Simoes, Francisco Petrucci-Fonseca, Freddie-Jeanne Richard

Abstract:

The non-invasive samples are an alternative of collecting genetic samples directly. Non-invasive samples are collected without the manipulation of the animal (e.g., scats, feathers and hairs). Nevertheless, the use of non-invasive samples has some limitations. The main issue is degraded DNA, leading to poorer extraction efficiency and genotyping. Those errors delayed for some years a widespread use of non-invasive genetic information. Possibilities to limit genotyping errors can be done using analysis methods that can assimilate the errors and singularities of non-invasive samples. Genotype matching and population estimation algorithms can be highlighted as important analysis tools that have been adapted to deal with those errors. Although, this recent development of analysis methods there is still a lack of empirical performance comparison of them. A comparison of methods with dataset different in size and structure can be useful for future studies since non-invasive samples are a powerful tool for getting information specially for endangered and rare populations. To compare the analysis methods, four different datasets used were obtained from the Dryad digital repository were used. Three different matching algorithms (Cervus, Colony and Error Tolerant Likelihood Matching - ETLM) are used for matching genotypes and two different ones for population estimation (Capwire and BayesN). The three matching algorithms showed different patterns of results. The ETLM produced less number of unique individuals and recaptures. A similarity in the matched genotypes between Colony and Cervus was observed. That is not a surprise since the similarity between those methods on the likelihood pairwise and clustering algorithms. The matching of ETLM showed almost no similarity with the genotypes that were matched with the other methods. The different cluster algorithm system and error model of ETLM seems to lead to a more criterious selection, although the processing time and interface friendly of ETLM were the worst between the compared methods. The population estimators performed differently regarding the datasets. There was a consensus between the different estimators only for the one dataset. The BayesN showed higher and lower estimations when compared with Capwire. The BayesN does not consider the total number of recaptures like Capwire only the recapture events. So, this makes the estimator sensitive to data heterogeneity. Heterogeneity in the sense means different capture rates between individuals. In those examples, the tolerance for homogeneity seems to be crucial for BayesN work properly. Both methods are user-friendly and have reasonable processing time. An amplified analysis with simulated genotype data can clarify the sensibility of the algorithms. The present comparison of the matching methods indicates that Colony seems to be more appropriated for general use considering a time/interface/robustness balance. The heterogeneity of the recaptures affected strongly the BayesN estimations, leading to over and underestimations population numbers. Capwire is then advisable to general use since it performs better in a wide range of situations.

Keywords: algorithms, genetics, matching, population

Procedia PDF Downloads 137
115 Cognitive and Functional Analysis of Experiencer Subject and Experiencer Object Psychological Predicate Constructions in French

Authors: Carine Kawakami

Abstract:

In French, as well as in English, there are two types of psychological predicate constructions depending on where the experiencer argument is realized; the first type is in the subject position (e.g. Je regrette d’être venu ici. ‘I regret coming here'), hereinafter called ES construction, and the second type is in the object position (e.g. Cette nouvelle m’a surpris. ‘This new surprised me.'), referred as EO construction. In the previous studies about psychological predicates, the syntactic position of the experiencer argument has been just a matter of its connection with the syntactic or semantic structure of the predicate. So that few attentions have been paid to how two types of realization of experiencer are related to the conceptualization of psychological event and to the function of the sentence describing the psychological event, in the sense of speech act theory. In this research, focusing on the French phenomena limited to the first personal pronoun and the present tense, the ES constructions and the EO constructions will be analyzed from cognitive and functional approach. It will be revealed that, due to the possibility to be used in soliloquy and the high co-occurrence with ça (‘it’), the EO constructions may have expressive function to betray what speaker feels in hic et nunc, like interjection. And in the expressive case, the experiencer is construed as a locus where a feeling appears spontaneously and is construed subjectively (e.g. Ah, ça m’énerve! ‘Oh, it irritates me!'). On the other hand, the ES constructions describe speaker’s mental state in an assertive manner rather than the expressive and spontaneously way. In other words, they describe what speaker feels to the interlocutor (e.g. Je suis énervé. ‘I am irritated.'). As a consequence, when the experiencer argument is realized in the subject position, it is construed objectively and have a participant feature in the sense of cognitive grammar. Finally, it will be concluded that the choice of construction type, at least in French, is correlated to the conceptualization of the psychological event and the discourse feature of its expression.

Keywords: french psychological verb, conceptualization, expressive function, assertive function, experiencer realization

Procedia PDF Downloads 130
114 Gender Bias in Natural Language Processing: Machines Reflect Misogyny in Society

Authors: Irene Yi

Abstract:

Machine learning, natural language processing, and neural network models of language are becoming more and more prevalent in the fields of technology and linguistics today. Training data for machines are at best, large corpora of human literature and at worst, a reflection of the ugliness in society. Machines have been trained on millions of human books, only to find that in the course of human history, derogatory and sexist adjectives are used significantly more frequently when describing females in history and literature than when describing males. This is extremely problematic, both as training data, and as the outcome of natural language processing. As machines start to handle more responsibilities, it is crucial to ensure that they do not take with them historical sexist and misogynistic notions. This paper gathers data and algorithms from neural network models of language having to deal with syntax, semantics, sociolinguistics, and text classification. Results are significant in showing the existing intentional and unintentional misogynistic notions used to train machines, as well as in developing better technologies that take into account the semantics and syntax of text to be more mindful and reflect gender equality. Further, this paper deals with the idea of non-binary gender pronouns and how machines can process these pronouns correctly, given its semantic and syntactic context. This paper also delves into the implications of gendered grammar and its effect, cross-linguistically, on natural language processing. Languages such as French or Spanish not only have rigid gendered grammar rules, but also historically patriarchal societies. The progression of society comes hand in hand with not only its language, but how machines process those natural languages. These ideas are all extremely vital to the development of natural language models in technology, and they must be taken into account immediately.

Keywords: gendered grammar, misogynistic language, natural language processing, neural networks

Procedia PDF Downloads 113
113 Predictors of Motor and Cognitive Domains of Functional Performance after Rehabilitation of Individuals with Acute Stroke

Authors: A. F. Jaber, E. Dean, M. Liu, J. He, D. Sabata, J. Radel

Abstract:

Background: Stroke is a serious health care concern and a major cause of disability in the United States. This condition impacts the individual’s functional ability to perform daily activities. Predicting functional performance of people with stroke assists health care professionals in optimizing the delivery of health services to the affected individuals. The purpose of this study was to identify significant predictors of Motor FIM and of Cognitive FIM subscores among individuals with stroke after discharge from inpatient rehabilitation (typically 4-6 weeks after stroke onset). A second purpose is to explore the relation among personal characteristics, health status, and functional performance of daily activities within 2 weeks of stroke onset. Methods: This study used a retrospective chart review to conduct a secondary analysis of data obtained from the Healthcare Enterprise Repository for Ontological Narration (HERON) database. The HERON database integrates de-identified clinical data from seven different regional sources including hospital electronic medical record systems of the University of Kansas Health System. The initial HERON data extract encompassed 1192 records and the final sample consisted of 207 participants who were mostly white (74%) males (55%) with a diagnosis of ischemic stroke (77%). The outcome measures collected from HERON included performance scores on the National Institute of Health Stroke Scale (NIHSS), the Glasgow Coma Scale (GCS), and the Functional Independence Measure (FIM). The data analysis plan included descriptive statistics, Pearson correlation analysis, and Stepwise regression analysis. Results: significant predictors of discharge Motor FIM subscores included age, baseline Motor FIM subscores, discharge NIHSS scores, and comorbid electrolyte disorder (R2 = 0.57, p <0.026). Significant predictors of discharge Cognitive FIM subscores were age, baseline cognitive FIM subscores, client cooperative behavior, comorbid obesity, and the total number of comorbidities (R2 = 0.67, p <0.020). Functional performance on admission was significantly associated with age (p < 0.01), stroke severity (p < 0.01), and length of hospital stay (p < 0.05). Conclusions: our findings show that younger age, good motor and cognitive abilities on admission, mild stroke severity, fewer comorbidities, and positive client attitude all predict favorable functional outcomes after inpatient stroke rehabilitation. This study provides health care professionals with evidence to evaluate predictors of favorable functional outcomes early at stroke rehabilitation, to tailor individualized interventions based on their client’s anticipated prognosis, and to educate clients about the benefits of making lifestyle changes to improve their anticipated rate of functional recovery.

Keywords: functional performance, predictors, stroke, recovery

Procedia PDF Downloads 140
112 Interpreting Ecclesiastical Heritage: Meaning Making and Contentious Conversations

Authors: Alexis Thouki

Abstract:

In our post-Christian societies, ecclesiastical heritage acquired a new extrovert profile aiming to reach out an increasingly diverse audience. In this context, the various motivations, interests, personalities and cultural exchanges, found in the ‘post-modern pilgrimage’, bequeath a hybrid and multidimensional character to religious tourism education. In consequence, churches have acquired the challenging role of enriching visitors cultural and spiritual capital. Despite this promising diversification to relate, reveal and provoke constructive discourses, due to the various ‘conflicting interests’, practitioners attempt to tame the rich in symbolism and meanings religious environment through ‘neutral interpretations’. This paper aims to present the results of an ongoing developing strategy related to the presentation of contentious meanings in English churches. The paper will explore some of the underlying issues related to the capacity of ‘neutrality’ to spark, downplay or eliminate contentious conversations relating to the cultural, religious, and social dimension of Christian cultural heritage thematology. In an effort to understand this issue, the paper examines the concept of neutrality and what it stands for, executing a discourse analysis in the semantic context in which the theological lexicon is interwoven with the cultural and social meanings of sacred sites. Following that, the paper examines whether the preferable interpretive strategies meet the post-modern interpretative framework which is marked by polysemy and critical active engagement. The ultimate aim of the paper is to investigate the hypothesis that the preferable neutral strategies, managing the ‘conflicting’ demands of worshippers and visitors, result in the uneven treatment of both, the religious and historical spirit of the place.

Keywords: contentious dialogue, interpretation, meaning making, religious tourism

Procedia PDF Downloads 153
111 INCIPIT-CRIS: A Research Information System Combining Linked Data Ontologies and Persistent Identifiers

Authors: David Nogueiras Blanco, Amir Alwash, Arnaud Gaudinat, René Schneider

Abstract:

At a time when the access to and the sharing of information are crucial in the world of research, the use of technologies such as persistent identifiers (PIDs), Current Research Information Systems (CRIS), and ontologies may create platforms for information sharing if they respond to the need of disambiguation of their data by assuring interoperability inside and between other systems. INCIPIT-CRIS is a continuation of the former INCIPIT project, whose goal was to set up an infrastructure for a low-cost attribution of PIDs with high granularity based on Archival Resource Keys (ARKs). INCIPIT-CRIS can be interpreted as a logical consequence and propose a research information management system developed from scratch. The system has been created on and around the Schema.org ontology with a further articulation of the use of ARKs. It is thus built upon the infrastructure previously implemented (i.e., INCIPIT) in order to enhance the persistence of URIs. As a consequence, INCIPIT-CRIS aims to be the hinge between previously separated aspects such as CRIS, ontologies and PIDs in order to produce a powerful system allowing the resolution of disambiguation problems using a combination of an ontology such as Schema.org and unique persistent identifiers such as ARK, allowing the sharing of information through a dedicated platform, but also the interoperability of the system by representing the entirety of the data as RDF triplets. This paper aims to present the implemented solution as well as its simulation in real life. We will describe the underlying ideas and inspirations while going through the logic and the different functionalities implemented and their links with ARKs and Schema.org. Finally, we will discuss the tests performed with our project partner, the Swiss Institute of Bioinformatics (SIB), by the use of large and real-world data sets.

Keywords: current research information systems, linked data, ontologies, persistent identifier, schema.org, semantic web

Procedia PDF Downloads 127
110 Electronic Physical Activity Record (EPAR): Key for Data Driven Physical Activity Healthcare Services

Authors: Rishi Kanth Saripalle

Abstract:

Medical experts highly recommend to include physical activity in everyone’s daily routine irrespective of gender or age as it helps to improve various medical issues or curb potential issues. Simultaneously, experts are also diligently trying to provide various healthcare services (interventions, plans, exercise routines, etc.) for promoting healthy living and increasing physical activity in one’s ever increasing hectic schedules. With the introduction of wearables, individuals are able to keep track, analyze, and visualize their daily physical activities. However, there seems to be no common agreed standard for representing, gathering, aggregating and analyzing an individual’s physical activity data from disparate multiple sources (exercise pans, multiple wearables, etc.). This issue makes it highly impractical to develop any data-driven physical activity applications and healthcare programs. Further, the inability to integrate the physical activity data into an individual’s Electronic Health Record to provide a wholistic image of that individual’s health is still eluding the experts. This article has identified three primary reasons for this potential issue. First, there is no agreed standard, both structure and semantic, for representing and sharing physical activity data across disparate systems. Second, various organizations (e.g., LA fitness, Gold’s Gym, etc.) and research backed interventions and programs still primarily rely on paper or unstructured format (such as text or notes) to keep track of the data generated from physical activities. Finally, most of the wearable devices operate in silos. This article identifies the underlying problem, explores the idea of reusing existing standards, and identifies the essential modules required to move forward.

Keywords: electronic physical activity record, physical activity in EHR EIM, tracking physical activity data, physical activity data standards

Procedia PDF Downloads 278
109 3D Modeling Approach for Cultural Heritage Structures: The Case of Virgin of Loreto Chapel in Cusco, Peru

Authors: Rony Reátegui, Cesar Chácara, Benjamin Castañeda, Rafael Aguilar

Abstract:

Nowadays, heritage building information modeling (HBIM) is considered an efficient tool to represent and manage information of cultural heritage (CH). The basis of this tool relies on a 3D model generally obtained from a cloud-to-BIM procedure. There are different methods to create an HBIM model that goes from manual modeling based on the point cloud to the automatic detection of shapes and the creation of objects. The selection of these methods depends on the desired level of development (LOD), level of information (LOI), grade of generation (GOG), as well as on the availability of commercial software. This paper presents the 3D modeling of a stone masonry chapel using Recap Pro, Revit, and Dynamo interface following a three-step methodology. The first step consists of the manual modeling of simple structural (e.g., regular walls, columns, floors, wall openings, etc.) and architectural (e.g., cornices, moldings, and other minor details) elements using the point cloud as reference. Then, Dynamo is used for generative modeling of complex structural elements such as vaults, infills, and domes. Finally, semantic information (e.g., materials, typology, state of conservation, etc.) and pathologies are added within the HBIM model as text parameters and generic models families, respectively. The application of this methodology allows the documentation of CH following a relatively simple to apply process that ensures adequate LOD, LOI, and GOG levels. In addition, the easy implementation of the method as well as the fact of using only one BIM software with its respective plugin for the scan-to-BIM modeling process means that this methodology can be adopted by a larger number of users with intermediate knowledge and limited resources since the BIM software used has a free student license.

Keywords: cloud-to-BIM, cultural heritage, generative modeling, HBIM, parametric modeling, Revit

Procedia PDF Downloads 139
108 Multi-source Question Answering Framework Using Transformers for Attribute Extraction

Authors: Prashanth Pillai, Purnaprajna Mangsuli

Abstract:

Oil exploration and production companies invest considerable time and efforts to extract essential well attributes (like well status, surface, and target coordinates, wellbore depths, event timelines, etc.) from unstructured data sources like technical reports, which are often non-standardized, multimodal, and highly domain-specific by nature. It is also important to consider the context when extracting attribute values from reports that contain information on multiple wells/wellbores. Moreover, semantically similar information may often be depicted in different data syntax representations across multiple pages and document sources. We propose a hierarchical multi-source fact extraction workflow based on a deep learning framework to extract essential well attributes at scale. An information retrieval module based on the transformer architecture was used to rank relevant pages in a document source utilizing the page image embeddings and semantic text embeddings. A question answering framework utilizingLayoutLM transformer was used to extract attribute-value pairs incorporating the text semantics and layout information from top relevant pages in a document. To better handle context while dealing with multi-well reports, we incorporate a dynamic query generation module to resolve ambiguities. The extracted attribute information from various pages and documents are standardized to a common representation using a parser module to facilitate information comparison and aggregation. Finally, we use a probabilistic approach to fuse information extracted from multiple sources into a coherent well record. The applicability of the proposed approach and related performance was studied on several real-life well technical reports.

Keywords: natural language processing, deep learning, transformers, information retrieval

Procedia PDF Downloads 189
107 Linguistic Misinterpretation and the Dialogue of Civilizations

Authors: Oleg Redkin, Olga Bernikova

Abstract:

Globalization and migrations have made cross-cultural contacts more frequent and intensive. Sometimes, these contacts may lead to misunderstanding between partners of communication and misinterpretations of the verbal messages that some researchers tend to consider as the 'clash of civilizations'. In most cases, reasons for that may be found in cultural and linguistic differences and hence misinterpretations of intentions and behavior. The current research examines factors of verbal and non-verbal communication that should be taken into consideration in verbal and non-verbal contacts. Language is one of the most important manifestations of the cultural code, and it is often considered as one of the special features of a civilization. The Arabic language, in particular, is commonly associated with Islam and the language and the Arab-Muslim civilization. It is one of the most important markers of self-identification for more than 200 million of native speakers. Arabic is the language of the Quran and hence the symbol of religious affiliation for more than one billion Muslims around the globe. Adequate interpretation of Arabic texts requires profound knowledge of its grammar, semantics of its vocabulary. Communicating sides who belong to different cultural groups are guided by different models of behavior and hierarchy of values, besides that the vocabulary each of them uses in the dialogue may convey different semantic realities and vary in connotations. In this context direct, literal translation in most cases cannot adequately convey the original meaning of the original message. Besides that peculiarities and diversities of the extralinguistic information, such as the body language, communicative etiquette, cultural background and religious affiliations may make the dialogue even more difficult. It is very likely that the so called 'clash of civilizations' in most cases is due to misinterpretation of counterpart's means of discourse such as language, cultural codes, and models of behavior rather than lies in basic contradictions between partners of communication. In the process of communication, one has to rely on universal values rather than focus on cultural or religious peculiarities, to take into account current linguistic and extralinguistic context.

Keywords: Arabic, civilization, discourse, language, linguistic

Procedia PDF Downloads 218
106 Variational Explanation Generator: Generating Explanation for Natural Language Inference Using Variational Auto-Encoder

Authors: Zhen Cheng, Xinyu Dai, Shujian Huang, Jiajun Chen

Abstract:

Recently, explanatory natural language inference has attracted much attention for the interpretability of logic relationship prediction, which is also known as explanation generation for Natural Language Inference (NLI). Existing explanation generators based on discriminative Encoder-Decoder architecture have achieved noticeable results. However, we find that these discriminative generators usually generate explanations with correct evidence but incorrect logic semantic. It is due to that logic information is implicitly encoded in the premise-hypothesis pairs and difficult to model. Actually, logic information identically exists between premise-hypothesis pair and explanation. And it is easy to extract logic information that is explicitly contained in the target explanation. Hence we assume that there exists a latent space of logic information while generating explanations. Specifically, we propose a generative model called Variational Explanation Generator (VariationalEG) with a latent variable to model this space. Training with the guide of explicit logic information in target explanations, latent variable in VariationalEG could capture the implicit logic information in premise-hypothesis pairs effectively. Additionally, to tackle the problem of posterior collapse while training VariaztionalEG, we propose a simple yet effective approach called Logic Supervision on the latent variable to force it to encode logic information. Experiments on explanation generation benchmark—explanation-Stanford Natural Language Inference (e-SNLI) demonstrate that the proposed VariationalEG achieves significant improvement compared to previous studies and yields a state-of-the-art result. Furthermore, we perform the analysis of generated explanations to demonstrate the effect of the latent variable.

Keywords: natural language inference, explanation generation, variational auto-encoder, generative model

Procedia PDF Downloads 143
105 An Event-Related Potential Investigation of Speech-in-Noise Recognition in Native and Nonnative Speakers of English

Authors: Zahra Fotovatnia, Jeffery A. Jones, Alexandra Gottardo

Abstract:

Speech communication often occurs in environments where noise conceals part of a message. Listeners should compensate for the lack of auditory information by picking up distinct acoustic cues and using semantic and sentential context to recreate the speaker’s intended message. This situation seems to be more challenging in a nonnative than native language. On the other hand, early bilinguals are expected to show an advantage over the late bilingual and monolingual speakers of a language due to their better executive functioning components. In this study, English monolingual speakers were compared with early and late nonnative speakers of English to understand speech in noise processing (SIN) and the underlying neurobiological features of this phenomenon. Auditory mismatch negativities (MMNs) were recorded using a double-oddball paradigm in response to a minimal pair that differed in their middle vowel (beat/bit) at Wilfrid Laurier University in Ontario, Canada. The results did not show any significant structural and electroneural differences across groups. However, vocabulary knowledge correlated positively with performance on tests that measured SIN processing in participants who learned English after age 6. Moreover, their performance on the test negatively correlated with the integral area amplitudes in the left superior temporal gyrus (STG). In addition, the STG was engaged before the inferior frontal gyrus (IFG) in noise-free and low-noise test conditions in all groups. We infer that the pre-attentive processing of words engages temporal lobes earlier than the fronto-central areas and that vocabulary knowledge helps the nonnative perception of degraded speech.

Keywords: degraded speech perception, event-related brain potentials, mismatch negativities, brain regions

Procedia PDF Downloads 101
104 Exploring Twitter Data on Human Rights Activism on Olympics Stage through Social Network Analysis and Mining

Authors: Teklu Urgessa, Joong Seek Lee

Abstract:

Social media is becoming the primary choice of activists to make their voices heard. This fact is coupled by two main reasons. The first reason is the emergence web 2.0, which gave the users opportunity to become content creators than passive recipients. Secondly the control of the mainstream mass media outlets by the governments and individuals with their political and economic interests. This paper aimed at exploring twitter data of network actors talking about the marathon silver medalists on Rio2016, who showed solidarity with the Oromo protesters in Ethiopia on the marathon race finish line when he won silver. The aim is to discover important insight using social network analysis and mining. The hashtag #FeyisaLelisa was used for Twitter network search. The actors’ network was visualized and analyzed. It showed the central influencers during first 10 days in August, were international media outlets while it was changed to individual activist in September. The degree distribution of the network is scale free where the frequency of degrees decay by power low. Text mining was also used to arrive at meaningful themes from tweet corpus about the event selected for analysis. The semantic network indicated important clusters of concepts (15) that provided different insight regarding the why, who, where, how of the situation related to the event. The sentiments of the words in the tweets were also analyzed and indicated that 95% of the opinions in the tweets were either positive or neutral. Overall, the finding showed that Olympic stage protest of the marathoner brought the issue of Oromo protest to the global stage. The new research framework is proposed based for event-based social network analysis and mining based on the practical procedures followed in this research for event-based social media sense making.

Keywords: human rights, Olympics, social media, network analysis, social network ming

Procedia PDF Downloads 252
103 Web-Based Instructional Program to Improve Professional Development: Recommendations and Standards for Radioactive Facilities in Brazil

Authors: Denise Levy, Gian M. A. A. Sordi

Abstract:

This web based project focuses on continuing corporate education and improving workers' skills in Brazilian radioactive facilities throughout the country. The potential of Information and Communication Technologies (ICTs) shall contribute to improve the global communication in this very large country, where it is a strong challenge to ensure high quality professional information to as many people as possible. The main objective of this system is to provide Brazilian radioactive facilities a complete web-based repository - in Portuguese - for research, consultation and information, offering conditions for learning and improving professional and personal skills. UNIPRORAD is a web based system to offer unified programs and inter-related information about radiological protection programs. The content includes the best practices for radioactive facilities in order to meet both national standards and international recommendations published by different organizations over the past decades: International Commission on Radiological Protection (ICRP), International Atomic Energy Agency (IAEA) and National Nuclear Energy Commission (CNEN). The website counts on concepts, definitions and theory about optimization and ionizing radiation monitoring procedures. Moreover, the content presents further discussions related to some national and international recommendations, such as potential exposure, which is currently one of the most important research fields in radiological protection. Only two publications of ICRP develop expressively the issue and there is still a lack of knowledge of fail probabilities, for there are still uncertainties to find effective paths to quantify probabilistically the occurrence of potential exposures and the probabilities to reach a certain level of dose. To respond to this challenge, this project discusses and introduces potential exposures in a more quantitative way than national and international recommendations. Articulating ICRP and AIEA valid recommendations and official reports, in addition to scientific papers published in major international congresses, the website discusses and suggests a number of effective actions towards safety which can be incorporated into labor practice. The WEB platform was created according to corporate public needs, taking into account the development of a robust but flexible system, which can be easily adapted to future demands. ICTs provide a vast array of new communication capabilities and allow to spread information to as many people as possible at low costs and high quality communication. This initiative shall provide opportunities for employees to increase professional skills, stimulating development in this large country where it is an enormous challenge to ensure effective and updated information to geographically distant facilities, minimizing costs and optimizing results.

Keywords: distance learning, information and communication technology, nuclear science, radioactive facilities

Procedia PDF Downloads 194
102 A Geoprocessing Tool for Early Civil Work Notification to Optimize Fiber Optic Cable Installation Cost

Authors: Hussain Adnan Alsalman, Khalid Alhajri, Humoud Alrashidi, Abdulkareem Almakrami, Badie Alguwaisem, Said Alshahrani, Abdullah Alrowaished

Abstract:

Most of the cost of installing a new fiber optic cable is attributed to civil work-trenching-cost. In many cases, information technology departments receive project proposals in their eReview system, but not all projects are visible to everyone. Additionally, if there was no IT scope in the proposed project, it is not likely to be visible to IT. Sometimes it is too late to add IT scope after project budgets have been finalized. Finally, the eReview system is a repository of PDF files for each project, which commits the reviewer to manual work and limits automation potential. This paper details a solution to address the late notification of the eReview system by integrating IT Sites GIS data-sites locations-with land use permit (LUP) data-civil work activity, which is the first step before securing the required land usage authorizations and means no detailed designs for any relevant project before an approved LUP request. To address the manual nature of eReview system, both the LUP System and IT data are using ArcGIS Desktop, which enables the creation of a geoprocessing tool with either Python or Model Builder to automate finding and evaluating potentially usable LUP requests to reduce trenching between two sites in need of a new FOC. To achieve this, a weekly dump was taken from LUP system production data and loaded manually onto ArcMap Desktop. Then a custom tool was developed in model builder, which consisted of a table of two columns containing all the pairs of sites in need of new fiber connectivity. The tool then iterates all rows of this table, taking the sites’ pair one at a time and finding potential LUPs between them, which satisfies the provided search radius. If a group of LUPs was found, an iterator would go through each LUP to find the required civil work between the two sites and the LUP Polyline feature and the distance through the line, which would be counted as cost avoidance if an IT scope had been added. Finally, the tool will export an Excel file named with sites pair, and it will contain as many rows as the number of LUPs, which met the search radius containing trenching and pulling information and cost. As a result, multiple projects have been identified – historical, missed opportunity, and proposed projects. For the proposed project, the savings were about 75% ($750,000) to install a new fiber with the Euclidean distance between Abqaiq GOSP2 and GOSP3 DCOs. In conclusion, the current tool setup identifies opportunities to bundle civil work on single projects at a time and between two sites. More work is needed to allow the bundling of multiple projects between two sites to achieve even more cost avoidance in both capital cost and carbon footprint.

Keywords: GIS, fiber optic cable installation optimization, eliminate redundant civil work, reduce carbon footprint for fiber optic cable installation

Procedia PDF Downloads 214
101 Enhancing Cognitive and Emotional Well-Being in an 85-Year-Old American-Dominican Veteran through Neuropsychological Intervention and Cognitive Stimulation

Authors: Natividad Natalia Angeles Manuel

Abstract:

In the Dominican Republic, American-Dominican veterans face unique challenges due to their dual identities and wartime experiences. This case study examines an 85-year-old veteran with memory impairments and emotional distress linked to military service. A neuropsychological assessment using standardized tools evaluated cognitive domains and functional abilities. Significant deficits in memory, orientation, semantic memory, and executive functions, alongside symptoms of Post-Traumatic Stress Disorder and depression, were identified. A six-month cognitive stimulation program included tailored interventions to enhance memory, attention, and executive skills through weekly sessions and group activities. Medical and physical therapy support aimed to improve overall cognitive, functional, and emotional outcomes. Follow-up evaluations showed improvements in memory retention, attention, task proficiency, and reduced depressive symptoms, highlighting the program's effectiveness in promoting emotional well-being and quality of life. Despite ongoing memory challenges and military-related nightmares, the veteran responded positively to interventions, demonstrating resilience and motivation. This study emphasizes the importance of personalized neuropsychological interventions for American-Dominican veterans in the Dominican Republic. Through assessment tools and focused cognitive stimulation strategies, healthcare providers can successfully alleviate cognitive and emotional challenges stemming from traumatic experiences in elderly veterans. Overall, integrated neuropsychological assessment and stimulation programs are shown to enhance cognitive resilience and emotional well-being, thus contributing to an enhanced quality of life for aging American-Dominican veterans.

Keywords: neuropsychology, cognitive stimulation, American-Dominican veterans, Dominican Republic, PTSD, memory deficits

Procedia PDF Downloads 17
100 Vascular Crossed Aphasia in Dextrals: A Study on Bengali-Speaking Population in Eastern India

Authors: Durjoy Lahiri, Vishal Madhukar Sawale, Ashwani Bhat, Souvik Dubey, Gautam Das, Biman Kanti Roy, Suparna Chatterjee, Goutam Gangopadhyay

Abstract:

Crossed aphasia has been an area of considerable interest for cognitive researchers as it offers a fascinating insight into cerebral lateralization for language function. We conducted an observational study in the stroke unit of a tertiary care neurology teaching hospital in eastern India on subjects with crossed aphasia over a period of four years. During the study period, we detected twelve cases of crossed aphasia in strongly right-handed patients, caused by ischemic stroke. The age, gender, vernacular language and educational status of the patients were noted. Aphasia type and severity were assessed using Bengali version of Western Aphasia Battery (validated). Computed tomography, magnetic resonance imaging and angiography were used to evaluate the location and extent of the ischemic lesion in brain. Our series of 12 cases of crossed aphasia included 7 male and 5 female with mean age being 58.6 years. Eight patients were found to have Broca’s aphasia, 3 had trans-cortical motor aphasia and 1 patient suffered from global aphasia. Nine patients were having very severe aphasia and 3 suffered from mild aphasia. Mirror-image type of crossed aphasia was found in 3 patients, whereas 9 had anomalous variety. In our study crossed aphasia was found to be more frequent in males. Anomalous pattern was more common than mirror-image. Majority of the patients had motor-type aphasia and no patient was found to have pure comprehension deficit. We hypothesize that in Bengali-speaking right-handed population, lexical-semantic system of the language network remains loyal to the left hemisphere even if the phonological output system is anomalously located in the right hemisphere.

Keywords: aphasia, crossed, lateralization, language function, vascular

Procedia PDF Downloads 182
99 Real-Time Big-Data Warehouse a Next-Generation Enterprise Data Warehouse and Analysis Framework

Authors: Abbas Raza Ali

Abstract:

Big Data technology is gradually becoming a dire need of large enterprises. These enterprises are generating massively large amount of off-line and streaming data in both structured and unstructured formats on daily basis. It is a challenging task to effectively extract useful insights from the large scale datasets, even though sometimes it becomes a technology constraint to manage transactional data history of more than a few months. This paper presents a framework to efficiently manage massively large and complex datasets. The framework has been tested on a communication service provider producing massively large complex streaming data in binary format. The communication industry is bound by the regulators to manage history of their subscribers’ call records where every action of a subscriber generates a record. Also, managing and analyzing transactional data allows service providers to better understand their customers’ behavior, for example, deep packet inspection requires transactional internet usage data to explain internet usage behaviour of the subscribers. However, current relational database systems limit service providers to only maintain history at semantic level which is aggregated at subscriber level. The framework addresses these challenges by leveraging Big Data technology which optimally manages and allows deep analysis of complex datasets. The framework has been applied to offload existing Intelligent Network Mediation and relational Data Warehouse of the service provider on Big Data. The service provider has 50+ million subscriber-base with yearly growth of 7-10%. The end-to-end process takes not more than 10 minutes which involves binary to ASCII decoding of call detail records, stitching of all the interrogations against a call (transformations) and aggregations of all the call records of a subscriber.

Keywords: big data, communication service providers, enterprise data warehouse, stream computing, Telco IN Mediation

Procedia PDF Downloads 170
98 Marketing Strategy of Agricultural Products in Remote Districts: A Case Study of Mudan Township, Taiwan

Authors: Ying-Hsiang Ho, Hsiao-Tseng Lin

Abstract:

Mudan Township is a remote mountainous area in Taiwan. In recent years, due to the migration of the population, inconvenient transportation, digital divide, and low production, agricultural products marketing have become a major issue. This research aims to develop the marketing strategy suitable for the agricultural products of the rural areas. The main objective of this work is to conduct in-depth interviews with scholars and experts in the marketing field, combined with the marketing 4P combination, to analyze and summarize the possible marketing strategies for agricultural products for remote districts. The interviews consist of seven experts from industry who have practical experience in producing, marketing, and selling agricultural products and three professors that have experience in teaching marketing management. The in-depth interviews are conducted for about an hour using a pre-drafted interview outline. The results of the interviews are summarized by semantic analysis and presented in a marketing 4P combination. The results indicate that in terms of products, high-quality products with original characteristics can be added through the implementation of production history, organic certification, and cultural packaging. In the place part, we found that the use of emerging communities, the emphasis on cross-industry alliances, the improvement of information application capabilities of rural households, production and marketing group, and contractual farming system are the development priorities. In terms of promotion, it should be an emphasis on the management of internet social media and word-of-mouth marketing. Mudan Township may consider promoting agricultural products through special festivals such as farmer's market, wild ginger flower season and hot spring season. This research also proposes relevant recommendations for the government's public sector and related industry reference for the promotion of agricultural products for remote area.

Keywords: marketing strategy, remote districts, agricultural products, in-depth interviews

Procedia PDF Downloads 120
97 Contextual Variables Affecting Frustration Level in Reading: An Integral Inquiry

Authors: Mae C. Pavilario

Abstract:

This study employs a sequential explanatory mixed method. Quantitatively it investigated the profile of grade VII students. Qualitatively, the prevailing contextual variables that affect their frustration-level were sought based on their perspective and that of their parents and teachers. These students were categorized as frustration-level in reading based on the data on word list of the Philippine Informal Reading Inventory (Phil-IRI). The researcher-made reading factor instrument translated to local dialect (Hiligaynon) was subjected to cross-cultural translation to address content, semantic, technical, criterion, or conceptual equivalence, the open-ended questions, and one unstructured interview was utilized. In the profile of the 26 participants, the 12 males are categorized as grade II and grade III frustration-levels. The prevailing contextual variables are personal-“having no interest in reading”, “being ashamed and fear of having to read in front of others” for extremely high frustration level; social environmental-“having no regular reading schedule at home” for very high frustration level and personal- “having no interest in reading” for high frustration level. Kendall Tau inferential statistical tool was used to test the significant relationship in the prevailing contextual variables that affect frustration-level readers when grouped according to perspective. Result showed that significant relationship exists between students-parents perspectives; however, there is no significant relationship between students’ and teachers’, and parents’ and teachers’ perspectives. The themes in the narratives of the participants on frustration-level readers are existence of speech defects, undesirable attitude, insufficient amount of reading materials, lack of close supervision from parents, and losing time and focus on task. Intervention was designed.

Keywords: contextual variables, frustration-level readers, perspective, inquiry

Procedia PDF Downloads 161
96 Automated Prediction of HIV-associated Cervical Cancer Patients Using Data Mining Techniques for Survival Analysis

Authors: O. J. Akinsola, Yinan Zheng, Rose Anorlu, F. T. Ogunsola, Lifang Hou, Robert Leo-Murphy

Abstract:

Cervical Cancer (CC) is the 2nd most common cancer among women living in low and middle-income countries, with no associated symptoms during formative periods. With the advancement and innovative medical research, there are numerous preventive measures being utilized, but the incidence of cervical cancer cannot be truncated with the application of only screening tests. The mortality associated with this invasive cervical cancer can be nipped in the bud through the important role of early-stage detection. This study research selected an array of different top features selection techniques which was aimed at developing a model that could validly diagnose the risk factors of cervical cancer. A retrospective clinic-based cohort study was conducted on 178 HIV-associated cervical cancer patients in Lagos University teaching Hospital, Nigeria (U54 data repository) in April 2022. The outcome measure was the automated prediction of the HIV-associated cervical cancer cases, while the predictor variables include: demographic information, reproductive history, birth control, sexual history, cervical cancer screening history for invasive cervical cancer. The proposed technique was assessed with R and Python programming software to produce the model by utilizing the classification algorithms for the detection and diagnosis of cervical cancer disease. Four machine learning classification algorithms used are: the machine learning model was split into training and testing dataset into ratio 80:20. The numerical features were also standardized while hyperparameter tuning was carried out on the machine learning to train and test the data. Logistic Regression (LR), Decision Tree (DT), Random Forest (RF), and K-Nearest Neighbor (KNN). Some fitting features were selected for the detection and diagnosis of cervical cancer diseases from selected characteristics in the dataset using the contribution of various selection methods for the classification cervical cancer into healthy or diseased status. The mean age of patients was 49.7±12.1 years, mean age at pregnancy was 23.3±5.5 years, mean age at first sexual experience was 19.4±3.2 years, while the mean BMI was 27.1±5.6 kg/m2. A larger percentage of the patients are Married (62.9%), while most of them have at least two sexual partners (72.5%). Age of patients (OR=1.065, p<0.001**), marital status (OR=0.375, p=0.011**), number of pregnancy live-births (OR=1.317, p=0.007**), and use of birth control pills (OR=0.291, p=0.015**) were found to be significantly associated with HIV-associated cervical cancer. On top ten 10 features (variables) considered in the analysis, RF claims the overall model performance, which include: accuracy of (72.0%), the precision of (84.6%), a recall of (84.6%) and F1-score of (74.0%) while LR has: an accuracy of (74.0%), precision of (70.0%), recall of (70.0%) and F1-score of (70.0%). The RF model identified 10 features predictive of developing cervical cancer. The age of patients was considered as the most important risk factor, followed by the number of pregnancy livebirths, marital status, and use of birth control pills, The study shows that data mining techniques could be used to identify women living with HIV at high risk of developing cervical cancer in Nigeria and other sub-Saharan African countries.

Keywords: associated cervical cancer, data mining, random forest, logistic regression

Procedia PDF Downloads 79
95 The Contribution of Corpora to the Investigation of Cross-Linguistic Equivalence in Phraseology: A Contrastive Analysis of Russian and Italian Idioms

Authors: Federica Floridi

Abstract:

The long tradition of contrastive idiom research has essentially been focusing on three domains: the comparison of structural types of idioms (e.g. verbal idioms, idioms with noun-phrase structure, etc.), the description of idioms belonging to the same thematic groups (Sachgruppen), the identification of different types of cross-linguistic equivalents (i.e. full equivalents, partial equivalents, phraseological parallels, non-equivalents). The diastratic, diachronic and diatopic aspects of the compared idioms, as well as their syntactic, pragmatic and semantic properties, have been rather ignored. Corpora (both monolingual and parallel) give the opportunity to investigate the actual use of correlating idioms in authentic texts of L1 and L2. Adopting the corpus-based approach, it is possible to draw attention to the frequency of occurrence of idioms, their syntactic embedding, their potential syntactic transformations (e.g., nominalization, passivization, relativization, etc.), their combinatorial possibilities, the variations of their lexical structure, their connotations in terms of stylistic markedness or register. This paper aims to present the results of a contrastive analysis of Russian and Italian idioms referring to the concepts of ‘beginning’ and ‘end’, that has been carried out by using the Russian National Corpus and the ‘La Repubblica’ corpus. Beyond the digital corpora, bilingual dictionaries, like Skvorcova - Majzel’, Dobrovol’skaja, Kovalev, Čerdanceva, as well as monolingual resources, have been consulted. The study has shown that many of the idioms that have been traditionally indicated as cross-linguistic equivalents on bilingual dictionaries cannot be considered correspondents. The findings demonstrate that even those idioms, that are formally identical in Russian and Italian and are presumably derived from the same source (e.g., conceptual metaphor, Bible, classical mythology, World literature), exhibit differences regarding usage. The ultimate purpose of this article is to highlight that it is necessary to review and improve the existing bilingual dictionaries considering the empirical data collected in corpora. The materials gathered in this research can contribute to this sense.

Keywords: corpora, cross-linguistic equivalence, idioms, Italian, Russian

Procedia PDF Downloads 141
94 Enhancing Learners' Metacognitive, Cultural and Linguistic Proficiency through Egyptian Series

Authors: Hanan Eltayeb, Reem Al Refaie

Abstract:

To be able to connect and relate to shows spoken in a foreign language, advanced learners must understand not only linguistics inferences but also cultural, metacognitive, and pragmatic connotations in colloquial Egyptian TV series. These connotations are needed to both understand the different facets of the dramas put before them, and they’re also consistently grown and formulated through watching these shows. The inferences have become a staple in the Egyptian colloquial culture over the years, making their way into day-to-day conversations as Egyptians use them to speak, relate, joke, and connect with each other, without having known one another from previous times. As for advanced learners, they need to understand these inferences not only to watch these shows, but also to be able to converse with Egyptians on a level that surpasses the formal, or standard. When faced with some of the somewhat recent shows on the Egyptian screens, learners faced challenges in understanding pragmatics, cultural, and religious background of the target language and consequently not able to interact effectively with a native speaker in real-life situations. This study aims to enhance the linguistic and cultural proficiency of learners through studying two genres of TV Colloquial Egyptian series. Study samples derived from two recent comedian and social Egyptian series ('The Seventh Neighbor' سابع جار, and 'Nelly and Sherihan' نيللي و شريهان). When learners watch such series, they are usually faced with a problem understanding inferences that have to do with social, religious, and political events that are addressed in the series. Using discourse analysis of the sematic, semantic, pragmatic, cultural, and linguistic characteristics of the target language, some major deductions were highlighted and repeated, showing a pattern in both. The research paper concludes that there are many sets of lingual and para-lingual phrases, idioms, and proverbs to be acquired and used effectively by teaching these series. The strategies adopted in the study can be applied to different types of media, like movies, TV shows, and even cartoons, to enhance student proficiency.

Keywords: Egyptian series, culture, linguistic competence, pragmatics, semantics, social

Procedia PDF Downloads 136