Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 6213

Search results for: short text classification

5553 Phenotype Prediction of DNA Sequence Data: A Machine and Statistical Learning Approach

Authors: Mpho Mokoatle, Darlington Mapiye, James Mashiyane, Stephanie Muller, Gciniwe Dlamini

Abstract:

Great advances in high-throughput sequencing technologies have resulted in availability of huge amounts of sequencing data in public and private repositories, enabling a holistic understanding of complex biological phenomena. Sequence data are used for a wide range of applications such as gene annotations, expression studies, personalized treatment and precision medicine. However, this rapid growth in sequence data poses a great challenge which calls for novel data processing and analytic methods, as well as huge computing resources. In this work, a machine and statistical learning approach for DNA sequence classification based on $k$-mer representation of sequence data is proposed. The approach is tested using whole genome sequences of Mycobacterium tuberculosis (MTB) isolates to (i) reduce the size of genomic sequence data, (ii) identify an optimum size of k-mers and utilize it to build classification models, (iii) predict the phenotype from whole genome sequence data of a given bacterial isolate, and (iv) demonstrate computing challenges associated with the analysis of whole genome sequence data in producing interpretable and explainable insights. The classification models were trained on 104 whole genome sequences of MTB isoloates. Cluster analysis showed that k-mers maybe used to discriminate phenotypes and the discrimination becomes more concise as the size of k-mers increase. The best performing classification model had a k-mer size of 10 (longest k-mer) an accuracy, recall, precision, specificity, and Matthews Correlation coeffient of 72.0%, 80.5%, 80.5%, 63.6%, and 0.4 respectively. This study provides a comprehensive approach for resampling whole genome sequencing data, objectively selecting a k-mer size, and performing classification for phenotype prediction. The analysis also highlights the importance of increasing the k-mer size to produce more biological explainable results, which brings to the fore the interplay that exists amongst accuracy, computing resources and explainability of classification results. However, the analysis provides a new way to elucidate genetic information from genomic data, and identify phenotype relationships which are important especially in explaining complex biological mechanisms.

Keywords: AWD-LSTM, bootstrapping, k-mers, next generation sequencing

Procedia PDF Downloads 169

5552 Phenotype Prediction of DNA Sequence Data: A Machine and Statistical Learning Approach

Authors: Darlington Mapiye, Mpho Mokoatle, James Mashiyane, Stephanie Muller, Gciniwe Dlamini

Abstract:

Great advances in high-throughput sequencing technologies have resulted in availability of huge amounts of sequencing data in public and private repositories, enabling a holistic understanding of complex biological phenomena. Sequence data are used for a wide range of applications such as gene annotations, expression studies, personalized treatment and precision medicine. However, this rapid growth in sequence data poses a great challenge which calls for novel data processing and analytic methods, as well as huge computing resources. In this work, a machine and statistical learning approach for DNA sequence classification based on k-mer representation of sequence data is proposed. The approach is tested using whole genome sequences of Mycobacterium tuberculosis (MTB) isolates to (i) reduce the size of genomic sequence data, (ii) identify an optimum size of k-mers and utilize it to build classification models, (iii) predict the phenotype from whole genome sequence data of a given bacterial isolate, and (iv) demonstrate computing challenges associated with the analysis of whole genome sequence data in producing interpretable and explainable insights. The classification models were trained on 104 whole genome sequences of MTB isoloates. Cluster analysis showed that k-mers maybe used to discriminate phenotypes and the discrimination becomes more concise as the size of k-mers increase. The best performing classification model had a k-mer size of 10 (longest k-mer) an accuracy, recall, precision, specificity, and Matthews Correlation coeffient of 72.0 %, 80.5 %, 80.5 %, 63.6 %, and 0.4 respectively. This study provides a comprehensive approach for resampling whole genome sequencing data, objectively selecting a k-mer size, and performing classification for phenotype prediction. The analysis also highlights the importance of increasing the k-mer size to produce more biological explainable results, which brings to the fore the interplay that exists amongst accuracy, computing resources and explainability of classification results. However, the analysis provides a new way to elucidate genetic information from genomic data, and identify phenotype relationships which are important especially in explaining complex biological mechanisms

Keywords: AWD-LSTM, bootstrapping, k-mers, next generation sequencing

Procedia PDF Downloads 160

5551 Stator Short-Circuits Fault Diagnosis in Induction Motors

Authors: K. Yahia, M. Sahraoui, A. Guettaf

Abstract:

This paper deals with the problem of stator faults diagnosis in induction motors. Using the discrete wavelet transform (DWT) for the current Park’s vector modulus (CPVM) analysis, the inter-turn short-circuit faults diagnosis can be achieved. This method is based on the decomposition of the CPVM signal, where wavelet approximation and detail coefficients of this signal have been extracted. The energy evaluation of a known bandwidth detail permits to define a fault severity factor (FSF). This method has been tested through the simulation of an induction motor using a mathematical model based on the winding-function approach. Simulation, as well as experimental results, show the effectiveness of the used method.

Keywords: induction motors (IMs), inter-turn short-circuits diagnosis, discrete wavelet transform (DWT), Current Park’s Vector Modulus (CPVM)

Procedia PDF Downloads 458

5550 Linguistic Analysis of Holy Scriptures: A Comparative Study of Islamic Jurisprudence and the Western Hermeneutical Tradition

Authors: Sana Ammad

Abstract:

The tradition of linguistic analysis in Islam and Christianity has developed independently of each other in lieu of the social developments specific to their historical context. However, recently increasing number of Muslim academics educated in the West have tried to apply the Western tradition of linguistic interpretation to the Qur’anic text while completely disregarding the Islamic linguistic tradition used and developed by the traditional scholars over the centuries. The aim of the paper is to outline the linguistic tools and methods used by the traditional Islamic scholars for the purpose of interpretating the Holy Qur’an and shed light on how they contribute towards a better understanding of the text compared to their Western counterparts. This paper carries out a descriptive-comparative study of the linguistic tools developed and perfected by the traditional scholars in Islam for the purpose of textual analysis of the Qur’an as they have been described in the authentic works of Usul Al Fiqh (Jurisprudence) and the principles of textual analysis employed by the Western hermeneutical tradition for the study of the Bible. First, it briefly outlines the independent historical development of the two traditions emphasizing the final normative shape that they have taken. Then it draws a comparison of the two traditions highlighting the similarities and the differences existing between them. In the end, the paper demonstrates the level of academic excellence achieved by the traditional linguistic scholars in their efforts to develop appropriate tools of textual interpretation and how these tools are more suitable for interpreting the Qur’an compared to the Western principles. Since the aim of interpreters of both the traditions is to try and attain an objective understanding of the Scriptures, the emphasis of the paper shall be to highlight how well the Islamic method of linguistic interpretation contributes to an objective understanding of the Qur’anic text. The paper concludes with the following findings: The Western hermeneutical tradition of linguistic analysis developed within the Western historical context. However, the Islamic method of linguistic analysis is much more highly developed and complex and serves better the purpose of objective understanding of the Holy text.

Keywords: Islamic jurisprudence, linguistic analysis, textual interpretation, western hermeneutics

Procedia PDF Downloads 330

5549 Surface Hole Defect Detection of Rolled Sheets Based on Pixel Classification Approach

Authors: Samira Taleb, Sakina Aoun, Slimane Ziani, Zoheir Mentouri, Adel Boudiaf

Abstract:

Rolling is a pressure treatment technique that modifies the shape of steel ingots or billets between rotating rollers. During this process, defects may form on the surface of the rolled sheets and are likely to affect the performance and quality of the finished product. In our study, we developed a method for detecting surface hole defects using a pixel classification approach. This work includes several steps. First, we performed image preprocessing to delimit areas with and without hole defects on the sheet image. Then, we developed the histograms of each area to generate the gray level membership intervals of the pixels that characterize each area. As we noticed an intersection between the characteristics of the gray level intervals of the images of the two areas, we finally performed a learning step based on a series of detection tests to refine the membership intervals of each area, and to choose the defect detection criterion in order to optimize the recognition of the surface hole.

Keywords: classification, defect, surface, detection, hole

Procedia PDF Downloads 23

5548 A Survey on Smart Security Mechanism Using Graphical Passwords

Authors: Aboli Dhanavade, Shweta Bhimnath, Rutuja Jumale, Ajay Nadargi

Abstract:

Security to any of our personal thing is our most basic need. It is not possible to directly apply that standard Human-computer—interaction approaches. Important usability goal for authentication system is to support users in selecting best passwords. Users often select text-passwords that are easy to remember, but they are more open for attackers to guess. The human brain is good in remembering pictures rather than textual characters. So the best alternative is being designed that is Graphical passwords. However, Graphical passwords are still immature. Conventional password schemes are also vulnerable to Shoulder-surfing attacks, many shoulder-surfing resistant graphical passwords schemes have been proposed. Next, we have analyzed the security and usability of the proposed scheme, and show the resistance of the proposed scheme to shoulder-surfing and different accidental logins.

Keywords: shoulder-surfing, security, authentication, text-passwords

Procedia PDF Downloads 364

5547 An Ideational Grammatical Metaphor of Narrative History in Chinua Achebe's 'There Was a Country'

Authors: Muhammed-Badar Salihu Jibrin, Chibabi Makedono Darlington

Abstract:

This paper studied Ideational Grammatical Metaphor (IGM) of Narrative History in Chinua Achebe’s There Was a Country. It started with a narrative historical style as a recent genre out of the conventional historical writings. In order to explore the linguistic phenomenon using a particular lexico-grammatical tool of IGM, the theoretical background was examined based on Hallidayan Systemic Functional Linguistics. Furthermore, the study considered the possibility of applying IGM to the Part 4 of Achebe’s historical text with recourse to the concept of congruence in IGM and research questions before formulating a working methodology. The analysis of Achebe’s memoir was, thus, presented in tabular forms to account for the quantitative content analysis with qualitative research technique, as well as the metaphorical and congruent wording through nominalization and process types with samples. The frequencies and percentage were given appropriately with respect to each subheadings of the text. To this end, the findings showed that material and relational types indicated dominance. The discussion and implications were that the findings confirmed earlier study by MAK Halliday and C.I.M.I.M. Matthiessen’s suggestion that IGM should show dominance of material type process. The implication is that IGM can be an effective tool for the analysis of a narrative historical text. In conclusion, it was observed that IGM does not only carry grammatical function but also an ideological role in shaping the historical discourse within the narrative mode between writers and readers.

Keywords: ideational grammatical metaphor, nominalization, narrative history, memoire, dominance

Procedia PDF Downloads 221

5546 A Conversational Chatbot for Cricket Analytics

Authors: Kishan Bharadwaj Shridhar

Abstract:

Cricket is a data-rich sport, generating vast amounts of information, much of which is captured as textual commentary. Leading cricket data providers, such as ESPN Cricinfo include valuable Decision Review System (DRS) statistics within these commentaries, often as footnotes. Despite the significance of this data, accessing and analyzing it efficiently remains a challenge. This paper presents the development of a sophisticated chatbot designed to answer queries specifically about DRS in cricket. It supports up to seven distinct query types, including individual player statistics, umpire performance, player vs umpire dynamics, comparisons between batter and bowler, a player’s record at specific venues and more. Additionally, it enables stateful conversations, allowing a user to seamlessly build upon previous queries for a fluid and interactive experience. Leveraging advanced text-to-SQL methodologies and open-source frameworks such as Langgraph, it ensures low latency and robust performance. A distinct prompt engineering module enables the system to accurately interpret query intent, dynamically transitioning to an assisted text-to-SQL approach or a rule-based engine, as needed. This solution is the one of its kind in cricket analytics, offering unparalleled insights in cricket through an intuitive interface. It can be extended to other facets of cricket data and beyond, to other sports that generate textual data.

Keywords: conversational AI, cricket data analytics, text to SQL, large language models, stateful conversations.

Procedia PDF Downloads 4

5545 The Impact on the Composition of Survey Refusals΄ Demographic Profile When Implementing Different Classifications

Authors: Eva Tsouparopoulou, Maria Symeonaki

Abstract:

The internationally documented declining survey response rates of the last two decades are mainly attributed to refusals. In fieldwork, a refusal may be obtained not only from the respondent himself/herself, but from other sources on the respondent’s behalf, such as other household members, apartment building residents or administrator(s), and neighborhood residents. In this paper, we investigate how the composition of the demographic profile of survey refusals changes when different classifications are implemented and the classification issues arising from that. The analysis is based on the 2002-2018 European Social Survey (ESS) datasets for Belgium, Germany, and United Kingdom. For these three countries, the size of selected sample units coded as a type of refusal for all nine under investigation rounds was large enough to meet the purposes of the analysis. The results indicate the existence of four different possible classifications that can be implemented and the significance of choosing the one that strengthens the contrasts of the different types of respondents' demographic profiles. Since the foundation of social quantitative research lies in the triptych of definition, classification, and measurement, this study aims to identify the multiplicity of the definition of survey refusals as a methodological tool for the continually growing research on non-response.

Keywords: non-response, refusals, European social survey, classification

Procedia PDF Downloads 86

5544 Evaluation of the Accuracy of a ‘Two Question Screening Tool’ in the Detection of Intimate Partner Violence in a Primary Healthcare Setting in South Africa

Authors: A. Saimen, E. Armstrong, C. Manitshana

Abstract:

Intimate partner violence (IPV) has been recognised as a global human rights violation. It is universally under diagnosed and the institution of timeous multi-faceted interventions has been noted to benefit IPV victims. Currently, the concept of using a screening tool to detect IPV has not been widely explored in a primary healthcare setting in South Africa, and it was for this reason that this study has been undertaken. A systematic random sampling of 1 in 8 women over a period of 3 months was conducted prospectively at the OPD of a Level 1 Hospital. Participants were asked about their experience of IPV during the past 12 months. The WAST-short, a two-question tool, was used to screen patients for IPV. To verify the result of the screening, women were also asked the remaining questions from the WAST. Data was collected from 400 participants, with a response rate of 99.3%. The prevalence of IPV in the sample was 32%. The WAST-short was shown to have the following operating characteristics: sensitivity 45.2%, specificity 98%,positive predictive value 98%, negative predictive value 79%. The WAST-short lacks sufficient sensitivity and therefore is not an ideal screening tool for this setting. Improvement in the sensitivity of the WAST-short in this setting may be achieved by lowering the threshold for a positive result for IPV screening, and modification of the screening questions to better reflect IPV as understood by the local population.

Keywords: domestic violence, intimate partner violence, screening, screening tools

Procedia PDF Downloads 306

5543 Clicking Based Graphical Password Scheme Resistant to Spyware

Authors: Bandar Alahmadi

Abstract:

The fact that people tend to remember pictures better than texts, motivates researchers to develop graphical passwords as an alternative to textual passwords. Graphical passwords as such were introduced as a possible alternative to traditional text passwords, in which users prove their identity by clicking on pictures rather than typing alphanumerical text. In this paper, we present a scheme for graphical passwords that are resistant to shoulder surfing attacks and spyware attacks. The proposed scheme introduces a clicking technique to chosen images. First, the users choose a set of images, the images are then included in a grid where users can click in the cells around each image, the location of the click and the number of clicks are saved. As a result, the proposed scheme can be safe from shoulder surface and spyware attacks.

Keywords: security, password, authentication, attack, applications

Procedia PDF Downloads 167

5542 Readability Facing the Irreducible Otherness: Translation as a Third Dimension toward a Multilingual Higher Education

Authors: Noury Bakrim

Abstract:

From the point of view of language morphodynamics, interpretative Readability of the text-result (the stasis) is not the external hermeneutics of its various potential reading events but the paradigmatic, semantic immanence of its dynamics. In other words, interpretative Readability articulates the potential tension between projection (intentionality of the discursive event) and the result (Readability within the syntagmatic stasis). We then consider that translation represents much more a metalinguistic conversion of neurocognitive bilingual sub-routines and modular relations than a semantic equivalence. Furthermore, the actualizing Readability (the process of rewriting a target text within a target language/genre) builds upon the descriptive level between the generative syntax/semantic from and its paradigmatic potential translatability. Translation corpora reveal the evidence of a certain focusing on the positivist stasis of the source text at the expense of its interpretative Readability. For instance, Fluchere's brilliant translation of Miller's Tropic of cancer into French realizes unconsciously an inversion of the hierarchical relations between Life Thought and Fable: From Life Thought (fable) into Fable (Life Thought). We could regard the translation of Bernard Kreiss basing on Canetti's work die englischen Jahre (les annees anglaises) as another inversion of the historical scale from individual history into Hegelian history. In order to describe and test both translation process and result, we focus on the pedagogical practice which enables various principles grounding in interpretative/actualizing Readability. Henceforth, establishing the analytical uttering dynamics of the source text could be widened by other practices. The reversibility test (target - source text) or the comparison with a second translation in a third language (tertium comparationis A/B and A/C) point out the evidence of an impossible event. Therefore, it doesn't imply an uttering idealistic/absolute source but the irreducible/non-reproducible intentionality of its production event within the experience of world/discourse. The aim of this paper is to conceptualize translation as the tension between interpretative and actualizing Readability in a new approach grounding in morphodynamics of language and Translatability (mainly into French) within literary and non-literary texts articulating theoretical and described pedagogical corpora.

Keywords: readability, translation as deverbalization, translation as conversion, Tertium Comparationis, uttering actualization, translation pedagogy

Procedia PDF Downloads 166

5541 Disease Level Assessment in Wheat Plots Using a Residual Deep Learning Algorithm

Authors: Felipe A. Guth, Shane Ward, Kevin McDonnell

Abstract:

The assessment of disease levels in crop fields is an important and time-consuming task that generally relies on expert knowledge of trained individuals. Image classification in agriculture problems historically has been based on classical machine learning strategies that make use of hand-engineered features in the top of a classification algorithm. This approach tends to not produce results with high accuracy and generalization to the classes classified by the system when the nature of the elements has a significant variability. The advent of deep convolutional neural networks has revolutionized the field of machine learning, especially in computer vision tasks. These networks have great resourcefulness of learning and have been applied successfully to image classification and object detection tasks in the last years. The objective of this work was to propose a new method based on deep learning convolutional neural networks towards the task of disease level monitoring. Common RGB images of winter wheat were obtained during a growing season. Five categories of disease levels presence were produced, in collaboration with agronomists, for the algorithm classification. Disease level tasks performed by experts provided ground truth data for the disease score of the same winter wheat plots were RGB images were acquired. The system had an overall accuracy of 84% on the discrimination of the disease level classes.

Keywords: crop disease assessment, deep learning, precision agriculture, residual neural networks

Procedia PDF Downloads 334

5540 Designing Cultural-Creative Products with the Six Categories of Hanzi (Chinese Character Classification)

Authors: Pei-Jun Xue, Ming-Yu Hsiao

Abstract:

Chinese characters, or hanzi, represent a process of simplifying three-dimensional signs into plane signifiers. From pictograms at the beginning to logograms today, a Han linguist thus classified them into six categories known as the six categories of Chinese characters. Design is a process of signification, and cultural-creative design is a process translating ideas into design with creativity upon culture. Aiming to investigate the process of cultural-creative design transforming cultural text into cultural signs, this study analyzed existing cultural-creative products with the six categories of Chinese characters by treating such products as representations which accurately communicate the designer’s ideas to users through the categorization, simplification, and interpretation of sign features. This is a two-phase pilot study on designing cultural-creative products with the six categories of Chinese characters. Phase I reviews the related literature on the theory of the six categories of Chinese characters investigated and concludes with the process and principles of character evolution. Phase II analyzes the design of existing cultural-creative products with the six categories of Chinese characters and explores the conceptualization of product design.

Keywords: six categories of Chinese characters, cultural-creative product design, cultural signs, cultural product

Procedia PDF Downloads 344

5539 Effects of the Visual and Auditory Stimuli with Emotional Content on Eyewitness Testimony

Authors: İrem Bulut, Mustafa Z. Söyük, Ertuğrul Yalçın, Simge Şişman-Bal

Abstract:

Eyewitness testimony is one of the most frequently used methods in criminal cases for the determination of crime and perpetrator. In the literature, the number of studies about the reliability of eyewitness testimony is increasing. The study aims to reveal the factors that affect the short-term and long-term visual memory performance of the participants in the event of an accident. In this context, the effect of the emotional content of the accident and the sounds during the accident on visual memory performance was investigated with eye-tracking. According to the results, the presence of visual and auditory stimuli with emotional content during the accident decreases the participants' both short-term and long-term recall performance. Moreover, the data obtained from the eye monitoring device showed that the participants had difficulty in answering even the questions they focused on at the time of the accident.

Keywords: eye tracking, eyewitness testimony, long-term recall, short-term recall, visual memory

Procedia PDF Downloads 162

5538 Population Dynamics and Land Use/Land Cover Change on the Chilalo-Galama Mountain Range, Ethiopia

Authors: Yusuf Jundi Sado

Abstract:

Changes in land use are mostly credited to human actions that result in negative impacts on biodiversity and ecosystem functions. This study aims to analyze the dynamics of land use and land cover changes for sustainable natural resources planning and management. Chilalo-Galama Mountain Range, Ethiopia. This study used Thematic Mapper 05 (TM) for 1986, 2001 and Landsat 8 (OLI) data 2017. Additionally, data from the Central Statistics Agency on human population growth were analyzed. Semi-Automatic classification plugin (SCP) in QGIS 3.2.3 software was used for image classification. Global positioning system, field observations and focus group discussions were used for ground verification. Land Use Land Cover (LU/LC) change analysis was using maximum likelihood supervised classification and changes were calculated for the 1986–2001 and the 2001–2017 and 1986-2017 periods. The results show that agricultural land increased from 27.85% (1986) to 44.43% and 51.32% in 2001 and 2017, respectively with the overall accuracies of 92% (1986), 90.36% (2001), and 88% (2017). On the other hand, forests decreased from 8.51% (1986) to 7.64 (2001) and 4.46% (2017), and grassland decreased from 37.47% (1986) to 15.22%, and 15.01% in 2001 and 2017, respectively. It indicates for the years 1986–2017 the largest area cover gain of agricultural land was obtained from grassland. The matrix also shows that shrubland gained land from agricultural land, afro-alpine, and forest land. Population dynamics is found to be one of the major driving forces for the LU/LU changes in the study area.

Keywords: Landsat, LU/LC change, Semi-Automatic classification plugin, population dynamics, Ethiopia

Procedia PDF Downloads 87

5537 Clinical Feature Analysis and Prediction on Recurrence in Cervical Cancer

Authors: Ravinder Bahl, Jamini Sharma

Abstract:

The paper demonstrates analysis of the cervical cancer based on a probabilistic model. It involves technique for classification and prediction by recognizing typical and diagnostically most important test features relating to cervical cancer. The main contributions of the research include predicting the probability of recurrences in no recurrence (first time detection) cases. The combination of the conventional statistical and machine learning tools is applied for the analysis. Experimental study with real data demonstrates the feasibility and potential of the proposed approach for the said cause.

Keywords: cervical cancer, recurrence, no recurrence, probabilistic, classification, prediction, machine learning

Procedia PDF Downloads 360

5536 Pragmatic Interpretation in Translated Texts

Authors: Jamal Alqinai

Abstract:

A pragmatic approach to translation studies the rules and principles governing the use of language over and above the rules of syntax or morphology, and what makes some uses of language more appropriate than others in [communicative] situations. It attempts to explain translation as a procedure and product from the point of view of how, why and what is done by the source text author (ST) and what is to be done in the target text (TT) rendition. The latter will be subject to evaluation not as generated by the linguistics system but as conveyed and manipulated by participants in a communicative situation according to the referential and pragmatic standards employed. The failure of a purely lexical or structural translation stems from ignoring the relation between words as signs and the effect they have on their users. A more refined approach would also consider those processes that are sometimes labeled extra-linguistic or intuitive and which translators strive to reproduce unscathed in the translation process. We need to grasp the kind of actions an ST author performs on his readers by combining linguistic and non-linguistic elements against a backdrop of beliefs and cultural values. In other words, aside from considering the cohesive ties at the textual level, one needs to understand how the whole ST discourse hangs together logically in order to reproduce a coherent TT. The latter can only be achieved by an analysis of the pragmatic elements of presuppositions, implicatures and acts performed in the ST. Establishing cohesive ties within a text may require seeking reference outside the immediate text. The illocutionary functions manifested in one language/culture are relatively autonomous cultural/linguistic categories, but are imaginable by members of other cultures and, to some extent , are translatable though not, of course, without translation loss. Globalization and the spread of literacy worldwide may have created a universal empathy to comprehend the performative aspect of utterances when explained by approximate glosses or by paraphrase. Yet, it is often the multilayered and the culture-specific nature of illocutionary functions that de-universalize their possible interpretations. This paper addresses the pragmatic interpretation of culturally specific texts with examples adduced from a number of distinct settings to illustrate the influence of the pragmatic factors at stake.

Keywords: pragmatic, presupposition, implicature, cohesion

Procedia PDF Downloads 13

5535 Hate Speech Detection in Tunisian Dialect

Authors: Helmi Baazaoui, Mounir Zrigui

Abstract:

This study addresses the challenge of hate speech detection in Tunisian Arabic text, a critical issue for online safety and moderation. Leveraging the strengths of the AraBERT model, we fine-tuned and evaluated its performance against the Bi-LSTM model across four distinct datasets: T-HSAB, TNHS, TUNIZI-Dataset, and a newly compiled dataset with diverse labels such as Offensive Language, Racism, and Religious Intolerance. Our experimental results demonstrate that AraBERT significantly outperforms Bi-LSTM in terms of Recall, Precision, F1-Score, and Accuracy across all datasets. The findings underline the robustness of AraBERT in capturing the nuanced features of Tunisian Arabic and its superior capability in classification tasks. This research not only advances the technology for hate speech detection but also provides practical implications for social media moderation and policy-making in Tunisia. Future work will focus on expanding the datasets and exploring more sophisticated architectures to further enhance detection accuracy, thus promoting safer online interactions.

Keywords: hate speech detection, Tunisian Arabic, AraBERT, Bi-LSTM, Gemini annotation tool, social media moderation

Procedia PDF Downloads 16

5534 Smartphone Video Source Identification Based on Sensor Pattern Noise

Authors: Raquel Ramos López, Anissa El-Khattabi, Ana Lucila Sandoval Orozco, Luis Javier García Villalba

Abstract:

An increasing number of mobile devices with integrated cameras has meant that most digital video comes from these devices. These digital videos can be made anytime, anywhere and for different purposes. They can also be shared on the Internet in a short period of time and may sometimes contain recordings of illegal acts. The need to reliably trace the origin becomes evident when these videos are used for forensic purposes. This work proposes an algorithm to identify the brand and model of mobile device which generated the video. Its procedure is as follows: after obtaining the relevant video information, a classification algorithm based on sensor noise and Wavelet Transform performs the aforementioned identification process. We also present experimental results that support the validity of the techniques used and show promising results.

Keywords: digital video, forensics analysis, key frame, mobile device, PRNU, sensor noise, source identification

Procedia PDF Downloads 429

5533 Using Short Narrative Film to Drive Healthcare Policy: A Case Study

Authors: T. L. Granzyk, S. Scarborough, J. DeCosmo

Abstract:

The use of health-related or medical narratives has gained increasing anecdotal and research-based support as a successful device for changing health behavior and outcomes. These narratives, in the form of oral storytelling, short films, and educational documentaries, for example, are most effective when including empathetic characters that transport viewers into the story and command both their attention and emotional response. This case study outlines how and why one large health system created a short narrative film for their internal Sepsis Awareness campaign, which told the dramatic story of a patient recovering from a missed sepsis diagnosis, leaving her a quad-amputee. Results include positive global anecdotal response to the film from healthcare professionals and patients, as well as use of the film to support legislation, ultimately passed in favor of the formation of Sepsis Awareness Workgroups in Maryland. Authors conclude that narrative films can be used successfully to initiate healthcare legislation and to increase internal and external awareness of health-related areas in need of greater improvement and support. As such, healthcare leaders and stakeholders would benefit from learning how to intentionally create, cultivate, and curate narratives from within their own health systems that elicit an empathetic response.

Keywords: healthcare policy, healthcare narratives, sepsis awareness, short films

Procedia PDF Downloads 103

5532 Seasonal Short-Term Effect of Air Pollution on Cardiovascular Mortality in Belgium

Authors: Natalia Bustos Sierra, Katrien Tersago

Abstract:

It is currently proven that both extremes of temperature are associated with increased mortality and that air pollution is associated with temperature. This relationship is complex, and in countries with important seasonal variations in weather such as Belgium, some effects can appear as non-significant when the analysis is done over the entire year. We, therefore, analyzed the effect of short-term outdoor air pollution exposure on cardiovascular mortality during the warmer and colder months separately. We used daily cardiovascular deaths from acute cardiovascular diagnostics according to the International Classification of Diseases, 10th Revision (ICD-10: I20-I24, I44-I49, I50, I60-I66) during the period 2008-2013. The environmental data were population-weighted concentrations of particulates with an aerodynamic diameter less than 10 µm (PM₁₀) and less than 2.5 µm (PM₂.₅) (daily average), nitrogen dioxide (NO₂) (daily maximum of the hourly average) and ozone (O₃) (daily maximum of the 8-hour running mean). A Generalized linear model was applied adjusting for the confounding effect of season, temperature, dew point temperature, the day of the week, public holidays and the incidence of influenza-like illness (ILI) per 100,000 inhabitants. The relative risks (RR) were calculated for an increase of one interquartile range (IQR) of the air pollutant (μg/m³). These were presented for the four hottest months (June, July, August, September) and coldest months (November, December, January, February) in Belgium. We applied both individual lag model and unconstrained distributed lag model methods. The cumulative effect of a four-day exposure (day of exposure and three consecutive days) was calculated from the unconstrained distributed lag model. The IQR for PM₁₀, PM₂.₅, NO₂, and O₃ were respectively 8.2, 6.9, 12.9 and 25.5 µg/m³ during warm months and 18.8, 17.6, 18.4 and 27.8 µg/m³ during cold months. The association with CV mortality was statistically significant for the four pollutants during warm months and only for NO₂ during cold months. During the warm months, the cumulative effect of an IQR increase of ozone for the age groups 25-64, 65-84 and 85+ was 1.066 (95%CI: 1.002-1.135), 1.041 (1.008-1.075) and 1.036 (1.013-1.058) respectively. The cumulative effect of an IQR increase of NO₂ for the age group 65-84 was 1.066 (1.020-1.114) during warm months and 1.096 (1.030-1.166) during cold months. The cumulative effect of an IQR increase of PM₁₀ during warm months reached 1.046 (1.011-1.082) and 1.038 (1.015-1.063) for the age groups 65-84 and 85+ respectively. Similar results were observed for PM₂.₅. The short-term effect of air pollution on cardiovascular mortality is greater during warm months for lower pollutant concentrations compared to cold months. Spending more time outside during warm months increases population exposure to air pollution and can, therefore, be a confounding factor for this association. Age can also affect the length of time spent outdoors and the type of physical activity exercised. This study supports the deleterious effect of air pollution on cardiovascular mortality (CV) which varies according to season and age groups in Belgium. Public health measures should, therefore, be adapted to seasonality.

Keywords: air pollution, cardiovascular, mortality, season

Procedia PDF Downloads 166

5531 Comparing Deep Architectures for Selecting Optimal Machine Translation

Authors: Despoina Mouratidis, Katia Lida Kermanidis

Abstract:

Machine translation (MT) is a very important task in Natural Language Processing (NLP). MT evaluation is crucial in MT development, as it constitutes the means to assess the success of an MT system, and also helps improve its performance. Several methods have been proposed for the evaluation of (MT) systems. Some of the most popular ones in automatic MT evaluation are score-based, such as the BLEU score, and others are based on lexical similarity or syntactic similarity between the MT outputs and the reference involving higher-level information like part of speech tagging (POS). This paper presents a language-independent machine learning framework for classifying pairwise translations. This framework uses vector representations of two machine-produced translations, one from a statistical machine translation model (SMT) and one from a neural machine translation model (NMT). The vector representations consist of automatically extracted word embeddings and string-like language-independent features. These vector representations used as an input to a multi-layer neural network (NN) that models the similarity between each MT output and the reference, as well as between the two MT outputs. To evaluate the proposed approach, a professional translation and a "ground-truth" annotation are used. The parallel corpora used are English-Greek (EN-GR) and English-Italian (EN-IT), in the educational domain and of informal genres (video lecture subtitles, course forum text, etc.) that are difficult to be reliably translated. They have tested three basic deep learning (DL) architectures to this schema: (i) fully-connected dense, (ii) Convolutional Neural Network (CNN), and (iii) Long Short-Term Memory (LSTM). Experiments show that all tested architectures achieved better results when compared against those of some of the well-known basic approaches, such as Random Forest (RF) and Support Vector Machine (SVM). Better accuracy results are obtained when LSTM layers are used in our schema. In terms of a balance between the results, better accuracy results are obtained when dense layers are used. The reason for this is that the model correctly classifies more sentences of the minority class (SMT). For a more integrated analysis of the accuracy results, a qualitative linguistic analysis is carried out. In this context, problems have been identified about some figures of speech, as the metaphors, or about certain linguistic phenomena, such as per etymology: paronyms. It is quite interesting to find out why all the classifiers led to worse accuracy results in Italian as compared to Greek, taking into account that the linguistic features employed are language independent.

Keywords: machine learning, machine translation evaluation, neural network architecture, pairwise classification

Procedia PDF Downloads 133

5530 A Machine Learning Approach for Classification of Directional Valve Leakage in the Hydraulic Final Test

Authors: Christian Neunzig, Simon Fahle, Jürgen Schulz, Matthias Möller, Bernd Kuhlenkötter

Abstract:

Due to increasing cost pressure in global markets, artificial intelligence is becoming a technology that is decisive for competition. Predictive quality enables machinery and plant manufacturers to ensure product quality by using data-driven forecasts via machine learning models as a decision-making basis for test results. The use of cross-process Bosch production data along the value chain of hydraulic valves is a promising approach to classifying the quality characteristics of workpieces.

Keywords: predictive quality, hydraulics, machine learning, classification, supervised learning

Procedia PDF Downloads 232

5529 Time-Frequency Feature Extraction Method Based on Micro-Doppler Signature of Ground Moving Targets

Authors: Ke Ren, Huiruo Shi, Linsen Li, Baoshuai Wang, Yu Zhou

Abstract:

Since some discriminative features are required for ground moving targets classification, we propose a new feature extraction method based on micro-Doppler signature. Firstly, the time-frequency analysis of measured data indicates that the time-frequency spectrograms of the three kinds of ground moving targets, i.e., single walking person, two people walking and a moving wheeled vehicle, are discriminative. Then, a three-dimensional time-frequency feature vector is extracted from the time-frequency spectrograms to depict these differences. At last, a Support Vector Machine (SVM) classifier is trained with the proposed three-dimensional feature vector. The classiﬁcation accuracy to categorize ground moving targets into the three kinds of the measured data is found to be over 96%, which demonstrates the good discriminative ability of the proposed micro-Doppler feature.

Keywords: micro-doppler, time-frequency analysis, feature extraction, radar target classification

Procedia PDF Downloads 406

5528 Extracting Actions with Improved Part of Speech Tagging for Social Networking Texts

Authors: Yassine Jamoussi, Ameni Youssfi, Henda Ben Ghezala

Abstract:

With the growing interest in social networking, the interaction of social actors evolved to a source of knowledge in which it becomes possible to perform context aware-reasoning. The information extraction from social networking especially Twitter and Facebook is one of the problems in this area. To extract text from social networking, we need several lexical features and large scale word clustering. We attempt to expand existing tokenizer and to develop our own tagger in order to support the incorrect words currently in existence in Facebook and Twitter. Our goal in this work is to benefit from the lexical features developed for Twitter and online conversational text in previous works, and to develop an extraction model for constructing a huge knowledge based on actions

Keywords: social networking, information extraction, part-of-speech tagging, natural language processing

Procedia PDF Downloads 305

5527 Clustering the Wheat Seeds Using SOM Artificial Neural Networks

Authors: Salah Ghamari

Abstract:

In this study, the ability of self organizing map artificial (SOM) neural networks in clustering the wheat seeds varieties according to morphological properties of them was considered. The SOM is one type of unsupervised competitive learning. Experimentally, five morphological features of 300 seeds (including three varieties: gaskozhen, Md and sardari) were obtained using image processing technique. The results show that the artificial neural network has a good performance (90.33% accuracy) in classification of the wheat varieties despite of high similarity in them. The highest classification accuracy (100%) was achieved for sardari.

Keywords: artificial neural networks, clustering, self organizing map, wheat variety

Procedia PDF Downloads 658

5526 Enhancing the Recruitment Process through Machine Learning: An Automated CV Screening System

Authors: Kaoutar Ben Azzou, Hanaa Talei

Abstract:

Human resources is an important department in each organization as it manages the life cycle of employees from recruitment training to retirement or termination of contracts. The recruitment process starts with a job opening, followed by a selection of the best-fit candidates from all applicants. Matching the best profile for a job position requires a manual way of looking at many CVs, which requires hours of work that can sometimes lead to choosing not the best profile. The work presented in this paper aims at reducing the workload of HR personnel by automating the preliminary stages of the candidate screening process, thereby fostering a more streamlined recruitment workflow. This tool introduces an automated system designed to help with the recruitment process by scanning candidates' CVs, extracting pertinent features, and employing machine learning algorithms to decide the most fitting job profile for each candidate. Our work employs natural language processing (NLP) techniques to identify and extract key features from unstructured text extracted from a CV, such as education, work experience, and skills. Subsequently, the system utilizes these features to match candidates with job profiles, leveraging the power of classification algorithms.

Keywords: automated recruitment, candidate screening, machine learning, human resources management

Procedia PDF Downloads 57

5525 SEM Image Classification Using CNN Architectures

Authors: Güzi̇n Ti̇rkeş, Özge Teki̇n, Kerem Kurtuluş, Y. Yekta Yurtseven, Murat Baran

Abstract:

A scanning electron microscope (SEM) is a type of electron microscope mainly used in nanoscience and nanotechnology areas. Automatic image recognition and classification are among the general areas of application concerning SEM. In line with these usages, the present paper proposes a deep learning algorithm that classifies SEM images into nine categories by means of an online application to simplify the process. The NFFA-EUROPE - 100% SEM data set, containing approximately 21,000 images, was used to train and test the algorithm at 80% and 20%, respectively. Validation was carried out using a separate data set obtained from the Middle East Technical University (METU) in Turkey. To increase the accuracy in the results, the Inception ResNet-V2 model was used in view of the Fine-Tuning approach. By using a confusion matrix, it was observed that the coated-surface category has a negative effect on the accuracy of the results since it contains other categories in the data set, thereby confusing the model when detecting category-specific patterns. For this reason, the coated-surface category was removed from the train data set, hence increasing accuracy by up to 96.5%.

Keywords: convolutional neural networks, deep learning, image classification, scanning electron microscope

Procedia PDF Downloads 126

5524 Surface Induced Alteration of Nanosized Amorphous Alumina

Authors: A. Katsman, L. Bloch, Y. Etinger, Y. Kauffmann, B. Pokroy

Abstract:

Various nanosized amorphous alumina thin films in the range of (2.4 - 63.1) nm were deposited onto amorphous carbon and amorphous Si3N4 membrane grids. Transmission electron microscopy (TEM), electron energy loss spectroscopy (EELS), X-ray photoelectron spectroscopy (XPS) and differential scanning calorimetry (DSC) techniques were used to probe the size effect on the short range order and the amorphous to crystalline phase transition temperature. It was found that the short-range order changes as a function of size: the fraction of tetrahedral Al sites is greater in thinner amorphous films. This result correlates with the change of amorphous alumina density with the film thickness demonstrated by the reflectivity experiments: the thinner amorphous films have the less density. These effects are discussed in terms of surface reconstruction of the amorphous alumina films. The average atomic binding energy in the thin film layer decreases with decease of the thickness, while the average O-Al interatomic distance increases. The reconstruction of amorphous alumina is induced by the surface reconstruction, and the short range order changes being dependent on the density. Decrease of the surface energy during reconstruction is the driving force of the alumina reconstruction (density change) followed by relaxation process (short range order change). The amorphous to crystalline phase transition temperature measured by DSC rises with the decrease in thickness from 997.6°C for 13.9 nm to 1020.4 °C for 2.7 nm thick. This effect was attributed to the different film densities: formation of nanovoids preceding and accompanying crystallization process influences the crystallization rate, and by these means, the temperature of crystallization peak.

Keywords: amorphous alumina, density, short range order, size effect

Procedia PDF Downloads 467