Search results for: text mining analysis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 28671

Search results for: text mining analysis

28131 On Exploring Search Heuristics for improving the efficiency in Web Information Extraction

Authors: Patricia Jiménez, Rafael Corchuelo

Abstract:

Nowadays the World Wide Web is the most popular source of information that relies on billions of on-line documents. Web mining is used to crawl through these documents, collect the information of interest and process it by applying data mining tools in order to use the gathered information in the best interest of a business, what enables companies to promote theirs. Unfortunately, it is not easy to extract the information a web site provides automatically when it lacks an API that allows to transform the user-friendly data provided in web documents into a structured format that is machine-readable. Rule-based information extractors are the tools intended to extract the information of interest automatically and offer it in a structured format that allow mining tools to process it. However, the performance of an information extractor strongly depends on the search heuristic employed since bad choices regarding how to learn a rule may easily result in loss of effectiveness and/or efficiency. Improving search heuristics regarding efficiency is of uttermost importance in the field of Web Information Extraction since typical datasets are very large. In this paper, we employ an information extractor based on a classical top-down algorithm that uses the so-called Information Gain heuristic introduced by Quinlan and Cameron-Jones. Unfortunately, the Information Gain relies on some well-known problems so we analyse an intuitive alternative, Termini, that is clearly more efficient; we also analyse other proposals in the literature and conclude that none of them outperforms the previous alternative.

Keywords: information extraction, search heuristics, semi-structured documents, web mining.

Procedia PDF Downloads 330
28130 Performance Evaluation of an Ontology-Based Arabic Sentiment Analysis

Authors: Salima Behdenna, Fatiha Barigou, Ghalem Belalem

Abstract:

Due to the quick increase in the volume of Arabic opinions posted on various social media, Arabic sentiment analysis has become one of the most important areas of research. Compared to English, there is very little works on Arabic sentiment analysis, in particular aspect-based sentiment analysis (ABSA). In ABSA, aspect extraction is the most important task. In this paper, we propose a semantic aspect-based sentiment analysis approach for standard Arabic reviews to extract explicit aspect terms and identify the polarity of the extracted aspects. The proposed approach was evaluated using HAAD datasets. Experiments showed that the proposed approach achieved a good level of performance compared with baseline results. The F-measure was improved by 19% for the aspect term extraction tasks and 55% aspect term polarity task.

Keywords: sentiment analysis, opinion mining, Arabic, aspect level, opinion, polarity

Procedia PDF Downloads 153
28129 Lexical Semantic Analysis to Support Ontology Modeling of Maintenance Activities– Case Study of Offshore Riser Integrity

Authors: Vahid Ebrahimipour

Abstract:

Word representation and context meaning of text-based documents play an essential role in knowledge modeling. Business procedures written in natural language are meant to store technical and engineering information, management decision and operation experience during the production system life cycle. Context meaning representation is highly dependent upon word sense, lexical relativity, and sematic features of the argument. This paper proposes a method for lexical semantic analysis and context meaning representation of maintenance activity in a mass production system. Our approach constructs a straightforward lexical semantic approach to analyze facilitates semantic and syntactic features of context structure of maintenance report to facilitate translation, interpretation, and conversion of human-readable interpretation into computer-readable representation and understandable with less heterogeneity and ambiguity. The methodology will enable users to obtain a representation format that maximizes shareability and accessibility for multi-purpose usage. It provides a contextualized structure to obtain a generic context model that can be utilized during the system life cycle. At first, it employs a co-occurrence-based clustering framework to recognize a group of highly frequent contextual features that correspond to a maintenance report text. Then the keywords are identified for syntactic and semantic extraction analysis. The analysis exercises causality-driven logic of keywords’ senses to divulge the structural and meaning dependency relationships between the words in a context. The output is a word contextualized representation of maintenance activity accommodating computer-based representation and inference using OWL/RDF.

Keywords: lexical semantic analysis, metadata modeling, contextual meaning extraction, ontology modeling, knowledge representation

Procedia PDF Downloads 102
28128 Strategies to Enhance Compliance of Health and Safety Standards at the Selected Mining Industries in Limpopo Province, South Africa: Occupational Health Nurse’s Perspective

Authors: Livhuwani Muthelo

Abstract:

The health and safety of the miners in the South African mining industry are guided by the regulations and standards which are anticipated to promote a healthy work environment and fatalities. It is of utmost importance for the miners to comply with these regulations/standards to protect themselves from potential occupational health and safety risks, accidents, and fatalities. The purpose of this study was to develop and validate strategies to enhance compliance with the Health and safety standards within the mining industries of Limpopo province in South Africa. A mixed-method exploratory sequential research design was adopted. The population consisted of 5350 miners. Purposive sampling was used to select the participants in the qualitative strand and stratified random sampling in the quantitative strand. Semi-structured interviews were conducted among the occupational health nurse practitioners and the health and safety team. Thematic analysis was used to generate an understanding of the interviews. In the quantitative strand, a survey was conducted using a self-administered questionnaire. Data were analysed using SPSS version 26.0. A descriptive statistical test was used in the analysis of data including frequencies, means, and standard deviation. Cronbach's alpha test was used to measure internal consistency. The integrated results revealed that there are diverse experiences related to health and safety standards compliance among the mineworkers. The main findings were challenges related to leadership compliance and also related to the cost of maintaining safety, Miner's behavior-related challenges; the impact of non-compliance on the overall health of the miners was also described, the conflict between production and safety. Health and safety compliance is not just mere compliance with regulations and standards but a culture that warrants the miners and organization to take responsibility for their behavior and actions towards health and safety. Thus taking responsibility for your well-being and other miners.

Keywords: perceptions, compliance, health and safety, legislation, standards, miners

Procedia PDF Downloads 94
28127 Study on the Focus of Attention of Special Education Students in Primary School

Authors: Tung-Kuang Wu, Hsing-Pei Hsieh, Ying-Ru Meng

Abstract:

Special Education in Taiwan has been facing difficulties including shortage of teachers and lack in resources. Some students need to receive special education are thus not identified or admitted. Fortunately, information technologies can be applied to relieve some of the difficulties. For example, on-line multimedia courseware can be used to assist the learning of special education students and take pretty much workload from special education teachers. However, there may exist cognitive variations between students in special or regular educations, which suggests the design of online courseware requires different considerations. This study aims to investigate the difference in focus of attention (FOA) between special and regular education students of primary school in viewing the computer screen. The study is essential as it helps courseware developers in determining where to put learning elements that matter the most on the right position of screen. It may also assist special education specialists to better understand the subtle differences among various subtypes of learning disabilities. This study involves 76 special education students (among them, 39 are students with mental retardation, MR, and 37 are students with learning disabilities, LDs) and 42 regular education students. The participants were asked to view a computer screen showing a picture partitioned into 3 × 3 areas with each area filled with text or icon. The subjects were then instructed to mark on the prior given paper sheets, which are also partitioned into 3 × 3 grids, the areas corresponding to the pictures on the computer screen that they first set their eyes on. The data are then collected and analyzed. Major findings are listed: 1. In both text and icon scenario, significant differences exist in the first preferred FOA between special and regular education students. The first FOA for the former is mainly on area 1 (upper left area, 53.8% / 51.3% for MR / LDs students in text scenario; and 53.8% / 56.8% for MR / LDs students in icons scenario), while the latter on area 5 (middle area, 50.0% and 57.1% in text and icons scenarios). 2. The second most preferred area in text scenario for students with MR and LDs are area 2 (upper-middle, 20.5%) and 5 (middle area, 24.3%). In icons scenario, the results are similar, but lesser in percentage. 3. Students with LDs that show similar preference (either in text or icons scenarios) in FOA to regular education students tend to be of some specific sub-type of learning disabilities. For instance, students with LDs that chose area 5 (middle area, either in text or icon scenario) as their FOA are mostly ones that have reading or writing disability. Also, three (out of 13) subjects in this category, after going through the rediagnosis process, were excluded from being learning disabilities. In summary, the findings suggest when designing multimedia courseware for students with MR and LDs, the essential learning elements should be placed on area 1, 2 and 5. In addition, FOV preference may also potentially be used as an indicator for diagnosing students with LDs.

Keywords: focus of attention, learning disabilities, mental retardation, on-line multimedia courseware, special education

Procedia PDF Downloads 161
28126 The Affective Motivation of Women Miners in Ghana

Authors: Adesuwa Omorede, Rufai Haruna Kilu

Abstract:

Affective motivation (motivation that is emotionally laden usually related to affect, passion, emotions, moods) in the workplace stimulates individuals to reinforce, persist and commit to their task, which leads to the individual and organizational performance. This leads individuals to reach goals especially in situations where task are highly challenging and hostile. In such situations, individuals are more disposed to be more creative, innovative and see new opportunities from the loopholes in their workplace. However, when individuals feel displaced and less important, an adverse reaction may suffice which may be detrimental to the organization and its performance. One sector where affective motivation is eminently present and relevant, is the mining industry. Due to its intense work environment; mostly dominated by men and masculinity cultures; and deliberate exclusion of women in this environment which, makes the women working in these environments to feel marginalized. In Ghana, the mining industry is mostly seen as a very physical environment especially underground and mostly considerd as 'no place for a woman'. Despite the fact that these women feel less 'needed' or 'appreciated' in such environments, they still have to juggle between intense work shifts; face violence and other health risks with their families, which put a strain on their affective motivational reaction. Beyond these challenges, however, several mining companies in Ghana today are working towards providing a fair and equal working situation for both men and women miners, by recognizing them as key stakeholders, as well as including them in the stages of mining projects from the planning and designing phase to the evaluation and implementation stage. Drawing from the psychology and gender literature, this study takes a narrative approach to identify and understand the shifting gender dynamics within the mine works in Ghana, occasioning a change in background disposition of miners, which leads to more women taking up mine jobs in the country. In doing so, a qualitative study was conducted using semi-structured interviews from Ghana. Several women working within the mining industries in Ghana shared their experiences and how they felt and still feel in their workplace. In addition, archival documents were gathered to support the findings. The results suggest a change in enrolment regimes in a mining and technology university in Ghana, making room for a more gender equal enrolments in the university. A renowned university that train and feed mine work professional into the industry. The results further acknowledge gender equal and diversity recruitment policies and initiatives among the mining companies of Ghana. This study contributes to the psychology and gender literature by highlighting the hindrances women face in the mining industry as well as highlighting several of their affective reactions towards gender inequality. The study also provides several suggestions for decision makers in the mining industry of what can be done in the future to reduce the gender inequality gap within the industry.

Keywords: affective motivation, gender shape shifting, mining industry, women miners

Procedia PDF Downloads 292
28125 Web Data Scraping Technology Using Term Frequency Inverse Document Frequency to Enhance the Big Data Quality on Sentiment Analysis

Authors: Sangita Pokhrel, Nalinda Somasiri, Rebecca Jeyavadhanam, Swathi Ganesan

Abstract:

Tourism is a booming industry with huge future potential for global wealth and employment. There are countless data generated over social media sites every day, creating numerous opportunities to bring more insights to decision-makers. The integration of Big Data Technology into the tourism industry will allow companies to conclude where their customers have been and what they like. This information can then be used by businesses, such as those in charge of managing visitor centers or hotels, etc., and the tourist can get a clear idea of places before visiting. The technical perspective of natural language is processed by analysing the sentiment features of online reviews from tourists, and we then supply an enhanced long short-term memory (LSTM) framework for sentiment feature extraction of travel reviews. We have constructed a web review database using a crawler and web scraping technique for experimental validation to evaluate the effectiveness of our methodology. The text form of sentences was first classified through Vader and Roberta model to get the polarity of the reviews. In this paper, we have conducted study methods for feature extraction, such as Count Vectorization and TFIDF Vectorization, and implemented Convolutional Neural Network (CNN) classifier algorithm for the sentiment analysis to decide the tourist’s attitude towards the destinations is positive, negative, or simply neutral based on the review text that they posted online. The results demonstrated that from the CNN algorithm, after pre-processing and cleaning the dataset, we received an accuracy of 96.12% for the positive and negative sentiment analysis.

Keywords: counter vectorization, convolutional neural network, crawler, data technology, long short-term memory, web scraping, sentiment analysis

Procedia PDF Downloads 81
28124 Sexting Phenomenon in Educational Settings: A Data Mining Approach

Authors: Koutsopoulou Ioanna, Gkintoni Evgenia, Halkiopoulos Constantinos, Antonopoulou Hera

Abstract:

Recent advances in Internet Computer Technology (ICT) and the ever-increasing use of technological equipment amongst adolescents and young adults along with unattended access to the internet and social media and uncontrolled use of smart phones and PCs have caused social problems like sexting to emerge. The main purpose of the present article is first to present an analytic theoretical framework of sexting as a recent social phenomenon based on studies that have been conducted the last decade or so; and second to investigate Greek students’ and also social network users, sexting perceptions and to record how often social media users exchange sexual messages and to retrace demographic variables predictors. Data from 1,000 students were collected and analyzed and all statistical analysis was done by the software package WEKA. The results indicate among others, that the use of data mining methods is an important tool to draw conclusions that could affect decision and policy making especially in the field and related social topics of educational psychology. To sum up, sexting lurks many risks for adolescents and young adults students in Greece and needs to be better addressed in relevance to the stakeholders as well as society in general. Furthermore, policy makers, legislation makers and authorities will have to take action to protect minors. Prevention strategies based on Greek cultural specificities are being proposed. This social problem has raised concerns in recent years and will most likely escalate concerns in global communities in the future.

Keywords: educational ethics, sexting, Greek sexters, sex education, data mining

Procedia PDF Downloads 180
28123 Application of Reception Theory to Analyze the Translation as a Continuous Reception

Authors: Mina Darabi Amin

Abstract:

In 1972, Hans Robert Jauss introduced the Reception Theory a version of Reader-response criticism, that suggests the literary critics to re-examine the relationship between the author, the work and the reader. The revealing of these relationships has shown that, besides the creation, the reception and the reading of the text have different levels which exempt it from a continuous reference to the meaning intended by the artist and could lead to multiplicity of possible interpretations according to the ‘Horizon of Expectations’. This theory could be associated with another intellectual process called ‘translation’, a process that is always confronted by different levels of readers in the target language and different levels of reception by these readers. By adopting the perspective of Reception theory in translation, we could ignore a particular kind of translation and consider the initiation to a literary text, its translation and its reception as a continuous process. Just like the creation of the text, the translation and its reception, are not made once and for all; they are confronted with different levels of reception and interpretation which are made and remade endlessly. After having known and crossing the first levels, the Horizons of Expectation could be extended and the reader could be initiated to the higher levels. On the other hand, we could say that the faithful and free translation are not opposed to each other, but depending on the type of reception by the readers and in a particular moment, the existence of both is necessary. In fact, it is the level of reception in readers and their Horizon of Expectations that determine the degree of fidelity and freedom of translation.

Keywords: reception theory, reading, literary translation, horizons of expectation, reader

Procedia PDF Downloads 173
28122 Research on the Risks of Railroad Receiving and Dispatching Trains Operators: Natural Language Processing Risk Text Mining

Authors: Yangze Lan, Ruihua Xv, Feng Zhou, Yijia Shan, Longhao Zhang, Qinghui Xv

Abstract:

Receiving and dispatching trains is an important part of railroad organization, and the risky evaluation of operating personnel is still reflected by scores, lacking further excavation of wrong answers and operating accidents. With natural language processing (NLP) technology, this study extracts the keywords and key phrases of 40 relevant risk events about receiving and dispatching trains and reclassifies the risk events into 8 categories, such as train approach and signal risks, dispatching command risks, and so on. Based on the historical risk data of personnel, the K-Means clustering method is used to classify the risk level of personnel. The result indicates that the high-risk operating personnel need to strengthen the training of train receiving and dispatching operations towards essential trains and abnormal situations.

Keywords: receiving and dispatching trains, natural language processing, risk evaluation, K-means clustering

Procedia PDF Downloads 76
28121 Metadiscourse in EFL, ESP and Subject-Teaching Online Courses in Higher Education

Authors: Maria Antonietta Marongiu

Abstract:

Propositional information in discourse is made coherent, intelligible, and persuasive through metadiscourse. The linguistic and rhetorical choices that writers/speakers make to organize and negotiate content matter are intended to help relate a text to its context. Besides, they help the audience to connect to and interpret a text according to the values of a specific discourse community. Based on these assumptions, this work aims to analyse the use of metadiscourse in the spoken performance of teachers in online EFL, ESP, and subject-teacher courses taught in English to non-native learners in higher education. In point of fact, the global spread of Covid 19 has forced universities to transition their in-class courses to online delivery. This has inevitably placed on the instructor a heavier interactional responsibility compared to in-class courses. Accordingly, online delivery needs greater structuring as regards establishing the reader/listener’s resources for text understanding and negotiating. Indeed, in online as well as in in-class courses, lessons are social acts which take place in contexts where interlocutors, as members of a community, affect the ways ideas are presented and understood. Following Hyland’s Interactional Model of Metadiscourse (2005), this study intends to investigate Teacher Talk in online academic courses during the Covid 19 lock-down in Italy. The selected corpus includes the transcripts of online EFL and ESP courses and subject-teachers online courses taught in English. The objective of the investigation is, firstly, to ascertain the presence of metadiscourse in the form of interactive devices (to guide the listener through the text) and interactional features (to involve the listener in the subject). Previous research on metadiscourse in academic discourse, in college students' presentations in EAP (English for Academic Purposes) lessons, as well as in online teaching methodology courses and MOOC (Massive Open Online Courses) has shown that instructors use a vast array of metadiscoursal features intended to express the speakers’ intentions and standing with respect to discourse. Besides, they tend to use directions to orient their listeners and logical connectors referring to the structure of the text. Accordingly, the purpose of the investigation is also to find out whether metadiscourse is used as a rhetorical strategy by instructors to control, evaluate and negotiate the impact of the ongoing talk, and eventually to signal their attitudes towards the content and the audience. Thus, the use of metadiscourse can contribute to the informative and persuasive impact of discourse, and to the effectiveness of online communication, especially in learning contexts.

Keywords: discourse analysis, metadiscourse, online EFL and ESP teaching, rhetoric

Procedia PDF Downloads 124
28120 A Study of Growth Factors on Sustainable Manufacturing in Small and Medium-Sized Enterprises: Case Study of Japan Manufacturing

Authors: Tadayuki Kyoutani, Shigeyuki Haruyama, Ken Kaminishi, Zefry Darmawan

Abstract:

Japan’s semiconductor industries have developed greatly in recent years. Many were started from a Small and Medium-sized Enterprises (SMEs) that found at a good circumstance and now become the prosperous industries in the world. Sustainable growth factors that support the creation of spirit value inside the Japanese company were strongly embedded through performance. Those factors were not clearly defined among each company. A series of literature research conducted to explore quantitative text mining about the definition of sustainable growth factors. Sustainable criteria were developed from previous research to verify the definition of the factors. A typical frame work was proposed as a systematical approach to develop sustainable growth factor in a specific company. Result of approach was review in certain period shows that factors influenced in sustainable growth was importance for the company to achieve the goal.

Keywords: SME, manufacture, sustainable, growth factor

Procedia PDF Downloads 243
28119 Customer Preference in the Textile Market: Fabric-Based Analysis

Authors: Francisca Margarita Ocran

Abstract:

Underwear, and more particularly bras and panties, are defined as intimate clothing. Strictly speaking, they enhance the place of women in the public or private satchel. Therefore, women's lingerie is a complex garment with a high involvement profile, motivating consumers to buy it not only by its functional utility but also by the multisensory experience it provides them. Customer behavior models are generally based on customer data mining, and each model is designed to answer questions at a specific time. Predicting the customer experience is uncertain and difficult. Thus, knowledge of consumers' tastes in lingerie deserves to be treated as an experiential product, where the dimensions of the experience motivating consumers to buy a lingerie product and to remain faithful to it must be analyzed in detail by the manufacturers and retailers to engage and retain consumers, which is why this research aims to identify the variables that push consumers to choose their lingerie product, based on an in-depth analysis of the types of fabrics used to make lingerie. The data used in this study comes from online purchases. Machine learning approach with the use of Python programming language and Pycaret gives us a precision of 86.34%, 85.98%, and 84.55% for the three algorithms to use concerning the preference of a buyer in front of a range of lingerie. Gradient Boosting, random forest, and K Neighbors were used in this study; they are very promising and rich in the classification of preference in the textile industry.

Keywords: consumer behavior, data mining, lingerie, machine learning, preference

Procedia PDF Downloads 83
28118 A Character Detection Method for Ancient Yi Books Based on Connected Components and Regressive Character Segmentation

Authors: Xu Han, Shanxiong Chen, Shiyu Zhu, Xiaoyu Lin, Fujia Zhao, Dingwang Wang

Abstract:

Character detection is an important issue for character recognition of ancient Yi books. The accuracy of detection directly affects the recognition effect of ancient Yi books. Considering the complex layout, the lack of standard typesetting and the mixed arrangement between images and texts, we propose a character detection method for ancient Yi books based on connected components and regressive character segmentation. First, the scanned images of ancient Yi books are preprocessed with nonlocal mean filtering, and then a modified local adaptive threshold binarization algorithm is used to obtain the binary images to segment the foreground and background for the images. Second, the non-text areas are removed by the method based on connected components. Finally, the single character in the ancient Yi books is segmented by our method. The experimental results show that the method can effectively separate the text areas and non-text areas for ancient Yi books and achieve higher accuracy and recall rate in the experiment of character detection, and effectively solve the problem of character detection and segmentation in character recognition of ancient books.

Keywords: CCS concepts, computing methodologies, interest point, salient region detections, image segmentation

Procedia PDF Downloads 124
28117 The Investigation of Enzymatic Activity in the Soils Under the Impact of Metallurgical Industrial Activity in Lori Marz, Armenia

Authors: T. H. Derdzyan, K. A. Ghazaryan, G. A. Gevorgyan

Abstract:

Beta-glucosidase, chitinase, leucine-aminopeptidase, acid phosphomonoestearse and acetate-esterase enzyme activities in the soils under the impact of metallurgical industrial activity in Lori marz (district) were investigated. The results of the study showed that the activities of the investigated enzymes in the soils decreased with increasing distance from the Shamlugh copper mine, the Chochkan tailings storage facility and the ore transportation road. Statistical analysis revealed that the activities of the enzymes were positively correlated (significant) to each other according to the observation sites which indicated that enzyme activities were affected by the same anthropogenic factor. The investigations showed that the soils were polluted with heavy metals (Cu, Pb, As, Co, Ni, Zn) due to copper mining activity in this territory. The results of Pearson correlation analysis revealed a significant negative correlation between heavy metal pollution degree (Nemerow integrated pollution index) and soil enzyme activity. All of this indicated that copper mining activity in this territory causing the heavy metal pollution of the soils resulted in the inhabitation of the activities of the enzymes which are considered as biological catalysts to decompose organic materials and facilitate the cycling of nutrients.

Keywords: Armenia, metallurgical industrial activity, heavy metal pollutionl, soil enzyme activity

Procedia PDF Downloads 291
28116 Reconstruction of Visual Stimuli Using Stable Diffusion with Text Conditioning

Authors: ShyamKrishna Kirithivasan, Shreyas Battula, Aditi Soori, Richa Ramesh, Ramamoorthy Srinath

Abstract:

The human brain, among the most complex and mysterious aspects of the body, harbors vast potential for extensive exploration. Unraveling these enigmas, especially within neural perception and cognition, delves into the realm of neural decoding. Harnessing advancements in generative AI, particularly in Visual Computing, seeks to elucidate how the brain comprehends visual stimuli observed by humans. The paper endeavors to reconstruct human-perceived visual stimuli using Functional Magnetic Resonance Imaging (fMRI). This fMRI data is then processed through pre-trained deep-learning models to recreate the stimuli. Introducing a new architecture named LatentNeuroNet, the aim is to achieve the utmost semantic fidelity in stimuli reconstruction. The approach employs a Latent Diffusion Model (LDM) - Stable Diffusion v1.5, emphasizing semantic accuracy and generating superior quality outputs. This addresses the limitations of prior methods, such as GANs, known for poor semantic performance and inherent instability. Text conditioning within the LDM's denoising process is handled by extracting text from the brain's ventral visual cortex region. This extracted text undergoes processing through a Bootstrapping Language-Image Pre-training (BLIP) encoder before it is injected into the denoising process. In conclusion, a successful architecture is developed that reconstructs the visual stimuli perceived and finally, this research provides us with enough evidence to identify the most influential regions of the brain responsible for cognition and perception.

Keywords: BLIP, fMRI, latent diffusion model, neural perception.

Procedia PDF Downloads 62
28115 Performance Analysis with the Combination of Visualization and Classification Technique for Medical Chatbot

Authors: Shajida M., Sakthiyadharshini N. P., Kamalesh S., Aswitha B.

Abstract:

Natural Language Processing (NLP) continues to play a strategic part in complaint discovery and medicine discovery during the current epidemic. This abstract provides an overview of performance analysis with a combination of visualization and classification techniques of NLP for a medical chatbot. Sentiment analysis is an important aspect of NLP that is used to determine the emotional tone behind a piece of text. This technique has been applied to various domains, including medical chatbots. In this, we have compared the combination of the decision tree with heatmap and Naïve Bayes with Word Cloud. The performance of the chatbot was evaluated using accuracy, and the results indicate that the combination of visualization and classification techniques significantly improves the chatbot's performance.

Keywords: sentimental analysis, NLP, medical chatbot, decision tree, heatmap, naïve bayes, word cloud

Procedia PDF Downloads 67
28114 Translation of Culture-Specific References in the Turkish Translation of Shakespeare's Macbeth

Authors: Feride Sumbul

Abstract:

Drama is a literary genre that mirrors the people and society and transfers the human nature and life to the reader or the audience within its own social-cultural structure. Each play takes on a new reality in the time and culture of the staging, and each performance actually brings a new interpretation to the play. Similarly, each translation adds a new meaning to the source text. In other words, the translated theatrical text transcends the boundaries of its language and culture and finds a new interpretation. Thus the translation of drama takes place as a transfer from one culture to another as a cross cultural communication. In this context, translating culture specific references play a key role in terms of reflecting cultural aspects of a target society. This study aims to explore the use of Venuti's translation principles of domestication and foreignization in the transfer of culture specific references in the Turkish translation of Shakespeare's Macbeth. Macbeth is to be compared with its Turkish version in terms of the transference of culture specific references such as religious, witchcraft, and mythological, which have no equivalent in the target language and culture. To evaluate these principles of Venuti, Davies’s translation strategies are also conducted. As a method, for the most part, he predominantly uses Davies’ method of ‘addition’ through adding extra information in the notes. For instance, rather than finding the Turkish renderings of them, the translator mostly chooses to transfer witchcraft references through retaining them in the target text, but he mainly adds extra information about the references in the notes. Therefore, the translator Nutku mostly uses Venuti’s translation principle of foreignization so that he preserves the foreignness of the theatrical text.

Keywords: drama translation, theatrical texts, culture specific references, Macbeth

Procedia PDF Downloads 150
28113 Data Mining in Healthcare for Predictive Analytics

Authors: Ruzanna Muradyan

Abstract:

Medical data mining is a crucial field in contemporary healthcare that offers cutting-edge tactics with enormous potential to transform patient care. This abstract examines how sophisticated data mining techniques could transform the healthcare industry, with a special focus on how they might improve patient outcomes. Healthcare data repositories have dynamically evolved, producing a rich tapestry of different, multi-dimensional information that includes genetic profiles, lifestyle markers, electronic health records, and more. By utilizing data mining techniques inside this vast library, a variety of prospects for precision medicine, predictive analytics, and insight production become visible. Predictive modeling for illness prediction, risk stratification, and therapy efficacy evaluations are important points of focus. Healthcare providers may use this abundance of data to tailor treatment plans, identify high-risk patient populations, and forecast disease trajectories by applying machine learning algorithms and predictive analytics. Better patient outcomes, more efficient use of resources, and early treatments are made possible by this proactive strategy. Furthermore, data mining techniques act as catalysts to reveal complex relationships between apparently unrelated data pieces, providing enhanced insights into the cause of disease, genetic susceptibilities, and environmental factors. Healthcare practitioners can get practical insights that guide disease prevention, customized patient counseling, and focused therapies by analyzing these associations. The abstract explores the problems and ethical issues that come with using data mining techniques in the healthcare industry. In order to properly use these approaches, it is essential to find a balance between data privacy, security issues, and the interpretability of complex models. Finally, this abstract demonstrates the revolutionary power of modern data mining methodologies in transforming the healthcare sector. Healthcare practitioners and researchers can uncover unique insights, enhance clinical decision-making, and ultimately elevate patient care to unprecedented levels of precision and efficacy by employing cutting-edge methodologies.

Keywords: data mining, healthcare, patient care, predictive analytics, precision medicine, electronic health records, machine learning, predictive modeling, disease prognosis, risk stratification, treatment efficacy, genetic profiles, precision health

Procedia PDF Downloads 50
28112 Lessons from Farmers Performing Agroforestry for Reclamation of Gold Mine Spoils in Colombia

Authors: Bibiana Betancur-Corredor, Juan Carlos Loaiza, Manfred Denich, Christian Borgemeister

Abstract:

Alluvial gold mining generates a vast amount of deposits that cover the natural soil and negatively impacts riverbeds and valleys, causing loss of livelihood opportunities for farmers of these regions. In Colombia, more than 79,000 ha are affected by alluvial gold mining, therefore developing strategies to return this land to productivity is of crucial importance for the country. A novel restoration strategy has been created by a mining company, where the land is restored through the establishment of agroforestry systems, in which agricultural crops and livestock are combined to complement reforestation in the area. The purpose of this study is to capture the knowledge of farmers who perform agroforestry in areas with deposits created by alluvial gold mining activities. Semi structured interviews were conducted with farmers with regard to the following: indicators of soil fertility, management practices, soil heterogeneity, pest outbreaks and weeds. In order to compare the perceptions of soil fertility of farmers with physicochemical properties of soils, the farmers were asked to identify spots within their farms that have exhibited good and poor yields. Soil samples were collected in order to correlate farmer’s perceptions with soil physicochemical properties. The findings suggest that the main challenge that farmers face is the identification of fertile soil for crop establishment. They identify the fertile soil through visually analyzing soil color and compaction as well as the use of spontaneous growth of specific plants as indicator of soil fertility. For less fertile areas, nitrogen fixing plants are used as green manure to restore soil fertility for crop establishment. The findings of this study imply that if gold mining is followed by reclamation practices that involve the successful establishment of productive farmlands, agricultural productivity of these lands might improve, increasing food security of the affected communities.

Keywords: agroforestry, knowledge, mining, restoration

Procedia PDF Downloads 228
28111 A Dataset of Program Educational Objectives Mapped to ABET Outcomes: Data Cleansing, Exploratory Data Analysis and Modeling

Authors: Addin Osman, Anwar Ali Yahya, Mohammed Basit Kamal

Abstract:

Datasets or collections are becoming important assets by themselves and now they can be accepted as a primary intellectual output of a research. The quality and usage of the datasets depend mainly on the context under which they have been collected, processed, analyzed, validated, and interpreted. This paper aims to present a collection of program educational objectives mapped to student’s outcomes collected from self-study reports prepared by 32 engineering programs accredited by ABET. The manual mapping (classification) of this data is a notoriously tedious, time consuming process. In addition, it requires experts in the area, which are mostly not available. It has been shown the operational settings under which the collection has been produced. The collection has been cleansed, preprocessed, some features have been selected and preliminary exploratory data analysis has been performed so as to illustrate the properties and usefulness of the collection. At the end, the collection has been benchmarked using nine of the most widely used supervised multiclass classification techniques (Binary Relevance, Label Powerset, Classifier Chains, Pruned Sets, Random k-label sets, Ensemble of Classifier Chains, Ensemble of Pruned Sets, Multi-Label k-Nearest Neighbors and Back-Propagation Multi-Label Learning). The techniques have been compared to each other using five well-known measurements (Accuracy, Hamming Loss, Micro-F, Macro-F, and Macro-F). The Ensemble of Classifier Chains and Ensemble of Pruned Sets have achieved encouraging performance compared to other experimented multi-label classification methods. The Classifier Chains method has shown the worst performance. To recap, the benchmark has achieved promising results by utilizing preliminary exploratory data analysis performed on the collection, proposing new trends for research and providing a baseline for future studies.

Keywords: ABET, accreditation, benchmark collection, machine learning, program educational objectives, student outcomes, supervised multi-class classification, text mining

Procedia PDF Downloads 164
28110 Main Cause of Children's Deaths in Indigenous Wayuu Community from Department of La Guajira: A Research Developed through Data Mining Use

Authors: Isaura Esther Solano Núñez, David Suarez

Abstract:

The main purpose of this research is to discover what causes death in children of the Wayuu community, and deeply analyze those results in order to take corrective measures to properly control infant mortality. We consider important to determine the reasons that are producing early death in this specific type of population, since they are the most vulnerable to high risk environmental conditions. In this way, the government, through competent authorities, may develop prevention policies and the right measures to avoid an increase of this tragic fact. The methodology used to develop this investigation is data mining, which consists in gaining and examining large amounts of data to produce new and valuable information. Through this technique it has been possible to determine that the child population is dying mostly from malnutrition. In short, this technique has been very useful to develop this study; it has allowed us to transform large amounts of information into a conclusive and important statement, which has made it easier to take appropriate steps to resolve a particular situation.

Keywords: malnutrition, data mining, analytical, descriptive, population, Wayuu, indigenous

Procedia PDF Downloads 156
28109 Pregnant Women in Substance Abuse: Transition of Characteristics and Mining of Association from Teds-a 2011 to 2018

Authors: Md Tareq Ferdous Khan, Shrabanti Mazumder, MB Rao

Abstract:

Background: Substance use during pregnancy is a longstanding public health problem that results in severe consequences for pregnant women and fetuses. Methods: Eight (2011-2018) datasets on pregnant women’s admissions are extracted from TEDS-A. Distributions of sociodemographic, substance abuse behaviors, and clinical characteristics are constructed and compared over the years for trends by the Cochran-Armitage test. Market basket analysis is used in mining the association among polysubstance abuse. Results: Over the years, pregnant woman admissions as the percentage of total and female admissions remain stable, where total annual admissions range from 1.54 to about 2 million with the female share of 33.30% to 35.61%. Pregnant women aged 21-29, 12 or more years of education, white race, unemployed, holding independent living status are among the most vulnerable. Concerns prevail on a significant number of polysubstance users, young age at first use, frequency of daily users, and records of prior admissions (60%). Trends of abused primary substances show a significant rise in heroin (66%) and methamphetamine (46%) over the years, although the latest year shows a considerable downturn. On the other hand, significant decreasing patterns are evident for alcohol (43%), marijuana or hashish (24%), cocaine or crack (23%), other opiates or synthetics (36%), and benzodiazepines (29%). Basket analysis reveals some patterns of co-occurrence of substances consistent over the years. Conclusions: This comprehensive study can work as a reference to identify the most vulnerable groups based on their characteristics and deal with the most hazardous substances from their evidence of co-occurrence.

Keywords: basket analysis, pregnant women, substance abuse, trend analysis

Procedia PDF Downloads 192
28108 A Design for Customer Preferences Model by Cluster Analysis of Geometric Features and Customer Preferences

Authors: Yuan-Jye Tseng, Ching-Yen Chen

Abstract:

In the design cycle, a main design task is to determine the external shape of the product. The external shape of a product is one of the key factors that can affect the customers’ preferences linking to the motivation to buy the product, especially in the case of a consumer electronic product such as a mobile phone. The relationship between the external shape and the customer preferences needs to be studied to enhance the customer’s purchase desire and action. In this research, a design for customer preferences model is developed for investigating the relationships between the external shape and the customer preferences of a product. In the first stage, the names of the geometric features are collected and evaluated from the data of the specified internet web pages using the developed text miner. The key geometric features can be determined if the number of occurrence on the web pages is relatively high. For each key geometric feature, the numerical values are explored using the text miner to collect the internet data from the web pages. In the second stage, a cluster analysis model is developed to evaluate the numerical values of the key geometric features to divide the external shapes into several groups. Several design suggestion cases can be proposed, for example, large model, mid-size model, and mini model, for designing a mobile phone. A customer preference index is developed by evaluating the numerical data of each of the key geometric features of the design suggestion cases. The design suggestion case with the top ranking of the customer preference index can be selected as the final design of the product. In this paper, an example product of a notebook computer is illustrated. It shows that the external shape of a product can be used to drive customer preferences. The presented design for customer preferences model is useful for determining a suitable external shape of the product to increase customer preferences.

Keywords: cluster analysis, customer preferences, design evaluation, design for customer preferences, product design

Procedia PDF Downloads 183
28107 An Experience of Translating an Excerpt from Sophie Adonon’s Echos de Femmes from French to English, Using Reverso.

Authors: Michael Ngongeh Mombe

Abstract:

This Paper seeks to investigate an assertion made by some colleagues that there is no need paying a human translator to translate their literary texts, that there are softwares such as Reverso that can be used to do the translation. The main objective of this study is to examine the veracity of this assertion using Reverso to translate a literary text without any post-editing by a human translator. The work is based on two theories: Skopos and Communicative theories of translation. The work is a documentary research where data were collected from published documents in libraries, on the internet and from the translation produced by Reverso. We made a comparative text analyses of both source and target texts in a bid to highlight the weaknesses and strengths of the software. Findings of this work revealed that those who advocate the use of only Machine translation do so in ignorance of the translation mistakes usually made by the software. From the review of all the 268 segments of translation, we found out that the translation produced by Reverso is fraught with errors. We therefore recommend the use of human translators to either do the translation of their literary texts or revise the translation produced by machine to conform to the skopos of the work. This paper is based on Reverso translation. Similar works in the near future will be based on the other translation softwares to determine their weaknesses and strengths.

Keywords: machine translation, human translator, Reverso, literary text

Procedia PDF Downloads 91
28106 Performance Comparison of ADTree and Naive Bayes Algorithms for Spam Filtering

Authors: Thanh Nguyen, Andrei Doncescu, Pierre Siegel

Abstract:

Classification is an important data mining technique and could be used as data filtering in artificial intelligence. The broad application of classification for all kind of data leads to be used in nearly every field of our modern life. Classification helps us to put together different items according to the feature items decided as interesting and useful. In this paper, we compare two classification methods Naïve Bayes and ADTree use to detect spam e-mail. This choice is motivated by the fact that Naive Bayes algorithm is based on probability calculus while ADTree algorithm is based on decision tree. The parameter settings of the above classifiers use the maximization of true positive rate and minimization of false positive rate. The experiment results present classification accuracy and cost analysis in view of optimal classifier choice for Spam Detection. It is point out the number of attributes to obtain a tradeoff between number of them and the classification accuracy.

Keywords: classification, data mining, spam filtering, naive bayes, decision tree

Procedia PDF Downloads 405
28105 Exploring Research Trends and Topics in Intervention on Metabolic Syndrome Using Network Analysis

Authors: Lee Soo-Kyoung, Kim Young-Su

Abstract:

This study established a network related to metabolic syndrome intervention by conducting a social network analysis of titles, keywords, and abstracts, and it identified emerging topics of research. It visualized an interconnection between critical keywords and investigated their frequency of appearance to construe the trends in metabolic syndrome intervention measures used in studies conducted over 38 years (1979–2017). It examined a collection of keywords from 8,285 studies using text rank analyzer, NetMiner 4.0. The analysis revealed 5 groups of newly emerging keywords in the research. By examining the relationship between keywords with reference to their betweenness centrality, the following clusters were identified. Thus if new researchers refer to existing trends to establish the subject of their study and the direction of the development of future research on metabolic syndrome intervention can be predicted.

Keywords: intervention, metabolic syndrome, network analysis, research, the trend

Procedia PDF Downloads 197
28104 Existential Feeling in Contemporary Chinese Novels: The Case of Yan Lianke

Authors: Thuy Hanh Nguyen Thi

Abstract:

Since 1940, existentialism has penetrated into China and continued to profoundly influence contemporary Chinese literature. By the method of deep reading and text analysis, this article analyzes the existential feeling in Yan Lianke’s novels through various aspects: the Sisyphus senses, the narrative rationalization and the viewpoint of the dead. In addition to pointing out the characteristics of the existential sensation in the writer’s novels, the analysis of the article also provides an insight into the nature and depth of contemporary Chinese society.

Keywords: Yan Lianke, existentialism, the existential feeling, contemporary Chinese literature

Procedia PDF Downloads 136
28103 Unsupervised Text Mining Approach to Early Warning System

Authors: Ichihan Tai, Bill Olson, Paul Blessner

Abstract:

Traditional early warning systems that alarm against crisis are generally based on structured or numerical data; therefore, a system that can make predictions based on unstructured textual data, an uncorrelated data source, is a great complement to the traditional early warning systems. The Chicago Board Options Exchange (CBOE) Volatility Index (VIX), commonly referred to as the fear index, measures the cost of insurance against market crash, and spikes in the event of crisis. In this study, news data is consumed for prediction of whether there will be a market-wide crisis by predicting the movement of the fear index, and the historical references to similar events are presented in an unsupervised manner. Topic modeling-based prediction and representation are made based on daily news data between 1990 and 2015 from The Wall Street Journal against VIX index data from CBOE.

Keywords: early warning system, knowledge management, market prediction, topic modeling.

Procedia PDF Downloads 333
28102 Building an Integrated Relational Database from Swiss Nutrition National Survey and Swiss Health Datasets for Data Mining Purposes

Authors: Ilona Mewes, Helena Jenzer, Farshideh Einsele

Abstract:

Objective: The objective of the study was to integrate two big databases from Swiss nutrition national survey (menuCH) and Swiss health national survey 2012 for data mining purposes. Each database has a demographic base data. An integrated Swiss database is built to later discover critical food consumption patterns linked with lifestyle diseases known to be strongly tied with food consumption. Design: Swiss nutrition national survey (menuCH) with approx. 2000 respondents from two different surveys, one by Phone and the other by questionnaire along with Swiss health national survey 2012 with 21500 respondents were pre-processed, cleaned and finally integrated to a unique relational database. Results: The result of this study is an integrated relational database from the Swiss nutritional and health databases.

Keywords: health informatics, data mining, nutritional and health databases, nutritional and chronical databases

Procedia PDF Downloads 105