Search results for: text mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2218

Search results for: text mining

1108 Generating Product Description with Generative Pre-Trained Transformer 2

Authors: Minh-Thuan Nguyen, Phuong-Thai Nguyen, Van-Vinh Nguyen, Quang-Minh Nguyen

Abstract:

Research on automatically generating descriptions for e-commerce products is gaining increasing attention in recent years. However, the generated descriptions of their systems are often less informative and attractive because of lacking training datasets or the limitation of these approaches, which often use templates or statistical methods. In this paper, we explore a method to generate production descriptions by using the GPT-2 model. In addition, we apply text paraphrasing and task-adaptive pretraining techniques to improve the qualify of descriptions generated from the GPT-2 model. Experiment results show that our models outperform the baseline model through automatic evaluation and human evaluation. Especially, our methods achieve a promising result not only on the seen test set but also in the unseen test set.

Keywords: GPT-2, product description, transformer, task-adaptive, language model, pretraining

Procedia PDF Downloads 178
1107 Lecture Video Indexing and Retrieval Using Topic Keywords

Authors: B. J. Sandesh, Saurabha Jirgi, S. Vidya, Prakash Eljer, Gowri Srinivasa

Abstract:

In this paper, we propose a framework to help users to search and retrieve the portions in the lecture video of their interest. This is achieved by temporally segmenting and indexing the lecture video using the topic keywords. We use transcribed text from the video and documents relevant to the video topic extracted from the web for this purpose. The keywords for indexing are found by applying the non-negative matrix factorization (NMF) topic modeling techniques on the web documents. Our proposed technique first creates indices on the transcribed documents using the topic keywords, and these are mapped to the video to find the start and end time of the portions of the video for a particular topic. This time information is stored in the index table along with the topic keyword which is used to retrieve the specific portions of the video for the query provided by the users.

Keywords: video indexing and retrieval, lecture videos, content based video search, multimodal indexing

Procedia PDF Downloads 234
1106 Monitoring the Pollution Status of the Goan Coast Using Genotoxicity Biomarkers in the Bivalve, Meretrix ovum

Authors: Avelyno D'Costa, S. K. Shyama, M. K. Praveen Kumar

Abstract:

The coast of Goa, India receives constant anthropogenic stress through its major rivers which carry mining rejects of iron and manganese ores from upstream mining sites and petroleum hydrocarbons from shipping and harbor-related activities which put the aquatic fauna such as bivalves at risk. The present study reports the pollution status of the Goan coast by the above xenobiotics employing genotoxicity studies. This is further supplemented by the quantification of total petroleum hydrocarbons (TPHs) and various trace metals (iron, manganese, copper, cadmium, and lead) in gills of the estuarine clam, Meretrix ovum as well as from the surrounding water and sediment, over a two-year sampling period, from January 2013 to December 2014. Bivalves were collected from a probable unpolluted site at Palolem and a probable polluted site at Vasco, based upon the anthropogenic activities at these sites. Genotoxicity was assessed in the gill cells using the comet assay and micronucleus test. The quantity of TPHs and trace metals present in gill tissue, water and sediments were analyzed using spectrofluorometry and atomic absorption spectrophotometry (AAS), respectively. The statistical significance of data was analyzed employing Student’s t-test. The relationship between DNA damage and pollutant concentrations was evaluated using multiple regression analysis. Significant DNA damage was observed in the bivalves collected from Vasco which is a region of high industrial activity. Concentrations of TPHs and trace metals (iron, manganese, and cadmium) were also found to be significantly high in gills of the bivalves collected from Vasco compared to those collected from Palolem. Further, the concentrations of these pollutants were also found to be significantly high in the water and sediments at Vasco compared to that of Palolem. This may be due to the lack of industrial activity at Palolem. A high positive correlation was observed between the pollutant levels and DNA damage in the bivalves collected from Vasco suggesting the genotoxic nature of these pollutants. Further, M. ovum can be used as a bioindicator species for monitoring the level of pollution of the estuarine/coastal regions by TPHs and trace metals.

Keywords: comet assay, metals, micronucleus test, total petroleum Hydrocarbons

Procedia PDF Downloads 219
1105 Madame Bovary in Transit: from Novel to Graphic Novel

Authors: Hania Pasandi

Abstract:

Since its publication in 1856, Madame Bovary has established itself as one of the most adapted texts of French literature. Some eighteen film adaptations and twenty-seven rewritings of Madame Bovary in fiction to date shows a great enthusiasm for recreating Flaubert’s masterpiece in a variety of mediums. Posy Simmonds’ 1999 graphic novel, Gemma Bovery stands out among these adaptations as the graphic novel with its visual and narrative structure offers a new reading experience of Madame Bovary, while combining Emma Bovary’s elements with contemporary social, cultural, and artistic discourses. This paper studies the transposition of Flaubert’s Madame Bovary (1857) to late twentieth-century Britain in Posy Simmonds’ 1999 graphic novel, Gemma Bovery by exploring how it borrows the essential flaubertian themes, from its source text to incorporate it with contemporary cultural trends.

Keywords: graphic novel, Gemma Bovery, Madame Bovary, transposition

Procedia PDF Downloads 137
1104 Grammatical Parallelism in the Qurʼān

Authors: Yehudit Dror

Abstract:

Parallelism¬, or as it is called in Arabic, al-muqābala, occupies a central position in the rhetorical discipline of ʻilm al-bayān. Parallelism is used as a figure of textual ornamentation or embellishment and can be divided into several types that are based on the semantics of parallelism and its formative structure. Parallelism in Arabic has received a considerable amount of attention from the Arab rhetorician, which enables understanding the essence of parallelism in Arabic – its types, structure and meaning. However, there are some lacunae in their descriptions concerning the function and thematic restrictions of parallelism in the Qur’ān. In my presentation, which focuses on grammatical parallelism where the two stichos of the parallelism are the same with respect to syntax and morphology, I will show that parallelism has some important roles in the textual arrangement; it may, for example, conclude a thematic section, indicate a turning point in the text or to clarify what has been said previously. In addition, it will be shown that parallelism is not used randomly in the Qurʼān but rather is restricted to repeated themes which carry the most important messages of the Qurʼān, such as God's Might or behavioral patterns of the believers and the non-believers; or it can be used as a stylistic device.

Keywords: grammatical parallelism, half-line, symmetry, Koran

Procedia PDF Downloads 315
1103 Spatial Setting in Translation: A Comparative Evaluation of translations from Pre-Islamic Poetry

Authors: Raja Lahiani

Abstract:

This study is concerned with scrutinising translations into English and French of references to locations in the desert of pre-Islamic Arabia. These references are used in the Source Text (ST) within a poetic image. Reference is made to the names of three different mountains in Arabia, namely Qatan, Sitar, and Yadhbul. As these mountains are referred to in the context of the poet’s description of the density and expansion of the clouds, it is crucial to know that while Sitar and Yadhbul are close to each other, Qatan is far away from them. This distance was functional for the poet to describe the expansion of the clouds. This reflects the spacious place (desert) he handled, and the fact that it was possible for him to physically see what he described. The purpose of this image is for the poet to communicate the vastness of the space he managed to see as he was in a moment of contemplation. Thus, knowledge of this characteristic about the setting is capital for the receiver to understand the communicative function of the verse. A corpus of eighteen translations is gathered. These vary between verse and prose renderings. The methodology adopted in this research work is comparative. Comparison is conducted at both the synchronic and diachronic levels; every translation shall be compared to the ST and then to previous translations. The comparative work will prove at the end that the translators who target historical facts do not necessarily succeed in preserving the image of the ST. It also proves that the more recent the translation is, the deeper the translator’s awareness is the link between imagery, setting, and point of view. Since the late eighteenth century and until nowadays, pre-Islamic poetry has been translated into Western languages. Translators differ as to motives, sources, priorities and intellectual backgrounds. A translator's skopoi undoubtedly affect the way s/he handles aspects of the ST. When it comes to culture-specific aspects and details related to setting, the problem is even more complex. Setting is a very important factor that reveals a great deal of the culture of pre-Islamic Arabia as this is remote in place, historical framework and literary tradition from its translators. History is present in pre-Islamic poetry, which justifies the important literature that has been written to extract information and data from it. These are imbedded not only by signalling given facts, events, and meditations but also by means of references to specific locations and landmarks that used to exist at the time. Spatial setting is an integral part of a literary text as it places it within its historical context. The importance of the translator’s awareness of spatial anthropological data before indulging in the process of translation is tested. This is also crucial in measuring the effect of setting loss and setting gain in translation. The findings of this research would ultimately evaluate the extent to which a comparative methodology is reliable in investigating the role of spatial setting awareness in translation.

Keywords: historical context, translation, comparative literature, spatial setting

Procedia PDF Downloads 235
1102 The Paralinguistic Function of Emojis in Twitter Communication

Authors: Yasmin Tantawi, Mary Beth Rosson

Abstract:

In response to the dearth of information about emoji use for different purposes in different settings, this paper investigates the paralinguistic function of emojis within Twitter communication in the United States. To conduct this investigation, the Twitter feeds from 16 population centers spread throughout the United States were collected from the Twitter public API. One hundred tweets were collected from each population center, totaling to 1,600 tweets. Tweets containing emojis were next extracted using the “emot” Python package; these were then analyzed via the IBM Watson API Natural Language Understanding module to identify the topics discussed. A manual content analysis was then conducted to ascertain the paralinguistic and emotional features of the emojis used in these tweets. We present our characterization of emoji usage in Twitter and discuss implications for the design of Twitter and other text-based communication tools.

Keywords: computer-mediated communication, content analysis, paralinguistics, sociology

Procedia PDF Downloads 149
1101 Performance Analysis with the Combination of Visualization and Classification Technique for Medical Chatbot

Authors: Shajida M., Sakthiyadharshini N. P., Kamalesh S., Aswitha B.

Abstract:

Natural Language Processing (NLP) continues to play a strategic part in complaint discovery and medicine discovery during the current epidemic. This abstract provides an overview of performance analysis with a combination of visualization and classification techniques of NLP for a medical chatbot. Sentiment analysis is an important aspect of NLP that is used to determine the emotional tone behind a piece of text. This technique has been applied to various domains, including medical chatbots. In this, we have compared the combination of the decision tree with heatmap and Naïve Bayes with Word Cloud. The performance of the chatbot was evaluated using accuracy, and the results indicate that the combination of visualization and classification techniques significantly improves the chatbot's performance.

Keywords: sentimental analysis, NLP, medical chatbot, decision tree, heatmap, naïve bayes, word cloud

Procedia PDF Downloads 55
1100 EDM for Prediction of Academic Trends and Patterns

Authors: Trupti Diwan

Abstract:

Predicting student failure at school has changed into a difficult challenge due to both the large number of factors that can affect the reduced performance of students and the imbalanced nature of these kinds of data sets. This paper surveys the two elements needed to make prediction on Students’ Academic Performances which are parameters and methods. This paper also proposes a framework for predicting the performance of engineering students. Genetic programming can be used to predict student failure/success. Ranking algorithm is used to rank students according to their credit points. The framework can be used as a basis for the system implementation & prediction of students’ Academic Performance in Higher Learning Institute.

Keywords: classification, educational data mining, student failure, grammar-based genetic programming

Procedia PDF Downloads 405
1099 Causal Relation Identification Using Convolutional Neural Networks and Knowledge Based Features

Authors: Tharini N. de Silva, Xiao Zhibo, Zhao Rui, Mao Kezhi

Abstract:

Causal relation identification is a crucial task in information extraction and knowledge discovery. In this work, we present two approaches to causal relation identification. The first is a classification model trained on a set of knowledge-based features. The second is a deep learning based approach training a model using convolutional neural networks to classify causal relations. We experiment with several different convolutional neural networks (CNN) models based on previous work on relation extraction as well as our own research. Our models are able to identify both explicit and implicit causal relations as well as the direction of the causal relation. The results of our experiments show a higher accuracy than previously achieved for causal relation identification tasks.

Keywords: causal realtion extraction, relation extracton, convolutional neural network, text representation

Procedia PDF Downloads 701
1098 Searching for Health-Related Information on the Internet: A Case Study on Young Adults

Authors: Dana Weimann Saks

Abstract:

This study aimed to examine the use of the internet as a source of health-related information (HRI), as well as the change in attitudes following the online search for HRI. The current study sample included 88 participants, randomly divided into two experimental groups. One was given the name of an unfamiliar disease and told to search for information about it using various search engines, and the second was given a text about the disease from a credible scientific source. The study findings show a large percentage of participants used the internet as a source of HRI. Likewise, no differences were found in the extent to which the internet was used as a source of HRI when demographics were compared. Those who searched for the HRI on the internet had more negative opinions and believed symptoms of the disease were worse than the average opinion among those who obtained the information about the disease from a credible scientific source. The Internet clearly influences the participants’ beliefs, regardless of demographic differences.

Keywords: health-related information, internet, young adults, HRI

Procedia PDF Downloads 105
1097 Active Control Improvement of Smart Cantilever Beam by Piezoelectric Materials and On-Line Differential Artificial Neural Networks

Authors: P. Karimi, A. H. Khedmati Bazkiaei

Abstract:

The main goal of this study is to test differential neural network as a controller of smart structure and is to enumerate its advantages and disadvantages in comparison with other controllers. In this study, the smart structure has been considered as a Euler Bernoulli cantilever beam and it has been tried that it be under control with the use of vibration neural network resulting from movement. Also, a linear observer has been considered as a reference controller and has been compared its results. The considered vibration charts and the controlled state have been recounted in the final part of this text. The obtained result show that neural observer has better performance in comparison to the implemented linear observer.

Keywords: smart material, on-line differential artificial neural network, active control, finite element method

Procedia PDF Downloads 194
1096 Analysis on Thermococcus achaeans with Frequent Pattern Mining

Authors: Jeongyeob Hong, Myeonghoon Park, Taeson Yoon

Abstract:

After the advent of Achaeans which utilize different metabolism pathway and contain conspicuously different cellular structure, they have been recognized as possible materials for developing quality of human beings. Among diverse Achaeans, in this paper, we compared 16s RNA Sequences of four different species of Thermococcus: Achaeans genus specialized in sulfur-dealing metabolism. Four Species, Barophilus, Kodakarensis, Hydrothermalis, and Onnurineus, live near the hydrothermal vent that emits extreme amount of sulfur and heat. By comparing ribosomal sequences of aforementioned four species, we found similarities in their sequences and expressed protein, enabling us to expect that certain ribosomal sequence or proteins are vital for their survival. Apriori algorithms and Decision Tree were used. for comparison.

Keywords: Achaeans, Thermococcus, apriori algorithm, decision tree

Procedia PDF Downloads 277
1095 A Polyphonic Look at Trends

Authors: Turquesa Topper

Abstract:

The reflection focuses on recording and explaining the considerations, conceptualizations and methodological approach with which from the University, that is to say, from the academic field, the study of Trends is addressed with the intention of training professionals in the area, an area that requires disciplinary boundaries and builds a polyphonic vision. When referring to the objective of our Laboratory the detection of aesthetic trends of consumption, we find ourselves in the requirement to define our object: trends, aesthetic trends of consumption, more specifically. The pages cover a conception of trends from a theoretical framework that incorporates contributions from linguistics, semiotics, sociology, cultural studies and project disciplines, in order to consolidate a polyphonic look. The text investigates in the pre-discursive aspect of the trends, in the circulation of the notion of style and in the dynamics of affirmation - denial as the constitutive dynamics of Fashion linked to any process of innovation. From such inquiry, it is presented to Fashion as a system that operates directly on the construction of socio-individual identities unfolding through the liquefaction of signs in trends.

Keywords: fashion, methodology, narrative, trends

Procedia PDF Downloads 234
1094 Research on the Rewriting and Adaptation in the English Translation of the Analects

Authors: Jun Xu, Haiyan Xiao

Abstract:

The Analects (Lunyu) is one of the most recognized Confucian classics and one of the earliest Chinese classics that have been translated into English and known to the West. Research on the translation of The Analects has witnessed a transfer from the comparison of the text and language to a wider description of social and cultural contexts. Mainly on the basis of Legge and Waley’s translations of The Analects, this paper integrates Lefevere’s theory of rewriting and Verschueren’s theory of adaptation and explores the influence of ideology and poetics on the translation. It analyses how translators make adaptive decisions in the manipulation of ideology and poetics. It is proved that the English translation of The Analects is the translators’ initiative rewriting of the original work, which is a selective and adaptive process in the multi-layered contexts of the target language. The research on the translation of classics should include both the manipulative factors and translator’s initiative as well.

Keywords: The Analects, ideology, poetics, rewriting, adaptation

Procedia PDF Downloads 258
1093 Scalable Learning of Tree-Based Models on Sparsely Representable Data

Authors: Fares Hedayatit, Arnauld Joly, Panagiotis Papadimitriou

Abstract:

Many machine learning tasks such as text annotation usually require training over very big datasets, e.g., millions of web documents, that can be represented in a sparse input space. State-of the-art tree-based ensemble algorithms cannot scale to such datasets, since they include operations whose running time is a function of the input space size rather than a function of the non-zero input elements. In this paper, we propose an efficient splitting algorithm to leverage input sparsity within decision tree methods. Our algorithm improves training time over sparse datasets by more than two orders of magnitude and it has been incorporated in the current version of scikit-learn.org, the most popular open source Python machine learning library.

Keywords: big data, sparsely representable data, tree-based models, scalable learning

Procedia PDF Downloads 244
1092 Information Retrieval for Kafficho Language

Authors: Mareye Zeleke Mekonen

Abstract:

The Kafficho language has distinct issues in information retrieval because of its restricted resources and dearth of standardized methods. In this endeavor, with the cooperation and support of linguists and native speakers, we investigate the creation of information retrieval systems specifically designed for the Kafficho language. The Kafficho information retrieval system allows Kafficho speakers to access information easily in an efficient and effective way. Our objective is to conduct an information retrieval experiment using 220 Kafficho text files, including fifteen sample questions. Tokenization, normalization, stop word removal, stemming, and other data pre-processing chores, together with additional tasks like term weighting, were prerequisites for the vector space model to represent each page and a particular query. The three well-known measurement metrics we used for our word were Precision, Recall, and and F-measure, with values of 87%, 28%, and 35%, respectively. This demonstrates how well the Kaffiho information retrieval system performed well while utilizing the vector space paradigm.

Keywords: Kafficho, information retrieval, stemming, vector space

Procedia PDF Downloads 33
1091 Interactive Image Search for Mobile Devices

Authors: Komal V. Aher, Sanjay B. Waykar

Abstract:

Nowadays every individual having mobile device with them. In both computer vision and information retrieval Image search is currently hot topic with many applications. The proposed intelligent image search system is fully utilizing multimodal and multi-touch functionalities of smart phones which allows search with Image, Voice, and Text on mobile phones. The system will be more useful for users who already have pictures in their minds but have no proper descriptions or names to address them. The paper gives system with ability to form composite visual query to express user’s intention more clearly which helps to give more precise or appropriate results to user. The proposed algorithm will considerably get better in different aspects. System also uses Context based Image retrieval scheme to give significant outcomes. So system is able to achieve gain in terms of search performance, accuracy and user satisfaction.

Keywords: color space, histogram, mobile device, mobile visual search, multimodal search

Procedia PDF Downloads 352
1090 Exploring Research Trends and Topics in Intervention on Metabolic Syndrome Using Network Analysis

Authors: Lee Soo-Kyoung, Kim Young-Su

Abstract:

This study established a network related to metabolic syndrome intervention by conducting a social network analysis of titles, keywords, and abstracts, and it identified emerging topics of research. It visualized an interconnection between critical keywords and investigated their frequency of appearance to construe the trends in metabolic syndrome intervention measures used in studies conducted over 38 years (1979–2017). It examined a collection of keywords from 8,285 studies using text rank analyzer, NetMiner 4.0. The analysis revealed 5 groups of newly emerging keywords in the research. By examining the relationship between keywords with reference to their betweenness centrality, the following clusters were identified. Thus if new researchers refer to existing trends to establish the subject of their study and the direction of the development of future research on metabolic syndrome intervention can be predicted.

Keywords: intervention, metabolic syndrome, network analysis, research, the trend

Procedia PDF Downloads 188
1089 Affects Associations Analysis in Emergency Situations

Authors: Joanna Grzybowska, Magdalena Igras, Mariusz Ziółko

Abstract:

Association rule learning is an approach for discovering interesting relationships in large databases. The analysis of relations, invisible at first glance, is a source of new knowledge which can be subsequently used for prediction. We used this data mining technique (which is an automatic and objective method) to learn about interesting affects associations in a corpus of emergency phone calls. We also made an attempt to match revealed rules with their possible situational context. The corpus was collected and subjectively annotated by two researchers. Each of 3306 recordings contains information on emotion: (1) type (sadness, weariness, anxiety, surprise, stress, anger, frustration, calm, relief, compassion, contentment, amusement, joy) (2) valence (negative, neutral, or positive) (3) intensity (low, typical, alternating, high). Also, additional information, that is a clue to speaker’s emotional state, was annotated: speech rate (slow, normal, fast), characteristic vocabulary (filled pauses, repeated words) and conversation style (normal, chaotic). Exponentially many rules can be extracted from a set of items (an item is a previously annotated single information). To generate the rules in the form of an implication X → Y (where X and Y are frequent k-itemsets) the Apriori algorithm was used - it avoids performing needless computations. Then, two basic measures (Support and Confidence) and several additional symmetric and asymmetric objective measures (e.g. Laplace, Conviction, Interest Factor, Cosine, correlation coefficient) were calculated for each rule. Each applied interestingness measure revealed different rules - we selected some top rules for each measure. Owing to the specificity of the corpus (emergency situations), most of the strong rules contain only negative emotions. There are though strong rules including neutral or even positive emotions. Three examples of the strongest rules are: {sadness} → {anxiety}; {sadness, weariness, stress, frustration} → {anger}; {compassion} → {sadness}. Association rule learning revealed the strongest configurations of affects (as well as configurations of affects with affect-related information) in our emergency phone calls corpus. The acquired knowledge can be used for prediction to fulfill the emotional profile of a new caller. Furthermore, a rule-related possible context analysis may be a clue to the situation a caller is in.

Keywords: data mining, emergency phone calls, emotional profiles, rules

Procedia PDF Downloads 393
1088 Surface to the Deeper: A Universal Entity Alignment Approach Focusing on Surface Information

Authors: Zheng Baichuan, Li Shenghui, Li Bingqian, Zhang Ning, Chen Kai

Abstract:

Entity alignment (EA) tasks in knowledge graphs often play a pivotal role in the integration of knowledge graphs, where structural differences often exist between the source and target graphs, such as the presence or absence of attribute information and the types of attribute information (text, timestamps, images, etc.). However, most current research efforts are focused on improving alignment accuracy, often along with an increased reliance on specific structures -a dependency that inevitably diminishes their practical value and causes difficulties when facing knowledge graph alignment tasks with varying structures. Therefore, we propose a universal knowledge graph alignment approach that only utilizes the common basic structures shared by knowledge graphs. We have demonstrated through experiments that our method achieves state-of-the-art performance in fair comparisons.

Keywords: knowledge graph, entity alignment, transformer, deep learning

Procedia PDF Downloads 27
1087 Quality and Quantity in the Strategic Network of Higher Education Institutions

Authors: Juha Kettunen

Abstract:

This study analyzes the quality and the size of the strategic network of higher education institutions. The study analyses the concept of fitness for purpose in quality assurance. It also analyses the transaction costs of networking that have consequences on the number of members in the network. Empirical evidence is presented of the Consortium on Applied Research and Professional Education, which is a European strategic network of six higher education institutions. The results of the study support the argument that the number of members in the strategic network should be relatively small to provide high quality results. The practical importance is that networking has been able to promote international research and development projects. The results of this study are important for those who want to design and improve international networks in higher education.

Keywords: balanced scorecard, higher education, social networking, strategic planning

Procedia PDF Downloads 324
1086 Interacting with Multi-Scale Structures of Online Political Debates by Visualizing Phylomemies

Authors: Quentin Lobbe, David Chavalarias, Alexandre Delanoe

Abstract:

The ICT revolution has given birth to an unprecedented world of digital traces and has impacted a wide number of knowledge-driven domains such as science, education or policy making. Nowadays, we are daily fueled by unlimited flows of articles, blogs, messages, tweets, etc. The internet itself can thus be considered as an unsteady hyper-textual environment where websites emerge and expand every day. But there are structures inside knowledge. A given text can always be studied in relation to others or in light of a specific socio-cultural context. By way of their textual traces, human beings are calling each other out: hypertext citations, retweets, vocabulary similarity, etc. We are in fact the architects of a giant web of elements of knowledge whose structures and shapes convey their own information. The global shapes of these digital traces represent a source of collective knowledge and the question of their visualization remains an opened challenge. How can we explore, browse and interact with such shapes? In order to navigate across these growing constellations of words and texts, interdisciplinary innovations are emerging at the crossroad between fields of social and computational sciences. In particular, complex systems approaches make it now possible to reconstruct the hidden structures of textual knowledge by means of multi-scale objects of research such as semantic maps and phylomemies. The phylomemy reconstruction is a generic method related to the co-word analysis framework. Phylomemies aim to reveal the temporal dynamics of large corpora of textual contents by performing inter-temporal matching on extracted knowledge domains in order to identify their conceptual lineages. This study aims to address the question of visualizing the global shapes of online political discussions related to the French presidential and legislative elections of 2017. We aim to build phylomemies on top of a dedicated collection of thousands of French political tweets enriched with archived contemporary news web articles. Our goal is to reconstruct the temporal evolution of online debates fueled by each political community during the elections. To that end, we want to introduce an iterative data exploration methodology implemented and tested within the free software Gargantext. There we combine synchronic and diachronic axis of visualization to reveal the dynamics of our corpora of tweets and web pages as well as their inner syntagmatic and paradigmatic relationships. In doing so, we aim to provide researchers with innovative methodological means to explore online semantic landscapes in a collaborative and reflective way.

Keywords: online political debate, French election, hyper-text, phylomemy

Procedia PDF Downloads 175
1085 Economic Characteristics of Bitcoin: "An Analytical Study"

Authors: Abdelhalem Shahen

Abstract:

The world is now experiencing a digital revolution and greatly accelerated technological developments, in addition to the transition from the economy in its traditional form to the digital economy, which has resulted in the emergence of new tools that are appropriate to those developments, and from this, this paper attempts to explore the economic characteristics of the bitcoin currency that circulated recently. Due to the many advantages that distinguish it from money in its traditional forms, which have a range of economic effects. The study found that Bitcoin is among the technological innovations, which contain a set of characteristics that are worth studying, those that make it the focus of attention, such as the digital currency, the peer-to-peer property, Lower and Faster Transaction Costs, transparency, decentralized control, privacy, and Double-Spending, as well as security and Cryptographic, and finally mining.

Keywords: Digital Economics, Digital Currencies, Bitcoin, Features of Bitcoin

Procedia PDF Downloads 120
1084 Pueblos Mágicos in Mexico: The Loss of Intangible Cultural Heritage and Cultural Tourism

Authors: Claudia Rodriguez-Espinosa, Erika Elizabeth Pérez Múzquiz

Abstract:

Since the creation of the “Pueblos Mágicos” program in 2001, a series of social and cultural events had directly affected the heritage conservation of the 121 registered localities until 2018, when the federal government terminated the program. Many studies have been carried out that seek to analyze from different perspectives and disciplines the consequences that these appointments have generated in the “Pueblos Mágicos.” Multidisciplinary groups such as the one headed by Carmen Valverde and Liliana López Levi, have brought together specialists from all over the Mexican Republic to create a set of diagnoses of most of these settlements, and although each one has unique specificities, there is a constant in most of them that has to do with the loss of cultural heritage and that is related to transculturality. There are several factors identified that have fostered a cultural loss, as a direct reflection of the economic crisis that prevails in Mexico. It is important to remember that the origin of this program had as its main objective to promote the growth and development of local economies since one of the conditions for entering the program is that they have less than 20,000 inhabitants. With this goal in mind, one of the first actions that many “Pueblos Mágicos” carried out was to improve or create an infrastructure to receive both national and foreign tourists since this was practically non-existent. Creating hotels, restaurants, cafes, training certified tour guides, among other actions, have led to one of the great problems they face: globalization. Although by itself it is not bad, its impact in many cases has been negative for heritage conservation. The entry into and contact with new cultures has led to the undervaluation of cultural traditions, their transformation and even their total loss. This work seeks to present specific cases of transformation and loss of cultural heritage, as well as to reflect on the problem and propose scenarios in which the negative effects can be reversed. For this text, 36 “Pueblos Mágicos” have been selected for study, based on those settlements that are cited in volumes I and IV (the first and last of the collection) of the series produced by the multidisciplinary group led by Carmen Valverde and Liliana López Levi (researchers from UNAM and UAM Xochimilco respectively) in the project supported by CONACyT entitled “Pueblos Mágicos. An interdisciplinary vision”, of which we are part. This sample is considered representative since it forms 30% of the total of 121 “Pueblos Mágicos” existing at that moment. With this information, the elements of its intangible heritage loss or transformation have been identified in every chapter based on the texts written by the participants of that project. Finally, this text shows an analysis of the effects that this federal program, as a public policy applied to 132 populations, has had on the conservation or transformation of the intangible cultural heritage of the “Pueblos Mágicos.” Transculturality, globalization, the creation of identities and the desire to increase the flow of tourists have impacted the changes that traditions (main intangible cultural heritage) have had in the 18 years that the federal program lasted.

Keywords: public policies, cultural tourism, heritage preservation, pueblos mágicos program

Procedia PDF Downloads 171
1083 Frequent Pattern Mining for Digenic Human Traits

Authors: Atsuko Okazaki, Jurg Ott

Abstract:

Some genetic diseases (‘digenic traits’) are due to the interaction between two DNA variants. For example, certain forms of Retinitis Pigmentosa (a genetic form of blindness) occur in the presence of two mutant variants, one in the ROM1 gene and one in the RDS gene, while the occurrence of only one of these mutant variants leads to a completely normal phenotype. Detecting such digenic traits by genetic methods is difficult. A common approach to finding disease-causing variants is to compare 100,000s of variants between individuals with a trait (cases) and those without the trait (controls). Such genome-wide association studies (GWASs) have been very successful but hinge on genetic effects of single variants, that is, there should be a difference in allele or genotype frequencies between cases and controls at a disease-causing variant. Frequent pattern mining (FPM) methods offer an avenue at detecting digenic traits even in the absence of single-variant effects. The idea is to enumerate pairs of genotypes (genotype patterns) with each of the two genotypes originating from different variants that may be located at very different genomic positions. What is needed is for genotype patterns to be significantly more common in cases than in controls. Let Y = 2 refer to cases and Y = 1 to controls, with X denoting a specific genotype pattern. We are seeking association rules, ‘X → Y’, with high confidence, P(Y = 2|X), significantly higher than the proportion of cases, P(Y = 2) in the study. Clearly, generally available FPM methods are very suitable for detecting disease-associated genotype patterns. We use fpgrowth as the basic FPM algorithm and built a framework around it to enumerate high-frequency digenic genotype patterns and to evaluate their statistical significance by permutation analysis. Application to a published dataset on opioid dependence furnished results that could not be found with classical GWAS methodology. There were 143 cases and 153 healthy controls, each genotyped for 82 variants in eight genes of the opioid system. The aim was to find out whether any of these variants were disease-associated. The single-variant analysis did not lead to significant results. Application of our FPM implementation resulted in one significant (p < 0.01) genotype pattern with both genotypes in the pattern being heterozygous and originating from two variants on different chromosomes. This pattern occurred in 14 cases and none of the controls. Thus, the pattern seems quite specific to this form of substance abuse and is also rather predictive of disease. An algorithm called Multifactor Dimension Reduction (MDR) was developed some 20 years ago and has been in use in human genetics ever since. This and our algorithms share some similar properties, but they are also very different in other respects. The main difference seems to be that our algorithm focuses on patterns of genotypes while the main object of inference in MDR is the 3 × 3 table of genotypes at two variants.

Keywords: digenic traits, DNA variants, epistasis, statistical genetics

Procedia PDF Downloads 105
1082 A Novel Image Steganography Scheme Based on Mandelbrot Fractal

Authors: Adnan H. M. Al-Helali, Hamza A. Ali

Abstract:

Growth of censorship and pervasive monitoring on the Internet, Steganography arises as a new means of achieving secret communication. Steganography is the art and science of embedding information within electronic media used by common applications and systems. Generally, hiding information of multimedia within images will change some of their properties that may introduce few degradation or unusual characteristics. This paper presents a new image steganography approach for hiding information of multimedia (images, text, and audio) using generated Mandelbrot Fractal image as a cover. The proposed technique has been extensively tested with different images. The results show that the method is a very secure means of hiding and retrieving steganographic information. Experimental results demonstrate that an effective improvement in the values of the Peak Signal to Noise Ratio (PSNR), Mean Square Error (MSE), Normalized Cross Correlation (NCC) and Image Fidelity (IF) over the previous techniques.

Keywords: fractal image, information hiding, Mandelbrot et fractal, steganography

Procedia PDF Downloads 520
1081 Documents Emotions Classification Model Based on TF-IDF Weighting Measure

Authors: Amr Mansour Mohsen, Hesham Ahmed Hassan, Amira M. Idrees

Abstract:

Emotions classification of text documents is applied to reveal if the document expresses a determined emotion from its writer. As different supervised methods are previously used for emotion documents’ classification, in this research we present a novel model that supports the classification algorithms for more accurate results by the support of TF-IDF measure. Different experiments have been applied to reveal the applicability of the proposed model, the model succeeds in raising the accuracy percentage according to the determined metrics (precision, recall, and f-measure) based on applying the refinement of the lexicon, integration of lexicons using different perspectives, and applying the TF-IDF weighting measure over the classifying features. The proposed model has also been compared with other research to prove its competence in raising the results’ accuracy.

Keywords: emotion detection, TF-IDF, WEKA tool, classification algorithms

Procedia PDF Downloads 461
1080 Gender Construction in Contemporary Dystopian Fiction in Young Adult Literature: A South African Example

Authors: Johan Anker

Abstract:

The purpose of this paper is to discuss the nature of gender construction in modern dystopian fiction, the development of this genre in Young Adult Literature and reasons for the enormous appeal on the adolescent readers. A recent award winning South African text in this genre, The Mark by Edith Bullring (2014), will be used as example while also comparing this text to international bestsellers like Divergent (Roth:2011), The Hunger Games (Collins:2008) and others. Theoretical insights from critics and academics in the field of children’s literature, like Ames, Coats, Bradford, Booker, Basu, Green-Barteet, Hintz, McAlear, McCallum, Moylan, Ostry, Ryan, Stephens and Westerfield will be referred to and their insights used as part of the analysis of The Mark. The role of relevant and recurring themes in this genre, like global concerns, environmental destruction, liberty, self-determination, social and political critique, surveillance and repression by the state or other institutions will also be referred to. The paper will shortly refer to the history and emergence of dystopian literature as genre in adult and young adult literature as part of the long tradition since the publishing of Orwell’s 1984 and Huxley’s Brave New World. Different factors appeal to adolescent readers in the modern versions of this hybrid genre for young adults: teenage protagonists who are questioning the underlying values of a flawed society like an inhuman or tyrannical government, a growing understanding of the society around them, feelings of isolation and the dynamic of relationships. This unease leads to a growing sense of the potential to act against society (rebellion), and of their role as agents in a larger community and independent decision-making abilities. This awareness also leads to a growing sense of self (identity and agency) and the development of romantic relationships. The specific modern tendency of a female protagonist as leader in the rebellion against state and state apparatus, who gains in agency and independence in this rebellion, an important part of the identification with and construction of gender, while being part of the traditional coming-of-age young adult novel will be emphasized. A comparison between the traditional themes, structures and plots of young adult literature (YAL) with adult dystopian literature and those of recent dystopian YAL will be made while the hybrid nature of this genre and the 'sense of unease' but also of hope, as an essential part of youth literature, in the closure to these novels will be discussed. Important questions about the role of the didactic nature of these texts and the political issues and the importance of the formation of agency and identity for the young adult reader, as well as identification with the protagonists in this genre, are also part of this discussion of The Mark and other YAL novels.

Keywords: agency, dystopian literature, gender construction, young adult literature

Procedia PDF Downloads 163
1079 Automated Evaluation Approach for Time-Dependent Question Answering Pairs on Web Crawler Based Question Answering System

Authors: Shraddha Chaudhary, Raksha Agarwal, Niladri Chatterjee

Abstract:

This work demonstrates a web crawler-based generalized end-to-end open domain Question Answering (QA) system. An efficient QA system requires a significant amount of domain knowledge to answer any question with the aim to find an exact and correct answer in the form of a number, a noun, a short phrase, or a brief piece of text for the user's questions. Analysis of the question, searching the relevant document, and choosing an answer are three important steps in a QA system. This work uses a web scraper (Beautiful Soup) to extract K-documents from the web. The value of K can be calibrated on the basis of a trade-off between time and accuracy. This is followed by a passage ranking process using the MS-Marco dataset trained on 500K queries to extract the most relevant text passage, to shorten the lengthy documents. Further, a QA system is used to extract the answers from the shortened documents based on the query and return the top 3 answers. For evaluation of such systems, accuracy is judged by the exact match between predicted answers and gold answers. But automatic evaluation methods fail due to the linguistic ambiguities inherent in the questions. Moreover, reference answers are often not exhaustive or are out of date. Hence correct answers predicted by the system are often judged incorrect according to the automated metrics. One such scenario arises from the original Google Natural Question (GNQ) dataset which was collected and made available in the year 2016. Use of any such dataset proves to be inefficient with respect to any questions that have time-varying answers. For illustration, if the query is where will be the next Olympics? Gold Answer for the above query as given in the GNQ dataset is “Tokyo”. Since the dataset was collected in the year 2016, and the next Olympics after 2016 were in 2020 that was in Tokyo which is absolutely correct. But if the same question is asked in 2022 then the answer is “Paris, 2024”. Consequently, any evaluation based on the GNQ dataset will be incorrect. Such erroneous predictions are usually given to human evaluators for further validation which is quite expensive and time-consuming. To address this erroneous evaluation, the present work proposes an automated approach for evaluating time-dependent question-answer pairs. In particular, it proposes a metric using the current timestamp along with top-n predicted answers from a given QA system. To test the proposed approach GNQ dataset has been used and the system achieved an accuracy of 78% for a test dataset comprising 100 QA pairs. This test data was automatically extracted using an analysis-based approach from 10K QA pairs of the GNQ dataset. The results obtained are encouraging. The proposed technique appears to have the possibility of developing into a useful scheme for gathering precise, reliable, and specific information in a real-time and efficient manner. Our subsequent experiments will be guided towards establishing the efficacy of the above system for a larger set of time-dependent QA pairs.

Keywords: web-based information retrieval, open domain question answering system, time-varying QA, QA evaluation

Procedia PDF Downloads 86