Search results for: large-scale text clustering
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1855

Search results for: large-scale text clustering

1555 A Computational Cost-Effective Clustering Algorithm in Multidimensional Space Using the Manhattan Metric: Application to the Global Terrorism Database

Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami

Abstract:

The increasing amount of collected data has limited the performance of the current analyzing algorithms. Thus, developing new cost-effective algorithms in terms of complexity, scalability, and accuracy raised significant interests. In this paper, a modified effective k-means based algorithm is developed and experimented. The new algorithm aims to reduce the computational load without significantly affecting the quality of the clusterings. The algorithm uses the City Block distance and a new stop criterion to guarantee the convergence. Conducted experiments on a real data set show its high performance when compared with the original k-means version.

Keywords: pattern recognition, global terrorism database, Manhattan distance, k-means clustering, terrorism data analysis

Procedia PDF Downloads 358
1554 Evaluation Means in English and Russian Academic Discourse: Through Comparative Analysis towards Translation

Authors: Albina Vodyanitskaya

Abstract:

Given the culture- and language-specific nature of evaluation, this phenomenon is widely studied around the linguistic world and may be regarded as a challenge for translators. Evaluation penetrates all the levels of a scientific text, influences its composition and the reader’s attitude towards the information presented. One of the most challenging and rarely studied phenomena is the individual style of the scientific writer, which is mostly reflected in the use of evaluative language means. The evaluative and expressive potential of a scientific text is becoming more and more welcoming area for researchers, which stems in the shift towards anthropocentric paradigm in linguistics. Other reasons include: the cognitive and psycholinguistic processes that accompany knowledge acquisition, a genre-determined nature of a scientific text, the increasing public concern about the quality of scientific papers and some such. One more important issue, is the fact that linguists all over the world still argue about the definition of evaluation and its functions in the text. The author analyzes various approaches towards the study of evaluation and scientific texts. A comparative analysis of English and Russian dissertations and other scientific papers with regard to evaluative language means reveals major differences and similarities between English and Russian scientific style. Though standardized and genre-specific, English scientific texts contain more figurative and expressive evaluative means than the Russian ones, which should be taken into account while translating scientific papers. The processes that evaluation undergoes while being expressed by means of a target language are also analyzed. The author offers a target-language-dependent strategy for the translation of evaluation in English and Russian scientific texts. The findings may contribute to the theory and practice of translation and can increase scientific writers’ awareness of inter-language and intercultural differences in evaluative language means.

Keywords: academic discourse, evaluation, scientific text, scientific writing, translation

Procedia PDF Downloads 329
1553 The Syntactic Features of Islamic Legal Texts and Their Implications for Translation

Authors: Rafat Y. Alwazna

Abstract:

Certain religious texts are deemed part of legal texts that are characterised by high sensitivity and sacredness. Amongst such religious texts are Islamic legal texts that are replete with Islamic legal terms that designate particular legal concepts peculiar to Islamic legal system and legal culture. However, from the syntactic perspective, Islamic legal texts prove lengthy, condensed and convoluted, with little use of punctuation system, but with an extensive use of subordinations and co-ordinations, which separate the main verb from the subject, and which, of course, carry a heavy load of legal detail. The present paper seeks to examine the syntactic features of Islamic legal texts through analysing a short text of Islamic jurisprudence in an attempt at exploring the syntactic features that characterise this type of legal text. A translation of this text into legal English is then exercised to find the translation implications that have emerged as a result of the English translation. Based on these implications, the paper compares and contrasts the syntactic features of Islamic legal texts to those of legal English texts. Finally, the present paper argues that there are a number of syntactic features of Islamic legal texts, such as nominalisation, passivisation, little use of punctuation system, the use of the Arabic cohesive device, etc., which are also possessed by English legal texts except for the last feature and with some variations. The paper also claims that when rendering an Islamic legal text into legal English, certain implications emerge, such as the necessity of a sentence break, the omission of the cohesive device concerned and the increase in the use of nominalisation, passivisation, passive participles, and so on.

Keywords: English legal texts, Islamic legal texts, nominalisation, participles, passivisation, syntactic features, translation implications

Procedia PDF Downloads 195
1552 Altered Network Organization in Mild Alzheimer's Disease Compared to Mild Cognitive Impairment Using Resting-State EEG

Authors: Chia-Feng Lu, Yuh-Jen Wang, Shin Teng, Yu-Te Wu, Sui-Hing Yan

Abstract:

Brain functional networks based on resting-state EEG data were compared between patients with mild Alzheimer’s disease (mAD) and matched patients with amnestic subtype of mild cognitive impairment (aMCI). We integrated the time–frequency cross mutual information (TFCMI) method to estimate the EEG functional connectivity between cortical regions and the network analysis based on graph theory to further investigate the alterations of functional networks in mAD compared with aMCI group. We aimed at investigating the changes of network integrity, local clustering, information processing efficiency, and fault tolerance in mAD brain networks for different frequency bands based on several topological properties, including degree, strength, clustering coefficient, shortest path length, and efficiency. Results showed that the disruptions of network integrity and reductions of network efficiency in mAD characterized by lower degree, decreased clustering coefficient, higher shortest path length, and reduced global and local efficiencies in the delta, theta, beta2, and gamma bands were evident. The significant changes in network organization can be used in assisting discrimination of mAD from aMCI in clinical.

Keywords: EEG, functional connectivity, graph theory, TFCMI

Procedia PDF Downloads 402
1551 Communication through Technology: SMS Taking Most of the Time Impacting the Standard English

Authors: Nazia Sulemna, Sadia Gul

Abstract:

With the invade of mobile phones text messaging has become a popular medium of communication. Its users are multiplying with every passing day. Its use is not only limites to informal but to formal communication as well. Students are the advent users of mobile phones and of SMS as well. The present study manifests the fact that students are practicing SMS for a number of reasons and a good amount of time is spent upon it which is resulting in typographical features, graphones and rebus writing. Data was collected through questionnaires and came to the conclusion that its effect is obvious in the L2 users and in exam as well.

Keywords: text messaging, technology, exams, formal writing

Procedia PDF Downloads 715
1550 Amharic Text News Classification Using Supervised Learning

Authors: Misrak Assefa

Abstract:

The Amharic language is the second most widely spoken Semitic language in the world. There are several new overloaded on the web. Searching some useful documents from the web on a specific topic, which is written in the Amharic language, is a challenging task. Hence, document categorization is required for managing and filtering important information. In the classification of Amharic text news, there is still a gap in the domain of information that needs to be launch. This study attempts to design an automatic Amharic news classification using a supervised learning mechanism on four un-touch classes. To achieve this research, 4,182 news articles were used. Naive Bayes (NB) and Decision tree (j48) algorithms were used to classify the given Amharic dataset. In this paper, k-fold cross-validation is used to estimate the accuracy of the classifier. As a result, it shows those algorithms can be applicable in Amharic news categorization. The best average accuracy result is achieved by j48 decision tree and naïve Bayes is 95.2345 %, and 94.6245 % respectively using three categories. This research indicated that a typical decision tree algorithm is more applicable to Amharic news categorization.

Keywords: text categorization, supervised machine learning, naive Bayes, decision tree

Procedia PDF Downloads 166
1549 Google Translate: AI Application

Authors: Shaima Almalhan, Lubna Shukri, Miriam Talal, Safaa Teskieh

Abstract:

Since artificial intelligence is a rapidly evolving topic that has had a significant impact on technical growth and innovation, this paper examines people's awareness, use, and engagement with the Google Translate application. To see how familiar aware users are with the app and its features, quantitative and qualitative research was conducted. The findings revealed that consumers have a high level of confidence in the application and how far people they benefit from this sort of innovation and how convenient it makes communication.

Keywords: artificial intelligence, google translate, speech recognition, language translation, camera translation, speech to text, text to speech

Procedia PDF Downloads 127
1548 A Rational Intelligent Agent to Promote Metacognition a Situation of Text Comprehension

Authors: Anass Hsissi, Hakim Allali, Abdelmajid Hajami

Abstract:

This article presents the results of a doctoral research which aims to integrate metacognitive dimension in the design of human learning computing environments (ILE). We conducted a detailed study on the relationship between metacognitive processes and learning, specifically their positive impact on the performance of learners in the area of reading comprehension. Our contribution is to implement methods, using an intelligent agent based on BDI paradigm to ensure intelligent and reliable support for low readers, in order to encourage regulation and a conscious and rational use of their metacognitive abilities.

Keywords: metacognition, text comprehension EIAH, autoregulation, BDI agent

Procedia PDF Downloads 301
1547 Research on the Landscape of Xi'an Ancient City Based on the Poetry Text of Tang Dynasty

Authors: Zou Yihui

Abstract:

The integration of the traditional landscape of the ancient city and the poet's emotions and symbolization into ancient poetry is the unique cultural gene and spiritual core of the historical city, and re-understanding the historical landscape pattern from the poetry is conducive to continuing the historical city context and improving the current situation of the gradual decline of the poetry of the modern historical urban landscape. Starting from Tang poetry uses semantic analysis methods、combined with text mining technology, entry mining, word frequency analysis, and cluster analysis of the landscape information of Tang Chang'an City were carried out, and the method framework for analyzing the urban landscape form based on poetry text was constructed. Nearly 160 poems describing the landscape of Tang Chang'an City were screened, and the poetic landscape characteristics of Tang Chang'an City were sorted out locally in order to combine with modern urban spatial development to continue the urban spatial context.

Keywords: Tang Chang'an City, poetic texts, semantic analysis, historical landscape

Procedia PDF Downloads 17
1546 Increasing the Ability of State Senior High School 12 Pekanbaru Students in Writing an Analytical Exposition Text through Comic Strips

Authors: Budiman Budiman

Abstract:

This research aimed at describing and testing whether the students’ ability in writing analytical exposition text is increased by using comic strips at SMAN 12 Pekanbaru. The respondents of this study were the second-grade students, especially XI Science 3 academic year 2011-2012. The total number of students in this class was forty-two (42) students. The quantitative and qualitative data was collected by using writing test and observation sheets. The research finding reveals that there is a significant increase of students’ writing ability in writing analytical exposition text through comic strips. It can be proved by the average score of pre-test was 43.7 and the average score of post-test was 65.37. Besides, the students’ interest and motivation in learning are also improved. These can be seen from the increasing of students’ awareness and activeness in learning process based on observation sheets. The findings draw attention to the use of comic strips in teaching and learning is beneficial for better learning outcome.

Keywords: analytical exposition, comic strips, secondary school students, writing ability

Procedia PDF Downloads 135
1545 Improving Technical Translation Ability of the Iranian Students of Translation Through Multimedia: An Empirical Study

Authors: Dina Zakeri, Ali Aminzad

Abstract:

Multimedia-assisted teaching results in eliminating traditional training barriers, facilitating the cognition process and upgrading learning outcomes. This study attempted to examine the effects of implementing multimedia on teaching technical translation model and on the technical text translation ability of Iranian students of translation. To fulfill the purpose of the study, a total of forty-six learners were selected out of fifty-seven participants in a higher education center in Tehran based on their scores in Preliminary English Test (PET) and were divided randomly into the experimental and control groups. Prior to the treatment, a technical text translation questionnaire was devised and then approved and validated by three assistant professors of technical fields and three assistant professors of Teaching English as a Foreign Language (TEFL) at the university. This questionnaire was administered as a pretest to both groups. Control and experimental groups were trained for five successive weeks using identical course books but with a different lesson plan that allowed employing multimedia for the experimental group only. The devised and approved questionnaire was administered as a posttest to both groups at the end of the instruction. A multivariate ANOVA was run to compare the two groups’ means on the PET, pretest and posttest. The results showed the rejection of all null hypotheses of the study and revealed that multimedia significantly improved technical text translation ability of the learners.

Keywords: multimedia, multimedia-mediated teaching, technical translation model, technical text, translation ability

Procedia PDF Downloads 102
1544 Temporality, Place and Autobiography in J.M. Coetzee’s 'Summertime'

Authors: Barbara Janari

Abstract:

In this paper it is argued that the effect of the disjunctive temporality in Summertime (the third of J.M. Coetzee’s fictionalised memoirs) is two-fold: firstly, it reflects the memoir’s ambivalent, contradictory representations of place in order to emphasize the fractured sense of self growing up in South Africa during apartheid entailed for Coetzee. Secondly, it reconceives the autobiographical discourse as one that foregrounds the inherent fictionality of all texts. The memoir’s narrative is filtered through intricate textual strategies that disrupt the chronological movement of the narrative, evoking the labyrinthine ways in which the past and present intersect and interpenetrate each other. It is framed by entries from Coetzee’s Notebooks: it opens with entries that cover the years 1972–1975, and ends with a number of undated fragments from his Notebooks. Most of the entries include a short ‘memo’ at the end, added between 1999 and 2000. While the memos follow the Notebook entries in the text, they are separated by decades. Between the Notebook entries is a series of interviews conducted by Vincent, the text’s putative biographer, between 2007 and 2008, based on recollections from five people who had known Coetzee in the 1970s – a key period in John’s life as it marks both his return to South Africa after a failed emigration attempt to America, and the beginning of his writing career, with the publication of Dusklands in 1974. The relationship between the memoir’s various parts is a key feature of Coetzee’s representation of place in Summertime, which is constructed as a composite one in which the principle of reflexive referencing has to be adopted. In other words, readers have to suspend individual references temporarily until the relationships between the parts have been connected to each other. In order to apprehend meaning in the text, the disparate narrative elements have to first be tied together. In this text, then, the experience of time as ordered and chronological is ruptured. Instead, the memoir’s themes and patterns become apparent most clearly through reflexive referencing, by which relationships between disparate sections of the text are linked. The image of the fictional John that emerges from the text is a composite of this John and the author, J.M. Coetzee, and is one which embodies Coetzee’s often fraught relationship with his home country, South Africa.

Keywords: autobiography, place, reflexive referencing, temporality

Procedia PDF Downloads 47
1543 Automatic Landmark Selection Based on Feature Clustering for Visual Autonomous Unmanned Aerial Vehicle Navigation

Authors: Paulo Fernando Silva Filho, Elcio Hideiti Shiguemori

Abstract:

The selection of specific landmarks for an Unmanned Aerial Vehicles’ Visual Navigation systems based on Automatic Landmark Recognition has significant influence on the precision of the system’s estimated position. At the same time, manual selection of the landmarks does not guarantee a high recognition rate, which would also result on a poor precision. This work aims to develop an automatic landmark selection that will take the image of the flight area and identify the best landmarks to be recognized by the Visual Navigation Landmark Recognition System. The criterion to select a landmark is based on features detected by ORB or AKAZE and edges information on each possible landmark. Results have shown that disposition of possible landmarks is quite different from the human perception.

Keywords: clustering, edges, feature points, landmark selection, X-means

Procedia PDF Downloads 253
1542 Effect of Mobile Phone Text Message Reminders on Adherence to Routine Prenatal Iron/Folic Acid Supplement among Pregnant Women: A Pilot Study

Authors: Nneka U. Igboeli, Maxwell O. Adibe

Abstract:

Iron and folate supplementation in pregnancy are important interventions that prevent maternal anaemia and fetal anomaly. Thus, daily oral doses of iron and folic acid are recommended throughout pregnancy as part of antenatal care. However, low adherence has been a major drawback leading to low effectiveness of these programs. The effect of mobile text message reminders to pregnant women to take their routine medications on adherence was evaluated in this study. The first 100 women who consented to the study were recruited and randomized to either receive a text message reminder on adherence to routine medications or not. Adherence was assessed using the 8-item Modified Morisky Adherence Scale (8-MMAS). The folders of successfully recruited women were tagged with the a study number assigned to each of them. The womens’ phone numbers were collected and these were used to send text messages reminders on adhering to routine drugs only to women in the intervention group. The text messages were sent three times per week for a period of four weeks with an adherence reassessment at the one month follow-up antenatal visit for recruited women. At one month follow-up, the lost to follow-up were 6 (16%) women for the intervention group and 17 (34%) for the control group. The across group mean difference in adherence score was 0.07 (-0.96 – 1.10) at baseline and 0.3 (-0.31 – 0.92) after intervention, both insignificant at p > 0.05. The within group change were increases of 0.58 (0.00 – 1.16) (p = 0.05) from baseline for the intervention group and a 0.35 (-0.51 – 1.20) (p = 0.395) for the control group. Non-significant increase in adherence scores were recorded for both groups. However, the increase in adherence scores of women in the intervention group was greater and may be potentially transformed into more positive results if the study period is increased with possibly reduced study drop-outs shows great promise for more positive results.

Keywords: adherence, mobile phone, pregnant women, reminders

Procedia PDF Downloads 152
1541 Clustering Based and Centralized Routing Table Topology of Control Protocol in Mobile Wireless Sensor Networks

Authors: Mbida Mohamed, Ezzati Abdellah

Abstract:

A strong challenge in the wireless sensor networks (WSN) is to save the energy and have a long life time in the network without having a high rate of loss information. However, topology control (TC) protocols are designed in a way that the network is divided and having a standard system of exchange packets between nodes. In this article, we will propose a clustering based and centralized routing table protocol of TC (CBCRT) which delegates a leader node that will encapsulate a single routing table in every cluster nodes. Hence, if a node wants to send packets to the sink, it requests the information's routing table of the current cluster from the node leader in order to root the packet.

Keywords: mobile wireless sensor networks, routing, topology of control, protocols

Procedia PDF Downloads 244
1540 Multimodal Sentiment Analysis With Web Based Application

Authors: Shreyansh Singh, Afroz Ahmed

Abstract:

Sentiment Analysis intends to naturally reveal the hidden mentality that we hold towards an entity. The total of this assumption over a populace addresses sentiment surveying and has various applications. Current text-based sentiment analysis depends on the development of word embeddings and Machine Learning models that take in conclusion from enormous text corpora. Sentiment Analysis from text is presently generally utilized for consumer loyalty appraisal and brand insight investigation. With the expansion of online media, multimodal assessment investigation is set to carry new freedoms with the appearance of integral information streams for improving and going past text-based feeling examination using the new transforms methods. Since supposition can be distinguished through compelling follows it leaves, like facial and vocal presentations, multimodal opinion investigation offers good roads for examining facial and vocal articulations notwithstanding the record or printed content. These methodologies use the Recurrent Neural Networks (RNNs) with the LSTM modes to increase their performance. In this study, we characterize feeling and the issue of multimodal assessment investigation and audit ongoing advancements in multimodal notion examination in various spaces, including spoken surveys, pictures, video websites, human-machine, and human-human connections. Difficulties and chances of this arising field are additionally examined, promoting our theory that multimodal feeling investigation holds critical undiscovered potential.

Keywords: sentiment analysis, RNN, LSTM, word embeddings

Procedia PDF Downloads 95
1539 A Word-to-Vector Formulation for Word Representation

Authors: Sandra Rizkallah, Amir F. Atiya

Abstract:

This work presents a novel word to vector representation that is based on embedding the words into a sphere, whereby the dot product of the corresponding vectors represents the similarity between any two words. Embedding the vectors into a sphere enabled us to take into consideration the antonymity between words, not only the synonymity, because of the suitability to handle the polarity nature of words. For example, a word and its antonym can be represented as a vector and its negative. Moreover, we have managed to extract an adequate vocabulary. The obtained results show that the proposed approach can capture the essence of the language, and can be generalized to estimate a correct similarity of any new pair of words.

Keywords: natural language processing, word to vector, text similarity, text mining

Procedia PDF Downloads 244
1538 Fake News Detection for Korean News Using Machine Learning Techniques

Authors: Tae-Uk Yun, Pullip Chung, Kee-Young Kwahk, Hyunchul Ahn

Abstract:

Fake news is defined as the news articles that are intentionally and verifiably false, and could mislead readers. Spread of fake news may provoke anxiety, chaos, fear, or irrational decisions of the public. Thus, detecting fake news and preventing its spread has become very important issue in our society. However, due to the huge amount of fake news produced every day, it is almost impossible to identify it by a human. Under this context, researchers have tried to develop automated fake news detection using machine learning techniques over the past years. But, there have been no prior studies proposed an automated fake news detection method for Korean news to our best knowledge. In this study, we aim to detect Korean fake news using text mining and machine learning techniques. Our proposed method consists of two steps. In the first step, the news contents to be analyzed is convert to quantified values using various text mining techniques (topic modeling, TF-IDF, and so on). After that, in step 2, classifiers are trained using the values produced in step 1. As the classifiers, machine learning techniques such as logistic regression, backpropagation network, support vector machine, and deep neural network can be applied. To validate the effectiveness of the proposed method, we collected about 200 short Korean news from Seoul National University’s FactCheck. which provides with detailed analysis reports from 20 media outlets and links to source documents for each case. Using this dataset, we will identify which text features are important as well as which classifiers are effective in detecting Korean fake news.

Keywords: fake news detection, Korean news, machine learning, text mining

Procedia PDF Downloads 246
1537 Translation and Ideology: New Perspectives

Authors: Hamza Salih

Abstract:

Since translation is no longer viewed as a mere replacement of linguistic codes from one language to another, it has increasingly been considered, especially with the advent of the cultural turn in the late 70's, in relation to the broader external context in which it takes place. According to scholars in the field, the translation process is determined by the political, economic and cultural values which exert external pressures on the translator. Correspondingly, the relationship between translation as an act of re-writing the original text and ideology has already been established. This paper addresses the issue of how ideology comes into play in the translational process and what strategies the translator adopts to foreground or circumvent ideological constraints. Along with this, the paper will touch upon the notions of censorship, manipulation, subversion and domestication which are deemed of relevance to this very topic. In fact, after the domination of the empirically-oriented linguistic approaches in translation studies, the relationship between translation and ideology has to be foregrounded to draw attention to the fact that the translation process is not a mere text-to-text linguistic transfer, but, on the contrary, takes place in the midst of economic, political, cultural and religious variables, which some scholars subsume under the category ideology.

Keywords: translation, language, ideology, subversion, censorship and manipulation

Procedia PDF Downloads 226
1536 How Is a Machine-Translated Literary Text Organized in Coherence? An Analysis Based upon Theme-Rheme Structure

Authors: Jiang Niu, Yue Jiang

Abstract:

With the ultimate goal to automatically generate translated texts with high quality, machine translation has made tremendous improvements. However, its translations of literary works are still plagued with problems in coherence, esp. the translation between distant language pairs. One of the causes of the problems is probably the lack of linguistic knowledge to be incorporated into the training of machine translation systems. In order to enable readers to better understand the problems of machine translation in coherence, to seek out the potential knowledge to be incorporated, and thus to improve the quality of machine translation products, this study applies Theme-Rheme structure to examine how a machine-translated literary text is organized and developed in terms of coherence. Theme-Rheme structure in Systemic Functional Linguistics is a useful tool for analysis of textual coherence. Theme is the departure point of a clause and Rheme is the rest of the clause. In a text, as Themes and Rhemes may be connected with each other in meaning, they form thematic and rhematic progressions throughout the text. Based on this structure, we can look into how a text is organized and developed in terms of coherence. Methodologically, we chose Chinese and English as the language pair to be studied. Specifically, we built a comparable corpus with two modes of English translations, viz. machine translation (MT) and human translation (HT) of one Chinese literary source text. The translated texts were annotated with Themes, Rhemes and their progressions throughout the texts. The annotated texts were analyzed from two respects, the different types of Themes functioning differently in achieving coherence, and the different types of thematic and rhematic progressions functioning differently in constructing texts. By analyzing and contrasting the two modes of translations, it is found that compared with the HT, 1) the MT features “pseudo-coherence”, with lots of ill-connected fragments of information using “and”; 2) the MT system produces a static and less interconnected text that reads like a list; these two points, in turn, lead to the less coherent organization and development of the MT than that of the HT; 3) novel to traditional and previous studies, Rhemes do contribute to textual connection and coherence though less than Themes do and thus are worthy of notice in further studies. Hence, the findings suggest that Theme-Rheme structure be applied to measuring and assessing the coherence of machine translation, to being incorporated into the training of the machine translation system, and Rheme be taken into account when studying the textual coherence of both MT and HT.

Keywords: coherence, corpus-based, literary translation, machine translation, Theme-Rheme structure

Procedia PDF Downloads 180
1535 Development of Fake News Model Using Machine Learning through Natural Language Processing

Authors: Sajjad Ahmed, Knut Hinkelmann, Flavio Corradini

Abstract:

Fake news detection research is still in the early stage as this is a relatively new phenomenon in the interest raised by society. Machine learning helps to solve complex problems and to build AI systems nowadays and especially in those cases where we have tacit knowledge or the knowledge that is not known. We used machine learning algorithms and for identification of fake news; we applied three classifiers; Passive Aggressive, Naïve Bayes, and Support Vector Machine. Simple classification is not completely correct in fake news detection because classification methods are not specialized for fake news. With the integration of machine learning and text-based processing, we can detect fake news and build classifiers that can classify the news data. Text classification mainly focuses on extracting various features of text and after that incorporating those features into classification. The big challenge in this area is the lack of an efficient way to differentiate between fake and non-fake due to the unavailability of corpora. We applied three different machine learning classifiers on two publicly available datasets. Experimental analysis based on the existing dataset indicates a very encouraging and improved performance.

Keywords: fake news detection, natural language processing, machine learning, classification techniques.

Procedia PDF Downloads 135
1534 Evaluation of Security and Performance of Master Node Protocol in the Bitcoin Peer-To-Peer Network

Authors: Muntadher Sallal, Gareth Owenson, Mo Adda, Safa Shubbar

Abstract:

Bitcoin is a digital currency based on a peer-to-peer network to propagate and verify transactions. Bitcoin is gaining wider adoption than any previous crypto-currency. However, the mechanism of peers randomly choosing logical neighbors without any knowledge about underlying physical topology can cause a delay overhead in information propagation, which makes the system vulnerable to double-spend attacks. Aiming at alleviating the propagation delay problem, this paper introduces proximity-aware extensions to the current Bitcoin protocol, named Master Node Based Clustering (MNBC). The ultimate purpose of the proposed protocol, that are based on how clusters are formulated and how nodes can define their membership, is to improve the information propagation delay in the Bitcoin network. In MNBC protocol, physical internet connectivity increases, as well as the number of hops between nodes, decreases through assigning nodes to be responsible for maintaining clusters based on physical internet proximity. We show, through simulations, that the proposed protocol defines better clustering structures that optimize the performance of the transaction propagation over the Bitcoin protocol. The evaluation of partition attacks in the MNBC protocol, as well as the Bitcoin network, was done in this paper. Evaluation results prove that even though the Bitcoin network is more resistant against the partitioning attack than the MNBC protocol, more resources are needed to be spent to split the network in the MNBC protocol, especially with a higher number of nodes.

Keywords: Bitcoin network, propagation delay, clustering, scalability

Procedia PDF Downloads 95
1533 Event Driven Dynamic Clustering and Data Aggregation in Wireless Sensor Network

Authors: Ashok V. Sutagundar, Sunilkumar S. Manvi

Abstract:

Energy, delay and bandwidth are the prime issues of wireless sensor network (WSN). Energy usage optimization and efficient bandwidth utilization are important issues in WSN. Event triggered data aggregation facilitates such optimal tasks for event affected area in WSN. Reliable delivery of the critical information to sink node is also a major challenge of WSN. To tackle these issues, we propose an event driven dynamic clustering and data aggregation scheme for WSN that enhances the life time of the network by minimizing redundant data transmission. The proposed scheme operates as follows: (1) Whenever the event is triggered, event triggered node selects the cluster head. (2) Cluster head gathers data from sensor nodes within the cluster. (3) Cluster head node identifies and classifies the events out of the collected data using Bayesian classifier. (4) Aggregation of data is done using statistical method. (5) Cluster head discovers the paths to the sink node using residual energy, path distance and bandwidth. (6) If the aggregated data is critical, cluster head sends the aggregated data over the multipath for reliable data communication. (7) Otherwise aggregated data is transmitted towards sink node over the single path which is having the more bandwidth and residual energy. The performance of the scheme is validated for various WSN scenarios to evaluate the effectiveness of the proposed approach in terms of aggregation time, cluster formation time and energy consumed for aggregation.

Keywords: wireless sensor network, dynamic clustering, data aggregation, wireless communication

Procedia PDF Downloads 418
1532 Approach Based on Fuzzy C-Means for Band Selection in Hyperspectral Images

Authors: Diego Saqui, José H. Saito, José R. Campos, Lúcio A. de C. Jorge

Abstract:

Hyperspectral images and remote sensing are important for many applications. A problem in the use of these images is the high volume of data to be processed, stored and transferred. Dimensionality reduction techniques can be used to reduce the volume of data. In this paper, an approach to band selection based on clustering algorithms is presented. This approach allows to reduce the volume of data. The proposed structure is based on Fuzzy C-Means (or K-Means) and NWHFC algorithms. New attributes in relation to other studies in the literature, such as kurtosis and low correlation, are also considered. A comparison of the results of the approach using the Fuzzy C-Means and K-Means with different attributes is performed. The use of both algorithms show similar good results but, particularly when used attributes variance and kurtosis in the clustering process, however applicable in hyperspectral images.

Keywords: band selection, fuzzy c-means, k-means, hyperspectral image

Procedia PDF Downloads 376
1531 Privacy Preserving Data Publishing Based on Sensitivity in Context of Big Data Using Hive

Authors: P. Srinivasa Rao, K. Venkatesh Sharma, G. Sadhya Devi, V. Nagesh

Abstract:

Privacy Preserving Data Publication is the main concern in present days because the data being published through the internet has been increasing day by day. This huge amount of data was named as Big Data by its size. This project deals the privacy preservation in the context of Big Data using a data warehousing solution called hive. We implemented Nearest Similarity Based Clustering (NSB) with Bottom-up generalization to achieve (v,l)-anonymity. (v,l)-Anonymity deals with the sensitivity vulnerabilities and ensures the individual privacy. We also calculate the sensitivity levels by simple comparison method using the index values, by classifying the different levels of sensitivity. The experiments were carried out on the hive environment to verify the efficiency of algorithms with Big Data. This framework also supports the execution of existing algorithms without any changes. The model in the paper outperforms than existing models.

Keywords: sensitivity, sensitive level, clustering, Privacy Preserving Data Publication (PPDP), bottom-up generalization, Big Data

Procedia PDF Downloads 264
1530 The Use of Punctuation by Primary School Students Writing Texts Collaboratively: A Franco-Brazilian Comparative Study

Authors: Cristina Felipeto, Catherine Bore, Eduardo Calil

Abstract:

This work aims to analyze and compare the punctuation marks (PM) in school texts of Brazilian and French students and the comments on these PM made spontaneously by the students during the ongoing text. Assuming textual genetics as an investigative field within a dialogical and enunciative approach, we defined a common methodological design in two 1st year classrooms (7 years old) of the primary school, one classroom in Brazil (Maceio) and the other one in France (Paris). Through a multimodal capture system of writing processes in real time and space (Ramos System), we recorded the collaborative writing proposal in dyads in each of the classrooms. This system preserves the classroom’s ecological characteristics and provides a video recording synchronized with dialogues, gestures and facial expressions of the students, the stroke of the pen’s ink on the sheet of paper and the movement of the teacher and students in the classroom. The multimodal register of the writing process allowed access to the text in progress and the comments made by the students on what was being written. In each proposed text production, teachers organized their students in dyads and requested that they should talk, combine and write a fictional narrative. We selected a Dyad of Brazilian students (BD) and another Dyad of French students (FD) and we have filmed 6 proposals for each of the dyads. The proposals were collected during the 2nd Term of 2013 (Brazil) and 2014 (France). In 6 texts written by the BD there were identified 39 PMs and 825 written words (on average, a PM every 23 words): Of these 39 PMs, 27 were highlighted orally and commented by either student. In the texts written by the FD there were identified 48 PMs and 258 written words (on average, 1 PM every 5 words): Of these 48 PM, 39 were commented by the French students. Unlike what the studies on punctuation acquisition point out, the PM that occurred the most were hyphens (BD) and commas (FD). Despite the significant difference between the types and quantities of PM in the written texts, the recognition of the need for writing PM in the text in progress and the comments have some common characteristics: i) the writing of the PM was not anticipated in relation to the text in progress, then they were added after the end of a sentence or after the finished text itself; ii) the need to add punctuation marks in the text came after one of the students had ‘remembered’ that a particular sign was needed; iii) most of the PM inscribed were not related to their linguistic functions, but the graphic-visual feature of the text; iv) the comments justify or explain the PM, indicating metalinguistic reflections made by the students. Our results indicate how the comments of the BD and FD express the dialogic and subjective nature of knowledge acquisition. Our study suggests that the initial learning of PM depends more on its graphic features and interactional conditions than on its linguistic functions.

Keywords: collaborative writing, erasure, graphic marks, learning, metalinguistic awareness, textual genesis

Procedia PDF Downloads 141
1529 Architectural Experience of the Everyday in Bangkok CBD

Authors: Thirayu Jumsai Na Ayudhya

Abstract:

The attempt to understand about what architecture means to people as they go about their everyday life revealed that knowledge such as environmental psychology, environmental perception, environmental aesthetics, inadequately address the contextualized and holistic theoretical framework. In my previous research, it was found that people’s making senses of their everyday architecture can be addressed in terms of four super‐ordinate themes; (1) building in urban (text), (2) building in (text), (3) building in human (text), (4) and building in time (text). In this research, Bangkok CBD was selected as the focal urban context that the integrated style of architecture is noticeable. It is expected that in a unique urban context like Bangkok CBD unprecedented super-ordinate themes will be unveiled through the reflection of people’s everyday experiences. In this research, people’s architectural experience conducted in Bangkok CBD, Thailand, will be presented succinctly. The research addresses the question of how do people make sense of their everyday architecture/buildings especially in a unique urban context, Bangkok CBD, and identifies ways in which people make sense of their everyday architecture. Two key methodologies are adopted. First, Participant-Produced-Photograph (PPP) allows people to express their experiences of the everyday urban context freely without any interference or forced-data generating by researchers. Second, Interpretative Phenomenological Analysis (IPA) are also applied as main methodologies. With IPA methodology, a small pool of participants is considered giving the detailed level of analysis and its potential to produce a meaningful outcome.

Keywords: architectural experience, building appreciation, design psychology, environmental psychology, sense-making, the everyday experience, transactional theory

Procedia PDF Downloads 302
1528 Intertextuality as a Dialogue Between Postmodern Writer J. Fowles and Mid-English Writer J. Donne

Authors: Isahakyan Heghine

Abstract:

Intertextuality, being in the centre of attention of both linguists and literary critics, is vividly expressed in the outstanding British novelist and philosopher J. Fowles' works. 'The Magus’ is a deep psychological and philosophical novel with vivid intertextual links with the Greek mythology and authors from different epochs. The aim of the paper is to show how intertextuality might serve as a dialogue between two authors (J. Fowles and J. Donne) disguised in the dialogue of two protagonists of the novel : Conchis and Nicholas. Contrastive viewpoints concerning man's isolation, loneliness are stated in the dialogue. Due to the conceptual analysis of the text it becomes possible both to decode the conceptual information of the text and find out its intertextual links.

Keywords: dialogue, conceptual analysis, isolation, intertextuality

Procedia PDF Downloads 306
1527 Identification of Nonlinear Systems Using Radial Basis Function Neural Network

Authors: C. Pislaru, A. Shebani

Abstract:

This paper uses the radial basis function neural network (RBFNN) for system identification of nonlinear systems. Five nonlinear systems are used to examine the activity of RBFNN in system modeling of nonlinear systems; the five nonlinear systems are dual tank system, single tank system, DC motor system, and two academic models. The feed forward method is considered in this work for modelling the non-linear dynamic models, where the K-Means clustering algorithm used in this paper to select the centers of radial basis function network, because it is reliable, offers fast convergence and can handle large data sets. The least mean square method is used to adjust the weights to the output layer, and Euclidean distance method used to measure the width of the Gaussian function.

Keywords: system identification, nonlinear systems, neural networks, radial basis function, K-means clustering algorithm

Procedia PDF Downloads 447
1526 Discriminating Between Energy Drinks and Sports Drinks Based on Their Chemical Properties Using Chemometric Methods

Authors: Robert Cazar, Nathaly Maza

Abstract:

Energy drinks and sports drinks are quite popular among young adults and teenagers worldwide. Some concerns regarding their health effects – particularly those of the energy drinks - have been raised based on scientific findings. Differentiating between these two types of drinks by means of their chemical properties seems to be an instructive task. Chemometrics provides the most appropriate strategy to do so. In this study, a discrimination analysis of the energy and sports drinks has been carried out applying chemometric methods. A set of eleven samples of available commercial brands of drinks – seven energy drinks and four sports drinks – were collected. Each sample was characterized by eight chemical variables (carbohydrates, energy, sugar, sodium, pH, degrees Brix, density, and citric acid). The data set was standardized and examined by exploratory chemometric techniques such as clustering and principal component analysis. As a preliminary step, a variable selection was carried out by inspecting the variable correlation matrix. It was detected that some variables are redundant, so they can be safely removed, leaving only five variables that are sufficient for this analysis. They are sugar, sodium, pH, density, and citric acid. Then, a hierarchical clustering `employing the average – linkage criterion and using the Euclidian distance metrics was performed. It perfectly separates the two types of drinks since the resultant dendogram, cut at the 25% similarity level, assorts the samples in two well defined groups, one of them containing the energy drinks and the other one the sports drinks. Further assurance of the complete discrimination is provided by the principal component analysis. The projection of the data set on the first two principal components – which retain the 71% of the data information – permits to visualize the distribution of the samples in the two groups identified in the clustering stage. Since the first principal component is the discriminating one, the inspection of its loadings consents to characterize such groups. The energy drinks group possesses medium to high values of density, citric acid, and sugar. The sports drinks group, on the other hand, exhibits low values of those variables. In conclusion, the application of chemometric methods on a data set that features some chemical properties of a number of energy and sports drinks provides an accurate, dependable way to discriminate between these two types of beverages.

Keywords: chemometrics, clustering, energy drinks, principal component analysis, sports drinks

Procedia PDF Downloads 82