Search results for: text obfuscation
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1322

Search results for: text obfuscation

1052 Financial Reports and Common Ownership: An Analysis of the Mechanisms Common Owners Use to Induce Anti-Competitive Behavior

Authors: Kevin Smith

Abstract:

Publicly traded company in the US are legally obligated to host earnings calls that discuss their most recent financial reports. During these calls, investors are able to ask these companies questions about these financial reports and on the future direction of the company. This paper examines whether common institutional owners use these calls as a way to indirectly signal to companies in their portfolio to not take actions that could hurt the common owner's interests. This paper uses transcripts taken from the earnings calls of the six largest health insurance companies in the US from 2014 to 2019. This data is analyzed using text analysis and sentiment analysis to look for patterns in the statements made by common owners. The analysis found that common owners where more likely to recommend against direct price competition and instead redirect the insurance companies towards more passive actions, like investing in new technologies. This result indicates a mechanism that common owners use to reduce competition in the health insurance market.

Keywords: common ownership, text analysis, sentiment analysis, machine learning

Procedia PDF Downloads 74
1051 A Review of Research on Pre-training Technology for Natural Language Processing

Authors: Moquan Gong

Abstract:

In recent years, with the rapid development of deep learning, pre-training technology for natural language processing has made great progress. The early field of natural language processing has long used word vector methods such as Word2Vec to encode text. These word vector methods can also be regarded as static pre-training techniques. However, this context-free text representation brings very limited improvement to subsequent natural language processing tasks and cannot solve the problem of word polysemy. ELMo proposes a context-sensitive text representation method that can effectively handle polysemy problems. Since then, pre-training language models such as GPT and BERT have been proposed one after another. Among them, the BERT model has significantly improved its performance on many typical downstream tasks, greatly promoting the technological development in the field of natural language processing, and has since entered the field of natural language processing. The era of dynamic pre-training technology. Since then, a large number of pre-trained language models based on BERT and XLNet have continued to emerge, and pre-training technology has become an indispensable mainstream technology in the field of natural language processing. This article first gives an overview of pre-training technology and its development history, and introduces in detail the classic pre-training technology in the field of natural language processing, including early static pre-training technology and classic dynamic pre-training technology; and then briefly sorts out a series of enlightening technologies. Pre-training technology, including improved models based on BERT and XLNet; on this basis, analyze the problems faced by current pre-training technology research; finally, look forward to the future development trend of pre-training technology.

Keywords: natural language processing, pre-training, language model, word vectors

Procedia PDF Downloads 57
1050 StockTwits Sentiment Analysis on Stock Price Prediction

Authors: Min Chen, Rubi Gupta

Abstract:

Understanding and predicting stock market movements is a challenging problem. It is believed stock markets are partially driven by public sentiments, which leads to numerous research efforts to predict stock market trend using public sentiments expressed on social media such as Twitter but with limited success. Recently a microblogging website StockTwits is becoming increasingly popular for users to share their discussions and sentiments about stocks and financial market. In this project, we analyze the text content of StockTwits tweets and extract financial sentiment using text featurization and machine learning algorithms. StockTwits tweets are first pre-processed using techniques including stopword removal, special character removal, and case normalization to remove noise. Features are extracted from these preprocessed tweets through text featurization process using bags of words, N-gram models, TF-IDF (term frequency-inverse document frequency), and latent semantic analysis. Machine learning models are then trained to classify the tweets' sentiment as positive (bullish) or negative (bearish). The correlation between the aggregated daily sentiment and daily stock price movement is then investigated using Pearson’s correlation coefficient. Finally, the sentiment information is applied together with time series stock data to predict stock price movement. The experiments on five companies (Apple, Amazon, General Electric, Microsoft, and Target) in a duration of nine months demonstrate the effectiveness of our study in improving the prediction accuracy.

Keywords: machine learning, sentiment analysis, stock price prediction, tweet processing

Procedia PDF Downloads 156
1049 Shaking the Iceberg: Metaphoric Shifting and Loss in the German Translations of 'The Sun Also Rises'

Authors: Christopher Dick

Abstract:

While the translation of 'literal language' poses numerous challenges for the translator, the translation of 'figurative language' creates even more complicated issues. It has been only in the last several decades that scholars have attempted to propose theories of figurative language translation, including metaphor translation. Even less work has applied these theories to metaphoric translation in literary texts. And almost no work has linked an analysis of metaphors in translation with the recent scholarship on conceptual metaphors. A study of literature in translation must not only examine the inevitable shifts that occur as specific metaphors move from source language to target language but also analyze the ways in which these shifts impact conceptual metaphors and, ultimately, the text as a whole. Doing so contributes to on-going efforts to bridge the sometimes wide gulf between considerations of content and form in literary studies. This paper attempts to add to the body of scholarly literature on metaphor translation and the function of metaphor in a literary text. Specifically, the study examines the metaphoric expressions in Hemingway’s The Sun Also Rises. First, the issue of Hemingway and metaphor is addressed. Next, the study examines the specific metaphors in the original novel in English and the German translations, first in Annemarie Horschitz’s 1928 German version and then in the recent Werner Schmitz 2013 translation. Hemingway’s metaphors, far from being random occurrences of figurative language, are linguistic manifestations of deeper conceptual metaphors that are central to an interpretation of the text. By examining the modifications that are made to these original metaphoric expressions as they are translated into German, one can begin to appreciate the shifts involved with metaphor translation. The translation of Hemingway’s metaphors into German represents significant metaphoric loss and shifting that subsequently shakes the important conceptual metaphors in the novel.

Keywords: Hemingway, Conceptual Metaphor, Translation, Stylistics

Procedia PDF Downloads 356
1048 The Power of Words: The Use of Language in Ethan Frome

Authors: Ritu Sharma

Abstract:

In order to be objective, critics must examine the dynamic relationships between the author, the reader, the text, and the outside world. However, it is also crucial to recognize that because the language was created by God, meaning is ingrained in it. Meaning is located in and discovered through literature rather than being limited to the author, reader, text, or the outside world. The link between the author, the reader, and the text is crucial because literature unites an author and a reader through the use of language. Literature is a potent kind of communication, and Ethan Frome's audience is forever changed as a result of the book's language and the language its characters use. The narrative of Ethan Frome and his wife Zeena is presented in Ethan Frome. Ethan's story is told throughout the course of the book, revealed through the eyes of the narrator, an outsider passing through Starkfield, as well as through the insight that the narrator gains from the townspeople and his stay on the Frome farm. The story is set in the rural New England community of Starkfield, Massachusetts. The weather provides the ideal setting for Ethan and the narrator to get to know one another as the narrator gets preoccupied with unraveling the narrative that underlies Ethan's physical anomalies. In addition to telling a gripping tale and capturing human nature as it is, Ethan Frome uses its storyline to achieve something more significant. The book by Edith Wharton supports language. Zeena's deliberate and convincing language challenges relativity and meaninglessness. Ethan and Mattie's effort to effectively use words reflects the complexity of language, and their battle illustrates the influence that language may have if and when it is used. Ethan Frome defends the written word, the foundation upon which it is constructed, as a literary work. Communication is based on language, and as the characters respond to and get involved in disputes throughout the book, Zeena, Ethan, and Mattie, each reflects particular theories of communication that help define their uses of communication within the broader context of language.

Keywords: dynamic relationships, potent, communication, complexity

Procedia PDF Downloads 91
1047 Improved Safety Science: Utilizing a Design Hierarchy

Authors: Ulrica Pettersson

Abstract:

Collection of information on incidents is regularly done through pre-printed incident report forms. These tend to be incomplete and frequently lack essential information. ne consequence is that reports with inadequate information, that do not fulfil analysts’ requirements, are transferred into the analysis process. To improve an incident reporting form, theory in design science, witness psychology and interview and questionnaire research has been used. Previously three experiments have been conducted to evaluate the form and shown significant improved results. The form has proved to capture knowledge, regardless of the incidents’ character or context. The aim in this paper is to describe how design science, in more detail a design hierarchy can be used to construct a collection form for improvements in safety science.

Keywords: data collection, design science, incident reports, safety science

Procedia PDF Downloads 223
1046 Mobile Phone Text Reminders and Voice Call Follow-ups Improve Attendance for Community Retail Pharmacy Refills; Learnings from Lango Sub-region in Northern Uganda

Authors: Jonathan Ogwal, Louis H. Kamulegeya, John M. Bwanika, Davis Musinguzi

Abstract:

Introduction: Community retail Pharmacy drug distribution points (CRPDDP) were implemented in the Lango sub-region as part of the Ministry of Health’s response to improving access and adherence to antiretroviral treatment (ART). Clients received their ART refills from nearby local pharmacies; as such, the need for continuous engagement through mobile phone appointment reminders and health messages. We share learnings from the implementation of mobile text reminders and voice call follow-ups among ART clients attending the CRPDDP program in northern Uganda. Methods: A retrospective data review of electronic medical records from four pharmacies allocated for CRPDDP in the Lira and Apac districts of the Lango sub-region in Northern Uganda was done from February to August 2022. The process involved collecting phone contacts of eligible clients from the health facility appointment register and uploading them onto a messaging platform customized by Rapid-pro, an open-source software. Client information, including code name, phone number, next appointment date, and the allocated pharmacy for ART refill, was collected and kept confidential. Contacts received appointment reminder messages and other messages on positive living as an ART client. Routine voice call follow-ups were done to ascertain the picking of ART from the refill pharmacy. Findings: In total, 1,354 clients were reached from the four allocated pharmacies found in urban centers. 972 clients received short message service (SMS) appointment reminders, and 382 were followed up through voice calls. The majority (75%) of the clients returned for refills on the appointed date, 20% returned within four days after the appointment date, and the remaining 5% needed follow-up where they reported that they were not in the district by the appointment date due to other engagements. Conclusion: The use of mobile text reminders and voice call follow-ups improves the attendance of community retail pharmacy refills.

Keywords: antiretroviral treatment, community retail drug distribution points, mobile text reminders, voice call follow-up

Procedia PDF Downloads 99
1045 The Popular Imagination through the Poem of “Ras B’Nadam”

Authors: Hirreche Baghdad Mohamed

Abstract:

One of the main texts in popular culture in Algeria is a symbolic and imaginary tale, through which the author was able to derive from the world and popular cultural stock and symbolic capital elements that enabled him to create a synthesis between a number of imaginary and real events. Thanks to the level of spirituality that the author was experiencing, he was able to go deep in order to redraw the boundaries of human life in view of its existence and status (life experiences, its end, and its fate). It is a text that is consistent with religious values and has a philosophical depth. This poem can be shared in official and unofficial meetings, during feasts, and during popular celebrations, such as circumcision ceremonies, marriage, and condolences. It has also the ability to draw attention and appeal to the listener and let him travel into the imaginary world. It is the text related to the story of "Ras b’nadem", or "the head of a man", or rather, a "human skull", for which only a few academic studies have been devoted, and there are two copies of it, one attributed to Lakhdar Ibn Khalouf as a matter of suspicion, while the other is attributed to Qadour Ibn Ashour Al-Zarhouni.

Keywords: ras B’Nadam, ras al mahna, lakhdar ibn khalouf, qadour ibn ashour, sufism, melhoun poetry, resistance poetry

Procedia PDF Downloads 192
1044 The Arab Spring Rebellion or Revolution: An Analysis of the Text

Authors: Sulaiman Ahmed

Abstract:

This paper will analyse the classical Islamic text in order to determine whether the Arab spring was a rebellion or a revolution. Commencing in 2010, we saw a series of revolutions or what some would call rebellions throughout the Arab peninsula. Many of the religious clergies came out emphatically in support of the people who wanted to overthrow the leaders. This brought forth the important question about the acceptability of rebelling against unjust leaders in Islamic theological texts. The paper will look to analyse the Islamic legal and theological position on the permissibility of rebelling, whether there is scholarly consensus on the issue, and how the texts are analysed in order to come to the current position we have today. The position of the clergy who supported the Arab spring will also be analysed in order to deduce if their position falls within the religious framework. An inquiry will be about to determine the ideology of those who joined the rebellion after the inception and whether these ideas can be found in classical Islamic texts. The nuances of these positions will be analysed in order to determine whether what we witnessed was a rebellion or a revolution.

Keywords: rebellion, revolution, Arab spring, scholarly consensus

Procedia PDF Downloads 154
1043 Measuring Text-Based Semantics Relatedness Using WordNet

Authors: Madiha Khan, Sidrah Ramzan, Seemab Khan, Shahzad Hassan, Kamran Saeed

Abstract:

Measuring semantic similarity between texts is calculating semantic relatedness between texts using various techniques. Our web application (Measuring Relatedness of Concepts-MRC) allows user to input two text corpuses and get semantic similarity percentage between both using WordNet. Our application goes through five stages for the computation of semantic relatedness. Those stages are: Preprocessing (extracts keywords from content), Feature Extraction (classification of words into Parts-of-Speech), Synonyms Extraction (retrieves synonyms against each keyword), Measuring Similarity (using keywords and synonyms, similarity is measured) and Visualization (graphical representation of similarity measure). Hence the user can measure similarity on basis of features as well. The end result is a percentage score and the word(s) which form the basis of similarity between both texts with use of different tools on same platform. In future work we look forward for a Web as a live corpus application that provides a simpler and user friendly tool to compare documents and extract useful information.

Keywords: Graphviz representation, semantic relatedness, similarity measurement, WordNet similarity

Procedia PDF Downloads 238
1042 Deciding on Customary International Law: The ICJ's Approach Using Induction, Deduction, and Assertion

Authors: Maryam Nimehforush, Hamid Vahidkia

Abstract:

The International Court of Justice, as well as international law in general, may not excel in methodology. In contrast to how it interprets treaties, the Court rarely explains how it determines the existence, content, and scope of customary international law rules it uses. The Court's jurisprudence only mentions the inductive and deductive methods of law determination sporadically. Both the Court and legal literature have not extensively discussed their approach to determining customary international law. Surprisingly, the question of the Court's methodology has not garnered much attention despite the fact that interpreting and shaping the law have always been intertwined. This article seeks to redirect focus to the method used by the Court in deciding the customs of international law it enforces, emphasizing the importance of methodology in the evolution of customary international law. The text begins by giving explanations for the concepts of ‘induction’ and ‘deduction’ and explores how the Court utilizes them. It later examines when the Court employs inductive and deductive reasoning, the varied types and purposes of deduction, and the connection between the two approaches. The text questions the different concepts of inductive and deductive tradition and proves that the primary approach utilized by the Court is not induction or deduction but instead, assertion.

Keywords: ICJ, law, international, induction, deduction, assertion

Procedia PDF Downloads 11
1041 An Interdisciplinary Approach to Investigating Style: A Case Study of a Chinese Translation of Gilbert’s (2006) Eat Pray Love

Authors: Elaine Y. L. Ng

Abstract:

Elizabeth Gilbert’s (2006) biography Eat, Pray, Love describes her travels to Italy, India, and Indonesia after a painful divorce. The author’s experiences with love, loss, search for happiness, and meaning have resonated with a huge readership. As regards the translation of Gilbert’s (2006) Eat, Pray, Love into Chinese, it was first translated by a Taiwanese translator He Pei-Hua and published in Taiwan in 2007 by Make Boluo Wenhua Chubanshe with the fairly catching title “Enjoy! Traveling Alone.” The same translation was translocated to China, republished in simplified Chinese characters by Shanxi Shifan Daxue Chubanshe in 2008 and renamed in China, entitled “To Be a Girl for the Whole Life.” Later on, the same translation in simplified Chinese characters was reprinted by Hunan Wenyi Chubanshe in 2013. This study employs Munday’s (2002) systemic model for descriptive translation studies to investigate the translation of Gilbert’s (2006) Eat, Pray, Love into Chinese by the Taiwanese translator Hu Pei-Hua. It employs an interdisciplinary approach, combining systemic functional linguistics and corpus stylistics with sociohistorical research within a descriptive framework to study the translator’s discursive presence in the text. The research consists of three phases. The first phase is to locate the target text within its socio-cultural context. The target-text context concerning the para-texts, readers’ responses, and the publishers’ orientation will be explored. The second phase is to compare the source text and the target text for the categorization of translation shifts by using the methodological tools of systemic functional linguistics and corpus stylistics. The investigation concerns the rendering of mental clauses and speech and thought presentation. The final phase is an explanation of the causes of translation shifts. The linguistic findings are related to the extra-textual information collected in an effort to ascertain the motivations behind the translator’s choices. There exist sets of possible factors that may have contributed to shaping the textual features of the given translation within a specific socio-cultural context. The study finds that the translator generally reproduces the mental clauses and speech and thought presentation closely according to the original. Nevertheless, the language of the translation has been widely criticized to be unidiomatic and stiff, losing the elegance of the original. In addition, the several Chinese translations of the given text produced by one Taiwanese and two Chinese publishers are basically the same. They are repackaged slightly differently, mainly with the change of the book cover and its captions for each version. By relating the textual findings to the extra-textual data of the study, it is argued that the popularity of the Chinese translation of Gilbert’s (2006) Eat, Pray, Love may not be attributed to the quality of the translation. Instead, it may have to do with the way the work is promoted strategically by the social media manipulated by the four e-bookstores promoting and selling the book online in China.

Keywords: chinese translation of eat pray love, corpus stylistics, motivations for translation shifts, systemic approach to translation studies

Procedia PDF Downloads 175
1040 Semantic Textual Similarity on Contracts: Exploring Multiple Negative Ranking Losses for Sentence Transformers

Authors: Yogendra Sisodia

Abstract:

Researchers are becoming more interested in extracting useful information from legal documents thanks to the development of large-scale language models in natural language processing (NLP), and deep learning has accelerated the creation of powerful text mining models. Legal fields like contracts benefit greatly from semantic text search since it makes it quick and easy to find related clauses. After collecting sentence embeddings, it is relatively simple to locate sentences with a comparable meaning throughout the entire legal corpus. The author of this research investigated two pre-trained language models for this task: MiniLM and Roberta, and further fine-tuned them on Legal Contracts. The author used Multiple Negative Ranking Loss for the creation of sentence transformers. The fine-tuned language models and sentence transformers showed promising results.

Keywords: legal contracts, multiple negative ranking loss, natural language inference, sentence transformers, semantic textual similarity

Procedia PDF Downloads 108
1039 Information Extraction for Short-Answer Question for the University of the Cordilleras

Authors: Thelma Palaoag, Melanie Basa, Jezreel Mark Panilo

Abstract:

Checking short-answer questions and essays, whether it may be paper or electronic in form, is a tiring and tedious task for teachers. Evaluating a student’s output require wide array of domains. Scoring the work is often a critical task. Several attempts in the past few years to create an automated writing assessment software but only have received negative results from teachers and students alike due to unreliability in scoring, does not provide feedback and others. The study aims to create an application that will be able to check short-answer questions which incorporate information extraction. Information extraction is a subfield of Natural Language Processing (NLP) where a chunk of text (technically known as unstructured text) is being broken down to gather necessary bits of data and/or keywords (structured text) to be further analyzed or rather be utilized by query tools. The proposed system shall be able to extract keywords or phrases from the individual’s answers to match it into a corpora of words (as defined by the instructor), which shall be the basis of evaluation of the individual’s answer. The proposed system shall also enable the teacher to provide feedback and re-evaluate the output of the student for some writing elements in which the computer cannot fully evaluate such as creativity and logic. Teachers can formulate, design, and check short answer questions efficiently by defining keywords or phrases as parameters by assigning weights for checking answers. With the proposed system, teacher’s time in checking and evaluating students output shall be lessened, thus, making the teacher more productive and easier.

Keywords: information extraction, short-answer question, natural language processing, application

Procedia PDF Downloads 428
1038 Enhance the Power of Sentiment Analysis

Authors: Yu Zhang, Pedro Desouza

Abstract:

Since big data has become substantially more accessible and manageable due to the development of powerful tools for dealing with unstructured data, people are eager to mine information from social media resources that could not be handled in the past. Sentiment analysis, as a novel branch of text mining, has in the last decade become increasingly important in marketing analysis, customer risk prediction and other fields. Scientists and researchers have undertaken significant work in creating and improving their sentiment models. In this paper, we present a concept of selecting appropriate classifiers based on the features and qualities of data sources by comparing the performances of five classifiers with three popular social media data sources: Twitter, Amazon Customer Reviews, and Movie Reviews. We introduced a couple of innovative models that outperform traditional sentiment classifiers for these data sources, and provide insights on how to further improve the predictive power of sentiment analysis. The modelling and testing work was done in R and Greenplum in-database analytic tools.

Keywords: sentiment analysis, social media, Twitter, Amazon, data mining, machine learning, text mining

Procedia PDF Downloads 353
1037 AI Tutor: A Computer Science Domain Knowledge Graph-Based QA System on JADE platform

Authors: Yingqi Cui, Changran Huang, Raymond Lee

Abstract:

In this paper, we proposed an AI Tutor using ontology and natural language process techniques to generate a computer science domain knowledge graph and answer users’ questions based on the knowledge graph. We define eight types of relation to extract relationships between entities according to the computer science domain text. The AI tutor is separated into two agents: learning agent and Question-Answer (QA) agent and developed on JADE (a multi-agent system) platform. The learning agent is responsible for reading text to extract information and generate a corresponding knowledge graph by defined patterns. The QA agent can understand the users’ questions and answer humans’ questions based on the knowledge graph generated by the learning agent.

Keywords: artificial intelligence, natural Language processing, knowledge graph, intelligent agents, QA system

Procedia PDF Downloads 187
1036 A Teaching Method for Improving Sentence Fluency in Writing

Authors: Manssour Habbash, Srinivasa Rao Idapalapati

Abstract:

Although writing is a multifaceted task, teaching writing is a demanding task basically for two reasons: Grammar and Syntax. This article provides a method of teaching writing that was found to be effective in improving students’ academic writing composition skill. The article explains the concepts of ‘guided-discovery’ and ‘guided-construction’ upon which a method of teaching writing is grounded and developed. Providing a brief commentary on what the core could mean primarily, the article presents an exposition of understanding and identifying the core and building upon the core that can demonstrate the way a teacher can make use of the concepts in teaching for improving the writing skills of their students. The method is an adaptation of grammar translation method that has been improvised to suit to a student-centered classroom environment. An intervention of teaching writing through this method was tried out with positive outcomes in formal classroom research setup, and in view of the content’s quality that relates more to the classroom practices and also in consideration of its usefulness to the practicing teachers the process and the findings are presented in a narrative form along with the results in tabular form.

Keywords: core of a text, guided construction, guided discovery, theme of a text

Procedia PDF Downloads 381
1035 Linguistics and Islamic Studies in Historical Perspective: The Case of Interdisciplinary Communication

Authors: Olga Bernikova, Oleg Redkin

Abstract:

Islamic Studies and the Arabic language are indivisible from each other starting from the appearance of Islam and formation of the Classical language. The present paper demonstrates correlation among linguistics and religion in historical perspective with regard to peculiarities of the Arabic language which distinguish it from the other prophetic languages. Islamic Studies and Linguistics are indivisible from each other starting from the invent of Islam and formation of the Classical language. In historical perspective, the Arabic language has been and remains a tool for the expression of Islamic rhetoric being a prophetic language. No other language in the world has preserved its stability for more than 14 centuries. Islam is considered to be one of the most important factors which secure this stability. The analysis and study of the text of Qurʾān are of special importance for those who study Islamic civilization, its role in the destinies of the mankind, its values and virtues. Without understanding of the polyphony of this sacred text, indivisible unity of its form and content it is impossible to understand social developments both in the present and the past. Since the first years of Islam Qurʾān had been in the center of attention of Muslim scholars, and in the center of attention of theologians, historians, philologists, jurists, mathematicians. Only quite recently it has become an object of analysis of the specialists of computer technologies. In Arabic and Islamic studies mediaeval texts i.e. textual documents are considered the main source of information. Hence the analysis of the multiplicity of various texts and finding of interconnections between them help to set scattered fragments of the riddle into a common and eloquent picture of the past, which reflects the state of the society on certain stages of its development. The text of the Qurʾān like any other phenomenon is a multifaceted object that should be studied from different points of view. As a result, this complex study will allow obtaining a three-dimensional image rather than a flat picture alone.

Keywords: Arabic, Islamic studies, linguistics, religion

Procedia PDF Downloads 223
1034 The Impact of Recurring Events in Fake News Detection

Authors: Ali Raza, Shafiq Ur Rehman Khan, Raja Sher Afgun Usmani, Asif Raza, Basit Umair

Abstract:

Detection of Fake news and missing information is gaining popularity, especially after the advancement in social media and online news platforms. Social media platforms are the main and speediest source of fake news propagation, whereas online news websites contribute to fake news dissipation. In this study, we propose a framework to detect fake news using the temporal features of text and consider user feedback to identify whether the news is fake or not. In recent studies, the temporal features in text documents gain valuable consideration from Natural Language Processing and user feedback and only try to classify the textual data as fake or true. This research article indicates the impact of recurring and non-recurring events on fake and true news. We use two models BERT and Bi-LSTM to investigate, and it is concluded from BERT we get better results and 70% of true news are recurring and rest of 30% are non-recurring.

Keywords: natural language processing, fake news detection, machine learning, Bi-LSTM

Procedia PDF Downloads 23
1033 'Wandering Uterus': An Analogy of Perception of Women in Hippocratic Corpus and Post-Modern Times

Authors: Ankita Sharma

Abstract:

The study proposes to review the perception of women in the Classical Age (500-336 BC) when Greek Philosophy was in bloom. It was observed that women had very few rights and were still under the control of men. One of the possible reasons for this exclusion was woman’s biology that had a huge influence on her being seen as inferior to men. The text ‘Hippocratic Corpus’ focuses on the biological construct of the female body in classical Greek science that perpetuated the idea of women as second-class citizens and were considered inherently weaker than men. The research highlights the significance of the text that was used to encourage women of that time to get married and produce children and how till today the perception remains the same. The Greek belief of need for confinement and control of 'wandering uterus' has led to superior understanding of men. The pivotal emphasis of this research is to women and their bodies that are depicted in a misogynistic way which paved the way for Hippocratic writers to influence the society’s attitude towards women in their writings. It is intended to draw attention to the prevailing cultural assumptions and preconceived notions about female anatomy that had a pervasive influence in the following centuries with its roots being in ancient science.

Keywords: classical Greek theory, women, wandering womb, modern ideology

Procedia PDF Downloads 195
1032 Degraded Document Analysis and Extraction of Original Text Document: An Approach without Optical Character Recognition

Authors: L. Hamsaveni, Navya Prakash, Suresha

Abstract:

Document Image Analysis recognizes text and graphics in documents acquired as images. An approach without Optical Character Recognition (OCR) for degraded document image analysis has been adopted in this paper. The technique involves document imaging methods such as Image Fusing and Speeded Up Robust Features (SURF) Detection to identify and extract the degraded regions from a set of document images to obtain an original document with complete information. In case, degraded document image captured is skewed, it has to be straightened (deskew) to perform further process. A special format of image storing known as YCbCr is used as a tool to convert the Grayscale image to RGB image format. The presented algorithm is tested on various types of degraded documents such as printed documents, handwritten documents, old script documents and handwritten image sketches in documents. The purpose of this research is to obtain an original document for a given set of degraded documents of the same source.

Keywords: grayscale image format, image fusing, RGB image format, SURF detection, YCbCr image format

Procedia PDF Downloads 377
1031 High Secure Data Hiding Using Cropping Image and Least Significant Bit Steganography

Authors: Khalid A. Al-Afandy, El-Sayyed El-Rabaie, Osama Salah, Ahmed El-Mhalaway

Abstract:

This paper presents a high secure data hiding technique using image cropping and Least Significant Bit (LSB) steganography. The predefined certain secret coordinate crops will be extracted from the cover image. The secret text message will be divided into sections. These sections quantity is equal the image crops quantity. Each section from the secret text message will embed into an image crop with a secret sequence using LSB technique. The embedding is done using the cover image color channels. Stego image is given by reassembling the image and the stego crops. The results of the technique will be compared to the other state of art techniques. Evaluation is based on visualization to detect any degradation of stego image, the difficulty of extracting the embedded data by any unauthorized viewer, Peak Signal-to-Noise Ratio of stego image (PSNR), and the embedding algorithm CPU time. Experimental results ensure that the proposed technique is more secure compared with the other traditional techniques.

Keywords: steganography, stego, LSB, crop

Procedia PDF Downloads 269
1030 Detecting Paraphrases in Arabic Text

Authors: Amal Alshahrani, Allan Ramsay

Abstract:

Paraphrasing is one of the important tasks in natural language processing; i.e. alternative ways to express the same concept by using different words or phrases. Paraphrases can be used in many natural language applications, such as Information Retrieval, Machine Translation, Question Answering, Text Summarization, or Information Extraction. To obtain pairs of sentences that are paraphrases we create a system that automatically extracts paraphrases from a corpus, which is built from different sources of news article since these are likely to contain paraphrases when they report the same event on the same day. There are existing simple standard approaches (e.g. TF-IDF vector space, cosine similarity) and alignment technique (e.g. Dynamic Time Warping (DTW)) for extracting paraphrase which have been applied to the English. However, the performance of these approaches could be affected when they are applied to another language, for instance Arabic language, due to the presence of phenomena which are not present in English, such as Free Word Order, Zero copula, and Pro-dropping. These phenomena will affect the performance of these algorithms. Thus, if we can analysis how the existing algorithms for English fail for Arabic then we can find a solution for Arabic. The results are promising.

Keywords: natural language processing, TF-IDF, cosine similarity, dynamic time warping (DTW)

Procedia PDF Downloads 388
1029 Classification of Political Affiliations by Reduced Number of Features

Authors: Vesile Evrim, Aliyu Awwal

Abstract:

By the evolvement in technology, the way of expressing opinions switched the direction to the digital world. The domain of politics as one of the hottest topics of opinion mining research merged together with the behavior analysis for affiliation determination in text which constitutes the subject of this paper. This study aims to classify the text in news/blogs either as Republican or Democrat with the minimum number of features. As an initial set, 68 features which 64 are constituted by Linguistic Inquiry and Word Count (LIWC) features are tested against 14 benchmark classification algorithms. In the later experiments, the dimensions of the feature vector reduced based on the 7 feature selection algorithms. The results show that Decision Tree, Rule Induction and M5 Rule classifiers when used with SVM and IGR feature selection algorithms performed the best up to 82.5% accuracy on a given dataset. Further tests on a single feature and the linguistic based feature sets showed the similar results. The feature “function” as an aggregate feature of the linguistic category, is obtained as the most differentiating feature among the 68 features with 81% accuracy by itself in classifying articles either as Republican or Democrat.

Keywords: feature selection, LIWC, machine learning, politics

Procedia PDF Downloads 383
1028 Comics Scanlation and Publishing Houses Translation

Authors: Sharifa Alshahrani

Abstract:

Comics is a multimodal text wherein meaning is created by taking in all modes of expression at once. It uses two different semiotic modes, the verbal and the visual modes, together to make meaning and these different semiotic modes can be socially and culturally shaped to give meaning. Therefore, comics translation cannot treat comics as a monomodal text by translating only the verbal mode inside or outside the speech balloons as the cultural differences are encoded in the visual mode as well. Due to the development of the internet and editing software, comics translation is not anymore confined to the publishing houses and official translation as scanlation, or the fan translation took the initiative in translating comics for being emotionally attracted to the culture and genre. Scanlation is carried out by volunteering fans who translate out of passion. However, quality is one of the debatable issues relating to scanlation and fan translation. This study will investigate how the dynamic multimodal relationship in comics is exploited and interpreted in the translation by exploring the translation strategies and procedures adopted by the publishing houses and scanlation in interpreting comics into Arabic using three analytical frameworks; cultural references model, multimodal relation model and translation strategies and procedures models.

Keywords: comics, multimodality, translation, scanlation

Procedia PDF Downloads 212
1027 Linguistic Analysis of Argumentation Structures in Georgian Political Speeches

Authors: Mariam Matiashvili

Abstract:

Argumentation is an integral part of our daily communications - formal or informal. Argumentative reasoning, techniques, and language tools are used both in personal conversations and in the business environment. Verbalization of the opinions requires the use of extraordinary syntactic-pragmatic structural quantities - arguments that add credibility to the statement. The study of argumentative structures allows us to identify the linguistic features that make the text argumentative. Knowing what elements make up an argumentative text in a particular language helps the users of that language improve their skills. Also, natural language processing (NLP) has become especially relevant recently. In this context, one of the main emphases is on the computational processing of argumentative texts, which will enable the automatic recognition and analysis of large volumes of textual data. The research deals with the linguistic analysis of the argumentative structures of Georgian political speeches - particularly the linguistic structure, characteristics, and functions of the parts of the argumentative text - claims, support, and attack statements. The research aims to describe the linguistic cues that give the sentence a judgmental/controversial character and helps to identify reasoning parts of the argumentative text. The empirical data comes from the Georgian Political Corpus, particularly TV debates. Consequently, the texts are of a dialogical nature, representing a discussion between two or more people (most often between a journalist and a politician). The research uses the following approaches to identify and analyze the argumentative structures Lexical Classification & Analysis - Identify lexical items that are relevant in argumentative texts creating process - Creating the lexicon of argumentation (presents groups of words gathered from a semantic point of view); Grammatical Analysis and Classification - means grammatical analysis of the words and phrases identified based on the arguing lexicon. Argumentation Schemas - Describe and identify the Argumentation Schemes that are most likely used in Georgian Political Speeches. As a final step, we analyzed the relations between the above mentioned components. For example, If an identified argument scheme is “Argument from Analogy”, identified lexical items semantically express analogy too, and they are most likely adverbs in Georgian. As a result, we created the lexicon with the words that play a significant role in creating Georgian argumentative structures. Linguistic analysis has shown that verbs play a crucial role in creating argumentative structures.

Keywords: georgian, argumentation schemas, argumentation structures, argumentation lexicon

Procedia PDF Downloads 74
1026 Death of the Author and Birth of the Adapter in a Literary Work

Authors: Slwa Al-Hammad

Abstract:

Adaptation studies have been closely aligned to translation studies as both deal with the process of rendering the meaning from one culture to another. These two disciplines are related to each other, but the theories are still being developed. This research aims to fill this gap and provide a contribution to the growing discipline of adaptation studies through a theoretical perspective while investigating how different cultural interpretations of adaptation influence the final literary product. This research focuses on the theoretical concepts of Barthes’s death of the author and Benjamin’s afterlife of the text in translation, which is believed to lead to the birth of the adapter in a literary work. That is, in adaptation, the ‘death’ of the author allows for the ‘birth’ of the adapter, offering them all the creative possibilities of authorship. It also explores the differences between the meanings of adaptation in the West and the Arab world through the analysis of adapted texts in Arabic initially deriving from the European and American literature of the 19th and 20th centuries. The methodology of this thesis is based upon qualitative literary analysis, in which original and adapted works are compared and contrasted, with the additional insights of literary and adaptation theories and prior scholarship. The main works discussed are the Arabic adaptations of William Faulkner’s novels. The analysis is guided by theories of adaptation studies to help in explaining the concepts of relocating, recreating, and rewriting in the process of adaptation. It draws on scholarship on adaptations to inquire into the status of the adapted texts in relation to the original texts. Also, these theories prove that adaptation is the process that is used to transfer text from source to adapted text, not some other analytical practice. Through the textual analysis, concepts of the death of the author and the birth of the adapter will be illustrated, as will the roles of the adapter and the task of rendering works for a different culture, and the understanding of adaptation and Arabization in Arabic literature.

Keywords: adaptation, Arabization, authorship, recreating, relocating

Procedia PDF Downloads 138
1025 Anaphora and Cataphora on the Selected State of the City Addresses of the Mayor of Dapitan

Authors: Mark Herman Sumagang Potoy

Abstract:

State of the City Address (SOCA) is a speech, modelled after the State of the Nation Address, given not as mandated by law but usually a matter of practice or tradition delivered before the chief executive’s constituents. Through this, the general public is made to know the performance of the local government unit and its agenda for the coming year. Therefore, it is imperative for SOCAs to clearly convey its message and carry out the myriad function of enlightening its readers which could be achieved through the proper use of reference. Anaphora and cataphora are the two major types of reference; the former refer back to something that has already been mentioned while the latter points forward to something which is yet to be said. This paper seeks to identify the types of reference employed on the SOCAs from 2014 to 2016 of Hon. Rosalina Garcia Jalosjos, Mayor of Dapitan City and look into how the references contribute to the clarity of the message of the text. The qualitative method of research is used in this study through an in-depth analysis of the corpus. As soon as the copies of the SOCAs are secured from the Office of the City Mayor, they are then analyzed using documentary technique categorizing the types of reference as to anaphora and cataphora, counting each of these types and describing the implications of the dominant types used in the addresses. After a thorough analysis, it is found out that the two reference types namely, anaphora and cataphora are both employed on the three SOCAs, the former being used more frequently than the latter accounting to 80% and 20% of actual usage, respectively. Moreover, the use of anaphors and cataphora on the three addresses helps in conveying the message clearly because they primarily become aids to avoid the repetition of the same element in the text especially when there wasn’t a need to emphasize a point. Finally, it is recommended that writers of State of the City Addresses should have a vast knowledge on how reference should be used and the functions they take in the text since this is a vital tool to clearly transmit a message. Moreover, English teachers should explicitly teach the proper usage of anaphora and cataphora, as instruments to develop cohesion in written discourse, to enable students to write not only with sense but also with fluidity in tying utterances together.

Keywords: anaphora, cataphora, reference, State of the City Address

Procedia PDF Downloads 192
1024 Recurrent Neural Networks with Deep Hierarchical Mixed Structures for Chinese Document Classification

Authors: Zhaoxin Luo, Michael Zhu

Abstract:

In natural languages, there are always complex semantic hierarchies. Obtaining the feature representation based on these complex semantic hierarchies becomes the key to the success of the model. Several RNN models have recently been proposed to use latent indicators to obtain the hierarchical structure of documents. However, the model that only uses a single-layer latent indicator cannot achieve the true hierarchical structure of the language, especially a complex language like Chinese. In this paper, we propose a deep layered model that stacks arbitrarily many RNN layers equipped with latent indicators. After using EM and training it hierarchically, our model solves the computational problem of stacking RNN layers and makes it possible to stack arbitrarily many RNN layers. Our deep hierarchical model not only achieves comparable results to large pre-trained models on the Chinese short text classification problem but also achieves state of art results on the Chinese long text classification problem.

Keywords: nature language processing, recurrent neural network, hierarchical structure, document classification, Chinese

Procedia PDF Downloads 68
1023 Semantic Based Analysis in Complaint Management System with Analytics

Authors: Francis Alterado, Jennifer Enriquez

Abstract:

Semantic Based Analysis in Complaint Management System with Analytics is an enhanced tool of providing complaints by the clients as well as a mechanism for Palawan Polytechnic College to gather, process, and monitor status of these complaints. The study has a mobile application that serves as a remote facility of communication between the students and the school management on the issues encountered by the student and the solution of every complaint received. In processing the complaints, text mining and clustering algorithms were utilized. Every module of the systems was tested and based on the results; these are 100% free from error before integration was done. A system testing was also done by checking the expected functionality of the system which was 100% functional. The system was tested by 10 students by forwarding complaints to 10 departments. Based on results, the students were able to submit complaints, the system was able to process accordingly by identifying to which department the complaints are intended, and the concerned department was able to give feedback on the complaint received to the student. With this, the system gained 4.7 rating which means Excellent.

Keywords: technology adoption, emerging technology, issues challenges, algorithm, text mining, mobile technology

Procedia PDF Downloads 199