Search results for: text summarization
1043 Measuring Text-Based Semantics Relatedness Using WordNet
Authors: Madiha Khan, Sidrah Ramzan, Seemab Khan, Shahzad Hassan, Kamran Saeed
Abstract:
Measuring semantic similarity between texts is calculating semantic relatedness between texts using various techniques. Our web application (Measuring Relatedness of Concepts-MRC) allows user to input two text corpuses and get semantic similarity percentage between both using WordNet. Our application goes through five stages for the computation of semantic relatedness. Those stages are: Preprocessing (extracts keywords from content), Feature Extraction (classification of words into Parts-of-Speech), Synonyms Extraction (retrieves synonyms against each keyword), Measuring Similarity (using keywords and synonyms, similarity is measured) and Visualization (graphical representation of similarity measure). Hence the user can measure similarity on basis of features as well. The end result is a percentage score and the word(s) which form the basis of similarity between both texts with use of different tools on same platform. In future work we look forward for a Web as a live corpus application that provides a simpler and user friendly tool to compare documents and extract useful information.Keywords: Graphviz representation, semantic relatedness, similarity measurement, WordNet similarity
Procedia PDF Downloads 2391042 Deciding on Customary International Law: The ICJ's Approach Using Induction, Deduction, and Assertion
Authors: Maryam Nimehforush, Hamid Vahidkia
Abstract:
The International Court of Justice, as well as international law in general, may not excel in methodology. In contrast to how it interprets treaties, the Court rarely explains how it determines the existence, content, and scope of customary international law rules it uses. The Court's jurisprudence only mentions the inductive and deductive methods of law determination sporadically. Both the Court and legal literature have not extensively discussed their approach to determining customary international law. Surprisingly, the question of the Court's methodology has not garnered much attention despite the fact that interpreting and shaping the law have always been intertwined. This article seeks to redirect focus to the method used by the Court in deciding the customs of international law it enforces, emphasizing the importance of methodology in the evolution of customary international law. The text begins by giving explanations for the concepts of ‘induction’ and ‘deduction’ and explores how the Court utilizes them. It later examines when the Court employs inductive and deductive reasoning, the varied types and purposes of deduction, and the connection between the two approaches. The text questions the different concepts of inductive and deductive tradition and proves that the primary approach utilized by the Court is not induction or deduction but instead, assertion.Keywords: ICJ, law, international, induction, deduction, assertion
Procedia PDF Downloads 181041 An Interdisciplinary Approach to Investigating Style: A Case Study of a Chinese Translation of Gilbert’s (2006) Eat Pray Love
Authors: Elaine Y. L. Ng
Abstract:
Elizabeth Gilbert’s (2006) biography Eat, Pray, Love describes her travels to Italy, India, and Indonesia after a painful divorce. The author’s experiences with love, loss, search for happiness, and meaning have resonated with a huge readership. As regards the translation of Gilbert’s (2006) Eat, Pray, Love into Chinese, it was first translated by a Taiwanese translator He Pei-Hua and published in Taiwan in 2007 by Make Boluo Wenhua Chubanshe with the fairly catching title “Enjoy! Traveling Alone.” The same translation was translocated to China, republished in simplified Chinese characters by Shanxi Shifan Daxue Chubanshe in 2008 and renamed in China, entitled “To Be a Girl for the Whole Life.” Later on, the same translation in simplified Chinese characters was reprinted by Hunan Wenyi Chubanshe in 2013. This study employs Munday’s (2002) systemic model for descriptive translation studies to investigate the translation of Gilbert’s (2006) Eat, Pray, Love into Chinese by the Taiwanese translator Hu Pei-Hua. It employs an interdisciplinary approach, combining systemic functional linguistics and corpus stylistics with sociohistorical research within a descriptive framework to study the translator’s discursive presence in the text. The research consists of three phases. The first phase is to locate the target text within its socio-cultural context. The target-text context concerning the para-texts, readers’ responses, and the publishers’ orientation will be explored. The second phase is to compare the source text and the target text for the categorization of translation shifts by using the methodological tools of systemic functional linguistics and corpus stylistics. The investigation concerns the rendering of mental clauses and speech and thought presentation. The final phase is an explanation of the causes of translation shifts. The linguistic findings are related to the extra-textual information collected in an effort to ascertain the motivations behind the translator’s choices. There exist sets of possible factors that may have contributed to shaping the textual features of the given translation within a specific socio-cultural context. The study finds that the translator generally reproduces the mental clauses and speech and thought presentation closely according to the original. Nevertheless, the language of the translation has been widely criticized to be unidiomatic and stiff, losing the elegance of the original. In addition, the several Chinese translations of the given text produced by one Taiwanese and two Chinese publishers are basically the same. They are repackaged slightly differently, mainly with the change of the book cover and its captions for each version. By relating the textual findings to the extra-textual data of the study, it is argued that the popularity of the Chinese translation of Gilbert’s (2006) Eat, Pray, Love may not be attributed to the quality of the translation. Instead, it may have to do with the way the work is promoted strategically by the social media manipulated by the four e-bookstores promoting and selling the book online in China.Keywords: chinese translation of eat pray love, corpus stylistics, motivations for translation shifts, systemic approach to translation studies
Procedia PDF Downloads 1761040 Semantic Textual Similarity on Contracts: Exploring Multiple Negative Ranking Losses for Sentence Transformers
Authors: Yogendra Sisodia
Abstract:
Researchers are becoming more interested in extracting useful information from legal documents thanks to the development of large-scale language models in natural language processing (NLP), and deep learning has accelerated the creation of powerful text mining models. Legal fields like contracts benefit greatly from semantic text search since it makes it quick and easy to find related clauses. After collecting sentence embeddings, it is relatively simple to locate sentences with a comparable meaning throughout the entire legal corpus. The author of this research investigated two pre-trained language models for this task: MiniLM and Roberta, and further fine-tuned them on Legal Contracts. The author used Multiple Negative Ranking Loss for the creation of sentence transformers. The fine-tuned language models and sentence transformers showed promising results.Keywords: legal contracts, multiple negative ranking loss, natural language inference, sentence transformers, semantic textual similarity
Procedia PDF Downloads 1091039 Information Extraction for Short-Answer Question for the University of the Cordilleras
Authors: Thelma Palaoag, Melanie Basa, Jezreel Mark Panilo
Abstract:
Checking short-answer questions and essays, whether it may be paper or electronic in form, is a tiring and tedious task for teachers. Evaluating a student’s output require wide array of domains. Scoring the work is often a critical task. Several attempts in the past few years to create an automated writing assessment software but only have received negative results from teachers and students alike due to unreliability in scoring, does not provide feedback and others. The study aims to create an application that will be able to check short-answer questions which incorporate information extraction. Information extraction is a subfield of Natural Language Processing (NLP) where a chunk of text (technically known as unstructured text) is being broken down to gather necessary bits of data and/or keywords (structured text) to be further analyzed or rather be utilized by query tools. The proposed system shall be able to extract keywords or phrases from the individual’s answers to match it into a corpora of words (as defined by the instructor), which shall be the basis of evaluation of the individual’s answer. The proposed system shall also enable the teacher to provide feedback and re-evaluate the output of the student for some writing elements in which the computer cannot fully evaluate such as creativity and logic. Teachers can formulate, design, and check short answer questions efficiently by defining keywords or phrases as parameters by assigning weights for checking answers. With the proposed system, teacher’s time in checking and evaluating students output shall be lessened, thus, making the teacher more productive and easier.Keywords: information extraction, short-answer question, natural language processing, application
Procedia PDF Downloads 4281038 Enhance the Power of Sentiment Analysis
Authors: Yu Zhang, Pedro Desouza
Abstract:
Since big data has become substantially more accessible and manageable due to the development of powerful tools for dealing with unstructured data, people are eager to mine information from social media resources that could not be handled in the past. Sentiment analysis, as a novel branch of text mining, has in the last decade become increasingly important in marketing analysis, customer risk prediction and other fields. Scientists and researchers have undertaken significant work in creating and improving their sentiment models. In this paper, we present a concept of selecting appropriate classifiers based on the features and qualities of data sources by comparing the performances of five classifiers with three popular social media data sources: Twitter, Amazon Customer Reviews, and Movie Reviews. We introduced a couple of innovative models that outperform traditional sentiment classifiers for these data sources, and provide insights on how to further improve the predictive power of sentiment analysis. The modelling and testing work was done in R and Greenplum in-database analytic tools.Keywords: sentiment analysis, social media, Twitter, Amazon, data mining, machine learning, text mining
Procedia PDF Downloads 3531037 AI Tutor: A Computer Science Domain Knowledge Graph-Based QA System on JADE platform
Authors: Yingqi Cui, Changran Huang, Raymond Lee
Abstract:
In this paper, we proposed an AI Tutor using ontology and natural language process techniques to generate a computer science domain knowledge graph and answer users’ questions based on the knowledge graph. We define eight types of relation to extract relationships between entities according to the computer science domain text. The AI tutor is separated into two agents: learning agent and Question-Answer (QA) agent and developed on JADE (a multi-agent system) platform. The learning agent is responsible for reading text to extract information and generate a corresponding knowledge graph by defined patterns. The QA agent can understand the users’ questions and answer humans’ questions based on the knowledge graph generated by the learning agent.Keywords: artificial intelligence, natural Language processing, knowledge graph, intelligent agents, QA system
Procedia PDF Downloads 1891036 A Teaching Method for Improving Sentence Fluency in Writing
Authors: Manssour Habbash, Srinivasa Rao Idapalapati
Abstract:
Although writing is a multifaceted task, teaching writing is a demanding task basically for two reasons: Grammar and Syntax. This article provides a method of teaching writing that was found to be effective in improving students’ academic writing composition skill. The article explains the concepts of ‘guided-discovery’ and ‘guided-construction’ upon which a method of teaching writing is grounded and developed. Providing a brief commentary on what the core could mean primarily, the article presents an exposition of understanding and identifying the core and building upon the core that can demonstrate the way a teacher can make use of the concepts in teaching for improving the writing skills of their students. The method is an adaptation of grammar translation method that has been improvised to suit to a student-centered classroom environment. An intervention of teaching writing through this method was tried out with positive outcomes in formal classroom research setup, and in view of the content’s quality that relates more to the classroom practices and also in consideration of its usefulness to the practicing teachers the process and the findings are presented in a narrative form along with the results in tabular form.Keywords: core of a text, guided construction, guided discovery, theme of a text
Procedia PDF Downloads 3811035 Linguistics and Islamic Studies in Historical Perspective: The Case of Interdisciplinary Communication
Authors: Olga Bernikova, Oleg Redkin
Abstract:
Islamic Studies and the Arabic language are indivisible from each other starting from the appearance of Islam and formation of the Classical language. The present paper demonstrates correlation among linguistics and religion in historical perspective with regard to peculiarities of the Arabic language which distinguish it from the other prophetic languages. Islamic Studies and Linguistics are indivisible from each other starting from the invent of Islam and formation of the Classical language. In historical perspective, the Arabic language has been and remains a tool for the expression of Islamic rhetoric being a prophetic language. No other language in the world has preserved its stability for more than 14 centuries. Islam is considered to be one of the most important factors which secure this stability. The analysis and study of the text of Qurʾān are of special importance for those who study Islamic civilization, its role in the destinies of the mankind, its values and virtues. Without understanding of the polyphony of this sacred text, indivisible unity of its form and content it is impossible to understand social developments both in the present and the past. Since the first years of Islam Qurʾān had been in the center of attention of Muslim scholars, and in the center of attention of theologians, historians, philologists, jurists, mathematicians. Only quite recently it has become an object of analysis of the specialists of computer technologies. In Arabic and Islamic studies mediaeval texts i.e. textual documents are considered the main source of information. Hence the analysis of the multiplicity of various texts and finding of interconnections between them help to set scattered fragments of the riddle into a common and eloquent picture of the past, which reflects the state of the society on certain stages of its development. The text of the Qurʾān like any other phenomenon is a multifaceted object that should be studied from different points of view. As a result, this complex study will allow obtaining a three-dimensional image rather than a flat picture alone.Keywords: Arabic, Islamic studies, linguistics, religion
Procedia PDF Downloads 2231034 The Impact of Recurring Events in Fake News Detection
Authors: Ali Raza, Shafiq Ur Rehman Khan, Raja Sher Afgun Usmani, Asif Raza, Basit Umair
Abstract:
Detection of Fake news and missing information is gaining popularity, especially after the advancement in social media and online news platforms. Social media platforms are the main and speediest source of fake news propagation, whereas online news websites contribute to fake news dissipation. In this study, we propose a framework to detect fake news using the temporal features of text and consider user feedback to identify whether the news is fake or not. In recent studies, the temporal features in text documents gain valuable consideration from Natural Language Processing and user feedback and only try to classify the textual data as fake or true. This research article indicates the impact of recurring and non-recurring events on fake and true news. We use two models BERT and Bi-LSTM to investigate, and it is concluded from BERT we get better results and 70% of true news are recurring and rest of 30% are non-recurring.Keywords: natural language processing, fake news detection, machine learning, Bi-LSTM
Procedia PDF Downloads 251033 'Wandering Uterus': An Analogy of Perception of Women in Hippocratic Corpus and Post-Modern Times
Authors: Ankita Sharma
Abstract:
The study proposes to review the perception of women in the Classical Age (500-336 BC) when Greek Philosophy was in bloom. It was observed that women had very few rights and were still under the control of men. One of the possible reasons for this exclusion was woman’s biology that had a huge influence on her being seen as inferior to men. The text ‘Hippocratic Corpus’ focuses on the biological construct of the female body in classical Greek science that perpetuated the idea of women as second-class citizens and were considered inherently weaker than men. The research highlights the significance of the text that was used to encourage women of that time to get married and produce children and how till today the perception remains the same. The Greek belief of need for confinement and control of 'wandering uterus' has led to superior understanding of men. The pivotal emphasis of this research is to women and their bodies that are depicted in a misogynistic way which paved the way for Hippocratic writers to influence the society’s attitude towards women in their writings. It is intended to draw attention to the prevailing cultural assumptions and preconceived notions about female anatomy that had a pervasive influence in the following centuries with its roots being in ancient science.Keywords: classical Greek theory, women, wandering womb, modern ideology
Procedia PDF Downloads 1971032 Degraded Document Analysis and Extraction of Original Text Document: An Approach without Optical Character Recognition
Authors: L. Hamsaveni, Navya Prakash, Suresha
Abstract:
Document Image Analysis recognizes text and graphics in documents acquired as images. An approach without Optical Character Recognition (OCR) for degraded document image analysis has been adopted in this paper. The technique involves document imaging methods such as Image Fusing and Speeded Up Robust Features (SURF) Detection to identify and extract the degraded regions from a set of document images to obtain an original document with complete information. In case, degraded document image captured is skewed, it has to be straightened (deskew) to perform further process. A special format of image storing known as YCbCr is used as a tool to convert the Grayscale image to RGB image format. The presented algorithm is tested on various types of degraded documents such as printed documents, handwritten documents, old script documents and handwritten image sketches in documents. The purpose of this research is to obtain an original document for a given set of degraded documents of the same source.Keywords: grayscale image format, image fusing, RGB image format, SURF detection, YCbCr image format
Procedia PDF Downloads 3771031 High Secure Data Hiding Using Cropping Image and Least Significant Bit Steganography
Authors: Khalid A. Al-Afandy, El-Sayyed El-Rabaie, Osama Salah, Ahmed El-Mhalaway
Abstract:
This paper presents a high secure data hiding technique using image cropping and Least Significant Bit (LSB) steganography. The predefined certain secret coordinate crops will be extracted from the cover image. The secret text message will be divided into sections. These sections quantity is equal the image crops quantity. Each section from the secret text message will embed into an image crop with a secret sequence using LSB technique. The embedding is done using the cover image color channels. Stego image is given by reassembling the image and the stego crops. The results of the technique will be compared to the other state of art techniques. Evaluation is based on visualization to detect any degradation of stego image, the difficulty of extracting the embedded data by any unauthorized viewer, Peak Signal-to-Noise Ratio of stego image (PSNR), and the embedding algorithm CPU time. Experimental results ensure that the proposed technique is more secure compared with the other traditional techniques.Keywords: steganography, stego, LSB, crop
Procedia PDF Downloads 2701030 Classification of Political Affiliations by Reduced Number of Features
Authors: Vesile Evrim, Aliyu Awwal
Abstract:
By the evolvement in technology, the way of expressing opinions switched the direction to the digital world. The domain of politics as one of the hottest topics of opinion mining research merged together with the behavior analysis for affiliation determination in text which constitutes the subject of this paper. This study aims to classify the text in news/blogs either as Republican or Democrat with the minimum number of features. As an initial set, 68 features which 64 are constituted by Linguistic Inquiry and Word Count (LIWC) features are tested against 14 benchmark classification algorithms. In the later experiments, the dimensions of the feature vector reduced based on the 7 feature selection algorithms. The results show that Decision Tree, Rule Induction and M5 Rule classifiers when used with SVM and IGR feature selection algorithms performed the best up to 82.5% accuracy on a given dataset. Further tests on a single feature and the linguistic based feature sets showed the similar results. The feature “function” as an aggregate feature of the linguistic category, is obtained as the most differentiating feature among the 68 features with 81% accuracy by itself in classifying articles either as Republican or Democrat.Keywords: feature selection, LIWC, machine learning, politics
Procedia PDF Downloads 3831029 Comics Scanlation and Publishing Houses Translation
Authors: Sharifa Alshahrani
Abstract:
Comics is a multimodal text wherein meaning is created by taking in all modes of expression at once. It uses two different semiotic modes, the verbal and the visual modes, together to make meaning and these different semiotic modes can be socially and culturally shaped to give meaning. Therefore, comics translation cannot treat comics as a monomodal text by translating only the verbal mode inside or outside the speech balloons as the cultural differences are encoded in the visual mode as well. Due to the development of the internet and editing software, comics translation is not anymore confined to the publishing houses and official translation as scanlation, or the fan translation took the initiative in translating comics for being emotionally attracted to the culture and genre. Scanlation is carried out by volunteering fans who translate out of passion. However, quality is one of the debatable issues relating to scanlation and fan translation. This study will investigate how the dynamic multimodal relationship in comics is exploited and interpreted in the translation by exploring the translation strategies and procedures adopted by the publishing houses and scanlation in interpreting comics into Arabic using three analytical frameworks; cultural references model, multimodal relation model and translation strategies and procedures models.Keywords: comics, multimodality, translation, scanlation
Procedia PDF Downloads 2131028 Linguistic Analysis of Argumentation Structures in Georgian Political Speeches
Authors: Mariam Matiashvili
Abstract:
Argumentation is an integral part of our daily communications - formal or informal. Argumentative reasoning, techniques, and language tools are used both in personal conversations and in the business environment. Verbalization of the opinions requires the use of extraordinary syntactic-pragmatic structural quantities - arguments that add credibility to the statement. The study of argumentative structures allows us to identify the linguistic features that make the text argumentative. Knowing what elements make up an argumentative text in a particular language helps the users of that language improve their skills. Also, natural language processing (NLP) has become especially relevant recently. In this context, one of the main emphases is on the computational processing of argumentative texts, which will enable the automatic recognition and analysis of large volumes of textual data. The research deals with the linguistic analysis of the argumentative structures of Georgian political speeches - particularly the linguistic structure, characteristics, and functions of the parts of the argumentative text - claims, support, and attack statements. The research aims to describe the linguistic cues that give the sentence a judgmental/controversial character and helps to identify reasoning parts of the argumentative text. The empirical data comes from the Georgian Political Corpus, particularly TV debates. Consequently, the texts are of a dialogical nature, representing a discussion between two or more people (most often between a journalist and a politician). The research uses the following approaches to identify and analyze the argumentative structures Lexical Classification & Analysis - Identify lexical items that are relevant in argumentative texts creating process - Creating the lexicon of argumentation (presents groups of words gathered from a semantic point of view); Grammatical Analysis and Classification - means grammatical analysis of the words and phrases identified based on the arguing lexicon. Argumentation Schemas - Describe and identify the Argumentation Schemes that are most likely used in Georgian Political Speeches. As a final step, we analyzed the relations between the above mentioned components. For example, If an identified argument scheme is “Argument from Analogy”, identified lexical items semantically express analogy too, and they are most likely adverbs in Georgian. As a result, we created the lexicon with the words that play a significant role in creating Georgian argumentative structures. Linguistic analysis has shown that verbs play a crucial role in creating argumentative structures.Keywords: georgian, argumentation schemas, argumentation structures, argumentation lexicon
Procedia PDF Downloads 741027 Death of the Author and Birth of the Adapter in a Literary Work
Authors: Slwa Al-Hammad
Abstract:
Adaptation studies have been closely aligned to translation studies as both deal with the process of rendering the meaning from one culture to another. These two disciplines are related to each other, but the theories are still being developed. This research aims to fill this gap and provide a contribution to the growing discipline of adaptation studies through a theoretical perspective while investigating how different cultural interpretations of adaptation influence the final literary product. This research focuses on the theoretical concepts of Barthes’s death of the author and Benjamin’s afterlife of the text in translation, which is believed to lead to the birth of the adapter in a literary work. That is, in adaptation, the ‘death’ of the author allows for the ‘birth’ of the adapter, offering them all the creative possibilities of authorship. It also explores the differences between the meanings of adaptation in the West and the Arab world through the analysis of adapted texts in Arabic initially deriving from the European and American literature of the 19th and 20th centuries. The methodology of this thesis is based upon qualitative literary analysis, in which original and adapted works are compared and contrasted, with the additional insights of literary and adaptation theories and prior scholarship. The main works discussed are the Arabic adaptations of William Faulkner’s novels. The analysis is guided by theories of adaptation studies to help in explaining the concepts of relocating, recreating, and rewriting in the process of adaptation. It draws on scholarship on adaptations to inquire into the status of the adapted texts in relation to the original texts. Also, these theories prove that adaptation is the process that is used to transfer text from source to adapted text, not some other analytical practice. Through the textual analysis, concepts of the death of the author and the birth of the adapter will be illustrated, as will the roles of the adapter and the task of rendering works for a different culture, and the understanding of adaptation and Arabization in Arabic literature.Keywords: adaptation, Arabization, authorship, recreating, relocating
Procedia PDF Downloads 1431026 Anaphora and Cataphora on the Selected State of the City Addresses of the Mayor of Dapitan
Authors: Mark Herman Sumagang Potoy
Abstract:
State of the City Address (SOCA) is a speech, modelled after the State of the Nation Address, given not as mandated by law but usually a matter of practice or tradition delivered before the chief executive’s constituents. Through this, the general public is made to know the performance of the local government unit and its agenda for the coming year. Therefore, it is imperative for SOCAs to clearly convey its message and carry out the myriad function of enlightening its readers which could be achieved through the proper use of reference. Anaphora and cataphora are the two major types of reference; the former refer back to something that has already been mentioned while the latter points forward to something which is yet to be said. This paper seeks to identify the types of reference employed on the SOCAs from 2014 to 2016 of Hon. Rosalina Garcia Jalosjos, Mayor of Dapitan City and look into how the references contribute to the clarity of the message of the text. The qualitative method of research is used in this study through an in-depth analysis of the corpus. As soon as the copies of the SOCAs are secured from the Office of the City Mayor, they are then analyzed using documentary technique categorizing the types of reference as to anaphora and cataphora, counting each of these types and describing the implications of the dominant types used in the addresses. After a thorough analysis, it is found out that the two reference types namely, anaphora and cataphora are both employed on the three SOCAs, the former being used more frequently than the latter accounting to 80% and 20% of actual usage, respectively. Moreover, the use of anaphors and cataphora on the three addresses helps in conveying the message clearly because they primarily become aids to avoid the repetition of the same element in the text especially when there wasn’t a need to emphasize a point. Finally, it is recommended that writers of State of the City Addresses should have a vast knowledge on how reference should be used and the functions they take in the text since this is a vital tool to clearly transmit a message. Moreover, English teachers should explicitly teach the proper usage of anaphora and cataphora, as instruments to develop cohesion in written discourse, to enable students to write not only with sense but also with fluidity in tying utterances together.Keywords: anaphora, cataphora, reference, State of the City Address
Procedia PDF Downloads 1931025 Recurrent Neural Networks with Deep Hierarchical Mixed Structures for Chinese Document Classification
Authors: Zhaoxin Luo, Michael Zhu
Abstract:
In natural languages, there are always complex semantic hierarchies. Obtaining the feature representation based on these complex semantic hierarchies becomes the key to the success of the model. Several RNN models have recently been proposed to use latent indicators to obtain the hierarchical structure of documents. However, the model that only uses a single-layer latent indicator cannot achieve the true hierarchical structure of the language, especially a complex language like Chinese. In this paper, we propose a deep layered model that stacks arbitrarily many RNN layers equipped with latent indicators. After using EM and training it hierarchically, our model solves the computational problem of stacking RNN layers and makes it possible to stack arbitrarily many RNN layers. Our deep hierarchical model not only achieves comparable results to large pre-trained models on the Chinese short text classification problem but also achieves state of art results on the Chinese long text classification problem.Keywords: nature language processing, recurrent neural network, hierarchical structure, document classification, Chinese
Procedia PDF Downloads 691024 Semantic Based Analysis in Complaint Management System with Analytics
Authors: Francis Alterado, Jennifer Enriquez
Abstract:
Semantic Based Analysis in Complaint Management System with Analytics is an enhanced tool of providing complaints by the clients as well as a mechanism for Palawan Polytechnic College to gather, process, and monitor status of these complaints. The study has a mobile application that serves as a remote facility of communication between the students and the school management on the issues encountered by the student and the solution of every complaint received. In processing the complaints, text mining and clustering algorithms were utilized. Every module of the systems was tested and based on the results; these are 100% free from error before integration was done. A system testing was also done by checking the expected functionality of the system which was 100% functional. The system was tested by 10 students by forwarding complaints to 10 departments. Based on results, the students were able to submit complaints, the system was able to process accordingly by identifying to which department the complaints are intended, and the concerned department was able to give feedback on the complaint received to the student. With this, the system gained 4.7 rating which means Excellent.Keywords: technology adoption, emerging technology, issues challenges, algorithm, text mining, mobile technology
Procedia PDF Downloads 1991023 Compilation and Statistical Analysis of an Arabic-English Legal Corpus in Sketch Engine
Authors: C. Brierley, H. El-Farahaty, A. Farhan
Abstract:
The Leeds Parallel Corpus of Arabic-English Constitutions is a parallel corpus for the Arabic legal domain. Analysis of legal language via Corpus Linguistics techniques is an important development. In legal proceedings, a corpus-based approach to disambiguating meaning is set to replace the dictionary as an interpretative tool, and legal scholarship in the States is now attuned to the potential for Text Analytics over vast quantities of text-based legal material, following the business and medical industries. This trend is reflected in Europe: the interdisciplinary research group in Computer Assisted Legal Linguistics mines big data collections of legal and non-legal texts to analyse: legal interpretations; legal discourse; the comprehensibility of legal texts; conflict resolution; and linguistic human rights. This paper focuses on ‘dignity’ as an important aspect of the overarching concept of human rights in current constitutions across the Arab world. We have compiled a parallel, Arabic-English raw text corpus (169,861 Arabic words and 205,893 English words) from reputable websites such as the World Intellectual Property Organisation and CONSTITUTE, and uploaded and queried our corpus in Sketch Engine. Our most challenging task was sentence-level alignment of Arabic-English data. This entailed manual intervention to ensure correspondence on a one-to-many basis since Arabic sentences differ from English in length and punctuation. We have searched for morphological variants of ‘dignity’ (رامة ك, karāma) in the Arabic data and inspected their English translation equivalents. The term occurs most frequently in the Sudanese constitution (10 instances), and not at all in the constitution of Palestine. Its most frequent collocate, determined via the logDice statistic in Sketch Engine, is ‘human’ as in ‘human dignity’.Keywords: Arabic constitution, corpus-based legal linguistics, human rights, parallel Arabic-English legal corpora
Procedia PDF Downloads 1831022 Integrating Critical Stylistics and Visual Grammar: A Multimodal Stylistic Approach to the Analysis of Non-Literary Texts
Authors: Shatha Khuzaee
Abstract:
The study develops multimodal stylistic approach to analyse a number of BBC online news articles reporting some key events from the so called ‘Arab Uprisings’. Critical stylistics (CS) and visual grammar (VG) provide insightful arguments to the ways ideology is projected through different verbal and visual modes, yet they are mode specific because they examine how each mode projects its meaning separately and do not attempt to clarify what happens intersemiotically when the two modes co-occur. Therefore, it is the task undertaken in this research to propose multimodal stylistic approach that addresses the issue of ideology construction when the two modes co-occur. Informed by functional grammar and social semiotics, the analysis attempts to integrate three linguistic models developed in critical stylistics, namely, transitivity choices, prioritizing and hypothesizing along with their visual equivalents adopted from visual grammar to investigate the way ideology is constructed, in multimodal text, when text/image participate and interrelate in the process of meaning making on the textual level of analysis. The analysis provides comprehensive theoretical and analytical elaborations on the different points of integration between CS linguistic models and VG equivalents which operate on the textual level of analysis to better account for ideology construction in news as non-literary multimodal texts. It is argued that the analysis well thought out a plan that would remark the first step towards the integration between the well-established linguistic models of critical stylistics and that of visual analysis to analyse multimodal texts on the textual level. Both approaches are compatible to produce multimodal stylistic approach because they intend to analyse text and image depending on whatever textual evidence is available. This supports the analysis maintain the rigor and replicability needed for a stylistic analysis like the one undertaken in this study.Keywords: multimodality, stylistics, visual grammar, social semiotics, functional grammar
Procedia PDF Downloads 2211021 Developing Students’ Academic Writing Skills through Scientific Reading: Using Questions and Answer Activities
Authors: Makhim Artikova, Shavkat Duschanov
Abstract:
So far, there have been a plethora of attempts to improve learners’ academic writing skills. However, this issue remains to be a real concern among the majority of students, especially those who are standing on their academic life threshold. The purpose of this research is improving students’ academic writing skills through 'Questions and Answer Reading' activities. Using well-prepared and well-chosen reading materials (from textbooks, scientific journals, or magazines) and applying questions and answer activities in the classroom facilitate learners to become great critical readers. Furthermore, it boosts their writing skills, which are the most crucial part of students’ personal and academic developments. In this activity, the class is divided into small groups of four. Then, the instructor will give students whether one section of the text or full text asking them to read and to find unfamiliar words within the group. After discovering the meaning of unknown words, each group has to share their findings with the class. In the next stage of the activity, students should be asked to create questions in a group based on the given reading material. Follow by each group should ask the other groups their questions which are an excellent opportunity to challenge leads to improve critical thinking skills. In the last part, the students are asked to write the text or article summary, which is the activity core that pilots to the writing skills perfection. This engaging activity highlights the effectiveness of incorporating reading materials into the classroom when it comes to improving students’ composition writings. Structural writing after every reading activity resulted in improving students’ coherence and cohesion in writing well-organized essays. Having experimented with high school 9th and 11th-grade students, implementing reading activities into the classroom is proved to be a productive tool to enhance one’s academic writing skills. In the future, this method planning to be implemented among university students.Keywords: academic writing, coherence and cohesion, questions and answer activities, scientific reading
Procedia PDF Downloads 1111020 Literature as a Strategic Tool to Conscientise Africans: An Attempt by Postcolonial Writers and Critics to Reverse the Socio-Economics Imbalances of Colonialism
Authors: Lutendo Nendauni
Abstract:
Colonialism breaks things, colonisers exploded native cultural solidarity, producing the spiritual confusion, psychic wounding, and economic exploitation of a new and dominated ‘other’. Colonialism as the cultural and economic exploitation began when the West defended in their seizure of foreign territories for the exploitation of its natural resources; this resulted in brutal socio-economic imbalances. The Western profited at the detriment of the weak Africa. However, colonialism has since passed, but the effects are still evident culturally, socially, and economically. This paper explored how postcolonial writers and critics attempt to reverse the socio-economic imbalances resulting from the fragmentation of colonialism, with a focus on the play 'I will Marry When I Want' by Ngugi wa Thiong’o and Ngugi wa Mirii, as a primary text. Using qualitative discourse-textual analysis as the research methodology, the researcher purposively extracts discourse segments from the text for analysis and interpretation. The findings reveal that Postcolonial critics and writers attempt to reverse the socio-economic effects of colonialism through various counter discourses; their literature is concerned with the destruction of colonised identity, the search for this identity, and its assertion. It is manifest in the text that writers offer corrective views about Africans; they stress that they write their literary texts to conscientise their fellow Africans. Postcolonial writers and critics argue that language is a carrier of culture and that the only way to break free from colonial influence is by not adopting a foreign language. They further through their poems, novels, plays, and music strategically shine the spotlight on the previously nameless and destitute people so that they can develop the human spirit’s desire to overcome defeat, socio-political deprivation, and isolation.Keywords: colonialism, postcoloniality, critics, socio-economic imbalances
Procedia PDF Downloads 1581019 Instructional Consequences of the Transiency of Spoken Words
Authors: Slava Kalyuga, Sujanya Sombatteera
Abstract:
In multimedia learning, written text is often transformed into spoken (narrated) text. This transient information may overwhelm limited processing capacity of working memory and inhibit learning instead of improving it. The paper reviews recent empirical studies in modality and verbal redundancy effects within a cognitive load framework and outlines conditions under which negative effects of transiency may occur. According to the modality effect, textual information accompanying pictures should be presented in an auditory rather than visual form in order to engage two available channels of working memory – auditory and visual - instead of only one of them. However, some studies failed to replicate the modality effect and found differences opposite to those expected. Also, according to the multimedia redundancy effect, the same information should not be presented simultaneously in different modalities to avoid unnecessary cognitive load imposed by the integration of redundant sources of information. However, a few studies failed to replicate the multimedia redundancy effect too. Transiency of information is used to explain these controversial results.Keywords: cognitive load, transient information, modality effect, verbal redundancy effect
Procedia PDF Downloads 3811018 Detecting Elderly Abuse in US Nursing Homes Using Machine Learning and Text Analytics
Authors: Minh Huynh, Aaron Heuser, Luke Patterson, Chris Zhang, Mason Miller, Daniel Wang, Sandeep Shetty, Mike Trinh, Abigail Miller, Adaeze Enekwechi, Tenille Daniels, Lu Huynh
Abstract:
Machine learning and text analytics have been used to analyze child abuse, cyberbullying, domestic abuse and domestic violence, and hate speech. However, to the authors’ knowledge, no research to date has used these methods to study elder abuse in nursing homes or skilled nursing facilities from field inspection reports. We used machine learning and text analytics methods to analyze 356,000 inspection reports, which have been extracted from CMS Form-2567 field inspections of US nursing homes and skilled nursing facilities between 2016 and 2021. Our algorithm detected occurrences of the various types of abuse, including physical abuse, psychological abuse, verbal abuse, sexual abuse, and passive and active neglect. For example, to detect physical abuse, our algorithms search for combinations or phrases and words suggesting willful infliction of damage (hitting, pinching or burning, tethering, tying), or consciously ignoring an emergency. To detect occurrences of elder neglect, our algorithm looks for combinations or phrases and words suggesting both passive neglect (neglecting vital needs, allowing malnutrition and dehydration, allowing decubiti, deprivation of information, limitation of freedom, negligence toward safety precautions) and active neglect (intimidation and name-calling, tying the victim up to prevent falls without consent, consciously ignoring an emergency, not calling a physician in spite of indication, stopping important treatments, failure to provide essential care, deprivation of nourishment, leaving a person alone for an inappropriate amount of time, excessive demands in a situation of care). We further compare the prevalence of abuse before and after Covid-19 related restrictions on nursing home visits. We also identified the facilities with the most number of cases of abuse with no abuse facilities within a 25-mile radius as most likely candidates for additional inspections. We also built an interactive display to visualize the location of these facilities.Keywords: machine learning, text analytics, elder abuse, elder neglect, nursing home abuse
Procedia PDF Downloads 1481017 Secure Text Steganography for Microsoft Word Document
Authors: Khan Farhan Rafat, M. Junaid Hussain
Abstract:
Seamless modification of an entity for the purpose of hiding a message of significance inside its substance in a manner that the embedding remains oblivious to an observer is known as steganography. Together with today's pervasive registering frameworks, steganography has developed into a science that offers an assortment of strategies for stealth correspondence over the globe that must, however, need a critical appraisal from security breach standpoint. Microsoft Word is amongst the preferably used word processing software, which comes as a part of the Microsoft Office suite. With a user-friendly graphical interface, the richness of text editing, and formatting topographies, the documents produced through this software are also most suitable for stealth communication. This research aimed not only to epitomize the fundamental concepts of steganography but also to expound on the utilization of Microsoft Word document as a carrier for furtive message exchange. The exertion is to examine contemporary message hiding schemes from security aspect so as to present the explorative discoveries and suggest enhancements which may serve a wellspring of information to encourage such futuristic research endeavors.Keywords: hiding information in plain sight, stealth communication, oblivious information exchange, conceal, steganography
Procedia PDF Downloads 2431016 Mechanisms Underlying Comprehension of Visualized Personal Health Information: An Eye Tracking Study
Authors: Da Tao, Mingfu Qin, Wenkai Li, Tieyan Wang
Abstract:
While the use of electronic personal health portals has gained increasing popularity in the healthcare industry, users usually experience difficulty in comprehending and correctly responding to personal health information, partly due to inappropriate or poor presentation of the information. The way personal health information is visualized may affect how users perceive and assess their personal health information. This study was conducted to examine the effects of information visualization format and visualization mode on the comprehension and perceptions of personal health information among personal health information users with eye tracking techniques. A two-factor within-subjects experimental design was employed, where participants were instructed to complete a series of personal health information comprehension tasks under varied types of visualization mode (i.e., whether the information visualization is static or dynamic) and three visualization formats (i.e., bar graph, instrument-like graph, and text-only format). Data on a set of measures, including comprehension performance, perceptions, and eye movement indicators, were collected during the task completion in the experiment. Repeated measure analysis of variance analyses (RM-ANOVAs) was used for data analysis. The results showed that while the visualization format yielded no effects on comprehension performance, it significantly affected users’ perceptions (such as perceived ease of use and satisfaction). The two graphic visualizations yielded significantly higher favorable scores on subjective evaluations than that of the text format. While visualization mode showed no effects on users’ perception measures, it significantly affected users' comprehension performance in that dynamic visualization significantly reduced users' information search time. Both visualization format and visualization mode had significant main effects on eye movement behaviors, and their interaction effects were also significant. While the bar graph format and text format had similar time to first fixation across dynamic and static visualizations, instrument-like graph format had a larger time to first fixation for dynamic visualization than for static visualization. The two graphic visualization formats yielded shorter total fixation duration compared with the text-only format, indicating their ability to improve information comprehension efficiency. The results suggest that dynamic visualization can improve efficiency in comprehending important health information, and graphic visualization formats were favored more by users. The findings are helpful in the underlying comprehension mechanism of visualized personal health information and provide important implications for optimal design and visualization of personal health information.Keywords: eye tracking, information comprehension, personal health information, visualization
Procedia PDF Downloads 1091015 A Mixed Methods Study Aimed at Exploring the Conceptualization of Orthorexia Nervosa on Instagram
Authors: Elena V. Syurina, Sophie Renckens, Martina Valente
Abstract:
Objective: The objective of this study was to investigate the nature of the conversation around orthorexia nervosa (ON) on Instagram. Methods: The present study was conducted using mixed methods, combining a concurrent triangulation and sequential explanatory design. First, 3027 pictures posted on Instagram using #Orthorexia were analyzed. Then, a questionnaire about Instagram use related to ON was completed entirely by 185 respondents. These two quantitative data sources were statistically analyzed and triangulated afterwards. Finally, 9 interviews were conducted, to more deeply investigate what is being said about ON on Instagram and what the motivations to post about it are. Results: Four main categories of pictures were found to be represented in Instagram posts about ON: ‘food’, ‘people’, ‘text’, and ‘other.’ Savory and unprocessed food was most highly represented within the food category, and pictures of people were mostly pictures of the account holder. People who self-identify as having ON were more likely to post about ON, and they were significantly more likely to post about ‘food’, ‘people’ and ‘text.’ The goal of the posts was to raise awareness around ON, as well as to provide support for people who believe to be suffering from it. Conclusion: Since the conversation around ON on Instagram is supportive, it could be beneficial to consider Instagram use in the treatment of ON. However, more research is needed on a larger scale.Keywords: orthorexia nervosa, Instagram, social media, disordered eating
Procedia PDF Downloads 1381014 AI-Based Techniques for Online Social Media Network Sentiment Analysis: A Methodical Review
Authors: A. M. John-Otumu, M. M. Rahman, O. C. Nwokonkwo, M. C. Onuoha
Abstract:
Online social media networks have long served as a primary arena for group conversations, gossip, text-based information sharing and distribution. The use of natural language processing techniques for text classification and unbiased decision-making has not been far-fetched. Proper classification of this textual information in a given context has also been very difficult. As a result, we decided to conduct a systematic review of previous literature on sentiment classification and AI-based techniques that have been used in order to gain a better understanding of the process of designing and developing a robust and more accurate sentiment classifier that can correctly classify social media textual information of a given context between hate speech and inverted compliments with a high level of accuracy by assessing different artificial intelligence techniques. We evaluated over 250 articles from digital sources like ScienceDirect, ACM, Google Scholar, and IEEE Xplore and whittled down the number of research to 31. Findings revealed that Deep learning approaches such as CNN, RNN, BERT, and LSTM outperformed various machine learning techniques in terms of performance accuracy. A large dataset is also necessary for developing a robust sentiment classifier and can be obtained from places like Twitter, movie reviews, Kaggle, SST, and SemEval Task4. Hybrid Deep Learning techniques like CNN+LSTM, CNN+GRU, CNN+BERT outperformed single Deep Learning techniques and machine learning techniques. Python programming language outperformed Java programming language in terms of sentiment analyzer development due to its simplicity and AI-based library functionalities. Based on some of the important findings from this study, we made a recommendation for future research.Keywords: artificial intelligence, natural language processing, sentiment analysis, social network, text
Procedia PDF Downloads 116