Search results for: text preprocessing
1132 Degraded Document Analysis and Extraction of Original Text Document: An Approach without Optical Character Recognition
Authors: L. Hamsaveni, Navya Prakash, Suresha
Abstract:
Document Image Analysis recognizes text and graphics in documents acquired as images. An approach without Optical Character Recognition (OCR) for degraded document image analysis has been adopted in this paper. The technique involves document imaging methods such as Image Fusing and Speeded Up Robust Features (SURF) Detection to identify and extract the degraded regions from a set of document images to obtain an original document with complete information. In case, degraded document image captured is skewed, it has to be straightened (deskew) to perform further process. A special format of image storing known as YCbCr is used as a tool to convert the Grayscale image to RGB image format. The presented algorithm is tested on various types of degraded documents such as printed documents, handwritten documents, old script documents and handwritten image sketches in documents. The purpose of this research is to obtain an original document for a given set of degraded documents of the same source.Keywords: grayscale image format, image fusing, RGB image format, SURF detection, YCbCr image format
Procedia PDF Downloads 3771131 High Secure Data Hiding Using Cropping Image and Least Significant Bit Steganography
Authors: Khalid A. Al-Afandy, El-Sayyed El-Rabaie, Osama Salah, Ahmed El-Mhalaway
Abstract:
This paper presents a high secure data hiding technique using image cropping and Least Significant Bit (LSB) steganography. The predefined certain secret coordinate crops will be extracted from the cover image. The secret text message will be divided into sections. These sections quantity is equal the image crops quantity. Each section from the secret text message will embed into an image crop with a secret sequence using LSB technique. The embedding is done using the cover image color channels. Stego image is given by reassembling the image and the stego crops. The results of the technique will be compared to the other state of art techniques. Evaluation is based on visualization to detect any degradation of stego image, the difficulty of extracting the embedded data by any unauthorized viewer, Peak Signal-to-Noise Ratio of stego image (PSNR), and the embedding algorithm CPU time. Experimental results ensure that the proposed technique is more secure compared with the other traditional techniques.Keywords: steganography, stego, LSB, crop
Procedia PDF Downloads 2701130 Detecting Paraphrases in Arabic Text
Authors: Amal Alshahrani, Allan Ramsay
Abstract:
Paraphrasing is one of the important tasks in natural language processing; i.e. alternative ways to express the same concept by using different words or phrases. Paraphrases can be used in many natural language applications, such as Information Retrieval, Machine Translation, Question Answering, Text Summarization, or Information Extraction. To obtain pairs of sentences that are paraphrases we create a system that automatically extracts paraphrases from a corpus, which is built from different sources of news article since these are likely to contain paraphrases when they report the same event on the same day. There are existing simple standard approaches (e.g. TF-IDF vector space, cosine similarity) and alignment technique (e.g. Dynamic Time Warping (DTW)) for extracting paraphrase which have been applied to the English. However, the performance of these approaches could be affected when they are applied to another language, for instance Arabic language, due to the presence of phenomena which are not present in English, such as Free Word Order, Zero copula, and Pro-dropping. These phenomena will affect the performance of these algorithms. Thus, if we can analysis how the existing algorithms for English fail for Arabic then we can find a solution for Arabic. The results are promising.Keywords: natural language processing, TF-IDF, cosine similarity, dynamic time warping (DTW)
Procedia PDF Downloads 3881129 Classification of Political Affiliations by Reduced Number of Features
Authors: Vesile Evrim, Aliyu Awwal
Abstract:
By the evolvement in technology, the way of expressing opinions switched the direction to the digital world. The domain of politics as one of the hottest topics of opinion mining research merged together with the behavior analysis for affiliation determination in text which constitutes the subject of this paper. This study aims to classify the text in news/blogs either as Republican or Democrat with the minimum number of features. As an initial set, 68 features which 64 are constituted by Linguistic Inquiry and Word Count (LIWC) features are tested against 14 benchmark classification algorithms. In the later experiments, the dimensions of the feature vector reduced based on the 7 feature selection algorithms. The results show that Decision Tree, Rule Induction and M5 Rule classifiers when used with SVM and IGR feature selection algorithms performed the best up to 82.5% accuracy on a given dataset. Further tests on a single feature and the linguistic based feature sets showed the similar results. The feature “function” as an aggregate feature of the linguistic category, is obtained as the most differentiating feature among the 68 features with 81% accuracy by itself in classifying articles either as Republican or Democrat.Keywords: feature selection, LIWC, machine learning, politics
Procedia PDF Downloads 3831128 Automatic Segmentation of Lung Pleura Based On Curvature Analysis
Authors: Sasidhar B., Bhaskar Rao N., Ramesh Babu D. R., Ravi Shankar M.
Abstract:
Segmentation of lung pleura is a preprocessing step in Computer-Aided Diagnosis (CAD) which helps in reducing false positives in detection of lung cancer. The existing methods fail in extraction of lung regions with the nodules at the pleura of the lungs. In this paper, a new method is proposed which segments lung regions with nodules at the pleura of the lungs based on curvature analysis and morphological operators. The proposed algorithm is tested on 06 patient’s dataset which consists of 60 images of Lung Image Database Consortium (LIDC) and the results are found to be satisfactory with 98.3% average overlap measure (AΩ).Keywords: curvature analysis, image segmentation, morphological operators, thresholding
Procedia PDF Downloads 5961127 Comics Scanlation and Publishing Houses Translation
Authors: Sharifa Alshahrani
Abstract:
Comics is a multimodal text wherein meaning is created by taking in all modes of expression at once. It uses two different semiotic modes, the verbal and the visual modes, together to make meaning and these different semiotic modes can be socially and culturally shaped to give meaning. Therefore, comics translation cannot treat comics as a monomodal text by translating only the verbal mode inside or outside the speech balloons as the cultural differences are encoded in the visual mode as well. Due to the development of the internet and editing software, comics translation is not anymore confined to the publishing houses and official translation as scanlation, or the fan translation took the initiative in translating comics for being emotionally attracted to the culture and genre. Scanlation is carried out by volunteering fans who translate out of passion. However, quality is one of the debatable issues relating to scanlation and fan translation. This study will investigate how the dynamic multimodal relationship in comics is exploited and interpreted in the translation by exploring the translation strategies and procedures adopted by the publishing houses and scanlation in interpreting comics into Arabic using three analytical frameworks; cultural references model, multimodal relation model and translation strategies and procedures models.Keywords: comics, multimodality, translation, scanlation
Procedia PDF Downloads 2131126 Linguistic Analysis of Argumentation Structures in Georgian Political Speeches
Authors: Mariam Matiashvili
Abstract:
Argumentation is an integral part of our daily communications - formal or informal. Argumentative reasoning, techniques, and language tools are used both in personal conversations and in the business environment. Verbalization of the opinions requires the use of extraordinary syntactic-pragmatic structural quantities - arguments that add credibility to the statement. The study of argumentative structures allows us to identify the linguistic features that make the text argumentative. Knowing what elements make up an argumentative text in a particular language helps the users of that language improve their skills. Also, natural language processing (NLP) has become especially relevant recently. In this context, one of the main emphases is on the computational processing of argumentative texts, which will enable the automatic recognition and analysis of large volumes of textual data. The research deals with the linguistic analysis of the argumentative structures of Georgian political speeches - particularly the linguistic structure, characteristics, and functions of the parts of the argumentative text - claims, support, and attack statements. The research aims to describe the linguistic cues that give the sentence a judgmental/controversial character and helps to identify reasoning parts of the argumentative text. The empirical data comes from the Georgian Political Corpus, particularly TV debates. Consequently, the texts are of a dialogical nature, representing a discussion between two or more people (most often between a journalist and a politician). The research uses the following approaches to identify and analyze the argumentative structures Lexical Classification & Analysis - Identify lexical items that are relevant in argumentative texts creating process - Creating the lexicon of argumentation (presents groups of words gathered from a semantic point of view); Grammatical Analysis and Classification - means grammatical analysis of the words and phrases identified based on the arguing lexicon. Argumentation Schemas - Describe and identify the Argumentation Schemes that are most likely used in Georgian Political Speeches. As a final step, we analyzed the relations between the above mentioned components. For example, If an identified argument scheme is “Argument from Analogy”, identified lexical items semantically express analogy too, and they are most likely adverbs in Georgian. As a result, we created the lexicon with the words that play a significant role in creating Georgian argumentative structures. Linguistic analysis has shown that verbs play a crucial role in creating argumentative structures.Keywords: georgian, argumentation schemas, argumentation structures, argumentation lexicon
Procedia PDF Downloads 741125 Death of the Author and Birth of the Adapter in a Literary Work
Authors: Slwa Al-Hammad
Abstract:
Adaptation studies have been closely aligned to translation studies as both deal with the process of rendering the meaning from one culture to another. These two disciplines are related to each other, but the theories are still being developed. This research aims to fill this gap and provide a contribution to the growing discipline of adaptation studies through a theoretical perspective while investigating how different cultural interpretations of adaptation influence the final literary product. This research focuses on the theoretical concepts of Barthes’s death of the author and Benjamin’s afterlife of the text in translation, which is believed to lead to the birth of the adapter in a literary work. That is, in adaptation, the ‘death’ of the author allows for the ‘birth’ of the adapter, offering them all the creative possibilities of authorship. It also explores the differences between the meanings of adaptation in the West and the Arab world through the analysis of adapted texts in Arabic initially deriving from the European and American literature of the 19th and 20th centuries. The methodology of this thesis is based upon qualitative literary analysis, in which original and adapted works are compared and contrasted, with the additional insights of literary and adaptation theories and prior scholarship. The main works discussed are the Arabic adaptations of William Faulkner’s novels. The analysis is guided by theories of adaptation studies to help in explaining the concepts of relocating, recreating, and rewriting in the process of adaptation. It draws on scholarship on adaptations to inquire into the status of the adapted texts in relation to the original texts. Also, these theories prove that adaptation is the process that is used to transfer text from source to adapted text, not some other analytical practice. Through the textual analysis, concepts of the death of the author and the birth of the adapter will be illustrated, as will the roles of the adapter and the task of rendering works for a different culture, and the understanding of adaptation and Arabization in Arabic literature.Keywords: adaptation, Arabization, authorship, recreating, relocating
Procedia PDF Downloads 1431124 Anaphora and Cataphora on the Selected State of the City Addresses of the Mayor of Dapitan
Authors: Mark Herman Sumagang Potoy
Abstract:
State of the City Address (SOCA) is a speech, modelled after the State of the Nation Address, given not as mandated by law but usually a matter of practice or tradition delivered before the chief executive’s constituents. Through this, the general public is made to know the performance of the local government unit and its agenda for the coming year. Therefore, it is imperative for SOCAs to clearly convey its message and carry out the myriad function of enlightening its readers which could be achieved through the proper use of reference. Anaphora and cataphora are the two major types of reference; the former refer back to something that has already been mentioned while the latter points forward to something which is yet to be said. This paper seeks to identify the types of reference employed on the SOCAs from 2014 to 2016 of Hon. Rosalina Garcia Jalosjos, Mayor of Dapitan City and look into how the references contribute to the clarity of the message of the text. The qualitative method of research is used in this study through an in-depth analysis of the corpus. As soon as the copies of the SOCAs are secured from the Office of the City Mayor, they are then analyzed using documentary technique categorizing the types of reference as to anaphora and cataphora, counting each of these types and describing the implications of the dominant types used in the addresses. After a thorough analysis, it is found out that the two reference types namely, anaphora and cataphora are both employed on the three SOCAs, the former being used more frequently than the latter accounting to 80% and 20% of actual usage, respectively. Moreover, the use of anaphors and cataphora on the three addresses helps in conveying the message clearly because they primarily become aids to avoid the repetition of the same element in the text especially when there wasn’t a need to emphasize a point. Finally, it is recommended that writers of State of the City Addresses should have a vast knowledge on how reference should be used and the functions they take in the text since this is a vital tool to clearly transmit a message. Moreover, English teachers should explicitly teach the proper usage of anaphora and cataphora, as instruments to develop cohesion in written discourse, to enable students to write not only with sense but also with fluidity in tying utterances together.Keywords: anaphora, cataphora, reference, State of the City Address
Procedia PDF Downloads 1931123 Recurrent Neural Networks with Deep Hierarchical Mixed Structures for Chinese Document Classification
Authors: Zhaoxin Luo, Michael Zhu
Abstract:
In natural languages, there are always complex semantic hierarchies. Obtaining the feature representation based on these complex semantic hierarchies becomes the key to the success of the model. Several RNN models have recently been proposed to use latent indicators to obtain the hierarchical structure of documents. However, the model that only uses a single-layer latent indicator cannot achieve the true hierarchical structure of the language, especially a complex language like Chinese. In this paper, we propose a deep layered model that stacks arbitrarily many RNN layers equipped with latent indicators. After using EM and training it hierarchically, our model solves the computational problem of stacking RNN layers and makes it possible to stack arbitrarily many RNN layers. Our deep hierarchical model not only achieves comparable results to large pre-trained models on the Chinese short text classification problem but also achieves state of art results on the Chinese long text classification problem.Keywords: nature language processing, recurrent neural network, hierarchical structure, document classification, Chinese
Procedia PDF Downloads 691122 Semantic Based Analysis in Complaint Management System with Analytics
Authors: Francis Alterado, Jennifer Enriquez
Abstract:
Semantic Based Analysis in Complaint Management System with Analytics is an enhanced tool of providing complaints by the clients as well as a mechanism for Palawan Polytechnic College to gather, process, and monitor status of these complaints. The study has a mobile application that serves as a remote facility of communication between the students and the school management on the issues encountered by the student and the solution of every complaint received. In processing the complaints, text mining and clustering algorithms were utilized. Every module of the systems was tested and based on the results; these are 100% free from error before integration was done. A system testing was also done by checking the expected functionality of the system which was 100% functional. The system was tested by 10 students by forwarding complaints to 10 departments. Based on results, the students were able to submit complaints, the system was able to process accordingly by identifying to which department the complaints are intended, and the concerned department was able to give feedback on the complaint received to the student. With this, the system gained 4.7 rating which means Excellent.Keywords: technology adoption, emerging technology, issues challenges, algorithm, text mining, mobile technology
Procedia PDF Downloads 1991121 Compilation and Statistical Analysis of an Arabic-English Legal Corpus in Sketch Engine
Authors: C. Brierley, H. El-Farahaty, A. Farhan
Abstract:
The Leeds Parallel Corpus of Arabic-English Constitutions is a parallel corpus for the Arabic legal domain. Analysis of legal language via Corpus Linguistics techniques is an important development. In legal proceedings, a corpus-based approach to disambiguating meaning is set to replace the dictionary as an interpretative tool, and legal scholarship in the States is now attuned to the potential for Text Analytics over vast quantities of text-based legal material, following the business and medical industries. This trend is reflected in Europe: the interdisciplinary research group in Computer Assisted Legal Linguistics mines big data collections of legal and non-legal texts to analyse: legal interpretations; legal discourse; the comprehensibility of legal texts; conflict resolution; and linguistic human rights. This paper focuses on ‘dignity’ as an important aspect of the overarching concept of human rights in current constitutions across the Arab world. We have compiled a parallel, Arabic-English raw text corpus (169,861 Arabic words and 205,893 English words) from reputable websites such as the World Intellectual Property Organisation and CONSTITUTE, and uploaded and queried our corpus in Sketch Engine. Our most challenging task was sentence-level alignment of Arabic-English data. This entailed manual intervention to ensure correspondence on a one-to-many basis since Arabic sentences differ from English in length and punctuation. We have searched for morphological variants of ‘dignity’ (رامة ك, karāma) in the Arabic data and inspected their English translation equivalents. The term occurs most frequently in the Sudanese constitution (10 instances), and not at all in the constitution of Palestine. Its most frequent collocate, determined via the logDice statistic in Sketch Engine, is ‘human’ as in ‘human dignity’.Keywords: Arabic constitution, corpus-based legal linguistics, human rights, parallel Arabic-English legal corpora
Procedia PDF Downloads 1831120 Rapid Detection of Cocaine Using Aggregation-Induced Emission and Aptamer Combined Fluorescent Probe
Authors: Jianuo Sun, Jinghan Wang, Sirui Zhang, Chenhan Xu, Hongxia Hao, Hong Zhou
Abstract:
In recent years, the diversification and industrialization of drug-related crimes have posed significant threats to public health and safety globally. The widespread and increasingly younger demographics of drug users and the persistence of drug-impaired driving incidents underscore the urgency of this issue. Drug detection, a specialized forensic activity, is pivotal in identifying and analyzing substances involved in drug crimes. It relies on pharmacological and chemical knowledge and employs analytical chemistry and modern detection techniques. However, current drug detection methods are limited by their inability to perform semi-quantitative, real-time field analyses. They require extensive, complex laboratory-based preprocessing, expensive equipment, and specialized personnel and are hindered by long processing times. This study introduces an alternative approach using nucleic acid aptamers and Aggregation-Induced Emission (AIE) technology. Nucleic acid aptamers, selected artificially for their specific binding to target molecules and stable spatial structures, represent a new generation of biosensors following antibodies. Rapid advancements in AIE technology, particularly in tetraphenyl ethene-based luminous, offer simplicity in synthesis and versatility in modifications, making them ideal for fluorescence analysis. This work successfully synthesized, isolated, and purified an AIE molecule and constructed a probe comprising the AIE molecule, nucleic acid aptamers, and exonuclease for cocaine detection. The probe demonstrated significant relative fluorescence intensity changes and selectivity towards cocaine over other drugs. Using 4-Butoxytriethylammonium Bromide Tetraphenylethene (TPE-TTA) as the fluorescent probe, the aptamer as the recognition unit, and Exo I as an auxiliary, the system achieved rapid detection of cocaine within 5 mins in aqueous and urine, with detection limits of 1.0 and 5.0 µmol/L respectively. The probe-maintained stability and interference resistance in urine, enabling quantitative cocaine detection within a certain concentration range. This fluorescent sensor significantly reduces sample preprocessing time, offers a basis for rapid onsite cocaine detection, and promises potential for miniaturized testing setups.Keywords: drug detection, aggregation-induced emission (AIE), nucleic acid aptamer, exonuclease, cocaine
Procedia PDF Downloads 641119 Integrating Critical Stylistics and Visual Grammar: A Multimodal Stylistic Approach to the Analysis of Non-Literary Texts
Authors: Shatha Khuzaee
Abstract:
The study develops multimodal stylistic approach to analyse a number of BBC online news articles reporting some key events from the so called ‘Arab Uprisings’. Critical stylistics (CS) and visual grammar (VG) provide insightful arguments to the ways ideology is projected through different verbal and visual modes, yet they are mode specific because they examine how each mode projects its meaning separately and do not attempt to clarify what happens intersemiotically when the two modes co-occur. Therefore, it is the task undertaken in this research to propose multimodal stylistic approach that addresses the issue of ideology construction when the two modes co-occur. Informed by functional grammar and social semiotics, the analysis attempts to integrate three linguistic models developed in critical stylistics, namely, transitivity choices, prioritizing and hypothesizing along with their visual equivalents adopted from visual grammar to investigate the way ideology is constructed, in multimodal text, when text/image participate and interrelate in the process of meaning making on the textual level of analysis. The analysis provides comprehensive theoretical and analytical elaborations on the different points of integration between CS linguistic models and VG equivalents which operate on the textual level of analysis to better account for ideology construction in news as non-literary multimodal texts. It is argued that the analysis well thought out a plan that would remark the first step towards the integration between the well-established linguistic models of critical stylistics and that of visual analysis to analyse multimodal texts on the textual level. Both approaches are compatible to produce multimodal stylistic approach because they intend to analyse text and image depending on whatever textual evidence is available. This supports the analysis maintain the rigor and replicability needed for a stylistic analysis like the one undertaken in this study.Keywords: multimodality, stylistics, visual grammar, social semiotics, functional grammar
Procedia PDF Downloads 2211118 Developing Students’ Academic Writing Skills through Scientific Reading: Using Questions and Answer Activities
Authors: Makhim Artikova, Shavkat Duschanov
Abstract:
So far, there have been a plethora of attempts to improve learners’ academic writing skills. However, this issue remains to be a real concern among the majority of students, especially those who are standing on their academic life threshold. The purpose of this research is improving students’ academic writing skills through 'Questions and Answer Reading' activities. Using well-prepared and well-chosen reading materials (from textbooks, scientific journals, or magazines) and applying questions and answer activities in the classroom facilitate learners to become great critical readers. Furthermore, it boosts their writing skills, which are the most crucial part of students’ personal and academic developments. In this activity, the class is divided into small groups of four. Then, the instructor will give students whether one section of the text or full text asking them to read and to find unfamiliar words within the group. After discovering the meaning of unknown words, each group has to share their findings with the class. In the next stage of the activity, students should be asked to create questions in a group based on the given reading material. Follow by each group should ask the other groups their questions which are an excellent opportunity to challenge leads to improve critical thinking skills. In the last part, the students are asked to write the text or article summary, which is the activity core that pilots to the writing skills perfection. This engaging activity highlights the effectiveness of incorporating reading materials into the classroom when it comes to improving students’ composition writings. Structural writing after every reading activity resulted in improving students’ coherence and cohesion in writing well-organized essays. Having experimented with high school 9th and 11th-grade students, implementing reading activities into the classroom is proved to be a productive tool to enhance one’s academic writing skills. In the future, this method planning to be implemented among university students.Keywords: academic writing, coherence and cohesion, questions and answer activities, scientific reading
Procedia PDF Downloads 1111117 Literature as a Strategic Tool to Conscientise Africans: An Attempt by Postcolonial Writers and Critics to Reverse the Socio-Economics Imbalances of Colonialism
Authors: Lutendo Nendauni
Abstract:
Colonialism breaks things, colonisers exploded native cultural solidarity, producing the spiritual confusion, psychic wounding, and economic exploitation of a new and dominated ‘other’. Colonialism as the cultural and economic exploitation began when the West defended in their seizure of foreign territories for the exploitation of its natural resources; this resulted in brutal socio-economic imbalances. The Western profited at the detriment of the weak Africa. However, colonialism has since passed, but the effects are still evident culturally, socially, and economically. This paper explored how postcolonial writers and critics attempt to reverse the socio-economic imbalances resulting from the fragmentation of colonialism, with a focus on the play 'I will Marry When I Want' by Ngugi wa Thiong’o and Ngugi wa Mirii, as a primary text. Using qualitative discourse-textual analysis as the research methodology, the researcher purposively extracts discourse segments from the text for analysis and interpretation. The findings reveal that Postcolonial critics and writers attempt to reverse the socio-economic effects of colonialism through various counter discourses; their literature is concerned with the destruction of colonised identity, the search for this identity, and its assertion. It is manifest in the text that writers offer corrective views about Africans; they stress that they write their literary texts to conscientise their fellow Africans. Postcolonial writers and critics argue that language is a carrier of culture and that the only way to break free from colonial influence is by not adopting a foreign language. They further through their poems, novels, plays, and music strategically shine the spotlight on the previously nameless and destitute people so that they can develop the human spirit’s desire to overcome defeat, socio-political deprivation, and isolation.Keywords: colonialism, postcoloniality, critics, socio-economic imbalances
Procedia PDF Downloads 1581116 Instructional Consequences of the Transiency of Spoken Words
Authors: Slava Kalyuga, Sujanya Sombatteera
Abstract:
In multimedia learning, written text is often transformed into spoken (narrated) text. This transient information may overwhelm limited processing capacity of working memory and inhibit learning instead of improving it. The paper reviews recent empirical studies in modality and verbal redundancy effects within a cognitive load framework and outlines conditions under which negative effects of transiency may occur. According to the modality effect, textual information accompanying pictures should be presented in an auditory rather than visual form in order to engage two available channels of working memory – auditory and visual - instead of only one of them. However, some studies failed to replicate the modality effect and found differences opposite to those expected. Also, according to the multimedia redundancy effect, the same information should not be presented simultaneously in different modalities to avoid unnecessary cognitive load imposed by the integration of redundant sources of information. However, a few studies failed to replicate the multimedia redundancy effect too. Transiency of information is used to explain these controversial results.Keywords: cognitive load, transient information, modality effect, verbal redundancy effect
Procedia PDF Downloads 3811115 A Study on Sentiment Analysis Using Various ML/NLP Models on Historical Data of Indian Leaders
Authors: Sarthak Deshpande, Akshay Patil, Pradip Pandhare, Nikhil Wankhede, Rushali Deshmukh
Abstract:
Among the highly significant duties for any language most effective is the sentiment analysis, which is also a key area of NLP, that recently made impressive strides. There are several models and datasets available for those tasks in popular and commonly used languages like English, Russian, and Spanish. While sentiment analysis research is performed extensively, however it is lagging behind for the regional languages having few resources such as Hindi, Marathi. Marathi is one of the languages that included in the Indian Constitution’s 8th schedule and is the third most widely spoken language in the country and primarily spoken in the Deccan region, which encompasses Maharashtra and Goa. There isn’t sufficient study on sentiment analysis methods based on Marathi text due to lack of available resources, information. Therefore, this project proposes the use of different ML/NLP models for the analysis of Marathi data from the comments below YouTube content, tweets or Instagram posts. We aim to achieve a short and precise analysis and summary of the related data using our dataset (Dates, names, root words) and lexicons to locate exact information.Keywords: multilingual sentiment analysis, Marathi, natural language processing, text summarization, lexicon-based approaches
Procedia PDF Downloads 761114 Detecting Elderly Abuse in US Nursing Homes Using Machine Learning and Text Analytics
Authors: Minh Huynh, Aaron Heuser, Luke Patterson, Chris Zhang, Mason Miller, Daniel Wang, Sandeep Shetty, Mike Trinh, Abigail Miller, Adaeze Enekwechi, Tenille Daniels, Lu Huynh
Abstract:
Machine learning and text analytics have been used to analyze child abuse, cyberbullying, domestic abuse and domestic violence, and hate speech. However, to the authors’ knowledge, no research to date has used these methods to study elder abuse in nursing homes or skilled nursing facilities from field inspection reports. We used machine learning and text analytics methods to analyze 356,000 inspection reports, which have been extracted from CMS Form-2567 field inspections of US nursing homes and skilled nursing facilities between 2016 and 2021. Our algorithm detected occurrences of the various types of abuse, including physical abuse, psychological abuse, verbal abuse, sexual abuse, and passive and active neglect. For example, to detect physical abuse, our algorithms search for combinations or phrases and words suggesting willful infliction of damage (hitting, pinching or burning, tethering, tying), or consciously ignoring an emergency. To detect occurrences of elder neglect, our algorithm looks for combinations or phrases and words suggesting both passive neglect (neglecting vital needs, allowing malnutrition and dehydration, allowing decubiti, deprivation of information, limitation of freedom, negligence toward safety precautions) and active neglect (intimidation and name-calling, tying the victim up to prevent falls without consent, consciously ignoring an emergency, not calling a physician in spite of indication, stopping important treatments, failure to provide essential care, deprivation of nourishment, leaving a person alone for an inappropriate amount of time, excessive demands in a situation of care). We further compare the prevalence of abuse before and after Covid-19 related restrictions on nursing home visits. We also identified the facilities with the most number of cases of abuse with no abuse facilities within a 25-mile radius as most likely candidates for additional inspections. We also built an interactive display to visualize the location of these facilities.Keywords: machine learning, text analytics, elder abuse, elder neglect, nursing home abuse
Procedia PDF Downloads 1481113 Secure Text Steganography for Microsoft Word Document
Authors: Khan Farhan Rafat, M. Junaid Hussain
Abstract:
Seamless modification of an entity for the purpose of hiding a message of significance inside its substance in a manner that the embedding remains oblivious to an observer is known as steganography. Together with today's pervasive registering frameworks, steganography has developed into a science that offers an assortment of strategies for stealth correspondence over the globe that must, however, need a critical appraisal from security breach standpoint. Microsoft Word is amongst the preferably used word processing software, which comes as a part of the Microsoft Office suite. With a user-friendly graphical interface, the richness of text editing, and formatting topographies, the documents produced through this software are also most suitable for stealth communication. This research aimed not only to epitomize the fundamental concepts of steganography but also to expound on the utilization of Microsoft Word document as a carrier for furtive message exchange. The exertion is to examine contemporary message hiding schemes from security aspect so as to present the explorative discoveries and suggest enhancements which may serve a wellspring of information to encourage such futuristic research endeavors.Keywords: hiding information in plain sight, stealth communication, oblivious information exchange, conceal, steganography
Procedia PDF Downloads 2431112 Sweepline Algorithm for Voronoi Diagram of Polygonal Sites
Authors: Dmitry A. Koptelov, Leonid M. Mestetskiy
Abstract:
Voronoi Diagram (VD) of finite set of disjoint simple polygons, called sites, is a partition of plane into loci (for each site at the locus) – regions, consisting of points that are closer to a given site than to all other. Set of polygons is a universal model for many applications in engineering, geoinformatics, design, computer vision, and graphics. VD of polygons construction usually done with a reduction to task of constructing VD of segments, for which there are effective O(n log n) algorithms for n segments. Preprocessing – constructing segments from polygons’ sides, and postprocessing – polygon’s loci construction by merging the loci of the sides of each polygon are also included in reduction. This approach doesn’t take into account two specific properties of the resulting segment sites. Firstly, all this segments are connected in pairs in the vertices of the polygons. Secondly, on the one side of each segment lies the interior of the polygon. The polygon is obviously included in its locus. Using this properties in the algorithm for VD construction is a resource to reduce computations. The article proposes an algorithm for the direct construction of VD of polygonal sites. Algorithm is based on sweepline paradigm, allowing to effectively take into account these properties. The solution is performed based on reduction. Preprocessing is the constructing of set of sites from vertices and edges of polygons. Each site has an orientation such that the interior of the polygon lies to the left of it. Proposed algorithm constructs VD for set of oriented sites with sweepline paradigm. Postprocessing is a selecting of edges of this VD formed by the centers of empty circles touching different polygons. Improving the efficiency of the proposed sweepline algorithm in comparison with the general Fortune algorithm is achieved due to the following fundamental solutions: 1. Algorithm constructs only such VD edges, which are on the outside of polygons. Concept of oriented sites allowed to avoid construction of VD edges located inside the polygons. 2. The list of events in sweepline algorithm has a special property: the majority of events are connected with “medium” polygon vertices, where one incident polygon side lies behind the sweepline and the other in front of it. The proposed algorithm processes such events in constant time and not in logarithmic time, as in the general Fortune algorithm. The proposed algorithm is fully implemented and tested on a large number of examples. The high reliability and efficiency of the algorithm is also confirmed by computational experiments with complex sets of several thousand polygons. It should be noted that, despite the considerable time that has passed since the publication of Fortune's algorithm in 1986, a full-scale implementation of this algorithm for an arbitrary set of segment sites has not been made. The proposed algorithm fills this gap for an important special case - a set of sites formed by polygons.Keywords: voronoi diagram, sweepline, polygon sites, fortunes' algorithm, segment sites
Procedia PDF Downloads 1771111 Mechanisms Underlying Comprehension of Visualized Personal Health Information: An Eye Tracking Study
Authors: Da Tao, Mingfu Qin, Wenkai Li, Tieyan Wang
Abstract:
While the use of electronic personal health portals has gained increasing popularity in the healthcare industry, users usually experience difficulty in comprehending and correctly responding to personal health information, partly due to inappropriate or poor presentation of the information. The way personal health information is visualized may affect how users perceive and assess their personal health information. This study was conducted to examine the effects of information visualization format and visualization mode on the comprehension and perceptions of personal health information among personal health information users with eye tracking techniques. A two-factor within-subjects experimental design was employed, where participants were instructed to complete a series of personal health information comprehension tasks under varied types of visualization mode (i.e., whether the information visualization is static or dynamic) and three visualization formats (i.e., bar graph, instrument-like graph, and text-only format). Data on a set of measures, including comprehension performance, perceptions, and eye movement indicators, were collected during the task completion in the experiment. Repeated measure analysis of variance analyses (RM-ANOVAs) was used for data analysis. The results showed that while the visualization format yielded no effects on comprehension performance, it significantly affected users’ perceptions (such as perceived ease of use and satisfaction). The two graphic visualizations yielded significantly higher favorable scores on subjective evaluations than that of the text format. While visualization mode showed no effects on users’ perception measures, it significantly affected users' comprehension performance in that dynamic visualization significantly reduced users' information search time. Both visualization format and visualization mode had significant main effects on eye movement behaviors, and their interaction effects were also significant. While the bar graph format and text format had similar time to first fixation across dynamic and static visualizations, instrument-like graph format had a larger time to first fixation for dynamic visualization than for static visualization. The two graphic visualization formats yielded shorter total fixation duration compared with the text-only format, indicating their ability to improve information comprehension efficiency. The results suggest that dynamic visualization can improve efficiency in comprehending important health information, and graphic visualization formats were favored more by users. The findings are helpful in the underlying comprehension mechanism of visualized personal health information and provide important implications for optimal design and visualization of personal health information.Keywords: eye tracking, information comprehension, personal health information, visualization
Procedia PDF Downloads 1091110 A Mixed Methods Study Aimed at Exploring the Conceptualization of Orthorexia Nervosa on Instagram
Authors: Elena V. Syurina, Sophie Renckens, Martina Valente
Abstract:
Objective: The objective of this study was to investigate the nature of the conversation around orthorexia nervosa (ON) on Instagram. Methods: The present study was conducted using mixed methods, combining a concurrent triangulation and sequential explanatory design. First, 3027 pictures posted on Instagram using #Orthorexia were analyzed. Then, a questionnaire about Instagram use related to ON was completed entirely by 185 respondents. These two quantitative data sources were statistically analyzed and triangulated afterwards. Finally, 9 interviews were conducted, to more deeply investigate what is being said about ON on Instagram and what the motivations to post about it are. Results: Four main categories of pictures were found to be represented in Instagram posts about ON: ‘food’, ‘people’, ‘text’, and ‘other.’ Savory and unprocessed food was most highly represented within the food category, and pictures of people were mostly pictures of the account holder. People who self-identify as having ON were more likely to post about ON, and they were significantly more likely to post about ‘food’, ‘people’ and ‘text.’ The goal of the posts was to raise awareness around ON, as well as to provide support for people who believe to be suffering from it. Conclusion: Since the conversation around ON on Instagram is supportive, it could be beneficial to consider Instagram use in the treatment of ON. However, more research is needed on a larger scale.Keywords: orthorexia nervosa, Instagram, social media, disordered eating
Procedia PDF Downloads 1381109 AI-Based Techniques for Online Social Media Network Sentiment Analysis: A Methodical Review
Authors: A. M. John-Otumu, M. M. Rahman, O. C. Nwokonkwo, M. C. Onuoha
Abstract:
Online social media networks have long served as a primary arena for group conversations, gossip, text-based information sharing and distribution. The use of natural language processing techniques for text classification and unbiased decision-making has not been far-fetched. Proper classification of this textual information in a given context has also been very difficult. As a result, we decided to conduct a systematic review of previous literature on sentiment classification and AI-based techniques that have been used in order to gain a better understanding of the process of designing and developing a robust and more accurate sentiment classifier that can correctly classify social media textual information of a given context between hate speech and inverted compliments with a high level of accuracy by assessing different artificial intelligence techniques. We evaluated over 250 articles from digital sources like ScienceDirect, ACM, Google Scholar, and IEEE Xplore and whittled down the number of research to 31. Findings revealed that Deep learning approaches such as CNN, RNN, BERT, and LSTM outperformed various machine learning techniques in terms of performance accuracy. A large dataset is also necessary for developing a robust sentiment classifier and can be obtained from places like Twitter, movie reviews, Kaggle, SST, and SemEval Task4. Hybrid Deep Learning techniques like CNN+LSTM, CNN+GRU, CNN+BERT outperformed single Deep Learning techniques and machine learning techniques. Python programming language outperformed Java programming language in terms of sentiment analyzer development due to its simplicity and AI-based library functionalities. Based on some of the important findings from this study, we made a recommendation for future research.Keywords: artificial intelligence, natural language processing, sentiment analysis, social network, text
Procedia PDF Downloads 1161108 An Eco-Translatology Approach to the Translation of Spanish Tourism Advertising in Digital Communication in Chinese
Authors: Mingshu Liu, Laura Santamaria, Xavier Carmaniu Mainadé
Abstract:
As one of the sectors most affected by the COVID-19 pandemic, tourism is facing challenges in revitalizing the industry. But at the same time, it would be a good opportunity to take advantage of digital communication as an effective tool for tourism promotion. Our proposal aims to verify the linguistic operations on online platforms in China. The research is carried out based on the theory of Eco-traductology put forward by Gengshen Hu, whose contribution focuses on the translator's adaptation to the ecosystem environment and the three elaborated parameters (linguistic, cultural and communicative). We also relate it to Even-Zohar's and Toury's theoretical postulates on the Polysystem to elaborate on interdisciplinary methodology. Such a methodology allows us to analyze personal treatments and phraseology in the target text. As for the corpus, we adopt the official Spanish-language website of Turismo de España as the source text and the postings on the two major social networks in China, Weibo and Wechat, in 2019. Through qualitative analysis, we conclude that, in the tourism advertising campaign on Chinese social networks, chengyu (Chinese phraseology) and honorific titles are used very frequently.Keywords: digital communication, eco-traductology, polysystem theory, tourism advertising
Procedia PDF Downloads 2291107 Critical Mathematics Education and School Education in India: A Study of the National Curriculum Framework 2022 for Foundational Stage
Authors: Eish Sharma
Abstract:
Literature around Mathematics education suggests that democratic attitudes can be strengthened through teaching and learning Mathematics. Furthermore, connections between critical education and Mathematics education are observed in the light of critical pedagogy to locate Critical Mathematics Education (CME) as the theoretical framework. Critical pedagogy applied to Mathematics education is identified as one of the key themes subsumed under Critical Mathematics Education. Through the application of critical pedagogy in mathematics, unequal power relations and social injustice can be identified, analyzed, and challenged. The research question is: have educational policies in India viewed the role of critical pedagogy applied to mathematics education (i.e., critical mathematics education) to ensure social justice as an educational aim? The National Curriculum Framework (NCF), 2005 upholds education for democracy and the role of mathematics education in facilitating the same. More than this, NCF 2005 rests on Critical Pedagogy Framework and it recommends that critical pedagogy must be practiced in all dimensions of school education. NCF 2005 visualizes critical pedagogy for social sciences as well as sciences, stating that the science curriculum, including mathematics, must be used as an “instrument for achieving social change to reduce the divide based on economic class, gender, caste, religion, and the region”. Furthermore, the implementation of NCF 2005 led to a reform in the syllabus and textbooks in school mathematics at the national level, and critical pedagogy was applied to mathematics textbooks at the primary level. This intervention led to ethnomathematics and critical mathematics education in the school curriculum in India for the first time at the national level. In October 2022, the Ministry of Education launched the National Curriculum Framework for Foundational Stage (NCF-FS), developed in light of the National Education Policy, 2020, for children in the three to eight years age group. I want to find out whether critical pedagogy-based education and critical pedagogy-based mathematics education are carried forward in NCF 2022. To find this, an argument analysis of specific sections of the National Curriculum Framework 2022 document needs to be executed. Des Gasper suggests two tables: The first table contains four columns, namely, text component, comments on meanings, possible reformulation of the same text, and identified conclusions and assumptions (both stated and unstated). This table is for understanding the components and meanings of the text and is based on Scriven’s model for understanding the components and meanings of words in the text. The second table contains four columns i.e., claim identified, given data, warrant, and stated qualifier/rebuttal. This table is for describing the structure of the argument, how and how well the components fit together and is called ‘George Table diagram based on Toulmin-Bunn Model’.Keywords: critical mathematics education, critical pedagogy, social justice, etnomathematics
Procedia PDF Downloads 821106 Inclusion in Rabbinic and Protestant Translations of the Hebrew book of Proverbs (1865) History of Translations and Cultural Inclusion Terms of Reference
Authors: Mh. D Tammam Ayoubi
Abstract:
The Old Testament has been translated into many languages, including Arabic. There have been consecutive translations of it since Islamic antiquity. The Rabbinic translation, which rendered the Hebrew text into Arabic without a linguistic medium, appeared later. It was followed by several Orthodox and Jesuit trials, including the Protestant translation. Those two translations were chosen to study the book of Proverbs, which is classified as one of the books of Wisdom; something that distances it from being either symbolical or historical and makes the translation the subject of the translator's ideology starting from the incorporated cultural element be it Jewish, Aramaic or Islamist (Mu'tazila) of the first translation, or through the choice of the equivalent signs of origin, and the neutralization of the Rabbinic, Arabic, and Greek element of the second translation. The various Protestant translation of different authors has contributed to the multiplicity of the term of reference, mostly Christian, in contrast with the single reference of one author, which carries multiple conflicting cultural facades when it comes to the Rabbinic translation. This has led to a change in the origin through the inclusion of those various verbal or interpretative elements in the book of Proverbs, which will be examined in the verses through a comparative study with the original Hebrew text or the cultural terms or references.Keywords: rabbinic and protestant translations, book of proverbs, hebrew, protestant translation
Procedia PDF Downloads 801105 An Examination of the Effectiveness of iPad-Based Augmentative and Alternative Intervention on Acquisition, Generalization and Maintenance of the Requesting Information Skills of Children with Autism
Authors: Amaal Almigal
Abstract:
Technology has been argued to offer distinct advantages and benefits for teaching children with autism spectrum disorder (ASD) to communicate. One aspect of this technology is augmentative and alternative communication (AAC) systems such as picture exchange or speech generation devices. Whilst there has been significant progress in teaching these children to request their wants and needs with AAC, there remains a need for developing technologies that can really make a difference in teaching them to ask questions. iPad-based AAC can be effective for communication. However, the effectiveness of this type of AAC in teaching children to ask questions needs to be examined. Thus, in order to examine the effectiveness of iPad-based AAC in teaching children with ASD to ask questions, This research will test whether iPad leads to more learning than a traditional approach picture and text cards does. Two groups of children who use AAC will be taught to ask ‘What is it?’ questions. With the first group, low-tech AAC picture and text cards will be used, while an iPad-based AAC application called Proloquo2Go will be used with the second group. Interviews with teachers and parents will be conducted before and after the experiment. The children’s perspectives will also be considered. The initial outcomes of this research indicate that iPad can be an effective tool to help children with autism to ask questions.Keywords: autism, communication, information, iPad, pictures, requesting
Procedia PDF Downloads 2641104 A Critical Discourse Study of Gender Identity Issues in Daniyal Mueenuddin’s Short Story “Saleema”
Authors: Zafar Ali
Abstract:
The aim of this research is to highlight problems that are faced by women at the hands of men. Males in Pakistani society have power and use this power for the exploitation of women. Further, the purpose of the study is to make societies like Pakistan and especially the young generation, aware and enable them to resist such issues, and the role of discourse in this regard is to minimize its political and social repercussions. The study finds out different discursive techniques and manipulative language used in the short story to construct gender identity. The study also investigates socio-economic roles in the construction of gender identity. This study has been completed with the help of Critical Discourse Analysis (CDA) principles. CDA principles have been applied to the text of the selected short story Saleema from Daniyal Mueenuddin’s collection In Other Rooms, Other Wonders. Related passages, structures, expressions, and text are analyzed from the point of view of CDA, especially Norman Fairclough’s CDA approach. It was found from the analysis that women have no identity of their own in patriarchal societies like Pakistan. Further, it was found women are mistreated, and they have a very limited and defined role in Pakistan. They cannot go beyond the limit defined to them by men.Keywords: gender issues, resourceful groups, CDA, exploitation
Procedia PDF Downloads 1321103 Forecasting Residential Water Consumption in Hamilton, New Zealand
Authors: Farnaz Farhangi
Abstract:
Many people in New Zealand believe that the access to water is inexhaustible, and it comes from a history of virtually unrestricted access to it. For the region like Hamilton which is one of New Zealand’s fastest growing cities, it is crucial for policy makers to know about the future water consumption and implementation of rules and regulation such as universal water metering. Hamilton residents use water freely and they do not have any idea about how much water they use. Hence, one of proposed objectives of this research is focusing on forecasting water consumption using different methods. Residential water consumption time series exhibits seasonal and trend variations. Seasonality is the pattern caused by repeating events such as weather conditions in summer and winter, public holidays, etc. The problem with this seasonal fluctuation is that, it dominates other time series components and makes difficulties in determining other variations (such as educational campaign’s effect, regulation, etc.) in time series. Apart from seasonality, a stochastic trend is also combined with seasonality and makes different effects on results of forecasting. According to the forecasting literature, preprocessing (de-trending and de-seasonalization) is essential to have more performed forecasting results, while some other researchers mention that seasonally non-adjusted data should be used. Hence, I answer the question that is pre-processing essential? A wide range of forecasting methods exists with different pros and cons. In this research, I apply double seasonal ARIMA and Artificial Neural Network (ANN), considering diverse elements such as seasonality and calendar effects (public and school holidays) and combine their results to find the best predicted values. My hypothesis is the examination the results of combined method (hybrid model) and individual methods and comparing the accuracy and robustness. In order to use ARIMA, the data should be stationary. Also, ANN has successful forecasting applications in terms of forecasting seasonal and trend time series. Using a hybrid model is a way to improve the accuracy of the methods. Due to the fact that water demand is dominated by different seasonality, in order to find their sensitivity to weather conditions or calendar effects or other seasonal patterns, I combine different methods. The advantage of this combination is reduction of errors by averaging of each individual model. It is also useful when we are not sure about the accuracy of each forecasting model and it can ease the problem of model selection. Using daily residential water consumption data from January 2000 to July 2015 in Hamilton, I indicate how prediction by different methods varies. ANN has more accurate forecasting results than other method and preprocessing is essential when we use seasonal time series. Using hybrid model reduces forecasting average errors and increases the performance.Keywords: artificial neural network (ANN), double seasonal ARIMA, forecasting, hybrid model
Procedia PDF Downloads 339