Search results for: text information retrieval
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 11544

Search results for: text information retrieval

11184 Animated Poetry-Film: Poetry in Action

Authors: Linette van der Merwe

Abstract:

It is known that visual artists, performing artists, and literary artists have inspired each other since time immemorial. The enduring, symbiotic relationship between the various art genres is evident where words, colours, lines, and sounds act as metaphors, a physical separation of the transcendental reality of art. Simonides of Keos (c. 556-468 BC) confirmed this, stating that a poem is a talking picture, or, in a more modern expression, a picture is worth a thousand words. It can be seen as an ancient relationship, originating from the epigram (tombstone or artefact inscriptions), the carmen figuratum (figure poem), and the ekphrasis (a description in the form of a poem of a work of art). Visual artists, including Michelangelo, Leonardo da Vinci, and Goethe, wrote poems and songs. Goya, Degas, and Picasso are famous for their works of art and for trying their hands at poetry. Afrikaans writers whose fine art is often published together with their writing, as in the case of Andries Bezuidenhout, Breyten Breytenbach, Sheila Cussons, Hennie Meyer, Carina Stander, and Johan van Wyk, among others, are not a strange phenomenon either. Imitating one art form into another art form is a form of translation, transposition, contemplation, and discovery of artistic impressions, showing parallel interpretations rather than physical comparison. It is especially about the harmony that exists between the different art genres, i.e., a poem that describes a painting or a visual text that portrays a poem that becomes a translation, interpretation, and rediscovery of the verbal text, or rather, from the word text to the image text. Poetry-film, as a form of such a translation of the word text into an image text, can be considered a hybrid, transdisciplinary art form that connects poetry and film. Poetry-film is regarded as an intertwined entity of word, sound, and visual image. It is an attempt to transpose and transform a poem into a new artwork that makes the poem more accessible to people who are not necessarily open to the written word and will, in effect, attract a larger audience to a genre that usually has a limited market. Poetry-film is considered a creative expression of an inverted ekphrastic inspiration, a visual description, interpretation, and expression of a poem. Research also emphasises that animated poetry-film is not widely regarded as a genre of anything and is thus severely under-theorized. This paper will focus on Afrikaans animated poetry-films as a multimodal transposition of a poem text to an animated poetry film, with specific reference to animated poetry-films in Filmverse I (2014) and Filmverse II (2016).

Keywords: poetry film, animated poetry film, poetic metaphor, conceptual metaphor, monomodal metaphor, multimodal metaphor, semiotic metaphor, multimodality, metaphor analysis, target domain, source domain

Procedia PDF Downloads 40
11183 Recognition of Spelling Problems during the Text in Progress: A Case Study on the Comments Made by Portuguese Students Newly Literate

Authors: E. Calil, L. A. Pereira

Abstract:

The acquisition of orthography is a complex process, involving both lexical and grammatical questions. This learning occurs simultaneously with the domain of multiple textual aspects (e.g.: graphs, punctuation, etc.). However, most of the research on orthographic acquisition focus on this acquisition from an autonomous point of view, separated from the process of textual production. This means that their object of analysis is the production of words selected by the researcher or the requested sentences in an experimental and controlled setting. In addition, the analysis of the Spelling Problems (SP) are identified by the researcher on the sheet of paper. Considering the perspective of Textual Genetics, from an enunciative approach, this study will discuss the SPs recognized by dyads of newly literate students, while they are writing a text collaboratively. Six proposals of textual production were registered, requested by a 2nd year teacher of a Portuguese Primary School between January and March 2015. In our case study we discuss the SPs recognized by the dyad B and L (7 years old). We adopted as a methodological tool the Ramos System audiovisual record. This system allows real-time capture of the text in process and of the face-to-face dialogue between both students and their teacher, and also captures the body movements and facial expressions of the participants during textual production proposals in the classroom. In these ecological conditions of multimodal registration of collaborative writing, we could identify the emergence of SP in two dimensions: i. In the product (finished text): SP identification without recursive graphic marks (without erasures) and the identification of SPs with erasures, indicating the recognition of SP by the student; ii. In the process (text in progress): identification of comments made by students about recognized SPs. Given this, we’ve analyzed the comments on identified SPs during the text in progress. These comments characterize a type of reformulation referred to as Commented Oral Erasure (COE). The COE has two enunciative forms: Simple Comment (SC) such as ' 'X' is written with 'Y' '; or Unfolded Comment (UC), such as ' 'X' is written with 'Y' because...'. The spelling COE may also occur before or during the SP (Early Spelling Recognition - ESR) or after the SP has been entered (Later Spelling Recognition - LSR). There were 631 words entered in the 6 stories written by the B-L dyad, 145 of them containing some type of SP. During the text in progress, the students recognized orally 174 SP, 46 of which were identified in advance (ESRs) and 128 were identified later (LSPs). If we consider that the 88 erasure SPs in the product indicate some form of SP recognition, we can observe that there were twice as many SPs recognized orally. The ESR was characterized by SC when students asked their colleague or teacher how to spell a given word. The LSR presented predominantly UC, verbalizing meta-orthographic arguments, mostly made by L. These results indicate that writing in dyad is an important didactic strategy for the promotion of metalinguistic reflection, favoring the learning of spelling.

Keywords: collaborative writing, erasure, learning, metalinguistic awareness, spelling, text production

Procedia PDF Downloads 145
11182 Detecting Elderly Abuse in US Nursing Homes Using Machine Learning and Text Analytics

Authors: Minh Huynh, Aaron Heuser, Luke Patterson, Chris Zhang, Mason Miller, Daniel Wang, Sandeep Shetty, Mike Trinh, Abigail Miller, Adaeze Enekwechi, Tenille Daniels, Lu Huynh

Abstract:

Machine learning and text analytics have been used to analyze child abuse, cyberbullying, domestic abuse and domestic violence, and hate speech. However, to the authors’ knowledge, no research to date has used these methods to study elder abuse in nursing homes or skilled nursing facilities from field inspection reports. We used machine learning and text analytics methods to analyze 356,000 inspection reports, which have been extracted from CMS Form-2567 field inspections of US nursing homes and skilled nursing facilities between 2016 and 2021. Our algorithm detected occurrences of the various types of abuse, including physical abuse, psychological abuse, verbal abuse, sexual abuse, and passive and active neglect. For example, to detect physical abuse, our algorithms search for combinations or phrases and words suggesting willful infliction of damage (hitting, pinching or burning, tethering, tying), or consciously ignoring an emergency. To detect occurrences of elder neglect, our algorithm looks for combinations or phrases and words suggesting both passive neglect (neglecting vital needs, allowing malnutrition and dehydration, allowing decubiti, deprivation of information, limitation of freedom, negligence toward safety precautions) and active neglect (intimidation and name-calling, tying the victim up to prevent falls without consent, consciously ignoring an emergency, not calling a physician in spite of indication, stopping important treatments, failure to provide essential care, deprivation of nourishment, leaving a person alone for an inappropriate amount of time, excessive demands in a situation of care). We further compare the prevalence of abuse before and after Covid-19 related restrictions on nursing home visits. We also identified the facilities with the most number of cases of abuse with no abuse facilities within a 25-mile radius as most likely candidates for additional inspections. We also built an interactive display to visualize the location of these facilities.

Keywords: machine learning, text analytics, elder abuse, elder neglect, nursing home abuse

Procedia PDF Downloads 120
11181 Adapting Tools for Text Monitoring and for Scenario Analysis Related to the Field of Social Disasters

Authors: Svetlana Cojocaru, Mircea Petic, Inga Titchiev

Abstract:

Humanity faces more and more often with different social disasters, which in turn can generate new accidents and catastrophes. To mitigate their consequences, it is important to obtain early possible signals about the events which are or can occur and to prepare the corresponding scenarios that could be applied. Our research is focused on solving two problems in this domain: identifying signals related that an accident occurred or may occur and mitigation of some consequences of disasters. To solve the first problem, methods of selecting and processing texts from global network Internet are developed. Information in Romanian is of special interest for us. In order to obtain the mentioned tools, we should follow several steps, divided into preparatory stage and processing stage. Throughout the first stage, we manually collected over 724 news articles and classified them into 10 categories of social disasters. It constitutes more than 150 thousand words. Using this information, a controlled vocabulary of more than 300 keywords was elaborated, that will help in the process of classification and identification of the texts related to the field of social disasters. To solve the second problem, the formalism of Petri net has been used. We deal with the problem of inhabitants’ evacuation in useful time. The analysis methods such as reachability or coverability tree and invariants technique to determine dynamic properties of the modeled systems will be used. To perform a case study of properties of extended evacuation system by adding time, the analysis modules of PIPE such as Generalized Stochastic Petri Nets (GSPN) Analysis, Simulation, State Space Analysis, and Invariant Analysis have been used. These modules helped us to obtain the average number of persons situated in the rooms and the other quantitative properties and characteristics related to its dynamics.

Keywords: lexicon of disasters, modelling, Petri nets, text annotation, social disasters

Procedia PDF Downloads 184
11180 Prosody of Text Communication: Inducing Synchronization and Coherence in Chat Conversations

Authors: Karolina Ziembowicz, Andrzej Nowak

Abstract:

In the current study, we examined the consequences of adding prosodic cues to text communication by allowing users to observe the process of message creation while engaged in dyadic conversations. In the first condition, users interacted through a traditional chat that requires pressing ‘enter’ to make a message visible to an interlocutor. In another, text appeared on the screen simultaneously as the sender was writing it, letter after letter (Synchat condition), so that users could observe the varying rhythm of message production, precise timing of message appearance, typos and their corrections. The results show that the ability to observe the dynamics of message production had a twofold effect on the social interaction process. First, it enhanced the relational aspect of communication – interlocutors synchronized their emotional states during the interaction, their communication included more statements on relationship building, and they evaluated the Synchat medium as more personal and emotionally engaging. Second, it increased the coherence of communication, reflected in greater continuity of the topics raised in Synchat conversations. The results are discussed from the interaction design (IxD) perspective.

Keywords: chat communication, online conversation, prosody, social synchronization, interaction incoherence, relationship building

Procedia PDF Downloads 122
11179 An Unsupervised Domain-Knowledge Discovery Framework for Fake News Detection

Authors: Yulan Wu

Abstract:

With the rapid development of social media, the issue of fake news has gained considerable prominence, drawing the attention of both the public and governments. The widespread dissemination of false information poses a tangible threat across multiple domains of society, including politics, economy, and health. However, much research has concentrated on supervised training models within specific domains, their effectiveness diminishes when applied to identify fake news across multiple domains. To solve this problem, some approaches based on domain labels have been proposed. By segmenting news to their specific area in advance, judges in the corresponding field may be more accurate on fake news. However, these approaches disregard the fact that news records can pertain to multiple domains, resulting in a significant loss of valuable information. In addition, the datasets used for training must all be domain-labeled, which creates unnecessary complexity. To solve these problems, an unsupervised domain knowledge discovery framework for fake news detection is proposed. Firstly, to effectively retain the multidomain knowledge of the text, a low-dimensional vector for each news text to capture domain embeddings is generated. Subsequently, a feature extraction module utilizing the unsupervisedly discovered domain embeddings is used to extract the comprehensive features of news. Finally, a classifier is employed to determine the authenticity of the news. To verify the proposed framework, a test is conducted on the existing widely used datasets, and the experimental results demonstrate that this method is able to improve the detection performance for fake news across multiple domains. Moreover, even in datasets that lack domain labels, this method can still effectively transfer domain knowledge, which can educe the time consumed by tagging without sacrificing the detection accuracy.

Keywords: fake news, deep learning, natural language processing, multiple domains

Procedia PDF Downloads 60
11178 Optimizing the Readability of Orthopaedic Trauma Patient Education Materials Using ChatGPT-4

Authors: Oscar Covarrubias, Diane Ghanem, Christopher Murdock, Babar Shafiq

Abstract:

Introduction: ChatGPT is an advanced language AI tool designed to understand and generate human-like text. The aim of this study is to assess the ability of ChatGPT-4 to re-write orthopaedic trauma patient education materials at the recommended 6th-grade level. Methods: Two independent reviewers accessed ChatGPT-4 (chat.openai.com) and gave identical instructions to simplify the readability of provided text to a 6th-grade level. All trauma-related articles by the Orthopaedic Trauma Association (OTA) and American Academy of Orthopaedic Surgeons (AAOS) were sequentially provided. The academic grade level was determined using the Flesh-Kincaid Grade Level (FKGL) and Flesch Reading Ease (FRE). Paired t-tests and Wilcox-rank sum tests were used to compare the FKGL and FRE between the ChatGPT-4 revised and original text. Inter-rater correlation coefficient (ICC) was used to assess variability in ChatGPT-4 generated text between the two reviewers. Results: ChatGPT-4 significantly reduced FKGL and increased FRE scores in the OTA (FKGL: 5.7±0.5 compared to the original 8.2±1.1, FRE: 76.4±5.7 compared to the original 65.5±6.6, p < 0.001) and AAOS articles (FKGL: 5.8±0.8 compared to the original 8.9±0.8, FRE: 76±5.5 compared to the original 56.7±5.9, p < 0.001). On average, 14.6% of OTA and 28.6% of AAOS articles required at least two revisions by ChatGPT-4 to achieve a 6th-grade reading level. ICC demonstrated poor reliability for FKGL (OTA 0.24, AAOS 0.45) and moderate reliability for FRE (OTA 0.61, AAOS 0.73). Conclusion: This study provides a novel, simple and efficient method using language AI to optimize the readability of patient education content which may only require the surgeon’s final proofreading. This method would likely be as effective for other medical specialties.

Keywords: artificial intelligence, AI, chatGPT, patient education, readability, trauma education

Procedia PDF Downloads 52
11177 Architectural Experience of the Everyday in Phuket Old Town

Authors: Thirayu Jumsai na Ayudhya

Abstract:

Initial attempts to understand about what architecture means to people as they go about their everyday life through my previous research revealed that knowledge such as environmental psychology, environmental perception, environmental aesthetics, did not adequately address a perceived need for the contextualized and holistic theoretical framework. In my previous research, it is found that people’s making senses of their everyday architecture can be described in terms of four super‐ordinate themes; (1) building in urban (text), (2) building in (text), (3) building in human (text), (4) and building in time (text). For more comprehensively understanding of how people make sense of their everyday architectural experience, in this ongoing research Phuket Old town was selected as the focal urban context where the distinguish character of Chino-Portuguese is remarkable. It is expected that in a unique urban context like Phuket old town unprecedented super-ordinate themes will be unveiled through the reflection of people’s everyday experiences. The ongoing research of people’s architectural experience conducted in Phuket Island, Thailand, will be presented succinctly. The research will address the question of how do people make sense of their everyday architecture/buildings especially in a unique urban context, Phuket Old town, and identify ways in which people make sense of their everyday architecture. Participant-Produced-Photograph (PPP) and Interpretative Phenomenological Analysis (IPA) are adopted as main methodologies. PPP allows people to express experiences of their everyday urban context freely without any interference or forced-data generating by researchers. With IPA methodology a small pool of participants is considered desirable given the detailed level of analysis required and its potential to produce a meaningful outcome.

Keywords: architectural experience, the everyday architecture, Phuket, Thailand

Procedia PDF Downloads 273
11176 Exploring Social Impact of Emerging Technologies from Futuristic Data

Authors: Heeyeul Kwon, Yongtae Park

Abstract:

Despite the highly touted benefits, emerging technologies have unleashed pervasive concerns regarding unintended and unforeseen social impacts. Thus, those wishing to create safe and socially acceptable products need to identify such side effects and mitigate them prior to the market proliferation. Various methodologies in the field of technology assessment (TA), namely Delphi, impact assessment, and scenario planning, have been widely incorporated in such a circumstance. However, literatures face a major limitation in terms of sole reliance on participatory workshop activities. They unfortunately missed out the availability of a massive untapped data source of futuristic information flooding through the Internet. This research thus seeks to gain insights into utilization of futuristic data, future-oriented documents from the Internet, as a supplementary method to generate social impact scenarios whilst capturing perspectives of experts from a wide variety of disciplines. To this end, network analysis is conducted based on the social keywords extracted from the futuristic documents by text mining, which is then used as a guide to produce a comprehensive set of detailed scenarios. Our proposed approach facilitates harmonized depictions of possible hazardous consequences of emerging technologies and thereby makes decision makers more aware of, and responsive to, broad qualitative uncertainties.

Keywords: emerging technologies, futuristic data, scenario, text mining

Procedia PDF Downloads 472
11175 National Image in the Age of Mass Self-Communication: An Analysis of Internet Users' Perception of Portugal

Authors: L. Godinho, N. Teixeira

Abstract:

Nowadays, massification of Internet access represents one of the major challenges to the traditional powers of the State, among which the power to control its external image. The virtual world has also sparked the interest of social sciences which consider it a new field of study, an immense open text where sense is expressed. In this paper, that immense text has been accessed to so as to understand the perception Internet users from all over the world have of Portugal. Ours is a quantitative and qualitative approach, as we have resorted to buzz, thematic and category analysis. The results confirm the predominance of sea stereotype in others' vision of the Portuguese people, and evidence that national image has adapted to network communication through processes of individuation and paganization.

Keywords: national image, internet, self-communication, perception

Procedia PDF Downloads 238
11174 The Capabilities of New Communication Devices in Development of Informing: Case Study Mobile Functions in Iran

Authors: Mohsen Shakerinejad

Abstract:

Due to the growing momentum of technology, the present age is called age of communication and information. And With Astounding progress of Communication and information tools, current world Is likened to the "global village". That a message can be sent from one point to another point of the world in a Time scale Less than a minute. However, one of the new sociologists -Alain Touraine- in describing the destructive effects of new changes arising from the development of information appliances refers to the "new fields for undemocratic social control And the incidence of acute and unrest social and political tensions", Yet, in this era That With the advancement of the industry, the life of people has been industrial too, quickly and accurately Data Transfer, Causes Blowing new life in the Body of Society And according to the features of each society and the progress of science and technology, Various tools should be used. One of these communication tools is Mobile. Cellular phone As Communication and telecommunication revolution in recent years, Has had a great influence on the individual and collective life of societies. This powerful communication tool Have had an Undeniable effect, On all aspects of life, including social, economic, cultural, scientific, etc. so that Ignoring It in Design, Implementation and enforcement of any system is not wise. Nowadays knowledge and information are one of the most important aspects of human life. Therefore, in this article, it has been tried to introduce mobile potentials in receive and transmit News and Information. As it follows, among the numerous capabilities of current mobile phones features such as sending text, photography, sound recording, filming, and Internet connectivity could indicate the potential of this medium of communication in the process of sending and receiving information. So that nowadays, mobile journalism as an important component of citizen journalism Has a unique role in information dissemination.

Keywords: mobile, informing, receiving information, mobile journalism, citizen journalism

Procedia PDF Downloads 384
11173 From Text to Data: Sentiment Analysis of Presidential Election Political Forums

Authors: Sergio V Davalos, Alison L. Watkins

Abstract:

User generated content (UGC) such as website post has data associated with it: time of the post, gender, location, type of device, and number of words. The text entered in user generated content (UGC) can provide a valuable dimension for analysis. In this research, each user post is treated as a collection of terms (words). In addition to the number of words per post, the frequency of each term is determined by post and by the sum of occurrences in all posts. This research focuses on one specific aspect of UGC: sentiment. Sentiment analysis (SA) was applied to the content (user posts) of two sets of political forums related to the US presidential elections for 2012 and 2016. Sentiment analysis results in deriving data from the text. This enables the subsequent application of data analytic methods. The SASA (SAIL/SAI Sentiment Analyzer) model was used for sentiment analysis. The application of SASA resulted with a sentiment score for each post. Based on the sentiment scores for the posts there are significant differences between the content and sentiment of the two sets for the 2012 and 2016 presidential election forums. In the 2012 forums, 38% of the forums started with positive sentiment and 16% with negative sentiment. In the 2016 forums, 29% started with positive sentiment and 15% with negative sentiment. There also were changes in sentiment over time. For both elections as the election got closer, the cumulative sentiment score became negative. The candidate who won each election was in the more posts than the losing candidates. In the case of Trump, there were more negative posts than Clinton’s highest number of posts which were positive. KNIME topic modeling was used to derive topics from the posts. There were also changes in topics and keyword emphasis over time. Initially, the political parties were the most referenced and as the election got closer the emphasis changed to the candidates. The performance of the SASA method proved to predict sentiment better than four other methods in Sentibench. The research resulted in deriving sentiment data from text. In combination with other data, the sentiment data provided insight and discovery about user sentiment in the US presidential elections for 2012 and 2016.

Keywords: sentiment analysis, text mining, user generated content, US presidential elections

Procedia PDF Downloads 162
11172 Automatic Tagging and Accuracy in Assamese Text Data

Authors: Chayanika Hazarika Bordoloi

Abstract:

This paper is an attempt to work on a highly inflectional language called Assamese. This is also one of the national languages of India and very little has been achieved in terms of computational research. Building a language processing tool for a natural language is not very smooth as the standard and language representation change at various levels. This paper presents inflectional suffixes of Assamese verbs and how the statistical tools, along with linguistic features, can improve the tagging accuracy. Conditional random fields (CRF tool) was used to automatically tag and train the text data; however, accuracy was improved after linguistic featured were fed into the training data. Assamese is a highly inflectional language; hence, it is challenging to standardizing its morphology. Inflectional suffixes are used as a feature of the text data. In order to analyze the inflections of Assamese word forms, a list of suffixes is prepared. This list comprises suffixes, comprising of all possible suffixes that various categories can take is prepared. Assamese words can be classified into inflected classes (noun, pronoun, adjective and verb) and un-inflected classes (adverb and particle). The corpus used for this morphological analysis has huge tokens. The corpus is a mixed corpus and it has given satisfactory accuracy. The accuracy rate of the tagger has gradually improved with the modified training data.

Keywords: CRF, morphology, tagging, tagset

Procedia PDF Downloads 174
11171 Examining the Effects of Increasing Lexical Retrieval Attempts in Tablet-Based Naming Therapy for Aphasia

Authors: Jeanne Gallee, Sofia Vallila-Rohter

Abstract:

Technology-based applications are increasingly being utilized in aphasia rehabilitation as a means of increasing intensity of treatment and improving accessibility to treatment. These interactive therapies, often available on tablets, lead individuals to complete language and cognitive rehabilitation tasks that draw upon skills such as the ability to name items, recognize semantic features, count syllables, rhyme, and categorize objects. Tasks involve visual and auditory stimulus cues and provide feedback about the accuracy of a person’s response. Research has begun to examine the efficacy of tablet-based therapies for aphasia, yet much remains unknown about how individuals interact with these therapy applications. Thus, the current study aims to examine the efficacy of a tablet-based therapy program for anomia, further examining how strategy training might influence the way that individuals with aphasia engage with and benefit from therapy. Individuals with aphasia are enrolled in one of two treatment paradigms: traditional therapy or strategy therapy. For ten weeks, all participants receive 2 hours of weekly in-house therapy using Constant Therapy, a tablet-based therapy application. Participants are provided with iPads and are additionally encouraged to work on therapy tasks for one hour a day at home (home logins). For those enrolled in traditional therapy, in-house sessions involve completing therapy tasks while a clinician researcher is present. For those enrolled in the strategy training group, in-house sessions focus on limiting cue use in order to maximize lexical retrieval attempts and naming opportunities. The strategy paradigm is based on the principle that retrieval attempts may foster long-term naming gains. Data have been collected from 7 participants with aphasia (3 in the traditional therapy group, 4 in the strategy training group). We examine cue use, latency of responses and accuracy through the course of therapy, comparing results across group and setting (in-house sessions vs. home logins).

Keywords: aphasia, speech-language pathology, traumatic brain injury, language

Procedia PDF Downloads 180
11170 Against Language Disorder: A Way of Reading Dialects in Yan Lianke’s Novels

Authors: Thuy Hanh Nguyen Thi

Abstract:

By the method of deep reading and text analysis, this article will analyze the use and creation of dialects as a way of demonstrating Yan Lianke's creative stance. This article indicates that this is the writer’s narrative strategy in a fight against aphasia, a language disorder of Chinese people and culture, demonstrating a sense of return to folklore and marks his own linguistic style. In terms of verbal text, the dialect in the Yan Lianke’s novels manifested through the use of words, sentences and dialects. There are two types of dialects that exist in Yan Lianke’s novels: the current dialect system and the particular dialect system of Pa Lau world created by the writer himself in order to enrich the vocabulary of Han Chinese.

Keywords: Yan Lianke , aphasia, dialect, Pa Lou world

Procedia PDF Downloads 103
11169 Analyzing Semantic Feature Using Multiple Information Sources for Reviews Summarization

Authors: Yu Hung Chiang, Hei Chia Wang

Abstract:

Nowadays, tourism has become a part of life. Before reserving hotels, customers need some information, which the most important source is online reviews, about hotels to help them make decisions. Due to the dramatic growing of online reviews, it is impossible for tourists to read all reviews manually. Therefore, designing an automatic review analysis system, which summarizes reviews, is necessary for them. The main purpose of the system is to understand the opinion of reviews, which may be positive or negative. In other words, the system would analyze whether the customers who visited the hotel like it or not. Using sentiment analysis methods will help the system achieve the purpose. In sentiment analysis methods, the targets of opinion (here they are called the feature) should be recognized to clarify the polarity of the opinion because polarity of the opinion may be ambiguous. Hence, the study proposes an unsupervised method using Part-Of-Speech pattern and multi-lexicons sentiment analysis to summarize all reviews. We expect this method can help customers search what they want information as well as make decisions efficiently.

Keywords: text mining, sentiment analysis, product feature extraction, multi-lexicons

Procedia PDF Downloads 308
11168 Off-Line Text-Independent Arabic Writer Identification Using Optimum Codebooks

Authors: Ahmed Abdullah Ahmed

Abstract:

The task of recognizing the writer of a handwritten text has been an attractive research problem in the document analysis and recognition community with applications in handwriting forensics, paleography, document examination and handwriting recognition. This research presents an automatic method for writer recognition from digitized images of unconstrained writings. Although a great effort has been made by previous studies to come out with various methods, their performances, especially in terms of accuracy, are fallen short, and room for improvements is still wide open. The proposed technique employs optimal codebook based writer characterization where each writing sample is represented by a set of features computed from two codebooks, beginning and ending. Unlike most of the classical codebook based approaches which segment the writing into graphemes, this study is based on fragmenting a particular area of writing which are beginning and ending strokes. The proposed method starting with contour detection to extract significant information from the handwriting and the curve fragmentation is then employed to categorize the handwriting into Beginning and Ending zones into small fragments. The similar fragments of beginning strokes are grouped together to create Beginning cluster, and similarly, the ending strokes are grouped to create the ending cluster. These two clusters lead to the development of two codebooks (beginning and ending) by choosing the center of every similar fragments group. Writings under study are then represented by computing the probability of occurrence of codebook patterns. The probability distribution is used to characterize each writer. Two writings are then compared by computing distances between their respective probability distribution. The evaluations carried out on ICFHR standard dataset of 206 writers using Beginning and Ending codebooks separately. Finally, the Ending codebook achieved the highest identification rate of 98.23%, which is the best result so far on ICFHR dataset.

Keywords: off-line text-independent writer identification, feature extraction, codebook, fragments

Procedia PDF Downloads 489
11167 Reading and Writing of Biscriptal Children with and Without Reading Difficulties in Two Alphabetic Scripts

Authors: Baran Johansson

Abstract:

This PhD dissertation aimed to explore children’s writing and reading in L1 (Persian) and L2 (Swedish). It adds new perspectives to reading and writing studies of bilingual biscriptal children with and without reading and writing difficulties (RWD). The study used standardised tests to examine linguistic and cognitive skills related to word reading and writing fluency in both languages. Furthermore, all participants produced two texts (one descriptive and one narrative) in each language. The writing processes and the writing product of these children were explored using logging methodologies (Eye and Pen) for both languages. Furthermore, this study investigated how two bilingual children with RWD presented themselves through writing across their languages. To my knowledge, studies utilizing standardised tests and logging tools to investigate bilingual children’s word reading and writing fluency across two different alphabetic scripts are scarce. There have been few studies analysing how bilingual children construct meaning in their writing, and none have focused on children who write in two different alphabetic scripts or those with RWD. Therefore, some aspects of the systemic functional linguistics (SFL) perspective were employed to examine how two participants with RWD created meaning in their written texts in each language. The results revealed that children with and without RWD had higher writing fluency in all measures (e.g. text lengths, writing speed) in their L2 compared to their L1. Word reading abilities in both languages were found to influence their writing fluency. The findings also showed that bilingual children without reading difficulties performed 1 standard deviation below the mean when reading words in Persian. However, their reading performance in Swedish aligned with the expected age norms, suggesting greater efficient in reading Swedish than in Persian. Furthermore, the results showed that the level of orthographic depth, consistency between graphemes and phonemes, and orthographic features can probably explain these differences across languages. The analysis of meaning-making indicated that the participants with RWD exhibited varying levels of difficulty, which influenced their knowledge and usage of writing across languages. For example, the participant with poor word recognition (PWR) presented himself similarly across genres, irrespective of the language in which he wrote. He employed the listing technique similarly across his L1 and L2. However, the participant with mixed reading difficulties (MRD) had difficulties with both transcription and text production. He produced spelling errors and frequently paused in both languages. He also struggled with word retrieval and producing coherent texts, consistent with studies of monolingual children with poor comprehension or with developmental language disorder. The results suggest that the mother tongue instruction provided to the participants has not been sufficient for them to become balanced biscriptal readers and writers in both languages. Therefore, increasing the number of hours dedicated to mother tongue instruction and motivating the children to participate in these classes could be potential strategies to address this issue.

Keywords: reading, writing, reading and writing difficulties, bilingual children, biscriptal

Procedia PDF Downloads 46
11166 From Shallow Semantic Representation to Deeper One: Verb Decomposition Approach

Authors: Aliaksandr Huminski

Abstract:

Semantic Role Labeling (SRL) as shallow semantic parsing approach includes recognition and labeling arguments of a verb in a sentence. Verb participants are linked with specific semantic roles (Agent, Patient, Instrument, Location, etc.). Thus, SRL can answer on key questions such as ‘Who’, ‘When’, ‘What’, ‘Where’ in a text and it is widely applied in dialog systems, question-answering, named entity recognition, information retrieval, and other fields of NLP. However, SRL has the following flaw: Two sentences with identical (or almost identical) meaning can have different semantic role structures. Let consider 2 sentences: (1) John put butter on the bread. (2) John buttered the bread. SRL for (1) and (2) will be significantly different. For the verb put in (1) it is [Agent + Patient + Goal], but for the verb butter in (2) it is [Agent + Goal]. It happens because of one of the most interesting and intriguing features of a verb: Its ability to capture participants as in the case of the verb butter, or their features as, say, in the case of the verb drink where the participant’s feature being liquid is shared with the verb. This capture looks like a total fusion of meaning and cannot be decomposed in direct way (in comparison with compound verbs like babysit or breastfeed). From this perspective, SRL looks really shallow to represent semantic structure. If the key point in semantic representation is an opportunity to use it for making inferences and finding hidden reasons, it assumes by default that two different but semantically identical sentences must have the same semantic structure. Otherwise we will have different inferences from the same meaning. To overcome the above-mentioned flaw, the following approach is suggested. Assume that: P is a participant of relation; F is a feature of a participant; Vcp is a verb that captures a participant; Vcf is a verb that captures a feature of a participant; Vpr is a primitive verb or a verb that does not capture any participant and represents only a relation. In another word, a primitive verb is a verb whose meaning does not include meanings from its surroundings. Then Vcp and Vcf can be decomposed as: Vcp = Vpr +P; Vcf = Vpr +F. If all Vcp and Vcf will be represented this way, then primitive verbs Vpr can be considered as a canonical form for SRL. As a result of that, there will be no hidden participants caught by a verb since all participants will be explicitly unfolded. An obvious example of Vpr is the verb go, which represents pure movement. In this case the verb drink can be represented as man-made movement of liquid into specific direction. Extraction and using primitive verbs for SRL create a canonical representation unique for semantically identical sentences. It leads to the unification of semantic representation. In this case, the critical flaw related to SRL will be resolved.

Keywords: decomposition, labeling, primitive verbs, semantic roles

Procedia PDF Downloads 344
11165 Personal Information Classification Based on Deep Learning in Automatic Form Filling System

Authors: Shunzuo Wu, Xudong Luo, Yuanxiu Liao

Abstract:

Recently, the rapid development of deep learning makes artificial intelligence (AI) penetrate into many fields, replacing manual work there. In particular, AI systems also become a research focus in the field of automatic office. To meet real needs in automatic officiating, in this paper we develop an automatic form filling system. Specifically, it uses two classical neural network models and several word embedding models to classify various relevant information elicited from the Internet. When training the neural network models, we use less noisy and balanced data for training. We conduct a series of experiments to test my systems and the results show that our system can achieve better classification results.

Keywords: artificial intelligence and office, NLP, deep learning, text classification

Procedia PDF Downloads 160
11164 The Challenges of Hyper-Textual Learning Approach for Religious Education

Authors: Elham Shirvani–Ghadikolaei, Seyed Mahdi Sajjadi

Abstract:

State of the art technology has the tremendous impact on our life, in this situation education system have been influenced as well as. In this paper, tried to compare two space of learning text and hypertext with each other, and some challenges of using hypertext in religious education. Regarding the fact that, hypertext is an undeniable part of learning in this world and it has highly beneficial for the education process from class to office and home. In this paper tried to solve this question: the consequences and challenges of applying hypertext in religious education. Also, the consequences of this survey demonstrate the role of curriculum designer and planner of education to solve this problem.

Keywords: Hyper-textual, learning, religious education, learning text

Procedia PDF Downloads 290
11163 Human Action Recognition Using Wavelets of Derived Beta Distributions

Authors: Neziha Jaouedi, Noureddine Boujnah, Mohamed Salim Bouhlel

Abstract:

In the framework of human machine interaction systems enhancement, we focus throw this paper on human behavior analysis and action recognition. Human behavior is characterized by actions and reactions duality (movements, psychological modification, verbal and emotional expression). It’s worth noting that many information is hidden behind gesture, sudden motion points trajectories and speeds, many research works reconstructed an information retrieval issues. In our work we will focus on motion extraction, tracking and action recognition using wavelet network approaches. Our contribution uses an analysis of human subtraction by Gaussian Mixture Model (GMM) and body movement through trajectory models of motion constructed from kalman filter. These models allow to remove the noise using the extraction of the main motion features and constitute a stable base to identify the evolutions of human activity. Each modality is used to recognize a human action using wavelets of derived beta distributions approach. The proposed approach has been validated successfully on a subset of KTH and UCF sports database.

Keywords: feautures extraction, human action classifier, wavelet neural network, beta wavelet

Procedia PDF Downloads 388
11162 Enhancing Large Language Models' Data Analysis Capability with Planning-and-Execution and Code Generation Agents: A Use Case for Southeast Asia Real Estate Market Analytics

Authors: Kien Vu, Jien Min Soh, Mohamed Jahangir Abubacker, Piyawut Pattamanon, Soojin Lee, Suvro Banerjee

Abstract:

Recent advances in Generative Artificial Intelligence (GenAI), in particular Large Language Models (LLMs) have shown promise to disrupt multiple industries at scale. However, LLMs also present unique challenges, notably, these so-called "hallucination" which is the generation of outputs that are not grounded in the input data that hinders its adoption into production. Common practice to mitigate hallucination problem is utilizing Retrieval Agmented Generation (RAG) system to ground LLMs'response to ground truth. RAG converts the grounding documents into embeddings, retrieve the relevant parts with vector similarity between user's query and documents, then generates a response that is not only based on its pre-trained knowledge but also on the specific information from the retrieved documents. However, the RAG system is not suitable for tabular data and subsequent data analysis tasks due to multiple reasons such as information loss, data format, and retrieval mechanism. In this study, we have explored a novel methodology that combines planning-and-execution and code generation agents to enhance LLMs' data analysis capabilities. The approach enables LLMs to autonomously dissect a complex analytical task into simpler sub-tasks and requirements, then convert them into executable segments of code. In the final step, it generates the complete response from output of the executed code. When deployed beta version on DataSense, the property insight tool of PropertyGuru, the approach yielded promising results, as it was able to provide market insights and data visualization needs with high accuracy and extensive coverage by abstracting the complexities for real-estate agents and developers from non-programming background. In essence, the methodology not only refines the analytical process but also serves as a strategic tool for real estate professionals, aiding in market understanding and enhancement without the need for programming skills. The implication extends beyond immediate analytics, paving the way for a new era in the real estate industry characterized by efficiency and advanced data utilization.

Keywords: large language model, reasoning, planning and execution, code generation, natural language processing, prompt engineering, data analysis, real estate, data sense, PropertyGuru

Procedia PDF Downloads 49
11161 Exchanges between Literature and Cinema: Scripted Writing in the Novel "Miguel e os Demônios", by Lourenço Mutarelli

Authors: Marilia Correa Parecis De Oliveira

Abstract:

This research looks at the novel Miguel e os demônios (2009), by the contemporary Brazilian author Lourenço Mutarelli. In it, the presence of film language resources is remarkable, creating thus a kind of scripted writing. We intend to analyze the presence of film language in work under study, in which there is a mixture of the characteristics of the novel and screenplay genres, trying to explore which aesthetic and meaning effects of the ownership of a visual language for the creation of a literary text create in the novel. The objective of this research is to identify and analyze the formal and thematic aspects that characterize the hybridity of literature and film in the novel by Lourenço Mutarelli. The method employed comprises reading and production cataloging of theoretical and critical texts, literary and film theory, historical review about the author, and also the realization of an analytical and interpretative reading of novel. In Miguel e os demônios there is a range of formal and thematic elements of popular narrative genres such as the detective story and action film, with a predominance of verb forms in the present and NPs - features that tend to make present the narrated scenes, as in the cinema. The novel, in this sense, is located in an intermediate position between the literary text and the pre-film text, as though filled with proper elements of the language of film, you can not fit it categorically in the genre script, since it does not reduce the script because aspires to be read as a novel. Therefore, the difficulty of fitting the work in a single gender also refused to be extra-textual factors - such as your publication as novel - but, rather, by the binary classifications serve solely to imprison the work on a label, which impoverish not only reading the text, as also the possibility of recognizing literature as a constant dialogue space and interaction with other media. We can say, therefore, that frame the work Miguel e os demônios in one of the two genres (novel or screenplay) proves not enough, since the text is revealed a hybrid narrative, consisting in a kind of scripted writing. In this sense, it is like a text that is born in a society saturated by audiovisual in their daily lives in order to be consumed by readers who, in ascending scale, exchange books by visual narratives. However, the novel uses film's resources without giving up its constitution as literature; on the contrary, it enriches the visual and linguistically, dialoguing with the complex contemporary horizon marked by the cultural industry.

Keywords: Brazilian literature, cinema, Lourenço Mutarelli, screenplay

Procedia PDF Downloads 288
11160 Medical Image Watermark and Tamper Detection Using Constant Correlation Spread Spectrum Watermarking

Authors: Peter U. Eze, P. Udaya, Robin J. Evans

Abstract:

Data hiding can be achieved by Steganography or invisible digital watermarking. For digital watermarking, both accurate retrieval of the embedded watermark and the integrity of the cover image are important. Medical image security in Teleradiology is one of the applications where the embedded patient record needs to be extracted with accuracy as well as the medical image integrity verified. In this research paper, the Constant Correlation Spread Spectrum digital watermarking for medical image tamper detection and accurate embedded watermark retrieval is introduced. In the proposed method, a watermark bit from a patient record is spread in a medical image sub-block such that the correlation of all watermarked sub-blocks with a spreading code, W, would have a constant value, p. The constant correlation p, spreading code, W and the size of the sub-blocks constitute the secret key. Tamper detection is achieved by flagging any sub-block whose correlation value deviates by more than a small value, ℇ, from p. The major features of our new scheme include: (1) Improving watermark detection accuracy for high-pixel depth medical images by reducing the Bit Error Rate (BER) to Zero and (2) block-level tamper detection in a single computational process with simultaneous watermark detection, thereby increasing utility with the same computational cost.

Keywords: Constant Correlation, Medical Image, Spread Spectrum, Tamper Detection, Watermarking

Procedia PDF Downloads 170
11159 A Contrastive Rhetoric Study: The Use of Textual and Interpersonal Metadiscoursal Markers in Persian and English Newspaper Editorials

Authors: Habibollah Mashhady, Moslem Fatollahi

Abstract:

This study tries to contrast the use of metadiscoursal markers in English and Persian Newspaper Editorials as persuasive text types. These markers are linguistic elements in the text which do not add to the propositional content of it, rather they serve to realize the Halliday’s (1985) textual and interpersonal functions of language. At first, some of the most common markers from five subcategories of Text Connectives, Illocution Markers, Hedges, Emphatics, and Attitude Markers were identified in both English and Persian newspapers. Then, the frequency of occurrence of these markers in both English and Persian corpus consisting of 44 randomly selected editorials (18,000 words in each) from several English and Persian newspapers was recorded. After that, using a two-way chi square analysis, the overall x2 obs was found to be highly significant. So, the null hypothesis of no difference was confidently rejected. Finally, in order to determine the contribution of each subcategory to the overall x 2 value, one-way chi square analyses were applied to the individual subcategories. The results indicated that only two of the five subcategories of markers were statistically significant. This difference is then attributed to the differing spirits prevailing in the linguistic communities involved. Regarding the minor research question it was found that, in contrast to English writers, Persian writers are more writer-oriented in their writings.

Keywords: metadiscoursal markers, textual meta-function, interpersonal meta-function, persuasive texts, English and Persian newspaper editorials

Procedia PDF Downloads 552
11158 Translating the Gendered Discourse: A Corpus-Based Study of the Chinese Science Fiction The Three Body Problem

Authors: Yi Gu

Abstract:

The Three-Body Problem by Cixin Liu has been a bestseller Chinese Sci-Fi novel for years since 2008. The book was translated into English by Ken Liu in 2014 and won the prestigious 2015 science fiction and fantasy writing Hugo Award, drawing greater attention from wider international communities. The story exposes the horrors of the Chinese Cultural Revolution in the 1960s, in an intriguing narrative for readers at home and abroad. However, without the access to the source text, western readers may not be aware that the original Chinese version of the book is rich in gender-bias. Some Chinese scholars have applied feminist translation theories to their analysis on this book before, based on isolated selected, cherry-picking examples. Thus this paper aims to obtain a more thorough picture of how translators can cope with gender discrimination and reshape the gendered discourse from the source text, by systematically investigating the lexical and syntactic patterns in the translation of Liu’s entire book of 400 pages. The source text and the translation were downloaded into digital files, automatically aligned at paragraph level and then manually post-edited. They were then compiled into a parallel corpus of 114,629 English words and 204,145 Chinese characters using Sketch Engine. Gender-discrimination markers such as the overuse of ‘girl’ to describe an adult woman were searched in the source text, and the alignment made it possible to identify the strategies adopted by the translator to mitigate gender discrimination. The results provide a framework for translators to address gender bias. The study also shows how corpus methods can be used to further research in feminist translation and critical discourse analysis.

Keywords: corpus, discourse analysis, feminist translation, science fiction translation

Procedia PDF Downloads 233
11157 Managing Cognitive Load in Accounting: An Analysis of Three Instructional Designs in Financial Accounting

Authors: Seedwell Sithole

Abstract:

One of the persistent problems in accounting education is how to effectively support students’ learning. A promising technique to this issue is to investigate the extent that learning is determined by the design of instructional material. This study examines the academic performance of students using three instructional designs in financial accounting. Student’s performance scores and reported mental effort ratings were used to determine the instructional effectiveness. The findings of this study show that accounting students prefer graph and text designs that are integrated. The results suggest that spatially separated graph and text presentations in accounting should be reorganized to align with the requirements of human cognitive architecture.

Keywords: accounting, cognitive load, education, instructional preferences, students

Procedia PDF Downloads 118
11156 Information Literacy: Concept and Importance

Authors: Gaurav Kumar

Abstract:

An information literate person is one who uses information effectively in all its forms. When presented with questions or problems, an information literate person would know what information to look for, how to search efficiently and be able to access relevant sources. In addition, an information literate person would have the ability to evaluate and select appropriate information sources and to use the information effectively and ethically to answer questions or solve problems. Information literacy has become an important element in higher education. The information literacy movement has internationally recognized standards and learning outcomes. The step-by-step process of achieving information literacy is particularly crucial in an era where knowledge could be disseminated through a variety of media. What is the relationship between information literacy as we define it in higher education and information literacy among non-academic populations? What forces will change how we think about the definition of information literacy in the future and how we will apply the definition in all environments?

Keywords: information literacy, human beings, visual media and computer network etc, information literacy

Procedia PDF Downloads 310
11155 Evaluating the Effectiveness of Animated Videos in Learning Economics

Authors: J. Chow

Abstract:

In laboratory settings, this study measured and reported the effects of undergraduate students watching animated videos on learning microeconomics as compared with the effectiveness of reading written texts. The study described an experiment on learning microeconomics in higher education using two different types of learning materials. It reported the effectiveness on microeconomics learning of watching animated videos and reading written texts. Undergraduate students in the university were randomly assigned to either a ‘video group’ or a ‘text group’ in the experiment. Previously-validated multiple-choice questions on fundamental concepts of microeconomics were administered. Both groups showed improvement between the pre-test and post-test. The experience of learning using text and video materials was also assessed. After controlling the student characteristics variables, the analyses showed that both types of materials showed comparable level of perceived learning experience. The effect size and statistical significance of these results supported the hypothesis that animated video is an effective alternative to text materials as a learning tool for students. The findings suggest that such animated videos may support teaching microeconomics in higher education.

Keywords: animated videos for education, laboratory experiment, microeconomics education, undergraduate economics education

Procedia PDF Downloads 125