Search results for: short text analysis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 29290

Search results for: short text analysis

29290 Experimental Study of Hyperparameter Tuning a Deep Learning Convolutional Recurrent Network for Text Classification

Authors: Bharatendra Rai

Abstract:

The sequence of words in text data has long-term dependencies and is known to suffer from vanishing gradient problems when developing deep learning models. Although recurrent networks such as long short-term memory networks help to overcome this problem, achieving high text classification performance is a challenging problem. Convolutional recurrent networks that combine the advantages of long short-term memory networks and convolutional neural networks can be useful for text classification performance improvements. However, arriving at suitable hyperparameter values for convolutional recurrent networks is still a challenging task where fitting a model requires significant computing resources. This paper illustrates the advantages of using convolutional recurrent networks for text classification with the help of statistically planned computer experiments for hyperparameter tuning.

Keywords: long short-term memory networks, convolutional recurrent networks, text classification, hyperparameter tuning, Tukey honest significant differences

Procedia PDF Downloads 84
29289 Graph-Based Semantical Extractive Text Analysis

Authors: Mina Samizadeh

Abstract:

In the past few decades, there has been an explosion in the amount of available data produced from various sources with different topics. The availability of this enormous data necessitates us to adopt effective computational tools to explore the data. This leads to an intense growing interest in the research community to develop computational methods focused on processing this text data. A line of study focused on condensing the text so that we are able to get a higher level of understanding in a shorter time. The two important tasks to do this are keyword extraction and text summarization. In keyword extraction, we are interested in finding the key important words from a text. This makes us familiar with the general topic of a text. In text summarization, we are interested in producing a short-length text which includes important information about the document. The TextRank algorithm, an unsupervised learning method that is an extension of the PageRank (algorithm which is the base algorithm of Google search engine for searching pages and ranking them), has shown its efficacy in large-scale text mining, especially for text summarization and keyword extraction. This algorithm can automatically extract the important parts of a text (keywords or sentences) and declare them as a result. However, this algorithm neglects the semantic similarity between the different parts. In this work, we improved the results of the TextRank algorithm by incorporating the semantic similarity between parts of the text. Aside from keyword extraction and text summarization, we develop a topic clustering algorithm based on our framework, which can be used individually or as a part of generating the summary to overcome coverage problems.

Keywords: keyword extraction, n-gram extraction, text summarization, topic clustering, semantic analysis

Procedia PDF Downloads 43
29288 Programmed Speech to Text Summarization Using Graph-Based Algorithm

Authors: Hamsini Pulugurtha, P. V. S. L. Jagadamba

Abstract:

Programmed Speech to Text and Text Summarization Using Graph-based Algorithms can be utilized in gatherings to get the short depiction of the gathering for future reference. This gives signature check utilizing Siamese neural organization to confirm the personality of the client and convert the client gave sound record which is in English into English text utilizing the discourse acknowledgment bundle given in python. At times just the outline of the gathering is required, the answer for this text rundown. Thus, the record is then summed up utilizing the regular language preparing approaches, for example, solo extractive text outline calculations

Keywords: Siamese neural network, English speech, English text, natural language processing, unsupervised extractive text summarization

Procedia PDF Downloads 180
29287 Long Short-Term Memory (LSTM) Matters: A Sequential Brief Text that Assistive Approach of Text Summarization

Authors: Sharun Akter Khushbu

Abstract:

‘SOS’ addresses text summary such as feasibility study and allows more comprehensive methods on text of language resources. Resources language has been exploited by the importance of text documental procedure. Throughout this key idea will come out a machine interpreter called an SOS that has built an argumentative as an employed model is LSTM-CNN(long short-term memory- recurrent neural network). Summarization of Bengali text formulated by the information of latent structure instead of brief input string counting as text. Text summarization is the proper utilization of optimal solutions being time reduction, and easy interpretation whenever human-generated summary and machine targeted summary remain similar and without degrading the semantic summarization quality. According to the problem affirmation key idea has advanced an algorithm with the method of encoder and decoder describing a sequential structure that is rigorously connected with actual predicted and meaningful output. Regarding the seq2seq approach aimed in the future with high semantic summarization similarity on behalf of the large data samples that are also enlisted by the method. Thus, the SOS method assigns a discriminator over Bengali text documents where encoded input sequences such as summary and decoded the targeted summary of gist will be an error-free machine.

Keywords: LSTM-CNN, NN, SOS, text summarization

Procedia PDF Downloads 39
29286 Urdu Text Extraction Method from Images

Authors: Samabia Tehsin, Sumaira Kausar

Abstract:

Due to the vast increase in the multimedia data in recent years, efficient and robust retrieval techniques are needed to retrieve and index images/ videos. Text embedded in the images can serve as the strong retrieval tool for images. This is the reason that text extraction is an area of research with increasing attention. English text extraction is the focus of many researchers but very less work has been done on other languages like Urdu. This paper is focusing on Urdu text extraction from video frames. This paper presents a text detection feature set, which has the ability to deal up with most of the problems connected with the text extraction process. To test the validity of the method, it is tested on Urdu news dataset, which gives promising results.

Keywords: caption text, content-based image retrieval, document analysis, text extraction

Procedia PDF Downloads 480
29285 Short Text Classification for Saudi Tweets

Authors: Asma A. Alsufyani, Maram A. Alharthi, Maha J. Althobaiti, Manal S. Alharthi, Huda Rizq

Abstract:

Twitter is one of the most popular microblogging sites that allows users to publish short text messages called 'tweets'. Increasing the number of accounts to follow (followings) increases the number of tweets that will be displayed from different topics in an unclassified manner in the timeline of the user. Therefore, it can be a vital solution for many Twitter users to have their tweets in a timeline classified into general categories to save the user’s time and to provide easy and quick access to tweets based on topics. In this paper, we developed a classifier for timeline tweets trained on a dataset consisting of 3600 tweets in total, which were collected from Saudi Twitter and annotated manually. We experimented with the well-known Bag-of-Words approach to text classification, and we used support vector machines (SVM) in the training process. The trained classifier performed well on a test dataset, with an average F1-measure equal to 92.3%. The classifier has been integrated into an application, which practically proved the classifier’s ability to classify timeline tweets of the user.

Keywords: corpus creation, feature extraction, machine learning, short text classification, social media, support vector machine, Twitter

Procedia PDF Downloads 122
29284 A Critical Discourse Study of Gender Identity Issues in Daniyal Mueenuddin’s Short Story “Saleema”

Authors: Zafar Ali

Abstract:

The aim of this research is to highlight problems that are faced by women at the hands of men. Males in Pakistani society have power and use this power for the exploitation of women. Further, the purpose of the study is to make societies like Pakistan and especially the young generation, aware and enable them to resist such issues, and the role of discourse in this regard is to minimize its political and social repercussions. The study finds out different discursive techniques and manipulative language used in the short story to construct gender identity. The study also investigates socio-economic roles in the construction of gender identity. This study has been completed with the help of Critical Discourse Analysis (CDA) principles. CDA principles have been applied to the text of the selected short story Saleema from Daniyal Mueenuddin’s collection In Other Rooms, Other Wonders. Related passages, structures, expressions, and text are analyzed from the point of view of CDA, especially Norman Fairclough’s CDA approach. It was found from the analysis that women have no identity of their own in patriarchal societies like Pakistan. Further, it was found women are mistreated, and they have a very limited and defined role in Pakistan. They cannot go beyond the limit defined to them by men.

Keywords: gender issues, resourceful groups, CDA, exploitation

Procedia PDF Downloads 96
29283 “A Built-In, Shockproof, Shit Detector”: Major Challenges and Peculiarities of Translating Ernest Hemingway’s Short Stories Into Georgian

Authors: Natia Kvachakidze

Abstract:

Translating fiction is a complicated and multidimensional issue. However, studying and analyzing literary translations is not less challenging. This becomes even more complex due to the existence of several alternative translations of one and the same literary work. However, this also makes the research process more interesting at the same time. The aim of the given work is to distinguish major obstacles and challenges translators come across while working on Ernest Hemingway’s short fiction, as well as to analyze certain peculiarities and characteristic features of some existing Georgian translations of the writer’s work (especially in the context of various alternative versions of some well-known short stories). Consequently, the focus is on studying how close these translations come to the form and the context of the original text in order to see if the linguistic and stylistic characteristics of the original author are preserved. Moreover, it is interesting not only to study the relevance of each translation to the original text but also to present a comparative analysis of some major peculiarities of the given translations, which are naturally characterized by certain strengths and weaknesses. The latter is at times inevitable, but in certain cases, there is room for improvement. The given work also attempts to humbly suggest certain ways of possible improvements of some translation inadequacies, as this can provide even more opportunities for deeper and detailed studies in the future.

Keywords: Hemingway, short fiction, translation, Georgian

Procedia PDF Downloads 56
29282 Sentiment Analysis of Chinese Microblog Comments: Comparison between Support Vector Machine and Long Short-Term Memory

Authors: Xu Jiaqiao

Abstract:

Text sentiment analysis is an important branch of natural language processing. This technology is widely used in public opinion analysis and web surfing recommendations. At present, the mainstream sentiment analysis methods include three parts: sentiment analysis based on a sentiment dictionary, based on traditional machine learning, and based on deep learning. This paper mainly analyzes and compares the advantages and disadvantages of the SVM method of traditional machine learning and the Long Short-term Memory (LSTM) method of deep learning in the field of Chinese sentiment analysis, using Chinese comments on Sina Microblog as the data set. Firstly, this paper classifies and adds labels to the original comment dataset obtained by the web crawler, and then uses Jieba word segmentation to classify the original dataset and remove stop words. After that, this paper extracts text feature vectors and builds document word vectors to facilitate the training of the model. Finally, SVM and LSTM models are trained respectively. After accuracy calculation, it can be obtained that the accuracy of the LSTM model is 85.80%, while the accuracy of SVM is 91.07%. But at the same time, LSTM operation only needs 2.57 seconds, SVM model needs 6.06 seconds. Therefore, this paper concludes that: compared with the SVM model, the LSTM model is worse in accuracy but faster in processing speed.

Keywords: sentiment analysis, support vector machine, long short-term memory, Chinese microblog comments

Procedia PDF Downloads 58
29281 Adaptation in Translation of 'Christmas Every Day' Short Story by William Dean Howells

Authors: Mohsine Khazrouni

Abstract:

The present study is an attempt to highlight the importance of adaptation in translation. To convey the message, the translator needs to take into account not only the text but also extra-linguistic factors such as the target audience. The present paper claims that adaptation is an unavoidable translation strategy when dealing with texts that are heavy with religious and cultural themes. The translation task becomes even more challenging when dealing with children’s literature as the audience are children whose comprehension, experience and world knowledge are limited. The study uses the Arabic translation of the short story ‘Christmas Every Day’ as a case study. The short story will be translated, and the pragmatic problems involved will be discussed. The focus will be on the issue of adaptation. i.e., the source text should be adapted to the target language audience`s social and cultural environment.

Keywords: pragmatic adaptation, Arabic translation, children's literature, equivalence

Procedia PDF Downloads 175
29280 Distorted Document Images Dataset for Text Detection and Recognition

Authors: Ilia Zharikov, Philipp Nikitin, Ilia Vasiliev, Vladimir Dokholyan

Abstract:

With the increasing popularity of document analysis and recognition systems, text detection (TD) and optical character recognition (OCR) in document images become challenging tasks. However, according to our best knowledge, no publicly available datasets for these particular problems exist. In this paper, we introduce a Distorted Document Images dataset (DDI-100) and provide a detailed analysis of the DDI-100 in its current state. To create the dataset we collected 7000 unique document pages, and extend it by applying different types of distortions and geometric transformations. In total, DDI-100 contains more than 100,000 document images together with binary text masks, text and character locations in terms of bounding boxes. We also present an analysis of several state-of-the-art TD and OCR approaches on the presented dataset. Lastly, we demonstrate the usefulness of DDI-100 to improve accuracy and stability of the considered TD and OCR models.

Keywords: document analysis, open dataset, optical character recognition, text detection

Procedia PDF Downloads 134
29279 Extraction of Text Subtitles in Multimedia Systems

Authors: Amarjit Singh

Abstract:

In this paper, a method for extraction of text subtitles in large video is proposed. The video data needs to be annotated for many multimedia applications. Text is incorporated in digital video for the motive of providing useful information about that video. So need arises to detect text present in video to understanding and video indexing. This is achieved in two steps. First step is text localization and the second step is text verification. The method of text detection can be extended to text recognition which finds applications in automatic video indexing; video annotation and content based video retrieval. The method has been tested on various types of videos.

Keywords: video, subtitles, extraction, annotation, frames

Procedia PDF Downloads 567
29278 A Summary-Based Text Classification Model for Graph Attention Networks

Authors: Shuo Liu

Abstract:

In Chinese text classification tasks, redundant words and phrases can interfere with the formation of extracted and analyzed text information, leading to a decrease in the accuracy of the classification model. To reduce irrelevant elements, extract and utilize text content information more efficiently and improve the accuracy of text classification models. In this paper, the text in the corpus is first extracted using the TextRank algorithm for abstraction, the words in the abstract are used as nodes to construct a text graph, and then the graph attention network (GAT) is used to complete the task of classifying the text. Testing on a Chinese dataset from the network, the classification accuracy was improved over the direct method of generating graph structures using text.

Keywords: Chinese natural language processing, text classification, abstract extraction, graph attention network

Procedia PDF Downloads 63
29277 Structure Analysis of Text-Image Connection in Jalayrid Period Illustrated Manuscripts

Authors: Mahsa Khani Oushani

Abstract:

Text and image are two important elements in the field of Iranian art, the text component and the image component have always been manifested together. The image narrates the text and the text is the factor in the formation of the image and they are closely related to each other. The connection between text and image is an interactive and two-way connection in the tradition of Iranian manuscript arrangement. The interaction between the narrative description and the image scene is the result of a direct and close connection between the text and the image, which in addition to the decorative aspect, also has a descriptive aspect. In this article the connection between the text element and the image element and its adaptation to the theory of Roland Barthes, the structuralism theorist, in this regard will be discussed. This study tends to investigate the question of how the connection between text and image in illustrated manuscripts of the Jalayrid period is defined according to Barthes’ theory. And what kind of proportion has the artist created in the composition between text and image. Based on the results of reviewing the data of this study, it can be inferred that in the Jalayrid period, the image has a reference connection and although it is of major importance on the page, it also maintains a close connection with the text and is placed in a special proportion. It is not necessarily balanced and symmetrical and sometimes uses imbalance for composition. This research has been done by descriptive-analytical method, which has been done by library collection method.

Keywords: structure, text, image, Jalayrid, painter

Procedia PDF Downloads 188
29276 Morphological Processing of Punjabi Text for Sentiment Analysis of Farmer Suicides

Authors: Jaspreet Singh, Gurvinder Singh, Prabhsimran Singh, Rajinder Singh, Prithvipal Singh, Karanjeet Singh Kahlon, Ravinder Singh Sawhney

Abstract:

Morphological evaluation of Indian languages is one of the burgeoning fields in the area of Natural Language Processing (NLP). The evaluation of a language is an eminent task in the era of information retrieval and text mining. The extraction and classification of knowledge from text can be exploited for sentiment analysis and morphological evaluation. This study coalesce morphological evaluation and sentiment analysis for the task of classification of farmer suicide cases reported in Punjab state of India. The pre-processing of Punjabi text involves morphological evaluation and normalization of Punjabi word tokens followed by the training of proposed model using deep learning classification on Punjabi language text extracted from online Punjabi news reports. The class-wise accuracies of sentiment prediction for four negatively oriented classes of farmer suicide cases are 93.85%, 88.53%, 83.3%, and 95.45% respectively. The overall accuracy of sentiment classification obtained using proposed framework on 275 Punjabi text documents is found to be 90.29%.

Keywords: deep neural network, farmer suicides, morphological processing, punjabi text, sentiment analysis

Procedia PDF Downloads 285
29275 N400 Investigation of Semantic Priming Effect to Symbolic Pictures in Text

Authors: Thomas Ousterhout

Abstract:

The purpose of this study was to investigate if incorporating meaningful pictures of gestures and facial expressions in short sentences of text could supplement the text with enough semantic information to produce and N400 effect when probe words incongruent to the picture were subsequently presented. Event-related potentials (ERPs) were recorded from a 14-channel commercial grade EEG headset while subjects performed congruent/incongruent reaction time discrimination tasks. Since pictures of meaningful gestures have been shown to be semantically processed in the brain in a similar manner as words are, it is believed that pictures will add supplementary information to text just as the inclusion of their equivalent synonymous word would. The hypothesis is that when subjects read the text/picture mixed sentences, they will process the images and words just like in face-to-face communication and therefore probe words incongruent to the image will produce an N400.

Keywords: EEG, ERP, N400, semantics, congruency, facilitation, Emotiv

Procedia PDF Downloads 235
29274 Small Text Extraction from Documents and Chart Images

Authors: Rominkumar Busa, Shahira K. C., Lijiya A.

Abstract:

Text recognition is an important area in computer vision which deals with detecting and recognising text from an image. The Optical Character Recognition (OCR) is a saturated area these days and with very good text recognition accuracy. However the same OCR methods when applied on text with small font sizes like the text data of chart images, the recognition rate is less than 30%. In this work, aims to extract small text in images using the deep learning model, CRNN with CTC loss. The text recognition accuracy is found to improve by applying image enhancement by super resolution prior to CRNN model. We also observe the text recognition rate further increases by 18% by applying the proposed method, which involves super resolution and character segmentation followed by CRNN with CTC loss. The efficiency of the proposed method shows that further pre-processing on chart image text and other small text images will improve the accuracy further, thereby helping text extraction from chart images.

Keywords: small text extraction, OCR, scene text recognition, CRNN

Procedia PDF Downloads 94
29273 Text Data Preprocessing Library: Bilingual Approach

Authors: Kabil Boukhari

Abstract:

In the context of information retrieval, the selection of the most relevant words is a very important step. In fact, the text cleaning allows keeping only the most representative words for a better use. In this paper, we propose a library for the purpose text preprocessing within an implemented application to facilitate this task. This study has two purposes. The first, is to present the related work of the various steps involved in text preprocessing, presenting the segmentation, stemming and lemmatization algorithms that could be efficient in the rest of study. The second, is to implement a developed tool for text preprocessing in French and English. This library accepts unstructured text as input and provides the preprocessed text as output, based on a set of rules and on a base of stop words for both languages. The proposed library has been made on different corpora and gave an interesting result.

Keywords: text preprocessing, segmentation, knowledge extraction, normalization, text generation, information retrieval

Procedia PDF Downloads 58
29272 The Morphology of Sri Lankan Text Messages

Authors: Chamindi Dilkushi Senaratne

Abstract:

Communicating via a text or an SMS (Short Message Service) has become an integral part of our daily lives. With the increase in the use of mobile phones, text messaging has become a genre by itself worth researching and studying. It is undoubtedly a major phenomenon revealing language change. This paper attempts to describe the morphological processes of text language of urban bilinguals in Sri Lanka. It will be a typological study based on 500 English text messages collected from urban bilinguals residing in Colombo. The messages are selected by categorizing the deviant forms of language use apparent in text messages. These stylistic deviations are a deliberate skilled performance by the users of the language possessing an in-depth knowledge of linguistic systems to create new words and thereby convey their linguistic identity and individual and group solidarity via the message. The findings of the study solidifies arguments that the manipulation of language in text messages is both creative and appropriate. In addition, code mixing theories will be used to identify how existing morphological processes are adapted by bilingual users in Sri Lanka when texting. The study will reveal processes such as omission, initialism, insertion and alternation in addition to other identified linguistic features in text language. The corpus reveals the most common morphological processes used by Sri Lankan urban bilinguals when sending texts.

Keywords: bilingual, deviations, morphology, texts

Procedia PDF Downloads 241
29271 Anatomical Survey for Text Pattern Detection

Authors: S. Tehsin, S. Kausar

Abstract:

The ultimate aim of machine intelligence is to explore and materialize the human capabilities, one of which is the ability to detect various text objects within one or more images displayed on any canvas including prints, videos or electronic displays. Multimedia data has increased rapidly in past years. Textual information present in multimedia contains important information about the image/video content. However, it needs to technologically testify the commonly used human intelligence of detecting and differentiating the text within an image, for computers. Hence in this paper feature set based on anatomical study of human text detection system is proposed. Subsequent examination bears testimony to the fact that the features extracted proved instrumental to text detection.

Keywords: biologically inspired vision, content based retrieval, document analysis, text extraction

Procedia PDF Downloads 417
29270 Text Mining of Twitter Data Using a Latent Dirichlet Allocation Topic Model and Sentiment Analysis

Authors: Sidi Yang, Haiyi Zhang

Abstract:

Twitter is a microblogging platform, where millions of users daily share their attitudes, views, and opinions. Using a probabilistic Latent Dirichlet Allocation (LDA) topic model to discern the most popular topics in the Twitter data is an effective way to analyze a large set of tweets to find a set of topics in a computationally efficient manner. Sentiment analysis provides an effective method to show the emotions and sentiments found in each tweet and an efficient way to summarize the results in a manner that is clearly understood. The primary goal of this paper is to explore text mining, extract and analyze useful information from unstructured text using two approaches: LDA topic modelling and sentiment analysis by examining Twitter plain text data in English. These two methods allow people to dig data more effectively and efficiently. LDA topic model and sentiment analysis can also be applied to provide insight views in business and scientific fields.

Keywords: text mining, Twitter, topic model, sentiment analysis

Procedia PDF Downloads 149
29269 Exploratory Analysis of A Review of Nonexistence Polarity in Native Speech

Authors: Deawan Rakin Ahamed Remal, Sinthia Chowdhury, Sharun Akter Khushbu, Sheak Rashed Haider Noori

Abstract:

Native Speech to text synthesis has its own leverage for the purpose of mankind. The extensive nature of art to speaking different accents is common but the purpose of communication between two different accent types of people is quite difficult. This problem will be motivated by the extraction of the wrong perception of language meaning. Thus, many existing automatic speech recognition has been placed to detect text. Overall study of this paper mentions a review of NSTTR (Native Speech Text to Text Recognition) synthesis compared with Text to Text recognition. Review has exposed many text to text recognition systems that are at a very early stage to comply with the system by native speech recognition. Many discussions started about the progression of chatbots, linguistic theory another is rule based approach. In the Recent years Deep learning is an overwhelming chapter for text to text learning to detect language nature. To the best of our knowledge, In the sub continent a huge number of people speak in Bangla language but they have different accents in different regions therefore study has been elaborate contradictory discussion achievement of existing works and findings of future needs in Bangla language acoustic accent.

Keywords: TTR, NSTTR, text to text recognition, deep learning, natural language processing

Procedia PDF Downloads 98
29268 An Event Relationship Extraction Method Incorporating Deep Feedback Recurrent Neural Network and Bidirectional Long Short-Term Memory

Authors: Yin Yuanling

Abstract:

A Deep Feedback Recurrent Neural Network (DFRNN) and Bidirectional Long Short-Term Memory (BiLSTM) are designed to address the problem of low accuracy of traditional relationship extraction models. This method combines a deep feedback-based recurrent neural network (DFRNN) with a bi-directional long short-term memory (BiLSTM) approach. The method combines DFRNN, which extracts local features of text based on deep feedback recurrent mechanism, BiLSTM, which better extracts global features of text, and Self-Attention, which extracts semantic information. Experiments show that the method achieves an F1 value of 76.69% on the CEC dataset, which is 0.0652 better than the BiLSTM+Self-ATT model, thus optimizing the performance of the deep learning method in the event relationship extraction task.

Keywords: event relations, deep learning, DFRNN models, bi-directional long and short-term memory networks

Procedia PDF Downloads 97
29267 A Study on Sentiment Analysis Using Various ML/NLP Models on Historical Data of Indian Leaders

Authors: Sarthak Deshpande, Akshay Patil, Pradip Pandhare, Nikhil Wankhede, Rushali Deshmukh

Abstract:

Among the highly significant duties for any language most effective is the sentiment analysis, which is also a key area of NLP, that recently made impressive strides. There are several models and datasets available for those tasks in popular and commonly used languages like English, Russian, and Spanish. While sentiment analysis research is performed extensively, however it is lagging behind for the regional languages having few resources such as Hindi, Marathi. Marathi is one of the languages that included in the Indian Constitution’s 8th schedule and is the third most widely spoken language in the country and primarily spoken in the Deccan region, which encompasses Maharashtra and Goa. There isn’t sufficient study on sentiment analysis methods based on Marathi text due to lack of available resources, information. Therefore, this project proposes the use of different ML/NLP models for the analysis of Marathi data from the comments below YouTube content, tweets or Instagram posts. We aim to achieve a short and precise analysis and summary of the related data using our dataset (Dates, names, root words) and lexicons to locate exact information.

Keywords: multilingual sentiment analysis, Marathi, natural language processing, text summarization, lexicon-based approaches

Procedia PDF Downloads 35
29266 Improving Depression Symptoms and Antidepressant Medication Adherence Using Encrypted Short Message Service Text Message Reminders

Authors: Ogbonna Olelewe

Abstract:

This quality improvement project seeks to address the background and significance of promoting antidepressant (AD) medication adherence to reduce depression symptoms in patients diagnosed with major depression. This project aims to substantiate using daily encrypted short message service (SMS) text reminders to take prescribed antidepressant medications with the goal of increasing medication adherence to reduce depression scores in patients diagnosed with major depression, thereby preventing relapses and increasing remission rates. Depression symptoms were measured using the Patient Health Questionnaire-9 (PHQ-9) scale. The PHQ-9 provides a total score of depression symptoms from mild to severe, ranging from 0 to 27. A -pretest/post-test design was used, with a convenience sample size of 35 adult patients aged 18 years old to 45 years old, diagnosed with MDD, and prescribed at least one antidepressant for one year or more. Pre- and post-test PHQ-9 scores were conducted to compare depression scores before and after the four-week intervention period. The results indicated improved post-intervention PHQ-9 scores, improved AD medication adherence, and a significant reduction in depression symptoms.

Keywords: major depressive disorder, antidepressants, short message services, text reminders, Medication adherence/non-adherence, Patient Health Questionnaire 9

Procedia PDF Downloads 120
29265 Information Extraction for Short-Answer Question for the University of the Cordilleras

Authors: Thelma Palaoag, Melanie Basa, Jezreel Mark Panilo

Abstract:

Checking short-answer questions and essays, whether it may be paper or electronic in form, is a tiring and tedious task for teachers. Evaluating a student’s output require wide array of domains. Scoring the work is often a critical task. Several attempts in the past few years to create an automated writing assessment software but only have received negative results from teachers and students alike due to unreliability in scoring, does not provide feedback and others. The study aims to create an application that will be able to check short-answer questions which incorporate information extraction. Information extraction is a subfield of Natural Language Processing (NLP) where a chunk of text (technically known as unstructured text) is being broken down to gather necessary bits of data and/or keywords (structured text) to be further analyzed or rather be utilized by query tools. The proposed system shall be able to extract keywords or phrases from the individual’s answers to match it into a corpora of words (as defined by the instructor), which shall be the basis of evaluation of the individual’s answer. The proposed system shall also enable the teacher to provide feedback and re-evaluate the output of the student for some writing elements in which the computer cannot fully evaluate such as creativity and logic. Teachers can formulate, design, and check short answer questions efficiently by defining keywords or phrases as parameters by assigning weights for checking answers. With the proposed system, teacher’s time in checking and evaluating students output shall be lessened, thus, making the teacher more productive and easier.

Keywords: information extraction, short-answer question, natural language processing, application

Procedia PDF Downloads 403
29264 OCR/ICR Text Recognition Using ABBYY FineReader as an Example Text

Authors: A. R. Bagirzade, A. Sh. Najafova, S. M. Yessirkepova, E. S. Albert

Abstract:

This article describes a text recognition method based on Optical Character Recognition (OCR). The features of the OCR method were examined using the ABBYY FineReader program. It describes automatic text recognition in images. OCR is necessary because optical input devices can only transmit raster graphics as a result. Text recognition describes the task of recognizing letters shown as such, to identify and assign them an assigned numerical value in accordance with the usual text encoding (ASCII, Unicode). The peculiarity of this study conducted by the authors using the example of the ABBYY FineReader, was confirmed and shown in practice, the improvement of digital text recognition platforms developed by Electronic Publication.

Keywords: ABBYY FineReader system, algorithm symbol recognition, OCR/ICR techniques, recognition technologies

Procedia PDF Downloads 134
29263 Assessment of the Validity of Sentiment Analysis as a Tool to Analyze the Emotional Content of Text

Authors: Trisha Malhotra

Abstract:

Sentiment analysis is a recent field of study that computationally assesses the emotional nature of a body of text. To assess its test-validity, sentiment analysis was carried out on the emotional corpus of text from a personal 15-day mood diary. Self-reported mood scores varied more or less accurately with daily mood evaluation score given by the software. On further assessment, it was found that while sentiment analysis was good at assessing ‘global’ mood, it was not able to ‘locally’ identify and differentially score synonyms of various emotional words. It is further critiqued for treating the intensity of an emotion as universal across cultures. Finally, the software is shown not to account for emotional complexity in sentences by treating emotions as strictly positive or negative. Hence, it is posited that a better output could be two (positive and negative) affect scores for the same body of text.

Keywords: analysis, data, diary, emotions, mood, sentiment

Procedia PDF Downloads 237
29262 A Quantitative Evaluation of Text Feature Selection Methods

Authors: B. S. Harish, M. B. Revanasiddappa

Abstract:

Due to rapid growth of text documents in digital form, automated text classification has become an important research in the last two decades. The major challenge of text document representations are high dimension, sparsity, volume and semantics. Since the terms are only features that can be found in documents, selection of good terms (features) plays an very important role. In text classification, feature selection is a strategy that can be used to improve classification effectiveness, computational efficiency and accuracy. In this paper, we present a quantitative analysis of most widely used feature selection (FS) methods, viz. Term Frequency-Inverse Document Frequency (tfidf ), Mutual Information (MI), Information Gain (IG), CHISquare (x2), Term Frequency-Relevance Frequency (tfrf ), Term Strength (TS), Ambiguity Measure (AM) and Symbolic Feature Selection (SFS) to classify text documents. We evaluated all the feature selection methods on standard datasets like 20 Newsgroups, 4 University dataset and Reuters-21578.

Keywords: classifiers, feature selection, text classification

Procedia PDF Downloads 421
29261 The Composer’s Hand: An Analysis of Arvo Pärt’s String Orchestral Work, Psalom

Authors: Mark K. Johnson

Abstract:

Arvo Pärt has composed over 80 text-based compositions based on nine different languages. But prior to 2015, it was not publicly known what texts the composer used in composing a number of his non-vocal works, nor the language of those texts. Because of this lack of information, few if any musical scholars have illustrated in any detail how textual structure applies to any of Pärt’s instrumental compositions. However, in early 2015, the Arvo Pärt Centre in Estonia published In Principio, a compendium of the texts Pärt has used to derive many of the parameters of his text-based compositions. This paper provides the first detailed analysis of the relationship between structural aspects of the Church Slavonic Eastern Orthodox text of Psalm 112 and the musical parameters that Pärt used when composing the string orchestral work Psalom. It demonstrates that Pärt’s text-based compositions are carefully crafted works, and that evidence of the presence of the ‘invisible’ hand of the composer can be found within every aspect of the underpinning structures, at the more elaborate middle ground level, and even within surface aspects of these works. Based on the analysis of Psalom, it is evident that the text Pärt selected for Psalom informed many of his decisions regarding the musical structures, parameters and processes that he deployed in composing this non-vocal text-based work. Many of these composerly decisions in relation to these various aspects cannot be fathomed without access to, and an understanding of, the text associated with the work.

Keywords: Arvo Pärt, minimalism, psalom, text-based process music

Procedia PDF Downloads 197