Search results for: text information retrieval
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 11544

Search results for: text information retrieval

11244 Making Sense of Places: A Comparative Study of Three Contexts in Thailand

Authors: Thirayu Jumsai Na Ayudhya

Abstract:

The study of what architecture means to people in their everyday lives inadequately addresses the contextualized and holistic theoretical framework. This article succinctly presents theoretical framework obtained from the comparative study of how people experience the everyday architecture in three different contexts including 1) Bangkok CBD, 2) Phuket island old-town, and 3) Nan province old-town. The way people make sense of the everyday architecture can be addressed in four super-ordinate themes; (1) building in urban (text), (2) building in (text), (3) building in human (text), (4) and building in time (text). In this article, these super-ordinate themes were verified whether they recur in three studied-contexts. In each studied-context, the participants were divided into two groups, 1) local people, 2) visitors. Participants were asked to take photographs of the everyday architecture during the everyday routine and to participate the elicit-interview with photographs produced by themselves. Interpretative phenomenological analysis (IPA) was adopted to interpret elicit-interview data. Sub-themes emerging in each studied-context were brought into the cross-comparison among three studied- contexts. It is found that four super-ordinate themes recur with additional distinctive sub-themes. Further studies in other different contexts, such as socio-political, economic, cultural differences, are recommended to complete the theoretical framework.

Keywords: sense of place, the everyday architecture, architectural experience, the everyday

Procedia PDF Downloads 135
11243 Unsupervised Part-of-Speech Tagging for Amharic Using K-Means Clustering

Authors: Zelalem Fantahun

Abstract:

Part-of-speech tagging is the process of assigning a part-of-speech or other lexical class marker to each word into naturally occurring text. Part-of-speech tagging is the most fundamental and basic task almost in all natural language processing. In natural language processing, the problem of providing large amount of manually annotated data is a knowledge acquisition bottleneck. Since, Amharic is one of under-resourced language, the availability of tagged corpus is the bottleneck problem for natural language processing especially for POS tagging. A promising direction to tackle this problem is to provide a system that does not require manually tagged data. In unsupervised learning, the learner is not provided with classifications. Unsupervised algorithms seek out similarity between pieces of data in order to determine whether they can be characterized as forming a group. This paper explicates the development of unsupervised part-of-speech tagger using K-Means clustering for Amharic language since large amount of data is produced in day-to-day activities. In the development of the tagger, the following procedures are followed. First, the unlabeled data (raw text) is divided into 10 folds and tokenization phase takes place; at this level, the raw text is chunked at sentence level and then into words. The second phase is feature extraction which includes word frequency, syntactic and morphological features of a word. The third phase is clustering. Among different clustering algorithms, K-means is selected and implemented in this study that brings group of similar words together. The fourth phase is mapping, which deals with looking at each cluster carefully and the most common tag is assigned to a group. This study finds out two features that are capable of distinguishing one part-of-speech from others these are morphological feature and positional information and show that it is possible to use unsupervised learning for Amharic POS tagging. In order to increase performance of the unsupervised part-of-speech tagger, there is a need to incorporate other features that are not included in this study, such as semantic related information. Finally, based on experimental result, the performance of the system achieves a maximum of 81% accuracy.

Keywords: POS tagging, Amharic, unsupervised learning, k-means

Procedia PDF Downloads 423
11242 Degraded Document Analysis and Extraction of Original Text Document: An Approach without Optical Character Recognition

Authors: L. Hamsaveni, Navya Prakash, Suresha

Abstract:

Document Image Analysis recognizes text and graphics in documents acquired as images. An approach without Optical Character Recognition (OCR) for degraded document image analysis has been adopted in this paper. The technique involves document imaging methods such as Image Fusing and Speeded Up Robust Features (SURF) Detection to identify and extract the degraded regions from a set of document images to obtain an original document with complete information. In case, degraded document image captured is skewed, it has to be straightened (deskew) to perform further process. A special format of image storing known as YCbCr is used as a tool to convert the Grayscale image to RGB image format. The presented algorithm is tested on various types of degraded documents such as printed documents, handwritten documents, old script documents and handwritten image sketches in documents. The purpose of this research is to obtain an original document for a given set of degraded documents of the same source.

Keywords: grayscale image format, image fusing, RGB image format, SURF detection, YCbCr image format

Procedia PDF Downloads 355
11241 A U-Net Based Architecture for Fast and Accurate Diagram Extraction

Authors: Revoti Prasad Bora, Saurabh Yadav, Nikita Katyal

Abstract:

In the context of educational data mining, the use case of extracting information from images containing both text and diagrams is of high importance. Hence, document analysis requires the extraction of diagrams from such images and processes the text and diagrams separately. To the author’s best knowledge, none among plenty of approaches for extracting tables, figures, etc., suffice the need for real-time processing with high accuracy as needed in multiple applications. In the education domain, diagrams can be of varied characteristics viz. line-based i.e. geometric diagrams, chemical bonds, mathematical formulas, etc. There are two broad categories of approaches that try to solve similar problems viz. traditional computer vision based approaches and deep learning approaches. The traditional computer vision based approaches mainly leverage connected components and distance transform based processing and hence perform well in very limited scenarios. The existing deep learning approaches either leverage YOLO or faster-RCNN architectures. These approaches suffer from a performance-accuracy tradeoff. This paper proposes a U-Net based architecture that formulates the diagram extraction as a segmentation problem. The proposed method provides similar accuracy with a much faster extraction time as compared to the mentioned state-of-the-art approaches. Further, the segmentation mask in this approach allows the extraction of diagrams of irregular shapes.

Keywords: computer vision, deep-learning, educational data mining, faster-RCNN, figure extraction, image segmentation, real-time document analysis, text extraction, U-Net, YOLO

Procedia PDF Downloads 108
11240 Surface to the Deeper: A Universal Entity Alignment Approach Focusing on Surface Information

Authors: Zheng Baichuan, Li Shenghui, Li Bingqian, Zhang Ning, Chen Kai

Abstract:

Entity alignment (EA) tasks in knowledge graphs often play a pivotal role in the integration of knowledge graphs, where structural differences often exist between the source and target graphs, such as the presence or absence of attribute information and the types of attribute information (text, timestamps, images, etc.). However, most current research efforts are focused on improving alignment accuracy, often along with an increased reliance on specific structures -a dependency that inevitably diminishes their practical value and causes difficulties when facing knowledge graph alignment tasks with varying structures. Therefore, we propose a universal knowledge graph alignment approach that only utilizes the common basic structures shared by knowledge graphs. We have demonstrated through experiments that our method achieves state-of-the-art performance in fair comparisons.

Keywords: knowledge graph, entity alignment, transformer, deep learning

Procedia PDF Downloads 22
11239 Integrating Natural Language Processing (NLP) and Machine Learning in Lung Cancer Diagnosis

Authors: Mehrnaz Mostafavi

Abstract:

The assessment and categorization of incidental lung nodules present a considerable challenge in healthcare, often necessitating resource-intensive multiple computed tomography (CT) scans for growth confirmation. This research addresses this issue by introducing a distinct computational approach leveraging radiomics and deep-learning methods. However, understanding local services is essential before implementing these advancements. With diverse tracking methods in place, there is a need for efficient and accurate identification approaches, especially in the context of managing lung nodules alongside pre-existing cancer scenarios. This study explores the integration of text-based algorithms in medical data curation, indicating their efficacy in conjunction with machine learning and deep-learning models for identifying lung nodules. Combining medical images with text data has demonstrated superior data retrieval compared to using each modality independently. While deep learning and text analysis show potential in detecting previously missed nodules, challenges persist, such as increased false positives. The presented research introduces a Structured-Query-Language (SQL) algorithm designed for identifying pulmonary nodules in a tertiary cancer center, externally validated at another hospital. Leveraging natural language processing (NLP) and machine learning, the algorithm categorizes lung nodule reports based on sentence features, aiming to facilitate research and assess clinical pathways. The hypothesis posits that the algorithm can accurately identify lung nodule CT scans and predict concerning nodule features using machine-learning classifiers. Through a retrospective observational study spanning a decade, CT scan reports were collected, and an algorithm was developed to extract and classify data. Results underscore the complexity of lung nodule cohorts in cancer centers, emphasizing the importance of careful evaluation before assuming a metastatic origin. The SQL and NLP algorithms demonstrated high accuracy in identifying lung nodule sentences, indicating potential for local service evaluation and research dataset creation. Machine-learning models exhibited strong accuracy in predicting concerning changes in lung nodule scan reports. While limitations include variability in disease group attribution, the potential for correlation rather than causality in clinical findings, and the need for further external validation, the algorithm's accuracy and potential to support clinical decision-making and healthcare automation represent a significant stride in lung nodule management and research.

Keywords: lung cancer diagnosis, structured-query-language (SQL), natural language processing (NLP), machine learning, CT scans

Procedia PDF Downloads 58
11238 The Syntactic Features of Islamic Legal Texts and Their Implications for Translation

Authors: Rafat Y. Alwazna

Abstract:

Certain religious texts are deemed part of legal texts that are characterised by high sensitivity and sacredness. Amongst such religious texts are Islamic legal texts that are replete with Islamic legal terms that designate particular legal concepts peculiar to Islamic legal system and legal culture. However, from the syntactic perspective, Islamic legal texts prove lengthy, condensed and convoluted, with little use of punctuation system, but with an extensive use of subordinations and co-ordinations, which separate the main verb from the subject, and which, of course, carry a heavy load of legal detail. The present paper seeks to examine the syntactic features of Islamic legal texts through analysing a short text of Islamic jurisprudence in an attempt at exploring the syntactic features that characterise this type of legal text. A translation of this text into legal English is then exercised to find the translation implications that have emerged as a result of the English translation. Based on these implications, the paper compares and contrasts the syntactic features of Islamic legal texts to those of legal English texts. Finally, the present paper argues that there are a number of syntactic features of Islamic legal texts, such as nominalisation, passivisation, little use of punctuation system, the use of the Arabic cohesive device, etc., which are also possessed by English legal texts except for the last feature and with some variations. The paper also claims that when rendering an Islamic legal text into legal English, certain implications emerge, such as the necessity of a sentence break, the omission of the cohesive device concerned and the increase in the use of nominalisation, passivisation, passive participles, and so on.

Keywords: English legal texts, Islamic legal texts, nominalisation, participles, passivisation, syntactic features, translation implications

Procedia PDF Downloads 198
11237 Communication through Technology: SMS Taking Most of the Time Impacting the Standard English

Authors: Nazia Sulemna, Sadia Gul

Abstract:

With the invade of mobile phones text messaging has become a popular medium of communication. Its users are multiplying with every passing day. Its use is not only limites to informal but to formal communication as well. Students are the advent users of mobile phones and of SMS as well. The present study manifests the fact that students are practicing SMS for a number of reasons and a good amount of time is spent upon it which is resulting in typographical features, graphones and rebus writing. Data was collected through questionnaires and came to the conclusion that its effect is obvious in the L2 users and in exam as well.

Keywords: text messaging, technology, exams, formal writing

Procedia PDF Downloads 720
11236 StockTwits Sentiment Analysis on Stock Price Prediction

Authors: Min Chen, Rubi Gupta

Abstract:

Understanding and predicting stock market movements is a challenging problem. It is believed stock markets are partially driven by public sentiments, which leads to numerous research efforts to predict stock market trend using public sentiments expressed on social media such as Twitter but with limited success. Recently a microblogging website StockTwits is becoming increasingly popular for users to share their discussions and sentiments about stocks and financial market. In this project, we analyze the text content of StockTwits tweets and extract financial sentiment using text featurization and machine learning algorithms. StockTwits tweets are first pre-processed using techniques including stopword removal, special character removal, and case normalization to remove noise. Features are extracted from these preprocessed tweets through text featurization process using bags of words, N-gram models, TF-IDF (term frequency-inverse document frequency), and latent semantic analysis. Machine learning models are then trained to classify the tweets' sentiment as positive (bullish) or negative (bearish). The correlation between the aggregated daily sentiment and daily stock price movement is then investigated using Pearson’s correlation coefficient. Finally, the sentiment information is applied together with time series stock data to predict stock price movement. The experiments on five companies (Apple, Amazon, General Electric, Microsoft, and Target) in a duration of nine months demonstrate the effectiveness of our study in improving the prediction accuracy.

Keywords: machine learning, sentiment analysis, stock price prediction, tweet processing

Procedia PDF Downloads 131
11235 Information Literacy Initiatives in India in Present Era Age

Authors: Darshan Lal

Abstract:

The paper describes the concept of Information literacy. It is a critical component of this information age. Information literacy is the vital process in modern changing world. Information Literacy initiatives in India was also discussed. Paper also discussed Information literacy programmes for LIS professionals. Information literacy makes person capable to recognize when information is needed and how to locate, evaluate and use effectively of the needed information.

Keywords: information literacy, information communication technology (ICT), information literacy programmes

Procedia PDF Downloads 344
11234 Structured-Ness and Contextual Retrieval Underlie Language Comprehension

Authors: Yao-Ying Lai, Maria Pinango, Ashwini Deo

Abstract:

While grammatical devices are essential to language processing, how comprehension utilizes cognitive mechanisms is less emphasized. This study addresses this issue by probing the complement coercion phenomenon: an entity-denoting complement following verbs like begin and finish receives an eventive interpretation. For example, (1) “The queen began the book” receives an agentive reading like (2) “The queen began [reading/writing/etc.…] the book.” Such sentences engender additional processing cost in real-time comprehension. The traditional account attributes this cost to an operation that coerces the entity-denoting complement to an event, assuming that these verbs require eventive complements. However, in closer examination, examples like “Chapter 1 began the book” undermine this assumption. An alternative, Structured Individual (SI) hypothesis, proposes that the complement following aspectual verbs (AspV; e.g. begin, finish) is conceptualized as a structured individual, construed as an axis along various dimensions (e.g. spatial, eventive, temporal, informational). The composition of an animate subject and an AspV such as (1) engenders an ambiguity between an agentive reading along the eventive dimension like (2), and a constitutive reading along the informational/spatial dimension like (3) “[The story of the queen] began the book,” in which the subject is interpreted as a subpart of the complement denotation. Comprehenders need to resolve the ambiguity by searching contextual information, resulting in additional cost. To evaluate the SI hypothesis, a questionnaire was employed. Method: Target AspV sentences such as “Shakespeare began the volume.” were preceded by one of the following types of context sentence: (A) Agentive-biasing, in which an event was mentioned (…writers often read…), (C) Constitutive-biasing, in which a constitutive meaning was hinted (Larry owns collections of Renaissance literature.), (N) Neutral context, which allowed both interpretations. Thirty-nine native speakers of English were asked to (i) rate each context-target sentence pair from a 1~5 scale (5=fully understandable), and (ii) choose possible interpretations for the target sentence given the context. The SI hypothesis predicts that comprehension is harder for the Neutral condition, as compared to the biasing conditions because no contextual information is provided to resolve an ambiguity. Also, comprehenders should obtain the specific interpretation corresponding to the context type. Results: (A) Agentive-biasing and (C) Constitutive-biasing were rated higher than (N) Neutral conditions (p< .001), while all conditions were within the acceptable range (> 3.5 on the 1~5 scale). This suggests that when lacking relevant contextual information, semantic ambiguity decreases comprehensibility. The interpretation task shows that the participants selected the biased agentive/constitutive reading for condition (A) and (C) respectively. For the Neutral condition, the agentive and constitutive readings were chosen equally often. Conclusion: These findings support the SI hypothesis: the meaning of AspV sentences is conceptualized as a parthood relation involving structured individuals. We argue that semantic representation makes reference to spatial structured-ness (abstracted axis). To obtain an appropriate interpretation, comprehenders utilize contextual information to enrich the conceptual representation of the sentence in question. This study connects semantic structure to human’s conceptual structure, and provides a processing model that incorporates contextual retrieval.

Keywords: ambiguity resolution, contextual retrieval, spatial structured-ness, structured individual

Procedia PDF Downloads 310
11233 Poultry in Motion: Text Mining Social Media Data for Avian Influenza Surveillance in the UK

Authors: Samuel Munaf, Kevin Swingler, Franz Brülisauer, Anthony O’Hare, George Gunn, Aaron Reeves

Abstract:

Background: Avian influenza, more commonly known as Bird flu, is a viral zoonotic respiratory disease stemming from various species of poultry, including pets and migratory birds. Researchers have purported that the accessibility of health information online, in addition to the low-cost data collection methods the internet provides, has revolutionized the methods in which epidemiological and disease surveillance data is utilized. This paper examines the feasibility of using internet data sources, such as Twitter and livestock forums, for the early detection of the avian flu outbreak, through the use of text mining algorithms and social network analysis. Methods: Social media mining was conducted on Twitter between the period of 01/01/2021 to 31/12/2021 via the Twitter API in Python. The results were filtered firstly by hashtags (#avianflu, #birdflu), word occurrences (avian flu, bird flu, H5N1), and then refined further by location to include only those results from within the UK. Analysis was conducted on this text in a time-series manner to determine keyword frequencies and topic modeling to uncover insights in the text prior to a confirmed outbreak. Further analysis was performed by examining clinical signs (e.g., swollen head, blue comb, dullness) within the time series prior to the confirmed avian flu outbreak by the Animal and Plant Health Agency (APHA). Results: The increased search results in Google and avian flu-related tweets showed a correlation in time with the confirmed cases. Topic modeling uncovered clusters of word occurrences relating to livestock biosecurity, disposal of dead birds, and prevention measures. Conclusions: Text mining social media data can prove to be useful in relation to analysing discussed topics for epidemiological surveillance purposes, especially given the lack of applied research in the veterinary domain. The small sample size of tweets for certain weekly time periods makes it difficult to provide statistically plausible results, in addition to a great amount of textual noise in the data.

Keywords: veterinary epidemiology, disease surveillance, infodemiology, infoveillance, avian influenza, social media

Procedia PDF Downloads 85
11232 Arabic Quran Search Tool Based on Ontology

Authors: Mohammad Alqahtani, Eric Atwell

Abstract:

This paper reviews and classifies most of the important types of search techniques that have been applied on the holy Quran. Then, it addresses the limitations in these techniques. Additionally, this paper surveys most existing Quranic ontologies and what are their deficiencies. Finally, it explains a new search tool called: A semantic search tool for Al Quran based on Qur’anic ontologies. This tool will overcome all limitations in the existing Quranic search applications.

Keywords: holy Quran, natural language processing (NLP), semantic search, information retrieval (IR), ontology

Procedia PDF Downloads 546
11231 Source Separation for Global Multispectral Satellite Images Indexing

Authors: Aymen Bouzid, Jihen Ben Smida

Abstract:

In this paper, we propose to prove the importance of the application of blind source separation methods on remote sensing data in order to index multispectral images. The proposed method starts with Gabor Filtering and the application of a Blind Source Separation to get a more effective representation of the information contained on the observation images. After that, a feature vector is extracted from each image in order to index them. Experimental results show the superior performance of this approach.

Keywords: blind source separation, content based image retrieval, feature extraction multispectral, satellite images

Procedia PDF Downloads 379
11230 Google Translate: AI Application

Authors: Shaima Almalhan, Lubna Shukri, Miriam Talal, Safaa Teskieh

Abstract:

Since artificial intelligence is a rapidly evolving topic that has had a significant impact on technical growth and innovation, this paper examines people's awareness, use, and engagement with the Google Translate application. To see how familiar aware users are with the app and its features, quantitative and qualitative research was conducted. The findings revealed that consumers have a high level of confidence in the application and how far people they benefit from this sort of innovation and how convenient it makes communication.

Keywords: artificial intelligence, google translate, speech recognition, language translation, camera translation, speech to text, text to speech

Procedia PDF Downloads 134
11229 A Rational Intelligent Agent to Promote Metacognition a Situation of Text Comprehension

Authors: Anass Hsissi, Hakim Allali, Abdelmajid Hajami

Abstract:

This article presents the results of a doctoral research which aims to integrate metacognitive dimension in the design of human learning computing environments (ILE). We conducted a detailed study on the relationship between metacognitive processes and learning, specifically their positive impact on the performance of learners in the area of reading comprehension. Our contribution is to implement methods, using an intelligent agent based on BDI paradigm to ensure intelligent and reliable support for low readers, in order to encourage regulation and a conscious and rational use of their metacognitive abilities.

Keywords: metacognition, text comprehension EIAH, autoregulation, BDI agent

Procedia PDF Downloads 302
11228 Developmental Trends on Initial Letter Fluency in Typically Developing Children

Authors: Sunila John, B. Rajashekhar

Abstract:

Initial letter fluency tasks are one of the simple behavioral measures to evaluate the complex nature of word retrieval ability. This task requires the participant to retrieve as many words as possible beginning with a particular letter in a fixed time frame. Though the task of verbal fluency is popular among adult clinical conditions, its role in children has been less emphasized. There exists a lack of in-depth understanding of processes underlying verbal fluency performance in typically developing children. The present study, therefore, aims to delineate the developmental trend on initial letter fluency task observed in typically developing Malayalam speaking children. The participants were aged between 5 to 10 years and categorized into three groups: Group I (class I and II, mean (SD) age years: 6.44(.78)), Group II (class III and IV, mean (SD) age years: 8.59 (.83)) and group III (class V and VI, mean (SD) age years: 10.28 (.80). On two tasks of initial letter fluency, the verbal fluency outcome measures were analyzed. The study findings revealed a distinct pattern of initial letter fluency development which may enhance its usefulness in clinical and research settings.

Keywords: children, development, initial letter fluency, word retrieval

Procedia PDF Downloads 437
11227 Increasing the Ability of State Senior High School 12 Pekanbaru Students in Writing an Analytical Exposition Text through Comic Strips

Authors: Budiman Budiman

Abstract:

This research aimed at describing and testing whether the students’ ability in writing analytical exposition text is increased by using comic strips at SMAN 12 Pekanbaru. The respondents of this study were the second-grade students, especially XI Science 3 academic year 2011-2012. The total number of students in this class was forty-two (42) students. The quantitative and qualitative data was collected by using writing test and observation sheets. The research finding reveals that there is a significant increase of students’ writing ability in writing analytical exposition text through comic strips. It can be proved by the average score of pre-test was 43.7 and the average score of post-test was 65.37. Besides, the students’ interest and motivation in learning are also improved. These can be seen from the increasing of students’ awareness and activeness in learning process based on observation sheets. The findings draw attention to the use of comic strips in teaching and learning is beneficial for better learning outcome.

Keywords: analytical exposition, comic strips, secondary school students, writing ability

Procedia PDF Downloads 136
11226 Improving Technical Translation Ability of the Iranian Students of Translation Through Multimedia: An Empirical Study

Authors: Dina Zakeri, Ali Aminzad

Abstract:

Multimedia-assisted teaching results in eliminating traditional training barriers, facilitating the cognition process and upgrading learning outcomes. This study attempted to examine the effects of implementing multimedia on teaching technical translation model and on the technical text translation ability of Iranian students of translation. To fulfill the purpose of the study, a total of forty-six learners were selected out of fifty-seven participants in a higher education center in Tehran based on their scores in Preliminary English Test (PET) and were divided randomly into the experimental and control groups. Prior to the treatment, a technical text translation questionnaire was devised and then approved and validated by three assistant professors of technical fields and three assistant professors of Teaching English as a Foreign Language (TEFL) at the university. This questionnaire was administered as a pretest to both groups. Control and experimental groups were trained for five successive weeks using identical course books but with a different lesson plan that allowed employing multimedia for the experimental group only. The devised and approved questionnaire was administered as a posttest to both groups at the end of the instruction. A multivariate ANOVA was run to compare the two groups’ means on the PET, pretest and posttest. The results showed the rejection of all null hypotheses of the study and revealed that multimedia significantly improved technical text translation ability of the learners.

Keywords: multimedia, multimedia-mediated teaching, technical translation model, technical text, translation ability

Procedia PDF Downloads 109
11225 Genetic Algorithms for Feature Generation in the Context of Audio Classification

Authors: José A. Menezes, Giordano Cabral, Bruno T. Gomes

Abstract:

Choosing good features is an essential part of machine learning. Recent techniques aim to automate this process. For instance, feature learning intends to learn the transformation of raw data into a useful representation to machine learning tasks. In automatic audio classification tasks, this is interesting since the audio, usually complex information, needs to be transformed into a computationally convenient input to process. Another technique tries to generate features by searching a feature space. Genetic algorithms, for instance, have being used to generate audio features by combining or modifying them. We find this approach particularly interesting and, despite the undeniable advances of feature learning approaches, we wanted to take a step forward in the use of genetic algorithms to find audio features, combining them with more conventional methods, like PCA, and inserting search control mechanisms, such as constraints over a confusion matrix. This work presents the results obtained on particular audio classification problems.

Keywords: feature generation, feature learning, genetic algorithm, music information retrieval

Procedia PDF Downloads 409
11224 Identification of Coauthors in Scientific Database

Authors: Thiago M. R Dias, Gray F. Moita

Abstract:

The analysis of scientific collaboration networks has contributed significantly to improving the understanding of how does the process of collaboration between researchers and also to understand how the evolution of scientific production of researchers or research groups occurs. However, the identification of collaborations in large scientific databases is not a trivial task given the high computational cost of the methods commonly used. This paper proposes a method for identifying collaboration in large data base of curriculum researchers. The proposed method has low computational cost with satisfactory results, proving to be an interesting alternative for the modeling and characterization of large scientific collaboration networks.

Keywords: extraction, data integration, information retrieval, scientific collaboration

Procedia PDF Downloads 367
11223 Temporality, Place and Autobiography in J.M. Coetzee’s 'Summertime'

Authors: Barbara Janari

Abstract:

In this paper it is argued that the effect of the disjunctive temporality in Summertime (the third of J.M. Coetzee’s fictionalised memoirs) is two-fold: firstly, it reflects the memoir’s ambivalent, contradictory representations of place in order to emphasize the fractured sense of self growing up in South Africa during apartheid entailed for Coetzee. Secondly, it reconceives the autobiographical discourse as one that foregrounds the inherent fictionality of all texts. The memoir’s narrative is filtered through intricate textual strategies that disrupt the chronological movement of the narrative, evoking the labyrinthine ways in which the past and present intersect and interpenetrate each other. It is framed by entries from Coetzee’s Notebooks: it opens with entries that cover the years 1972–1975, and ends with a number of undated fragments from his Notebooks. Most of the entries include a short ‘memo’ at the end, added between 1999 and 2000. While the memos follow the Notebook entries in the text, they are separated by decades. Between the Notebook entries is a series of interviews conducted by Vincent, the text’s putative biographer, between 2007 and 2008, based on recollections from five people who had known Coetzee in the 1970s – a key period in John’s life as it marks both his return to South Africa after a failed emigration attempt to America, and the beginning of his writing career, with the publication of Dusklands in 1974. The relationship between the memoir’s various parts is a key feature of Coetzee’s representation of place in Summertime, which is constructed as a composite one in which the principle of reflexive referencing has to be adopted. In other words, readers have to suspend individual references temporarily until the relationships between the parts have been connected to each other. In order to apprehend meaning in the text, the disparate narrative elements have to first be tied together. In this text, then, the experience of time as ordered and chronological is ruptured. Instead, the memoir’s themes and patterns become apparent most clearly through reflexive referencing, by which relationships between disparate sections of the text are linked. The image of the fictional John that emerges from the text is a composite of this John and the author, J.M. Coetzee, and is one which embodies Coetzee’s often fraught relationship with his home country, South Africa.

Keywords: autobiography, place, reflexive referencing, temporality

Procedia PDF Downloads 50
11222 Metadiscourse in EFL, ESP and Subject-Teaching Online Courses in Higher Education

Authors: Maria Antonietta Marongiu

Abstract:

Propositional information in discourse is made coherent, intelligible, and persuasive through metadiscourse. The linguistic and rhetorical choices that writers/speakers make to organize and negotiate content matter are intended to help relate a text to its context. Besides, they help the audience to connect to and interpret a text according to the values of a specific discourse community. Based on these assumptions, this work aims to analyse the use of metadiscourse in the spoken performance of teachers in online EFL, ESP, and subject-teacher courses taught in English to non-native learners in higher education. In point of fact, the global spread of Covid 19 has forced universities to transition their in-class courses to online delivery. This has inevitably placed on the instructor a heavier interactional responsibility compared to in-class courses. Accordingly, online delivery needs greater structuring as regards establishing the reader/listener’s resources for text understanding and negotiating. Indeed, in online as well as in in-class courses, lessons are social acts which take place in contexts where interlocutors, as members of a community, affect the ways ideas are presented and understood. Following Hyland’s Interactional Model of Metadiscourse (2005), this study intends to investigate Teacher Talk in online academic courses during the Covid 19 lock-down in Italy. The selected corpus includes the transcripts of online EFL and ESP courses and subject-teachers online courses taught in English. The objective of the investigation is, firstly, to ascertain the presence of metadiscourse in the form of interactive devices (to guide the listener through the text) and interactional features (to involve the listener in the subject). Previous research on metadiscourse in academic discourse, in college students' presentations in EAP (English for Academic Purposes) lessons, as well as in online teaching methodology courses and MOOC (Massive Open Online Courses) has shown that instructors use a vast array of metadiscoursal features intended to express the speakers’ intentions and standing with respect to discourse. Besides, they tend to use directions to orient their listeners and logical connectors referring to the structure of the text. Accordingly, the purpose of the investigation is also to find out whether metadiscourse is used as a rhetorical strategy by instructors to control, evaluate and negotiate the impact of the ongoing talk, and eventually to signal their attitudes towards the content and the audience. Thus, the use of metadiscourse can contribute to the informative and persuasive impact of discourse, and to the effectiveness of online communication, especially in learning contexts.

Keywords: discourse analysis, metadiscourse, online EFL and ESP teaching, rhetoric

Procedia PDF Downloads 111
11221 Effect of Mobile Phone Text Message Reminders on Adherence to Routine Prenatal Iron/Folic Acid Supplement among Pregnant Women: A Pilot Study

Authors: Nneka U. Igboeli, Maxwell O. Adibe

Abstract:

Iron and folate supplementation in pregnancy are important interventions that prevent maternal anaemia and fetal anomaly. Thus, daily oral doses of iron and folic acid are recommended throughout pregnancy as part of antenatal care. However, low adherence has been a major drawback leading to low effectiveness of these programs. The effect of mobile text message reminders to pregnant women to take their routine medications on adherence was evaluated in this study. The first 100 women who consented to the study were recruited and randomized to either receive a text message reminder on adherence to routine medications or not. Adherence was assessed using the 8-item Modified Morisky Adherence Scale (8-MMAS). The folders of successfully recruited women were tagged with the a study number assigned to each of them. The womens’ phone numbers were collected and these were used to send text messages reminders on adhering to routine drugs only to women in the intervention group. The text messages were sent three times per week for a period of four weeks with an adherence reassessment at the one month follow-up antenatal visit for recruited women. At one month follow-up, the lost to follow-up were 6 (16%) women for the intervention group and 17 (34%) for the control group. The across group mean difference in adherence score was 0.07 (-0.96 – 1.10) at baseline and 0.3 (-0.31 – 0.92) after intervention, both insignificant at p > 0.05. The within group change were increases of 0.58 (0.00 – 1.16) (p = 0.05) from baseline for the intervention group and a 0.35 (-0.51 – 1.20) (p = 0.395) for the control group. Non-significant increase in adherence scores were recorded for both groups. However, the increase in adherence scores of women in the intervention group was greater and may be potentially transformed into more positive results if the study period is increased with possibly reduced study drop-outs shows great promise for more positive results.

Keywords: adherence, mobile phone, pregnant women, reminders

Procedia PDF Downloads 157
11220 Content-Based Mammograms Retrieval Based on Breast Density Criteria Using Bidimensional Empirical Mode Decomposition

Authors: Sourour Khouaja, Hejer Jlassi, Nadia Feddaoui, Kamel Hamrouni

Abstract:

Most medical images, and especially mammographies, are now stored in large databases. Retrieving a desired image is considered of great importance in order to find previous similar cases diagnosis. Our method is implemented to assist radiologists in retrieving mammographic images containing breast with similar density aspect as seen on the mammogram. This is becoming a challenge seeing the importance of density criteria in cancer provision and its effect on segmentation issues. We used the BEMD (Bidimensional Empirical Mode Decomposition) to characterize the content of images and Euclidean distance measure similarity between images. Through the experiments on the MIAS mammography image database, we confirm that the results are promising. The performance was evaluated using precision and recall curves comparing query and retrieved images. Computing recall-precision proved the effectiveness of applying the CBIR in the large mammographic image databases. We found a precision of 91.2% for mammography with a recall of 86.8%.

Keywords: BEMD, breast density, contend-based, image retrieval, mammography

Procedia PDF Downloads 213
11219 A Postmodern Framework for Quranic Hermeneutics

Authors: Christiane Paulus

Abstract:

Post-Islamism assumes that the Quran should not be viewed in terms of what Lyotard identifies as a ‘meta-narrative'. However, its socio-ethical content can be viewed as critical of power discourse (Foucault). Practicing religion seems to be limited to rites and individual spirituality, taqwa. Alternatively, can we build on Muhammad Abduh's classic-modern reform and develop it through a postmodernist frame? This is the main question of this study. Through his general and vague remarks on the context of the Quran, Abduh was the first to refer to the historical and cultural distance of the text as an obstacle for interpretation. His application, however, corresponded to the modern absolute idea of authentic sharia. He was followed by Amin al-Khuli, who hermeneutically linked the content of the Quran to the theory of evolution. Fazlur Rahman and Nasr Hamid abu Zeid remain reluctant to go beyond the general level in terms of context. The hermeneutic circle, therefore, persists in challenging, how to get out to overcome one’s own assumptions. The insight into and the acceptance of the lasting ambivalence of understanding can be grasped as a postmodern approach; it is documented in Derrida's discovery of the shift in text meanings, difference, also in Lyotard's theory of différend. The resulting mixture of meanings (Wolfgang Welsch) can be read together with the classic ambiguity of the premodern interpreters of the Quran (Thomas Bauer). Confronting hermeneutic difficulties in general, Niklas Luhmann proves every description an attribution, tautology, i.e., remaining in the circle. ‘De-tautologization’ is possible, namely by analyzing the distinctions in the sense of objective, temporal and social information that every text contains. This could be expanded with the Kantian aesthetic dimension of reason (critique of pure judgment) corresponding to the iʽgaz of the Coran. Luhmann asks, ‘What distinction does the observer/author make?’ Quran as a speech from God to the first listeners could be seen as a discourse responding to the problems of everyday life of that time, which can be viewed as the general goal of the entire Qoran. Through reconstructing koranic Lifeworlds (Alfred Schütz) in detail, the social structure crystallizes the socio-economic differences, the enormous poverty. The koranic instruction to provide the basic needs for the neglected groups, which often intersect (old, poor, slaves, women, children), can be seen immediately in the text. First, the references to lifeworlds/social problems and discourses in longer koranic passages should be hypothesized. Subsequently, information from the classic commentaries could be extracted, the classical Tafseer, in particular, contains rich narrative material for reconstructing. By selecting and assigning suitable, specific context information, the meaning of the description becomes condensed (Clifford Geertz). In this manner, the text gets necessarily an alienation and is newly accessible. The socio-ethical implications can thus be grasped from the difference of the original problem and the revealed/improved order/procedure; this small step can be materialized as such, not as an absolute solution but as offering plausible patterns for today’s challenges as the Agenda 2030.

Keywords: postmodern hermeneutics, condensed description, sociological approach, small steps of reform

Procedia PDF Downloads 196
11218 A Word-to-Vector Formulation for Word Representation

Authors: Sandra Rizkallah, Amir F. Atiya

Abstract:

This work presents a novel word to vector representation that is based on embedding the words into a sphere, whereby the dot product of the corresponding vectors represents the similarity between any two words. Embedding the vectors into a sphere enabled us to take into consideration the antonymity between words, not only the synonymity, because of the suitability to handle the polarity nature of words. For example, a word and its antonym can be represented as a vector and its negative. Moreover, we have managed to extract an adequate vocabulary. The obtained results show that the proposed approach can capture the essence of the language, and can be generalized to estimate a correct similarity of any new pair of words.

Keywords: natural language processing, word to vector, text similarity, text mining

Procedia PDF Downloads 249
11217 Fake News Detection for Korean News Using Machine Learning Techniques

Authors: Tae-Uk Yun, Pullip Chung, Kee-Young Kwahk, Hyunchul Ahn

Abstract:

Fake news is defined as the news articles that are intentionally and verifiably false, and could mislead readers. Spread of fake news may provoke anxiety, chaos, fear, or irrational decisions of the public. Thus, detecting fake news and preventing its spread has become very important issue in our society. However, due to the huge amount of fake news produced every day, it is almost impossible to identify it by a human. Under this context, researchers have tried to develop automated fake news detection using machine learning techniques over the past years. But, there have been no prior studies proposed an automated fake news detection method for Korean news to our best knowledge. In this study, we aim to detect Korean fake news using text mining and machine learning techniques. Our proposed method consists of two steps. In the first step, the news contents to be analyzed is convert to quantified values using various text mining techniques (topic modeling, TF-IDF, and so on). After that, in step 2, classifiers are trained using the values produced in step 1. As the classifiers, machine learning techniques such as logistic regression, backpropagation network, support vector machine, and deep neural network can be applied. To validate the effectiveness of the proposed method, we collected about 200 short Korean news from Seoul National University’s FactCheck. which provides with detailed analysis reports from 20 media outlets and links to source documents for each case. Using this dataset, we will identify which text features are important as well as which classifiers are effective in detecting Korean fake news.

Keywords: fake news detection, Korean news, machine learning, text mining

Procedia PDF Downloads 248
11216 Translation and Ideology: New Perspectives

Authors: Hamza Salih

Abstract:

Since translation is no longer viewed as a mere replacement of linguistic codes from one language to another, it has increasingly been considered, especially with the advent of the cultural turn in the late 70's, in relation to the broader external context in which it takes place. According to scholars in the field, the translation process is determined by the political, economic and cultural values which exert external pressures on the translator. Correspondingly, the relationship between translation as an act of re-writing the original text and ideology has already been established. This paper addresses the issue of how ideology comes into play in the translational process and what strategies the translator adopts to foreground or circumvent ideological constraints. Along with this, the paper will touch upon the notions of censorship, manipulation, subversion and domestication which are deemed of relevance to this very topic. In fact, after the domination of the empirically-oriented linguistic approaches in translation studies, the relationship between translation and ideology has to be foregrounded to draw attention to the fact that the translation process is not a mere text-to-text linguistic transfer, but, on the contrary, takes place in the midst of economic, political, cultural and religious variables, which some scholars subsume under the category ideology.

Keywords: translation, language, ideology, subversion, censorship and manipulation

Procedia PDF Downloads 227
11215 Robustness of the Deep Chroma Extractor and Locally-Normalized Quarter Tone Filters in Automatic Chord Estimation under Reverberant Conditions

Authors: Luis Alvarado, Victor Poblete, Isaac Gonzalez, Yetzabeth Gonzalez

Abstract:

In MIREX 2016 (http://www.music-ir.org/mirex), the deep neural network (DNN)-Deep Chroma Extractor, proposed by Korzeniowski and Wiedmer, reached the highest score in an audio chord recognition task. In the present paper, this tool is assessed under acoustic reverberant environments and distinct source-microphone distances. The evaluation dataset comprises The Beatles and Queen datasets. These datasets are sequentially re-recorded with a single microphone in a real reverberant chamber at four reverberation times (0 -anechoic-, 1, 2, and 3 s, approximately), as well as four source-microphone distances (32, 64, 128, and 256 cm). It is expected that the performance of the trained DNN will dramatically decrease under these acoustic conditions with signals degraded by room reverberation and distance to the source. Recently, the effect of the bio-inspired Locally-Normalized Cepstral Coefficients (LNCC), has been assessed in a text independent speaker verification task using speech signals degraded by additive noise at different signal-to-noise ratios with variations of recording distance, and it has also been assessed under reverberant conditions with variations of recording distance. LNCC showed a performance so high as the state-of-the-art Mel Frequency Cepstral Coefficient filters. Based on these results, this paper proposes a variation of locally-normalized triangular filters called Locally-Normalized Quarter Tone (LNQT) filters. By using the LNQT spectrogram, robustness improvements of the trained Deep Chroma Extractor are expected, compared with classical triangular filters, and thus compensating the music signal degradation improving the accuracy of the chord recognition system.

Keywords: chord recognition, deep neural networks, feature extraction, music information retrieval

Procedia PDF Downloads 213