Search results for: text retrieval
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1571

Search results for: text retrieval

1091 Source Separation for Global Multispectral Satellite Images Indexing

Authors: Aymen Bouzid, Jihen Ben Smida

Abstract:

In this paper, we propose to prove the importance of the application of blind source separation methods on remote sensing data in order to index multispectral images. The proposed method starts with Gabor Filtering and the application of a Blind Source Separation to get a more effective representation of the information contained on the observation images. After that, a feature vector is extracted from each image in order to index them. Experimental results show the superior performance of this approach.

Keywords: blind source separation, content based image retrieval, feature extraction multispectral, satellite images

Procedia PDF Downloads 396
1090 Probing Language Models for Multiple Linguistic Information

Authors: Bowen Ding, Yihao Kuang

Abstract:

In recent years, large-scale pre-trained language models have achieved state-of-the-art performance on a variety of natural language processing tasks. The word vectors produced by these language models can be viewed as dense encoded presentations of natural language that in text form. However, it is unknown how much linguistic information is encoded and how. In this paper, we construct several corresponding probing tasks for multiple linguistic information to clarify the encoding capabilities of different language models and performed a visual display. We firstly obtain word presentations in vector form from different language models, including BERT, ELMo, RoBERTa and GPT. Classifiers with a small scale of parameters and unsupervised tasks are then applied on these word vectors to discriminate their capability to encode corresponding linguistic information. The constructed probe tasks contain both semantic and syntactic aspects. The semantic aspect includes the ability of the model to understand semantic entities such as numbers, time, and characters, and the grammatical aspect includes the ability of the language model to understand grammatical structures such as dependency relationships and reference relationships. We also compare encoding capabilities of different layers in the same language model to infer how linguistic information is encoded in the model.

Keywords: language models, probing task, text presentation, linguistic information

Procedia PDF Downloads 101
1089 On ‘Freaks’ and the Feminine in Margaret Atwood’s ‘Lusus Naturae’

Authors: Shahd Alshammari

Abstract:

This paper considers one of Margaret Atwood’s short stories ‘Lusus Naturae'. Through a critical lens that makes use of Julia Kristeva’s work on Powers of Horror and abjection, this paper suggests that the monstrous girl is the disabled woman, the abject in society. The monster is used as a metaphor for the unknown, the misunderstood, and the ‘different’ woman. Culturally Relevant Teaching (CRT) is a pedagogy that calls for making course material accessible and relevant to students. Through the study of literary texts, we are able to help create agency inside and outside the classroom. Stories are a necessary part of establishing connections across borders and boundaries. Stories are meant to raise awareness both inside and outside the classroom. The discussion is equally important, and the text is meant to facilitate relevant questions that the students need to consider when it comes to identity. Questions to consider are: what does it mean to be a ‘girl’ today, and what implications and consequences are at hand when you fail to perform this gendered identity? Gender is sometimes a fatal bond in the Middle East, and even more so, is the disability. In the case of our unnamed protagonist, she undergoes a process of un-becoming, a non-linear process of growing up. In a sense, it is a counter-Bildungsroman. The reading of this text emphasizes that a non-linear narrative is sometimes necessary for the female protagonist’s self-awareness and development. Discussion in class facilitates this sense of agency and questioning of gender and disability.

Keywords: disability, gender, literature, pedagogy

Procedia PDF Downloads 653
1088 Identification of Coauthors in Scientific Database

Authors: Thiago M. R Dias, Gray F. Moita

Abstract:

The analysis of scientific collaboration networks has contributed significantly to improving the understanding of how does the process of collaboration between researchers and also to understand how the evolution of scientific production of researchers or research groups occurs. However, the identification of collaborations in large scientific databases is not a trivial task given the high computational cost of the methods commonly used. This paper proposes a method for identifying collaboration in large data base of curriculum researchers. The proposed method has low computational cost with satisfactory results, proving to be an interesting alternative for the modeling and characterization of large scientific collaboration networks.

Keywords: extraction, data integration, information retrieval, scientific collaboration

Procedia PDF Downloads 387
1087 The Use of Neuter in Oedipus Lines to Refer to Antigone in Phoenissae of Seneca

Authors: Cíntia Martins Sanches

Abstract:

In the first part of Phoenissae of Seneca, Antigone is a guide to Oedipus, and they leave Thebes: he is blind searching for death (inflicting the punishment himself wished on the killer of Laius, ie exile and death); she is trying to convince him to give up such punishment and bring him back to Thebes. Concerning Oedipus lines, we observed a high frequency of Latin neuter in the treatment the protagonist gave to his daughter Antigone. We considered in this study that such frequency may be related to the sanctification of the daughter, who is seen by him as an enlightened being and without defects, free of the human condition (which takes on the existence of failures by essence). This study, thus, puts forward an analysis of the passages the said feature is present, relating them to the effect of meaning found in each occurrence. As part of a doctorate, this study investigates the stylistic idiom of Seneca in the Oedipus and Phoenissae tragedies, aiming at translating both tragedies expressively. The concept of stylistic idiom concerns the stylistic affinity required for a translation to be equivalent to the source text. In this wise, this study inquires into how the Latin text is organized poetically, pointing out the expressive features frequently appearing in both dramas. The method we used is based on the Semiotics theory — observing how connotation, ie a language use in which prevails the poetic function, naturally polysemous, acts to achieve each expressive effect.

Keywords: antigone, neuter, Oedipus, Phoenissae, Seneca

Procedia PDF Downloads 282
1086 Ranking Priorities for Digital Health in Portugal: Aligning Health Managers’ Perceptions with Official Policy Perspectives

Authors: Pedro G. Rodrigues, Maria J. Bárrios, Sara A. Ambrósio

Abstract:

The digitalisation of health is a profoundly transformative economic, political, and social process. As is often the case, such processes need to be carefully managed if misunderstandings, policy misalignments, or outright conflicts between the government and a wide gamut of stakeholders with competing interests are to be avoided. Thus, ensuring open lines of communication where all parties know what each other’s concerns are is key to good governance, as well as efficient and effective policymaking. This project aims to make a small but still significant contribution in this regard in that we seek to determine the extent to which health managers’ perceptions of what is a priority for digital health in Portugal are aligned with official policy perspectives. By applying state-of-the-art artificial intelligence technology first to the indexed literature on digital health and then to a set of official policy documents on the same topic, followed by a survey directed at health managers working in public and private hospitals in Portugal, we obtain two priority rankings that, when compared, will allow us to produce a synthesis and toolkit on digital health policy in Portugal, with a view to identifying areas of policy convergence and divergence. This project is also particularly peculiar in the sense that sophisticated digital methods related to text analytics are employed to study good governance aspects of digitalisation applied to health care.

Keywords: digital health, health informatics, text analytics, governance, natural language understanding

Procedia PDF Downloads 57
1085 The Role of Digital Text in School and Vernacular Literacies: Students Digital Practices at Cybercafés in Mexico

Authors: Guadalupe López-Bonilla

Abstract:

Students of all educational levels participate in literacy practices that may involve print or digital media. Scholars from the New Literacy Studies distinguish practices that fulfill institutional purposes such as those established at schools from literate practices aimed at doing other kinds of activities, such as reading instructions in order to play a video game; the first are known as institutional practices while the latter are considered vernacular literacies. When students perform these kinds of activities they engage with print and digital media according to the demands of the task. In this paper, it is aimed to discuss the results of a research project focusing on literacy practices of high school students at 10 urban cybercafés in Mexico. The main objective was to analyze the literacy practices of students performing both school tasks and vernacular literacies. The methodology included a focused ethnography with online and face to face observations of 10 high school students (5 male and 5 female) and interviews after performing each task. In the results, it is presented how students treat texts as open, dynamic and relational artifacts when engaging in vernacular literacies; while texts are conceived as closed, authoritarian and fixed documents when performing school activities. Samples of each type of activity are shown followed by a discussion of the pedagogical implications for improving school literacy.

Keywords: digital literacy, text, school literacy, vernacular practices

Procedia PDF Downloads 265
1084 Information Extraction Based on Search Engine Results

Authors: Mohammed R. Elkobaisi, Abdelsalam Maatuk

Abstract:

The search engines are the large scale information retrieval tools from the Web that are currently freely available to all. This paper explains how to convert the raw resulted number of search engines into useful information. This represents a new method for data gathering comparing with traditional methods. When a query is submitted for a multiple numbers of keywords, this take a long time and effort, hence we develop a user interface program to automatic search by taking multi-keywords at the same time and leave this program to collect wanted data automatically. The collected raw data is processed using mathematical and statistical theories to eliminate unwanted data and converting it to usable data.

Keywords: search engines, information extraction, agent system

Procedia PDF Downloads 421
1083 Myanmar Character Recognition Using Eight Direction Chain Code Frequency Features

Authors: Kyi Pyar Zaw, Zin Mar Kyu

Abstract:

Character recognition is the process of converting a text image file into editable and searchable text file. Feature Extraction is the heart of any character recognition system. The character recognition rate may be low or high depending on the extracted features. In the proposed paper, 25 features for one character are used in character recognition. Basically, there are three steps of character recognition such as character segmentation, feature extraction and classification. In segmentation step, horizontal cropping method is used for line segmentation and vertical cropping method is used for character segmentation. In the Feature extraction step, features are extracted in two ways. The first way is that the 8 features are extracted from the entire input character using eight direction chain code frequency extraction. The second way is that the input character is divided into 16 blocks. For each block, although 8 feature values are obtained through eight-direction chain code frequency extraction method, we define the sum of these 8 feature values as a feature for one block. Therefore, 16 features are extracted from that 16 blocks in the second way. We use the number of holes feature to cluster the similar characters. We can recognize the almost Myanmar common characters with various font sizes by using these features. All these 25 features are used in both training part and testing part. In the classification step, the characters are classified by matching the all features of input character with already trained features of characters.

Keywords: chain code frequency, character recognition, feature extraction, features matching, segmentation

Procedia PDF Downloads 311
1082 Teaching Tolerance in the Language Classroom through a Text

Authors: Natalia Kasatkina

Abstract:

In an ever-increasing globalization, one’s grasp of diversity and tolerance has never been more indispensable, and it is a vital duty for all those in the field of foreign language teaching to help children cultivate such values. The present study explores the role of DIVERSITY and TOLERANCE in the language classroom and elementary, middle, and high school students’ perceptions of these two concepts. It draws on several theoretical domains of language acquisition, cultural awareness, and school psychology. Relying on these frameworks, the major findings are synthesized, and a paradigm of teaching tolerance through language-teaching is formulated. Upon analysing how tolerant our children are with ‘others’ in and outside the classroom, we have concluded that intolerance and aggression towards the ‘other’ increase with age, and that a feeling of supremacy over migrants and a sense of fear towards them begin to manifest more apparently when the students are in high school. In addition, we have also found that children in elementary school do not exhibit such prejudiced thoughts and behavior, which leads us to the believe that tolerance as well as intolerance are learned. Therefore, it is within our reach to teach our children to be open-minded and accepting. We have used the novel ‘Uncle Tom’s Cabin’ by Harriet Beecher Stowe as a springboard for lessons which are not only targeted at shedding light on the role of language in the modern world, but also aim to stimulate an awareness of cultural diversity. We equally strive to conduct further cross-cultural research in order to solidify the theory behind this study, and thus devise a language-based curriculum which would encourage tolerance through the examination of various literary texts.

Keywords: literary text, tolerance, EFL classroom, word-association test

Procedia PDF Downloads 287
1081 Multimodal Content: Fostering Students’ Language and Communication Competences

Authors: Victoria L. Malakhova

Abstract:

The research is devoted to multimodal content and its effectiveness in developing students’ linguistic and intercultural communicative competences as an indefeasible constituent of their future professional activity. Description of multimodal content both as a linguistic and didactic phenomenon makes the study relevant. The objective of the article is the analysis of creolized texts and the effect they have on fostering higher education students’ skills and their productivity. The main methods used are linguistic text analysis, qualitative and quantitative methods, deduction, generalization. The author studies texts with full and partial creolization, their features and role in composing multimodal textual space. The main verbal and non-verbal markers and paralinguistic means that enhance the linguo-pragmatic potential of creolized texts are covered. To reveal the efficiency of multimodal content application in English teaching, the author conducts an experiment among both undergraduate students and teachers. This allows specifying main functions of creolized texts in the process of language learning, detecting ways of enhancing students’ competences, and increasing their motivation. The described stages of using creolized texts can serve as an algorithm for work with multimodal content in teaching English as a foreign language. The findings contribute to improving the efficiency of the academic process.

Keywords: creolized text, English language learning, higher education, language and communication competences, multimodal content

Procedia PDF Downloads 109
1080 Empirical Study on Factors Influencing SEO

Authors: Pakinee Aimmanee, Phoom Chokratsamesiri

Abstract:

Search engine has become an essential tool nowadays for people to search for their needed information on the internet. In this work, we evaluate the performance of the search engine from three factors: the keyword frequency, the number of inbound links, and the difficulty of the keyword. The evaluations are based on the ranking position and the number of days that Google has seen or detect the webpage. We find that the keyword frequency and the difficulty of the keyword do not affect the Google ranking where the number of inbound links gives remarkable improvement of the ranking position. The optimal number of inbound links found in the experiment is 10.

Keywords: SEO, information retrieval, web search, knowledge technologies

Procedia PDF Downloads 277
1079 A Postmodern Framework for Quranic Hermeneutics

Authors: Christiane Paulus

Abstract:

Post-Islamism assumes that the Quran should not be viewed in terms of what Lyotard identifies as a ‘meta-narrative'. However, its socio-ethical content can be viewed as critical of power discourse (Foucault). Practicing religion seems to be limited to rites and individual spirituality, taqwa. Alternatively, can we build on Muhammad Abduh's classic-modern reform and develop it through a postmodernist frame? This is the main question of this study. Through his general and vague remarks on the context of the Quran, Abduh was the first to refer to the historical and cultural distance of the text as an obstacle for interpretation. His application, however, corresponded to the modern absolute idea of authentic sharia. He was followed by Amin al-Khuli, who hermeneutically linked the content of the Quran to the theory of evolution. Fazlur Rahman and Nasr Hamid abu Zeid remain reluctant to go beyond the general level in terms of context. The hermeneutic circle, therefore, persists in challenging, how to get out to overcome one’s own assumptions. The insight into and the acceptance of the lasting ambivalence of understanding can be grasped as a postmodern approach; it is documented in Derrida's discovery of the shift in text meanings, difference, also in Lyotard's theory of différend. The resulting mixture of meanings (Wolfgang Welsch) can be read together with the classic ambiguity of the premodern interpreters of the Quran (Thomas Bauer). Confronting hermeneutic difficulties in general, Niklas Luhmann proves every description an attribution, tautology, i.e., remaining in the circle. ‘De-tautologization’ is possible, namely by analyzing the distinctions in the sense of objective, temporal and social information that every text contains. This could be expanded with the Kantian aesthetic dimension of reason (critique of pure judgment) corresponding to the iʽgaz of the Coran. Luhmann asks, ‘What distinction does the observer/author make?’ Quran as a speech from God to the first listeners could be seen as a discourse responding to the problems of everyday life of that time, which can be viewed as the general goal of the entire Qoran. Through reconstructing koranic Lifeworlds (Alfred Schütz) in detail, the social structure crystallizes the socio-economic differences, the enormous poverty. The koranic instruction to provide the basic needs for the neglected groups, which often intersect (old, poor, slaves, women, children), can be seen immediately in the text. First, the references to lifeworlds/social problems and discourses in longer koranic passages should be hypothesized. Subsequently, information from the classic commentaries could be extracted, the classical Tafseer, in particular, contains rich narrative material for reconstructing. By selecting and assigning suitable, specific context information, the meaning of the description becomes condensed (Clifford Geertz). In this manner, the text gets necessarily an alienation and is newly accessible. The socio-ethical implications can thus be grasped from the difference of the original problem and the revealed/improved order/procedure; this small step can be materialized as such, not as an absolute solution but as offering plausible patterns for today’s challenges as the Agenda 2030.

Keywords: postmodern hermeneutics, condensed description, sociological approach, small steps of reform

Procedia PDF Downloads 214
1078 Neurophysiology of Domain Specific Execution Costs of Grasping in Working Memory Phases

Authors: Rumeysa Gunduz, Dirk Koester, Thomas Schack

Abstract:

Previous behavioral studies have shown that working memory (WM) and manual actions share limited capacity cognitive resources, which in turn results in execution costs of manual actions in WM. However, to the best of our knowledge, there is no study investigating the neurophysiology of execution costs. The current study aims to fill this research gap investigating the neurophysiology of execution costs of grasping in WM phases (encoding, maintenance, retrieval) considering verbal and visuospatial domains of WM. A WM-grasping dual task paradigm was implemented to examine execution costs. Baseline single task required performing verbal or visuospatial version of a WM task. Dual task required performing the WM task embedded in a high precision grasp to place task. 30 participants were tested in a 2 (single vs. dual task) x 2 (visuo-spatial vs. verbal WM) within subject design. Event related potentials (ERPs) were extracted for each WM phase separately in the single and dual tasks. Memory performance for visuospatial WM, but not for verbal WM, was significantly lower in the dual task compared to the single task. Encoding related ERPs in the single task revealed different ERPs of verbal WM and visuospatial WM at bilateral anterior sites and right posterior site. In the dual task, bilateral anterior difference disappeared due to bilaterally increased anterior negativities for visuospatial WM. Maintenance related ERPs in the dual task revealed different ERPs of verbal WM and visuospatial WM at bilateral posterior sites. There was also anterior negativity for visuospatial WM. Retrieval related ERPs in the single task revealed different ERPs of verbal WM and visuospatial WM at bilateral posterior sites. In the dual task, there was no difference between verbal WM and visuospatial WM. Behavioral and ERP findings suggest that execution of grasping shares cognitive resources only with visuospatial WM, which in turn results in domain specific execution costs. Moreover, ERP findings suggest unique patterns of costs in each WM phase, which supports the idea that each WM phase reflects a separate cognitive process. This study not only contributes to the understanding of cognitive principles of manual action control, but also contributes to the understanding of WM as an entity consisting of separate modalities and cognitive processes.

Keywords: dual task, grasping execution, neurophysiology, working memory domains, working memory phases

Procedia PDF Downloads 419
1077 Alphabet Recognition Using Pixel Probability Distribution

Authors: Vaidehi Murarka, Sneha Mehta, Dishant Upadhyay

Abstract:

Our project topic is “Alphabet Recognition using pixel probability distribution”. The project uses techniques of Image Processing and Machine Learning in Computer Vision. Alphabet recognition is the mechanical or electronic translation of scanned images of handwritten, typewritten or printed text into machine-encoded text. It is widely used to convert books and documents into electronic files etc. Alphabet Recognition based OCR application is sometimes used in signature recognition which is used in bank and other high security buildings. One of the popular mobile applications includes reading a visiting card and directly storing it to the contacts. OCR's are known to be used in radar systems for reading speeders license plates and lots of other things. The implementation of our project has been done using Visual Studio and Open CV (Open Source Computer Vision). Our algorithm is based on Neural Networks (machine learning). The project was implemented in three modules: (1) Training: This module aims “Database Generation”. Database was generated using two methods: (a) Run-time generation included database generation at compilation time using inbuilt fonts of OpenCV library. Human intervention is not necessary for generating this database. (b) Contour–detection: ‘jpeg’ template containing different fonts of an alphabet is converted to the weighted matrix using specialized functions (contour detection and blob detection) of OpenCV. The main advantage of this type of database generation is that the algorithm becomes self-learning and the final database requires little memory to be stored (119kb precisely). (2) Preprocessing: Input image is pre-processed using image processing concepts such as adaptive thresholding, binarizing, dilating etc. and is made ready for segmentation. “Segmentation” includes extraction of lines, words, and letters from the processed text image. (3) Testing and prediction: The extracted letters are classified and predicted using the neural networks algorithm. The algorithm recognizes an alphabet based on certain mathematical parameters calculated using the database and weight matrix of the segmented image.

Keywords: contour-detection, neural networks, pre-processing, recognition coefficient, runtime-template generation, segmentation, weight matrix

Procedia PDF Downloads 380
1076 Structured-Ness and Contextual Retrieval Underlie Language Comprehension

Authors: Yao-Ying Lai, Maria Pinango, Ashwini Deo

Abstract:

While grammatical devices are essential to language processing, how comprehension utilizes cognitive mechanisms is less emphasized. This study addresses this issue by probing the complement coercion phenomenon: an entity-denoting complement following verbs like begin and finish receives an eventive interpretation. For example, (1) “The queen began the book” receives an agentive reading like (2) “The queen began [reading/writing/etc.…] the book.” Such sentences engender additional processing cost in real-time comprehension. The traditional account attributes this cost to an operation that coerces the entity-denoting complement to an event, assuming that these verbs require eventive complements. However, in closer examination, examples like “Chapter 1 began the book” undermine this assumption. An alternative, Structured Individual (SI) hypothesis, proposes that the complement following aspectual verbs (AspV; e.g. begin, finish) is conceptualized as a structured individual, construed as an axis along various dimensions (e.g. spatial, eventive, temporal, informational). The composition of an animate subject and an AspV such as (1) engenders an ambiguity between an agentive reading along the eventive dimension like (2), and a constitutive reading along the informational/spatial dimension like (3) “[The story of the queen] began the book,” in which the subject is interpreted as a subpart of the complement denotation. Comprehenders need to resolve the ambiguity by searching contextual information, resulting in additional cost. To evaluate the SI hypothesis, a questionnaire was employed. Method: Target AspV sentences such as “Shakespeare began the volume.” were preceded by one of the following types of context sentence: (A) Agentive-biasing, in which an event was mentioned (…writers often read…), (C) Constitutive-biasing, in which a constitutive meaning was hinted (Larry owns collections of Renaissance literature.), (N) Neutral context, which allowed both interpretations. Thirty-nine native speakers of English were asked to (i) rate each context-target sentence pair from a 1~5 scale (5=fully understandable), and (ii) choose possible interpretations for the target sentence given the context. The SI hypothesis predicts that comprehension is harder for the Neutral condition, as compared to the biasing conditions because no contextual information is provided to resolve an ambiguity. Also, comprehenders should obtain the specific interpretation corresponding to the context type. Results: (A) Agentive-biasing and (C) Constitutive-biasing were rated higher than (N) Neutral conditions (p< .001), while all conditions were within the acceptable range (> 3.5 on the 1~5 scale). This suggests that when lacking relevant contextual information, semantic ambiguity decreases comprehensibility. The interpretation task shows that the participants selected the biased agentive/constitutive reading for condition (A) and (C) respectively. For the Neutral condition, the agentive and constitutive readings were chosen equally often. Conclusion: These findings support the SI hypothesis: the meaning of AspV sentences is conceptualized as a parthood relation involving structured individuals. We argue that semantic representation makes reference to spatial structured-ness (abstracted axis). To obtain an appropriate interpretation, comprehenders utilize contextual information to enrich the conceptual representation of the sentence in question. This study connects semantic structure to human’s conceptual structure, and provides a processing model that incorporates contextual retrieval.

Keywords: ambiguity resolution, contextual retrieval, spatial structured-ness, structured individual

Procedia PDF Downloads 326
1075 Improving Topic Quality of Scripts by Using Scene Similarity Based Word Co-Occurrence

Authors: Yunseok Noh, Chang-Uk Kwak, Sun-Joong Kim, Seong-Bae Park

Abstract:

Scripts are one of the basic text resources to understand broadcasting contents. Since broadcast media wields lots of influence over the public, tools for understanding broadcasting contents are more required. Topic modeling is the method to get the summary of the broadcasting contents from its scripts. Generally, scripts represent contents descriptively with directions and speeches. Scripts also provide scene segments that can be seen as semantic units. Therefore, a script can be topic modeled by treating a scene segment as a document. Because scripts consist of speeches mainly, however, relatively small co-occurrences among words in the scene segments are observed. This causes inevitably the bad quality of topics based on statistical learning method. To tackle this problem, we propose a method of learning with additional word co-occurrence information obtained using scene similarities. The main idea of improving topic quality is that the information that two or more texts are topically related can be useful to learn high quality of topics. In addition, by using high quality of topics, we can get information more accurate whether two texts are related or not. In this paper, we regard two scene segments are related if their topical similarity is high enough. We also consider that words are co-occurred if they are in topically related scene segments together. In the experiments, we showed the proposed method generates a higher quality of topics from Korean drama scripts than the baselines.

Keywords: broadcasting contents, scripts, text similarity, topic model

Procedia PDF Downloads 314
1074 Treating Voxels as Words: Word-to-Vector Methods for fMRI Meta-Analyses

Authors: Matthew Baucum

Abstract:

With the increasing popularity of fMRI as an experimental method, psychology and neuroscience can greatly benefit from advanced techniques for summarizing and synthesizing large amounts of data from brain imaging studies. One promising avenue is automated meta-analyses, in which natural language processing methods are used to identify the brain regions consistently associated with certain semantic concepts (e.g. “social”, “reward’) across large corpora of studies. This study builds on this approach by demonstrating how, in fMRI meta-analyses, individual voxels can be treated as vectors in a semantic space and evaluated for their “proximity” to terms of interest. In this technique, a low-dimensional semantic space is built from brain imaging study texts, allowing words in each text to be represented as vectors (where words that frequently appear together are near each other in the semantic space). Consequently, each voxel in a brain mask can be represented as a normalized vector sum of all of the words in the studies that showed activation in that voxel. The entire brain mask can then be visualized in terms of each voxel’s proximity to a given term of interest (e.g., “vision”, “decision making”) or collection of terms (e.g., “theory of mind”, “social”, “agent”), as measured by the cosine similarity between the voxel’s vector and the term vector (or the average of multiple term vectors). Analysis can also proceed in the opposite direction, allowing word cloud visualizations of the nearest semantic neighbors for a given brain region. This approach allows for continuous, fine-grained metrics of voxel-term associations, and relies on state-of-the-art “open vocabulary” methods that go beyond mere word-counts. An analysis of over 11,000 neuroimaging studies from an existing meta-analytic fMRI database demonstrates that this technique can be used to recover known neural bases for multiple psychological functions, suggesting this method’s utility for efficient, high-level meta-analyses of localized brain function. While automated text analytic methods are no replacement for deliberate, manual meta-analyses, they seem to show promise for the efficient aggregation of large bodies of scientific knowledge, at least on a relatively general level.

Keywords: FMRI, machine learning, meta-analysis, text analysis

Procedia PDF Downloads 441
1073 Book Recommendation Using Query Expansion and Information Retrieval Methods

Authors: Ritesh Kumar, Rajendra Pamula

Abstract:

In this paper, we present our contribution for book recommendation. In our experiment, we combine the results of Sequential Dependence Model (SDM) and exploitation of book information such as reviews, tags and ratings. This social information is assigned by users. For this, we used CLEF-2016 Social Book Search Track Suggestion task. Finally, our proposed method extensively evaluated on CLEF -2015 Social Book Search datasets, and has better performance (nDCG@10) compared to other state-of-the-art systems. Recently we got the good performance in CLEF-2016.

Keywords: sequential dependence model, social information, social book search, query expansion

Procedia PDF Downloads 284
1072 A Systematic Review: Prevalence and Risk Factors of Low Back Pain among Waste Collection Workers

Authors: Benedicta Asante, Brenna Bath, Olugbenga Adebayo, Catherine Trask

Abstract:

Background: Waste Collection Workers’ (WCWs) activities contribute greatly to the recycling sector and are an important component of the waste management industry. As the recycling sector evolves, reports of injuries and fatal accidents in the industry demand notice particularly common and debilitating musculoskeletal disorders such as low back pain (LBP). WCWs are likely exposed to diverse work-related hazards that could contribute to LBP. However, to our knowledge there has never been a systematic review or other synthesis of LBP findings within this workforce. The aim of this systematic review was to determine the prevalence and risk factors of LBP among WCWs. Method: A comprehensive search was conducted in Ovid Medline, EMBASE, and Global Health e-publications with search term categories ‘low back pain’ and ‘waste collection workers’. Articles were screened at title, abstract, and full-text stages by two reviewers. Data were extracted on study design, sampling strategy, socio-demographic, geographical region, and exposure definition, definition of LBP, risk factors, response rate, statistical techniques, and LBP prevalence. Risk of bias (ROB) was assessed based on Hoy Damien’s ROB scale. Results: The search of three databases generated 79 studies. Thirty-two studies met the study inclusion criteria for both title and abstract; thirteen full-text articles met the study criteria at the full-text stage. Seven articles (54%) reported prevalence within 12 months of LBP between 42-82% among WCW. The major risk factors for LBP among WCW included: awkward posture; lifting; pulling; pushing; repetitive motions; work duration; and physical loads. Summary data and syntheses of findings was presented in trend-lines and tables to establish the several prevalence periods based on age and region distribution. Public health implications: LBP is a major occupational hazard among WCWs. In light of these risks and future growth in this industry, further research should focus on more detail ergonomic exposure assessment and LBP prevention efforts.

Keywords: low back pain, scavenger, waste collection workers, waste pickers

Procedia PDF Downloads 318
1071 A BERT-Based Model for Financial Social Media Sentiment Analysis

Authors: Josiel Delgadillo, Johnson Kinyua, Charles Mutigwe

Abstract:

The purpose of sentiment analysis is to determine the sentiment strength (e.g., positive, negative, neutral) from a textual source for good decision-making. Natural language processing in domains such as financial markets requires knowledge of domain ontology, and pre-trained language models, such as BERT, have made significant breakthroughs in various NLP tasks by training on large-scale un-labeled generic corpora such as Wikipedia. However, sentiment analysis is a strong domain-dependent task. The rapid growth of social media has given users a platform to share their experiences and views about products, services, and processes, including financial markets. StockTwits and Twitter are social networks that allow the public to express their sentiments in real time. Hence, leveraging the success of unsupervised pre-training and a large amount of financial text available on social media platforms could potentially benefit a wide range of financial applications. This work is focused on sentiment analysis using social media text on platforms such as StockTwits and Twitter. To meet this need, SkyBERT, a domain-specific language model pre-trained and fine-tuned on financial corpora, has been developed. The results show that SkyBERT outperforms current state-of-the-art models in financial sentiment analysis. Extensive experimental results demonstrate the effectiveness and robustness of SkyBERT.

Keywords: BERT, financial markets, Twitter, sentiment analysis

Procedia PDF Downloads 148
1070 Mining User-Generated Contents to Detect Service Failures with Topic Model

Authors: Kyung Bae Park, Sung Ho Ha

Abstract:

Online user-generated contents (UGC) significantly change the way customers behave (e.g., shop, travel), and a pressing need to handle the overwhelmingly plethora amount of various UGC is one of the paramount issues for management. However, a current approach (e.g., sentiment analysis) is often ineffective for leveraging textual information to detect the problems or issues that a certain management suffers from. In this paper, we employ text mining of Latent Dirichlet Allocation (LDA) on a popular online review site dedicated to complaint from users. We find that the employed LDA efficiently detects customer complaints, and a further inspection with the visualization technique is effective to categorize the problems or issues. As such, management can identify the issues at stake and prioritize them accordingly in a timely manner given the limited amount of resources. The findings provide managerial insights into how analytics on social media can help maintain and improve their reputation management. Our interdisciplinary approach also highlights several insights by applying machine learning techniques in marketing research domain. On a broader technical note, this paper illustrates the details of how to implement LDA in R program from a beginning (data collection in R) to an end (LDA analysis in R) since the instruction is still largely undocumented. In this regard, it will help lower the boundary for interdisciplinary researcher to conduct related research.

Keywords: latent dirichlet allocation, R program, text mining, topic model, user generated contents, visualization

Procedia PDF Downloads 184
1069 On the Relationship between the Concepts of "[New] Social Democracy" and "Democratic Socialism"

Authors: Gintaras Mitrulevičius

Abstract:

This text, which is based on the conference report, seeks to briefly examine the relationship between the concepts of social democracy and democratic socialism, drawing attention to the essential aspects of its development and, in particular, discussing the contradictions in the relationship between these concepts in the modern period. In the preparation of this text, such research methods as historical, historical-comparative methods were used, as well as methods of analyzing, synthesizing, and generalizing texts. The history of the use of terms in social democracy and democratic socialism shows that these terms were used alternately and almost synonymously. At the end of the 20th century, traditional social democracy was transformed into the so-called "new social democracy." Many of the new social democrats do not consider themselves democratic socialists and avoid the historically characteristic identification of social democracy with democratic socialism. It has become quite popular to believe that social democracy is a separate ideology from democratic socialism. Or that it has become a variant of the ideology of liberalism. This is a testimony to the crisis of ideological self-awareness of social democracy. Since the beginning of the 21st century, social democracy has also experienced a growing crisis of electoral support. This, among other things, led to her slight shift to the left. In this context, some social democrats are once again talking about democratic socialism. The rise of the ideas of democratic socialism in the United States was catalyzed by Bernie Sanders. But the proponents of democratic socialism in the United States have different concepts of democratic socialism. In modern Europe, democratic socialism is also spoken of by leftists of non-social democratic origin, whose understanding is different from that of democratic socialism inherent in classical social democracy. Some political scientists also single out the concepts in question. Analysis of the problem shows that there are currently several concepts of democratic socialism on the spectrum of the political left, both social-democratic and non-social-democratic.

Keywords: democratic socializm, socializm, social democracy, new social democracy, political ideologies

Procedia PDF Downloads 107
1068 Examining Reading Comprehension Skills Based on Different Reading Comprehension Frameworks and Taxonomies

Authors: Seval Kula-Kartal

Abstract:

Developing students’ reading comprehension skills is an aim that is difficult to accomplish and requires to follow long-term and systematic teaching and assessment processes. In these processes, teachers need tools to provide guidance to them on what reading comprehension is and which comprehension skills they should develop. Due to a lack of clear and evidence-based frameworks defining reading comprehension skills, especially in Turkiye, teachers and students mostly follow various processes in the classrooms without having an idea about what their comprehension goals are and what those goals mean. Since teachers and students do not have a clear view of comprehension targets, strengths, and weaknesses in students’ comprehension skills, the formative feedback processes cannot be managed in an effective way. It is believed that detecting and defining influential comprehension skills may provide guidance both to teachers and students during the feedback process. Therefore, in the current study, some of the reading comprehension frameworks that define comprehension skills operationally were examined. The aim of the study is to develop a simple and clear framework that can be used by teachers and students during their teaching, learning, assessment, and feedback processes. The current study is qualitative research in which documents related to reading comprehension skills were analyzed. Therefore, the study group consisted of recourses and frameworks which made big contributions to theoretical and operational definitions of reading comprehension. A content analysis was conducted on the resources included in the study group. To determine the validity of the themes and sub-categories revealed as the result of content analysis, three educational assessment experts were asked to examine the content analysis results. The Fleiss’ Cappa coefficient revealed that there is consistency among themes and categories defined by three different experts. The content analysis of the reading comprehension frameworks revealed that comprehension skills could be examined under four different themes. The first and second themes focus on understanding information given explicitly or implicitly within a text. The third theme includes skills used by the readers to make connections between their personal knowledge and the information given in the text. Lastly, the fourth theme focus on skills used by readers to examine the text with a critical view. The results suggested that fundamental reading comprehension skills can be examined under four themes. Teachers are recommended to use these themes in their reading comprehension teaching and assessment processes. Acknowledgment: This research is supported by Pamukkale University Scientific Research Unit within the project, whose title is Developing A Reading Comprehension Rubric.

Keywords: reading comprehension, assessing reading comprehension, comprehension taxonomies, educational assessment

Procedia PDF Downloads 80
1067 Retrieval of Aerosol Optical Depth and Correlation Analysis of PM2.5 Based on GF-1 Wide Field of View Images

Authors: Bo Wang

Abstract:

This paper proposes a method that can estimate PM2.5 by the images of GF-1 Satellite that called WFOV images (Wide Field of View). AOD (Aerosol Optical Depth) over land surfaces was retrieved in Shanghai area based on DDV (Dark Dense Vegetation) method. PM2.5 information, gathered from ground monitoring stations hourly, was fitted with AOD using different polynomial coefficients, and then the correlation coefficient between them was calculated. The results showed that, the GF-1 WFOV images can meet the requirement of retrieving AOD, and the correlation coefficient between the retrieved AOD and PM2.5 was high. If more detailed and comprehensive data is provided, the accuracy could be improved and the parameters can be more precise in the future.

Keywords: remote sensing retrieve, PM 2.5, GF-1, aerosol optical depth

Procedia PDF Downloads 239
1066 Translation as a Cultural Medium: Understanding the Mauritian Culture and History through an English Translation

Authors: Pooja Booluck

Abstract:

This project seeks to translate a chapter in Le Silence des Chagos by Shenaz Patel a Mauritian author whose work has never been translated before. The chapter discusses the attempt of the protagonist to return to her home country Diego Garcia after her deportation. The English translation will offer an historical account to the target audience of the deportation of Chagossians to Mauritius during the 1970s. The target audience comprises of English-speaking translation scholars translation students and African literature scholars. In light of making the cultural elements of Mauritian culture accessible the translation will maintain the cultural items such as food and oral discourses in Creole so as to preserve the authenticity of the source culture. In order to better comprehend the cultural elements mentioned the target reader will be provided with detailed footnotes explaining the cultural and historical references. This translation will also address the importance of folkloric songs in Mauritius and its intergenerational function in Mauritian communities which will also remain in Creole. While such an approach will help to preserve the meaning of the source text the borrowing technique and the foreignizing method will be employed which will in turn help the reader in becoming more familiar with the Mauritian community. Translating a text from French to English while maintaining certain words or discourses in a minority language such as Creole bears certain challenges: How does the translator ensure the comprehensibility of the reader? Are there any translation losses? What are the choices of the translator?

Keywords: Chagos archipelagos in Exile, English translation, Le Silence des Chagos, Mauritian culture and history

Procedia PDF Downloads 311
1065 A Methodology for Automatic Diversification of Document Categories

Authors: Dasom Kim, Chen Liu, Myungsu Lim, Su-Hyeon Jeon, ByeoungKug Jeon, Kee-Young Kwahk, Namgyu Kim

Abstract:

Recently, numerous documents including unstructured data and text have been created due to the rapid increase in the usage of social media and the Internet. Each document is usually provided with a specific category for the convenience of the users. In the past, the categorization was performed manually. However, in the case of manual categorization, not only can the accuracy of the categorization be not guaranteed but the categorization also requires a large amount of time and huge costs. Many studies have been conducted towards the automatic creation of categories to solve the limitations of manual categorization. Unfortunately, most of these methods cannot be applied to categorizing complex documents with multiple topics because the methods work by assuming that one document can be categorized into one category only. In order to overcome this limitation, some studies have attempted to categorize each document into multiple categories. However, they are also limited in that their learning process involves training using a multi-categorized document set. These methods therefore cannot be applied to multi-categorization of most documents unless multi-categorized training sets are provided. To overcome the limitation of the requirement of a multi-categorized training set by traditional multi-categorization algorithms, we previously proposed a new methodology that can extend a category of a single-categorized document to multiple categorizes by analyzing relationships among categories, topics, and documents. In this paper, we design a survey-based verification scenario for estimating the accuracy of our automatic categorization methodology.

Keywords: big data analysis, document classification, multi-category, text mining, topic analysis

Procedia PDF Downloads 265
1064 Cognitive Translation and Conceptual Wine Tasting Metaphors: A Corpus-Based Research

Authors: Christine Demaecker

Abstract:

Many researchers have underlined the importance of metaphors in specialised language. Their use of specific domains helps us understand the conceptualisations used to communicate new ideas or difficult topics. Within the wide area of specialised discourse, wine tasting is a very specific example because it is almost exclusively metaphoric. Wine tasting metaphors express various conceptualisations. They are not linguistic but rather conceptual, as defined by Lakoff & Johnson. They correspond to the linguistic expression of a mental projection from a well-known or more concrete source domain onto the target domain, which is the taste of wine. But unlike most specialised terminologies, the vocabulary is never clearly defined. When metaphorical terms are listed in dictionaries, their definitions remain vague, unclear, and circular. They cannot be replaced by literal linguistic expressions. This makes it impossible to transfer them into another language with the traditional linguistic translation methods. Qualitative research investigates whether wine tasting metaphors could rather be translated with the cognitive translation process, as well described by Nili Mandelblit (1995). The research is based on a corpus compiled from two high-profile wine guides; the Parker’s Wine Buyer’s Guide and its translation into French and the Guide Hachette des Vins and its translation into English. In this small corpus with a total of 68,826 words, 170 metaphoric expressions have been identified in the original English text and 180 in the original French text. They have been selected with the MIPVU Metaphor Identification Procedure developed at the Vrije Universiteit Amsterdam. The selection demonstrates that both languages use the same set of conceptualisations, which are often combined in wine tasting notes, creating conceptual integrations or blends. The comparison of expressions in the source and target texts also demonstrates the use of the cognitive translation approach. In accordance with the principle of relevance, the translation always uses target language conceptualisations, but compared to the original, the highlighting of the projection is often different. Also, when original metaphors are complex with a combination of conceptualisations, at least one element of the original metaphor underlies the target expression. This approach perfectly integrates into Lederer’s interpretative model of translation (2006). In this triangular model, the transfer of conceptualisation could be included at the level of ‘deverbalisation/reverbalisation’, the crucial stage of the model, where the extraction of meaning combines with the encyclopedic background to generate the target text.

Keywords: cognitive translation, conceptual integration, conceptual metaphor, interpretative model of translation, wine tasting metaphor

Procedia PDF Downloads 128
1063 The Effect of Metacognitive Think-Aloud Strategy on Form 1 Pupils’ Reading Comprehension Skills via DELIMa Platform

Authors: Fatin Khairani Khairul 'Azam

Abstract:

Reading comprehension requires the formation of an articulate mental representation of the information in a text. It involves three interdepended elements—the reader, the text, and the activity, all situated into an extensive sociocultural context. Incorporating metacognitive think-aloud strategy into teaching reading comprehension would improve learners’ reading comprehension skills as it helps to monitor their thinking as they read. Furthermore, by integrating Digital Educational Learning Initiative Malaysia (DELIMa) platform in teaching reading comprehension, it can make the process interactive and fun. A quasi-experimental one-group pre-test post-test design was used to identify the effectiveness of using metacognitive think-aloud strategy via DELIMa platform in improving pupils’ reading comprehension performance and their perceptions towards reading comprehension. The participants of the study comprised 82 of form 1 pupils from a secondary school in Pasir Gudang, Johor, Malaysia. All participants were required to sit for pre-and post-tests to track their reading comprehension performance and perceptions. The findings revealed that incorporating metacognitive think-aloud strategy is an effective strategy in teaching reading comprehension as the performance of pupils in reading comprehension and their perceptions towards reading comprehension were improved during the post tests. It is hoped that the findings of the study would be useful to the teachers incorporating the same strategy in teaching to improve pupils' reading skills. It is suggested that future study should involve the motivation factor of the participants on incorporating think-aloud strategy into teaching reading comprehension as well.

Keywords: DELIMa Platform, ESL Learners, Metacognitive Strategy, Pupils' Perceptions, Reading Comprehension, Think-Aloud Strategy

Procedia PDF Downloads 201
1062 Methodologies for Deriving Semantic Technical Information Using an Unstructured Patent Text Data

Authors: Jaehyung An, Sungjoo Lee

Abstract:

Patent documents constitute an up-to-date and reliable source of knowledge for reflecting technological advance, so patent analysis has been widely used for identification of technological trends and formulation of technology strategies. But, identifying technological information from patent data entails some limitations such as, high cost, complexity, and inconsistency because it rely on the expert’ knowledge. To overcome these limitations, researchers have applied to a quantitative analysis based on the keyword technique. By using this method, you can include a technological implication, particularly patent documents, or extract a keyword that indicates the important contents. However, it only uses the simple-counting method by keyword frequency, so it cannot take into account the sematic relationship with the keywords and sematic information such as, how the technologies are used in their technology area and how the technologies affect the other technologies. To automatically analyze unstructured technological information in patents to extract the semantic information, it should be transformed into an abstracted form that includes the technological key concepts. Specific sentence structure ‘SAO’ (subject, action, object) is newly emerged by representing ‘key concepts’ and can be extracted by NLP (Natural language processor). An SAO structure can be organized in a problem-solution format if the action-object (AO) states that the problem and subject (S) form the solution. In this paper, we propose the new methodology that can extract the SAO structure through technical elements extracting rules. Although sentence structures in the patents text have a unique format, prior studies have depended on general NLP (Natural language processor) applied to the common documents such as newspaper, research paper, and twitter mentions, so it cannot take into account the specific sentence structure types of the patent documents. To overcome this limitation, we identified a unique form of the patent sentences and defined the SAO structures in the patents text data. There are four types of technical elements that consist of technology adoption purpose, application area, tool for technology, and technical components. These four types of sentence structures from patents have their own specific word structure by location or sequence of the part of speech at each sentence. Finally, we developed algorithms for extracting SAOs and this result offer insight for the technology innovation process by providing different perspectives of technology.

Keywords: NLP, patent analysis, SAO, semantic-analysis

Procedia PDF Downloads 259