Search results for: topic familiarity
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1442

Search results for: topic familiarity

1442 Effects of Topic Familiarity on Linguistic Aspects in EFL Learners’ Writing Performance

Authors: Jeong-Won Lee, Kyeong-Ok Yoon

Abstract:

The current study aimed to investigate the effects of topic familiarity and language proficiency on linguistic aspects (lexical complexity, syntactic complexity, accuracy, and fluency) in EFL learners’ argumentative essays. For the study 64 college students were asked to write an argumentative essay for the two different topics (Driving and Smoking) chosen by the consideration of topic familiarity. The students were divided into two language proficiency groups (high-level and intermediate) according to their English writing proficiency. The findings of the study are as follows: 1) the participants of this study exhibited lower levels of lexical and syntactic complexity as well as accuracy when performing writing tasks with unfamiliar topics; and 2) they demonstrated the use of a wider range of vocabulary, and longer and more complex structures, and produced accurate and lengthier texts compared to their intermediate peers. Discussion and pedagogical implications for instruction of writing classes in EFL contexts were addressed.

Keywords: topic familiarity, complexity, accuracy, fluency

Procedia PDF Downloads 19
1441 The Impact of Content Familiarity of Receptive Skills on Language Learning

Authors: Sara Fallahi

Abstract:

This paper reviews the importance of content familiarity of receptive skills and offers solutions to the issue of content unfamiliarity in language learning materials. Presently, language learning materials are mainly comprised of global issues and target language speakers’ culture(s) in receptive skills. This might leadlearners to focus on content rather than the language. As a solution, materials on receptive skills can be developed with a focus on learners’culture and social concerns, especially in the beginner levels of learning. Language learners often learn their target language through the receptive skills of listening and reading before language production ensues through speaking and writing. Students’ journey from receptive skills to productive skills is mainly concentrated on by teachers. There are barriers to language learning, such as time and energy, that can hinder learners’ understanding and ability to build the required background knowledge of the content. This is generated due to learners’ unfamiliarity with the skill’s content. Therefore, materials that improve content familiarity will help learners improve their language comprehension, learning, and usage. This presentation will conclude with practical solutions to help teachers and learners more authentically integrate language and culture to elevate language learning.

Keywords: language learning, listening content, reading content, content familiarity, ESL books, language learning books, cultural familiarity

Procedia PDF Downloads 76
1440 Web Search Engine Based Naming Procedure for Independent Topic

Authors: Takahiro Nishigaki, Takashi Onoda

Abstract:

In recent years, the number of document data has been increasing since the spread of the Internet. Many methods have been studied for extracting topics from large document data. We proposed Independent Topic Analysis (ITA) to extract topics independent of each other from large document data such as newspaper data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis. The topic represented by ITA is represented by a set of words. However, the set of words is quite different from the topics the user imagines. For example, the top five words with high independence of a topic are as follows. Topic1 = {"scor", "game", "lead", "quarter", "rebound"}. This Topic 1 is considered to represent the topic of "SPORTS". This topic name "SPORTS" has to be attached by the user. ITA cannot name topics. Therefore, in this research, we propose a method to obtain topics easy for people to understand by using the web search engine, topics given by the set of words given by independent topic analysis. In particular, we search a set of topical words, and the title of the homepage of the search result is taken as the topic name. And we also use the proposed method for some data and verify its effectiveness.

Keywords: independent topic analysis, topic extraction, topic naming, web search engine

Procedia PDF Downloads 95
1439 Lecture Video Indexing and Retrieval Using Topic Keywords

Authors: B. J. Sandesh, Saurabha Jirgi, S. Vidya, Prakash Eljer, Gowri Srinivasa

Abstract:

In this paper, we propose a framework to help users to search and retrieve the portions in the lecture video of their interest. This is achieved by temporally segmenting and indexing the lecture video using the topic keywords. We use transcribed text from the video and documents relevant to the video topic extracted from the web for this purpose. The keywords for indexing are found by applying the non-negative matrix factorization (NMF) topic modeling techniques on the web documents. Our proposed technique first creates indices on the transcribed documents using the topic keywords, and these are mapped to the video to find the start and end time of the portions of the video for a particular topic. This time information is stored in the index table along with the topic keyword which is used to retrieve the specific portions of the video for the query provided by the users.

Keywords: video indexing and retrieval, lecture videos, content based video search, multimodal indexing

Procedia PDF Downloads 204
1438 A Phenomenological Approach to Computational Modeling of Analogy

Authors: José Eduardo García-Mendiola

Abstract:

In this work, a phenomenological approach to computational modeling of analogy processing is carried out. The paper goes through the consideration of the structure of the analogy, based on the possibility of sustaining the genesis of its elements regarding Husserl's genetic theory of association. Among particular processes which take place in order to get analogical inferences, there is one which arises crucial for enabling efficient base cases retrieval through long-term memory, namely analogical transference grounded on familiarity. In general, it has been argued that analogical reasoning is a way by which a conscious agent tries to determine or define a certain scope of objects and relationships between them using previous knowledge of other familiar domain of objects and relations. However, looking for a complete description of analogy process, a deeper consideration of phenomenological nature is required in so far, its simulation by computational programs is aimed. Also, one would get an idea of how complex it would be to have a fully computational account of the analogy elements. In fact, familiarity is not a result of a mere chain of repetitions of objects or events but generated insofar as the object/attribute or event in question is integrable inside a certain context that is taking shape as functionalities and functional approaches or perspectives of the object are being defined. Its familiarity is generated not by the identification of its parts or objective determinations as if they were isolated from those functionalities and approaches. Rather, at the core of such a familiarity between entities of different kinds lays the way they are functionally encoded. So, and hoping to make deeper inroads towards these topics, this essay allows us to consider that cognitive-computational perspectives can visualize, from the phenomenological projection of the analogy process reviewing achievements already obtained as well as exploration of new theoretical-experimental configurations towards implementation of analogy models in specific as well as in general purpose machines.

Keywords: analogy, association, encoding, retrieval

Procedia PDF Downloads 87
1437 Off-Topic Text Detection System Using a Hybrid Model

Authors: Usama Shahid

Abstract:

Be it written documents, news columns, or students' essays, verifying the content can be a time-consuming task. Apart from the spelling and grammar mistakes, the proofreader is also supposed to verify whether the content included in the essay or document is relevant or not. The irrelevant content in any document or essay is referred to as off-topic text and in this paper, we will address the problem of off-topic text detection from a document using machine learning techniques. Our study aims to identify the off-topic content from a document using Echo state network model and we will also compare data with other models. The previous study uses Convolutional Neural Networks and TFIDF to detect off-topic text. We will rearrange the existing datasets and take new classifiers along with new word embeddings and implement them on existing and new datasets in order to compare the results with the previously existing CNN model.

Keywords: off topic, text detection, eco state network, machine learning

Procedia PDF Downloads 51
1436 Investigation of the Cognition Factors of Fire Response Performances Based on Survey

Authors: Jingjing Yan, Gengen He, Anahid Basiri

Abstract:

The design of an indoor navigation system for fire evacuation support requires not only physical feasibility but also a relatively thorough consideration of the human factors. This study has taken a survey to investigate the fire response performances (FRP) of the indoor occupants in age of 20s, virtually in an environment for their routine life, focusing on the aspects of indoor familiarity (spatial cognition), psychological stress and decision makings. For indoor familiarity, it is interested in three factors, i.e., the familiarity to exits and risky places as well as the satisfaction degree of the current indoor sign installation. According to the results, males have a higher average familiarity with the indoor exits while both genders have a relatively low level of risky place awareness. These two factors are positively correlated with the satisfaction degree of the current installation of the indoor signs, and this correlation is more evident for the exit familiarity. The integration of the height factor with the other two indoor familiarity factors can improve the degree of indoor sign satisfaction. For psychological stress, this study concentrates on the situated cognition of moving difficulty, nervousness, and speed reduction when using a bending posture during the fire evacuation to avoid smoke inhalation. The results have shown that both genders have a similar mid-level of hardness sensation. The females have a higher average level of nervousness, while males have a higher average level of speed reduction sensation. This study has assumed that the growing indoor spatial cognition can help ease the psychological hardness and nervousness. However, it only seems to be true after reaching a certain level. When integrating the effects from indoor familiarity and the other two psychological factors, the correlation to the sensation of speed change can be strengthened, based on a stronger positive correlation with the integrated factors. This study has also investigated the participants’ attitude to the navigation support during evacuation, and the majority of the participants have shown positive attitudes. For following the guidance under some extreme cases, i.e., changing to a longer path and to an alternative exit, the majority of the participants has shown the confidence of keeping trusting the guidance service. These decisions are affected by the combined influences from indoor familiarity, psychological stress, and attitude of using navigation service. For the decision time of the selected extreme cases, it costs more time in average for deciding to use a longer route than to use an alternative exit, and this situation is more evident for the female participants. This requires further considerations when designing a personalized smartphone-based navigation app. This study has also investigated the calming factors for people being trapped during evacuation. The top consideration is the distance to the nearest firefighters, and the following considerations are the current fire conditions in the surrounding environment and the locations of all firefighters. The ranking of the latter two considerations is very gender-dependent according to the results.

Keywords: fire response performances, indoor spatial cognition, situated cognition, survey analysis

Procedia PDF Downloads 101
1435 Topic-to-Essay Generation with Event Element Constraints

Authors: Yufen Qin

Abstract:

Topic-to-Essay generation is a challenging task in Natural language processing, which aims to generate novel, diverse, and topic-related text based on user input. Previous research has overlooked the generation of articles under the constraints of event elements, resulting in issues such as incomplete event elements and logical inconsistencies in the generated results. To fill this gap, this paper proposes an event-constrained approach for a topic-to-essay generation that enforces the completeness of event elements during the generation process. Additionally, a language model is employed to verify the logical consistency of the generated results. Experimental results demonstrate that the proposed model achieves a better BLEU-2 score and performs better than the baseline in terms of subjective evaluation on a real dataset, indicating its capability to generate higher-quality topic-related text.

Keywords: event element, language model, natural language processing, topic-to-essay generation.

Procedia PDF Downloads 192
1434 Familiarity with Intercultural Conflicts and Global Work Performance: Testing a Theory of Recognition Primed Decision-Making

Authors: Thomas Rockstuhl, Kok Yee Ng, Guido Gianasso, Soon Ang

Abstract:

Two meta-analyses show that intercultural experience is not related to intercultural adaptation or performance in international assignments. These findings have prompted calls for a deeper grounding of research on international experience in the phenomenon of global work. Two issues, in particular, may limit current understanding of the relationship between international experience and global work performance. First, intercultural experience is too broad a construct that may not sufficiently capture the essence of global work, which to a large part involves sensemaking and managing intercultural conflicts. Second, the psychological mechanisms through which intercultural experience affects performance remains under-explored, resulting in a poor understanding of how experience is translated into learning and performance outcomes. Drawing on recognition primed decision-making (RPD) research, the current study advances a cognitive processing model to highlight the importance of intercultural conflict familiarity. Compared to intercultural experience, intercultural conflict familiarity is a more targeted construct that captures individuals’ previous exposure to dealing with intercultural conflicts. Drawing on RPD theory, we argue that individuals’ intercultural conflict familiarity enhances their ability to make accurate judgments and generate effective responses when intercultural conflicts arise. In turn, the ability to make accurate situation judgements and effective situation responses is an important predictor of global work performance. A relocation program within a multinational enterprise provided the context to test these hypotheses using a time-lagged, multi-source field study. Participants were 165 employees (46% female; with an average of 5 years of global work experience) from 42 countries who relocated from country to regional offices as part a global restructuring program. Within the first two weeks of transfer to the regional office, employees completed measures of their familiarity with intercultural conflicts, cultural intelligence, cognitive ability, and demographic information. They also completed an intercultural situational judgment test (iSJT) to assess their situation judgment and situation response. The iSJT comprised four validated multimedia vignettes of challenging intercultural work conflicts and prompted employees to provide protocols of their situation judgment and situation response. Two research assistants, trained in intercultural management but blind to the study hypotheses, coded the quality of employee’s situation judgment and situation response. Three months later, supervisors rated employees’ global work performance. Results using multilevel modeling (vignettes nested within employees) support the hypotheses that greater familiarity with intercultural conflicts is positively associated with better situation judgment, and that situation judgment mediates the effect of intercultural familiarity on situation response quality. Also, aggregated situation judgment and situation response quality both predicted supervisor-rated global work performance. Theoretically, our findings highlight the important but under-explored role of familiarity with intercultural conflicts; a shift in attention from the general nature of international experience assessed in terms of number and length of overseas assignments. Also, our cognitive approach premised on RPD theory offers a new theoretical lens to understand the psychological mechanisms through which intercultural conflict familiarity affects global work performance. Third, and importantly, our study contributes to the global talent identification literature by demonstrating that the cognitive processes engaged in resolving intercultural conflicts predict actual performance in the global workplace.

Keywords: intercultural conflict familiarity, job performance, judgment and decision making, situational judgment test

Procedia PDF Downloads 142
1433 Trend Detection Using Community Rank and Hawkes Process

Authors: Shashank Bhatnagar, W. Wilfred Godfrey

Abstract:

We develop in this paper, an approach to find the trendy topic, which not only considers the user-topic interaction but also considers the community, in which user belongs. This method modifies the previous approach of user-topic interaction to user-community-topic interaction with better speed-up in the range of [1.1-3]. We assume that trend detection in a social network is dependent on two things. The one is, broadcast of messages in social network governed by self-exciting point process, namely called Hawkes process and the second is, Community Rank. The influencer node links to others in the community and decides the community rank based on its PageRank and the number of users links to that community. The community rank decides the influence of one community over the other. Hence, the Hawkes process with the kernel of user-community-topic decides the trendy topic disseminated into the social network.

Keywords: community detection, community rank, Hawkes process, influencer node, pagerank, trend detection

Procedia PDF Downloads 350
1432 Topic Modelling Using Latent Dirichlet Allocation and Latent Semantic Indexing on SA Telco Twitter Data

Authors: Phumelele Kubheka, Pius Owolawi, Gbolahan Aiyetoro

Abstract:

Twitter is one of the most popular social media platforms where users can share their opinions on different subjects. As of 2010, The Twitter platform generates more than 12 Terabytes of data daily, ~ 4.3 petabytes in a single year. For this reason, Twitter is a great source for big mining data. Many industries such as Telecommunication companies can leverage the availability of Twitter data to better understand their markets and make an appropriate business decision. This study performs topic modeling on Twitter data using Latent Dirichlet Allocation (LDA). The obtained results are benchmarked with another topic modeling technique, Latent Semantic Indexing (LSI). The study aims to retrieve topics on a Twitter dataset containing user tweets on South African Telcos. Results from this study show that LSI is much faster than LDA. However, LDA yields better results with higher topic coherence by 8% for the best-performing model represented in Table 1. A higher topic coherence score indicates better performance of the model.

Keywords: big data, latent Dirichlet allocation, latent semantic indexing, telco, topic modeling, twitter

Procedia PDF Downloads 119
1431 Investigating Medical Students’ Perspectives toward University Teachers’ Talking Features in an English as a Foreign Language Context in Urmia, Iran

Authors: Ismail Baniadam, Nafisa Tadayyon, Javid Fereidoni

Abstract:

This study aimed to investigate medical students’ attitudes toward some teachers’ talking features regarding their gender in the Iranian context. To do so, 60 male and 60 female medical students of Urmia University of Medical Sciences (UMSU) participated in the research. A researcher made Likert-type questionnaire which was initially piloted and was used to gather the data. Comparing the four different factors regarding the features of teacher talk, it was revealed that visual and extra-linguistic information factor, Lexical and syntactic familiarity, Speed of speech, and the use of Persian language had the highest to the lowest mean score, respectively. It was also indicated that female students rather than male students were significantly more in favor of speed of speech and lexical and syntactic familiarity.

Keywords: attitude, gender, medical student, teacher talk

Procedia PDF Downloads 149
1430 Recognizing an Individual, Their Topic of Conversation and Cultural Background from 3D Body Movement

Authors: Gheida J. Shahrour, Martin J. Russell

Abstract:

The 3D body movement signals captured during human-human conversation include clues not only to the content of people’s communication but also to their culture and personality. This paper is concerned with automatic extraction of this information from body movement signals. For the purpose of this research, we collected a novel corpus from 27 subjects, arranged them into groups according to their culture. We arranged each group into pairs and each pair communicated with each other about different topics. A state-of-art recognition system is applied to the problems of person, culture, and topic recognition. We borrowed modeling, classification, and normalization techniques from speech recognition. We used Gaussian Mixture Modeling (GMM) as the main technique for building our three systems, obtaining 77.78%, 55.47%, and 39.06% from the person, culture, and topic recognition systems respectively. In addition, we combined the above GMM systems with Support Vector Machines (SVM) to obtain 85.42%, 62.50%, and 40.63% accuracy for person, culture, and topic recognition respectively. Although direct comparison among these three recognition systems is difficult, it seems that our person recognition system performs best for both GMM and GMM-SVM, suggesting that inter-subject differences (i.e. subject’s personality traits) are a major source of variation. When removing these traits from culture and topic recognition systems using the Nuisance Attribute Projection (NAP) and the Intersession Variability Compensation (ISVC) techniques, we obtained 73.44% and 46.09% accuracy from culture and topic recognition systems respectively.

Keywords: person recognition, topic recognition, culture recognition, 3D body movement signals, variability compensation

Procedia PDF Downloads 510
1429 Text Mining of Twitter Data Using a Latent Dirichlet Allocation Topic Model and Sentiment Analysis

Authors: Sidi Yang, Haiyi Zhang

Abstract:

Twitter is a microblogging platform, where millions of users daily share their attitudes, views, and opinions. Using a probabilistic Latent Dirichlet Allocation (LDA) topic model to discern the most popular topics in the Twitter data is an effective way to analyze a large set of tweets to find a set of topics in a computationally efficient manner. Sentiment analysis provides an effective method to show the emotions and sentiments found in each tweet and an efficient way to summarize the results in a manner that is clearly understood. The primary goal of this paper is to explore text mining, extract and analyze useful information from unstructured text using two approaches: LDA topic modelling and sentiment analysis by examining Twitter plain text data in English. These two methods allow people to dig data more effectively and efficiently. LDA topic model and sentiment analysis can also be applied to provide insight views in business and scientific fields.

Keywords: text mining, Twitter, topic model, sentiment analysis

Procedia PDF Downloads 149
1428 Optimized Text Summarization Model on Mobile Screens for Sight-Interpreters: An Empirical Study

Authors: Jianhua Wang

Abstract:

To obtain key information quickly from long texts on small screens of mobile devices, sight-interpreters need to establish optimized summarization model for fast information retrieval. Four summarization models based on previous studies were studied including title+key words (TKW), title+topic sentences (TTS), key words+topic sentences (KWTS) and title+key words+topic sentences (TKWTS). Psychological experiments were conducted on the four models for three different genres of interpreting texts to establish the optimized summarization model for sight-interpreters. This empirical study shows that the optimized summarization model for sight-interpreters to quickly grasp the key information of the texts they interpret is title+key words (TKW) for cultural texts, title+key words+topic sentences (TKWTS) for economic texts and topic sentences+key words (TSKW) for political texts.

Keywords: different genres, mobile screens, optimized summarization models, sight-interpreters

Procedia PDF Downloads 281
1427 Visualization and Performance Measure to Determine Number of Topics in Twitter Data Clustering Using Hybrid Topic Modeling

Authors: Moulana Mohammed

Abstract:

Topic models are widely used in building clusters of documents for more than a decade, yet problems occurring in choosing optimal number of topics. The main problem is the lack of a stable metric of the quality of topics obtained during the construction of topic models. The authors analyzed from previous works, most of the models used in determining the number of topics are non-parametric and quality of topics determined by using perplexity and coherence measures and concluded that they are not applicable in solving this problem. In this paper, we used the parametric method, which is an extension of the traditional topic model with visual access tendency for visualization of the number of topics (clusters) to complement clustering and to choose optimal number of topics based on results of cluster validity indices. Developed hybrid topic models are demonstrated with different Twitter datasets on various topics in obtaining the optimal number of topics and in measuring the quality of clusters. The experimental results showed that the Visual Non-negative Matrix Factorization (VNMF) topic model performs well in determining the optimal number of topics with interactive visualization and in performance measure of the quality of clusters with validity indices.

Keywords: interactive visualization, visual mon-negative matrix factorization model, optimal number of topics, cluster validity indices, Twitter data clustering

Procedia PDF Downloads 106
1426 Using Bidirectional Encoder Representations from Transformers to Extract Topic-Independent Sentiment Features for Social Media Bot Detection

Authors: Maryam Heidari, James H. Jones Jr.

Abstract:

Millions of online posts about different topics and products are shared on popular social media platforms. One use of this content is to provide crowd-sourced information about a specific topic, event or product. However, this use raises an important question: what percentage of information available through these services is trustworthy? In particular, might some of this information be generated by a machine, i.e., a bot, instead of a human? Bots can be, and often are, purposely designed to generate enough volume to skew an apparent trend or position on a topic, yet the consumer of such content cannot easily distinguish a bot post from a human post. In this paper, we introduce a model for social media bot detection which uses Bidirectional Encoder Representations from Transformers (Google Bert) for sentiment classification of tweets to identify topic-independent features. Our use of a Natural Language Processing approach to derive topic-independent features for our new bot detection model distinguishes this work from previous bot detection models. We achieve 94\% accuracy classifying the contents of data as generated by a bot or a human, where the most accurate prior work achieved accuracy of 92\%.

Keywords: bot detection, natural language processing, neural network, social media

Procedia PDF Downloads 88
1425 English as a Lingua Franca Elicited in ASEAN Accents

Authors: Choedchoo Kwanhathai

Abstract:

This study explores attitudes towards ASEAN plus ONE (namely ASEAN plus China) accents of English as a Lingua Franca. The study draws attention to features of ASEAN’s diversity of English and specifically examines the extent of which the English accent in ASEAN countries of three of the ten members plus one were perceived in terms of correctness, acceptability, pleasantness, and familiarity. Three accents were used for this study; Chinese, Philippine and Thai. The participants were ninety eight Thai students enrolled in a foundation course of Suan Dusit Rajabhat University, Bangkok Thailand. The students were asked in questionnaires to rank how they perceived each specifically ASEAN plus One English accent after listening to audio recordings of three stories spoken by the three different ASEAN plus ONE English speakers. SPSS was used to analyze the data. The findings of attitudes towards varieties of English accent from the 98 respondents regarding correctness, acceptability, pleasantness, and familiarity of Thai English accents found that Thai accent was overall at level 3 (X = 2.757, SD= o.33), %Then Philippines accents was at level 2 (X = 2.326, SD = 16.12), and Chinese accents w2as at level 3 (X 3.198, SD = 0.18). Finally, the present study proposes pedagogical implications for teaching regarding awareness of ‘Englishes’ of ASEAN and their respective accents and their lingua cultural background of instructors.

Keywords: English as a lingua franca, English accents, English as an international language, ASEAN plus one, ASEAN English varieties

Procedia PDF Downloads 392
1424 Analysis of Trends in Environmental Health Research Using Topic Modeling

Authors: Hayoung Cho, Gabi Cho

Abstract:

In response to the continuing increase of demands for living environment safety, the Korean government has established and implemented various environmental health policies and set a high priority to the related R&D. However, the level of related technologies such as environmental risk assessment are still relatively low, and there is a need for detailed investment strategies in the field of environmental health research. As scientific research papers can give valuable implications on the development of a certain field, this study analyzed the global research trends in the field of environmental health over the past 10 years (2005~2015). Research topics were extracted from abstracts of the collected SCI papers using topic modeling to study the changes in research trends and discover emerging technologies. The method of topic modeling can improve the traditional bibliometric approach and provide a more comprehensive review of the global research development. The results of this study are expected to help provide insights for effective policy making and R&D investment direction.

Keywords: environmental health, paper analysis, research trends, topic modeling

Procedia PDF Downloads 260
1423 Artificial Intelligence Assisted Sentiment Analysis of Hotel Reviews Using Topic Modeling

Authors: Sushma Ghogale

Abstract:

With a surge in user-generated content or feedback or reviews on the internet, it has become possible and important to know consumers' opinions about products and services. This data is important for both potential customers and businesses providing the services. Data from social media is attracting significant attention and has become the most prominent channel of expressing an unregulated opinion. Prospective customers look for reviews from experienced customers before deciding to buy a product or service. Several websites provide a platform for users to post their feedback for the provider and potential customers. However, the biggest challenge in analyzing such data is in extracting latent features and providing term-level analysis of the data. This paper proposes an approach to use topic modeling to classify the reviews into topics and conduct sentiment analysis to mine the opinions. This approach can analyse and classify latent topics mentioned by reviewers on business sites or review sites, or social media using topic modeling to identify the importance of each topic. It is followed by sentiment analysis to assess the satisfaction level of each topic. This approach provides a classification of hotel reviews using multiple machine learning techniques and comparing different classifiers to mine the opinions of user reviews through sentiment analysis. This experiment concludes that Multinomial Naïve Bayes classifier produces higher accuracy than other classifiers.

Keywords: latent Dirichlet allocation, topic modeling, text classification, sentiment analysis

Procedia PDF Downloads 73
1422 Investigating Dynamic Transition Process of Issues Using Unstructured Text Analysis

Authors: Myungsu Lim, William Xiu Shun Wong, Yoonjin Hyun, Chen Liu, Seongi Choi, Dasom Kim, Namgyu Kim

Abstract:

The amount of real-time data generated through various mass media has been increasing rapidly. In this study, we had performed topic analysis by using the unstructured text data that is distributed through news article. As one of the most prevalent applications of topic analysis, the issue tracking technique investigates the changes of the social issues that identified through topic analysis. Currently, traditional issue tracking is conducted by identifying the main topics of documents that cover an entire period at the same time and analyzing the occurrence of each topic by the period of occurrence. However, this traditional issue tracking approach has limitation that it cannot discover dynamic mutation process of complex social issues. The purpose of this study is to overcome the limitations of the existing issue tracking method. We first derived core issues of each period, and then discover the dynamic mutation process of various issues. In this study, we further analyze the mutation process from the perspective of the issues categories, in order to figure out the pattern of issue flow, including the frequency and reliability of the pattern. In other words, this study allows us to understand the components of the complex issues by tracking the dynamic history of issues. This methodology can facilitate a clearer understanding of complex social phenomena by providing mutation history and related category information of the phenomena.

Keywords: Data Mining, Issue Tracking, Text Mining, topic Analysis, topic Detection, Trend Detection

Procedia PDF Downloads 373
1421 Diversity in Finance Literature Revealed through the Lens of Machine Learning: A Topic Modeling Approach on Academic Papers

Authors: Oumaima Lahmar

Abstract:

This paper aims to define a structured topography for finance researchers seeking to navigate the body of knowledge in their extrapolation of finance phenomena. To make sense of the body of knowledge in finance, a probabilistic topic modeling approach is applied on 6000 abstracts of academic articles published in three top journals in finance between 1976 and 2020. This approach combines both machine learning techniques and natural language processing to statistically identify the conjunctions between research articles and their shared topics described each by relevant keywords. The topic modeling analysis reveals 35 coherent topics that can well depict finance literature and provide a comprehensive structure for the ongoing research themes. Comparing the extracted topics to the Journal of Economic Literature (JEL) classification system, a significant similarity was highlighted between the characterizing keywords. On the other hand, we identify other topics that do not match the JEL classification despite being relevant in the finance literature.

Keywords: finance literature, textual analysis, topic modeling, perplexity

Procedia PDF Downloads 130
1420 Topic Prominence and Temporal Encoding in Mandarin Chinese

Authors: Tzu-I Chiang

Abstract:

A central question for finite-nonfinite distinction in Mandarin Chinese is how does Mandarin encode temporal information without the grammatical contrast between past and present tense. Moreover, how do L2 learners of Mandarin whose native language is English and whose L1 system has tense morphology, acquire the temporal encoding system in L2 Mandarin? The current study reports preliminary findings on the relationship between topic prominence and the temporal encoding in L1 and L2 Chinese. Oral narratives data from 30 natives and learners of Mandarin Chinese were collected via a film-retell task. In terms of coding, predicates collected from the narratives were transcribed and then coded based on four major verb types: n-degree Statives (quality-STA), point-scale Statives (status-STA), n-atom EVENT (ACT), and point EVENT (resultative-ACT). How native speakers and non-native speakers started retelling the story was calculated. Results of the study show that native speakers of Chinese tend to express Topic Time (TT) syntactically at the topic position; whereas L2 learners of Chinese across levels rely mainly on the default time encoded in the event types. Moreover, as the proficiency level of the learner increases, learners’ appropriate use of the event predicates increased, which supports the argument that L2 development of temporal encoding is affected by lexical aspect.

Keywords: topic prominence, temporal encoding, lexical aspect, L2 acquisition

Procedia PDF Downloads 169
1419 Improving Topic Quality of Scripts by Using Scene Similarity Based Word Co-Occurrence

Authors: Yunseok Noh, Chang-Uk Kwak, Sun-Joong Kim, Seong-Bae Park

Abstract:

Scripts are one of the basic text resources to understand broadcasting contents. Since broadcast media wields lots of influence over the public, tools for understanding broadcasting contents are more required. Topic modeling is the method to get the summary of the broadcasting contents from its scripts. Generally, scripts represent contents descriptively with directions and speeches. Scripts also provide scene segments that can be seen as semantic units. Therefore, a script can be topic modeled by treating a scene segment as a document. Because scripts consist of speeches mainly, however, relatively small co-occurrences among words in the scene segments are observed. This causes inevitably the bad quality of topics based on statistical learning method. To tackle this problem, we propose a method of learning with additional word co-occurrence information obtained using scene similarities. The main idea of improving topic quality is that the information that two or more texts are topically related can be useful to learn high quality of topics. In addition, by using high quality of topics, we can get information more accurate whether two texts are related or not. In this paper, we regard two scene segments are related if their topical similarity is high enough. We also consider that words are co-occurred if they are in topically related scene segments together. In the experiments, we showed the proposed method generates a higher quality of topics from Korean drama scripts than the baselines.

Keywords: broadcasting contents, scripts, text similarity, topic model

Procedia PDF Downloads 288
1418 Emotion Oriented Students' Opinioned Topic Detection for Course Reviews in Massive Open Online Course

Authors: Zhi Liu, Xian Peng, Monika Domanska, Lingyun Kang, Sannyuya Liu

Abstract:

Massive Open education has become increasingly popular among worldwide learners. An increasing number of course reviews are being generated in Massive Open Online Course (MOOC) platform, which offers an interactive feedback channel for learners to express opinions and feelings in learning. These reviews typically contain subjective emotion and topic information towards the courses. However, it is time-consuming to artificially detect these opinions. In this paper, we propose an emotion-oriented topic detection model to automatically detect the students’ opinioned aspects in course reviews. The known overall emotion orientation and emotional words in each review are used to guide the joint probabilistic modeling of emotion and aspects in reviews. Through the experiment on real-life review data, it is verified that the distribution of course-emotion-aspect can be calculated to capture the most significant opinioned topics in each course unit. This proposed technique helps in conducting intelligent learning analytics for teachers to improve pedagogies and for developers to promote user experiences.

Keywords: Massive Open Online Course (MOOC), course reviews, topic model, emotion recognition, topical aspects

Procedia PDF Downloads 238
1417 AIPM:An Integrator and Pull Request Matching Model in Github

Authors: Zhifang Liao, Yanbing Li, Li Xu, Yan Zhang, Xiaoping Fan, Jinsong Wu

Abstract:

Pull Request (PR) is the primary method for code contributions from the external contributors in Github. PR review is an essential part of open source software developments for maintaining the quality of software. Matching a new PR of an appropriate integrator will make the PR review more effective. However, PR and integrator matching are now organized manually in Github. To reduce this cost, we presented an AIPM model to predict highly relevant integrator of incoming PRs. AIPM uses topic model to extract topics from the PRs, and builds a one-to-one correspondence between topics and integrators. Then, AIPM finds the most suitable integrator according to the maximum entry of the topic-document distribution. On average, AIPM can reach a precision of 60%, and even in some projects, can reach a precision of 80%.

Keywords: pull Request, integrator matching, Github, open source project, topic model

Procedia PDF Downloads 267
1416 Priming through Open Book MCQ Test: A Tool for Enhancing Learning in Medical Undergraduates

Authors: Bharti Bhandari, Bharati Mehta, Sabyasachi Sircar

Abstract:

Medical education is advancing in India, with its advancement newer innovations are being incorporated in teaching and assessment methodology. Our study focusses on a teaching innovation that is more student-centric than teacher-centric and is the need of the day. The teaching innovation was carried out in 1st year MBBS students of our institute. Students were assigned control and test groups. Priming was done for the students in the test group with an open-book MCQ based test in a particular topic before delivering formal didactic lecture on that topic. The control group was not assigned any such exercise. This was followed by formal didactic lecture on the same topic. Thereafter, both groups were assessed on the same topic. The marks were compiled and analysed using appropriate statistical tests. Students were also given questionnaire to elicit their views on the benefits of “self-priming”. The mean marks scored in theory assessment by the test group were statistically higher than the marks scored by the controls. According to students’ feedback, the ‘self-priming “process was interesting, helped in better orientation during class-room lectures and better understanding of the topic. They want it to be repeated for other topics with moderate difficulty level. Better performance of the students in the primed group validates the combination of student-centric priming model and didactic lecture as superior to the conventional, teacher-centric methods alone. If this system is successfully followed, the present teacher-centric pedagogy should increasingly give way to student-centric activities where the teacher is only a facilitator.

Keywords: medical education, open-book test, pedagogy, priming

Procedia PDF Downloads 408
1415 Building an Ontology for Researchers: An Application of Topic Maps and Social Information

Authors: Yu Hung Chiang, Hei Chia Wang

Abstract:

In the academic area, it is important for research to find proper research domain. Many researchers may refer to conference issues to find their interesting or new topics. Furthermore, conferences issues can help researchers realize current research trends in their field and learn about cutting-edge developments in their specialty. However, online published conference information may widely be distributed; it is not easy to be concluded. Many researchers use search engine of journals or conference issues to filter information in order to get what they want. However, this search engine has its limitation. There will still be some issues should be considered; i.e. researchers cannot find the associated topics which may be useful information for them. Hence, use Knowledge Management (KM) could be a way to resolve these issues. In KM, ontology is widely adopted; but most existed ontology construction methods do not consider social information between target users. To effective in academic KM, this study proposes a method of constructing research Topic Maps using Open Directory Project (ODP) and Social Information Processing (SIP). Through catching of social information in conference website: i.e. the information of co-authorship or collaborator, research topics can be associated among related researchers. Finally, the experiments show Topic Maps successfully help researchers to find the information they need more easily and quickly as well as construct associations between research topics.

Keywords: knowledge management, topic map, social information processing, ontology extraction

Procedia PDF Downloads 268
1414 Incremental Learning of Independent Topic Analysis

Authors: Takahiro Nishigaki, Katsumi Nitta, Takashi Onoda

Abstract:

In this paper, we present a method of applying Independent Topic Analysis (ITA) to increasing the number of document data. The number of document data has been increasing since the spread of the Internet. ITA was presented as one method to analyze the document data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis (ICA). ICA is a technique in the signal processing; however, it is difficult to apply the ITA to increasing number of document data. Because ITA must use the all document data so temporal and spatial cost is very high. Therefore, we present Incremental ITA which extracts the independent topics from increasing number of document data. Incremental ITA is a method of updating the independent topics when the document data is added after extracted the independent topics from a just previous the data. In addition, Incremental ITA updates the independent topics when the document data is added. And we show the result applied Incremental ITA to benchmark datasets.

Keywords: text mining, topic extraction, independent, incremental, independent component analysis

Procedia PDF Downloads 275
1413 Topic Sentiments toward the COVID-19 Vaccine on Twitter

Authors: Melissa Vang, Raheyma Khan, Haihua Chen

Abstract:

The coronavirus disease 2019 (COVID‐19) pandemic has changed people's lives from all over the world. More people have turned to Twitter to engage online and discuss the COVID-19 vaccine. This study aims to present a text mining approach to identify people's attitudes towards the COVID-19 vaccine on Twitter. To achieve this purpose, we collected 54,268 COVID-19 vaccine tweets from September 01, 2020, to November 01, 2020, then the BERT model is used for the sentiment and topic analysis. The results show that people had more negative than positive attitudes about the vaccine, and countries with an increasing number of confirmed cases had a higher percentage of negative attitudes. Additionally, the topics discussed in positive and negative tweets are different. The tweet datasets can be helpful to information professionals to inform the public about vaccine-related informational resources. Our findings may have implications for understanding people's cognitions and feelings about the vaccine.

Keywords: BERT, COVID-19 vaccine, sentiment analysis, topic modeling

Procedia PDF Downloads 117