Search results for: topic%20modeling
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1335

Search results for: topic%20modeling

1335 Web Search Engine Based Naming Procedure for Independent Topic

Authors: Takahiro Nishigaki, Takashi Onoda

Abstract:

In recent years, the number of document data has been increasing since the spread of the Internet. Many methods have been studied for extracting topics from large document data. We proposed Independent Topic Analysis (ITA) to extract topics independent of each other from large document data such as newspaper data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis. The topic represented by ITA is represented by a set of words. However, the set of words is quite different from the topics the user imagines. For example, the top five words with high independence of a topic are as follows. Topic1 = {"scor", "game", "lead", "quarter", "rebound"}. This Topic 1 is considered to represent the topic of "SPORTS". This topic name "SPORTS" has to be attached by the user. ITA cannot name topics. Therefore, in this research, we propose a method to obtain topics easy for people to understand by using the web search engine, topics given by the set of words given by independent topic analysis. In particular, we search a set of topical words, and the title of the homepage of the search result is taken as the topic name. And we also use the proposed method for some data and verify its effectiveness.

Keywords: independent topic analysis, topic extraction, topic naming, web search engine

Procedia PDF Downloads 88
1334 Lecture Video Indexing and Retrieval Using Topic Keywords

Authors: B. J. Sandesh, Saurabha Jirgi, S. Vidya, Prakash Eljer, Gowri Srinivasa

Abstract:

In this paper, we propose a framework to help users to search and retrieve the portions in the lecture video of their interest. This is achieved by temporally segmenting and indexing the lecture video using the topic keywords. We use transcribed text from the video and documents relevant to the video topic extracted from the web for this purpose. The keywords for indexing are found by applying the non-negative matrix factorization (NMF) topic modeling techniques on the web documents. Our proposed technique first creates indices on the transcribed documents using the topic keywords, and these are mapped to the video to find the start and end time of the portions of the video for a particular topic. This time information is stored in the index table along with the topic keyword which is used to retrieve the specific portions of the video for the query provided by the users.

Keywords: video indexing and retrieval, lecture videos, content based video search, multimodal indexing

Procedia PDF Downloads 198
1333 Off-Topic Text Detection System Using a Hybrid Model

Authors: Usama Shahid

Abstract:

Be it written documents, news columns, or students' essays, verifying the content can be a time-consuming task. Apart from the spelling and grammar mistakes, the proofreader is also supposed to verify whether the content included in the essay or document is relevant or not. The irrelevant content in any document or essay is referred to as off-topic text and in this paper, we will address the problem of off-topic text detection from a document using machine learning techniques. Our study aims to identify the off-topic content from a document using Echo state network model and we will also compare data with other models. The previous study uses Convolutional Neural Networks and TFIDF to detect off-topic text. We will rearrange the existing datasets and take new classifiers along with new word embeddings and implement them on existing and new datasets in order to compare the results with the previously existing CNN model.

Keywords: off topic, text detection, eco state network, machine learning

Procedia PDF Downloads 46
1332 Topic-to-Essay Generation with Event Element Constraints

Authors: Yufen Qin

Abstract:

Topic-to-Essay generation is a challenging task in Natural language processing, which aims to generate novel, diverse, and topic-related text based on user input. Previous research has overlooked the generation of articles under the constraints of event elements, resulting in issues such as incomplete event elements and logical inconsistencies in the generated results. To fill this gap, this paper proposes an event-constrained approach for a topic-to-essay generation that enforces the completeness of event elements during the generation process. Additionally, a language model is employed to verify the logical consistency of the generated results. Experimental results demonstrate that the proposed model achieves a better BLEU-2 score and performs better than the baseline in terms of subjective evaluation on a real dataset, indicating its capability to generate higher-quality topic-related text.

Keywords: event element, language model, natural language processing, topic-to-essay generation.

Procedia PDF Downloads 184
1331 Trend Detection Using Community Rank and Hawkes Process

Authors: Shashank Bhatnagar, W. Wilfred Godfrey

Abstract:

We develop in this paper, an approach to find the trendy topic, which not only considers the user-topic interaction but also considers the community, in which user belongs. This method modifies the previous approach of user-topic interaction to user-community-topic interaction with better speed-up in the range of [1.1-3]. We assume that trend detection in a social network is dependent on two things. The one is, broadcast of messages in social network governed by self-exciting point process, namely called Hawkes process and the second is, Community Rank. The influencer node links to others in the community and decides the community rank based on its PageRank and the number of users links to that community. The community rank decides the influence of one community over the other. Hence, the Hawkes process with the kernel of user-community-topic decides the trendy topic disseminated into the social network.

Keywords: community detection, community rank, Hawkes process, influencer node, pagerank, trend detection

Procedia PDF Downloads 342
1330 Topic Modelling Using Latent Dirichlet Allocation and Latent Semantic Indexing on SA Telco Twitter Data

Authors: Phumelele Kubheka, Pius Owolawi, Gbolahan Aiyetoro

Abstract:

Twitter is one of the most popular social media platforms where users can share their opinions on different subjects. As of 2010, The Twitter platform generates more than 12 Terabytes of data daily, ~ 4.3 petabytes in a single year. For this reason, Twitter is a great source for big mining data. Many industries such as Telecommunication companies can leverage the availability of Twitter data to better understand their markets and make an appropriate business decision. This study performs topic modeling on Twitter data using Latent Dirichlet Allocation (LDA). The obtained results are benchmarked with another topic modeling technique, Latent Semantic Indexing (LSI). The study aims to retrieve topics on a Twitter dataset containing user tweets on South African Telcos. Results from this study show that LSI is much faster than LDA. However, LDA yields better results with higher topic coherence by 8% for the best-performing model represented in Table 1. A higher topic coherence score indicates better performance of the model.

Keywords: big data, latent Dirichlet allocation, latent semantic indexing, telco, topic modeling, twitter

Procedia PDF Downloads 113
1329 Recognizing an Individual, Their Topic of Conversation and Cultural Background from 3D Body Movement

Authors: Gheida J. Shahrour, Martin J. Russell

Abstract:

The 3D body movement signals captured during human-human conversation include clues not only to the content of people’s communication but also to their culture and personality. This paper is concerned with automatic extraction of this information from body movement signals. For the purpose of this research, we collected a novel corpus from 27 subjects, arranged them into groups according to their culture. We arranged each group into pairs and each pair communicated with each other about different topics. A state-of-art recognition system is applied to the problems of person, culture, and topic recognition. We borrowed modeling, classification, and normalization techniques from speech recognition. We used Gaussian Mixture Modeling (GMM) as the main technique for building our three systems, obtaining 77.78%, 55.47%, and 39.06% from the person, culture, and topic recognition systems respectively. In addition, we combined the above GMM systems with Support Vector Machines (SVM) to obtain 85.42%, 62.50%, and 40.63% accuracy for person, culture, and topic recognition respectively. Although direct comparison among these three recognition systems is difficult, it seems that our person recognition system performs best for both GMM and GMM-SVM, suggesting that inter-subject differences (i.e. subject’s personality traits) are a major source of variation. When removing these traits from culture and topic recognition systems using the Nuisance Attribute Projection (NAP) and the Intersession Variability Compensation (ISVC) techniques, we obtained 73.44% and 46.09% accuracy from culture and topic recognition systems respectively.

Keywords: person recognition, topic recognition, culture recognition, 3D body movement signals, variability compensation

Procedia PDF Downloads 502
1328 Text Mining of Twitter Data Using a Latent Dirichlet Allocation Topic Model and Sentiment Analysis

Authors: Sidi Yang, Haiyi Zhang

Abstract:

Twitter is a microblogging platform, where millions of users daily share their attitudes, views, and opinions. Using a probabilistic Latent Dirichlet Allocation (LDA) topic model to discern the most popular topics in the Twitter data is an effective way to analyze a large set of tweets to find a set of topics in a computationally efficient manner. Sentiment analysis provides an effective method to show the emotions and sentiments found in each tweet and an efficient way to summarize the results in a manner that is clearly understood. The primary goal of this paper is to explore text mining, extract and analyze useful information from unstructured text using two approaches: LDA topic modelling and sentiment analysis by examining Twitter plain text data in English. These two methods allow people to dig data more effectively and efficiently. LDA topic model and sentiment analysis can also be applied to provide insight views in business and scientific fields.

Keywords: text mining, Twitter, topic model, sentiment analysis

Procedia PDF Downloads 142
1327 Optimized Text Summarization Model on Mobile Screens for Sight-Interpreters: An Empirical Study

Authors: Jianhua Wang

Abstract:

To obtain key information quickly from long texts on small screens of mobile devices, sight-interpreters need to establish optimized summarization model for fast information retrieval. Four summarization models based on previous studies were studied including title+key words (TKW), title+topic sentences (TTS), key words+topic sentences (KWTS) and title+key words+topic sentences (TKWTS). Psychological experiments were conducted on the four models for three different genres of interpreting texts to establish the optimized summarization model for sight-interpreters. This empirical study shows that the optimized summarization model for sight-interpreters to quickly grasp the key information of the texts they interpret is title+key words (TKW) for cultural texts, title+key words+topic sentences (TKWTS) for economic texts and topic sentences+key words (TSKW) for political texts.

Keywords: different genres, mobile screens, optimized summarization models, sight-interpreters

Procedia PDF Downloads 277
1326 Visualization and Performance Measure to Determine Number of Topics in Twitter Data Clustering Using Hybrid Topic Modeling

Authors: Moulana Mohammed

Abstract:

Topic models are widely used in building clusters of documents for more than a decade, yet problems occurring in choosing optimal number of topics. The main problem is the lack of a stable metric of the quality of topics obtained during the construction of topic models. The authors analyzed from previous works, most of the models used in determining the number of topics are non-parametric and quality of topics determined by using perplexity and coherence measures and concluded that they are not applicable in solving this problem. In this paper, we used the parametric method, which is an extension of the traditional topic model with visual access tendency for visualization of the number of topics (clusters) to complement clustering and to choose optimal number of topics based on results of cluster validity indices. Developed hybrid topic models are demonstrated with different Twitter datasets on various topics in obtaining the optimal number of topics and in measuring the quality of clusters. The experimental results showed that the Visual Non-negative Matrix Factorization (VNMF) topic model performs well in determining the optimal number of topics with interactive visualization and in performance measure of the quality of clusters with validity indices.

Keywords: interactive visualization, visual mon-negative matrix factorization model, optimal number of topics, cluster validity indices, Twitter data clustering

Procedia PDF Downloads 100
1325 Using Bidirectional Encoder Representations from Transformers to Extract Topic-Independent Sentiment Features for Social Media Bot Detection

Authors: Maryam Heidari, James H. Jones Jr.

Abstract:

Millions of online posts about different topics and products are shared on popular social media platforms. One use of this content is to provide crowd-sourced information about a specific topic, event or product. However, this use raises an important question: what percentage of information available through these services is trustworthy? In particular, might some of this information be generated by a machine, i.e., a bot, instead of a human? Bots can be, and often are, purposely designed to generate enough volume to skew an apparent trend or position on a topic, yet the consumer of such content cannot easily distinguish a bot post from a human post. In this paper, we introduce a model for social media bot detection which uses Bidirectional Encoder Representations from Transformers (Google Bert) for sentiment classification of tweets to identify topic-independent features. Our use of a Natural Language Processing approach to derive topic-independent features for our new bot detection model distinguishes this work from previous bot detection models. We achieve 94\% accuracy classifying the contents of data as generated by a bot or a human, where the most accurate prior work achieved accuracy of 92\%.

Keywords: bot detection, natural language processing, neural network, social media

Procedia PDF Downloads 81
1324 Analysis of Trends in Environmental Health Research Using Topic Modeling

Authors: Hayoung Cho, Gabi Cho

Abstract:

In response to the continuing increase of demands for living environment safety, the Korean government has established and implemented various environmental health policies and set a high priority to the related R&D. However, the level of related technologies such as environmental risk assessment are still relatively low, and there is a need for detailed investment strategies in the field of environmental health research. As scientific research papers can give valuable implications on the development of a certain field, this study analyzed the global research trends in the field of environmental health over the past 10 years (2005~2015). Research topics were extracted from abstracts of the collected SCI papers using topic modeling to study the changes in research trends and discover emerging technologies. The method of topic modeling can improve the traditional bibliometric approach and provide a more comprehensive review of the global research development. The results of this study are expected to help provide insights for effective policy making and R&D investment direction.

Keywords: environmental health, paper analysis, research trends, topic modeling

Procedia PDF Downloads 254
1323 Effects of Topic Familiarity on Linguistic Aspects in EFL Learners’ Writing Performance

Authors: Jeong-Won Lee, Kyeong-Ok Yoon

Abstract:

The current study aimed to investigate the effects of topic familiarity and language proficiency on linguistic aspects (lexical complexity, syntactic complexity, accuracy, and fluency) in EFL learners’ argumentative essays. For the study 64 college students were asked to write an argumentative essay for the two different topics (Driving and Smoking) chosen by the consideration of topic familiarity. The students were divided into two language proficiency groups (high-level and intermediate) according to their English writing proficiency. The findings of the study are as follows: 1) the participants of this study exhibited lower levels of lexical and syntactic complexity as well as accuracy when performing writing tasks with unfamiliar topics; and 2) they demonstrated the use of a wider range of vocabulary, and longer and more complex structures, and produced accurate and lengthier texts compared to their intermediate peers. Discussion and pedagogical implications for instruction of writing classes in EFL contexts were addressed.

Keywords: topic familiarity, complexity, accuracy, fluency

Procedia PDF Downloads 13
1322 Artificial Intelligence Assisted Sentiment Analysis of Hotel Reviews Using Topic Modeling

Authors: Sushma Ghogale

Abstract:

With a surge in user-generated content or feedback or reviews on the internet, it has become possible and important to know consumers' opinions about products and services. This data is important for both potential customers and businesses providing the services. Data from social media is attracting significant attention and has become the most prominent channel of expressing an unregulated opinion. Prospective customers look for reviews from experienced customers before deciding to buy a product or service. Several websites provide a platform for users to post their feedback for the provider and potential customers. However, the biggest challenge in analyzing such data is in extracting latent features and providing term-level analysis of the data. This paper proposes an approach to use topic modeling to classify the reviews into topics and conduct sentiment analysis to mine the opinions. This approach can analyse and classify latent topics mentioned by reviewers on business sites or review sites, or social media using topic modeling to identify the importance of each topic. It is followed by sentiment analysis to assess the satisfaction level of each topic. This approach provides a classification of hotel reviews using multiple machine learning techniques and comparing different classifiers to mine the opinions of user reviews through sentiment analysis. This experiment concludes that Multinomial Naïve Bayes classifier produces higher accuracy than other classifiers.

Keywords: latent Dirichlet allocation, topic modeling, text classification, sentiment analysis

Procedia PDF Downloads 64
1321 Investigating Dynamic Transition Process of Issues Using Unstructured Text Analysis

Authors: Myungsu Lim, William Xiu Shun Wong, Yoonjin Hyun, Chen Liu, Seongi Choi, Dasom Kim, Namgyu Kim

Abstract:

The amount of real-time data generated through various mass media has been increasing rapidly. In this study, we had performed topic analysis by using the unstructured text data that is distributed through news article. As one of the most prevalent applications of topic analysis, the issue tracking technique investigates the changes of the social issues that identified through topic analysis. Currently, traditional issue tracking is conducted by identifying the main topics of documents that cover an entire period at the same time and analyzing the occurrence of each topic by the period of occurrence. However, this traditional issue tracking approach has limitation that it cannot discover dynamic mutation process of complex social issues. The purpose of this study is to overcome the limitations of the existing issue tracking method. We first derived core issues of each period, and then discover the dynamic mutation process of various issues. In this study, we further analyze the mutation process from the perspective of the issues categories, in order to figure out the pattern of issue flow, including the frequency and reliability of the pattern. In other words, this study allows us to understand the components of the complex issues by tracking the dynamic history of issues. This methodology can facilitate a clearer understanding of complex social phenomena by providing mutation history and related category information of the phenomena.

Keywords: Data Mining, Issue Tracking, Text Mining, topic Analysis, topic Detection, Trend Detection

Procedia PDF Downloads 366
1320 Diversity in Finance Literature Revealed through the Lens of Machine Learning: A Topic Modeling Approach on Academic Papers

Authors: Oumaima Lahmar

Abstract:

This paper aims to define a structured topography for finance researchers seeking to navigate the body of knowledge in their extrapolation of finance phenomena. To make sense of the body of knowledge in finance, a probabilistic topic modeling approach is applied on 6000 abstracts of academic articles published in three top journals in finance between 1976 and 2020. This approach combines both machine learning techniques and natural language processing to statistically identify the conjunctions between research articles and their shared topics described each by relevant keywords. The topic modeling analysis reveals 35 coherent topics that can well depict finance literature and provide a comprehensive structure for the ongoing research themes. Comparing the extracted topics to the Journal of Economic Literature (JEL) classification system, a significant similarity was highlighted between the characterizing keywords. On the other hand, we identify other topics that do not match the JEL classification despite being relevant in the finance literature.

Keywords: finance literature, textual analysis, topic modeling, perplexity

Procedia PDF Downloads 125
1319 Topic Prominence and Temporal Encoding in Mandarin Chinese

Authors: Tzu-I Chiang

Abstract:

A central question for finite-nonfinite distinction in Mandarin Chinese is how does Mandarin encode temporal information without the grammatical contrast between past and present tense. Moreover, how do L2 learners of Mandarin whose native language is English and whose L1 system has tense morphology, acquire the temporal encoding system in L2 Mandarin? The current study reports preliminary findings on the relationship between topic prominence and the temporal encoding in L1 and L2 Chinese. Oral narratives data from 30 natives and learners of Mandarin Chinese were collected via a film-retell task. In terms of coding, predicates collected from the narratives were transcribed and then coded based on four major verb types: n-degree Statives (quality-STA), point-scale Statives (status-STA), n-atom EVENT (ACT), and point EVENT (resultative-ACT). How native speakers and non-native speakers started retelling the story was calculated. Results of the study show that native speakers of Chinese tend to express Topic Time (TT) syntactically at the topic position; whereas L2 learners of Chinese across levels rely mainly on the default time encoded in the event types. Moreover, as the proficiency level of the learner increases, learners’ appropriate use of the event predicates increased, which supports the argument that L2 development of temporal encoding is affected by lexical aspect.

Keywords: topic prominence, temporal encoding, lexical aspect, L2 acquisition

Procedia PDF Downloads 162
1318 Improving Topic Quality of Scripts by Using Scene Similarity Based Word Co-Occurrence

Authors: Yunseok Noh, Chang-Uk Kwak, Sun-Joong Kim, Seong-Bae Park

Abstract:

Scripts are one of the basic text resources to understand broadcasting contents. Since broadcast media wields lots of influence over the public, tools for understanding broadcasting contents are more required. Topic modeling is the method to get the summary of the broadcasting contents from its scripts. Generally, scripts represent contents descriptively with directions and speeches. Scripts also provide scene segments that can be seen as semantic units. Therefore, a script can be topic modeled by treating a scene segment as a document. Because scripts consist of speeches mainly, however, relatively small co-occurrences among words in the scene segments are observed. This causes inevitably the bad quality of topics based on statistical learning method. To tackle this problem, we propose a method of learning with additional word co-occurrence information obtained using scene similarities. The main idea of improving topic quality is that the information that two or more texts are topically related can be useful to learn high quality of topics. In addition, by using high quality of topics, we can get information more accurate whether two texts are related or not. In this paper, we regard two scene segments are related if their topical similarity is high enough. We also consider that words are co-occurred if they are in topically related scene segments together. In the experiments, we showed the proposed method generates a higher quality of topics from Korean drama scripts than the baselines.

Keywords: broadcasting contents, scripts, text similarity, topic model

Procedia PDF Downloads 283
1317 Emotion Oriented Students' Opinioned Topic Detection for Course Reviews in Massive Open Online Course

Authors: Zhi Liu, Xian Peng, Monika Domanska, Lingyun Kang, Sannyuya Liu

Abstract:

Massive Open education has become increasingly popular among worldwide learners. An increasing number of course reviews are being generated in Massive Open Online Course (MOOC) platform, which offers an interactive feedback channel for learners to express opinions and feelings in learning. These reviews typically contain subjective emotion and topic information towards the courses. However, it is time-consuming to artificially detect these opinions. In this paper, we propose an emotion-oriented topic detection model to automatically detect the students’ opinioned aspects in course reviews. The known overall emotion orientation and emotional words in each review are used to guide the joint probabilistic modeling of emotion and aspects in reviews. Through the experiment on real-life review data, it is verified that the distribution of course-emotion-aspect can be calculated to capture the most significant opinioned topics in each course unit. This proposed technique helps in conducting intelligent learning analytics for teachers to improve pedagogies and for developers to promote user experiences.

Keywords: Massive Open Online Course (MOOC), course reviews, topic model, emotion recognition, topical aspects

Procedia PDF Downloads 232
1316 AIPM:An Integrator and Pull Request Matching Model in Github

Authors: Zhifang Liao, Yanbing Li, Li Xu, Yan Zhang, Xiaoping Fan, Jinsong Wu

Abstract:

Pull Request (PR) is the primary method for code contributions from the external contributors in Github. PR review is an essential part of open source software developments for maintaining the quality of software. Matching a new PR of an appropriate integrator will make the PR review more effective. However, PR and integrator matching are now organized manually in Github. To reduce this cost, we presented an AIPM model to predict highly relevant integrator of incoming PRs. AIPM uses topic model to extract topics from the PRs, and builds a one-to-one correspondence between topics and integrators. Then, AIPM finds the most suitable integrator according to the maximum entry of the topic-document distribution. On average, AIPM can reach a precision of 60%, and even in some projects, can reach a precision of 80%.

Keywords: pull Request, integrator matching, Github, open source project, topic model

Procedia PDF Downloads 263
1315 Priming through Open Book MCQ Test: A Tool for Enhancing Learning in Medical Undergraduates

Authors: Bharti Bhandari, Bharati Mehta, Sabyasachi Sircar

Abstract:

Medical education is advancing in India, with its advancement newer innovations are being incorporated in teaching and assessment methodology. Our study focusses on a teaching innovation that is more student-centric than teacher-centric and is the need of the day. The teaching innovation was carried out in 1st year MBBS students of our institute. Students were assigned control and test groups. Priming was done for the students in the test group with an open-book MCQ based test in a particular topic before delivering formal didactic lecture on that topic. The control group was not assigned any such exercise. This was followed by formal didactic lecture on the same topic. Thereafter, both groups were assessed on the same topic. The marks were compiled and analysed using appropriate statistical tests. Students were also given questionnaire to elicit their views on the benefits of “self-priming”. The mean marks scored in theory assessment by the test group were statistically higher than the marks scored by the controls. According to students’ feedback, the ‘self-priming “process was interesting, helped in better orientation during class-room lectures and better understanding of the topic. They want it to be repeated for other topics with moderate difficulty level. Better performance of the students in the primed group validates the combination of student-centric priming model and didactic lecture as superior to the conventional, teacher-centric methods alone. If this system is successfully followed, the present teacher-centric pedagogy should increasingly give way to student-centric activities where the teacher is only a facilitator.

Keywords: medical education, open-book test, pedagogy, priming

Procedia PDF Downloads 400
1314 Building an Ontology for Researchers: An Application of Topic Maps and Social Information

Authors: Yu Hung Chiang, Hei Chia Wang

Abstract:

In the academic area, it is important for research to find proper research domain. Many researchers may refer to conference issues to find their interesting or new topics. Furthermore, conferences issues can help researchers realize current research trends in their field and learn about cutting-edge developments in their specialty. However, online published conference information may widely be distributed; it is not easy to be concluded. Many researchers use search engine of journals or conference issues to filter information in order to get what they want. However, this search engine has its limitation. There will still be some issues should be considered; i.e. researchers cannot find the associated topics which may be useful information for them. Hence, use Knowledge Management (KM) could be a way to resolve these issues. In KM, ontology is widely adopted; but most existed ontology construction methods do not consider social information between target users. To effective in academic KM, this study proposes a method of constructing research Topic Maps using Open Directory Project (ODP) and Social Information Processing (SIP). Through catching of social information in conference website: i.e. the information of co-authorship or collaborator, research topics can be associated among related researchers. Finally, the experiments show Topic Maps successfully help researchers to find the information they need more easily and quickly as well as construct associations between research topics.

Keywords: knowledge management, topic map, social information processing, ontology extraction

Procedia PDF Downloads 260
1313 Incremental Learning of Independent Topic Analysis

Authors: Takahiro Nishigaki, Katsumi Nitta, Takashi Onoda

Abstract:

In this paper, we present a method of applying Independent Topic Analysis (ITA) to increasing the number of document data. The number of document data has been increasing since the spread of the Internet. ITA was presented as one method to analyze the document data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis (ICA). ICA is a technique in the signal processing; however, it is difficult to apply the ITA to increasing number of document data. Because ITA must use the all document data so temporal and spatial cost is very high. Therefore, we present Incremental ITA which extracts the independent topics from increasing number of document data. Incremental ITA is a method of updating the independent topics when the document data is added after extracted the independent topics from a just previous the data. In addition, Incremental ITA updates the independent topics when the document data is added. And we show the result applied Incremental ITA to benchmark datasets.

Keywords: text mining, topic extraction, independent, incremental, independent component analysis

Procedia PDF Downloads 271
1312 Topic Sentiments toward the COVID-19 Vaccine on Twitter

Authors: Melissa Vang, Raheyma Khan, Haihua Chen

Abstract:

The coronavirus disease 2019 (COVID‐19) pandemic has changed people's lives from all over the world. More people have turned to Twitter to engage online and discuss the COVID-19 vaccine. This study aims to present a text mining approach to identify people's attitudes towards the COVID-19 vaccine on Twitter. To achieve this purpose, we collected 54,268 COVID-19 vaccine tweets from September 01, 2020, to November 01, 2020, then the BERT model is used for the sentiment and topic analysis. The results show that people had more negative than positive attitudes about the vaccine, and countries with an increasing number of confirmed cases had a higher percentage of negative attitudes. Additionally, the topics discussed in positive and negative tweets are different. The tweet datasets can be helpful to information professionals to inform the public about vaccine-related informational resources. Our findings may have implications for understanding people's cognitions and feelings about the vaccine.

Keywords: BERT, COVID-19 vaccine, sentiment analysis, topic modeling

Procedia PDF Downloads 112
1311 Scientific Recommender Systems Based on Neural Topic Model

Authors: Smail Boussaadi, Hassina Aliane

Abstract:

With the rapid growth of scientific literature, it is becoming increasingly challenging for researchers to keep up with the latest findings in their fields. Academic, professional networks play an essential role in connecting researchers and disseminating knowledge. To improve the user experience within these networks, we need effective article recommendation systems that provide personalized content.Current recommendation systems often rely on collaborative filtering or content-based techniques. However, these methods have limitations, such as the cold start problem and difficulty in capturing semantic relationships between articles. To overcome these challenges, we propose a new approach that combines BERTopic (Bidirectional Encoder Representations from Transformers), a state-of-the-art topic modeling technique, with community detection algorithms in a academic, professional network. Experiences confirm our performance expectations by showing good relevance and objectivity in the results.

Keywords: scientific articles, community detection, academic social network, recommender systems, neural topic model

Procedia PDF Downloads 48
1310 More Than a Game: An Educational Application Where Students Compete to Learn

Authors: Kadir Özsoy

Abstract:

Creating a moderately competitive learning environment is believed to have positive effects on student interest and motivation. The best way today to attract young learners to get involved in a fun, competitive learning experience is possible through mobile applications as these learners mostly rely on games and applications on their phones and tablets to have fun, communicate, look for information and study. In this study, a mobile application called ‘QuizUp’ is used to create a specific game topic for elementary level students at Anadolu University Preparatory School. The topic is specially designed with weekly-added questions in accordance with the course syllabus. Students challenge their classmates or randomly chosen opponents to answer questions related to their course subjects. They also chat and post on the topic’s wall in English. The study aims at finding out students’ perceptions towards the use of the application as a classroom and extra-curricular activity through a survey. The study concludes that educational games boost students’ motivation, lead to increased effort, and positively change their studying habits.

Keywords: competitive learning, educational application, effort, motivation 'QuizUp', study habits

Procedia PDF Downloads 323
1309 Green Accounting and Firm Performance: A Bibliometric Literature Review

Authors: Francesca di Donato, Sara Trucco

Abstract:

Green accounting is a growing topic of interest. Indeed, nowadays, most firms affect the environment; therefore, companies are seeking the best way to disclose environmental information. Furthermore, companies are increasingly committed to improving the environment, and the topic is gaining more importance to the public, governments, and policymakers. Green accounting is a type of accounting that considers environmental costs and their impact on the financial performance of firms. Thus, the motivation of the current research is to investigate the state-of-the-art literature on the relationship between green accounting and firm performance since the birth of the topic of green accounting and to investigate gaps in the literature that represent fruitful terrain for future research. In doing so, this study provides a bibliometric literature review of existing evidence related to the link between green accounting and firm performance since 2000. The search, based on the most relevant databases for scientific journals (which are Scopus, Emerald, Web of Science, Google Scholar, and Econlit), returned 1917 scientific articles. The articles were manually reviewed in order to identify only the relevant studies in the field by excluding articles with titles and abstracts out of scope. The final sample was composed of 107 articles. A content analysis was carried out on the final sample of articles; in doing so, a classification system has been proposed. Findings show the most relevant environmental costs and issues considered in previous studies and how green accounting may be linked to the financial and non-financial performance of a firm. The study also offers suggestions for future research in this domain. This study has several practical implications. Indeed, the topic of green accounting may be applied to different sectors and different types of companies. Therefore, this study may help managers to better understand the most relevant environmental information to disclose and how environmental issues may be managed to improve the performance of the firms. Moreover, the bibliometric literature review may be of interest to those stakeholders who are interested in the historical evolution of the topic.

Keywords: bibliometric literature review, firm performance, green accounting, literature review

Procedia PDF Downloads 27
1308 Towards Law Data Labelling Using Topic Modelling

Authors: Daniel Pinheiro Da Silva Junior, Aline Paes, Daniel De Oliveira, Christiano Lacerda Ghuerren, Marcio Duran

Abstract:

The Courts of Accounts are institutions responsible for overseeing and point out irregularities of Public Administration expenses. They have a high demand for processes to be analyzed, whose decisions must be grounded on severity laws. Despite the existing large amount of processes, there are several cases reporting similar subjects. Thus, previous decisions on already analyzed processes can be a precedent for current processes that refer to similar topics. Identifying similar topics is an open, yet essential task for identifying similarities between several processes. Since the actual amount of topics is considerably large, it is tedious and error-prone to identify topics using a pure manual approach. This paper presents a tool based on Machine Learning and Natural Language Processing to assists in building a labeled dataset. The tool relies on Topic Modelling with Latent Dirichlet Allocation to find the topics underlying a document followed by Jensen Shannon distance metric to generate a probability of similarity between documents pairs. Furthermore, in a case study with a corpus of decisions of the Rio de Janeiro State Court of Accounts, it was noted that data pre-processing plays an essential role in modeling relevant topics. Also, the combination of topic modeling and a calculated distance metric over document represented among generated topics has been proved useful in helping to construct a labeled base of similar and non-similar document pairs.

Keywords: courts of accounts, data labelling, document similarity, topic modeling

Procedia PDF Downloads 134
1307 Technology Assessment: Exploring Possibilities to Encounter Problems Faced by Intellectual Property through Blockchain

Authors: M. Ismail, E. Grifell-Tatjé, A. Paz

Abstract:

A significant discussion on the topic of blockchain as a solution to the issues of intellectual property highlights the relevance that this topic holds. Some experts label this technology as destructive since it holds immense potential to change course of traditional practices. The extent and areas to which this technology can be of use are still being researched. This paper provides an in-depth review on the intellectual property and blockchain technology. Further it explores what makes blockchain suitable for intellectual property, the practical solutions available and the support different governments are offering. This paper further studies the framework of universities in context of its outputs and how can they be streamlined using blockchain technology. The paper concludes by discussing some limitations and future research question.

Keywords: blockchain, decentralization, open innovation, intellectual property, patents, university-industry relationship

Procedia PDF Downloads 144
1306 Protection of a Doctor’s Reputation Against the Unjustified Medical Malpractice Allegations

Authors: Anna Wszołek

Abstract:

For a very long time, the doctor-patient relationship had a paternalistic character. The events of the II World War, as well as fast development of the biotechnology and medicine caused an important change in that relationship. Human beings and their dignity were put in the centre of philosophical and legal debate. The increasing frequency of clinical trials led to the emergence of bioethics, which dealt with the topic of the possibilities and boundaries of such research in relation to individual’s autonomy. Thus, there was a transformation from a paternalistic relationship to a more collaborative one in which the patient has more room for self-determination. Today, patients are more and more aware of their rights and the obligations placed on doctors and the health care system, which is linked to an increase in medical malpractice claims. Unfortunately, these claims are not always justified. There is a strong concentration around the topic of patient’s good, however, at the other side there are doctors who feel, on the example of Poland, they might be easily accused and sued for medical malpractice even though they fulfilled their duties. Such situation may have a negative impact on the quality of health care services and patient’s interests. This research is going to present doctor’s perspective on the topic of medical malpractice allegations. It is supposed to show possible damage to a doctor’s reputation caused by frivolous and weakly justified medical malpractice accusations, as well as means to protect this reputation.

Keywords: doctor's reputation, medical malpractice, personal rights, unjustified allegations

Procedia PDF Downloads 59