Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 27381

Search results for: unsupervised sentiment analysis

27231 COVID_ICU_BERT: A Fine-Tuned Language Model for COVID-19 Intensive Care Unit Clinical Notes

Authors: Shahad Nagoor, Lucy Hederman, Kevin Koidl, Annalina Caputo

Abstract:

Doctors’ notes reflect their impressions, attitudes, clinical sense, and opinions about patients’ conditions and progress, and other information that is essential for doctors’ daily clinical decisions. Despite their value, clinical notes are insufficiently researched within the language processing community. Automatically extracting information from unstructured text data is known to be a difficult task as opposed to dealing with structured information such as vital physiological signs, images, and laboratory results. The aim of this research is to investigate how Natural Language Processing (NLP) techniques and machine learning techniques applied to clinician notes can assist in doctors’ decision-making in Intensive Care Unit (ICU) for coronavirus disease 2019 (COVID-19) patients. The hypothesis is that clinical outcomes like survival or mortality can be useful in influencing the judgement of clinical sentiment in ICU clinical notes. This paper introduces two contributions: first, we introduce COVID_ICU_BERT, a fine-tuned version of clinical transformer models that can reliably predict clinical sentiment for notes of COVID patients in the ICU. We train the model on clinical notes for COVID-19 patients, a type of notes that were not previously seen by clinicalBERT, and Bio_Discharge_Summary_BERT. The model, which was based on clinicalBERT achieves higher predictive accuracy (Acc 93.33%, AUC 0.98, and precision 0.96 ). Second, we perform data augmentation using clinical contextual word embedding that is based on a pre-trained clinical model to balance the samples in each class in the data (survived vs. deceased patients). Data augmentation improves the accuracy of prediction slightly (Acc 96.67%, AUC 0.98, and precision 0.92 ).

Keywords: BERT fine-tuning, clinical sentiment, COVID-19, data augmentation

Procedia PDF Downloads 189

27230 Unsupervised Text Mining Approach to Early Warning System

Authors: Ichihan Tai, Bill Olson, Paul Blessner

Abstract:

Traditional early warning systems that alarm against crisis are generally based on structured or numerical data; therefore, a system that can make predictions based on unstructured textual data, an uncorrelated data source, is a great complement to the traditional early warning systems. The Chicago Board Options Exchange (CBOE) Volatility Index (VIX), commonly referred to as the fear index, measures the cost of insurance against market crash, and spikes in the event of crisis. In this study, news data is consumed for prediction of whether there will be a market-wide crisis by predicting the movement of the fear index, and the historical references to similar events are presented in an unsupervised manner. Topic modeling-based prediction and representation are made based on daily news data between 1990 and 2015 from The Wall Street Journal against VIX index data from CBOE.

Keywords: early warning system, knowledge management, market prediction, topic modeling.

Procedia PDF Downloads 324

27229 Electrical Decomposition of Time Series of Power Consumption

Authors: Noura Al Akkari, Aurélie Foucquier, Sylvain Lespinats

Abstract:

Load monitoring is a management process for energy consumption towards energy savings and energy efficiency. Non Intrusive Load Monitoring (NILM) is one method of load monitoring used for disaggregation purposes. NILM is a technique for identifying individual appliances based on the analysis of the whole residence data retrieved from the main power meter of the house. Our NILM framework starts with data acquisition, followed by data preprocessing, then event detection, feature extraction, then general appliance modeling and identification at the final stage. The event detection stage is a core component of NILM process since event detection techniques lead to the extraction of appliance features. Appliance features are required for the accurate identification of the household devices. In this research work, we aim at developing a new event detection methodology with accurate load disaggregation to extract appliance features. Time-domain features extracted are used for tuning general appliance models for appliance identification and classification steps. We use unsupervised algorithms such as Dynamic Time Warping (DTW). The proposed method relies on detecting areas of operation of each residential appliance based on the power demand. Then, detecting the time at which each selected appliance changes its states. In order to fit with practical existing smart meters capabilities, we work on low sampling data with a frequency of (1/60) Hz. The data is simulated on Load Profile Generator software (LPG), which was not previously taken into consideration for NILM purposes in the literature. LPG is a numerical software that uses behaviour simulation of people inside the house to generate residential energy consumption data. The proposed event detection method targets low consumption loads that are difficult to detect. Also, it facilitates the extraction of specific features used for general appliance modeling. In addition to this, the identification process includes unsupervised techniques such as DTW. To our best knowledge, there exist few unsupervised techniques employed with low sampling data in comparison to the many supervised techniques used for such cases. We extract a power interval at which falls the operation of the selected appliance along with a time vector for the values delimiting the state transitions of the appliance. After this, appliance signatures are formed from extracted power, geometrical and statistical features. Afterwards, those formed signatures are used to tune general model types for appliances identification using unsupervised algorithms. This method is evaluated using both simulated data on LPG and real-time Reference Energy Disaggregation Dataset (REDD). For that, we compute performance metrics using confusion matrix based metrics, considering accuracy, precision, recall and error-rate. The performance analysis of our methodology is then compared with other detection techniques previously used in the literature review, such as detection techniques based on statistical variations and abrupt changes (Variance Sliding Window and Cumulative Sum).

Keywords: electrical disaggregation, DTW, general appliance modeling, event detection

Procedia PDF Downloads 64

27228 Component Based Testing Using Clustering and Support Vector Machine

Authors: Iqbaldeep Kaur, Amarjeet Kaur

Abstract:

Software Reusability is important part of software development. So component based software development in case of software testing has gained a lot of practical importance in the field of software engineering from academic researcher and also from software development industry perspective. Finding test cases for efficient reuse of test cases is one of the important problems aimed by researcher. Clustering reduce the search space, reuse test cases by grouping similar entities according to requirements ensuring reduced time complexity as it reduce the search time for retrieval the test cases. In this research paper we proposed approach for re-usability of test cases by unsupervised approach. In unsupervised learning we proposed k-mean and Support Vector Machine. We have designed the algorithm for requirement and test case document clustering according to its tf-idf vector space and the output is set of highly cohesive pattern groups.

Keywords: software testing, reusability, clustering, k-mean, SVM

Procedia PDF Downloads 418

27227 A Recommender System for Job Seekers to Show up Companies Based on Their Psychometric Preferences and Company Sentiment Scores

Authors: A. Ashraff

Abstract:

The increasing importance of the web as a medium for electronic and business transactions has served as a catalyst or rather a driving force for the introduction and implementation of recommender systems. Recommender Systems play a major role in processing and analyzing thousands of data rows or reviews and help humans make a purchase decision of a product or service. It also has the ability to predict whether a particular user would rate a product or service based on the user’s profile behavioral pattern. At present, Recommender Systems are being used extensively in every domain known to us. They are said to be ubiquitous. However, in the field of recruitment, it’s not being utilized exclusively. Recent statistics show an increase in staff turnover, which has negatively impacted the organization as well as the employee. The reasons being company culture, working flexibility (work from home opportunity), no learning advancements, and pay scale. Further investigations revealed that there are lacking guidance or support, which helps a job seeker find the company that will suit him best, and though there’s information available about companies, job seekers can’t read all the reviews by themselves and get an analytical decision. In this paper, we propose an approach to study the available review data on IT companies (score their reviews based on user review sentiments) and gather information on job seekers, which includes their Psychometric evaluations. Then presents the job seeker with useful information or rather outputs on which company is most suitable for the job seeker. The theoretical approach, Algorithmic approach and the importance of such a system will be discussed in this paper.

Keywords: psychometric tests, recommender systems, sentiment analysis, hybrid recommender systems

Procedia PDF Downloads 96

27226 Optimal Pricing Based on Real Estate Demand Data

Authors: Vanessa Kummer, Maik Meusel

Abstract:

Real estate demand estimates are typically derived from transaction data. However, in regions with excess demand, transactions are driven by supply and therefore do not indicate what people are actually looking for. To estimate the demand for housing in Switzerland, search subscriptions from all important Swiss real estate platforms are used. These data do, however, suffer from missing information—for example, many users do not specify how many rooms they would like or what price they would be willing to pay. In economic analyses, it is often the case that only complete data is used. Usually, however, the proportion of complete data is rather small which leads to most information being neglected. Also, the data might have a strong distortion if it is complete. In addition, the reason that data is missing might itself also contain information, which is however ignored with that approach. An interesting issue is, therefore, if for economic analyses such as the one at hand, there is an added value by using the whole data set with the imputed missing values compared to using the usually small percentage of complete data (baseline). Also, it is interesting to see how different algorithms affect that result. The imputation of the missing data is done using unsupervised learning. Out of the numerous unsupervised learning approaches, the most common ones, such as clustering, principal component analysis, or neural networks techniques are applied. By training the model iteratively on the imputed data and, thereby, including the information of all data into the model, the distortion of the first training set—the complete data—vanishes. In a next step, the performances of the algorithms are measured. This is done by randomly creating missing values in subsets of the data, estimating those values with the relevant algorithms and several parameter combinations, and comparing the estimates to the actual data. After having found the optimal parameter set for each algorithm, the missing values are being imputed. Using the resulting data sets, the next step is to estimate the willingness to pay for real estate. This is done by fitting price distributions for real estate properties with certain characteristics, such as the region or the number of rooms. Based on these distributions, survival functions are computed to obtain the functional relationship between characteristics and selling probabilities. Comparing the survival functions shows that estimates which are based on imputed data sets do not differ significantly from each other; however, the demand estimate that is derived from the baseline data does. This indicates that the baseline data set does not include all available information and is therefore not representative for the entire sample. Also, demand estimates derived from the whole data set are much more accurate than the baseline estimation. Thus, in order to obtain optimal results, it is important to make use of all available data, even though it involves additional procedures such as data imputation.

Keywords: demand estimate, missing-data imputation, real estate, unsupervised learning

Procedia PDF Downloads 273

27225 Unsupervised Segmentation Technique for Acute Leukemia Cells Using Clustering Algorithms

Authors: N. H. Harun, A. S. Abdul Nasir, M. Y. Mashor, R. Hassan

Abstract:

Leukaemia is a blood cancer disease that contributes to the increment of mortality rate in Malaysia each year. There are two main categories for leukaemia, which are acute and chronic leukaemia. The production and development of acute leukaemia cells occurs rapidly and uncontrollable. Therefore, if the identification of acute leukaemia cells could be done fast and effectively, proper treatment and medicine could be delivered. Due to the requirement of prompt and accurate diagnosis of leukaemia, the current study has proposed unsupervised pixel segmentation based on clustering algorithm in order to obtain a fully segmented abnormal white blood cell (blast) in acute leukaemia image. In order to obtain the segmented blast, the current study proposed three clustering algorithms which are k-means, fuzzy c-means and moving k-means algorithms have been applied on the saturation component image. Then, median filter and seeded region growing area extraction algorithms have been applied, to smooth the region of segmented blast and to remove the large unwanted regions from the image, respectively. Comparisons among the three clustering algorithms are made in order to measure the performance of each clustering algorithm on segmenting the blast area. Based on the good sensitivity value that has been obtained, the results indicate that moving k-means clustering algorithm has successfully produced the fully segmented blast region in acute leukaemia image. Hence, indicating that the resultant images could be helpful to haematologists for further analysis of acute leukaemia.

Keywords: acute leukaemia images, clustering algorithms, image segmentation, moving k-means

Procedia PDF Downloads 281

27224 Domain Adaptive Dense Retrieval with Query Generation

Authors: Rui Yin, Haojie Wang, Xun Li

Abstract:

Recently, mainstream dense retrieval methods have obtained state-of-the-art results on some datasets and tasks. However, they require large amounts of training data, which is not available in most domains. The severe performance degradation of dense retrievers on new data domains has limited the use of dense retrieval methods to only a few domains with large training datasets. In this paper, we propose an unsupervised domain-adaptive approach based on query generation. First, a generative model is used to generate relevant queries for each passage in the target corpus, and then, the generated queries are used for mining negative passages. Finally, the query-passage pairs are labeled with a cross-encoder and used to train a domain-adapted dense retriever. We also explore contrastive learning as a method for training domain-adapted dense retrievers and show that it leads to strong performance in various retrieval settings. Experiments show that our approach is more robust than previous methods in target domains that require less unlabeled data.

Keywords: dense retrieval, query generation, contrastive learning, unsupervised training

Procedia PDF Downloads 83

27223 Deep Learning Based Unsupervised Sport Scene Recognition and Highlights Generation

Authors: Ksenia Meshkova

Abstract:

With increasing amount of multimedia data, it is very important to automate and speed up the process of obtaining meta. This process means not just recognition of some object or its movement, but recognition of the entire scene versus separate frames and having timeline segmentation as a final result. Labeling datasets is time consuming, besides, attributing characteristics to particular scenes is clearly difficult due to their nature. In this article, we will consider autoencoders application to unsupervised scene recognition and clusterization based on interpretable features. Further, we will focus on particular types of auto encoders that relevant to our study. We will take a look at the specificity of deep learning related to information theory and rate-distortion theory and describe the solutions empowering poor interpretability of deep learning in media content processing. As a conclusion, we will present the results of the work of custom framework, based on autoencoders, capable of scene recognition as was deeply studied above, with highlights generation resulted out of this recognition. We will not describe in detail the mathematical description of neural networks work but will clarify the necessary concepts and pay attention to important nuances.

Keywords: neural networks, computer vision, representation learning, autoencoders

Procedia PDF Downloads 111

27222 Neural Networks Models for Measuring Hotel Users Satisfaction

Authors: Asma Ameur, Dhafer Malouche

Abstract:

Nowadays, user comments on the Internet have an important impact on hotel bookings. This confirms that the e-reputation issue can influence the likelihood of customer loyalty to a hotel. In this way, e-reputation has become a real differentiator between hotels. For this reason, we have a unique opportunity in the opinion mining field to analyze the comments. In fact, this field provides the possibility of extracting information related to the polarity of user reviews. This sentimental study (Opinion Mining) represents a new line of research for analyzing the unstructured textual data. Knowing the score of e-reputation helps the hotelier to better manage his marketing strategy. The score we then obtain is translated into the image of hotels to differentiate between them. Therefore, this present research highlights the importance of hotel satisfaction ‘scoring. To calculate the satisfaction score, the sentimental analysis can be manipulated by several techniques of machine learning. In fact, this study treats the extracted textual data by using the Artificial Neural Networks Approach (ANNs). In this context, we adopt the aforementioned technique to extract information from the comments available in the ‘Trip Advisor’ website. This actual paper details the description and the modeling of the ANNs approach for the scoring of online hotel reviews. In summary, the validation of this used method provides a significant model for hotel sentiment analysis. So, it provides the possibility to determine precisely the polarity of the hotel users reviews. The empirical results show that the ANNs are an accurate approach for sentiment analysis. The obtained results show also that this proposed approach serves to the dimensionality reduction for textual data’ clustering. Thus, this study provides researchers with a useful exploration of this technique. Finally, we outline guidelines for future research in the hotel e-reputation field as comparing the ANNs with other technique.

Keywords: clustering, consumer behavior, data mining, e-reputation, machine learning, neural network, online hotel ‘reviews, opinion mining, scoring

Procedia PDF Downloads 119

27221 The Association between Affective States and Sexual/Health-Related Status among Men Who Have Sex with Men in China: An Exploration Study Using Social Media Data

Authors: Zhi-Wei Zheng, Zhong-Qi Liu, Jia-Ling Qiu, Shan-Qing Guo, Zhong-Wei Jia, Chun Hao

Abstract:

Objectives: The purpose of this study was to understand and examine the association between diurnal mood variation and sexual/health-related status among men who have sex with men (MSM) using data from MSM Chinese Twitter messages. The study consists of 843,745 postings of 377,610 MSM users located in Guangdong that were culled from the MSM Chinese Twitter App. Positive affect, negative affect, sexual related behaviors, and health-related status were measured using the Simplified Chinese Linguistic Inquiry and Word Count. Emotions, including joy, sadness, anger, fear, and disgust were measured using the Weibo Basic Mood Lexicon. A positive sentiment score and a positive emotions score were also calculated. Linear regression models based on a permutation test were used to assess associations between affective states and sexual/health-related status. In the results, 5,871 active MSM users and their 477,374 postings were finally selected. MSM expressed positive affect and joy at 8 a.m. and expressed negative affect and negative emotions between 2 a.m. and 4 a.m. In addition, 25.1% of negative postings were directly related to health and 13.4% reported seeking social support during that sensitive period. MSM who were senior, educated, overweight or obese, self-identified as performing a versatile sex role, and with less followers, more followers, and less chat groups mainly expressed more negative affect and negative emotions. MSM who talked more about sexual-related behaviors had a higher positive sentiment score (β=0.29, p < 0.001) and a higher positive emotions score (β = 0.16, p < 0.001). MSM who reported more on their health status had a lower positive sentiment score (β = -0.83, p < 0.001) and a lower positive emotions score (β = -0.37, p < 0.001). The study concluded that psychological intervention based on an app for MSM should be conducted, as it may improve mental health.

Keywords: affect, men who have sex with men, sexual related behavior, health-related status, social media

Procedia PDF Downloads 147

27220 Evaluation Metrics for Machine Learning Techniques: A Comprehensive Review and Comparative Analysis of Performance Measurement Approaches

Authors: Seyed-Ali Sadegh-Zadeh, Kaveh Kavianpour, Hamed Atashbar, Elham Heidari, Saeed Shiry Ghidary, Amir M. Hajiyavand

Abstract:

Evaluation metrics play a critical role in assessing the performance of machine learning models. In this review paper, we provide a comprehensive overview of performance measurement approaches for machine learning models. For each category, we discuss the most widely used metrics, including their mathematical formulations and interpretation. Additionally, we provide a comparative analysis of performance measurement approaches for metric combinations. Our review paper aims to provide researchers and practitioners with a better understanding of performance measurement approaches and to aid in the selection of appropriate evaluation metrics for their specific applications.

Keywords: evaluation metrics, performance measurement, supervised learning, unsupervised learning, reinforcement learning, model robustness and stability, comparative analysis

Procedia PDF Downloads 47

27219 Using Bidirectional Encoder Representations from Transformers to Extract Topic-Independent Sentiment Features for Social Media Bot Detection

Authors: Maryam Heidari, James H. Jones Jr.

Abstract:

Millions of online posts about different topics and products are shared on popular social media platforms. One use of this content is to provide crowd-sourced information about a specific topic, event or product. However, this use raises an important question: what percentage of information available through these services is trustworthy? In particular, might some of this information be generated by a machine, i.e., a bot, instead of a human? Bots can be, and often are, purposely designed to generate enough volume to skew an apparent trend or position on a topic, yet the consumer of such content cannot easily distinguish a bot post from a human post. In this paper, we introduce a model for social media bot detection which uses Bidirectional Encoder Representations from Transformers (Google Bert) for sentiment classification of tweets to identify topic-independent features. Our use of a Natural Language Processing approach to derive topic-independent features for our new bot detection model distinguishes this work from previous bot detection models. We achieve 94\% accuracy classifying the contents of data as generated by a bot or a human, where the most accurate prior work achieved accuracy of 92\%.

Keywords: bot detection, natural language processing, neural network, social media

Procedia PDF Downloads 105

27218 Using Deep Learning Neural Networks and Candlestick Chart Representation to Predict Stock Market

Authors: Rosdyana Mangir Irawan Kusuma, Wei-Chun Kao, Ho-Thi Trang, Yu-Yen Ou, Kai-Lung Hua

Abstract:

Stock market prediction is still a challenging problem because there are many factors that affect the stock market price such as company news and performance, industry performance, investor sentiment, social media sentiment, and economic factors. This work explores the predictability in the stock market using deep convolutional network and candlestick charts. The outcome is utilized to design a decision support framework that can be used by traders to provide suggested indications of future stock price direction. We perform this work using various types of neural networks like convolutional neural network, residual network and visual geometry group network. From stock market historical data, we converted it to candlestick charts. Finally, these candlestick charts will be feed as input for training a convolutional neural network model. This convolutional neural network model will help us to analyze the patterns inside the candlestick chart and predict the future movements of the stock market. The effectiveness of our method is evaluated in stock market prediction with promising results; 92.2% and 92.1 % accuracy for Taiwan and Indonesian stock market dataset respectively.

Keywords: candlestick chart, deep learning, neural network, stock market prediction

Procedia PDF Downloads 425

27217 Efficient Schemes of Classifiers for Remote Sensing Satellite Imageries of Land Use Pattern Classifications

Authors: S. S. Patil, Sachidanand Kini

Abstract:

Classification of land use patterns is compelling in complexity and variability of remote sensing imageries data. An imperative research in remote sensing application exploited to mine some of the significant spatially variable factors as land cover and land use from satellite images for remote arid areas in Karnataka State, India. The diverse classification techniques, unsupervised and supervised consisting of maximum likelihood, Mahalanobis distance, and minimum distance are applied in Bellary District in Karnataka State, India for the classification of the raw satellite images. The accuracy evaluations of results are compared visually with the standard maps with ground-truths. We initiated with the maximum likelihood technique that gave the finest results and both minimum distance and Mahalanobis distance methods over valued agriculture land areas. In meanness of mislaid few irrelevant features due to the low resolution of the satellite images, high-quality accord between parameters extracted automatically from the developed maps and field observations was found.

Keywords: Mahalanobis distance, minimum distance, supervised, unsupervised, user classification accuracy, producer's classification accuracy, maximum likelihood, kappa coefficient

Procedia PDF Downloads 170

27216 Sentiment Analysis of Tourist Online Reviews Concerning Lisbon Cultural Patrimony, as a Contribute to the City Attractiveness Evaluation

Authors: Joao Ferreira Do Rosario, Maria De Lurdes Calisto, Ana Teresa Machado, Nuno Gustavo, Rui Gonçalves

Abstract:

The tourism sector is increasingly important to the economic performance of countries and a relevant theme to academic research, increasing the importance of understanding how and why tourists evaluate tourism locations. The city of Lisbon is currently a tourist destination of excellence in the European and world-wide panorama, registering a significant growth of the economic weight of its tourist activities in the Gross Added Value of the region. Although there is research on the feedback of those who visit tourist sites and different methodologies for studying tourist sites have been applied, this research seeks to be innovative in the objective of obtaining insights on the competitiveness in terms of attractiveness of the city of Lisbon as a tourist destination, based the feedback of tourists in the Facebook pages of the most visited museums and monuments of Lisbon, an interpretation that is relevant in the development of strategies of tourist attraction. The intangible dimension of the tourism offer, due to its unique condition of simultaneous production and consumption, makes eWOM particularly relevant. The testimony of consumers is thus a decisive factor in the decision-making and buying process in tourism. Online social networks are one of the most used platforms for tourists to evaluate the attractiveness's points of a tourism destination (e.g. cultural and historical heritage), with this user-generated feedback enabling relevant information about the customer-tourists. This information is related to the tourist experience representing the true voice of the customer. Furthermore, this voice perceived by others as genuine, opposite to marketing messages, may have a powerful word-of-mouth influence on other potential tourists. The relevance of online reviews sharing, however, becomes particularly complex, considering social media users’ different profiles or the possible and different sources of information available, as well as their associated reputation associated with each source. In the light of these trends, our research focuses on the tourists’ feedback on Facebook pages of the most visited museums and monuments of Lisbon that contribute to its attractiveness as a tourism destination. Sentiment Analysis is the methodology selected for this research, using public available information in the online context, which was deemed as an appropriate non-participatory observation method. Data will be collected from two museums (Museu dos Coches and Museu de Arte Antiga) and three monuments ((Mosteiro dos Jerónimos, Torre de Belém and Panteão Nacional) Facebook pages during a period of one year. The research results will help in the evaluation of the considered places by the tourists, their contribution to the city attractiveness and present insights helpful for the management decisions regarding this museums and monuments. The results of this study will also contribute to a better knowledge of the tourism sector, namely the identification of attributes in the evaluation and choice of the city of Lisbon as a tourist destination. Further research will evaluate the Lisbon attraction points for tourists in different categories beyond museums and monuments, will also evaluate the tourist feedback from other sources like TripAdvisor and apply the same methodology in other cities and country regions.

Keywords: Lisbon tourism, opinion mining, sentiment analysis, tourism location attractiveness evaluation

Procedia PDF Downloads 220

27215 The Impact of the Economic Crisis in the European Identity

Authors: Sofía Luna, Carla González Salamanca

Abstract:

The 2008 economic crisis had huge implications in Europe. In this continent, the repercussions of the crisis were not only economic but also political and institutional. The economic stress has generated changes in the perception of the citizens, their attitude and the confidence placed in the political organizations. The lost of confidence is not only present in the debtor countries but it is also present in the European economic powers like Germany and France. This research explains how the economic crisis had an impact in the identity, population’s attitude and how this generated the rise of extreme right parties. In addition, it defines the different types of attitudes and support that exist towards these political and economic institutions. The results of this investigation show that the depression beside of its economic implications, it caused institutional, social and political difficulties for the Union. Moreover, the support and attitudes of the population were severely strained because the confidence in the political organization decreased. Furthermore, a rise in the otherness sentiment was shown. In other words, the distinction between “us” and “them” increased causing repercussions in the collective European identity. Additionally, there was a spread in national identities that caused the rise of the extreme right wing parties. In conclusion, the 2008 economic crisis caused not only economic stress but also it generated a political, social and institutional crisis in Europe.

Keywords: Europe, identity, economic crisis, otherness sentiment

Procedia PDF Downloads 489

27214 Analysis of Pangasinan State University: Bayambang Students’ Concerns Through Social Media Analytics and Latent Dirichlet Allocation Topic Modelling Approach

Authors: Matthew John F. Sino Cruz, Sarah Jane M. Ferrer, Janice C. Francisco

Abstract:

COVID-19 pandemic has affected more than 114 countries all over the world since it was considered a global health concern in 2020. Different sectors, including education, have shifted to remote/distant setups to follow the guidelines set to prevent the spread of the disease. One of the higher education institutes which shifted to remote setup is the Pangasinan State University (PSU). In order to continue providing quality instructions to the students, PSU designed Flexible Learning Model to still provide services to its stakeholders amidst the pandemic. The model covers the redesigning of delivering instructions in remote setup and the technology needed to support these adjustments. The primary goal of this study is to determine the insights of the PSU – Bayambang students towards the remote setup implemented during the pandemic and how they perceived the initiatives employed in relation to their experiences in flexible learning. In this study, the topic modelling approach was implemented using Latent Dirichlet Allocation. The dataset used in the study. The results show that the most common concern of the students includes time and resource management, poor internet connection issues, and difficulty coping with the flexible learning modality. Furthermore, the findings of the study can be used as one of the bases for the administration to review and improve the policies and initiatives implemented during the pandemic in relation to remote service delivery. In addition, further studies can be conducted to determine the overall sentiment of the other stakeholders in the policies implemented at the University.

Keywords: COVID-19, topic modelling, students’ sentiment, flexible learning, Latent Dirichlet allocation

Procedia PDF Downloads 110

27213 Emotional Analysis for Text Search Queries on Internet

Authors: Gemma García López

Abstract:

The goal of this study is to analyze if search queries carried out in search engines such as Google, can offer emotional information about the user that performs them. Knowing the emotional state in which the Internet user is located can be a key to achieve the maximum personalization of content and the detection of worrying behaviors. For this, two studies were carried out using tools with advanced natural language processing techniques. The first study determines if a query can be classified as positive, negative or neutral, while the second study extracts emotional content from words and applies the categorical and dimensional models for the representation of emotions. In addition, we use search queries in Spanish and English to establish similarities and differences between two languages. The results revealed that text search queries performed by users on the Internet can be classified emotionally. This allows us to better understand the emotional state of the user at the time of the search, which could involve adapting the technology and personalizing the responses to different emotional states.

Keywords: emotion classification, text search queries, emotional analysis, sentiment analysis in text, natural language processing

Procedia PDF Downloads 132

27212 Advancing the Analysis of Physical Activity Behaviour in Diverse, Rapidly Evolving Populations: Using Unsupervised Machine Learning to Segment and Cluster Accelerometer Data

Authors: Christopher Thornton, Niina Kolehmainen, Kianoush Nazarpour

Abstract:

Background: Accelerometers are widely used to measure physical activity behavior, including in children. The traditional method for processing acceleration data uses cut points, relying on calibration studies that relate the quantity of acceleration to energy expenditure. As these relationships do not generalise across diverse populations, they must be parametrised for each subpopulation, including different age groups, which is costly and makes studies across diverse populations difficult. A data-driven approach that allows physical activity intensity states to emerge from the data under study without relying on parameters derived from external populations offers a new perspective on this problem and potentially improved results. We evaluated the data-driven approach in a diverse population with a range of rapidly evolving physical and mental capabilities, namely very young children (9-38 months old), where this new approach may be particularly appropriate. Methods: We applied an unsupervised machine learning approach (a hidden semi-Markov model - HSMM) to segment and cluster the accelerometer data recorded from 275 children with a diverse range of physical and cognitive abilities. The HSMM was configured to identify a maximum of six physical activity intensity states and the output of the model was the time spent by each child in each of the states. For comparison, we also processed the accelerometer data using published cut points with available thresholds for the population. This provided us with time estimates for each child’s sedentary (SED), light physical activity (LPA), and moderate-to-vigorous physical activity (MVPA). Data on the children’s physical and cognitive abilities were collected using the Paediatric Evaluation of Disability Inventory (PEDI-CAT). Results: The HSMM identified two inactive states (INS, comparable to SED), two lightly active long duration states (LAS, comparable to LPA), and two short-duration high-intensity states (HIS, comparable to MVPA). Overall, the children spent on average 237/392 minutes per day in INS/SED, 211/129 minutes per day in LAS/LPA, and 178/168 minutes in HIS/MVPA. We found that INS overlapped with 53% of SED, LAS overlapped with 37% of LPA and HIS overlapped with 60% of MVPA. We also looked at the correlation between the time spent by a child in either HIS or MVPA and their physical and cognitive abilities. We found that HIS was more strongly correlated with physical mobility (R²HIS =0.5, R²MVPA= 0.28), cognitive ability (R²HIS =0.31, R²MVPA= 0.15), and age (R²HIS =0.15, R²MVPA= 0.09), indicating increased sensitivity to key attributes associated with a child’s mobility. Conclusion: An unsupervised machine learning technique can segment and cluster accelerometer data according to the intensity of movement at a given time. It provides a potentially more sensitive, appropriate, and cost-effective approach to analysing physical activity behavior in diverse populations, compared to the current cut points approach. This, in turn, supports research that is more inclusive across diverse populations.

Keywords: physical activity, machine learning, under 5s, disability, accelerometer

Procedia PDF Downloads 194

27211 The Role of Macroeconomic Condition and Volatility in Credit Risk: An Empirical Analysis of Credit Default Swap Index Spread on Structural Models in U.S. Market during Post-Crisis Period

Authors: Xu Wang

Abstract:

This research builds linear regressions of U.S. macroeconomic condition and volatility measures in the investment grade and high yield Credit Default Swap index spreads using monthly data from March 2009 to July 2016, to study the relationship between different dimensions of macroeconomy and overall credit risk quality. The most significant contribution of this research is systematically examining individual and joint effects of macroeconomic condition and volatility on CDX spreads by including macroeconomic time series that captures different dimensions of the U.S. economy. The industrial production index growth, non-farm payroll growth, consumer price index growth, 3-month treasury rate and consumer sentiment are introduced to capture the condition of real economic activity, employment, inflation, monetary policy and risk aversion respectively. The conditional variance of the macroeconomic series is constructed using ARMA-GARCH model and is used to measure macroeconomic volatility. The linear regression model is conducted to capture relationships between monthly average CDX spreads and macroeconomic variables. The Newey–West estimator is used to control for autocorrelation and heteroskedasticity in error terms. Furthermore, the sensitivity factor analysis and standardized coefficients analysis are conducted to compare the sensitivity of CDX spreads to different macroeconomic variables and to compare relative effects of macroeconomic condition versus macroeconomic uncertainty respectively. This research shows that macroeconomic condition can have a negative effect on CDX spread while macroeconomic volatility has a positive effect on determining CDX spread. Macroeconomic condition and volatility variables can jointly explain more than 70% of the whole variation of the CDX spread. In addition, sensitivity factor analysis shows that the CDX spread is the most sensitive to Consumer Sentiment index. Finally, the standardized coefficients analysis shows that both macroeconomic condition and volatility variables are important in determining CDX spread but macroeconomic condition category of variables have more relative importance in determining CDX spread than macroeconomic volatility category of variables. This research shows that the CDX spread can reflect the individual and joint effects of macroeconomic condition and volatility, which suggests that individual investors or government should carefully regard CDX spread as a measure of overall credit risk because the CDX spread is influenced by macroeconomy. In addition, the significance of macroeconomic condition and volatility variables, such as Non-farm Payroll growth rate and Industrial Production Index growth volatility suggests that the government, should pay more attention to the overall credit quality in the market when macroecnomy is low or volatile.

Keywords: autoregressive moving average model, credit spread puzzle, credit default swap spread, generalized autoregressive conditional heteroskedasticity model, macroeconomic conditions, macroeconomic uncertainty

Procedia PDF Downloads 153

27210 Influence of the Popularity of Opera during Risorgimento on Foreign Presence in Italy

Authors: Andrew Wee

Abstract:

As a result of the Italian Independence Wars starting in 1848, Italy began to change through unification. People gradually moved away from some of their traditional practices and values, such as the long-held belief that women were inferior to men, as part of the Risorgimento. Italians began to take interest in opera as a form of emotional release. As opera became more popular and prominent in their culture, it aided in the dissemination of ideas, especially stimulating the spread of imperialism, in the late 19th century, as Italy began extending its presence to other countries. In order to collect the information needed to analyze Italy’s foreign presence, it was necessary to consult texts concerning the culture of the Risorgimento. These texts included primary sources from operatic composers and contemporary recorded accounts. Letters from Giuseppe Verdi, a leader in opera during the Risorgimento, have been scrutinized for indications of popular attitudes of the time. The cultural context of the Risorgimento is essential to understanding the Italian motives and attitudes towards the outside world. On the more political side, research has also entailed the study of historical data of general laws, policies, and their purposes concerning geopolitical boundaries and foreign affairs, such as Edward Said’s thesis on Orientalism. By establishing these two characteristics of Italy, the paper will thoroughly illustrate Italy’s presence in foreign affairs. Texts have been searched with the intent of using information that reveals Italian attitudes toward exotic countries to determine whether their demeanor was positive or condescending. Motives behind sources have been interpreted in context in order to form a complete picture of the Italian sentiment towards foreigners. Additionally, research pertaining to Italian nationalism and imperialism such as song and literature has been used. The primary form of research has been the division of sources that are culturally based and those that are political in nature. Opera had always been developing since its creation in the 17th century, and in the 19th century, the bel canto movement revolutionized opera and its role in Italian society. This paper uses evidence that popular sentiment was influenced by opera to support the belief that the evolution of opera was as a result of the nationalist sentiment, and in turn fueled the cultural movement known as the Risorgimento. In this way, opera proceeded to affect Italian culture by spreading the idea of imperialism.

Keywords: opera, Italian unification, music history, imperialism

Procedia PDF Downloads 334

27209 Fuzzy Set Approach to Study Appositives and Its Impact Due to Positional Alterations

Authors: E. Mike Dison, T. Pathinathan

Abstract:

Computing with Words (CWW) and Possibilistic Relational Universal Fuzzy (PRUF) are the two concepts which widely represent and measure the vaguely defined natural phenomenon. In this paper, we study the positional alteration of the phrases by which the impact of a natural language proposition gets affected and/or modified. We observe the gradations due to sensitivity/feeling of a statement towards the positional alterations. We derive the classification and modification of the meaning of words due to the positional alteration. We present the results with reference to set theoretic interpretations.

Keywords: appositive, computing with words, possibilistic relational universal fuzzy (PRUF), semantic sentiment analysis, set-theoretic interpretations

Procedia PDF Downloads 146

27208 Sterilization Incident Analysis by the Association of Litigation and Risk Management Method

Authors: Souhir Chelly, Asma Ben Cheikh, Hela Ghali, Salwa Khefacha, Lamine Dhidah, Mohamed Ben Rejeb, Houyem Said Latiri

Abstract:

The hospital risk management department is firstly involved in the methodological analysis of grade zero sterilization incidents. The system is based on a subsequent analysis process in compliance with the ongoing requirements of the Haute Autorité de santé (HAS) for a reactive approach to risk, allowing to identify failures and start the appropriate preventive and corrective measures. The use of the association of litigation and risk management (ALARM) method makes easier the grade zero analysis and brings to light the team or institutional, organizational, temporal, individual factors representative of undesirable effects. Two main factors come out again from this analysis, pre-disinfection step of the emergency block unsupervised instrumentalist intern was poorly done since she did not remove the battery from micro air motor. At the sterilization unit, the worker who was not supervised by the nurse did the conditioning of the motor without having checked it if it still contained the battery. The main cause is that the management of human resources was inadequate at both levels, the instrumental trainee in the block who was not supervised by his supervisor and the worker of the sterilization unit who was not supervised by the responsible nurse. There is a lack of research help, advice, and collaboration. The difficulties encountered during this type of analysis are multiple. The first is based on its necessary acceptance by the various actors of care involved, which should not perceive it as a tool leading to individual punishment, but rather as a means to improve their practices.

Keywords: ALARM (Association of Litigation and Risk Management Method), incident, risk management, sterilization

Procedia PDF Downloads 206

27207 Natural Language Processing for the Classification of Social Media Posts in Post-Disaster Management

Authors: Ezgi Şendil

Abstract:

Information extracted from social media has received great attention since it has become an effective alternative for collecting people’s opinions and emotions based on specific experiences in a faster and easier way. The paper aims to put data in a meaningful way to analyze users’ posts and get a result in terms of the experiences and opinions of the users during and after natural disasters. The posts collected from Reddit are classified into nine different categories, including injured/dead people, infrastructure and utility damage, missing/found people, donation needs/offers, caution/advice, and emotional support, identified by using labelled Twitter data and four different machine learning (ML) classifiers.

Keywords: disaster, NLP, postdisaster management, sentiment analysis

Procedia PDF Downloads 61

27206 Circulating Public Perception on Agroforestry: Discourse Networks Analysis Using Social Media and Online News Media in Four Countries of the Sahel Region

Authors: Luisa Müting, Wisnu Harto Adiwijoyo

Abstract:

Agroforestry systems transform the agricultural landscapes in the Sahel region of Africa, providing food and farming products consumed for subsistence or sold for income. In the incrementally dry climate of the Sahel region, the spreading of agroforestry practices is integral for policymaker efforts to counteract land degradation and provide soil restoration in the region. Several measures on agroforestry practices have been implemented in the region by governmental and non-governmental institutions in recent years. However, despite the efforts, past research shows that awareness of how policies and interventions are being consumed and perceived by the public remains low. Therefore, interpreting public policy dilemmas by analyzing the public perception regarding agroforestry concepts and practices is necessary. Public perceptions and discourses can be an essential driver or constraint for the adoption of agroforestry practices in the region. Thus, understanding the public discourse behavior of crucial stakeholders could assist policymakers in developing inclusive and contextual policies that are relevant to the context of agroforestry adoption in Sahel region. To answer how information about agroforestry spreads and is perceived by the public. As internet usage increased drastically over the past decade, reaching a share of 33 percent of the population being connected to the internet, this research is based on online conversation data. Social media data from Facebook are gathered daily between April 2021 and April 2022 in Djibouti, Senegal, Mali, and Nigeria based on their share of active internet users compared to other countries in the Sahel region. A systematic methodology was applied to the extracted social media using discourse network analysis (DNA). This study then clustered the data by the types of agroforestry practices, sentiments, and country. Additionally, this research extracted the text data from online news media during the same period to pinpoint events related to the topic of agroforestry. The preliminary result indicates that tree management, crops, and livestock integration, diversifying species and genetic resources, and focusing on interactions and productivity across the agricultural system; are the most notable keywords in agroforestry-related conversations within the four countries in the Sahel region. Additionally, approximately 84 percent of the discussions were still dominated by big actors, such as NGO or government actors. Furthermore, as a subject of communication within agroforestry discourse, the Great Green Wall initiative generates almost 60 percent positive sentiment within the captured social media data, effectively having a more significant outreach than general agroforestry topics. This study provides an understanding for scholars and policymakers with a springboard for further research or policy design on agroforestry in the four countries of the Sahel region with systematically uncaptured novel data from the internet.

Keywords: sahel, djibouti, senegal, mali, nigeria, social networks analysis, public discourse analysis, sentiment analysis, content analysis, social media, online news, agroforestry, land restoration

Procedia PDF Downloads 83

27205 A Quality Index Optimization Method for Non-Invasive Fetal ECG Extraction

Authors: Lucia Billeci, Gennaro Tartarisco, Maurizio Varanini

Abstract:

Fetal cardiac monitoring by fetal electrocardiogram (fECG) can provide significant clinical information about the healthy condition of the fetus. Despite this potentiality till now the use of fECG in clinical practice has been quite limited due to the difficulties in its measuring. The recovery of fECG from the signals acquired non-invasively by using electrodes placed on the maternal abdomen is a challenging task because abdominal signals are a mixture of several components and the fetal one is very weak. This paper presents an approach for fECG extraction from abdominal maternal recordings, which exploits the characteristics of pseudo-periodicity of fetal ECG. It consists of devising a quality index (fQI) for fECG and of finding the linear combinations of preprocessed abdominal signals, which maximize these fQI (quality index optimization - QIO). It aims at improving the performances of the most commonly adopted methods for fECG extraction, usually based on maternal ECG (mECG) estimating and canceling. The procedure for the fECG extraction and fetal QRS (fQRS) detection is completely unsupervised and based on the following steps: signal pre-processing; maternal ECG (mECG) extraction and maternal QRS detection; mECG component approximation and canceling by weighted principal component analysis; fECG extraction by fQI maximization and fetal QRS detection. The proposed method was compared with our previously developed procedure, which obtained the highest at the Physionet/Computing in Cardiology Challenge 2013. That procedure was based on removing the mECG from abdominal signals estimated by a principal component analysis (PCA) and applying the Independent component Analysis (ICA) on the residual signals. Both methods were developed and tuned using 69, 1 min long, abdominal measurements with fetal QRS annotation of the dataset A provided by PhysioNet/Computing in Cardiology Challenge 2013. The QIO-based and the ICA-based methods were compared in analyzing two databases of abdominal maternal ECG available on the Physionet site. The first is the Abdominal and Direct Fetal Electrocardiogram Database (ADdb) which contains the fetal QRS annotations thus allowing a quantitative performance comparison, the second is the Non-Invasive Fetal Electrocardiogram Database (NIdb), which does not contain the fetal QRS annotations so that the comparison between the two methods can be only qualitative. In particular, the comparison on NIdb was performed defining an index of quality for the fetal RR series. On the annotated database ADdb the QIO method, provided the performance indexes Sens=0.9988, PPA=0.9991, F1=0.9989 overcoming the ICA-based one, which provided Sens=0.9966, PPA=0.9972, F1=0.9969. The comparison on NIdb was performed defining an index of quality for the fetal RR series. The index of quality resulted higher for the QIO-based method compared to the ICA-based one in 35 records out 55 cases of the NIdb. The QIO-based method gave very high performances with both the databases. The results of this study foresees the application of the algorithm in a fully unsupervised way for the implementation in wearable devices for self-monitoring of fetal health.

Keywords: fetal electrocardiography, fetal QRS detection, independent component analysis (ICA), optimization, wearable

Procedia PDF Downloads 268

27204 Deep Reinforcement Learning Approach for Trading Automation in The Stock Market

Authors: Taylan Kabbani, Ekrem Duman

Abstract:

The design of adaptive systems that take advantage of financial markets while reducing the risk can bring more stagnant wealth into the global market. However, most efforts made to generate successful deals in trading financial assets rely on Supervised Learning (SL), which suffered from various limitations. Deep Reinforcement Learning (DRL) offers to solve these drawbacks of SL approaches by combining the financial assets price "prediction" step and the "allocation" step of the portfolio in one unified process to produce fully autonomous systems capable of interacting with its environment to make optimal decisions through trial and error. In this paper, a continuous action space approach is adopted to give the trading agent the ability to gradually adjust the portfolio's positions with each time step (dynamically re-allocate investments), resulting in better agent-environment interaction and faster convergence of the learning process. In addition, the approach supports the managing of a portfolio with several assets instead of a single one. This work represents a novel DRL model to generate profitable trades in the stock market, effectively overcoming the limitations of supervised learning approaches. We formulate the trading problem, or what is referred to as The Agent Environment as Partially observed Markov Decision Process (POMDP) model, considering the constraints imposed by the stock market, such as liquidity and transaction costs. More specifically, we design an environment that simulates the real-world trading process by augmenting the state representation with ten different technical indicators and sentiment analysis of news articles for each stock. We then solve the formulated POMDP problem using the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm, which can learn policies in high-dimensional and continuous action spaces like those typically found in the stock market environment. From the point of view of stock market forecasting and the intelligent decision-making mechanism, this paper demonstrates the superiority of deep reinforcement learning in financial markets over other types of machine learning such as supervised learning and proves its credibility and advantages of strategic decision-making.

Keywords: the stock market, deep reinforcement learning, MDP, twin delayed deep deterministic policy gradient, sentiment analysis, technical indicators, autonomous agent

Procedia PDF Downloads 169

27203 Correlation Analysis to Quantify Learning Outcomes for Different Teaching Pedagogies

Authors: Kanika Sood, Sijie Shang

Abstract:

A fundamental goal of education includes preparing students to become a part of the global workforce by making beneficial contributions to society. In this paper, we analyze student performance for multiple courses that involve different teaching pedagogies: a cooperative learning technique and an inquiry-based learning strategy. Student performance includes student engagement, grades, and attendance records. We perform this study in the Computer Science department for online and in-person courses for 450 students. We will perform correlation analysis to study the relationship between student scores and other parameters such as gender, mode of learning. We use natural language processing and machine learning to analyze student feedback data and performance data. We assess the learning outcomes of two teaching pedagogies for undergraduate and graduate courses to showcase the impact of pedagogical adoption and learning outcome as determinants of academic achievement. Early findings suggest that when using the specified pedagogies, students become experts on their topics and illustrate enhanced engagement with peers.

Keywords: bag-of-words, cooperative learning, education, inquiry-based learning, in-person learning, natural language processing, online learning, sentiment analysis, teaching pedagogy

Procedia PDF Downloads 68

27202 Fake News Detection Based on Fusion of Domain Knowledge and Expert Knowledge

Authors: Yulan Wu

Abstract:

The spread of fake news on social media has posed significant societal harm to the public and the nation, with its threats spanning various domains, including politics, economics, health, and more. News on social media often covers multiple domains, and existing models studied by researchers and relevant organizations often perform well on datasets from a single domain. However, when these methods are applied to social platforms with news spanning multiple domains, their performance significantly deteriorates. Existing research has attempted to enhance the detection performance of multi-domain datasets by adding single-domain labels to the data. However, these methods overlook the fact that a news article typically belongs to multiple domains, leading to the loss of domain knowledge information contained within the news text. To address this issue, research has found that news records in different domains often use different vocabularies to describe their content. In this paper, we propose a fake news detection framework that combines domain knowledge and expert knowledge. Firstly, it utilizes an unsupervised domain discovery module to generate a low-dimensional vector for each news article, representing domain embeddings, which can retain multi-domain knowledge of the news content. Then, a feature extraction module uses the domain embeddings discovered through unsupervised domain knowledge to guide multiple experts in extracting news knowledge for the total feature representation. Finally, a classifier is used to determine whether the news is fake or not. Experiments show that this approach can improve multi-domain fake news detection performance while reducing the cost of manually labeling domain labels.

Keywords: fake news, deep learning, natural language processing, multiple domains

Procedia PDF Downloads 55