Search results for: Savas Guzel

2 Evaluation of Modern Natural Language Processing Techniques via Measuring a Company's Public Perception

Authors: Burak Oksuzoglu, Savas Yildirim, Ferhat Kutlu

Abstract:

Opinion mining (OM) is one of the natural language processing (NLP) problems to determine the polarity of opinions, mostly represented on a positive-neutral-negative axis. The data for OM is usually collected from various social media platforms. In an era where social media has considerable control over companies’ futures, it’s worth understanding social media and taking actions accordingly. OM comes to the fore here as the scale of the discussion about companies increases, and it becomes unfeasible to gauge opinion on individual levels. Thus, the companies opt to automize this process by applying machine learning (ML) approaches to their data. For the last two decades, OM or sentiment analysis (SA) has been mainly performed by applying ML classification algorithms such as support vector machines (SVM) and Naïve Bayes to a bag of n-gram representations of textual data. With the advent of deep learning and its apparent success in NLP, traditional methods have become obsolete. Transfer learning paradigm that has been commonly used in computer vision (CV) problems started to shape NLP approaches and language models (LM) lately. This gave a sudden rise to the usage of the pretrained language model (PTM), which contains language representations that are obtained by training it on the large datasets using self-supervised learning objectives. The PTMs are further fine-tuned by a specialized downstream task dataset to produce efficient models for various NLP tasks such as OM, NER (Named-Entity Recognition), Question Answering (QA), and so forth. In this study, the traditional and modern NLP approaches have been evaluated for OM by using a sizable corpus belonging to a large private company containing about 76,000 comments in Turkish: SVM with a bag of n-grams, and two chosen pre-trained models, multilingual universal sentence encoder (MUSE) and bidirectional encoder representations from transformers (BERT). The MUSE model is a multilingual model that supports 16 languages, including Turkish, and it is based on convolutional neural networks. The BERT is a monolingual model in our case and transformers-based neural networks. It uses a masked language model and next sentence prediction tasks that allow the bidirectional training of the transformers. During the training phase of the architecture, pre-processing operations such as morphological parsing, stemming, and spelling correction was not used since the experiments showed that their contribution to the model performance was found insignificant even though Turkish is a highly agglutinative and inflective language. The results show that usage of deep learning methods with pre-trained models and fine-tuning achieve about 11% improvement over SVM for OM. The BERT model achieved around 94% prediction accuracy while the MUSE model achieved around 88% and SVM did around 83%. The MUSE multilingual model shows better results than SVM, but it still performs worse than the monolingual BERT model.

Keywords: BERT, MUSE, opinion mining, pretrained language model, SVM, Turkish

Procedia PDF Downloads 150

1 Analysis of the Effects of Institutions on the Sub-National Distribution of Aid Using Geo-Referenced AidData

Authors: Savas Yildiz

Abstract:

The article assesses the performance of international aid donors to determine the sub-national distribution of their aid projects dependent on recipient countries’ governance. The present paper extends the scope from a cross-country perspective to a more detailed analysis by looking at the effects of institutional qualities on the sub-national distribution of foreign aid. The analysis examines geo-referenced aid project in 37 countries and 404 regions at the first administrative division level in Sub-Saharan Africa from the World Bank (WB) and the African Development Bank (ADB) that were approved between the years 2000 and 2011. To measure the influence of institutional qualities on the distribution of aid the following measures are used: control of corruption, government effectiveness, regulatory quality and rule of law from the World Governance Indicators (WGI) and the corruption perception index from Transparency International. Furthermore, to assess the importance of ethnic heterogeneity on the sub-national distribution of aid projects, the study also includes interaction terms measuring ethnic fragmentation. The regression results indicate a general skew of aid projects towards regions which hold capital cities, however, being incumbent presidents’ birth region does not increase the allocation of aid projects significantly. Nevertheless, with increasing quality of institutions aid projects are less skewed towards capital regions and the previously estimated coefficients loose significance in most cases. Higher ethnic fragmentation also seems to impede the possibility to allocate aid projects mainly in capital city regions and presidents’ birth places. Additionally, to assess the performance of the WB based on its own proclaimed goal to aim the poor in a country, the study also includes sub-national wealth data from the Demographic and Health Surveys (DSH), and finds that, even with better institutional qualities, regions with a larger share from the richest quintile receive significantly more aid than regions with a larger share of poor people. With increasing ethnic diversity, the allocation of aid projects towards regions where the richest citizens reside diminishes, but still remains high and significant. However, regions with a larger share of poor people still do not receive significantly more aid. This might imply that the sub-national distribution of aid projects increases in general with higher ethnic fragmentation, independent of the diverse regional needs. The results provide evidence that institutional qualities matter to undermine the influence of incumbent presidents on the allocation of aid projects towards their birth regions and capital regions. Moreover, even for countries with better institutional qualities the WB and the ADB do not seem to be able to aim the poor in a country with their aid projects. Even, if one considers need-based variables, such as infant mortality and child mortality rates, aid projects do not seem to be allocated in districts with a larger share of people in need. Therefore, the study provides further evidence using more detailed information on the sub-national distribution of aid projects that aid is not being allocated effectively towards regions with a larger share of poor people to alleviate poverty in recipient countries directly. Institutions do not have any significant influence on the sub-national distribution of aid towards the poor.

Keywords: aid allocation, georeferenced data, institutions, spatial analysis

Procedia PDF Downloads 121