Search results for: urdu sentiment analysis
27890 Document-level Sentiment Analysis: An Exploratory Case Study of Low-resource Language Urdu
Authors: Ammarah Irum, Muhammad Ali Tahir
Abstract:
Document-level sentiment analysis in Urdu is a challenging Natural Language Processing (NLP) task due to the difficulty of working with lengthy texts in a language with constrained resources. Deep learning models, which are complex neural network architectures, are well-suited to text-based applications in addition to data formats like audio, image, and video. To investigate the potential of deep learning for Urdu sentiment analysis, we implemented five different deep learning models, including Bidirectional Long Short Term Memory (BiLSTM), Convolutional Neural Network (CNN), Convolutional Neural Network with Bidirectional Long Short Term Memory (CNN-BiLSTM), and Bidirectional Encoder Representation from Transformer (BERT). In this study, we developed a hybrid deep learning model called BiLSTM-Single Layer Multi Filter Convolutional Neural Network (BiLSTM-SLMFCNN) by fusing BiLSTM and CNN architecture. The proposed and baseline techniques are applied on Urdu Customer Support data set and IMDB Urdu movie review data set by using pre-trained Urdu word embedding that are suitable for sentiment analysis at the document level. Results of these techniques are evaluated and our proposed model outperforms all other deep learning techniques for Urdu sentiment analysis. BiLSTM-SLMFCNN outperformed the baseline deep learning models and achieved 83%, 79%, 83% and 94% accuracy on small, medium and large sized IMDB Urdu movie review data set and Urdu Customer Support data set respectively.Keywords: urdu sentiment analysis, deep learning, natural language processing, opinion mining, low-resource language
Procedia PDF Downloads 7127889 Lexicon-Based Sentiment Analysis for Stock Movement Prediction
Authors: Zane Turner, Kevin Labille, Susan Gauch
Abstract:
Sentiment analysis is a broad and expanding field that aims to extract and classify opinions from textual data. Lexicon-based approaches are based on the use of a sentiment lexicon, i.e., a list of words each mapped to a sentiment score, to rate the sentiment of a text chunk. Our work focuses on predicting stock price change using a sentiment lexicon built from financial conference call logs. We present a method to generate a sentiment lexicon based upon an existing probabilistic approach. By using a domain-specific lexicon, we outperform traditional techniques and demonstrate that domain-specific sentiment lexicons provide higher accuracy than generic sentiment lexicons when predicting stock price change.Keywords: computational finance, sentiment analysis, sentiment lexicon, stock movement prediction
Procedia PDF Downloads 12627888 Lexicon-Based Sentiment Analysis for Stock Movement Prediction
Authors: Zane Turner, Kevin Labille, Susan Gauch
Abstract:
Sentiment analysis is a broad and expanding field that aims to extract and classify opinions from textual data. Lexicon-based approaches are based on the use of a sentiment lexicon, i.e., a list of words each mapped to a sentiment score, to rate the sentiment of a text chunk. Our work focuses on predicting stock price change using a sentiment lexicon built from financial conference call logs. We introduce a method to generate a sentiment lexicon based upon an existing probabilistic approach. By using a domain-specific lexicon, we outperform traditional techniques and demonstrate that domain-specific sentiment lexicons provide higher accuracy than generic sentiment lexicons when predicting stock price change.Keywords: computational finance, sentiment analysis, sentiment lexicon, stock movement prediction
Procedia PDF Downloads 16927887 Fine-Grained Sentiment Analysis: Recent Progress
Authors: Jie Liu, Xudong Luo, Pingping Lin, Yifan Fan
Abstract:
Facebook, Twitter, Weibo, and other social media and significant e-commerce sites generate a massive amount of online texts, which can be used to analyse people’s opinions or sentiments for better decision-making. So, sentiment analysis, especially fine-grained sentiment analysis, is a very active research topic. In this paper, we survey various methods for fine-grained sentiment analysis, including traditional sentiment lexicon-based methods, machine learning-based methods, and deep learning-based methods in aspect/target/attribute-based sentiment analysis tasks. Besides, we discuss their advantages and problems worthy of careful studies in the future.Keywords: sentiment analysis, fine-grained, machine learning, deep learning
Procedia PDF Downloads 26127886 A Survey of the Applications of Sentiment Analysis
Authors: Pingping Lin, Xudong Luo
Abstract:
Natural language often conveys emotions of speakers. Therefore, sentiment analysis on what people say is prevalent in the field of natural language process and has great application value in many practical problems. Thus, to help people understand its application value, in this paper, we survey various applications of sentiment analysis, including the ones in online business and offline business as well as other types of its applications. In particular, we give some application examples in intelligent customer service systems in China. Besides, we compare the applications of sentiment analysis on Twitter, Weibo, Taobao and Facebook, and discuss some challenges. Finally, we point out the challenges faced in the applications of sentiment analysis and the work that is worth being studied in the future.Keywords: application, natural language processing, online comments, sentiment analysis
Procedia PDF Downloads 26027885 Sentiment Analysis: Comparative Analysis of Multilingual Sentiment and Opinion Classification Techniques
Authors: Sannikumar Patel, Brian Nolan, Markus Hofmann, Philip Owende, Kunjan Patel
Abstract:
Sentiment analysis and opinion mining have become emerging topics of research in recent years but most of the work is focused on data in the English language. A comprehensive research and analysis are essential which considers multiple languages, machine translation techniques, and different classifiers. This paper presents, a comparative analysis of different approaches for multilingual sentiment analysis. These approaches are divided into two parts: one using classification of text without language translation and second using the translation of testing data to a target language, such as English, before classification. The presented research and results are useful for understanding whether machine translation should be used for multilingual sentiment analysis or building language specific sentiment classification systems is a better approach. The effects of language translation techniques, features, and accuracy of various classifiers for multilingual sentiment analysis is also discussed in this study.Keywords: cross-language analysis, machine learning, machine translation, sentiment analysis
Procedia PDF Downloads 71227884 One-Class Support Vector Machine for Sentiment Analysis of Movie Review Documents
Authors: Chothmal, Basant Agarwal
Abstract:
Sentiment analysis means to classify a given review document into positive or negative polar document. Sentiment analysis research has been increased tremendously in recent times due to its large number of applications in the industry and academia. Sentiment analysis models can be used to determine the opinion of the user towards any entity or product. E-commerce companies can use sentiment analysis model to improve their products on the basis of users’ opinion. In this paper, we propose a new One-class Support Vector Machine (One-class SVM) based sentiment analysis model for movie review documents. In the proposed approach, we initially extract features from one class of documents, and further test the given documents with the one-class SVM model if a given new test document lies in the model or it is an outlier. Experimental results show the effectiveness of the proposed sentiment analysis model.Keywords: feature selection methods, machine learning, NB, one-class SVM, sentiment analysis, support vector machine
Procedia PDF Downloads 51627883 A Survey of Sentiment Analysis Based on Deep Learning
Authors: Pingping Lin, Xudong Luo, Yifan Fan
Abstract:
Sentiment analysis is a very active research topic. Every day, Facebook, Twitter, Weibo, and other social media, as well as significant e-commerce websites, generate a massive amount of comments, which can be used to analyse peoples opinions or emotions. The existing methods for sentiment analysis are based mainly on sentiment dictionaries, machine learning, and deep learning. The first two kinds of methods rely on heavily sentiment dictionaries or large amounts of labelled data. The third one overcomes these two problems. So, in this paper, we focus on the third one. Specifically, we survey various sentiment analysis methods based on convolutional neural network, recurrent neural network, long short-term memory, deep neural network, deep belief network, and memory network. We compare their futures, advantages, and disadvantages. Also, we point out the main problems of these methods, which may be worthy of careful studies in the future. Finally, we also examine the application of deep learning in multimodal sentiment analysis and aspect-level sentiment analysis.Keywords: document analysis, deep learning, multimodal sentiment analysis, natural language processing
Procedia PDF Downloads 16327882 Comparative Study of Urdu and Hindko Language
Authors: Tahseen Bibi
Abstract:
Language is a source of communicating the ideas, emotions and feelings to others. Languages are different from one another on the basis of symbols and articulation. Regional languages play a role of unification in any country. National language of any country gives strength to its masses as it evaporates the mutual indifferences. There are various regional languages in Pakistan like Sindhi, Pushto, Hindko and Balochi. Hindko language dates back to the ancient times and the Hindko speakers can also easily understand and speak Urdu language. Urdu language is an amalgam of various languages. These languages are interconnected. Thus we can draw an analogy between the two languages under discussion on the basis of the pronunciation. The research will show that there are so many words in both the languages which have the similar pronunciation. It will further tell that the roots of Urdu language lie in Hindko. The reason behind this resemblance is that Urdu has got extracted from Hindko and other languages. Hindko language has played a prominent role in the development of Urdu language. Thus the role of Hindko language in the emergence and development of Urdu cannot be denied. This article will use the qualitative and comparative study as methodology. The research will highlight that there is close resemblance in both the languages on the basis of pronunciation, signifying that Urdu language has been extracted from Hindkon language.Keywords: Hindko, Urdu, regional languages, vocabulary
Procedia PDF Downloads 41227881 Exposing Investor Sentiment In Stock Returns
Authors: Qiang Bu
Abstract:
This paper compares the explanatory power of sentiment level and sentiment shock. The preliminary test results show that sentiment shock plays a more significant role in explaining stocks returns, including the raw return and abnormal return. We also find that sentiment shock beta has a higher statistical significance than sentiment beta. These finding sheds new light on the relationship between investor sentiment and stock returns.Keywords: sentiment level, sentiment shock, explanatory power, abnormal stock return, beta
Procedia PDF Downloads 13627880 Tracing the Evolution of English and Urdu Languages: A Linguistic and Cultural Analysis
Authors: Aamna Zafar
Abstract:
Through linguistic and cultural analysis, this study seeks to trace the development of the English and Urdu languages. Along with examining how the vocabulary and syntax of English and Urdu have evolved over time and the linguistic trends that may be seen in these changes, this study will also look at the historical and cultural influences that have shaped the languages throughout time. The study will also look at how English and Urdu have changed over time, both in terms of language use and communication inside each other's cultures and globally. We'll research how these changes affect social relations and cultural identity, as well as how they might affect the future of these languages.Keywords: linguistic and cultural analysis, historical factors, cultural factors, vocabulary, syntax, significance
Procedia PDF Downloads 7427879 Unsupervised Sentiment Analysis for Indonesian Political Message on Twitter
Authors: Omar Abdillah, Mirna Adriani
Abstract:
In this work, we perform new approach for analyzing public sentiment towards the presidential candidate in the 2014 Indonesian election that expressed in Twitter. In this study we propose such procedure for analyzing sentiment over Indonesian political message by understanding the behavior of Indonesian society in sending message on Twitter. We took different approach from previous works by utilizing punctuation mark and Indonesian sentiment lexicon that completed with the new procedure in determining sentiment towards the candidates. Our experiment shows the performance that yields up to 83.31% of average precision. In brief, this work makes two contributions: first, this work is the preliminary study of sentiment analysis in the domain of political message that has not been addressed yet before. Second, we propose such method to conduct sentiment analysis by creating decision making procedure in which it is in line with the characteristic of Indonesian message on Twitter.Keywords: unsupervised sentiment analysis, political message, lexicon based, user behavior understanding
Procedia PDF Downloads 47927878 Saudi Twitter Corpus for Sentiment Analysis
Authors: Adel Assiri, Ahmed Emam, Hmood Al-Dossari
Abstract:
Sentiment analysis (SA) has received growing attention in Arabic language research. However, few studies have yet to directly apply SA to Arabic due to lack of a publicly available dataset for this language. This paper partially bridges this gap due to its focus on one of the Arabic dialects which is the Saudi dialect. This paper presents annotated data set of 4700 for Saudi dialect sentiment analysis with (K= 0.807). Our next work is to extend this corpus and creation a large-scale lexicon for Saudi dialect from the corpus.Keywords: Arabic, sentiment analysis, Twitter, annotation
Procedia PDF Downloads 62827877 Collision Theory Based Sentiment Detection Using Discourse Analysis in Hadoop
Authors: Anuta Mukherjee, Saswati Mukherjee
Abstract:
Data is growing everyday. Social networking sites such as Twitter are becoming an integral part of our daily lives, contributing a large increase in the growth of data. It is a rich source especially for sentiment detection or mining since people often express honest opinion through tweets. However, although sentiment analysis is a well-researched topic in text, this analysis using Twitter data poses additional challenges since these are unstructured data with abbreviations and without a strict grammatical correctness. We have employed collision theory to achieve sentiment analysis in Twitter data. We have also incorporated discourse analysis in the collision theory based model to detect accurate sentiment from tweets. We have also used the retweet field to assign weights to certain tweets and obtained the overall weightage of a topic provided in the form of a query. Hadoop has been exploited for speed. Our experiments show effective results.Keywords: sentiment analysis, twitter, collision theory, discourse analysis
Procedia PDF Downloads 53427876 The Use of AI to Measure Gross National Happiness
Authors: Riona Dighe
Abstract:
This research attempts to identify an alternative approach to the measurement of Gross National Happiness (GNH). It uses artificial intelligence (AI), incorporating natural language processing (NLP) and sentiment analysis to measure GNH. We use ‘off the shelf’ NLP models responsible for the sentiment analysis of a sentence as a building block for this research. We constructed an algorithm using NLP models to derive a sentiment analysis score against sentences. This was then tested against a sample of 20 respondents to derive a sentiment analysis score. The scores generated resembled human responses. By utilising the MLP classifier, decision tree, linear model, and K-nearest neighbors, we were able to obtain a test accuracy of 89.97%, 54.63%, 52.13%, and 47.9%, respectively. This gave us the confidence to use the NLP models against sentences in websites to measure the GNH of a country.Keywords: artificial intelligence, NLP, sentiment analysis, gross national happiness
Procedia PDF Downloads 11627875 Urdu Text Extraction Method from Images
Authors: Samabia Tehsin, Sumaira Kausar
Abstract:
Due to the vast increase in the multimedia data in recent years, efficient and robust retrieval techniques are needed to retrieve and index images/ videos. Text embedded in the images can serve as the strong retrieval tool for images. This is the reason that text extraction is an area of research with increasing attention. English text extraction is the focus of many researchers but very less work has been done on other languages like Urdu. This paper is focusing on Urdu text extraction from video frames. This paper presents a text detection feature set, which has the ability to deal up with most of the problems connected with the text extraction process. To test the validity of the method, it is tested on Urdu news dataset, which gives promising results.Keywords: caption text, content-based image retrieval, document analysis, text extraction
Procedia PDF Downloads 51327874 Performance Evaluation of an Ontology-Based Arabic Sentiment Analysis
Authors: Salima Behdenna, Fatiha Barigou, Ghalem Belalem
Abstract:
Due to the quick increase in the volume of Arabic opinions posted on various social media, Arabic sentiment analysis has become one of the most important areas of research. Compared to English, there is very little works on Arabic sentiment analysis, in particular aspect-based sentiment analysis (ABSA). In ABSA, aspect extraction is the most important task. In this paper, we propose a semantic aspect-based sentiment analysis approach for standard Arabic reviews to extract explicit aspect terms and identify the polarity of the extracted aspects. The proposed approach was evaluated using HAAD datasets. Experiments showed that the proposed approach achieved a good level of performance compared with baseline results. The F-measure was improved by 19% for the aspect term extraction tasks and 55% aspect term polarity task.Keywords: sentiment analysis, opinion mining, Arabic, aspect level, opinion, polarity
Procedia PDF Downloads 16227873 Sentiment Analysis in Social Networks Sites Based on a Bibliometrics Analysis: A Comprehensive Analysis and Trends for Future Research Planning
Authors: Jehan Fahim M. Alsulami
Abstract:
Academic research about sentiment analysis in sentiment analysis has obtained significant advancement over recent years and is flourishing from the collection of knowledge provided by various academic disciplines. In the current study, the status and development trend of the field of sentiment analysis in social networks is evaluated through a bibliometric analysis of academic publications. In particular, the distributions of publications and citations, the distribution of subject, predominant journals, authors, countries are analyzed. The collaboration degree is applied to measure scientific connections from different aspects. Moreover, the keyword co-occurrence analysis is used to find out the major research topics and their evolutions throughout the time span. The area of sentiment analysis in social networks has gained growing attention in academia, with computer science and engineering as the top main research subjects. China and the USA provide the most to the area development. Authors prefer to collaborate more with those within the same nation. Among the research topics, newly risen topics such as COVID-19, customer satisfaction are discovered.Keywords: bibliometric analysis, sentiment analysis, social networks, social media
Procedia PDF Downloads 21727872 Women Characters in Pakistani Films: A Critical Evaluation
Authors: Ali Arshad
Abstract:
The study examines the depiction of women characters in Urdu and Punjabi films. It is a critical evaluation of forty-eight Pakistani films. It explores the characters of women portrays in Urdu and Punjabi film of Pakistan. Using content analysis as methodology with feminist research that helps to investigate the phenomena and supports the study. Finding of the study shows that women characters in Urdu and Punjabi films are not the reflection of true Pakistani women rather this picture represents a negative image of Pakistani women in viewers mind. These characters don’t address the women’s issues nor do they present the solutions to these problems faced by Pakistani women. The characters of Pakistani women are not free from male prejudice, and these films do not portray the social and political role perform by actual Pakistani women. The analysis shows that the characters of women in Urdu and Punjabi films are based on the assumptions.Keywords: women, Pakistani, film, characters
Procedia PDF Downloads 30127871 An Enhanced Support Vector Machine Based Approach for Sentiment Classification of Arabic Tweets of Different Dialects
Authors: Gehad S. Kaseb, Mona F. Ahmed
Abstract:
Arabic Sentiment Analysis (SA) is one of the most common research fields with many open areas. Few studies apply SA to Arabic dialects. This paper proposes different pre-processing steps and a modified methodology to improve the accuracy using normal Support Vector Machine (SVM) classification. The paper works on two datasets, Arabic Sentiment Tweets Dataset (ASTD) and Extended Arabic Tweets Sentiment Dataset (Extended-AATSD), which are publicly available for academic use. The results show that the classification accuracy approaches 86%.Keywords: Arabic, classification, sentiment analysis, tweets
Procedia PDF Downloads 14727870 Purposes of Urdu Translations of the Meanings of Holy Quran
Authors: Muhammad Saleem
Abstract:
The research paper entitled above would be a comprehensive and critical study of translations of the meanings of the Holy Qur’an. The discussion will deal with the targets & purposes of Urdu (National Language of Pakistan) translators of the meanings of the Holy Qur’an. There are more than 400 translations of the meanings of the Holy Qur’an in the Urdu Language. Muslims, non-Muslims and some organizations have made translations of the meanings of the Holy Qur’an to meet various targets. It is observed that all Urdu translators have not translated the Qur’an with a single objective and motivation; rather, some are biased and strive to discredit the Qur’an. Thus, they have made unauthentic and fabricated translations of the Qur’an. Some optimistically believe that they intend to do a service, whereas others pessimistically hold that they treacherously seek to further their rule. Some of them have been observed to be against Islam, starting their activities with spite, but after perceiving the truths of Islam and the miracle and greatness of the Holy Qur’an, they submitted to Islam, embracing it with pure hearts. Some translators made their translations of the meanings of the Holy Qur’an to serve Allah, and some of them have done their translations to earn only. All these translations vary from one to another due to style, trend, type, method and style. Some Urdu translations have been made to fulfill the lingual requirements. Some translations have been made by Muslim scholars to reduce the influence of Urdu translations of the meanings of the Holy Qur’an by Non-Muslims. The article deals with the various purposes of the translators of the meanings of the Holy Qur’an.Keywords: Qur'an, translation, urdu, language
Procedia PDF Downloads 3927869 1/Sigma Term Weighting Scheme for Sentiment Analysis
Authors: Hanan Alshaher, Jinsheng Xu
Abstract:
Large amounts of data on the web can provide valuable information. For example, product reviews help business owners measure customer satisfaction. Sentiment analysis classifies texts into two polarities: positive and negative. This paper examines movie reviews and tweets using a new term weighting scheme, called one-over-sigma (1/sigma), on benchmark datasets for sentiment classification. The proposed method aims to improve the performance of sentiment classification. The results show that 1/sigma is more accurate than the popular term weighting schemes. In order to verify if the entropy reflects the discriminating power of terms, we report a comparison of entropy values for different term weighting schemes.Keywords: 1/sigma, natural language processing, sentiment analysis, term weighting scheme, text classification
Procedia PDF Downloads 20027868 Mask-Prompt-Rerank: An Unsupervised Method for Text Sentiment Transfer
Authors: Yufen Qin
Abstract:
Text sentiment transfer is an important branch of text style transfer. The goal is to generate text with another sentiment attribute based on a text with a specific sentiment attribute while maintaining the content and semantic information unrelated to sentiment unchanged in the process. There are currently two main challenges in this field: no parallel corpus and text attribute entanglement. In response to the above problems, this paper proposed a novel solution: Mask-Prompt-Rerank. Use the method of masking the sentiment words and then using prompt regeneration to transfer the sentence sentiment. Experiments on two sentiment benchmark datasets and one formality transfer benchmark dataset show that this approach makes the performance of small pre-trained language models comparable to that of the most advanced large models, while consuming two orders of magnitude less computing and memory.Keywords: language model, natural language processing, prompt, text sentiment transfer
Procedia PDF Downloads 7927867 Investor Sentiment and Commodity Trading Advisor Fund Performance
Authors: Tian Lan
Abstract:
Arbitrageurs participate in a variety of techniques in response to the existence of fluctuating sentiment, resulting in sparse sentiment exposures. This paper found that Commodity Trading Advisor (CTA) funds in the top decile rated by sentiment beta outperformed those in the bottom decile by 0.33% per month on a risk-adjusted basis, with the difference being larger among skilled managers. This paper also discovered that around ten percent of Commodity Trading Advisor (CTA) funds could accurately predict market sentiment, which has a positive correlation with fund sentiment beta and acts as a determinant in fund performance. Instead of betting against mispricing, this research demonstrates that a competent manager can achieve remarkable returns by forecasting and reacting to shifts in investor sentiment.Keywords: investment sentiment, CTA fund, market timing, fund performance
Procedia PDF Downloads 8327866 From Text to Data: Sentiment Analysis of Presidential Election Political Forums
Authors: Sergio V Davalos, Alison L. Watkins
Abstract:
User generated content (UGC) such as website post has data associated with it: time of the post, gender, location, type of device, and number of words. The text entered in user generated content (UGC) can provide a valuable dimension for analysis. In this research, each user post is treated as a collection of terms (words). In addition to the number of words per post, the frequency of each term is determined by post and by the sum of occurrences in all posts. This research focuses on one specific aspect of UGC: sentiment. Sentiment analysis (SA) was applied to the content (user posts) of two sets of political forums related to the US presidential elections for 2012 and 2016. Sentiment analysis results in deriving data from the text. This enables the subsequent application of data analytic methods. The SASA (SAIL/SAI Sentiment Analyzer) model was used for sentiment analysis. The application of SASA resulted with a sentiment score for each post. Based on the sentiment scores for the posts there are significant differences between the content and sentiment of the two sets for the 2012 and 2016 presidential election forums. In the 2012 forums, 38% of the forums started with positive sentiment and 16% with negative sentiment. In the 2016 forums, 29% started with positive sentiment and 15% with negative sentiment. There also were changes in sentiment over time. For both elections as the election got closer, the cumulative sentiment score became negative. The candidate who won each election was in the more posts than the losing candidates. In the case of Trump, there were more negative posts than Clinton’s highest number of posts which were positive. KNIME topic modeling was used to derive topics from the posts. There were also changes in topics and keyword emphasis over time. Initially, the political parties were the most referenced and as the election got closer the emphasis changed to the candidates. The performance of the SASA method proved to predict sentiment better than four other methods in Sentibench. The research resulted in deriving sentiment data from text. In combination with other data, the sentiment data provided insight and discovery about user sentiment in the US presidential elections for 2012 and 2016.Keywords: sentiment analysis, text mining, user generated content, US presidential elections
Procedia PDF Downloads 19027865 Morpho-Syntactic Pattern in Maithili Urdu
Authors: Mohammad Jahangeer Warsi
Abstract:
This is, perhaps, the first linguistic study of Maithili Urdu, a dialect of Urdu language of Indo-Aryan family, spoken by around four million speakers in Darbhanga, Samastipur, Begusarai, Madhubani, and Muzafarpur districts of Bihar. It has the subject–verb–object (SOV) word order and it lacks script and literature. Needless to say, this work is an attempt to document this dialect so that it should contribute to the field of descriptive linguistics. Besides, it is also spoken by majority of Maithili diaspora community. Maithili Urdu does not have its own script or literature, yet it has maintained an oral history of over many centuries. It has contributed to enriching the Maithili, Hindi and Urdu languages and literature very profoundly. Dialects are the contact languages of particular regions, and they have a deep impact on their cultural heritage. Slowly with time, these dialects begin to take shape of languages. The convergence of a dialect into a language is a symbol and pride of the people who speak it. Although, confined to the five districts of northern Bihar, yet highly popular among the natives, it is the primary mode of communication of the local Muslims. The paper will focus on the structure of expressions about Maithili Urdu that include the structure of words, phrases, clauses, and sentences. There are clear differences in linguistic features of Maithili Urdu vis-à-vis Urdu, Maithili and Hindi. Though being a dialect of Urdu, interestingly, there is only one second person pronoun tu and lack of agentive marker –ne. Although being spoken in the vicinity of Hindi, Urdu and Maithili, it undoubtedly has its own linguistic features, of them, verb conjugation is remarkably unique. Because of the oral tradition of this link language, intonation has become significantly prominent. This paper will discuss the morpho-syntactic pattern of Maithili Urdu and will go through a sample text to authenticate the findings.Keywords: cultural heritage, morpho-syntactic pattern, Maithili Urdu, verb conjugation
Procedia PDF Downloads 21427864 Hybrid Feature Selection Method for Sentiment Classification of Movie Reviews
Authors: Vishnu Goyal, Basant Agarwal
Abstract:
Sentiment analysis research provides methods for identifying the people’s opinion written in blogs, reviews, social networking websites etc. Sentiment analysis is to understand what opinion people have about any given entity, object or thing. Sentiment analysis research can be broadly categorised into three types of approaches i.e. semantic orientation, machine learning and lexicon based approaches. Feature selection methods improve the performance of the machine learning algorithms by eliminating the irrelevant features. Information gain feature selection method has been considered best method for sentiment analysis; however, it has the drawback of selection of threshold. Therefore, in this paper, we propose a hybrid feature selection methods comprising of information gain and proposed feature selection method. Initially, features are selected using Information Gain (IG) and further more noisy features are eliminated using the proposed feature selection method. Experimental results show the efficiency of the proposed feature selection methods.Keywords: feature selection, sentiment analysis, hybrid feature selection
Procedia PDF Downloads 33627863 Enhance the Power of Sentiment Analysis
Authors: Yu Zhang, Pedro Desouza
Abstract:
Since big data has become substantially more accessible and manageable due to the development of powerful tools for dealing with unstructured data, people are eager to mine information from social media resources that could not be handled in the past. Sentiment analysis, as a novel branch of text mining, has in the last decade become increasingly important in marketing analysis, customer risk prediction and other fields. Scientists and researchers have undertaken significant work in creating and improving their sentiment models. In this paper, we present a concept of selecting appropriate classifiers based on the features and qualities of data sources by comparing the performances of five classifiers with three popular social media data sources: Twitter, Amazon Customer Reviews, and Movie Reviews. We introduced a couple of innovative models that outperform traditional sentiment classifiers for these data sources, and provide insights on how to further improve the predictive power of sentiment analysis. The modelling and testing work was done in R and Greenplum in-database analytic tools.Keywords: sentiment analysis, social media, Twitter, Amazon, data mining, machine learning, text mining
Procedia PDF Downloads 35027862 Linguistic Features for Sentence Difficulty Prediction in Aspect-Based Sentiment Analysis
Authors: Adrian-Gabriel Chifu, Sebastien Fournier
Abstract:
One of the challenges of natural language understanding is to deal with the subjectivity of sentences, which may express opinions and emotions that add layers of complexity and nuance. Sentiment analysis is a field that aims to extract and analyze these subjective elements from text, and it can be applied at different levels of granularity, such as document, paragraph, sentence, or aspect. Aspect-based sentiment analysis is a well-studied topic with many available data sets and models. However, there is no clear definition of what makes a sentence difficult for aspect-based sentiment analysis. In this paper, we explore this question by conducting an experiment with three data sets: ”Laptops”, ”Restaurants”, and ”MTSC” (Multi-Target-dependent Sentiment Classification), and a merged version of these three datasets. We study the impact of domain diversity and syntactic diversity on difficulty. We use a combination of classifiers to identify the most difficult sentences and analyze their characteristics. We employ two ways of defining sentence difficulty. The first one is binary and labels a sentence as difficult if the classifiers fail to correctly predict the sentiment polarity. The second one is a six-level scale based on how many of the top five best-performing classifiers can correctly predict the sentiment polarity. We also define 9 linguistic features that, combined, aim at estimating the difficulty at sentence level.Keywords: sentiment analysis, difficulty, classification, machine learning
Procedia PDF Downloads 8827861 Sentiment Analysis of Consumers’ Perceptions on Social Media about the Main Mobile Providers in Jamaica
Authors: Sherrene Bogle, Verlia Bogle, Tyrone Anderson
Abstract:
In recent years, organizations have become increasingly interested in the possibility of analyzing social media as a means of gaining meaningful feedback about their products and services. The aspect based sentiment analysis approach is used to predict the sentiment for Twitter datasets for Digicel and Lime, the main mobile companies in Jamaica, using supervised learning classification techniques. The results indicate an average of 82.2 percent accuracy in classifying tweets when comparing three separate classification algorithms against the purported baseline of 70 percent and an average root mean squared error of 0.31. These results indicate that the analysis of sentiment on social media in order to gain customer feedback can be a viable solution for mobile companies looking to improve business performance.Keywords: machine learning, sentiment analysis, social media, supervised learning
Procedia PDF Downloads 441