Search results for: qualitative text analysis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 29863

Search results for: qualitative text analysis

29713 Disowning of ‘Our Lady of Alice Bhatti’ by Mohammad Hanif Through Gendered and Religious Discourse

Authors: Abrar Ajmal

Abstract:

The language used in literature reveals the culture and social gestalt of any society in which it has been constructed and consumed. This paper carries the same rationale, which aims to track certain socio-religious and cultural-economic disparities and discrepancies towards minorities, particularly Christians, in an Islamic re(public) where there is a clear majority of Muslims with the help of analysis of instances of language used in the narratives “Our Lady of Alice Bhatt” by Mohammad Hanif. It would highlight social inequalities practiced deeply in sociocultural discourse. Moreover, this research would also touch upon the question of gender discrimination and gender construction as a female entity in a male-chauvinistic scenic turnout using language since the novel revolves around communicative forfeits of Alice Bhatti’s life where she is fraying in fisticuffs to befit herself in a miss-fitted society. It would employ using Fairclough's framework for analysis to conduct a critical discourse analysis of the text at three axiom levels namely textual analysis, discursive practices, and socio-cultural analysis. Thus, the results would reveal textual findings in linguistic analysis, a range of embedded discourses in discursive practices, and consumption of the text into socio-cultural explications with the use of language and lexicalization employed in the selected excerpts.

Keywords: gendered discourse, socio-economic disparities minorities, Islamization, analytical framework

Procedia PDF Downloads 26
29712 The Platform for Digitization of Georgian Documents

Authors: Erekle Magradze, Davit Soselia, Levan Shughliashvili, Irakli Koberidze, Shota Tsiskaridze, Victor Kakhniashvili, Tamar Chaghiashvili

Abstract:

Since the beginning of active publishing activity in Georgia, voluminous printed material has been accumulated, the digitization of which is an important task. Digitized materials will be available to the audience, and it will be possible to find text in them and conduct various factual research. Digitizing scanned documents means scanning documents, extracting text from the scanned documents, and processing the text into a corresponding language model to detect inaccuracies and grammatical errors. Implementing these stages requires a unified, scalable, and automated platform, where the digital service developed for each stage will perform the task assigned to it; at the same time, it will be possible to develop these services dynamically so that there is no interruption in the work of the platform.

Keywords: NLP, OCR, BERT, Kubernetes, transformers

Procedia PDF Downloads 118
29711 Aviation versus Aerospace: A Differential Analysis of Workforce Jobs via Text Mining

Authors: Sarah Werner, Michael J. Pritchard

Abstract:

From pilots to engineers, the skills development within the aerospace industry is exceptionally broad. Employers often struggle with finding the right mixture of qualified skills to fill their organizational demands. This effort to find qualified talent is further complicated by the industrial delineation between two key areas: aviation and aerospace. In a broad sense, the aerospace industry overlaps with the aviation industry. In turn, the aviation industry is a smaller sector segment within the context of the broader definition of the aerospace industry. Furthermore, it could be conceptually argued that -in practice- there is little distinction between these two sectors (i.e., aviation and aerospace). However, through our unstructured text analysis of over 6,000 job listings captured, our team found a clear delineation between aviation-related jobs and aerospace-related jobs. Using techniques in natural language processing, our research identifies an integrated workforce skill pattern that clearly breaks between these two sectors. While the aviation sector has largely maintained its need for pilots, mechanics, and associated support personnel, the staffing needs of the aerospace industry are being progressively driven by integrative engineering needs. Increasingly, this is leading many aerospace-based organizations towards the acquisition of 'system level' staffing requirements. This research helps to better align higher educational institutions with the current industrial staffing complexities within the broader aerospace sector.

Keywords: aerospace industry, job demand, text mining, workforce development

Procedia PDF Downloads 236
29710 On-Line Consumer Comments (E-Wom): A Case Qualitative Analysis on Resort Hotel Consumers

Authors: Yasin Bilim, Alaaddin Başoda

Abstract:

The recent growth of internet applications on hospitality and tourism provokes on-line consumer comments and reviews. Many researchers and practitioners have named this enormous potential as “e-WOM (electronic word of mouth)”. Travel comments are important experiential information for the potential travellers. Many researches have been conducted to analyse the effects of e-WOM on hotel consumers. Broadly quantitative methods have been used for analysing online comments. But, a few studies have mentioned about the positive practical aspects of the comments for hotel marketers. The study aims to show different usage and effects of hotel consumers’ comments. As qualitative analysis method, grounded theory, content and discourse analysis, were used. The data based on the 10 resort hotel consumers’ on-line comments. Results show that consumers tend to write comments about service person, rooms, food services and pool in their online space. These indicators can be used by hotel marketers as a marketing information tool.

Keywords: comments, E-WOM, hotel consumer, qualitative

Procedia PDF Downloads 201
29709 ExactData Smart Tool For Marketing Analysis

Authors: Aleksandra Jonas, Aleksandra Gronowska, Maciej Ścigacz, Szymon Jadczak

Abstract:

Exact Data is a smart tool which helps with meaningful marketing content creation. It helps marketers achieve this by analyzing the text of an advertisement before and after its publication on social media sites like Facebook or Instagram. In our research we focus on four areas of natural language processing (NLP): grammar correction, sentiment analysis, irony detection and advertisement interpretation. Our research has identified a considerable lack of NLP tools for the Polish language, which specifically aid online marketers. In light of this, our research team has set out to create a robust and versatile NLP tool for the Polish language. The primary objective of our research is to develop a tool that can perform a range of language processing tasks in this language, such as sentiment analysis, text classification, text correction and text interpretation. Our team has been working diligently to create a tool that is accurate, reliable, and adaptable to the specific linguistic features of Polish, and that can provide valuable insights for a wide range of marketers needs. In addition to the Polish language version, we are also developing an English version of the tool, which will enable us to expand the reach and impact of our research to a wider audience. Another area of focus in our research involves tackling the challenge of the limited availability of linguistically diverse corpora for non-English languages, which presents a significant barrier in the development of NLP applications. One approach we have been pursuing is the translation of existing English corpora, which would enable us to use the wealth of linguistic resources available in English for other languages. Furthermore, we are looking into other methods, such as gathering language samples from social media platforms. By analyzing the language used in social media posts, we can collect a wide range of data that reflects the unique linguistic characteristics of specific regions and communities, which can then be used to enhance the accuracy and performance of NLP algorithms for non-English languages. In doing so, we hope to broaden the scope and capabilities of NLP applications. Our research focuses on several key NLP techniques including sentiment analysis, text classification, text interpretation and text correction. To ensure that we can achieve the best possible performance for these techniques, we are evaluating and comparing different approaches and strategies for implementing them. We are exploring a range of different methods, including transformers and convolutional neural networks (CNNs), to determine which ones are most effective for different types of NLP tasks. By analyzing the strengths and weaknesses of each approach, we can identify the most effective techniques for specific use cases, and further enhance the performance of our tool. Our research aims to create a tool, which can provide a comprehensive analysis of advertising effectiveness, allowing marketers to identify areas for improvement and optimize their advertising strategies. The results of this study suggest that a smart tool for advertisement analysis can provide valuable insights for businesses seeking to create effective advertising campaigns.

Keywords: NLP, AI, IT, language, marketing, analysis

Procedia PDF Downloads 56
29708 HPTLC Fingerprint Profiling of Protorhus longifolia Methanolic Leaf Extract and Qualitative Analysis of Common Biomarkers

Authors: P. S. Seboletswe, Z. Mkhize, L. M. Katata-Seru

Abstract:

Protorhus longifolia is known as a medicinal plant that has been used traditionally to treat various ailments such as hemiplegic paralysis, blood clotting related diseases, diarrhoea, heartburn, etc. The study reports a High-Performance Thin Layer Chromatography (HPTLC) fingerprint profile of Protorhus longifolia methanolic extract and its qualitative analysis of gallic acid, rutin, and quercetin. HPTLC analysis was achieved using CAMAG HPTLC system equipped with CAMAG automatic TLC sampler 4, CAMAG Automatic Developing Chamber 2 (ADC2), CAMAG visualizer 2, CAMAG Thin Layer Chromatography (TLC) scanner and visionCATS CAMAG HPTLC software. Mobile phase comprising toluene, ethyl acetate, formic acid (21:15:3) was used for qualitative analysis of gallic acid and revealed eight peaks while the mobile phase containing ethyl acetate, water, glacial acetic acid, formic acid (100:26:11:11) for qualitative analysis of rutin and quercetin revealed six peaks. HPTLC sillica gel 60 F254 glass plates (10 × 10) were used as the stationary phase. Gallic acid was detected at the Rf = 0.35; while rutin and quercetin were not evident in the extract. Further studies will be performed to quantify gallic acid in Protorhus longifolia leaves and also identify other biomarkers.

Keywords: biomarkers, fingerprint profiling, gallic acid, HPTLC, Protorhus longifolia

Procedia PDF Downloads 116
29707 Recognizing Customer Preferences Using Review Documents: A Hybrid Text and Data Mining Approach

Authors: Oshin Anand, Atanu Rakshit

Abstract:

The vast increment in the e-commerce ventures makes this area a prominent research stream. Besides several quantified parameters, the textual content of reviews is a storehouse of many information that can educate companies and help them earn profit. This study is an attempt in this direction. The article attempts to categorize data based on a computed metric that quantifies the influencing capacity of reviews rendering two categories of high and low influential reviews. Further, each of these document is studied to conclude several product feature categories. Each of these categories along with the computed metric is converted to linguistic identifiers and are used in an association mining model. The article makes a novel attempt to combine feature attraction with quantified metric to categorize review text and finally provide frequent patterns that depict customer preferences. Frequent mentions in a highly influential score depict customer likes or preferred features in the product whereas prominent pattern in low influencing reviews highlights what is not important for customers. This is achieved using a hybrid approach of text mining for feature and term extraction, sentiment analysis, multicriteria decision-making technique and association mining model.

Keywords: association mining, customer preference, frequent pattern, online reviews, text mining

Procedia PDF Downloads 365
29706 Degraded Document Analysis and Extraction of Original Text Document: An Approach without Optical Character Recognition

Authors: L. Hamsaveni, Navya Prakash, Suresha

Abstract:

Document Image Analysis recognizes text and graphics in documents acquired as images. An approach without Optical Character Recognition (OCR) for degraded document image analysis has been adopted in this paper. The technique involves document imaging methods such as Image Fusing and Speeded Up Robust Features (SURF) Detection to identify and extract the degraded regions from a set of document images to obtain an original document with complete information. In case, degraded document image captured is skewed, it has to be straightened (deskew) to perform further process. A special format of image storing known as YCbCr is used as a tool to convert the Grayscale image to RGB image format. The presented algorithm is tested on various types of degraded documents such as printed documents, handwritten documents, old script documents and handwritten image sketches in documents. The purpose of this research is to obtain an original document for a given set of degraded documents of the same source.

Keywords: grayscale image format, image fusing, RGB image format, SURF detection, YCbCr image format

Procedia PDF Downloads 348
29705 Experiences Using Autoethnography as a Methodology for Research in Education

Authors: Sarah Amodeo

Abstract:

Drawing on the author’s research about the experiences of female immigrant students in academic Adult Education, in Montreal, Quebec, this paper deconstructs the benefits of autoethnography as a methodology for educators in Adult Education. Autoethnography is an advantageous methodology for teachers in Adult Education as it allows for deep engagement, allowing for educators to reflect on student experiences and their day-to-day realities, and in turn, allowing for professional development, improved andragogy, and changes to classroom practices. Autoethnography is a qualitative research methodology that cultivates strategies for improving adult learning. The paper begins by outlining the context that inspired autoethnography for the author’s work, highlighting the emergence of autoethnography as a method, while examining how it is evolving and drawing on foundational work that continues to inspire research. The basic autoethnographic methodologies that are explored in this paper include the use of memory work in episode formation, the use of personal photographs, and textual readings of artworks. Memory work allows for the researcher to use their professional experience and the lived/shared experiences of their students in their research, drawing on episodes from their past. Personal photographs and descriptions of artwork allow researchers to explore images of learning environments/realities in ways that compliment student experiences. Major findings of the text are examined through the analysis of categories of autoethnography. Specific categories include realism, impressionism, and conceptualism which aid in orientating the analysis and emergent themes that develop through self-study. Finally, the text presents a discussion surrounding the limitations of autoethnography, with attention to the trustworthiness and ethical issues. The paper concludes with a consideration of the implications of autoethnography for adult educators in juxtaposition with youth sector work.

Keywords: artwork, autoethnography, conceptualism, episode formation, impressionism, memory work, personal photographs, and realism, realism

Procedia PDF Downloads 156
29704 Qualitative Case Study Research in Accounting: Challenges and Prospects the Libyan Case Study

Authors: Bubaker F. Shareia

Abstract:

Much of the literature on research design has focussed on research conducted in developed, uni-cultural or primarily English speaking countries. Studies of qualitative case study research, the challenges and prospects have been embedded in Western/Euro-centric society and social theories. Although there have been some theoretical studies, few empirical studies have been conducted to explore the nature of the challenges of qualitative case study in developing countries. These challenges include accessibility to organizations, conducting interviews in developing countries, accessing documents and observing official meetings, language and cultural challenges, the use of consent forms, issues affecting access to companies, respondent issues and data analysis. The author, while conducting qualitative case study research in Libya, faced all these issues. The discussion in this paper examines these issues in order to make a contribution toward the literature in this area.

Keywords: accounting, challenges, prospects, developing countries, Libya, qualitative case study

Procedia PDF Downloads 277
29703 Pregnancy through the Lens of Iranian Women with HIV: A Qualitative

Authors: Zahra BehboodiI-Moghadam, Zohre Khalajinia, Ali Reza Nikbakht Nasrabadi, Minoo Mohraz

Abstract:

The purpose of our study was to explore and describe the experiences of pregnant women with HIV in Iran. A qualitative exploratory study with conventional content analysis was used. Twelve pregnant women with HIV who referred to perinatal care at the Imam Khomeini Hospital Behavioral Diseases Consultation: Center in Tehran were recruited to participate in in-depth interviews. The average age of the participants was 32.5 years. Four main themes were extracted from the data: “fear and hope, “stigma and discrimination, “marital life stability” and “trust”. The findings reveal the pregnant women living with HIV are vulnerable and need professional support. Improving the knowledge of healthcare professionals especially midwifes on pregnancy complications for women with HIV is crucial in order to provide high-quality care to pregnant women with HIV-positive.

Keywords: HIV, pregnancy, content analysis, experiences, Iran, qualitative research

Procedia PDF Downloads 441
29702 Linguistic Analysis of Argumentation Structures in Georgian Political Speeches

Authors: Mariam Matiashvili

Abstract:

Argumentation is an integral part of our daily communications - formal or informal. Argumentative reasoning, techniques, and language tools are used both in personal conversations and in the business environment. Verbalization of the opinions requires the use of extraordinary syntactic-pragmatic structural quantities - arguments that add credibility to the statement. The study of argumentative structures allows us to identify the linguistic features that make the text argumentative. Knowing what elements make up an argumentative text in a particular language helps the users of that language improve their skills. Also, natural language processing (NLP) has become especially relevant recently. In this context, one of the main emphases is on the computational processing of argumentative texts, which will enable the automatic recognition and analysis of large volumes of textual data. The research deals with the linguistic analysis of the argumentative structures of Georgian political speeches - particularly the linguistic structure, characteristics, and functions of the parts of the argumentative text - claims, support, and attack statements. The research aims to describe the linguistic cues that give the sentence a judgmental/controversial character and helps to identify reasoning parts of the argumentative text. The empirical data comes from the Georgian Political Corpus, particularly TV debates. Consequently, the texts are of a dialogical nature, representing a discussion between two or more people (most often between a journalist and a politician). The research uses the following approaches to identify and analyze the argumentative structures Lexical Classification & Analysis - Identify lexical items that are relevant in argumentative texts creating process - Creating the lexicon of argumentation (presents groups of words gathered from a semantic point of view); Grammatical Analysis and Classification - means grammatical analysis of the words and phrases identified based on the arguing lexicon. Argumentation Schemas - Describe and identify the Argumentation Schemes that are most likely used in Georgian Political Speeches. As a final step, we analyzed the relations between the above mentioned components. For example, If an identified argument scheme is “Argument from Analogy”, identified lexical items semantically express analogy too, and they are most likely adverbs in Georgian. As a result, we created the lexicon with the words that play a significant role in creating Georgian argumentative structures. Linguistic analysis has shown that verbs play a crucial role in creating argumentative structures.

Keywords: georgian, argumentation schemas, argumentation structures, argumentation lexicon

Procedia PDF Downloads 50
29701 An Emphasis on Creativity-Speak Words Increases Crowdfunding Success

Authors: Trayan Kushev, E. Shaunn Mattingly, Andrew S. Manikas

Abstract:

This study utilizes computer-aided text analysis (CATA) on the descriptions of 248,614 Kickstarter crowdfunding campaigns to reveal that backers are more likely to provide funding to projects that contain a higher percentage of creativity-speak words. Further, this relationship is observed to be stronger for product-based campaigns (e.g., games, technology, design) and weaker for content-based campaigns (e.g., film, music, publishing). In addition, both positive linguistic tone and the use of words expressing gratitude in the text of the campaign strengthen the positive effect of creativity-speak on campaign success.

Keywords: creativity-speak, crowdfunding, entrepreneurship, gratitude, tone

Procedia PDF Downloads 47
29700 Encryption and Decryption of Nucleic Acid Using Deoxyribonucleic Acid Algorithm

Authors: Iftikhar A. Tayubi, Aabdulrahman Alsubhi, Abdullah Althrwi

Abstract:

The deoxyribonucleic acid text provides a single source of high-quality Cryptography about Deoxyribonucleic acid sequence for structural biologists. We will provide an intuitive, well-organized and user-friendly web interface that allows users to encrypt and decrypt Deoxy Ribonucleic Acid sequence text. It includes complex, securing by using Algorithm to encrypt and decrypt Deoxy Ribonucleic Acid sequence. The utility of this Deoxy Ribonucleic Acid Sequence Text is that, it can provide a user-friendly interface for users to Encrypt and Decrypt store the information about Deoxy Ribonucleic Acid sequence. These interfaces created in this project will satisfy the demands of the scientific community by providing fully encrypt of Deoxy Ribonucleic Acid sequence during this website. We have adopted a methodology by using C# and Active Server Page.NET for programming which is smart and secure. Deoxy Ribonucleic Acid sequence text is a wonderful piece of equipment for encrypting large quantities of data, efficiently. The users can thus navigate from one encoding and store orange text, depending on the field for user’s interest. Algorithm classification allows a user to Protect the deoxy ribonucleic acid sequence from change, whether an alteration or error occurred during the Deoxy Ribonucleic Acid sequence data transfer. It will check the integrity of the Deoxy Ribonucleic Acid sequence data during the access.

Keywords: algorithm, ASP.NET, DNA, encrypt, decrypt

Procedia PDF Downloads 204
29699 Applying Dictogloss Technique to Improve Auditory Learners’ Writing Skills in Second Language Learning

Authors: Aji Budi Rinekso

Abstract:

There are some common problems that are often faced by students in writing. The problems are related to macro and micro skills of writing, such as incorrect spellings, inappropriate diction, grammatical errors, random ideas, and irrelevant supporting sentences. Therefore, it is needed a teaching technique that can solve those problems. Dictogloss technique is a teaching technique that involves listening practices. So, it is a suitable teaching technique for students with auditory learning style. Dictogloss technique comprises of four basic steps; (1) warm up, (2) dictation, (3) reconstruction and (4) analysis and correction. Warm up is when students find out about topics and do some preparatory vocabulary works. Then, dictation is when the students listen to texts read at normal speed by a teacher. The text is read by the teacher twice where at the first reading the students only listen to the teacher and at the second reading the students listen to the teacher again and take notes. Next, reconstruction is when the students discuss the information from the text read by the teacher and start to write a text. Lastly, analysis and correction are when the students check their writings and revise them. Dictogloss offers some advantages in relation to the efforts of improving writing skills. Through the use of dictogloss technique, students can solve their problems both on macro skills and micro skills. Easier to generate ideas and better writing mechanics are the benefits of dictogloss.

Keywords: auditory learners, writing skills, dictogloss technique, second language learning

Procedia PDF Downloads 122
29698 ViraPart: A Text Refinement Framework for Automatic Speech Recognition and Natural Language Processing Tasks in Persian

Authors: Narges Farokhshad, Milad Molazadeh, Saman Jamalabbasi, Hamed Babaei Giglou, Saeed Bibak

Abstract:

The Persian language is an inflectional subject-object-verb language. This fact makes Persian a more uncertain language. However, using techniques such as Zero-Width Non-Joiner (ZWNJ) recognition, punctuation restoration, and Persian Ezafe construction will lead us to a more understandable and precise language. In most of the works in Persian, these techniques are addressed individually. Despite that, we believe that for text refinement in Persian, all of these tasks are necessary. In this work, we proposed a ViraPart framework that uses embedded ParsBERT in its core for text clarifications. First, used the BERT variant for Persian followed by a classifier layer for classification procedures. Next, we combined models outputs to output cleartext. In the end, the proposed model for ZWNJ recognition, punctuation restoration, and Persian Ezafe construction performs the averaged F1 macro scores of 96.90%, 92.13%, and 98.50%, respectively. Experimental results show that our proposed approach is very effective in text refinement for the Persian language.

Keywords: Persian Ezafe, punctuation, ZWNJ, NLP, ParsBERT, transformers

Procedia PDF Downloads 178
29697 Understanding the Challenges of Lawbook Translation via the Framework of Functional Theory of Language

Authors: Tengku Sepora Tengku Mahadi

Abstract:

Where the speed of book writing lags behind the high need for such material for tertiary studies, translation offers a way to enhance the equilibrium in this demand-supply equation. Nevertheless, translation is confronted by obstacles that threaten its effectiveness. The primary challenge to the production of efficient translations may well be related to the text-type and in terms of its complexity. A text that is intricately written with unique rhetorical devices, subject-matter foundation and cultural references will undoubtedly challenge the translator. Longer time and greater effort would be the consequence. To understand these text-related challenges, the present paper set out to analyze a lawbook entitled Learning the Law by David Melinkoff. The book is chosen because it has often been used as a textbook or for reference in many law courses in the United Kingdom and has seen over thirteen editions; therefore, it can be said to be a worthy book for studies in law. Another reason is the existence of a ready translation in Malay. Reference to this translation enables confirmation to some extent of the potential problems that might occur in its translation. Understanding the organization and the language of the book will help translators to prepare themselves better for the task. They can anticipate the research and time that may be needed to produce an effective translation. Another premise here is that this text-type implies certain ways of writing and organization. Accordingly, it seems practicable to adopt the functional theory of language as suggested by Michael Halliday as its theoretical framework. Concepts of the context of culture, the context of situation and measures of the field, tenor and mode form the instruments for analysis. Additional examples from similar materials can also be used to validate the findings. Some interesting findings include the presence of several other text-types or sub-text-types in the book and the dependence on literary discourse and devices to capture the meanings better or add color to the dry field of law. In addition, many elements of culture can be seen, for example, the use of familiar alternatives, allusions, and even terminology and references that date back to various periods of time and languages. Also found are parts which discuss origins of words and terms that may be relevant to readers within the United Kingdom but make little sense to readers of the book in other languages. In conclusion, the textual analysis in terms of its functions and the linguistic and textual devices used to achieve them can then be applied as a guide to determine the effectiveness of the translation that is produced.

Keywords: functional theory of language, lawbook text-type, rhetorical devices, culture

Procedia PDF Downloads 125
29696 Improved Processing Speed for Text Watermarking Algorithm in Color Images

Authors: Hamza A. Al-Sewadi, Akram N. A. Aldakari

Abstract:

Copyright protection and ownership proof of digital multimedia are achieved nowadays by digital watermarking techniques. A text watermarking algorithm for protecting the property rights and ownership judgment of color images is proposed in this paper. Embedding is achieved by inserting texts elements randomly into the color image as noise. The YIQ image processing model is found to be faster than other image processing methods, and hence, it is adopted for the embedding process. An optional choice of encrypting the text watermark before embedding is also suggested (in case required by some applications), where, the text can is encrypted using any enciphering technique adding more difficulty to hackers. Experiments resulted in embedding speed improvement of more than double the speed of other considered systems (such as least significant bit method, and separate color code methods), and a fairly acceptable level of peak signal to noise ratio (PSNR) with low mean square error values for watermarking purposes.

Keywords: steganography, watermarking, time complexity measurements, private keys

Procedia PDF Downloads 121
29695 Investigating the Effectiveness of Multilingual NLP Models for Sentiment Analysis

Authors: Othmane Touri, Sanaa El Filali, El Habib Benlahmar

Abstract:

Natural Language Processing (NLP) has gained significant attention lately. It has proved its ability to analyze and extract insights from unstructured text data in various languages. It is found that one of the most popular NLP applications is sentiment analysis which aims to identify the sentiment expressed in a piece of text, such as positive, negative, or neutral, in multiple languages. While there are several multilingual NLP models available for sentiment analysis, there is a need to investigate their effectiveness in different contexts and applications. In this study, we aim to investigate the effectiveness of different multilingual NLP models for sentiment analysis on a dataset of online product reviews in multiple languages. The performance of several NLP models, including Google Cloud Natural Language API, Microsoft Azure Cognitive Services, Amazon Comprehend, Stanford CoreNLP, spaCy, and Hugging Face Transformers are being compared. The models based on several metrics, including accuracy, precision, recall, and F1 score, are being evaluated and compared to their performance across different categories of product reviews. In order to run the study, preprocessing of the dataset has been performed by cleaning and tokenizing the text data in multiple languages. Then training and testing each model has been applied using a cross-validation approach where randomly dividing the dataset into training and testing sets and repeating the process multiple times has been used. A grid search approach to optimize the hyperparameters of each model and select the best-performing model for each category of product reviews and language has been applied. The findings of this study provide insights into the effectiveness of different multilingual NLP models for Multilingual Sentiment Analysis and their suitability for different languages and applications. The strengths and limitations of each model were identified, and recommendations for selecting the most performant model based on the specific requirements of a project were provided. This study contributes to the advancement of research methods in multilingual NLP and provides a practical guide for researchers and practitioners in the field.

Keywords: NLP, multilingual, sentiment analysis, texts

Procedia PDF Downloads 60
29694 Development of Fake News Model Using Machine Learning through Natural Language Processing

Authors: Sajjad Ahmed, Knut Hinkelmann, Flavio Corradini

Abstract:

Fake news detection research is still in the early stage as this is a relatively new phenomenon in the interest raised by society. Machine learning helps to solve complex problems and to build AI systems nowadays and especially in those cases where we have tacit knowledge or the knowledge that is not known. We used machine learning algorithms and for identification of fake news; we applied three classifiers; Passive Aggressive, Naïve Bayes, and Support Vector Machine. Simple classification is not completely correct in fake news detection because classification methods are not specialized for fake news. With the integration of machine learning and text-based processing, we can detect fake news and build classifiers that can classify the news data. Text classification mainly focuses on extracting various features of text and after that incorporating those features into classification. The big challenge in this area is the lack of an efficient way to differentiate between fake and non-fake due to the unavailability of corpora. We applied three different machine learning classifiers on two publicly available datasets. Experimental analysis based on the existing dataset indicates a very encouraging and improved performance.

Keywords: fake news detection, natural language processing, machine learning, classification techniques.

Procedia PDF Downloads 133
29693 Patronage Network and Ideological Manipulations in Translation of Literary Texts: A Case Study of George Orwell's “1984” in Persian Translation in the Period 1980 to 2015

Authors: Masoud Hassanzade Novin, Bahloul Salmani

Abstract:

The process of the translation is not merely the linguistic aspects. It is also considered in the cultural framework of both the source and target text cultures. The translation process and translated texts are confronted the new aspect in 20th century which is considered mostly in the patronage framework and ideological grillwork of the target language. To have these factors scrutinized in the process of the translation both micro-element factors and macro-element factors can be taken into consideration. For the purpose of this study through a qualitative type of research based on critical discourse analysis approach, the case study of the novel “1984” written by George Orwell was chosen as the corpus of the study to have the contrastive analysis by its Persian translated texts. Results of the study revealed some distortions embedded in the target texts which were overshadowed by ideological aspect and patronage network. The outcomes of the manipulated terms were different in various categories which revealed the manipulation aspects in the texts translated.

Keywords: critical discourse analysis, ideology, patronage network, translated texts

Procedia PDF Downloads 298
29692 The Arabic Literary Text, between Proficiency and Pedagogy

Authors: Abdul Rahman M. Chamseddine, Mahmoud El-ashiri

Abstract:

In the field of language teaching, communication skills are essential for the learner to achieve, however, these skills, in general, might not support the comprehension of some texts of literary or artistic nature like poetry. Understanding sentences and expressions is not enough to understand a poem; other skills are needed in order to understand the special structure of a text which literary meaning is inapprehensible even when the lingual meaning is well comprehended. And then there is the need for many other components that surpass one text to other similar texts that can be understood through solid traditions, which do not form an obstacle in the face of change and progress. This is not exclusive to texts that are classified as a literary but it is also the same with some daily short phrases and indicatively charged expressions that can be classified as literary or bear a taste of literary nature.. it can be found in Newpapers’ titles, TV news reports, and maybe football commentaries… the need to understand this special lingual use – described as literary – is highly important to understand this discourse that can be generally classified as very far from literature. This work will try to explore the role of the literary text in the language class and the way it is being covered or dealt with throughout all levels of acquiring proficiency. It will also attempt to survery the position of the literary text in some of the most important books for teaching Arabic around the world. The same way grammar is needed to understand the language, another (literary) grammar is also needed for understanding literature.

Keywords: language teaching, Arabic, literature, pedagogy, language proficiency

Procedia PDF Downloads 245
29691 Understanding Cyber Terrorism from Motivational Perspectives: A Qualitative Data Analysis

Authors: Yunos Zahri, Ariffin Aswami

Abstract:

Cyber terrorism represents the convergence of two worlds: virtual and physical. The virtual world is a place in which computer programs function and data move, whereas the physical world is where people live and function. The merging of these two domains is the interface being targeted in the incidence of cyber terrorism. To better understand why cyber terrorism acts are committed, this study presents the context of cyber terrorism from motivational perspectives. Motivational forces behind cyber terrorism can be social, political, ideological and economic. In this research, data are analyzed using a qualitative method. A semi-structured interview with purposive sampling was used for data collection. With the growing interconnectedness between critical infrastructures and Information & Communication Technology (ICT), selecting targets that facilitate maximum disruption can significantly influence terrorists. This work provides a baseline for defining the concept of cyber terrorism from motivational perspectives.

Keywords: cyber terrorism, terrorism, motivation, qualitative analysis

Procedia PDF Downloads 381
29690 Developing an Advanced Algorithm Capable of Classifying News, Articles and Other Textual Documents Using Text Mining Techniques

Authors: R. B. Knudsen, O. T. Rasmussen, R. A. Alphinas

Abstract:

The reason for conducting this research is to develop an algorithm that is capable of classifying news articles from the automobile industry, according to the competitive actions that they entail, with the use of Text Mining (TM) methods. It is needed to test how to properly preprocess the data for this research by preparing pipelines which fits each algorithm the best. The pipelines are tested along with nine different classification algorithms in the realm of regression, support vector machines, and neural networks. Preliminary testing for identifying the optimal pipelines and algorithms resulted in the selection of two algorithms with two different pipelines. The two algorithms are Logistic Regression (LR) and Artificial Neural Network (ANN). These algorithms are optimized further, where several parameters of each algorithm are tested. The best result is achieved with the ANN. The final model yields an accuracy of 0.79, a precision of 0.80, a recall of 0.78, and an F1 score of 0.76. By removing three of the classes that created noise, the final algorithm is capable of reaching an accuracy of 94%.

Keywords: Artificial Neural network, Competitive dynamics, Logistic Regression, Text classification, Text mining

Procedia PDF Downloads 95
29689 A Lean Manufacturing Profile of Practices in the Metallurgical Industry: A Methodology for Multivariate Analysis

Authors: M. Jonathan D. Morales, R. Ramón Silva

Abstract:

The purpose of this project is to carry out an analysis and determine the profile of actual lean manufacturing processes in the Metropolitan Area of Bucaramanga. Through the analysis of qualitative and quantitative variables it was possible to establish how these manufacturers develop production practices that ensure their competitiveness and productivity in the market. In this study, a random sample of metallurgic and wrought iron companies was applied, following which a quantitative focus and analysis was used to formulate a qualitative methodology for measuring the level of lean manufacturing procedures in the industry. A qualitative evaluation was also carried out through a multivariate analysis using the Numerical Taxonomy System (NTSYS) program which should allow for the determination of Lean Manufacturing profiles. Through the results it was possible to observe how the companies in the sector are doing with respect to Lean Manufacturing Practices, as well as identify the level of management that these companies practice with respect to this topic. In addition, it was possible to ascertain that there is no one dominant profile in the sector when it comes to Lean Manufacturing. It was established that the companies in the metallurgic and wrought iron industry show low levels of Lean Manufacturing implementation. Each one carries out diverse actions that are insufficient to consolidate a sectoral strategy for developing a competitive advantage which enables them to tie together a production strategy.

Keywords: production line management, metallurgic industry, lean manufacturing, productivity

Procedia PDF Downloads 440
29688 A Methodology for Investigating Public Opinion Using Multilevel Text Analysis

Authors: William Xiu Shun Wong, Myungsu Lim, Yoonjin Hyun, Chen Liu, Seongi Choi, Dasom Kim, Kee-Young Kwahk, Namgyu Kim

Abstract:

Recently, many users have begun to frequently share their opinions on diverse issues using various social media. Therefore, numerous governments have attempted to establish or improve national policies according to the public opinions captured from various social media. In this paper, we indicate several limitations of the traditional approaches to analyze public opinion on science and technology and provide an alternative methodology to overcome these limitations. First, we distinguish between the science and technology analysis phase and the social issue analysis phase to reflect the fact that public opinion can be formed only when a certain science and technology is applied to a specific social issue. Next, we successively apply a start list and a stop list to acquire clarified and interesting results. Finally, to identify the most appropriate documents that fit with a given subject, we develop a new logical filter concept that consists of not only mere keywords but also a logical relationship among the keywords. This study then analyzes the possibilities for the practical use of the proposed methodology thorough its application to discover core issues and public opinions from 1,700,886 documents comprising SNS, blogs, news, and discussions.

Keywords: big data, social network analysis, text mining, topic modeling

Procedia PDF Downloads 267
29687 Against Language Disorder: A Way of Reading Dialects in Yan Lianke’s Novels

Authors: Thuy Hanh Nguyen Thi

Abstract:

By the method of deep reading and text analysis, this article will analyze the use and creation of dialects as a way of demonstrating Yan Lianke's creative stance. This article indicates that this is the writer’s narrative strategy in a fight against aphasia, a language disorder of Chinese people and culture, demonstrating a sense of return to folklore and marks his own linguistic style. In terms of verbal text, the dialect in the Yan Lianke’s novels manifested through the use of words, sentences and dialects. There are two types of dialects that exist in Yan Lianke’s novels: the current dialect system and the particular dialect system of Pa Lau world created by the writer himself in order to enrich the vocabulary of Han Chinese.

Keywords: Yan Lianke , aphasia, dialect, Pa Lou world

Procedia PDF Downloads 101
29686 Fake News Detection for Korean News Using Machine Learning Techniques

Authors: Tae-Uk Yun, Pullip Chung, Kee-Young Kwahk, Hyunchul Ahn

Abstract:

Fake news is defined as the news articles that are intentionally and verifiably false, and could mislead readers. Spread of fake news may provoke anxiety, chaos, fear, or irrational decisions of the public. Thus, detecting fake news and preventing its spread has become very important issue in our society. However, due to the huge amount of fake news produced every day, it is almost impossible to identify it by a human. Under this context, researchers have tried to develop automated fake news detection using machine learning techniques over the past years. But, there have been no prior studies proposed an automated fake news detection method for Korean news to our best knowledge. In this study, we aim to detect Korean fake news using text mining and machine learning techniques. Our proposed method consists of two steps. In the first step, the news contents to be analyzed is convert to quantified values using various text mining techniques (topic modeling, TF-IDF, and so on). After that, in step 2, classifiers are trained using the values produced in step 1. As the classifiers, machine learning techniques such as logistic regression, backpropagation network, support vector machine, and deep neural network can be applied. To validate the effectiveness of the proposed method, we collected about 200 short Korean news from Seoul National University’s FactCheck. which provides with detailed analysis reports from 20 media outlets and links to source documents for each case. Using this dataset, we will identify which text features are important as well as which classifiers are effective in detecting Korean fake news.

Keywords: fake news detection, Korean news, machine learning, text mining

Procedia PDF Downloads 245
29685 Criminal Exhibit the Feminine Violent Victim within Thai Newspaper

Authors: Supaporn Wimonchailerk

Abstract:

This research aims to critical analyze the feminine violent within Thai daily newspaper. This study was qualitative base; content analysis from two popular newspapers (Thairath and Dailynews) two qualitative newspapers (Thaipost and Mathichon). Purposive sampling was used to select eleven specialize news reporters to do in-depth interview. The result found that, popular newspapers, Thairath and dailynews have presented feminine violent news in their paper more than Thaipost and Mathichon the qualitative newspaper. Beside, majority of sample present the feminine violent within news under the code of ethic, The National Press Council of Thailand. Interesting, the age of feminine violent victim was the information that has been focused most. The popular newspaper have illustrated crime scene photo on their first-page while qualitative newspaper used only headline to present the same news.

Keywords: ethic, feminine, journalism, newspaper, violent victim

Procedia PDF Downloads 166
29684 A BERT-Based Model for Financial Social Media Sentiment Analysis

Authors: Josiel Delgadillo, Johnson Kinyua, Charles Mutigwe

Abstract:

The purpose of sentiment analysis is to determine the sentiment strength (e.g., positive, negative, neutral) from a textual source for good decision-making. Natural language processing in domains such as financial markets requires knowledge of domain ontology, and pre-trained language models, such as BERT, have made significant breakthroughs in various NLP tasks by training on large-scale un-labeled generic corpora such as Wikipedia. However, sentiment analysis is a strong domain-dependent task. The rapid growth of social media has given users a platform to share their experiences and views about products, services, and processes, including financial markets. StockTwits and Twitter are social networks that allow the public to express their sentiments in real time. Hence, leveraging the success of unsupervised pre-training and a large amount of financial text available on social media platforms could potentially benefit a wide range of financial applications. This work is focused on sentiment analysis using social media text on platforms such as StockTwits and Twitter. To meet this need, SkyBERT, a domain-specific language model pre-trained and fine-tuned on financial corpora, has been developed. The results show that SkyBERT outperforms current state-of-the-art models in financial sentiment analysis. Extensive experimental results demonstrate the effectiveness and robustness of SkyBERT.

Keywords: BERT, financial markets, Twitter, sentiment analysis

Procedia PDF Downloads 127