Search results for: text mining analysis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 29285

Search results for: text mining analysis

28895 A BERT-Based Model for Financial Social Media Sentiment Analysis

Authors: Josiel Delgadillo, Johnson Kinyua, Charles Mutigwe

Abstract:

The purpose of sentiment analysis is to determine the sentiment strength (e.g., positive, negative, neutral) from a textual source for good decision-making. Natural language processing in domains such as financial markets requires knowledge of domain ontology, and pre-trained language models, such as BERT, have made significant breakthroughs in various NLP tasks by training on large-scale un-labeled generic corpora such as Wikipedia. However, sentiment analysis is a strong domain-dependent task. The rapid growth of social media has given users a platform to share their experiences and views about products, services, and processes, including financial markets. StockTwits and Twitter are social networks that allow the public to express their sentiments in real time. Hence, leveraging the success of unsupervised pre-training and a large amount of financial text available on social media platforms could potentially benefit a wide range of financial applications. This work is focused on sentiment analysis using social media text on platforms such as StockTwits and Twitter. To meet this need, SkyBERT, a domain-specific language model pre-trained and fine-tuned on financial corpora, has been developed. The results show that SkyBERT outperforms current state-of-the-art models in financial sentiment analysis. Extensive experimental results demonstrate the effectiveness and robustness of SkyBERT.

Keywords: BERT, financial markets, Twitter, sentiment analysis

Procedia PDF Downloads 154
28894 Wasting Human and Computer Resources

Authors: Mária Csernoch, Piroska Biró

Abstract:

The legends about “user-friendly” and “easy-to-use” birotical tools (computer-related office tools) have been spreading and misleading end-users. This approach has led us to the extremely high number of incorrect documents, causing serious financial losses in the creating, modifying, and retrieving processes. Our research proved that there are at least two sources of this underachievement: (1) The lack of the definition of the correctly edited, formatted documents. Consequently, end-users do not know whether their methods and results are correct or not. They are not aware of their ignorance. They are so ignorant that their ignorance does not allow them to realize their lack of knowledge. (2) The end-users’ problem-solving methods. We have found that in non-traditional programming environments end-users apply, almost exclusively, surface approach metacognitive methods to carry out their computer related activities, which are proved less effective than deep approach methods. Based on these findings we have developed deep approach methods which are based on and adapted from traditional programming languages. In this study, we focus on the most popular type of birotical documents, the text-based documents. We have provided the definition of the correctly edited text, and based on this definition, adapted the debugging method known in programming. According to the method, before the realization of text editing, a thorough debugging of already existing texts and the categorization of errors are carried out. With this method in advance to real text editing users learn the requirements of text-based documents and also of the correctly formatted text. The method has been proved much more effective than the previously applied surface approach methods. The advantages of the method are that the real text handling requires much less human and computer sources than clicking aimlessly in the GUI (Graphical User Interface), and the data retrieval is much more effective than from error-prone documents.

Keywords: deep approach metacognitive methods, error-prone birotical documents, financial losses, human and computer resources

Procedia PDF Downloads 382
28893 Short Text Classification for Saudi Tweets

Authors: Asma A. Alsufyani, Maram A. Alharthi, Maha J. Althobaiti, Manal S. Alharthi, Huda Rizq

Abstract:

Twitter is one of the most popular microblogging sites that allows users to publish short text messages called 'tweets'. Increasing the number of accounts to follow (followings) increases the number of tweets that will be displayed from different topics in an unclassified manner in the timeline of the user. Therefore, it can be a vital solution for many Twitter users to have their tweets in a timeline classified into general categories to save the user’s time and to provide easy and quick access to tweets based on topics. In this paper, we developed a classifier for timeline tweets trained on a dataset consisting of 3600 tweets in total, which were collected from Saudi Twitter and annotated manually. We experimented with the well-known Bag-of-Words approach to text classification, and we used support vector machines (SVM) in the training process. The trained classifier performed well on a test dataset, with an average F1-measure equal to 92.3%. The classifier has been integrated into an application, which practically proved the classifier’s ability to classify timeline tweets of the user.

Keywords: corpus creation, feature extraction, machine learning, short text classification, social media, support vector machine, Twitter

Procedia PDF Downloads 155
28892 The Influence of Japanese Poetry in Spanish Piano Music: Benet Casablancas and Mercedes Zavala’s Haikus

Authors: Isabel Pérez Dobarro

Abstract:

In the mid-twentieth century, Spanish composers started looking beyond the national folkloric tradition (adopted by Albéniz, Granados, and Falla) and Rodrigo’s neoclassicism, and searched for other sources of inspiration. Japanese Haikus fascinated Spanish musicians, who found in their brevity and imagination a new avenue to develop their creativity. The goal of this research is to study how two renowned Spanish authors, Benet Casablancas and Mercedes Zavala, incorporated Haikus into their piano works. Based on Bruhn’s methodology on text and instrumental music relations, and developing a score and text analysis complemented by interviews with both composers, this study has revealed three possible interactions between the Haikus and these composers’ piano writing: inspiration, transmedialization, and mimesis. Findings also include specific technical gestures to support each of these approaches. Commonalities between their pieces and those by other non-Spanish composers such as Jonathan Harvey, John Cage, and Michael Berkeley have also been explored. According to the author's knowledge, this is the first study on the Japanese influence in Spanish piano music. Thus, it opens a new path for understanding musical exchanges between both countries as well as contemporary piano tools that support the interaction between text and music.

Keywords: Haiku, Spanish piano music, Benet Casablancas, Mercedes Zavala

Procedia PDF Downloads 153
28891 Unlocking the Potential of Short Texts with Semantic Enrichment, Disambiguation Techniques, and Context Fusion

Authors: Mouheb Mehdoui, Amel Fraisse, Mounir Zrigui

Abstract:

This paper explores the potential of short texts through semantic enrichment and disambiguation techniques. By employing context fusion, we aim to enhance the comprehension and utility of concise textual information. The methodologies utilized are grounded in recent advancements in natural language processing, which allow for a deeper understanding of semantics within limited text formats. Specifically, topic classification is employed to understand the context of the sentence and assess the relevance of added expressions. Additionally, word sense disambiguation is used to clarify unclear words, replacing them with more precise terms. The implications of this research extend to various applications, including information retrieval and knowledge representation. Ultimately, this work highlights the importance of refining short text processing techniques to unlock their full potential in real-world applications.

Keywords: information traffic, text summarization, word-sense disambiguation, semantic enrichment, ambiguity resolution, short text enhancement, information retrieval, contextual understanding, natural language processing, ambiguity

Procedia PDF Downloads 14
28890 Moral Wrongdoers: Evaluating the Value of Moral Actions Performed by War Criminals

Authors: Jean-Francois Caron

Abstract:

This text explores the value of moral acts performed by war criminals, and the extent to which they should alleviate the punishment these individuals ought to receive for violating the rules of war. Without neglecting the necessity of retribution in war crimes cases, it argues from an ethical perspective that we should not rule out the possibility of considering lesser punishments for war criminals who decide to perform a moral act, as it might produce significant positive moral outcomes. This text also analyzes how such a norm could be justified from a moral perspective.

Keywords: war criminals, pardon, amnesty, retribution

Procedia PDF Downloads 283
28889 Mining in Peru and Local Governance: Assessing the Contribution of CRS Projects

Authors: Sandra Carrillo Hoyos

Abstract:

Mining activities in South America have significantly grown during the last decades, given the abundance of natural resources, the implemented governmental policies to incentivize foreign investment as well as the boom in international prices for metals and oil between 2002 and 2008. While this context allowed the region to occupy a leading position between the top producers of minerals around the world, it has also meant an increase in socio-environmental conflicts which have generated costs and negative impacts not only for the companies but especially for the governments and local communities.During the latest decade, the mining sector in Peru has faced with the social resistance of a large number of communities, which began organizing actions against the implementation of high investing projects. The dissatisfaction has derived in the prevalence of socio-environmental conflicts associated with mining activities, some of them never solved into an agreement. In order to prevent those socio-environmental conflicts and obtain the social license from local communities, most of the mining companies have developed diverse initiatives within the framework of policies and practices of corporate social responsibility (CSR). This paper has assessed the mining sector’s contribution toward the local development management along the last decade, as part of CSR strategies as well as the policies promoted by the Peruvian State. This assessment found that, in the beginning, these initiatives have been based on a philanthropic approach and were reacting to pressures from local stakeholders to maintain the consent to operate from the surrounding communities as well as to create, as a result, a harmonious atmosphere for operations. Due to the weak State presence, such practices have increased the expectations of communities related to the participation of mining companies in solving structural development problems, especially those related to primary needs, infrastructure, education, health, among others. In other words, this paper was focused on analyze in what extent these initiatives have promoted local empowerment for development planning and integrated management of natural resources from a territorial approach. From this perspective, the analysis demonstrates that, while the design and planning of social investment initiatives have improved due to the sector´s sustainability approach, many companies have developed actions beyond their competence during this process. In some cases, the referenced actions have generated dependency with communities, even though this relationship has not exempted the companies of conflict situations with unfortunate consequences. Furthermore, the social programs developed have not necessarily generated a significant impact in improving the quality of life of affected populations. In fact, it is possible to identify that those regions with high mining resources and investment are facing with a situation of poverty and high dependency on mining production. In spite of the revenues derived from mining industry, local governments have not been able to translate the royalties into sustainable development opportunities. For this reason, the proposed paper suggests some challenges for the mining sector contribution to local development based on the best practices and lessons learnt from a benchmarking for the leading mining companies.

Keywords: corporate social responsibility, local development, mining, socio-environmental conflict

Procedia PDF Downloads 408
28888 A Deep Learning Approach to Subsection Identification in Electronic Health Records

Authors: Nitin Shravan, Sudarsun Santhiappan, B. Sivaselvan

Abstract:

Subsection identification, in the context of Electronic Health Records (EHRs), is identifying the important sections for down-stream tasks like auto-coding. In this work, we classify the text present in EHRs according to their information, using machine learning and deep learning techniques. We initially describe briefly about the problem and formulate it as a text classification problem. Then, we discuss upon the methods from the literature. We try two approaches - traditional feature extraction based machine learning methods and deep learning methods. Through experiments on a private dataset, we establish that the deep learning methods perform better than the feature extraction based Machine Learning Models.

Keywords: deep learning, machine learning, semantic clinical classification, subsection identification, text classification

Procedia PDF Downloads 220
28887 Improved FP-Growth Algorithm with Multiple Minimum Supports Using Maximum Constraints

Authors: Elsayeda M. Elgaml, Dina M. Ibrahim, Elsayed A. Sallam

Abstract:

Association rule mining is one of the most important fields of data mining and knowledge discovery. In this paper, we propose an efficient multiple support frequent pattern growth algorithm which we called “MSFP-growth” that enhancing the FP-growth algorithm by making infrequent child node pruning step with multiple minimum support using maximum constrains. The algorithm is implemented, and it is compared with other common algorithms: Apriori-multiple minimum supports using maximum constraints and FP-growth. The experimental results show that the rule mining from the proposed algorithm are interesting and our algorithm achieved better performance than other algorithms without scarifying the accuracy.

Keywords: association rules, FP-growth, multiple minimum supports, Weka tool

Procedia PDF Downloads 487
28886 Sentiment Analysis of Chinese Microblog Comments: Comparison between Support Vector Machine and Long Short-Term Memory

Authors: Xu Jiaqiao

Abstract:

Text sentiment analysis is an important branch of natural language processing. This technology is widely used in public opinion analysis and web surfing recommendations. At present, the mainstream sentiment analysis methods include three parts: sentiment analysis based on a sentiment dictionary, based on traditional machine learning, and based on deep learning. This paper mainly analyzes and compares the advantages and disadvantages of the SVM method of traditional machine learning and the Long Short-term Memory (LSTM) method of deep learning in the field of Chinese sentiment analysis, using Chinese comments on Sina Microblog as the data set. Firstly, this paper classifies and adds labels to the original comment dataset obtained by the web crawler, and then uses Jieba word segmentation to classify the original dataset and remove stop words. After that, this paper extracts text feature vectors and builds document word vectors to facilitate the training of the model. Finally, SVM and LSTM models are trained respectively. After accuracy calculation, it can be obtained that the accuracy of the LSTM model is 85.80%, while the accuracy of SVM is 91.07%. But at the same time, LSTM operation only needs 2.57 seconds, SVM model needs 6.06 seconds. Therefore, this paper concludes that: compared with the SVM model, the LSTM model is worse in accuracy but faster in processing speed.

Keywords: sentiment analysis, support vector machine, long short-term memory, Chinese microblog comments

Procedia PDF Downloads 96
28885 CoP-Networks: Virtual Spaces for New Faculty’s Professional Development in the 21st Higher Education

Authors: Eman AbuKhousa, Marwan Z. Bataineh

Abstract:

The 21st century higher education and globalization challenge new faculty members to build effective professional networks and partnership with industry in order to accelerate their growth and success. This creates the need for community of practice (CoP)-oriented development approaches that focus on cognitive apprenticeship while considering individual predisposition and future career needs. This work adopts data mining, clustering analysis, and social networking technologies to present the CoP-Network as a virtual space that connects together similar career-aspiration individuals who are socially influenced to join and engage in a process for domain-related knowledge and practice acquisitions. The CoP-Network model can be integrated into higher education to extend traditional graduate and professional development programs.

Keywords: clustering analysis, community of practice, data mining, higher education, new faculty challenges, social network, social influence, professional development

Procedia PDF Downloads 184
28884 Opinion Mining to Extract Community Emotions on Covid-19 Immunization Possible Side Effects

Authors: Yahya Almurtadha, Mukhtar Ghaleb, Ahmed M. Shamsan Saleh

Abstract:

The world witnessed a fierce attack from the Covid-19 virus, which affected public life socially, economically, healthily and psychologically. The world's governments tried to confront the pandemic by imposing a number of precautionary measures such as general closure, curfews and social distancing. Scientists have also made strenuous efforts to develop an effective vaccine to train the immune system to develop antibodies to combat the virus, thus reducing its symptoms and limiting its spread. Artificial intelligence, along with researchers and medical authorities, has accelerated the vaccine development process through big data processing and simulation. On the other hand, one of the most important negatives of the impact of Covid 19 was the state of anxiety and fear due to the blowout of rumors through social media, which prompted governments to try to reassure the public with the available means. This study aims to proposed using Sentiment Analysis (AKA Opinion Mining) and deep learning as efficient artificial intelligence techniques to work on retrieving the tweets of the public from Twitter and then analyze it automatically to extract their opinions, expression and feelings, negatively or positively, about the symptoms they may feel after vaccination. Sentiment analysis is characterized by its ability to access what the public post in social media within a record time and at a lower cost than traditional means such as questionnaires and interviews, not to mention the accuracy of the information as it comes from what the public expresses voluntarily.

Keywords: deep learning, opinion mining, natural language processing, sentiment analysis

Procedia PDF Downloads 172
28883 A Theoretical Model for Pattern Extraction in Large Datasets

Authors: Muhammad Usman

Abstract:

Pattern extraction has been done in past to extract hidden and interesting patterns from large datasets. Recently, advancements are being made in these techniques by providing the ability of multi-level mining, effective dimension reduction, advanced evaluation and visualization support. This paper focuses on reviewing the current techniques in literature on the basis of these parameters. Literature review suggests that most of the techniques which provide multi-level mining and dimension reduction, do not handle mixed-type data during the process. Patterns are not extracted using advanced algorithms for large datasets. Moreover, the evaluation of patterns is not done using advanced measures which are suited for high-dimensional data. Techniques which provide visualization support are unable to handle a large number of rules in a small space. We present a theoretical model to handle these issues. The implementation of the model is beyond the scope of this paper.

Keywords: association rule mining, data mining, data warehouses, visualization of association rules

Procedia PDF Downloads 224
28882 Automatic Tagging and Accuracy in Assamese Text Data

Authors: Chayanika Hazarika Bordoloi

Abstract:

This paper is an attempt to work on a highly inflectional language called Assamese. This is also one of the national languages of India and very little has been achieved in terms of computational research. Building a language processing tool for a natural language is not very smooth as the standard and language representation change at various levels. This paper presents inflectional suffixes of Assamese verbs and how the statistical tools, along with linguistic features, can improve the tagging accuracy. Conditional random fields (CRF tool) was used to automatically tag and train the text data; however, accuracy was improved after linguistic featured were fed into the training data. Assamese is a highly inflectional language; hence, it is challenging to standardizing its morphology. Inflectional suffixes are used as a feature of the text data. In order to analyze the inflections of Assamese word forms, a list of suffixes is prepared. This list comprises suffixes, comprising of all possible suffixes that various categories can take is prepared. Assamese words can be classified into inflected classes (noun, pronoun, adjective and verb) and un-inflected classes (adverb and particle). The corpus used for this morphological analysis has huge tokens. The corpus is a mixed corpus and it has given satisfactory accuracy. The accuracy rate of the tagger has gradually improved with the modified training data.

Keywords: CRF, morphology, tagging, tagset

Procedia PDF Downloads 195
28881 Comparative Analysis of the Computer Methods' Usage for Calculation of Hydrocarbon Reserves in the Baltic Sea

Authors: Pavel Shcherban, Vlad Golovanov

Abstract:

Nowadays, the depletion of hydrocarbon deposits on the land of the Kaliningrad region leads to active geological exploration and development of oil and natural gas reserves in the southeastern part of the Baltic Sea. LLC 'Lukoil-Kaliningradmorneft' implements a comprehensive program for the development of the region's shelf in 2014-2023. Due to heterogeneity of reservoir rocks in various open fields, as well as with ambiguous conclusions on the contours of deposits, additional geological prospecting and refinement of the recoverable oil reserves are carried out. The key element is use of an effective technique of computer stock modeling at the first stage of processing of the received data. The following step uses information for the cluster analysis, which makes it possible to optimize the field development approaches. The article analyzes the effectiveness of various methods for reserves' calculation and computer modelling methods of the offshore hydrocarbon fields. Cluster analysis allows to measure influence of the obtained data on the development of a technical and economic model for mining deposits. The relationship between the accuracy of the calculation of recoverable reserves and the need of modernization of existing mining infrastructure, as well as the optimization of the scheme of opening and development of oil deposits, is observed.

Keywords: cluster analysis, computer modelling of deposits, correction of the feasibility study, offshore hydrocarbon fields

Procedia PDF Downloads 166
28880 Mine Project Evaluations in the Rising of Uncertainty: Real Options Analysis

Authors: I. Inthanongsone, C. Drebenstedt, J. C. Bongaerts, P. Sontamino

Abstract:

The major concern in evaluating the value of mining projects related to the deficiency of the traditional discounted cash flow (DCF) method. This method does not take uncertainties into account and, hence it does not allow for an economic assessment of managerial flexibility and operational adaptability, which are increasingly determining long-term corporate success. Such an assessment can be performed with the real options valuation (ROV) approach, since it allows for a comparative evaluation of unforeseen uncertainties in a project life cycle. This paper presents an economic evaluation model for open pit mining projects based on real options valuation approach. Uncertainties in the model are caused by metal prices and cost uncertainties and the system dynamics (SD) modeling method is used to structure and solve the real options model. The model is applied to a case study. It can be shown that that managerial flexibility reacting to uncertainties may create additional value to a mining project in comparison to the outcomes of a DCF method. One important insight for management dealing with uncertainty is seen in choosing the optimal time to exercise strategic options.

Keywords: DCF methods, ROV approach, system dynamics modeling methods, uncertainty

Procedia PDF Downloads 502
28879 Accountant Strategists Challenge the Dominant Business Model: A Strategy-as-Practice Perspective

Authors: Lindie Grebe

Abstract:

This paper reports on a study that explored the strategizing practices of professional accountants in the mining industry, based on Jarratt and Stiles’ dominant strategizing practice models framework. Drawing on a strategy-as-practice perspective, the paper recognises qualified professional accountants in strategic management such as Chief Executive Officers, as strategy practitioners that perform their strategizing practices and praxis within a specific context. The main findings of this paper were produced through semi-structured individual interviews with accountants that perform strategy on a business level in the South African mining industry. Qualitative data were analysed through conversation analysis over two coding-cycles. Findings describe accountant strategists as practitioners who challenge the dominant business model when a disconnect seems to exist between international corporate level strategy and business level strategy in the South African mining industry. Accountant strategy practitioners described their dominant strategizing practice model as incremental change during strategic planning and as a lived experience during strategy implementation. Findings portrayed these strategists as taking initiative as strategy leaders in a dynamic and volatile environment to combine their accounting background with strategic management and challenge the dominant business model. Understanding how accountant strategists perform strategizing offers insight into the social practice of strategic management. This understanding contributes to the body of knowledge on strategizing in the South African mining industry. In addition, knowledge on the transformation of accountants as strategists could provide valuable practice relevant insights for accounting educators and the accounting profession alike.

Keywords: accountant strategists, dominant strategizing practice models framework, mining industry, strategy-as-practice

Procedia PDF Downloads 177
28878 The Application of a Hybrid Neural Network for Recognition of a Handwritten Kazakh Text

Authors: Almagul Assainova , Dariya Abykenova, Liudmila Goncharenko, Sergey Sybachin, Saule Rakhimova, Abay Aman

Abstract:

The recognition of a handwritten Kazakh text is a relevant objective today for the digitization of materials. The study presents a model of a hybrid neural network for handwriting recognition, which includes a convolutional neural network and a multi-layer perceptron. Each network includes 1024 input neurons and 42 output neurons. The model is implemented in the program, written in the Python programming language using the EMNIST database, NumPy, Keras, and Tensorflow modules. The neural network training of such specific letters of the Kazakh alphabet as ә, ғ, қ, ң, ө, ұ, ү, h, і was conducted. The neural network model and the program created on its basis can be used in electronic document management systems to digitize the Kazakh text.

Keywords: handwriting recognition system, image recognition, Kazakh font, machine learning, neural networks

Procedia PDF Downloads 263
28877 Applying Sequential Pattern Mining to Generate Block for Scheduling Problems

Authors: Meng-Hui Chen, Chen-Yu Kao, Chia-Yu Hsu, Pei-Chann Chang

Abstract:

The main idea in this paper is using sequential pattern mining to find the information which is helpful for finding high performance solutions. By combining this information, it is defined as blocks. Using the blocks to generate artificial chromosomes (ACs) could improve the structure of solutions. Estimation of Distribution Algorithms (EDAs) is adapted to solve the combinatorial problems. Nevertheless many of these approaches are advantageous for this application, but only some of them are used to enhance the efficiency of application. Generating ACs uses patterns and EDAs could increase the diversity. According to the experimental result, the algorithm which we proposed has a better performance to solve the permutation flow-shop problems.

Keywords: combinatorial problems, sequential pattern mining, estimationof distribution algorithms, artificial chromosomes

Procedia PDF Downloads 612
28876 Potential Use of Leaching Gravel as a Raw Material in the Preparation of Geo Polymeric Material as an Alternative to Conventional Cement Materials

Authors: Arturo Reyes Roman, Daniza Castillo Godoy, Francisca Balarezo Olivares, Francisco Arriagada Castro, Miguel Maulen Tapia

Abstract:

Mining waste–based geopolymers are a sustainable alternative to conventional cement materials due to their contribution to the valorization of mining wastes as well as to the new construction materials with reduced fingerprints. The objective of this study was to determine the potential of leaching gravel (LG) from hydrometallurgical copper processing to be used as a raw material in the manufacture of geopolymer. NaOH, Na2SiO3 (modulus 1.5), and LG were mixed and then wetted with an appropriate amount of tap water, then stirred until a homogenous paste was obtained. A liquid/solid ratio of 0.3 was used for preparing mixtures. The paste was then cast in cubic moulds of 50 mm for the determination of compressive strengths. The samples were left to dry for 24h at room temperature, then unmoulded before analysis after 28 days of curing time. The compressive test was conducted in a compression machine (15/300 kN). According to the laser diffraction spectroscopy (LDS) analysis, 90% of LG particles were below 500 μm. The X-ray diffraction (XRD) analysis identified crystalline phases of albite (30 %), Quartz (16%), Anorthite (16 %), and Phillipsite (14%). The X-ray fluorescence (XRF) determinations showed mainly 55% of SiO2, 13 % of Al2O3, and 9% of CaO. ICP (OES) concentrations of Fe, Ca, Cu, Al, As, V, Zn, Mo, and Ni were 49.545; 24.735; 6.172; 14.152, 239,5; 129,6; 41,1;15,1, and 13,1 mg kg-1, respectively. The geopolymer samples showed resistance ranging between 2 and 10 MPa. In comparison with the raw material composition, the amorphous percentage of materials in the geopolymer was 35 %, whereas the crystalline percentage of main mineral phases decreased. Further studies are needed to find the optimal combinations of materials to produce a more resistant and environmentally safe geopolymer. Particularly are necessary compressive resistance higher than 15 MPa are necessary to be used as construction unit such as bricks.

Keywords: mining waste, geopolymer, construction material, alkaline activation

Procedia PDF Downloads 96
28875 The Effects of Three Pre-Reading Activities (Text Summary, Vocabulary Definition, and Pre-Passage Questions) on the Reading Comprehension of Iranian EFL Learners

Authors: Leila Anjomshoa, Firooz Sadighi

Abstract:

This study investigated the effects of three types of pre-reading activities (vocabulary definitions, text summary and pre-passage questions) on EFL learners’ English reading comprehension. On the basis of the results of a placement test administered to two hundred and thirty English students at Kerman Azad University, 200 subjects (one hundred intermediate and one hundred advanced) were selected.Four texts, two of them at intermediate level and two of them at advanced level were chosen. The data gathered was subjected to the statistical procedures of ANOVA. A close examination of the results through Tukey’s HSD showed the fact that the experimental groups performed better than the control group, highlighting the effect of the treatment on them. Also, the experimental group C (text summary), performed remarkably better than the other three groups (both experimental & control). Group B subjects, vocabulary definitions, performed better than groups A and D. The pre-passage questions group’s (D) performance showed higher scores than the control condition.

Keywords: pre-reading activities, text summary, vocabulary definition, and pre-passage questions, reading comprehension

Procedia PDF Downloads 340
28874 Case Study Analysis for Driver's Company in the Transport Sector with the Help of Data Mining

Authors: Diana Katherine Gonzalez Galindo, David Rolando Suarez Mora

Abstract:

With this study, we used data mining as a new alternative of the solution to evaluate the comments of the customers in order to find a pattern that helps us to determine some behaviors to reduce the deactivation of the partners of the LEVEL app. In one of the greatest business created in the last times, the partners are being affected due to an internal process that compensates the customer for a bad experience, but these comments could be false towards the driver, that’s why we made an investigation to collect information to restructure this process, many partners have been disassociated due to this internal process and many of them refuse the comments given by the customer. The main methodology used in this case study is the observation, we recollect information in real time what gave us the opportunity to see the most common issues to get the most accurate solution. With this new process helped by data mining, we could get a prediction based on the behaviors of the customer and some basic data recollected such as the age, the gender, and others; this could help us in future to improve another process. This investigation gives more opportunities to the partner to keep his account active even if the customer writes a message through the app. The term is trying to avoid a recession of drivers in the future offering improving in the processes, at the same time we are in search of stablishing a strategy which benefits both the app’s managers and the associated driver.

Keywords: agent, driver, deactivation, rider

Procedia PDF Downloads 281
28873 Adaptation of Hough Transform Algorithm for Text Document Skew Angle Detection

Authors: Kayode A. Olaniyi, Olabanji F. Omotoye, Adeola A. Ogunleye

Abstract:

The skew detection and correction form an important part of digital document analysis. This is because uncompensated skew can deteriorate document features and can complicate further document image processing steps. Efficient text document analysis and digitization can rarely be achieved when a document is skewed even at a small angle. Once the documents have been digitized through the scanning system and binarization also achieved, document skew correction is required before further image analysis. Research efforts have been put in this area with algorithms developed to eliminate document skew. Skew angle correction algorithms can be compared based on performance criteria. Most important performance criteria are accuracy of skew angle detection, range of skew angle for detection, speed of processing the image, computational complexity and consequently memory space used. The standard Hough Transform has successfully been implemented for text documentation skew angle estimation application. However, the standard Hough Transform algorithm level of accuracy depends largely on how much fine the step size for the angle used. This consequently consumes more time and memory space for increase accuracy and, especially where number of pixels is considerable large. Whenever the Hough transform is used, there is always a tradeoff between accuracy and speed. So a more efficient solution is needed that optimizes space as well as time. In this paper, an improved Hough transform (HT) technique that optimizes space as well as time to robustly detect document skew is presented. The modified algorithm of Hough Transform presents solution to the contradiction between the memory space, running time and accuracy. Our algorithm starts with the first step of angle estimation accurate up to zero decimal place using the standard Hough Transform algorithm achieving minimal running time and space but lacks relative accuracy. Then to increase accuracy, suppose estimated angle found using the basic Hough algorithm is x degree, we then run again basic algorithm from range between ±x degrees with accuracy of one decimal place. Same process is iterated till level of desired accuracy is achieved. The procedure of our skew estimation and correction algorithm of text images is implemented using MATLAB. The memory space estimation and process time are also tabulated with skew angle assumption of within 00 and 450. The simulation results which is demonstrated in Matlab show the high performance of our algorithms with less computational time and memory space used in detecting document skew for a variety of documents with different levels of complexity.

Keywords: hough-transform, skew-detection, skew-angle, skew-correction, text-document

Procedia PDF Downloads 159
28872 A Contrastive Rhetoric Study: The Use of Textual and Interpersonal Metadiscoursal Markers in Persian and English Newspaper Editorials

Authors: Habibollah Mashhady, Moslem Fatollahi

Abstract:

This study tries to contrast the use of metadiscoursal markers in English and Persian Newspaper Editorials as persuasive text types. These markers are linguistic elements in the text which do not add to the propositional content of it, rather they serve to realize the Halliday’s (1985) textual and interpersonal functions of language. At first, some of the most common markers from five subcategories of Text Connectives, Illocution Markers, Hedges, Emphatics, and Attitude Markers were identified in both English and Persian newspapers. Then, the frequency of occurrence of these markers in both English and Persian corpus consisting of 44 randomly selected editorials (18,000 words in each) from several English and Persian newspapers was recorded. After that, using a two-way chi square analysis, the overall x2 obs was found to be highly significant. So, the null hypothesis of no difference was confidently rejected. Finally, in order to determine the contribution of each subcategory to the overall x 2 value, one-way chi square analyses were applied to the individual subcategories. The results indicated that only two of the five subcategories of markers were statistically significant. This difference is then attributed to the differing spirits prevailing in the linguistic communities involved. Regarding the minor research question it was found that, in contrast to English writers, Persian writers are more writer-oriented in their writings.

Keywords: metadiscoursal markers, textual meta-function, interpersonal meta-function, persuasive texts, English and Persian newspaper editorials

Procedia PDF Downloads 575
28871 Improving Subjective Bias Detection Using Bidirectional Encoder Representations from Transformers and Bidirectional Long Short-Term Memory

Authors: Ebipatei Victoria Tunyan, T. A. Cao, Cheol Young Ock

Abstract:

Detecting subjectively biased statements is a vital task. This is because this kind of bias, when present in the text or other forms of information dissemination media such as news, social media, scientific texts, and encyclopedias, can weaken trust in the information and stir conflicts amongst consumers. Subjective bias detection is also critical for many Natural Language Processing (NLP) tasks like sentiment analysis, opinion identification, and bias neutralization. Having a system that can adequately detect subjectivity in text will boost research in the above-mentioned areas significantly. It can also come in handy for platforms like Wikipedia, where the use of neutral language is of importance. The goal of this work is to identify the subjectively biased language in text on a sentence level. With machine learning, we can solve complex AI problems, making it a good fit for the problem of subjective bias detection. A key step in this approach is to train a classifier based on BERT (Bidirectional Encoder Representations from Transformers) as upstream model. BERT by itself can be used as a classifier; however, in this study, we use BERT as data preprocessor as well as an embedding generator for a Bi-LSTM (Bidirectional Long Short-Term Memory) network incorporated with attention mechanism. This approach produces a deeper and better classifier. We evaluate the effectiveness of our model using the Wiki Neutrality Corpus (WNC), which was compiled from Wikipedia edits that removed various biased instances from sentences as a benchmark dataset, with which we also compare our model to existing approaches. Experimental analysis indicates an improved performance, as our model achieved state-of-the-art accuracy in detecting subjective bias. This study focuses on the English language, but the model can be fine-tuned to accommodate other languages.

Keywords: subjective bias detection, machine learning, BERT–BiLSTM–Attention, text classification, natural language processing

Procedia PDF Downloads 131
28870 Rewashing for Gold: Optimizing Mine Plan for Effective Closure

Authors: O. D. Eniowo

Abstract:

“Rewashing” as it is commonly called, involves the process of scooping out and washing chunks of mud from a closed alluvial gold mine site with the purpose of extracting any leftover gold deposits in the site. It is usually carried out by illegal miners who infiltrate closed mine sites with the goal of scavenging for any leftover gold deposits. Expectedly, the practice gives little or no regard for environmental protection. This paper examines the process of “rewashing” in a mining community in Nigeria. It then discusses the looming danger it portends for health, safety, and the environment. The study draws lessons from these occurrences to examine and discuss fit-for-purpose mine closure plans that could be adopted by gold mines in Nigeria and other sub-Saharan African countries.

Keywords: mine planning, mine closure, illegal mining, artisanal mining, environmental sustainability

Procedia PDF Downloads 32
28869 The Difference of Learning Outcomes in Reading Comprehension between Text and Film as The Media in Indonesian Language for Foreign Speaker in Intermediate Level

Authors: Siti Ayu Ningsih

Abstract:

This study aims to find the differences outcomes in learning reading comprehension with text and film as media on Indonesian Language for foreign speaker (BIPA) learning at intermediate level. By using quantitative and qualitative research methods, the respondent of this study is a single respondent from D'Royal Morocco Integrative Islamic School in grade nine from secondary level. Quantitative method used to calculate the learning outcomes that have been given the appropriate action cycle, whereas qualitative method used to translate the findings derived from quantitative methods to be described. The technique used in this study is the observation techniques and testing work. Based on the research, it is known that the use of the text media is more effective than the film for intermediate level of Indonesian Language for foreign speaker learner. This is because, when using film the learner does not have enough time to take note the difficult vocabulary and don't have enough time to look for the meaning of the vocabulary from the dictionary. While the use of media texts shows the better effectiveness because it does not require additional time to take note the difficult words. For the words that are difficult or strange, the learner can immediately find its meaning from the dictionary. The presence of the text is also very helpful for Indonesian Language for foreign speaker learner to find the answers according to the questions more easily. By matching the vocabulary of the question into the text references.

Keywords: Indonesian language for foreign speaker, learning outcome, media, reading comprehension

Procedia PDF Downloads 197
28868 Performance Study of Classification Algorithms for Consumer Online Shopping Attitudes and Behavior Using Data Mining

Authors: Rana Alaa El-Deen Ahmed, M. Elemam Shehab, Shereen Morsy, Nermeen Mekawie

Abstract:

With the growing popularity and acceptance of e-commerce platforms, users face an ever increasing burden in actually choosing the right product from the large number of online offers. Thus, techniques for personalization and shopping guides are needed by users. For a pleasant and successful shopping experience, users need to know easily which products to buy with high confidence. Since selling a wide variety of products has become easier due to the popularity of online stores, online retailers are able to sell more products than a physical store. The disadvantage is that the customers might not find products they need. In this research the customer will be able to find the products he is searching for, because recommender systems are used in some ecommerce web sites. Recommender system learns from the information about customers and products and provides appropriate personalized recommendations to customers to find the needed product. In this paper eleven classification algorithms are comparatively tested to find the best classifier fit for consumer online shopping attitudes and behavior in the experimented dataset. The WEKA knowledge analysis tool, which is an open source data mining workbench software used in comparing conventional classifiers to get the best classifier was used in this research. In this research by using the data mining tool (WEKA) with the experimented classifiers the results show that decision table and filtered classifier gives the highest accuracy and the lowest accuracy classification via clustering and simple cart.

Keywords: classification, data mining, machine learning, online shopping, WEKA

Procedia PDF Downloads 352
28867 Application of Data Mining Techniques for Tourism Knowledge Discovery

Authors: Teklu Urgessa, Wookjae Maeng, Joong Seek Lee

Abstract:

Application of five implementations of three data mining classification techniques was experimented for extracting important insights from tourism data. The aim was to find out the best performing algorithm among the compared ones for tourism knowledge discovery. Knowledge discovery process from data was used as a process model. 10-fold cross validation method is used for testing purpose. Various data preprocessing activities were performed to get the final dataset for model building. Classification models of the selected algorithms were built with different scenarios on the preprocessed dataset. The outperformed algorithm tourism dataset was Random Forest (76%) before applying information gain based attribute selection and J48 (C4.5) (75%) after selection of top relevant attributes to the class (target) attribute. In terms of time for model building, attribute selection improves the efficiency of all algorithms. Artificial Neural Network (multilayer perceptron) showed the highest improvement (90%). The rules extracted from the decision tree model are presented, which showed intricate, non-trivial knowledge/insight that would otherwise not be discovered by simple statistical analysis with mediocre accuracy of the machine using classification algorithms.

Keywords: classification algorithms, data mining, knowledge discovery, tourism

Procedia PDF Downloads 295
28866 Valorization of Mining Waste (Sand of Djemi Djema) from the Djbel Onk Mine (Eastern Algeria)

Authors: Rachida Malaoui, Leila Arabet , Asma Benbouza

Abstract:

The use of mining waste rock as a material for construction is one of the biggest concerns grabbing the attention of many mining countries. As these materials are abandoned, more effective solutions have been made to offset some of the building materials, and to avoid environmental pollution. The sands of the Djemi Djema deposit mines of the Djebel Onk mines are sedimentary materials of several varieties of layers with varying thicknesses and are worth far more than 300m deep. The sands from the Djemi Djema business area are medium to coarse and are discharged and accumulated, generating a huge estimated quantity of more than 77424250 tonnes. This state of "resource" is of great importance so as to be oriented towards the fields of public works and civil engineering after having reached the acceptable properties of this resource

Keywords: reuse, sands, shear tests, waste rock

Procedia PDF Downloads 148