Search results for: Text Mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2211

Search results for: Text Mining

1671 Application of Data Mining for Aquifer Environmental Assessment

Authors: Saman Javadi, Mehdi Hashemy, Mohahammad Mahmoodi

Abstract:

Vulnerability maps are employed as an important solution in order to handle entrance of pollution into the aquifers. The common way to provide vulnerability map is DRASTIC. Meanwhile, application of the method is not easy to apply for any aquifer due to choosing appropriate constant values of weights and ranks. In this study, a new approach using k-means clustering is applied to make vulnerability maps. Four features of depth to groundwater, hydraulic conductivity, recharge value and vadose zone were considered at the same time as features of clustering. Five regions are recognized out of the case study represent zones with different level of vulnerability. The finding results show that clustering provides a realistic vulnerability map so that, Pearson’s correlation coefficients between nitrate concentrations and clustering vulnerability is obtained 61%.

Keywords: clustering, data mining, groundwater, vulnerability assessment

Procedia PDF Downloads 573
1670 ExactData Smart Tool For Marketing Analysis

Authors: Aleksandra Jonas, Aleksandra Gronowska, Maciej Ścigacz, Szymon Jadczak

Abstract:

Exact Data is a smart tool which helps with meaningful marketing content creation. It helps marketers achieve this by analyzing the text of an advertisement before and after its publication on social media sites like Facebook or Instagram. In our research we focus on four areas of natural language processing (NLP): grammar correction, sentiment analysis, irony detection and advertisement interpretation. Our research has identified a considerable lack of NLP tools for the Polish language, which specifically aid online marketers. In light of this, our research team has set out to create a robust and versatile NLP tool for the Polish language. The primary objective of our research is to develop a tool that can perform a range of language processing tasks in this language, such as sentiment analysis, text classification, text correction and text interpretation. Our team has been working diligently to create a tool that is accurate, reliable, and adaptable to the specific linguistic features of Polish, and that can provide valuable insights for a wide range of marketers needs. In addition to the Polish language version, we are also developing an English version of the tool, which will enable us to expand the reach and impact of our research to a wider audience. Another area of focus in our research involves tackling the challenge of the limited availability of linguistically diverse corpora for non-English languages, which presents a significant barrier in the development of NLP applications. One approach we have been pursuing is the translation of existing English corpora, which would enable us to use the wealth of linguistic resources available in English for other languages. Furthermore, we are looking into other methods, such as gathering language samples from social media platforms. By analyzing the language used in social media posts, we can collect a wide range of data that reflects the unique linguistic characteristics of specific regions and communities, which can then be used to enhance the accuracy and performance of NLP algorithms for non-English languages. In doing so, we hope to broaden the scope and capabilities of NLP applications. Our research focuses on several key NLP techniques including sentiment analysis, text classification, text interpretation and text correction. To ensure that we can achieve the best possible performance for these techniques, we are evaluating and comparing different approaches and strategies for implementing them. We are exploring a range of different methods, including transformers and convolutional neural networks (CNNs), to determine which ones are most effective for different types of NLP tasks. By analyzing the strengths and weaknesses of each approach, we can identify the most effective techniques for specific use cases, and further enhance the performance of our tool. Our research aims to create a tool, which can provide a comprehensive analysis of advertising effectiveness, allowing marketers to identify areas for improvement and optimize their advertising strategies. The results of this study suggest that a smart tool for advertisement analysis can provide valuable insights for businesses seeking to create effective advertising campaigns.

Keywords: NLP, AI, IT, language, marketing, analysis

Procedia PDF Downloads 61
1669 Attributes That Influence Respondents When Choosing a Mate in Internet Dating Sites: An Innovative Matching Algorithm

Authors: Moti Zwilling, Srečko Natek

Abstract:

This paper aims to present an innovative predictive analytics analysis in order to find the best combination between two consumers who strive to find their partner or in internet sites. The methodology shown in this paper is based on analysis of consumer preferences and involves data mining and machine learning search techniques. The study is composed of two parts: The first part examines by means of descriptive statistics the correlations between a set of parameters that are taken between man and women where they intent to meet each other through the social media, usually the internet. In this part several hypotheses were examined and statistical analysis were taken place. Results show that there is a strong correlation between the affiliated attributes of man and woman as long as concerned to how they present themselves in a social media such as "Facebook". One interesting issue is the strong desire to develop a serious relationship between most of the respondents. In the second part, the authors used common data mining algorithms to search and classify the most important and effective attributes that affect the response rate of the other side. Results exhibit that personal presentation and education background are found as most affective to achieve a positive attitude to one's profile from the other mate.

Keywords: dating sites, social networks, machine learning, decision trees, data mining

Procedia PDF Downloads 278
1668 A System to Detect Inappropriate Messages in Online Social Networks

Authors: Shivani Singh, Shantanu Nakhare, Kalyani Nair, Rohan Shetty

Abstract:

As social networking is growing at a rapid pace today it is vital that we work on improving its management. Research has shown that the content present in online social networks may have significant influence on impressionable minds. If such platforms are misused, it will lead to negative consequences. Detecting insults or inappropriate messages continues to be one of the most challenging aspects of Online Social Networks (OSNs) today. We address this problem through a Machine Learning Based Soft Text Classifier approach using Support Vector Machine algorithm. The proposed system acts as a screening mechanism the alerts the user about such messages. The messages are classified according to their subject matter and each comment is labeled for the presence of profanity and insults.

Keywords: machine learning, online social networks, soft text classifier, support vector machine

Procedia PDF Downloads 482
1667 Utilization of Process Mapping Tool to Enhance Production Drilling in Underground Metal Mining Operations

Authors: Sidharth Talan, Sanjay Kumar Sharma, Eoin Joseph Wallace, Nikita Agrawal

Abstract:

Underground mining is at the core of rapidly evolving metals and minerals sector due to the increasing mineral consumption globally. Even though the surface mines are still more abundant on earth, the scales of industry are slowly tipping towards underground mining due to rising depth and complexities of orebodies. Thus, the efficient and productive functioning of underground operations depends significantly on the synchronized performance of key elements such as operating site, mining equipment, manpower and mine services. Production drilling is the process of conducting long hole drilling for the purpose of charging and blasting these holes for the production of ore in underground metal mines. Thus, production drilling is the crucial segment in the underground metal mining value chain. This paper presents the process mapping tool to evaluate the production drilling process in the underground metal mining operation by dividing the given process into three segments namely Input, Process and Output. The three segments are further segregated into factors and sub-factors. As per the study, the major input factors crucial for the efficient functioning of production drilling process are power, drilling water, geotechnical support of the drilling site, skilled drilling operators, services installation crew, oils and drill accessories for drilling machine, survey markings at drill site, proper housekeeping, regular maintenance of drill machine, suitable transportation for reaching the drilling site and finally proper ventilation. The major outputs for the production drilling process are ore, waste as a result of dilution, timely reporting and investigation of unsafe practices, optimized process time and finally well fragmented blasted material within specifications set by the mining company. The paper also exhibits the drilling loss matrix, which is utilized to appraise the loss in planned production meters per day in a mine on account of availability loss in the machine due to breakdowns, underutilization of the machine and productivity loss in the machine measured in drilling meters per unit of percussion hour with respect to its planned productivity for the day. The given three losses would be essential to detect the bottlenecks in the process map of production drilling operation so as to instigate the action plan to suppress or prevent the causes leading to the operational performance deficiency. The given tool is beneficial to mine management to focus on the critical factors negatively impacting the production drilling operation and design necessary operational and maintenance strategies to mitigate them. 

Keywords: process map, drilling loss matrix, SIPOC, productivity, percussion rate

Procedia PDF Downloads 191
1666 A Grey-Box Text Attack Framework Using Explainable AI

Authors: Esther Chiramal, Kelvin Soh Boon Kai

Abstract:

Explainable AI is a strong strategy implemented to understand complex black-box model predictions in a human-interpretable language. It provides the evidence required to execute the use of trustworthy and reliable AI systems. On the other hand, however, it also opens the door to locating possible vulnerabilities in an AI model. Traditional adversarial text attack uses word substitution, data augmentation techniques, and gradient-based attacks on powerful pre-trained Bidirectional Encoder Representations from Transformers (BERT) variants to generate adversarial sentences. These attacks are generally white-box in nature and not practical as they can be easily detected by humans e.g., Changing the word from “Poor” to “Rich”. We proposed a simple yet effective Grey-box cum Black-box approach that does not require the knowledge of the model while using a set of surrogate Transformer/BERT models to perform the attack using Explainable AI techniques. As Transformers are the current state-of-the-art models for almost all Natural Language Processing (NLP) tasks, an attack generated from BERT1 is transferable to BERT2. This transferability is made possible due to the attention mechanism in the transformer that allows the model to capture long-range dependencies in a sequence. Using the power of BERT generalisation via attention, we attempt to exploit how transformers learn by attacking a few surrogate transformer variants which are all based on a different architecture. We demonstrate that this approach is highly effective to generate semantically good sentences by changing as little as one word that is not detectable by humans while still fooling other BERT models.

Keywords: BERT, explainable AI, Grey-box text attack, transformer

Procedia PDF Downloads 117
1665 Preserving Digital Arabic Text Integrity Using Blockchain Technology

Authors: Zineb Touati Hamad, Mohamed Ridda Laouar, Issam Bendib

Abstract:

With the massive development of technology today, the Arabic language has gained a prominent position among the languages most used for writing articles, expressing opinions, and also for citing in many websites, defying its growing sensitivity in terms of structure, language skills, diacritics, writing methods, etc. In the context of the spread of the Arabic language, the Holy Quran represents the most prevalent Arabic text today in many applications and websites for citation purposes or for the reading and learning rituals. The Quranic verses / surahs are published quickly and without cost, which may cause great concern to ensure the safety of the content from tampering and alteration. To protect the content of texts from distortion, it is necessary to refer to the original database and conduct a comparison process to extract the percentage of distortion. The disadvantage of this method is that it takes time, in addition to the lack of any guarantee on the integrity of the database itself as it belongs to one central party. Blockchain technology today represents the best way to maintain immutable content. Blockchain is a distributed database that stores information in blocks linked to each other through encryption, where the modification of each block can be easily known. To exploit these advantages, we seek in this paper to justify the use of this technique in preserving the integrity of Arabic texts sensitive to change by building a decentralized framework to authenticate and verify the integrity of the digital Quranic verses/surahs spread on websites.

Keywords: arabic text, authentication, blockchain, integrity, quran, verification

Procedia PDF Downloads 140
1664 Investigation of the Heavy Metal Pollution of the River Ecosystems in the Lake Sevan Basin, Armenia

Authors: G. Gevorgyan, S. Khudaverdyan, A. Vaseashta

Abstract:

The Lake Sevan basin is situated in the eastern part of the Republic of Armenia (Gegharquniq marz/district). The heavy metal pollution of the some tributaries of Lake Sevan was investigated. Water sampling was performed in August and December, 2014 from the 4 observation sites: 1) Sotq river upstream (about 600 meters upstream from the Sotq gold mine); 2) Sotq river mouth; 3) Masrik river mouth; 4) Dzknaget river mouth. Heavy metal (V, Fe, Ni, Cu, As, Mo, Pb) concentrations in the water samples were determined by the standard methods using an atomic absorption spectrophotometer. The results of the study showed that heavy metal content mainly increased from the upstream of the Sotq river to the mouth of the Masrik river which may have been conditioned by the influence of gold mining activity as the Masrik and its tributary-Sotq rivers passing through the gold mining area were exposed to heavy metal pollution. The observation sites can be ranked by pollution degree as follows: №3> №2> №1> №4. The highest heavy metal pollution degree was observed in the Masrik river mouth which may have been conditioned by the direct impact of gold mining activity and the pressure of its tributary–the Sotq river which flows through the gold mining area. The lowest heavy metal pollution degree was registered in the Dzknaget river mouth which flowing through rural areas wasn’t subject to significant heavy metal pollution. According to the observation sites of the Sotq and Masrik rivers, high positive correlation was mainly observed between the concentrations of the investigated heavy metals (except nickel) which indicated that all the heavy metals except the nickel had the same anthropogenic pollution source which was the activity of the Sotq gold mine. In general, it is possible to state that the activity of the Sotq gold mine in the Lake Sevan basin caused the heavy metal pollution of the Sotq and Masrik rivers which may have posed environmental hazards. Heavy metals are nondegradable substances, and heavy metal pollution of freshwater systems may pose risks to the environment and human health through accumulation in the tissues of aquatic organisms, water-food chain as well as oral ingestion and dermal contact.

Keywords: Armenia, Lake Sevan basin, gold mining activity, river ecosystems, heavy metal pollution

Procedia PDF Downloads 566
1663 Lab Support: A Computer Laboratory Class Management Support System

Authors: Eugenia P. Ramirez, Kevin Matthe Caramancion, Mia Eleazar

Abstract:

Getting the attention of students is a constant challenge to the instructors/lecturers. Although in the computer laboratories some networking and entertainment websites are blocked, yet, these websites have unlimited ways of attracting students to get into it. Thus, when an instructor gives a specific set of instructions, some students may not be able to follow sequentially the steps that are given. The instructor has to physically go to the specific remote terminal and show the student the details. Sometimes, during an examination in laboratory set-up, a proctor may prefer to give detailed and text-written instructions rather than verbal instructions. Even the mere calling of a specific student at any time will distract the whole class especially when activities are being performed. What is needed is : An application software that is able to lock the student's monitor and at the same time display the instructor’s screen; a software that is powerful enough to process in its side alone and manipulate a specific user’s terminal in terms of free configuration that is, without restrictions at the server level is a required functionality for a modern and optimal server structure; a software that is able to send text messages to students, per terminal or in group will be a solution. These features are found in LabSupport. This paper outlines the LabSupport application software framework to efficiently manage computer laboratory sessions and will include different modules: screen viewer, demonstration mode, monitor locking system, text messaging, and class management. This paper's ultimate aim is to provide a system that increases instructor productivity.

Keywords: application software, broadcast messaging, class management, locking system

Procedia PDF Downloads 418
1662 News Publication on Facebook: Emotional Analysis of Hooks

Authors: Gemma Garcia Lopez

Abstract:

The goal of this study is to perform an emotional analysis of the hooks used in Facebook by three of the most important daily newspapers in the USA. These hook texts are used to get the user's attention and invite him to read the news and linked contents. Thanks to the emotional analysis in text, made with the tool of IBM, Tone Analyzer, we discovered that more than 30% of the hooks can be classified emotionally as joy, sadness, anger or fear. This study gathered the publications made by The New York Times, USA Today and The Washington Post during a random day. The results show that the choice of words by the journalist, can expose the reader to different emotions before clicking on the content. In the three cases analyzed, the absence of emotions in some cases, and the presence of emotions in text in others, appear in very similar percentages. Therefore, beyond the objectivity and veracity of the content, a new factor could come into play: the emotional influence on the reader as a mediatic manipulation tool.

Keywords: emotional analysis of newspapers hooks, emotions on Facebook, newspaper hooks on Facebook, news publication on Facebook

Procedia PDF Downloads 139
1661 Convolutional Neural Networks-Optimized Text Recognition with Binary Embeddings for Arabic Expiry Date Recognition

Authors: Mohamed Lotfy, Ghada Soliman

Abstract:

Recognizing Arabic dot-matrix digits is a challenging problem due to the unique characteristics of dot-matrix fonts, such as irregular dot spacing and varying dot sizes. This paper presents an approach for recognizing Arabic digits printed in dot matrix format. The proposed model is based on Convolutional Neural Networks (CNN) that take the dot matrix as input and generate embeddings that are rounded to generate binary representations of the digits. The binary embeddings are then used to perform Optical Character Recognition (OCR) on the digit images. To overcome the challenge of the limited availability of dotted Arabic expiration date images, we developed a True Type Font (TTF) for generating synthetic images of Arabic dot-matrix characters. The model was trained on a synthetic dataset of 3287 images and 658 synthetic images for testing, representing realistic expiration dates from 2019 to 2027 in the format of yyyy/mm/dd. Our model achieved an accuracy of 98.94% on the expiry date recognition with Arabic dot matrix format using fewer parameters and less computational resources than traditional CNN-based models. By investigating and presenting our findings comprehensively, we aim to contribute substantially to the field of OCR and pave the way for advancements in Arabic dot-matrix character recognition. Our proposed approach is not limited to Arabic dot matrix digit recognition but can also be extended to text recognition tasks, such as text classification and sentiment analysis.

Keywords: computer vision, pattern recognition, optical character recognition, deep learning

Procedia PDF Downloads 58
1660 Evaluation of the Urban Regeneration Project: Land Use Transformation and SNS Big Data Analysis

Authors: Ju-Young Kim, Tae-Heon Moon, Jung-Hun Cho

Abstract:

Urban regeneration projects have been actively promoted in Korea. In particular, Jeonju Hanok Village is evaluated as one of representative cases in terms of utilizing local cultural heritage sits in the urban regeneration project. However, recently, there has been a growing concern in this area, due to the ‘gentrification’, caused by the excessive commercialization and surging tourists. This trend was changing land and building use and resulted in the loss of identity of the region. In this regard, this study analyzed the land use transformation between 2010 and 2016 to identify the commercialization trend in Jeonju Hanok Village. In addition, it conducted SNS big data analysis on Jeonju Hanok Village from February 14th, 2016 to March 31st, 2016 to identify visitors’ awareness of the village. The study results demonstrate that rapid commercialization was underway, unlikely the initial intention, so that planners and officials in city government should reconsider the project direction and rebuild deliberate management strategies. This study is meaningful in that it analyzed the land use transformation and SNS big data to identify the current situation in urban regeneration area. Furthermore, it is expected that the study results will contribute to the vitalization of regeneration area.

Keywords: land use, SNS, text mining, urban regeneration

Procedia PDF Downloads 274
1659 L1 Poetry and Moral Tales as a Factor Affecting L2 Acquisition in EFL Settings

Authors: Arif Ahmed Mohammed Al-Ahdal

Abstract:

Poetry, tales, and fables have always been a part of the L1 repertoire and one that takes the learners to another amazing and fascinating world of imagination. The storytelling class and the genre of poems are activities greatly enjoyed by all age groups. The very significant idea behind their inclusion in the language curriculum is to sensitize young minds to a wide range of human emotions that are believed to greatly contribute to building their social resilience, emotional stability, empathy towards fellow creatures, and literacy. Quite certainly, the learning objective at this stage is not language acquisition (though it happens as an automatic process) but getting the young learners to be acquainted with an entire spectrum of what may be called the ‘noble’ abilities of the human race. They enrich their very existence, inspiring them to unearth ‘selves’ that help them as adults and enable them to co-exist fruitfully and symbiotically with their fellow human beings. By extension, ‘higher’ training in these literature genres shows the universality of human emotions, sufferings, aspirations, and hopes. The current study is anchored on the Reader-Response-Theory in literature learning, which suggests that the reader reconstructs work and re-enacts the author's creative role. Reiteratingly, literary works provide clues or verbal symbols in a linguistic system, widely accepted by everyone who shares the language, but everyone reads their own life experiences and situations into them. The significance of words depends on the reader, even if they have a typical relationship. In every reading, there is an interaction between the reader and the text. The process of reading is an experience in which the reader tries to comprehend the literary work, which surpasses its full potential since it provides emotional and intellectual reactions that are not anticipated from the document but cannot be affirmed just by the reader as a part of the text. The idea is that the text forms the basis of a unifying experience. A reinterpretation of the literary text may transform it into a guiding principle to respond to actual experiences and personal memories. The impulses delivered to the reader vary according to poetry or texts; nevertheless, the readers differ considerably even with the same material. Previous studies confirm that poetry is a useful tool for learning a language. This present paper works on these hypotheses and proposes to study the impetus given to L2 learning as a factor of exposure to poetry and meaningful stories in L1. The driving force behind the choice of this topic is the first-hand experience that the researcher had while teaching a literary text to a group of BA students who, as a reaction to the text, initially burst into tears and ultimately turned the class into an interactive session. The study also intends to compare the performance of male and female students post intervention using pre and post-tests, apart from undertaking a detailed inquiry via interviews with college learners of English to understand how L1 literature plays a great role in the acquisition of L2.

Keywords: SLA, literary text, poetry, tales, affective factors

Procedia PDF Downloads 57
1658 Using Closed Frequent Itemsets for Hierarchical Document Clustering

Authors: Cheng-Jhe Lee, Chiun-Chieh Hsu

Abstract:

Due to the rapid development of the Internet and the increased availability of digital documents, the excessive information on the Internet has led to information overflow problem. In order to solve these problems for effective information retrieval, document clustering in text mining becomes a popular research topic. Clustering is the unsupervised classification of data items into groups without the need of training data. Many conventional document clustering methods perform inefficiently for large document collections because they were originally designed for relational database. Therefore they are impractical in real-world document clustering and require special handling for high dimensionality and high volume. We propose the FIHC (Frequent Itemset-based Hierarchical Clustering) method, which is a hierarchical clustering method developed for document clustering, where the intuition of FIHC is that there exist some common words for each cluster. FIHC uses such words to cluster documents and builds hierarchical topic tree. In this paper, we combine FIHC algorithm with ontology to solve the semantic problem and mine the meaning behind the words in documents. Furthermore, we use the closed frequent itemsets instead of only use frequent itemsets, which increases efficiency and scalability. The experimental results show that our method is more accurate than those of well-known document clustering algorithms.

Keywords: FIHC, documents clustering, ontology, closed frequent itemset

Procedia PDF Downloads 372
1657 QoS-CBMG: A Model for e-Commerce Customer Behavior

Authors: Hoda Ghavamipoor, S. Alireza Hashemi Golpayegani

Abstract:

An approach to model the customer interaction with e-commerce websites is presented. Considering the service quality level as a predictive feature, we offer an improved method based on the Customer Behavior Model Graph (CBMG), a state-transition graph model. To derive the Quality of Service sensitive-CBMG (QoS-CBMG) model, process-mining techniques is applied to pre-processed website server logs which are categorized as ‘buy’ or ‘visit’. Experimental results on an e-commerce website data confirmed that the proposed method outperforms CBMG based method.

Keywords: customer behavior model, electronic commerce, quality of service, customer behavior model graph, process mining

Procedia PDF Downloads 389
1656 Power Recovery from Waste Air of Mine Ventilation Fans Using Wind Turbines

Authors: Soumyadip Banerjee, Tanmoy Maity

Abstract:

The recovery of power from waste air generated by mine ventilation fans presents a promising avenue for enhancing energy efficiency in mining operations. This abstract explores the feasibility and benefits of utilizing turbine generators to capture the kinetic energy present in waste air and convert it into electrical power. By integrating turbine generator systems into mine ventilation infrastructures, the potential to harness and utilize the previously untapped energy within the waste air stream is realized. This study examines the principles underlying turbine generator technology and its application within the context of mine ventilation systems. The process involves directing waste air from ventilation fans through specially designed turbines, where the kinetic energy of the moving air is converted into rotational motion. This mechanical energy is then transferred to connected generators, which convert it into electrical power. The recovered electricity can be employed for various on-site applications, including powering mining equipment, lighting, and control systems. The benefits of power recovery from waste air using turbine generators are manifold. Improved energy efficiency within the mining environment results in reduced dependence on external power sources and associated cost savings. Additionally, this approach contributes to environmental sustainability by utilizing a previously wasted resource for power generation. Resource conservation is further enhanced, aligning with modern principles of sustainable mining practices. However, successful implementation requires careful consideration of factors such as waste air characteristics, turbine design, generator efficiency, and integration into existing mine infrastructure. Maintenance and monitoring protocols are necessary to ensure consistent performance and longevity of the turbine generator systems. While there is an initial investment associated with equipment procurement, installation, and integration, the long-term benefits of reduced energy costs and environmental impact make this approach economically viable. In conclusion, the recovery of power from waste air from mine ventilation fans using turbine generators offers a tangible solution to enhance energy efficiency and sustainability within mining operations. By capturing and converting the kinetic energy of waste air into usable electrical power, mines can optimize resource utilization, reduce operational costs, and contribute to a greener future for the mining industry.

Keywords: waste to energy, wind power generation, exhaust air, power recovery

Procedia PDF Downloads 10
1655 Text Emotion Recognition by Multi-Head Attention based Bidirectional LSTM Utilizing Multi-Level Classification

Authors: Vishwanath Pethri Kamath, Jayantha Gowda Sarapanahalli, Vishal Mishra, Siddhesh Balwant Bandgar

Abstract:

Recognition of emotional information is essential in any form of communication. Growing HCI (Human-Computer Interaction) in recent times indicates the importance of understanding of emotions expressed and becomes crucial for improving the system or the interaction itself. In this research work, textual data for emotion recognition is used. The text being the least expressive amongst the multimodal resources poses various challenges such as contextual information and also sequential nature of the language construction. In this research work, the proposal is made for a neural architecture to resolve not less than 8 emotions from textual data sources derived from multiple datasets using google pre-trained word2vec word embeddings and a Multi-head attention-based bidirectional LSTM model with a one-vs-all Multi-Level Classification. The emotions targeted in this research are Anger, Disgust, Fear, Guilt, Joy, Sadness, Shame, and Surprise. Textual data from multiple datasets were used for this research work such as ISEAR, Go Emotions, Affect datasets for creating the emotions’ dataset. Data samples overlap or conflicts were considered with careful preprocessing. Our results show a significant improvement with the modeling architecture and as good as 10 points improvement in recognizing some emotions.

Keywords: text emotion recognition, bidirectional LSTM, multi-head attention, multi-level classification, google word2vec word embeddings

Procedia PDF Downloads 152
1654 Defining Processes of Gender Restructuring: The Case of Displaced Tribal Communities of North East India

Authors: Bitopi Dutta

Abstract:

Development Induced Displacement (DID) of subaltern groups has been an issue of intense debate in India. This research will do a gender analysis of displacement induced by the mining projects in tribal indigenous societies of North East India, centering on the primary research question which is 'How does DID reorder gendered relationship in tribal matrilineal societies?' This paper will not focus primarily on the impacts of the displacement induced by coal mining on indigenous tribal women in the North East India; it will rather study 'what' are the processes that lead to these transformations and 'how' do they operate. In doing so, the paper will locate the cracks in traditional social systems that the discourse of displacement manipulates for its own benefit. DID in this sense will not only be understood as only physical displacement, but also as social and cultural displacement. The study will cover one matrilineal tribe in the state of Meghalaya in the North East India affected by several coal mining projects in the last 30 years. In-depth unstructured interviews used to collect life narratives will be the primary mode of data collection because the indigenous culture of the tribes in Meghalaya, including the matrilineal tribes, is based on oral history where knowledge and experiences produced under a tradition of oral history exist in a continuum. This is unlike modern societies which produce knowledge in a compartmentalized system. An interview guide designed around specific themes will be used rather than specific questions to ensure the flow of narratives from the interviewee. In addition to this, a number of focus groups will be held. The data collected through the life narrative will be supplemented and contextualized through documentary research using government data, and local media sources of the region.

Keywords: displacement, gender-relations, matriliny, mining

Procedia PDF Downloads 169
1653 Planning for Enviromental and Social Sustainability in Coastal Areas: A Case of Alappad

Authors: K. Vrinda

Abstract:

Coastal ecosystems across the world are facing a lot of challenges due to natural phenomena as well as from uncontrolled human interventions. Here, Alappad, a coastal island situated in Kerala, India is undergoing significant damage and is gradually losing its environmental and social sustainability. The area is blessed with very rare and precious black mineral sand deposits. Sand mining for these minerals started in 1911 and is still continuing. But, unfortunately all the problems that Alappad faces now, have its root on mining of this mineral sand. The land area is continuously diminishing due to sea erosion. The mining has also caused displacement of people and environmental degradation. Marine life also is getting affected by mining on beach and pollution. The inhabitants are fishermen who are largely dependent on the eco-system for a living. So loss of environmental sustainability subsequently affects social sustainability too. Now the damage has reached a point beyond which our actions may not be able to make any impact. This was one of the most affected areas of the 2004 tsunami and the environmental degradation has further increased the vulnerability. So this study focuses on understanding the concerns related to the resource utilization, environment and the indigenous community staying there, and on formulating suitable strategies to restore the sustainability of the area. An extensive study was conducted on site, to find out the physical, social, and economical characteristics of the area. A focus group discussion with the inhabitants shed light on different issues they face in their day-to-day life. The analysis of all these data, led to the formation of a new development vision for the area which focuses on environmental restoration and socio-economic development while allowing controlled exploitation of resources. A participatory approach is formulated which enables these three aspects through community based programs.

Keywords: Community development, Disaster resilience, Ecological restoration, Environmental sustainability, Social-environmental planning, Social Sustainability

Procedia PDF Downloads 92
1652 A New Method to Reduce 5G Application Layer Payload Size

Authors: Gui Yang Wu, Bo Wang, Xin Wang

Abstract:

Nowadays, 5G service-based interface architecture uses text-based payload like JSON to transfer business data between network functions, which has obvious advantages as internet services but causes unnecessarily larger traffic. In this paper, a new 5G application payload size reduction method is presented to provides the mechanism to negotiate about new capability between network functions when network communication starts up and how 5G application data are reduced according to negotiated information with peer network function. Without losing the advantages of 5G text-based payload, this method demonstrates an excellent result on application payload size reduction and does not increase the usage quota of computing resource. Implementation of this method does not impact any standards or specifications and not change any encoding or decoding functionality too. In a real 5G network, this method will contribute to network efficiency and eventually save considerable computing resources.

Keywords: 5G, JSON, payload size, service-based interface

Procedia PDF Downloads 150
1651 Ancient Port Towns of Western Coastal Plain in Kerala, India: From Manuscripts to Material Remains

Authors: Saravanan R.

Abstract:

The landscape of Kerala was paved way for the growth of maritime contacts with foreigners. Pepper was the important exported item from here because this region only having pepper production on the West Coast of India. The paper is attempting to analysis the available references of ancient port town in Kerala. It is merely preliminary investigation about Early Historic urban centres with the available literary evidences and excavations reports that would help us to understand the ancient port town in Kerala coast. There were number of ancient port towns mentioned in classical Greek and Sangam literatures. For instance, Naura, Tyndis, Nelcynda, Bacare and Muziris were the major sites of Kerala which represented only in the text but not able to locate these sites on the ground so far. There are lot of studies on site based as well as state based regarding the various aspects of ancient port towns. But, it is mainly focussed on factual narration and theoretical interpretation.

Keywords: urban centre, amphora, Muziris, port town, Sangam text and trade

Procedia PDF Downloads 52
1650 Adaptation of Hough Transform Algorithm for Text Document Skew Angle Detection

Authors: Kayode A. Olaniyi, Olabanji F. Omotoye, Adeola A. Ogunleye

Abstract:

The skew detection and correction form an important part of digital document analysis. This is because uncompensated skew can deteriorate document features and can complicate further document image processing steps. Efficient text document analysis and digitization can rarely be achieved when a document is skewed even at a small angle. Once the documents have been digitized through the scanning system and binarization also achieved, document skew correction is required before further image analysis. Research efforts have been put in this area with algorithms developed to eliminate document skew. Skew angle correction algorithms can be compared based on performance criteria. Most important performance criteria are accuracy of skew angle detection, range of skew angle for detection, speed of processing the image, computational complexity and consequently memory space used. The standard Hough Transform has successfully been implemented for text documentation skew angle estimation application. However, the standard Hough Transform algorithm level of accuracy depends largely on how much fine the step size for the angle used. This consequently consumes more time and memory space for increase accuracy and, especially where number of pixels is considerable large. Whenever the Hough transform is used, there is always a tradeoff between accuracy and speed. So a more efficient solution is needed that optimizes space as well as time. In this paper, an improved Hough transform (HT) technique that optimizes space as well as time to robustly detect document skew is presented. The modified algorithm of Hough Transform presents solution to the contradiction between the memory space, running time and accuracy. Our algorithm starts with the first step of angle estimation accurate up to zero decimal place using the standard Hough Transform algorithm achieving minimal running time and space but lacks relative accuracy. Then to increase accuracy, suppose estimated angle found using the basic Hough algorithm is x degree, we then run again basic algorithm from range between ±x degrees with accuracy of one decimal place. Same process is iterated till level of desired accuracy is achieved. The procedure of our skew estimation and correction algorithm of text images is implemented using MATLAB. The memory space estimation and process time are also tabulated with skew angle assumption of within 00 and 450. The simulation results which is demonstrated in Matlab show the high performance of our algorithms with less computational time and memory space used in detecting document skew for a variety of documents with different levels of complexity.

Keywords: hough-transform, skew-detection, skew-angle, skew-correction, text-document

Procedia PDF Downloads 132
1649 Measurement of Natural Radioactivity and Health Hazard Index Evaluation in Major Soils of Tin Mining Areas of Perak

Authors: Habila Nuhu

Abstract:

Natural radionuclides in the environment can significantly contribute to human exposure to ionizing radiation. The knowledge of their levels in an environment can help the radiological protection agencies in policymaking. Measurement of natural radioactivity in major soils in the tin mining state of Perak Malaysia has been conducted using an HPGe detector. Seventy (70) soil samples were collected at widely distributed locations in the state. Six major soil types were sampled, and thirteen districts around the state were covered. The following were the results of the 226Ra (238U), 228Ra (232Th), and 40K activity in the soil samples: 226Ra (238U) has a mean activity concentration of 191.83 Bq kg⁻¹, more than five times the UNSCEAR reference limits of 35 Bq kg⁻¹. The mean activity concentration of 228Ra (232Th) with a value of 232.41 Bq kg⁻¹ is over seven times the UNSCEAR reference values of 30 Bq kg⁻¹. The average concentration of 40K activity was 275.24 Bq kg⁻¹, which was less than the UNSCEAR reference limit of 400 Bq Kg⁻¹. The range of external hazards index (Hₑₓ) values was from 1.03 to 2.05, while the internal hazards index (Hin) was from 1.48 to 3.08. The Hex and Hin should be less than one for minimal external and internal radiation threats as well as secure use of soil material for building construction. The Hₑₓ and Hin results generally indicate that while using the soil types and their derivatives as building materials in the study area, care must be taken.

Keywords: activity concentration, hazard index, soil samples, tin mining

Procedia PDF Downloads 91
1648 Semantic Indexing Improvement for Textual Documents: Contribution of Classification by Fuzzy Association Rules

Authors: Mohsen Maraoui

Abstract:

In the aim of natural language processing applications improvement, such as information retrieval, machine translation, lexical disambiguation, we focus on statistical approach to semantic indexing for multilingual text documents based on conceptual network formalism. We propose to use this formalism as an indexing language to represent the descriptive concepts and their weighting. These concepts represent the content of the document. Our contribution is based on two steps. In the first step, we propose the extraction of index terms using the multilingual lexical resource Euro WordNet (EWN). In the second step, we pass from the representation of index terms to the representation of index concepts through conceptual network formalism. This network is generated using the EWN resource and pass by a classification step based on association rules model (in attempt to discover the non-taxonomic relations or contextual relations between the concepts of a document). These relations are latent relations buried in the text and carried by the semantic context of the co-occurrence of concepts in the document. Our proposed indexing approach can be applied to text documents in various languages because it is based on a linguistic method adapted to the language through a multilingual thesaurus. Next, we apply the same statistical process regardless of the language in order to extract the significant concepts and their associated weights. We prove that the proposed indexing approach provides encouraging results.

Keywords: concept extraction, conceptual network formalism, fuzzy association rules, multilingual thesaurus, semantic indexing

Procedia PDF Downloads 123
1647 Troubleshooting Petroleum Equipment Based on Wireless Sensors Based on Bayesian Algorithm

Authors: Vahid Bayrami Rad

Abstract:

In this research, common methods and techniques have been investigated with a focus on intelligent fault finding and monitoring systems in the oil industry. In fact, remote and intelligent control methods are considered a necessity for implementing various operations in the oil industry, but benefiting from the knowledge extracted from countless data generated with the help of data mining algorithms. It is a avoid way to speed up the operational process for monitoring and troubleshooting in today's big oil companies. Therefore, by comparing data mining algorithms and checking the efficiency and structure and how these algorithms respond in different conditions, The proposed (Bayesian) algorithm using data clustering and their analysis and data evaluation using a colored Petri net has provided an applicable and dynamic model from the point of view of reliability and response time. Therefore, by using this method, it is possible to achieve a dynamic and consistent model of the remote control system and prevent the occurrence of leakage in oil pipelines and refineries and reduce costs and human and financial errors. Statistical data The data obtained from the evaluation process shows an increase in reliability, availability and high speed compared to other previous methods in this proposed method.

Keywords: wireless sensors, petroleum equipment troubleshooting, Bayesian algorithm, colored Petri net, rapid miner, data mining-reliability

Procedia PDF Downloads 43
1646 Direct Blind Separation Methods for Convolutive Images Mixtures

Authors: Ahmed Hammed, Wady Naanaa

Abstract:

In this paper, we propose a general approach to deal with the problem of a convolutive mixture of images. We use a direct blind source separation method by adding only one non-statistical justified constraint describing the relationships between different mixing matrix at the aim to make its resolution easy. This method can be applied, provided that this constraint is known, to degraded document affected by the overlapping of text-patterns and images. This is due to chemical and physical reactions of the materials (paper, inks,...) occurring during the documents aging, and other unpredictable causes such as humidity, microorganism infestation, human handling, etc. We will demonstrate that this problem corresponds to a convolutive mixture of images. Subsequently, we will show how the validation of our method through numerical examples. We can so obtain clear images from unreadable ones which can be caused by pages superposition, a phenomenon similar to that we find every often in archival documents.

Keywords: blind source separation, convoluted mixture, degraded documents, text-patterns overlapping

Procedia PDF Downloads 304
1645 Computerized Scoring System: A Stethoscope to Understand Consumer's Emotion through His or Her Feedback

Authors: Chen Yang, Jun Hu, Ping Li, Lili Xue

Abstract:

Most companies pay careful attention to consumer feedback collection, so it is popular to find the ‘feedback’ button of all kinds of mobile apps. Yet it is much more changeling to analyze these feedback texts and to catch the true feelings of a consumer regarding either a problem or a complimentary of consumers who hands out the feedback. Especially to the Chinese content, it is possible that; in one context the Chinese feedback expresses positive feedback, but in the other context, the same Chinese feedback may be a negative one. For example, in Chinese, the feedback 'operating with loudness' works well with both refrigerator and stereo system. Apparently, this feedback towards a refrigerator shows negative feedback; however, the same feedback is positive towards a stereo system. By introducing Bradley, M. and Lang, P.'s Affective Norms for English Text (ANET) theory and Bucci W.’s Referential Activity (RA) theory, we, usability researchers at Pingan, are able to decipher the feedback and to find the hidden feelings behind the content. We subtract 2 disciplines ‘valence’ and ‘dominance’ out of 3 of ANET and 2 disciplines ‘concreteness’ and ‘specificity’ out of 4 of RA to organize our own rating system with a scale of 1 to 5 points. This rating system enables us to judge the feelings/emotion behind each feedback, and it works well with both single word/phrase and a whole paragraph. The result of the rating reflects the strength of the feeling/emotion of the consumer when he/she is typing the feedback. In our daily work, we first require a consumer to answer the net promoter score (NPS) before writing the feedback, so we can determine the feedback is positive or negative. Secondly, we code the feedback content according to company problematic list, which contains 200 problematic items. In this way, we are able to collect the data that how many feedbacks left by the consumer belong to one typical problem. Thirdly, we rate each feedback based on the rating system mentioned above to illustrate the strength of the feeling/emotion when our consumer writes the feedback. In this way, we actually obtain two kinds of data 1) the portion, which means how many feedbacks are ascribed into one problematic item and 2) the severity, how strong the negative feeling/emotion is when the consumer is writing this feedback. By crossing these two, and introducing the portion into X-axis and severity into Y-axis, we are able to find which typical problem gets the high score in both portion and severity. The higher the score of a problem has, the more urgent a problem is supposed to be solved as it means more people write stronger negative feelings in feedbacks regarding this problem. Moreover, by introducing hidden Markov model to program our rating system, we are able to computerize the scoring system and are able to process thousands of feedback in a short period of time, which is efficient and accurate enough for the industrial purpose.

Keywords: computerized scoring system, feeling/emotion of consumer feedback, referential activity, text mining

Procedia PDF Downloads 152
1644 Application of a Modified Crank-Nicolson Method in Metallurgy

Authors: Kobamelo Mashaba

Abstract:

The molten slag has a high substantial temperatures range between 1723-1923, carrying a huge amount of useful energy for reducing energy consumption and CO₂ emissions under the heat recovery process. Therefore in this study, we investigated the performance of the modified crank Nicolson method for a delayed partial differential equation on the heat recovery of molten slag in the metallurgical mining environment. It was proved that the proposed method converges quickly compared to the classic method with the existence of a unique solution. It was inferred from numerical result that the proposed methodology is more viable and profitable for the mining industry.

Keywords: delayed partial differential equation, modified Crank-Nicolson Method, molten slag, heat recovery, parabolic equation

Procedia PDF Downloads 84
1643 Directional Dust Deposition Measurements: The Influence of Seasonal Changes and the Meteorological Conditions Influencing in Witbank Area and Carletonville Area

Authors: Maphuti Georgina Kwata

Abstract:

Coal mining in Mpumalanga Province is known of contributing to the atmospheric pollution from various activities. Gold mining in North-West Province is known of also contributing to the atmospheric pollution especially with the production of radon gas. In this research directional dust deposition gauge was used to measure source of direction and meteorological data was used to determine the wind rose blowing and the influence of the seasonal changes. Fourteen months of dust collection was undertaken in Witbank Area and Carletonville Area. The results shows that the sources of direction for Ericson Dam its East in February 2010 and Tip Area shows that the source of direction its West in October 2010. In the East direction there were mining operations, power stations which contributed to the East to be the sources of direction. In the West direction there were smelters, power stations and agricultural activities which contributed for the source of direction to be the West direction for Driefontein Mine: East Recreational Village Club. The East of Leslie Williams hospital is the source of direction which also indicated that there dust generating activities such as mining operation, agricultural activities. The meteorological results for Emalahleni Area in summer and winter the wind rose blow with wind speed of 5-10 ms-1 from the East sector. Annual average for the wind rose blow its East South eastern sector with 20 ms-1 and day time the wind rose from northwestern sector with excess of 20 ms-1. The night time wind direction East-eastern direction with a maximum wind speed of 20 ms-1. The meteorogical results for Driefontein Mine show that North-western sector and north-eastern sector wind rose is blowing with 5-10 ms-1 win speed. Day time wind blows from the West sector and night time wind blows from the north sector. In summer the wind blows North-east sector with 5-10 ms-1 and winter wind blows from North-west and it’s also predominant. In spring wind blows from north-east. The conclusion is that not only mining operation where the directional dust deposit gauge were installed contributed to the source of direction also the power stations, smelters, and other activities nearby the mining operation contributed. The recommendations are the dust suppressant for unpaved roads should be used on a regular basis and there should be monitoring of the weather conditions (the wind speed and direction prior to blasting to ensure minimal emissions).

Keywords: directional dust deposition gauge, BS part 5 1747 dust deposit gauge, wind rose, wind blowing

Procedia PDF Downloads 489
1642 Metadiscourse in EFL, ESP and Subject-Teaching Online Courses in Higher Education

Authors: Maria Antonietta Marongiu

Abstract:

Propositional information in discourse is made coherent, intelligible, and persuasive through metadiscourse. The linguistic and rhetorical choices that writers/speakers make to organize and negotiate content matter are intended to help relate a text to its context. Besides, they help the audience to connect to and interpret a text according to the values of a specific discourse community. Based on these assumptions, this work aims to analyse the use of metadiscourse in the spoken performance of teachers in online EFL, ESP, and subject-teacher courses taught in English to non-native learners in higher education. In point of fact, the global spread of Covid 19 has forced universities to transition their in-class courses to online delivery. This has inevitably placed on the instructor a heavier interactional responsibility compared to in-class courses. Accordingly, online delivery needs greater structuring as regards establishing the reader/listener’s resources for text understanding and negotiating. Indeed, in online as well as in in-class courses, lessons are social acts which take place in contexts where interlocutors, as members of a community, affect the ways ideas are presented and understood. Following Hyland’s Interactional Model of Metadiscourse (2005), this study intends to investigate Teacher Talk in online academic courses during the Covid 19 lock-down in Italy. The selected corpus includes the transcripts of online EFL and ESP courses and subject-teachers online courses taught in English. The objective of the investigation is, firstly, to ascertain the presence of metadiscourse in the form of interactive devices (to guide the listener through the text) and interactional features (to involve the listener in the subject). Previous research on metadiscourse in academic discourse, in college students' presentations in EAP (English for Academic Purposes) lessons, as well as in online teaching methodology courses and MOOC (Massive Open Online Courses) has shown that instructors use a vast array of metadiscoursal features intended to express the speakers’ intentions and standing with respect to discourse. Besides, they tend to use directions to orient their listeners and logical connectors referring to the structure of the text. Accordingly, the purpose of the investigation is also to find out whether metadiscourse is used as a rhetorical strategy by instructors to control, evaluate and negotiate the impact of the ongoing talk, and eventually to signal their attitudes towards the content and the audience. Thus, the use of metadiscourse can contribute to the informative and persuasive impact of discourse, and to the effectiveness of online communication, especially in learning contexts.

Keywords: discourse analysis, metadiscourse, online EFL and ESP teaching, rhetoric

Procedia PDF Downloads 111