Search results for: data mining techniques
29495 Integrating of Multi-Criteria Decision Making and Spatial Data Warehouse in Geographic Information System
Authors: Zohra Mekranfar, Ahmed Saidi, Abdellah Mebrek
Abstract:
This work aims to develop multi-criteria decision making (MCDM) and spatial data warehouse (SDW) methods, which will be integrated into a GIS according to a ‘GIS dominant’ approach. The GIS operating tools will be operational to operate the SDW. The MCDM methods can provide many solutions to a set of problems with various and multiple criteria. When the problem is so complex, integrating spatial dimension, it makes sense to combine the MCDM process with other approaches like data mining, ascending analyses, we present in this paper an experiment showing a geo-decisional methodology of SWD construction, On-line analytical processing (OLAP) technology which combines both basic multidimensional analysis and the concepts of data mining provides powerful tools to highlight inductions and information not obvious by traditional tools. However, these OLAP tools become more complex in the presence of the spatial dimension. The integration of OLAP with a GIS is the future geographic and spatial information solution. GIS offers advanced functions for the acquisition, storage, analysis, and display of geographic information. However, their effectiveness for complex spatial analysis is questionable due to their determinism and their decisional rigor. A prerequisite for the implementation of any analysis or exploration of spatial data requires the construction and structuring of a spatial data warehouse (SDW). This SDW must be easily usable by the GIS and by the tools offered by an OLAP system.Keywords: data warehouse, GIS, MCDM, SOLAP
Procedia PDF Downloads 17829494 Multimodal Data Fusion Techniques in Audiovisual Speech Recognition
Authors: Hadeer M. Sayed, Hesham E. El Deeb, Shereen A. Taie
Abstract:
In the big data era, we are facing a diversity of datasets from different sources in different domains that describe a single life event. These datasets consist of multiple modalities, each of which has a different representation, distribution, scale, and density. Multimodal fusion is the concept of integrating information from multiple modalities in a joint representation with the goal of predicting an outcome through a classification task or regression task. In this paper, multimodal fusion techniques are classified into two main classes: model-agnostic techniques and model-based approaches. It provides a comprehensive study of recent research in each class and outlines the benefits and limitations of each of them. Furthermore, the audiovisual speech recognition task is expressed as a case study of multimodal data fusion approaches, and the open issues through the limitations of the current studies are presented. This paper can be considered a powerful guide for interested researchers in the field of multimodal data fusion and audiovisual speech recognition particularly.Keywords: multimodal data, data fusion, audio-visual speech recognition, neural networks
Procedia PDF Downloads 11229493 Mine Production Index (MPi): New Method to Evaluate Effectiveness of Mining Machinery
Authors: Amol Lanke, Hadi Hoseinie, Behzad Ghodrati
Abstract:
OEE has been used in many industries as measure of performance. However due to limitations of original OEE, it has been modified by various researchers. OEE for mining application is special version of classic equation, carries these limitation over. In this paper it has been aimed to modify the OEE for mining application by introducing the weights to the elements of it and termed as Mine Production index (MPi). As a special application of new index MPi shovel has been developed by team of experts and researchers for evaluating the shovel effectiveness. Based on analysis, utilization followed by performance and availability were ranked in this order. To check the applicability of this index, a case study was done on four electrical and one hydraulic shovel in a Swedish mine. The results shows that MPishovelcan properly evaluate production effectiveness of shovels and determine effectiveness values in optimistic view compared to OEE. MPi with calculation not only give the effectiveness but also can predict which elements should be focused for improving the productivity.Keywords: mining, overall equipment efficiency (OEE), mine production index, shovels
Procedia PDF Downloads 46329492 Clustering Categorical Data Using the K-Means Algorithm and the Attribute’s Relative Frequency
Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami
Abstract:
Clustering is a well known data mining technique used in pattern recognition and information retrieval. The initial dataset to be clustered can either contain categorical or numeric data. Each type of data has its own specific clustering algorithm. In this context, two algorithms are proposed: the k-means for clustering numeric datasets and the k-modes for categorical datasets. The main encountered problem in data mining applications is clustering categorical dataset so relevant in the datasets. One main issue to achieve the clustering process on categorical values is to transform the categorical attributes into numeric measures and directly apply the k-means algorithm instead the k-modes. In this paper, it is proposed to experiment an approach based on the previous issue by transforming the categorical values into numeric ones using the relative frequency of each modality in the attributes. The proposed approach is compared with a previously method based on transforming the categorical datasets into binary values. The scalability and accuracy of the two methods are experimented. The obtained results show that our proposed method outperforms the binary method in all cases.Keywords: clustering, unsupervised learning, pattern recognition, categorical datasets, knowledge discovery, k-means
Procedia PDF Downloads 25929491 Engineering Management and Practice in Nigeria
Authors: Harold Jideofor
Abstract:
The application of Project Management (PM) tools and techniques in the public sector is gradually becoming an important issue in developing economies, especially in a country like Nigeria where projects of different size and structures are undertaken. The paper examined the application of the project management practice in the public sector in Nigeria. The PM lifecycles, tools, and techniques were presented. The study was carried out in Lagos because of its metropolitan nature and rapidly growing economy. Twenty-three copies of questionnaire were administered to 23 public institutions in Lagos to generate primary data. The descriptive analysis techniques using percentages and table presentations coupled with the coefficient of correlation were used for data analysis. The study revealed that application of PM tools and techniques is an essential management approach that tends to achieve specified objectives within specific time and budget limits through the optimum use of resources. Furthermore, the study noted that there is a lack of in-depth knowledge of PM tools and techniques in public sector institutions sampled, also a high cost of the application was also observed by the respondents. The study recommended among others that PM tools and techniques should be applied gradually especially in old government institutions where resistance to change is perceived to be high.Keywords: project management, public sector, practice, Nigeria
Procedia PDF Downloads 34229490 The Acquisition of Case in Biological Domain Based on Text Mining
Authors: Shen Jian, Hu Jie, Qi Jin, Liu Wei Jie, Chen Ji Yi, Peng Ying Hong
Abstract:
In order to settle the problem of acquiring case in biological related to design problems, a biometrics instance acquisition method based on text mining is presented. Through the construction of corpus text vector space and knowledge mining, the feature selection, similarity measure and case retrieval method of text in the field of biology are studied. First, we establish a vector space model of the corpus in the biological field and complete the preprocessing steps. Then, the corpus is retrieved by using the vector space model combined with the functional keywords to obtain the biological domain examples related to the design problems. Finally, we verify the validity of this method by taking the example of text.Keywords: text mining, vector space model, feature selection, biologically inspired design
Procedia PDF Downloads 26229489 Combination of Artificial Neural Network Model and Geographic Information System for Prediction Water Quality
Authors: Sirilak Areerachakul
Abstract:
Water quality has initiated serious management efforts in many countries. Artificial Neural Network (ANN) models are developed as forecasting tools in predicting water quality trend based on historical data. This study endeavors to automatically classify water quality. The water quality classes are evaluated using 6 factor indices. These factors are pH value (pH), Dissolved Oxygen (DO), Biochemical Oxygen Demand (BOD), Nitrate Nitrogen (NO3N), Ammonia Nitrogen (NH3N) and Total Coliform (T-Coliform). The methodology involves applying data mining techniques using multilayer perceptron (MLP) neural network models. The data consisted of 11 sites of Saen Saep canal in Bangkok, Thailand. The data is obtained from the Department of Drainage and Sewerage Bangkok Metropolitan Administration during 2007-2011. The results of multilayer perceptron neural network exhibit a high accuracy multilayer perception rate at 94.23% in classifying the water quality of Saen Saep canal in Bangkok. Subsequently, this encouraging result could be combined with GIS data improves the classification accuracy significantly.Keywords: artificial neural network, geographic information system, water quality, computer science
Procedia PDF Downloads 34329488 Clustering Ethno-Informatics of Naming Village in Java Island Using Data Mining
Authors: Atje Setiawan Abdullah, Budi Nurani Ruchjana, I. Gede Nyoman Mindra Jaya, Eddy Hermawan
Abstract:
Ethnoscience is used to see the culture with a scientific perspective, which may help to understand how people develop various forms of knowledge and belief, initially focusing on the ecology and history of the contributions that have been there. One of the areas studied in ethnoscience is etno-informatics, is the application of informatics in the culture. In this study the science of informatics used is data mining, a process to automatically extract knowledge from large databases, to obtain interesting patterns in order to obtain a knowledge. While the application of culture described by naming database village on the island of Java were obtained from Geographic Indonesia Information Agency (BIG), 2014. The purpose of this study is; first, to classify the naming of the village on the island of Java based on the structure of the word naming the village, including the prefix of the word, syllable contained, and complete word. Second to classify the meaning of naming the village based on specific categories, as well as its role in the community behavioral characteristics. Third, how to visualize the naming of the village to a map location, to see the similarity of naming villages in each province. In this research we have developed two theorems, i.e theorems area as a result of research studies have collected intersection naming villages in each province on the island of Java, and the composition of the wedge theorem sets the provinces in Java is used to view the peculiarities of a location study. The methodology in this study base on the method of Knowledge Discovery in Database (KDD) on data mining, the process includes preprocessing, data mining and post processing. The results showed that the Java community prioritizes merit in running his life, always working hard to achieve a more prosperous life, and love as well as water and environmental sustainment. Naming villages in each location adjacent province has a high degree of similarity, and influence each other. Cultural similarities in the province of Central Java, East Java and West Java-Banten have a high similarity, whereas in Jakarta-Yogyakarta has a low similarity. This research resulted in the cultural character of communities within the meaning of the naming of the village on the island of Java, this character is expected to serve as a guide in the behavior of people's daily life on the island of Java.Keywords: ethnoscience, ethno-informatics, data mining, clustering, Java island culture
Procedia PDF Downloads 28329487 Compliance with the Health and Safety Standards/Regulations in the South African Mining Industry: A Literature Review
Authors: Livhuwani Muthelo, Tebogo Maria Mothiba, Rambelani Nancy Malema
Abstract:
Background: Despite occupational legislation/standards being in place in the industry, there are many reported health and safety incidents, including both occupational injuries and illnesses in the South African mining industry. Purpose: This systematic literature review aimed to describe and identify the existing gaps in health and safety compliance within the South African mining industry and propose future research areas. Methodology: A systematic literature review was conducted using the key concepts of health and safety, compliance, standards, and mining. A total of 102 papers issued from 1994 to April 2020 were extracted from an online database search, which included a combination of South African and international government OHS legislation documents, policies, standards, reports from the mineral departments and international labour office, qualitative and quantitative journal articles, dissertations, seminars and conference proceedings. Results: The literature review revealed that, though there are laws, regulations, standards to guide the industry on health and safety issues in South Africa, the main challenge is with the compliance with the existing health and safety systems, wherein systems are not being implemented. Conclusion: Gaps between research, policy, and implementation in occupational health practice in the South African mining industry were also identified.Keywords: circumstances, non-compliance, health and safety, standards, mining industry
Procedia PDF Downloads 28829486 Forest Risk and Vulnerability Assessment: A Case Study from East Bokaro Coal Mining Area in India
Authors: Sujata Upgupta, Prasoon Kumar Singh
Abstract:
The expansion of large scale coal mining into forest areas is a potential hazard for the local biodiversity and wildlife. The objective of this study is to provide a picture of the threat that coal mining poses to the forests of the East Bokaro landscape. The vulnerable forest areas at risk have been assessed and the priority areas for conservation have been presented. The forested areas at risk in the current scenario have been assessed and compared with the past conditions using classification and buffer based overlay approach. Forest vulnerability has been assessed using an analytical framework based on systematic indicators and composite vulnerability index values. The results indicate that more than 4 km2 of forests have been lost from 1973 to 2016. Large patches of forests have been diverted for coal mining projects. Forests in the northern part of the coal field within 1-3 km radius around the coal mines are at immediate risk. The original contiguous forests have been converted into fragmented and degraded forest patches. Most of the collieries are located within or very close to the forests thus threatening the biodiversity and hydrology of the surrounding regions. Based on the vulnerability values estimated, it was concluded that more than 90% of the forested grids in East Bokaro are highly vulnerable to mining. The forests in the sub-districts of Bermo and Chandrapura have been identified as the most vulnerable to coal mining activities. This case study would add to the capacity of the forest managers and mine managers to address the risk and vulnerability of forests at a small landscape level in order to achieve sustainable development.Keywords: forest, coal mining, indicators, vulnerability
Procedia PDF Downloads 38929485 A Text Classification Approach Based on Natural Language Processing and Machine Learning Techniques
Authors: Rim Messaoudi, Nogaye-Gueye Gning, François Azelart
Abstract:
Automatic text classification applies mostly natural language processing (NLP) and other AI-guided techniques to automatically classify text in a faster and more accurate manner. This paper discusses the subject of using predictive maintenance to manage incident tickets inside the sociality. It focuses on proposing a tool that treats and analyses comments and notes written by administrators after resolving an incident ticket. The goal here is to increase the quality of these comments. Additionally, this tool is based on NLP and machine learning techniques to realize the textual analytics of the extracted data. This approach was tested using real data taken from the French National Railways (SNCF) company and was given a high-quality result.Keywords: machine learning, text classification, NLP techniques, semantic representation
Procedia PDF Downloads 10029484 Planning Urban Sprawl in Mining Areas in Africa: How to Ensure Coherent Development
Authors: Pascal Rey, Anaïs Weber
Abstract:
Many mining projects are being developed in Africa the last decades. Due to the economic opportunities they offer, these projects result in a massive and rapid influx of migrants to the surrounding area. In areas where central government representation is low and local administration lack financial resources, urban development is often anarchical, beyond all public control. It leads to socio-spatial segregation, insecurity and the risk of social conflicts rising. Aware that their economic development is very correlated with local situation, mining companies get more and more involved in regional planning in setting up tools and Strategic Directions document. One of the commonly used tools in this regard is the “Influx Management Plan”. It consists in looking at the region’s absorption capacities in order to ensure its coherent development and by developing several urban centers than one macrocephalic city. It includes many other measures such as urban governance support, skills transfer, creation of strategic guidelines, financial support (local taxes, mining taxes, development funds etc.) local development projects. Through various examples of mining projects in Guinea, A country that is host to many large mining projects, we will look at the implications of regional and urban planning of which mining companies are key playor as well as public authorities. While their investment capacity offers advantages and accelerates development, their actions raise questions of the unilaterality of interests and local governance. By interfering in public affairs are mining companies not increasing the risk of central and local government shirking their responsibilities in terms of regional development, or even calling their legitimacy into question? Is such public-private collaboration really sustainable for the region as a whole and for all stakeholders?Keywords: Africa, guinea, mine, urban planning
Procedia PDF Downloads 49829483 Assessment for the Backfill Using the Run of the Mine Tailings and Portland Cement
Authors: Javad Someehneshin, Weizhou Quan, Abdelsalam Abugharara, Stephen Butt
Abstract:
Narrow vein mining (NVM) is exploiting very thin but valuable ore bodies that are uneconomical to extract by conventional mining methods. NVM applies the technique of Sustainable Mining by Drilling (SMD). The SMD method is used to mine stranded, steeply dipping ore veins, which are too small or isolated to mine economically using conventional methods since the dilution is minimized. This novel mining technique uses drilling rigs to extract the ore through directional drilling surgically. This paper is focusing on utilizing the run of the mine tailings and Portland cement as backfill material to support the hanging wall for providing safe mine operation. Cemented paste backfill (CPB) is designed by mixing waste tailings, water, and cement of the precise percentage for optimal outcomes. It is a non-homogenous material that contains 70-85% solids. Usually, a hydraulic binder is added to the mixture to increase the strength of the CPB. The binder fraction mostly accounts for 2–10% of the total weight. In the mining industry, CPB has been improved and expanded gradually because it provides safety and support for the mines. Furthermore, CPB helps manage the waste tailings in an economical method and plays a significant role in environmental protection.Keywords: backfilling, cement backfill, tailings, Portland cement
Procedia PDF Downloads 13829482 Cultural Dynamics in Online Consumer Behavior: Exploring Cross-Country Variances in Review Influence
Authors: Eunjung Lee
Abstract:
This research investigates the intricate connection between cultural differences and online consumer behaviors by integrating Hofstede's Cultural Dimensions theory with analysis methodologies such as text mining, data mining, and topic analysis. Our aim is to provide a comprehensive understanding of how national cultural differences influence individuals' behaviors when engaging with online reviews. To ensure the relevance of our investigation, we systematically analyze and interpret the cultural nuances influencing online consumer behaviors, especially in the context of online reviews. By anchoring our research in Hofstede's Cultural Dimensions theory, we seek to offer valuable insights for marketers to tailor their strategies based on the cultural preferences of diverse global consumer bases. In our methodology, we employ advanced text mining techniques to extract insights from a diverse range of online reviews gathered globally for a specific product or service like Netflix. This approach allows us to reveal hidden cultural cues in the language used by consumers from various backgrounds. Complementing text mining, data mining techniques are applied to extract meaningful patterns from online review datasets collected from different countries, aiming to unveil underlying structures and gain a deeper understanding of the impact of cultural differences on online consumer behaviors. The study also integrates topic analysis to identify recurring subjects, sentiments, and opinions within online reviews. Marketers can leverage these insights to inform the development of culturally sensitive strategies, enhance target audience segmentation, and refine messaging approaches aligned with cultural preferences. Anchored in Hofstede's Cultural Dimensions theory, our research employs sophisticated methodologies to delve into the intricate relationship between cultural differences and online consumer behaviors. Applied to specific cultural dimensions, such as individualism vs. collectivism, masculinity vs. femininity, uncertainty avoidance, and long-term vs. short-term orientation, the study uncovers nuanced insights. For example, in exploring individualism vs. collectivism, we examine how reviewers from individualistic cultures prioritize personal experiences while those from collectivistic cultures emphasize communal opinions. Similarly, within masculinity vs. femininity, we investigate whether distinct topics align with cultural notions, such as robust features in masculine cultures and user-friendliness in feminine cultures. Examining information-seeking behaviors under uncertainty avoidance reveals how cultures differ in seeking detailed information or providing succinct reviews based on their comfort with ambiguity. Additionally, in assessing long-term vs. short-term orientation, the research explores how cultural focus on enduring benefits or immediate gratification influences reviews. These concrete examples contribute to the theoretical enhancement of Hofstede's Cultural Dimensions theory, providing a detailed understanding of cultural impacts on online consumer behaviors. As online reviews become increasingly crucial in decision-making, this research not only contributes to the academic understanding of cultural influences but also proposes practical recommendations for enhancing online review systems. Marketers can leverage these findings to design targeted and culturally relevant strategies, ultimately enhancing their global marketing effectiveness and optimizing online review systems for maximum impact.Keywords: comparative analysis, cultural dimensions, marketing intelligence, national culture, online consumer behavior, text mining
Procedia PDF Downloads 4729481 Performance Comparison of Outlier Detection Techniques Based Classification in Wireless Sensor Networks
Authors: Ayadi Aya, Ghorbel Oussama, M. Obeid Abdulfattah, Abid Mohamed
Abstract:
Nowadays, many wireless sensor networks have been distributed in the real world to collect valuable raw sensed data. The challenge is to extract high-level knowledge from this huge amount of data. However, the identification of outliers can lead to the discovery of useful and meaningful knowledge. In the field of wireless sensor networks, an outlier is defined as a measurement that deviates from the normal behavior of sensed data. Many detection techniques of outliers in WSNs have been extensively studied in the past decade and have focused on classic based algorithms. These techniques identify outlier in the real transaction dataset. This survey aims at providing a structured and comprehensive overview of the existing researches on classification based outlier detection techniques as applicable to WSNs. Thus, we have identified key hypotheses, which are used by these approaches to differentiate between normal and outlier behavior. In addition, this paper tries to provide an easier and a succinct understanding of the classification based techniques. Furthermore, we identified the advantages and disadvantages of different classification based techniques and we presented a comparative guide with useful paradigms for promoting outliers detection research in various WSN applications and suggested further opportunities for future research.Keywords: bayesian networks, classification-based approaches, KPCA, neural networks, one-class SVM, outlier detection, wireless sensor networks
Procedia PDF Downloads 49629480 GIS for Simulating Air Traffic by Applying Different Multi-radar Positioning Techniques
Authors: Amara Rafik, Bougherara Maamar, Belhadj Aissa Mostefa
Abstract:
Radar data is one of the many data sources used by ATM Air Traffic Management systems. These data come from air navigation radar antennas. These radars intercept signals emitted by the various aircraft crossing the controlled airspace and calculate the position of these aircraft and retransmit their positions to the Air Traffic Management System. For greater reliability, these radars are positioned in such a way as to allow their coverage areas to overlap. An aircraft will therefore be detected by at least one of these radars. However, the position coordinates of the same aircraft and sent by these different radars are not necessarily identical. Therefore, the ATM system must calculate a single position (radar track) which will ultimately be sent to the control position and displayed on the air traffic controller's monitor. There are several techniques for calculating the radar track. Furthermore, the geographical nature of the problem requires the use of a Geographic Information System (GIS), i.e. a geographical database on the one hand and geographical processing. The objective of this work is to propose a GIS for traffic simulation which reconstructs the evolution over time of aircraft positions from a multi-source radar data set and by applying these different techniques.Keywords: ATM, GIS, radar data, air traffic simulation
Procedia PDF Downloads 8529479 Use of Quasi-3D Inversion of VES Data Based on Lateral Constraints to Characterize the Aquifer and Mining Sites of an Area Located in the North-East of Figuil, North Cameroon
Authors: Fofie Kokea Ariane Darolle, Gouet Daniel Hervé, Koumetio Fidèle, Yemele David
Abstract:
The electrical resistivity method is successfully used in this paper in order to have a clearer picture of the subsurface of the North-East ofFiguil in northern Cameroon. It is worth noting that this method is most often used when the objective of the study is to image the shallow subsoils by considering them as a set of stratified ground layers. The problem to be solved is very often environmental, and in this case, it is necessary to perform an inversion of the data in order to have a complete and accurate picture of the parameters of the said layers. In the case of this work, thirty-three (33) Schlumberger VES have been carried out on an irregular grid to investigate the subsurface of the study area. The 1D inversion applied as a preliminary modeling tool and in correlation with the mechanical drillings results indicates a complex subsurface lithology distribution mainly consisting of marbles and schists. Moreover, the quasi-3D inversion with lateral constraint shows that the misfit between the observed field data and the model response is quite good and acceptable with a value low than 10%. The method also reveals existence of two water bearing in the considered area. The first is the schist or weathering aquifer (unsuitable), and the other is the marble or the fracturing aquifer (suitable). The final quasi 3D inversion results and geological models indicate proper sites for groundwaters prospecting and for mining exploitation, thus allowing the economic development of the study area.Keywords: electrical resistivity method, 1D inversion, quasi 3D inversion, groundwaters, mining
Procedia PDF Downloads 15529478 A Personality-Based Behavioral Analysis on eSports
Authors: Halkiopoulos Constantinos, Gkintoni Evgenia, Koutsopoulou Ioanna, Antonopoulou Hera
Abstract:
E-sports and e-gaming have emerged in recent years since the increase in internet use have become universal and e-gamers are the new reality in our homes. The excessive involvement of young adults with e-sports has already been revealed and the adverse consequences have been reported in researches in the past few years, but the issue has not been fully studied yet. The present research is conducted in Greece and studies the psychological profile of video game players and provides information on personality traits, habits and emotional status that affect online gamers’ behaviors in order to help professionals and policy makers address the problem. Three standardized self-report questionnaires were administered to participants who were young male and female adults aged from 19-26 years old. The Profile of Mood States (POMS) scale was used to evaluate people’s perceptions of their everyday life mood; the personality features that can trace back to people’s habits and anticipated reactions were measured by Eysenck Personality Questionnaire (EPQ), and the Trait Emotional Intelligence Questionnaire (TEIQue) was used to measure which cognitive (gamers’ beliefs) and emotional parameters (gamers’ emotional abilities) mainly affected/ predicted gamers’ behaviors and leisure time activities?/ gaming behaviors. Data mining techniques were used to analyze the data, which resulted in machine learning algorithms that were included in the software package R. The research findings attempt to designate the effect of personality traits, emotional status and emotional intelligence influence and correlation with e-sports, gamers’ behaviors and help policy makers and stakeholders take action, shape social policy and prevent the adverse consequences on young adults. The need for further research, prevention and treatment strategies is also addressed.Keywords: e-sports, e-gamers, personality traits, POMS, emotional intelligence, data mining, R
Procedia PDF Downloads 23129477 Event Extraction, Analysis, and Event Linking
Authors: Anam Alam, Rahim Jamaluddin Kanji
Abstract:
With the rapid growth of event in everywhere, event extraction has now become an important matter to retrieve the information from the unstructured data. One of the challenging problems is to extract the event from it. An event is an observable occurrence of interaction among entities. The paper investigates the effectiveness of event extraction capabilities of three software tools that are Wandora, Nitro and SPSS. We performed standard text mining techniques of these tools on the data sets of (i) Afghan War Diaries (AWD collection), (ii) MUC4 and (iii) WebKB. Information retrieval measures such as precision and recall which are computed under extensive set of experiments for Event Extraction. The experimental study analyzes the difference between events extracted by the software and human. This approach helps to construct an algorithm that will be applied for different machine learning methods.Keywords: event extraction, Wandora, nitro, SPSS, event analysis, extraction method, AFG, Afghan War Diaries, MUC4, 4 universities, dataset, algorithm, precision, recall, evaluation
Procedia PDF Downloads 59629476 Machine Learning Facing Behavioral Noise Problem in an Imbalanced Data Using One Side Behavioral Noise Reduction: Application to a Fraud Detection
Authors: Salma El Hajjami, Jamal Malki, Alain Bouju, Mohammed Berrada
Abstract:
With the expansion of machine learning and data mining in the context of Big Data analytics, the common problem that affects data is class imbalance. It refers to an imbalanced distribution of instances belonging to each class. This problem is present in many real world applications such as fraud detection, network intrusion detection, medical diagnostics, etc. In these cases, data instances labeled negatively are significantly more numerous than the instances labeled positively. When this difference is too large, the learning system may face difficulty when tackling this problem, since it is initially designed to work in relatively balanced class distribution scenarios. Another important problem, which usually accompanies these imbalanced data, is the overlapping instances between the two classes. It is commonly referred to as noise or overlapping data. In this article, we propose an approach called: One Side Behavioral Noise Reduction (OSBNR). This approach presents a way to deal with the problem of class imbalance in the presence of a high noise level. OSBNR is based on two steps. Firstly, a cluster analysis is applied to groups similar instances from the minority class into several behavior clusters. Secondly, we select and eliminate the instances of the majority class, considered as behavioral noise, which overlap with behavior clusters of the minority class. The results of experiments carried out on a representative public dataset confirm that the proposed approach is efficient for the treatment of class imbalances in the presence of noise.Keywords: machine learning, imbalanced data, data mining, big data
Procedia PDF Downloads 13029475 Hybrid Energy System for the German Mining Industry: An Optimized Model
Authors: Kateryna Zharan, Jan C. Bongaerts
Abstract:
In recent years, economic attractiveness of renewable energy (RE) for the mining industry, especially for off-grid mines, and a negative environmental impact of fossil energy are stimulating to use RE for mining needs. Being that remote area mines have higher energy expenses than mines connected to a grid, integration of RE may give a mine economic benefits. Regarding the literature review, there is a lack of business models for adopting of RE at mine. The main aim of this paper is to develop an optimized model of RE integration into the German mining industry (GMI). Hereby, the GMI with amount of around 800 mill. t. annually extracted resources is included in the list of the 15 major mining country in the world. Accordingly, the mining potential of Germany is evaluated in this paper as a perspective market for RE implementation. The GMI has been classified in order to find out the location of resources, quantity and types of the mines, amount of extracted resources, and access of the mines to the energy resources. Additionally, weather conditions have been analyzed in order to figure out where wind and solar generation technologies can be integrated into a mine with the highest efficiency. Despite the fact that the electricity demand of the GMI is almost completely covered by a grid connection, the hybrid energy system (HES) based on a mix of RE and fossil energy is developed due to show environmental and economic benefits. The HES for the GMI consolidates a combination of wind turbine, solar PV, battery and diesel generation. The model has been calculated using the HOMER software. Furthermore, the demonstrated HES contains a forecasting model that predicts solar and wind generation in advance. The main result from the HES such as CO2 emission reduction is estimated in order to make the mining processing more environmental friendly.Keywords: diesel generation, German mining industry, hybrid energy system, hybrid optimization model for electric renewables, optimized model, renewable energy
Procedia PDF Downloads 34329474 Composite Kernels for Public Emotion Recognition from Twitter
Authors: Chien-Hung Chen, Yan-Chun Hsing, Yung-Chun Chang
Abstract:
The Internet has grown into a powerful medium for information dispersion and social interaction that leads to a rapid growth of social media which allows users to easily post their emotions and perspectives regarding certain topics online. Our research aims at using natural language processing and text mining techniques to explore the public emotions expressed on Twitter by analyzing the sentiment behind tweets. In this paper, we propose a composite kernel method that integrates tree kernel with the linear kernel to simultaneously exploit both the tree representation and the distributed emotion keyword representation to analyze the syntactic and content information in tweets. The experiment results demonstrate that our method can effectively detect public emotion of tweets while outperforming the other compared methods.Keywords: emotion recognition, natural language processing, composite kernel, sentiment analysis, text mining
Procedia PDF Downloads 21829473 Multi-Source Data Fusion for Urban Comprehensive Management
Authors: Bolin Hua
Abstract:
In city governance, various data are involved, including city component data, demographic data, housing data and all kinds of business data. These data reflects different aspects of people, events and activities. Data generated from various systems are different in form and data source are different because they may come from different sectors. In order to reflect one or several facets of an event or rule, data from multiple sources need fusion together. Data from different sources using different ways of collection raised several issues which need to be resolved. Problem of data fusion include data update and synchronization, data exchange and sharing, file parsing and entry, duplicate data and its comparison, resource catalogue construction. Governments adopt statistical analysis, time series analysis, extrapolation, monitoring analysis, value mining, scenario prediction in order to achieve pattern discovery, law verification, root cause analysis and public opinion monitoring. The result of Multi-source data fusion is to form a uniform central database, which includes people data, location data, object data, and institution data, business data and space data. We need to use meta data to be referred to and read when application needs to access, manipulate and display the data. A uniform meta data management ensures effectiveness and consistency of data in the process of data exchange, data modeling, data cleansing, data loading, data storing, data analysis, data search and data delivery.Keywords: multi-source data fusion, urban comprehensive management, information fusion, government data
Procedia PDF Downloads 39329472 Advanced Data Visualization Techniques for Effective Decision-making in Oil and Gas Exploration and Production
Authors: Deepak Singh, Rail Kuliev
Abstract:
This research article explores the significance of advanced data visualization techniques in enhancing decision-making processes within the oil and gas exploration and production domain. With the oil and gas industry facing numerous challenges, effective interpretation and analysis of vast and diverse datasets are crucial for optimizing exploration strategies, production operations, and risk assessment. The article highlights the importance of data visualization in managing big data, aiding the decision-making process, and facilitating communication with stakeholders. Various advanced data visualization techniques, including 3D visualization, augmented reality (AR), virtual reality (VR), interactive dashboards, and geospatial visualization, are discussed in detail, showcasing their applications and benefits in the oil and gas sector. The article presents case studies demonstrating the successful use of these techniques in optimizing well placement, real-time operations monitoring, and virtual reality training. Additionally, the article addresses the challenges of data integration and scalability, emphasizing the need for future developments in AI-driven visualization. In conclusion, this research emphasizes the immense potential of advanced data visualization in revolutionizing decision-making processes, fostering data-driven strategies, and promoting sustainable growth and improved operational efficiency within the oil and gas exploration and production industry.Keywords: augmented reality (AR), virtual reality (VR), interactive dashboards, real-time operations monitoring
Procedia PDF Downloads 8629471 Beyond Voluntary Corporate Social Responsibility: Examining the Impact of the New Mandatory Community Development Agreement in the Mining Sector of Sierra Leone
Authors: Wusu Conteh
Abstract:
Since the 1990s, neo-liberalization has become a global agenda. The free market ushered in an unprecedented drive by Multinational Corporations (MNCs) to secure mineral rights in resource-rich countries. Several governments in the Global South implemented a liberalized mining policy with support from the International Financial Institutions (IFIs). MNCs have maintained that voluntary Corporate Social Responsibility (CSR) has engendered socio-economic development in mining-affected communities. However, most resource-rich countries are struggling to transform the resources into sustainable socio-economic development. They are trapped in what has been widely described as the ‘resource curse.’ In an attempt to address this resource conundrum, the African Mining Vision (AMV) of 2009 developed a model on resource governance. The advent of the AMV has engendered the introduction of mandatory community development agreement (CDA) into the legal framework of many countries in Africa. In 2009, Sierra Leone enacted the Mines and Minerals Act that obligates mining companies to invest in Primary Host Communities. The study employs interviews and field observation techniques to explicate the dynamics of the CDA program. A total of 25 respondents -government officials, NGOs/CSOs and community stakeholders were interviewed. The study focuses on a case study of the Sierra Rutile CDA program in Sierra Leone. Extant scholarly works have extensively explored the resource curse and voluntary CSR. There are limited studies to uncover the mandatory CDA and its impact on socio-economic development in mining-affected communities. Thus, the purpose of this study is to explicate the impact of the CDA in Sierra Leone. Using the theory of change helps to understand how the availability of mandatory funds can empower communities to take an active part in decision making related to the development of the communities. The results show that the CDA has engendered a predictable fund for community development. It has also empowered ordinary members of the community to determine the development program. However, the CDA has created a new ground for contestations between the pre-existing local governance structure (traditional authority) and the newly created community development committee (CDC) that is headed by an ordinary member of the community.Keywords: community development agreement, impact, mandatory, participation
Procedia PDF Downloads 12329470 Use of Multivariate Statistical Techniques for Water Quality Monitoring Network Assessment, Case of Study: Jequetepeque River Basin
Authors: Jose Flores, Nadia Gamboa
Abstract:
A proper water quality management requires the establishment of a monitoring network. Therefore, evaluation of the efficiency of water quality monitoring networks is needed to ensure high-quality data collection of critical quality chemical parameters. Unfortunately, in some Latin American countries water quality monitoring programs are not sustainable in terms of recording historical data or environmentally representative sites wasting time, money and valuable information. In this study, multivariate statistical techniques, such as principal components analysis (PCA) and hierarchical cluster analysis (HCA), are applied for identifying the most significant monitoring sites as well as critical water quality parameters in the monitoring network of the Jequetepeque River basin, in northern Peru. The Jequetepeque River basin, like others in Peru, shows socio-environmental conflicts due to economical activities developed in this area. Water pollution by trace elements in the upper part of the basin is mainly related with mining activity, and agricultural land lost due to salinization is caused by the extensive use of groundwater in the lower part of the basin. Since the 1980s, the water quality in the basin has been non-continuously assessed by public and private organizations, and recently the National Water Authority had established permanent water quality networks in 45 basins in Peru. Despite many countries use multivariate statistical techniques for assessing water quality monitoring networks, those instruments have never been applied for that purpose in Peru. For this reason, the main contribution of this study is to demonstrate that application of the multivariate statistical techniques could serve as an instrument that allows the optimization of monitoring networks using least number of monitoring sites as well as the most significant water quality parameters, which would reduce costs concerns and improve the water quality management in Peru. Main socio-economical activities developed and the principal stakeholders related to the water management in the basin are also identified. Finally, water quality management programs will also be discussed in terms of their efficiency and sustainability.Keywords: PCA, HCA, Jequetepeque, multivariate statistical
Procedia PDF Downloads 35529469 Geographic Information System for Simulating Air Traffic By Applying Different Multi-Radar Positioning Techniques
Authors: Amara Rafik, Mostefa Belhadj Aissa
Abstract:
Radar data is one of the many data sources used by ATM Air Traffic Management systems. These data come from air navigation radar antennas. These radars intercept signals emitted by the various aircraft crossing the controlled airspace and calculate the position of these aircraft and retransmit their positions to the Air Traffic Management System. For greater reliability, these radars are positioned in such a way as to allow their coverage areas to overlap. An aircraft will therefore be detected by at least one of these radars. However, the position coordinates of the same aircraft and sent by these different radars are not necessarily identical. Therefore, the ATM system must calculate a single position (radar track) which will ultimately be sent to the control position and displayed on the air traffic controller's monitor. There are several techniques for calculating the radar track. Furthermore, the geographical nature of the problem requires the use of a Geographic Information System (GIS), i.e. a geographical database on the one hand and geographical processing. The objective of this work is to propose a GIS for traffic simulation which reconstructs the evolution over time of aircraft positions from a multi-source radar data set and by applying these different techniques.Keywords: ATM, GIS, radar data, simulation
Procedia PDF Downloads 11829468 Voice of Customer: Mining Customers' Reviews on On-Line Car Community
Authors: Kim Dongwon, Yu Songjin
Abstract:
This study identifies the business value of VOC (Voice of Customer) on the business. Precisely, we intend to demonstrate how much negative and positive sentiment of VOC has an influence on car sales market share in the unites states. We extract 7 emotions such as sadness, shame, anger, fear, frustration, delight and satisfaction from the VOC data, 23,204 pieces of opinions, that had been posted on car-related on-line community from 2007 to 2009(a part of data collection from 2007 to 2015), and intend to clarify the correlation between negative and positive sentimental keywords and contribution to market share. In order to develop a lexicon for each category of negative and positive sentiment, we took advantage of Corpus program, Antconc 3.4.1.w and on-line sentimental data, SentiWordNet and identified the part of speech(POS) information of words in the customers' opinion by using a part-of-speech tagging function provided by TextAnalysisOnline. For the purpose of this present study, a total of 45,741 pieces of customers' opinions of 28 car manufacturing companies had been collected including titles and status information. We conducted an experiment to examine whether the inclusion, frequency and intensity of terms with negative and positive emotions in each category affect the adoption of customer opinions for vehicle organizations' market share. In the experiment, we statistically verified that there is correlation between customer ideas containing negative and positive emotions and variation of marker share. Particularly, "Anger," a domain of negative domains, is significantly influential to car sales market share. The domain "Delight" and "Satisfaction" increased in proportion to growth of market share.Keywords: data mining, opinion mining, sentiment analysis, VOC
Procedia PDF Downloads 21429467 Study of the Stability of Underground Mines by Numerical Method: The Mine Chaabet El Hamra, Algeria
Authors: Nakache Radouane, M. Boukelloul, M. Fredj
Abstract:
Method room and pillar sizes are key factors for safe mining and their recovery in open-stop mining. This method is advantageous due to its simplicity and requirement of little information to be used. It is probably the most representative method among the total load approach methods although it also remains a safe design method. Using a finite element software (PLAXIS 3D), analyses were carried out with an elasto-plastic model and comparisons were made with methods based on the total load approach. The results were presented as the optimization for improving the ore recovery rate while maintaining a safe working environment.Keywords: room and pillar, mining, total load approach, elasto-plastic
Procedia PDF Downloads 33029466 Literature Review on Text Comparison Techniques: Analysis of Text Extraction, Main Comparison and Visual Representation Tools
Authors: Andriana Mkrtchyan, Vahe Khlghatyan
Abstract:
The choice of a profession is one of the most important decisions people make throughout their life. With the development of modern science, technologies, and all the spheres existing in the modern world, more and more professions are being arisen that complicate even more the process of choosing. Hence, there is a need for a guiding platform to help people to choose a profession and the right career path based on their interests, skills, and personality. This review aims at analyzing existing methods of comparing PDF format documents and suggests that a 3-stage approach is implemented for the comparison, that is – 1. text extraction from PDF format documents, 2. comparison of the extracted text via NLP algorithms, 3. comparison representation using special shape and color psychology methodology.Keywords: color psychology, data acquisition/extraction, data augmentation, disambiguation, natural language processing, outlier detection, semantic similarity, text-mining, user evaluation, visual search
Procedia PDF Downloads 76