Search results for: mining methods
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 15424

Search results for: mining methods

15184 Comparison Of Data Mining Models To Predict Future Bridge Conditions

Authors: Pablo Martinez, Emad Mohamed, Osama Mohsen, Yasser Mohamed

Abstract:

Highway and bridge agencies, such as the Ministry of Transportation in Ontario, use the Bridge Condition Index (BCI) which is defined as the weighted condition of all bridge elements to determine the rehabilitation priorities for its bridges. Therefore, accurate forecasting of BCI is essential for bridge rehabilitation budgeting planning. The large amount of data available in regard to bridge conditions for several years dictate utilizing traditional mathematical models as infeasible analysis methods. This research study focuses on investigating different classification models that are developed to predict the bridge condition index in the province of Ontario, Canada based on the publicly available data for 2800 bridges over a period of more than 10 years. The data preparation is a key factor to develop acceptable classification models even with the simplest one, the k-NN model. All the models were tested, compared and statistically validated via cross validation and t-test. A simple k-NN model showed reasonable results (within 0.5% relative error) when predicting the bridge condition in an incoming year.

Keywords: asset management, bridge condition index, data mining, forecasting, infrastructure, knowledge discovery in databases, maintenance, predictive models

Procedia PDF Downloads 164
15183 Assessing Lithium Recovery from Secondary Sources

Authors: Carolina A. Santos, Alexandra B. Ribeiro

Abstract:

Climate change and environmental degradation are threats to humanity. Europe has been addressing these problems, namely through the Green Deal, with the use of batteries in mobility and energy fields. However, these require the use of critical raw materials, like lithium, which demand is estimated to grow 60 times in the next 30 years. Thus, it is fundamental to promote a circular economy with lithium recovery from secondary resources. These are nowadays key topics, which will be even more relevant in the future, so a new way to approach them is needed and must be encouraged. Therefore, one of our main goals is to analyse two methods of lithium retrieval from secondary sources, bioleaching, and electrodialysis, and assess them regarding their sustainability. The latest results show good efficiency of removal with both methods, even though there are some matrix interferences. Hence, further investment and research are needed in order to make this process sustainable and our society more circular.

Keywords: lithium, sustainable mining, social license to operate, bioleaching, electrodialysis

Procedia PDF Downloads 97
15182 Mining in Peru and Local Governance: Assessing the Contribution of CRS Projects

Authors: Sandra Carrillo Hoyos

Abstract:

Mining activities in South America have significantly grown during the last decades, given the abundance of natural resources, the implemented governmental policies to incentivize foreign investment as well as the boom in international prices for metals and oil between 2002 and 2008. While this context allowed the region to occupy a leading position between the top producers of minerals around the world, it has also meant an increase in socio-environmental conflicts which have generated costs and negative impacts not only for the companies but especially for the governments and local communities.During the latest decade, the mining sector in Peru has faced with the social resistance of a large number of communities, which began organizing actions against the implementation of high investing projects. The dissatisfaction has derived in the prevalence of socio-environmental conflicts associated with mining activities, some of them never solved into an agreement. In order to prevent those socio-environmental conflicts and obtain the social license from local communities, most of the mining companies have developed diverse initiatives within the framework of policies and practices of corporate social responsibility (CSR). This paper has assessed the mining sector’s contribution toward the local development management along the last decade, as part of CSR strategies as well as the policies promoted by the Peruvian State. This assessment found that, in the beginning, these initiatives have been based on a philanthropic approach and were reacting to pressures from local stakeholders to maintain the consent to operate from the surrounding communities as well as to create, as a result, a harmonious atmosphere for operations. Due to the weak State presence, such practices have increased the expectations of communities related to the participation of mining companies in solving structural development problems, especially those related to primary needs, infrastructure, education, health, among others. In other words, this paper was focused on analyze in what extent these initiatives have promoted local empowerment for development planning and integrated management of natural resources from a territorial approach. From this perspective, the analysis demonstrates that, while the design and planning of social investment initiatives have improved due to the sector´s sustainability approach, many companies have developed actions beyond their competence during this process. In some cases, the referenced actions have generated dependency with communities, even though this relationship has not exempted the companies of conflict situations with unfortunate consequences. Furthermore, the social programs developed have not necessarily generated a significant impact in improving the quality of life of affected populations. In fact, it is possible to identify that those regions with high mining resources and investment are facing with a situation of poverty and high dependency on mining production. In spite of the revenues derived from mining industry, local governments have not been able to translate the royalties into sustainable development opportunities. For this reason, the proposed paper suggests some challenges for the mining sector contribution to local development based on the best practices and lessons learnt from a benchmarking for the leading mining companies.

Keywords: corporate social responsibility, local development, mining, socio-environmental conflict

Procedia PDF Downloads 369
15181 Lead and Cadmium Spatial Pattern and Risk Assessment around Coal Mine in Hyrcanian Forest, North Iran

Authors: Mahsa Tavakoli, Seyed Mohammad Hojjati, Yahya Kooch

Abstract:

In this study, the effect of coal mining activities on lead and cadmium concentrations and distribution in soil was investigated in Hyrcanian forest, North Iran. 16 plots (20×20 m2) were established by systematic-randomly (60×60 m2) in an area of 4 ha (200×200 m2-mine entrance placed at center). An area adjacent to the mine was not affected by the mining activity; considered as the controlled area. In order to investigate soil lead and cadmium concentration, one sample was taken from the 0-10 cm in each plot. To study the spatial pattern of soil properties and lead and cadmium concentrations in the mining area, an area of 80×80m2 (the mine as the center) was considered and 80 soil samples were systematic-randomly taken (10 m intervals). Geostatistical analysis was performed via Kriging method and GS+ software (version 5.1). In order to estimate the impact of coal mining activities on soil quality, pollution index was measured. Lead and cadmium concentrations were significantly higher in mine area (Pb: 10.97±0.30, Cd: 184.47±6.26 mg.kg-1) in comparison to control area (Pb: 9.42±0.17, Cd: 131.71±15.77 mg.kg-1). The mean values of the PI index indicate that Pb (1.16) and Cd (1.77) presented slightly polluted. Results of the NIPI index showed that Pb (1.44) and Cd (2.52) presented slight pollution and moderate pollution respectively. Results of variography and kriging method showed that it is possible to prepare interpolation maps of lead and cadmium around the mining areas in Hyrcanian forest. According to results of pollution and risk assessments, forest soil was contaminated by heavy metals (lead and cadmium); therefore, using reclamation and remediation techniques in these areas is necessary.

Keywords: traditional coal mining, heavy metals, pollution indicators, geostatistics, Caspian forest

Procedia PDF Downloads 143
15180 Study and Analysis of the Factors Affecting Road Safety Using Decision Tree Algorithms

Authors: Naina Mahajan, Bikram Pal Kaur

Abstract:

The purpose of traffic accident analysis is to find the possible causes of an accident. Road accidents cannot be totally prevented but by suitable traffic engineering and management the accident rate can be reduced to a certain extent. This paper discusses the classification techniques C4.5 and ID3 using the WEKA Data mining tool. These techniques use on the NH (National highway) dataset. With the C4.5 and ID3 technique it gives best results and high accuracy with less computation time and error rate.

Keywords: C4.5, ID3, NH(National highway), WEKA data mining tool

Procedia PDF Downloads 305
15179 Phillips Curve Estimation in an Emerging Economy: Evidence from Sub-National Data of Indonesia

Authors: Harry Aginta

Abstract:

Using Phillips curve framework, this paper seeks for new empirical evidence on the relationship between inflation and output in a major emerging economy. By exploiting sub-national data, the contribution of this paper is threefold. First, it resolves the issue of using on-target national inflation rates that potentially causes weakening inflation-output nexus. This is very relevant for Indonesia as its central bank has been adopting inflation targeting framework based on national consumer price index (CPI) inflation. Second, the study tests the relevance of mining sector in output gap estimation. The test for mining sector is important to control for the effects of mining regulation and nominal effects of coal prices on real economic activities. Third, the paper applies panel econometric method by incorporating regional variation that help to improve model estimation. The results from this paper confirm the strong presence of Phillips curve in Indonesia. Positive output gap that reflects excess demand condition gives rise to the inflation rates. In addition, the elasticity of output gap is higher if the mining sector is excluded from output gap estimation. In addition to inflation adaptation, the dynamics of exchange rate and international commodity price are also found to affect inflation significantly. The results are robust to the alternative measurement of output gap

Keywords: Phillips curve, inflation, Indonesia, panel data

Procedia PDF Downloads 95
15178 Using Data Mining Technique for Scholarship Disbursement

Authors: J. K. Alhassan, S. A. Lawal

Abstract:

This work is on decision tree-based classification for the disbursement of scholarship. Tree-based data mining classification technique is used in other to determine the generic rule to be used to disburse the scholarship. The system based on the defined rules from the tree is able to determine the class (status) to which an applicant shall belong whether Granted or Not Granted. The applicants that fall to the class of granted denote a successful acquirement of scholarship while those in not granted class are unsuccessful in the scheme. An algorithm that can be used to classify the applicants based on the rules from tree-based classification was also developed. The tree-based classification is adopted because of its efficiency, effectiveness, and easy to comprehend features. The system was tested with the data of National Information Technology Development Agency (NITDA) Abuja, a Parastatal of Federal Ministry of Communication Technology that is mandated to develop and regulate information technology in Nigeria. The system was found working according to the specification. It is therefore recommended for all scholarship disbursement organizations.

Keywords: classification, data mining, decision tree, scholarship

Procedia PDF Downloads 344
15177 The Use of Classifiers in Image Analysis of Oil Wells Profiling Process and the Automatic Identification of Events

Authors: Jaqueline Maria Ribeiro Vieira

Abstract:

Different strategies and tools are available at the oil and gas industry for detecting and analyzing tension and possible fractures in borehole walls. Most of these techniques are based on manual observation of the captured borehole images. While this strategy may be possible and convenient with small images and few data, it may become difficult and suitable to errors when big databases of images must be treated. While the patterns may differ among the image area, depending on many characteristics (drilling strategy, rock components, rock strength, etc.). Previously we developed and proposed a novel strategy capable of detecting patterns at borehole images that may point to regions that have tension and breakout characteristics, based on segmented images. In this work we propose the inclusion of data-mining classification strategies in order to create a knowledge database of the segmented curves. These classifiers allow that, after some time using and manually pointing parts of borehole images that correspond to tension regions and breakout areas, the system will indicate and suggest automatically new candidate regions, with higher accuracy. We suggest the use of different classifiers methods, in order to achieve different knowledge data set configurations.

Keywords: image segmentation, oil well visualization, classifiers, data-mining, visual computer

Procedia PDF Downloads 272
15176 Assessing Carbon Stock and Sequestration of Reforestation Species on Old Mining Sites in Morocco Using the DNDC Model

Authors: Nabil Elkhatri, Mohamed Louay Metougui, Ngonidzashe Chirinda

Abstract:

Mining activities have left a legacy of degraded landscapes, prompting urgent efforts for ecological restoration. Reforestation holds promise as a potent tool to rehabilitate these old mining sites, with the potential to sequester carbon and contribute to climate change mitigation. This study focuses on evaluating the carbon stock and sequestration potential of reforestation species in the context of Morocco's mining areas, employing the DeNitrification-DeComposition (DNDC) model. The research is grounded in recognizing the need to connect theoretical models with practical implementation, ensuring that reforestation efforts are informed by accurate and context-specific data. Field data collection encompasses growth patterns, biomass accumulation, and carbon sequestration rates, establishing an empirical foundation for the study's analyses. By integrating the collected data with the DNDC model, the study aims to provide a comprehensive understanding of carbon dynamics within reforested ecosystems on old mining sites. The major findings reveal varying sequestration rates among different reforestation species, indicating the potential for species-specific optimization of reforestation strategies to enhance carbon capture. This research's significance lies in its potential to contribute to sustainable land management practices and climate change mitigation strategies. By quantifying the carbon stock and sequestration potential of reforestation species, the study serves as a valuable resource for policymakers, land managers, and practitioners involved in ecological restoration and carbon management. Ultimately, the study aligns with global objectives to rejuvenate degraded landscapes while addressing pressing climate challenges.

Keywords: carbon stock, carbon sequestration, DNDC model, ecological restoration, mining sites, Morocco, reforestation, sustainable land management.

Procedia PDF Downloads 36
15175 Increasing the Capacity of Plant Bottlenecks by Using of Improving the Ratio of Mean Time between Failures to Mean Time to Repair

Authors: Jalal Soleimannejad, Mohammad Asadizeidabadi, Mahmoud Koorki, Mojtaba Azarpira

Abstract:

A significant percentage of production costs is the maintenance costs, and analysis of maintenance costs could to achieve greater productivity and competitiveness. With this is mind, the maintenance of machines and installations is considered as an essential part of organizational functions and applying effective strategies causes significant added value in manufacturing activities. Organizations are trying to achieve performance levels on a global scale with emphasis on creating competitive advantage by different methods consist of RCM (Reliability-Center-Maintenance), TPM (Total Productivity Maintenance) etc. In this study, increasing the capacity of Concentration Plant of Golgohar Iron Ore Mining & Industrial Company (GEG) was examined by using of reliability and maintainability analyses. The results of this research showed that instead of increasing the number of machines (in order to solve the bottleneck problems), the improving of reliability and maintainability would solve bottleneck problems in the best way. It should be mention that in the abovementioned study, the data set of Concentration Plant of GEG as a case study, was applied and analyzed.

Keywords: bottleneck, golgohar iron ore mining & industrial company, maintainability, maintenance costs, reliability

Procedia PDF Downloads 322
15174 Application of Advanced Remote Sensing Data in Mineral Exploration in the Vicinity of Heavy Dense Forest Cover Area of Jharkhand and Odisha State Mining Area

Authors: Hemant Kumar, R. N. K. Sharma, A. P. Krishna

Abstract:

The study has been carried out on the Saranda in Jharkhand and a part of Odisha state. Geospatial data of Hyperion, a remote sensing satellite, have been used. This study has used a wide variety of patterns related to image processing to enhance and extract the mining class of Fe and Mn ores.Landsat-8, OLI sensor data have also been used to correctly explore related minerals. In this way, various processes have been applied to increase the mineralogy class and comparative evaluation with related frequency done. The Hyperion dataset for hyperspectral remote sensing has been specifically verified as an effective tool for mineral or rock information extraction within the band range of shortwave infrared used. The abundant spatial and spectral information contained in hyperspectral images enables the differentiation of different objects of any object into targeted applications for exploration such as exploration detection, mining.

Keywords: Hyperion, hyperspectral, sensor, Landsat-8

Procedia PDF Downloads 93
15173 Heritage Value and Industrial Tourism Potential of the Urals, Russia

Authors: Anatoly V. Stepanov, Maria Y. Ilyushkina, Alexander S. Burnasov

Abstract:

Expansion of tourism, especially after WWII, has led to significant improvements in the regional infrastructure. The present study has revealed a lot of progress in the advancement of industrial heritage narrative in the Central Urals. The evidence comes from the general public’s increased fascination with some of Europe’s oldest mining and industrial sites, and the agreement of many stakeholders that the Urals industrial heritage should be preserved. The development of tourist sites in Nizhny Tagil and Nevyansk, gold-digging in Beryosovsky, gemstone search in Murzinka, and the progress with the Urals Gemstone Ring project are the examples showing the immense opportunities of industrial heritage tourism development in the region that are still to be realized. Regardless of the economic future of the Central Urals, whether it will remain an industrial region or experience a deeper deindustrialization, the sprouts of the industrial heritage tourism should be advanced and amplified for the benefit of local communities and the tourist community at large as it is hard to imagine a more suitable site for the discovery of industrial and mining heritage than the Central Urals Region of Russia.

Keywords: industrial heritage, mining heritage, Central Urals, Russia

Procedia PDF Downloads 99
15172 Using Data Mining Techniques to Evaluate the Different Factors Affecting the Academic Performance of Students at the Faculty of Information Technology in Hashemite University in Jordan

Authors: Feras Hanandeh, Majdi Shannag

Abstract:

This research studies the different factors that could affect the Faculty of Information Technology in Hashemite University students’ accumulative average. The research paper verifies the student information, background, their academic records, and how this information will affect the student to get high grades. The student information used in the study is extracted from the student’s academic records. The data mining tools and techniques are used to decide which attribute(s) will affect the student’s accumulative average. The results show that the most important factor which affects the students’ accumulative average is the student Acceptance Type. And we built a decision tree model and rules to determine how the student can get high grades in their courses. The overall accuracy of the model is 44% which is accepted rate.

Keywords: data mining, classification, extracting rules, decision tree

Procedia PDF Downloads 383
15171 A Clustering-Based Approach for Weblog Data Cleaning

Authors: Amine Ganibardi, Cherif Arab Ali

Abstract:

This paper addresses the data cleaning issue as a part of web usage data preprocessing within the scope of Web Usage Mining. Weblog data recorded by web servers within log files reflect usage activity, i.e., End-users’ clicks and underlying user-agents’ hits. As Web Usage Mining is interested in End-users’ behavior, user-agents’ hits are referred to as noise to be cleaned-off before mining. Filtering hits from clicks is not trivial for two reasons, i.e., a server records requests interlaced in sequential order regardless of their source or type, website resources may be set up as requestable interchangeably by end-users and user-agents. The current methods are content-centric based on filtering heuristics of relevant/irrelevant items in terms of some cleaning attributes, i.e., website’s resources filetype extensions, website’s resources pointed by hyperlinks/URIs, http methods, user-agents, etc. These methods need exhaustive extra-weblog data and prior knowledge on the relevant and/or irrelevant items to be assumed as clicks or hits within the filtering heuristics. Such methods are not appropriate for dynamic/responsive Web for three reasons, i.e., resources may be set up to as clickable by end-users regardless of their type, website’s resources are indexed by frame names without filetype extensions, web contents are generated and cancelled differently from an end-user to another. In order to overcome these constraints, a clustering-based cleaning method centered on the logging structure is proposed. This method focuses on the statistical properties of the logging structure at the requested and referring resources attributes levels. It is insensitive to logging content and does not need extra-weblog data. The used statistical property takes on the structure of the generated logging feature by webpage requests in terms of clicks and hits. Since a webpage consists of its single URI and several components, these feature results in a single click to multiple hits ratio in terms of the requested and referring resources. Thus, the clustering-based method is meant to identify two clusters based on the application of the appropriate distance to the frequency matrix of the requested and referring resources levels. As the ratio clicks to hits is single to multiple, the clicks’ cluster is the smallest one in requests number. Hierarchical Agglomerative Clustering based on a pairwise distance (Gower) and average linkage has been applied to four logfiles of dynamic/responsive websites whose click to hits ratio range from 1/2 to 1/15. The optimal clustering set on the basis of average linkage and maximum inter-cluster inertia results always in two clusters. The evaluation of the smallest cluster referred to as clicks cluster under the terms of confusion matrix indicators results in 97% of true positive rate. The content-centric cleaning methods, i.e., conventional and advanced cleaning, resulted in a lower rate 91%. Thus, the proposed clustering-based cleaning outperforms the content-centric methods within dynamic and responsive web design without the need of any extra-weblog. Such an improvement in cleaning quality is likely to refine dependent analysis.

Keywords: clustering approach, data cleaning, data preprocessing, weblog data, web usage data

Procedia PDF Downloads 153
15170 Relay Mining: Verifiable Multi-Tenant Distributed Rate Limiting

Authors: Daniel Olshansky, Ramiro Rodrıguez Colmeiro

Abstract:

Relay Mining presents a scalable solution employing probabilistic mechanisms and crypto-economic incentives to estimate RPC volume usage, facilitating decentralized multitenant rate limiting. Network traffic from individual applications can be concurrently serviced by multiple RPC service providers, with costs, rewards, and rate limiting governed by a native cryptocurrency on a distributed ledger. Building upon established research in token bucket algorithms and distributed rate-limiting penalty models, our approach harnesses a feedback loop control mechanism to adjust the difficulty of mining relay rewards, dynamically scaling with network usage growth. By leveraging crypto-economic incentives, we reduce coordination overhead costs and introduce a mechanism for providing RPC services that are both geopolitically and geographically distributed.

Keywords: remote procedure call, crypto-economic, commit-reveal, decentralization, scalability, blockchain, rate limiting, token bucket

Procedia PDF Downloads 27
15169 Ecological Risk Aspects of Essential Trace Metals in Soil Derived From Gold Mining Region, South Africa

Authors: Lowanika Victor Tibane, David Mamba

Abstract:

Human body, animals, and plants depend on certain essential metals in permissible quantities for their survival. Excessive metal concentration may cause severe malfunctioning of the organisms and even fatal in extreme cases. Because of gold mining in the Witwatersrand basin in South Africa, enormous untreated mine dumps comprise elevated concentration of essential trace elements. Elevated quantities of trace metal have direct negative impact on the quality of soil for different land use types, reduce soil efficiency for plant growth, and affect the health human and animals. A total of 21 subsoil samples were examined using inductively coupled plasma optical emission spectrometry and X-ray fluorescence methods and the results elevated men concentration of Fe (36,433.39) > S (5,071.83) > Cu (1,717,28) > Mn (612.81) > Cr (74.52) > Zn (68.67) > Ni (40.44) > Co (9.63) > P (3.49) > Mo > (2.74), reported in mg/kg. Using various contamination indices, it was discovered that the sites surveyed are on average moderately contaminated with Co, Cr, Cu, Mn, Ni, S, and Zn. The ecological risk assessment revealed a low ecological risk for Cr, Ni and Zn, whereas Cu poses a very high ecological risk.

Keywords: essential trace elements, soil contamination, contamination indices, toxicity, descriptive statistics, ecological risk evaluation

Procedia PDF Downloads 62
15168 Data Mining Approach: Classification Model Evaluation

Authors: Lubabatu Sada Sodangi

Abstract:

The rapid growth in exchange and accessibility of information via the internet makes many organisations acquire data on their own operation. The aim of data mining is to analyse the different behaviour of a dataset using observation. Although, the subset of the dataset being analysed may not display all the behaviours and relationships of the entire data and, therefore, may not represent other parts that exist in the dataset. There is a range of techniques used in data mining to determine the hidden or unknown information in datasets. In this paper, the performance of two algorithms Chi-Square Automatic Interaction Detection (CHAID) and multilayer perceptron (MLP) would be matched using an Adult dataset to find out the percentage of an/the adults that earn > 50k and those that earn <= 50k per year. The two algorithms were studied and compared using IBM SPSS statistics software. The result for CHAID shows that the most important predictors are relationship and education. The algorithm shows that those are married (husband) and have qualification: Bachelor, Masters, Doctorate or Prof-school whose their age is > 41<57 earn > 50k. Also, multilayer perceptron displays marital status and capital gain as the most important predictors of the income. It also shows that individuals that their capital gain is less than 6,849 and are single, separated or widow, earn <= 50K, whereas individuals with their capital gain is > 6,849, work > 35 hrs/wk, and > 27yrs their income will be > 50k. By comparing the two algorithms, it is observed that both algorithms are reliable but there is strong reliability in CHAID which clearly shows that relation and education contribute to the prediction as displayed in the data visualisation.

Keywords: data mining, CHAID, multi-layer perceptron, SPSS, Adult dataset

Procedia PDF Downloads 355
15167 On Exploring Search Heuristics for improving the efficiency in Web Information Extraction

Authors: Patricia Jiménez, Rafael Corchuelo

Abstract:

Nowadays the World Wide Web is the most popular source of information that relies on billions of on-line documents. Web mining is used to crawl through these documents, collect the information of interest and process it by applying data mining tools in order to use the gathered information in the best interest of a business, what enables companies to promote theirs. Unfortunately, it is not easy to extract the information a web site provides automatically when it lacks an API that allows to transform the user-friendly data provided in web documents into a structured format that is machine-readable. Rule-based information extractors are the tools intended to extract the information of interest automatically and offer it in a structured format that allow mining tools to process it. However, the performance of an information extractor strongly depends on the search heuristic employed since bad choices regarding how to learn a rule may easily result in loss of effectiveness and/or efficiency. Improving search heuristics regarding efficiency is of uttermost importance in the field of Web Information Extraction since typical datasets are very large. In this paper, we employ an information extractor based on a classical top-down algorithm that uses the so-called Information Gain heuristic introduced by Quinlan and Cameron-Jones. Unfortunately, the Information Gain relies on some well-known problems so we analyse an intuitive alternative, Termini, that is clearly more efficient; we also analyse other proposals in the literature and conclude that none of them outperforms the previous alternative.

Keywords: information extraction, search heuristics, semi-structured documents, web mining.

Procedia PDF Downloads 307
15166 Automatic Lead Qualification with Opinion Mining in Customer Relationship Management Projects

Authors: Victor Radich, Tania Basso, Regina Moraes

Abstract:

Lead qualification is one of the main procedures in Customer Relationship Management (CRM) projects. Its main goal is to identify potential consumers who have the ideal characteristics to establish a profitable and long-term relationship with a certain organization. Social networks can be an important source of data for identifying and qualifying leads since interest in specific products or services can be identified from the users’ expressed feelings of (dis)satisfaction. In this context, this work proposes the use of machine learning techniques and sentiment analysis as an extra step in the lead qualification process in order to improve it. In addition to machine learning models, sentiment analysis or opinion mining can be used to understand the evaluation that the user makes of a particular service, product, or brand. The results obtained so far have shown that it is possible to extract data from social networks and combine the techniques for a more complete classification.

Keywords: lead qualification, sentiment analysis, opinion mining, machine learning, CRM, lead scoring

Procedia PDF Downloads 43
15165 Characterization of Tailings From Traditional Panning of Alluvial Gold Ore (A Case Study of Ilesa - Southwestern Nigeria Goldfield Tailings Dumps)

Authors: Olaniyi Awe, Adelana R. Adetunji, Abraham Adeleke

Abstract:

Field observation revealed a lot of artisanal gold mining activities in Ilesa gold belt of southwestern Nigeria. The possibility of alluvial and lode gold deposits in commercial quantities around this location is very high, as there are many resident artisanal gold miners who have been mining and trading alluvial gold ore for decades and to date in the area. Their major process of solid gold recovery from its ore is by gravity concentration using the convectional panning method. This method is simple to learn and fast to recover gold from its alluvial ore, but its effectiveness is based on rules of thumb and the artisanal miners' experience in handling gold ore panning tool while processing the ore. Research samples from five alluvial gold ore tailings dumps were collected and studied. Samples were subjected to particle size analysis and mineralogical and elemental characterization using X-Ray Diffraction (XRD) and Particle-Induced X-ray Emission (PIXE) methods, respectively. The results showed that the tailings were of major quartz in association with albite, plagioclase, mica, gold, calcite and sulphide minerals. The elemental composition analysis revealed a 15ppm of gold concentration in particle size fraction of -90 microns in one of the tailings dumps investigated. These results are significant. It is recommended that heaps of panning tailings should be further reprocessed using other gold recovery methods such as shaking tables, flotation and controlled cyanidation that can efficiently recover fine gold particles that were previously lost into the gold panning tailings. The tailings site should also be well controlled and monitored so that these heavy minerals do not find their way into surrounding water streams and rivers, thereby causing health hazards.

Keywords: gold ore, panning, PIXE, tailings, XRD

Procedia PDF Downloads 54
15164 A Method for Reduction of Association Rules in Data Mining

Authors: Diego De Castro Rodrigues, Marcelo Lisboa Rocha, Daniela M. De Q. Trevisan, Marcos Dias Da Conceicao, Gabriel Rosa, Rommel M. Barbosa

Abstract:

The use of association rules algorithms within data mining is recognized as being of great value in the knowledge discovery in databases. Very often, the number of rules generated is high, sometimes even in databases with small volume, so the success in the analysis of results can be hampered by this quantity. The purpose of this research is to present a method for reducing the quantity of rules generated with association algorithms. Therefore, a computational algorithm was developed with the use of a Weka Application Programming Interface, which allows the execution of the method on different types of databases. After the development, tests were carried out on three types of databases: synthetic, model, and real. Efficient results were obtained in reducing the number of rules, where the worst case presented a gain of more than 50%, considering the concepts of support, confidence, and lift as measures. This study concluded that the proposed model is feasible and quite interesting, contributing to the analysis of the results of association rules generated from the use of algorithms.

Keywords: data mining, association rules, rules reduction, artificial intelligence

Procedia PDF Downloads 129
15163 An Improved K-Means Algorithm for Gene Expression Data Clustering

Authors: Billel Kenidra, Mohamed Benmohammed

Abstract:

Data mining technique used in the field of clustering is a subject of active research and assists in biological pattern recognition and extraction of new knowledge from raw data. Clustering means the act of partitioning an unlabeled dataset into groups of similar objects. Each group, called a cluster, consists of objects that are similar between themselves and dissimilar to objects of other groups. Several clustering methods are based on partitional clustering. This category attempts to directly decompose the dataset into a set of disjoint clusters leading to an integer number of clusters that optimizes a given criterion function. The criterion function may emphasize a local or a global structure of the data, and its optimization is an iterative relocation procedure. The K-Means algorithm is one of the most widely used partitional clustering techniques. Since K-Means is extremely sensitive to the initial choice of centers and a poor choice of centers may lead to a local optimum that is quite inferior to the global optimum, we propose a strategy to initiate K-Means centers. The improved K-Means algorithm is compared with the original K-Means, and the results prove how the efficiency has been significantly improved.

Keywords: microarray data mining, biological pattern recognition, partitional clustering, k-means algorithm, centroid initialization

Procedia PDF Downloads 161
15162 The Affective Motivation of Women Miners in Ghana

Authors: Adesuwa Omorede, Rufai Haruna Kilu

Abstract:

Affective motivation (motivation that is emotionally laden usually related to affect, passion, emotions, moods) in the workplace stimulates individuals to reinforce, persist and commit to their task, which leads to the individual and organizational performance. This leads individuals to reach goals especially in situations where task are highly challenging and hostile. In such situations, individuals are more disposed to be more creative, innovative and see new opportunities from the loopholes in their workplace. However, when individuals feel displaced and less important, an adverse reaction may suffice which may be detrimental to the organization and its performance. One sector where affective motivation is eminently present and relevant, is the mining industry. Due to its intense work environment; mostly dominated by men and masculinity cultures; and deliberate exclusion of women in this environment which, makes the women working in these environments to feel marginalized. In Ghana, the mining industry is mostly seen as a very physical environment especially underground and mostly considerd as 'no place for a woman'. Despite the fact that these women feel less 'needed' or 'appreciated' in such environments, they still have to juggle between intense work shifts; face violence and other health risks with their families, which put a strain on their affective motivational reaction. Beyond these challenges, however, several mining companies in Ghana today are working towards providing a fair and equal working situation for both men and women miners, by recognizing them as key stakeholders, as well as including them in the stages of mining projects from the planning and designing phase to the evaluation and implementation stage. Drawing from the psychology and gender literature, this study takes a narrative approach to identify and understand the shifting gender dynamics within the mine works in Ghana, occasioning a change in background disposition of miners, which leads to more women taking up mine jobs in the country. In doing so, a qualitative study was conducted using semi-structured interviews from Ghana. Several women working within the mining industries in Ghana shared their experiences and how they felt and still feel in their workplace. In addition, archival documents were gathered to support the findings. The results suggest a change in enrolment regimes in a mining and technology university in Ghana, making room for a more gender equal enrolments in the university. A renowned university that train and feed mine work professional into the industry. The results further acknowledge gender equal and diversity recruitment policies and initiatives among the mining companies of Ghana. This study contributes to the psychology and gender literature by highlighting the hindrances women face in the mining industry as well as highlighting several of their affective reactions towards gender inequality. The study also provides several suggestions for decision makers in the mining industry of what can be done in the future to reduce the gender inequality gap within the industry.

Keywords: affective motivation, gender shape shifting, mining industry, women miners

Procedia PDF Downloads 269
15161 Clustering Categorical Data Using the K-Means Algorithm and the Attribute’s Relative Frequency

Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami

Abstract:

Clustering is a well known data mining technique used in pattern recognition and information retrieval. The initial dataset to be clustered can either contain categorical or numeric data. Each type of data has its own specific clustering algorithm. In this context, two algorithms are proposed: the k-means for clustering numeric datasets and the k-modes for categorical datasets. The main encountered problem in data mining applications is clustering categorical dataset so relevant in the datasets. One main issue to achieve the clustering process on categorical values is to transform the categorical attributes into numeric measures and directly apply the k-means algorithm instead the k-modes. In this paper, it is proposed to experiment an approach based on the previous issue by transforming the categorical values into numeric ones using the relative frequency of each modality in the attributes. The proposed approach is compared with a previously method based on transforming the categorical datasets into binary values. The scalability and accuracy of the two methods are experimented. The obtained results show that our proposed method outperforms the binary method in all cases.

Keywords: clustering, unsupervised learning, pattern recognition, categorical datasets, knowledge discovery, k-means

Procedia PDF Downloads 232
15160 Educational Data Mining: The Case of the Department of Mathematics and Computing in the Period 2009-2018

Authors: Mário Ernesto Sitoe, Orlando Zacarias

Abstract:

University education is influenced by several factors that range from the adoption of strategies to strengthen the whole process to the academic performance improvement of the students themselves. This work uses data mining techniques to develop a predictive model to identify students with a tendency to evasion and retention. To this end, a database of real students’ data from the Department of University Admission (DAU) and the Department of Mathematics and Informatics (DMI) was used. The data comprised 388 undergraduate students admitted in the years 2009 to 2014. The Weka tool was used for model building, using three different techniques, namely: K-nearest neighbor, random forest, and logistic regression. To allow for training on multiple train-test splits, a cross-validation approach was employed with a varying number of folds. To reduce bias variance and improve the performance of the models, ensemble methods of Bagging and Stacking were used. After comparing the results obtained by the three classifiers, Logistic Regression using Bagging with seven folds obtained the best performance, showing results above 90% in all evaluated metrics: accuracy, rate of true positives, and precision. Retention is the most common tendency.

Keywords: evasion and retention, cross-validation, bagging, stacking

Procedia PDF Downloads 55
15159 Distributed Perceptually Important Point Identification for Time Series Data Mining

Authors: Tak-Chung Fu, Ying-Kit Hung, Fu-Lai Chung

Abstract:

In the field of time series data mining, the concept of the Perceptually Important Point (PIP) identification process is first introduced in 2001. This process originally works for financial time series pattern matching and it is then found suitable for time series dimensionality reduction and representation. Its strength is on preserving the overall shape of the time series by identifying the salient points in it. With the rise of Big Data, time series data contributes a major proportion, especially on the data which generates by sensors in the Internet of Things (IoT) environment. According to the nature of PIP identification and the successful cases, it is worth to further explore the opportunity to apply PIP in time series ‘Big Data’. However, the performance of PIP identification is always considered as the limitation when dealing with ‘Big’ time series data. In this paper, two distributed versions of PIP identification based on the Specialized Binary (SB) Tree are proposed. The proposed approaches solve the bottleneck when running the PIP identification process in a standalone computer. Improvement in term of speed is obtained by the distributed versions.

Keywords: distributed computing, performance analysis, Perceptually Important Point identification, time series data mining

Procedia PDF Downloads 393
15158 The Environmental Concerns in Coal Mining, and Utilization in Pakistan

Authors: S. R. H. Baqri, T. Shahina, M. T. Hasan

Abstract:

Pakistan is facing acute shortage of energy and looking for indigenous resources of the energy mix to meet the short fall. After the discovery of huge coal resources in Thar Desert of Sindh province, focus has shifted to coal power generation. The government of Pakistan has planned power generation of 20000 MW on coal by the year 2025. This target will be achieved by mining and power generation in Thar coal Field and on imported coal in different parts of Pakistan. Total indigenous coal production of around 3.0 million tons is being utilized in brick kilns, cement and sugar industry. Coal-based power generation is only limited to three units of 50 MW near Hyderabad from nearby Lakhra Coal field. The purpose of this presentation is to identify and redressal of issues of coal mining and utilization with reference to environmental hazards. Thar coal resource is estimated at 175 billion tons out of a total resource estimate of 184 billion tons in Pakistan. Coal of Pakistan is of Tertiary age (Palaeocene/Eocene) and classified from lignite to sub-bituminous category. Coal characterization has established three main pollutants such as Sulphur, Carbon dioxide and Methane besides some others associated with coal and rock types. The element Sulphur occurs in organic as well as inorganic forms associated with coals as free sulphur and as pyrite, gypsum, respectively. Carbon dioxide, methane and minerals are mostly associated with fractures, joints local faults, seatearth and roof rocks. The abandoned and working coal mines give kerosene odour due to escape of methane in the atmosphere. While the frozen methane/methane ices in organic matter rich sediments have also been reported from the Makran coastal and offshore areas. The Sulphur escapes into the atmosphere during mining and utilization of coal in industry. The natural erosional processes due to rivers, streams, lakes and coastal waves erode over lying sediments allowing pollutants to escape into air and water. Power plants emissions should be controlled through application of appropriate clean coal technology and need to be regularly monitored. Therefore, the systematic and scientific studies will be required to estimate the quantity of methane, carbon dioxide and sulphur at various sites such as abandoned and working coal mines, exploratory wells for coal, oil and gas. Pressure gauges on gas pipes connecting the coal-bearing horizons will be installed on surface to know the quantity of gas. The quality and quantity of gases will be examined according to the defined intervals of times. This will help to design and recommend the methods and procedures to stop the escape of gases into atmosphere. The element of Sulphur can be removed partially by gravity and chemical methods after grinding and before industrial utilization of coal.

Keywords: atmosphere, coal production, energy, pollutants

Procedia PDF Downloads 405
15157 Develop a Conceptual Data Model of Geotechnical Risk Assessment in Underground Coal Mining Using a Cloud-Based Machine Learning Platform

Authors: Reza Mohammadzadeh

Abstract:

The major challenges in geotechnical engineering in underground spaces arise from uncertainties and different probabilities. The collection, collation, and collaboration of existing data to incorporate them in analysis and design for given prospect evaluation would be a reliable, practical problem solving method under uncertainty. Machine learning (ML) is a subfield of artificial intelligence in statistical science which applies different techniques (e.g., Regression, neural networks, support vector machines, decision trees, random forests, genetic programming, etc.) on data to automatically learn and improve from them without being explicitly programmed and make decisions and predictions. In this paper, a conceptual database schema of geotechnical risks in underground coal mining based on a cloud system architecture has been designed. A new approach of risk assessment using a three-dimensional risk matrix supported by the level of knowledge (LoK) has been proposed in this model. Subsequently, the model workflow methodology stages have been described. In order to train data and LoK models deployment, an ML platform has been implemented. IBM Watson Studio, as a leading data science tool and data-driven cloud integration ML platform, is employed in this study. As a Use case, a data set of geotechnical hazards and risk assessment in underground coal mining were prepared to demonstrate the performance of the model, and accordingly, the results have been outlined.

Keywords: data model, geotechnical risks, machine learning, underground coal mining

Procedia PDF Downloads 242
15156 Enhance the Power of Sentiment Analysis

Authors: Yu Zhang, Pedro Desouza

Abstract:

Since big data has become substantially more accessible and manageable due to the development of powerful tools for dealing with unstructured data, people are eager to mine information from social media resources that could not be handled in the past. Sentiment analysis, as a novel branch of text mining, has in the last decade become increasingly important in marketing analysis, customer risk prediction and other fields. Scientists and researchers have undertaken significant work in creating and improving their sentiment models. In this paper, we present a concept of selecting appropriate classifiers based on the features and qualities of data sources by comparing the performances of five classifiers with three popular social media data sources: Twitter, Amazon Customer Reviews, and Movie Reviews. We introduced a couple of innovative models that outperform traditional sentiment classifiers for these data sources, and provide insights on how to further improve the predictive power of sentiment analysis. The modelling and testing work was done in R and Greenplum in-database analytic tools.

Keywords: sentiment analysis, social media, Twitter, Amazon, data mining, machine learning, text mining

Procedia PDF Downloads 325
15155 Data Mining in Healthcare for Predictive Analytics

Authors: Ruzanna Muradyan

Abstract:

Medical data mining is a crucial field in contemporary healthcare that offers cutting-edge tactics with enormous potential to transform patient care. This abstract examines how sophisticated data mining techniques could transform the healthcare industry, with a special focus on how they might improve patient outcomes. Healthcare data repositories have dynamically evolved, producing a rich tapestry of different, multi-dimensional information that includes genetic profiles, lifestyle markers, electronic health records, and more. By utilizing data mining techniques inside this vast library, a variety of prospects for precision medicine, predictive analytics, and insight production become visible. Predictive modeling for illness prediction, risk stratification, and therapy efficacy evaluations are important points of focus. Healthcare providers may use this abundance of data to tailor treatment plans, identify high-risk patient populations, and forecast disease trajectories by applying machine learning algorithms and predictive analytics. Better patient outcomes, more efficient use of resources, and early treatments are made possible by this proactive strategy. Furthermore, data mining techniques act as catalysts to reveal complex relationships between apparently unrelated data pieces, providing enhanced insights into the cause of disease, genetic susceptibilities, and environmental factors. Healthcare practitioners can get practical insights that guide disease prevention, customized patient counseling, and focused therapies by analyzing these associations. The abstract explores the problems and ethical issues that come with using data mining techniques in the healthcare industry. In order to properly use these approaches, it is essential to find a balance between data privacy, security issues, and the interpretability of complex models. Finally, this abstract demonstrates the revolutionary power of modern data mining methodologies in transforming the healthcare sector. Healthcare practitioners and researchers can uncover unique insights, enhance clinical decision-making, and ultimately elevate patient care to unprecedented levels of precision and efficacy by employing cutting-edge methodologies.

Keywords: data mining, healthcare, patient care, predictive analytics, precision medicine, electronic health records, machine learning, predictive modeling, disease prognosis, risk stratification, treatment efficacy, genetic profiles, precision health

Procedia PDF Downloads 31