Search results for: sentiment mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1234

Search results for: sentiment mining

934 Arabic Light Stemmer for Better Search Accuracy

Authors: Sahar Khedr, Dina Sayed, Ayman Hanafy

Abstract:

Arabic is one of the most ancient and critical languages in the world. It has over than 250 million Arabic native speakers and more than twenty countries having Arabic as one of its official languages. In the past decade, we have witnessed a rapid evolution in smart devices, social network and technology sector which led to the need to provide tools and libraries that properly tackle the Arabic language in different domains. Stemming is one of the most crucial linguistic fundamentals. It is used in many applications especially in information extraction and text mining fields. The motivation behind this work is to enhance the Arabic light stemmer to serve the data mining industry and leverage it in an open source community. The presented implementation works on enhancing the Arabic light stemmer by utilizing and enhancing an algorithm that provides an extension for a new set of rules and patterns accompanied by adjusted procedure. This study has proven a significant enhancement for better search accuracy with an average 10% improvement in comparison with previous works.

Keywords: Arabic data mining, Arabic Information extraction, Arabic Light stemmer, Arabic stemmer

Procedia PDF Downloads 291
933 Feature Selection for Production Schedule Optimization in Transition Mines

Authors: Angelina Anani, Ignacio Ortiz Flores, Haitao Li

Abstract:

The use of underground mining methods have increased significantly over the past decades. This increase has also been spared on by several mines transitioning from surface to underground mining. However, determining the transition depth can be a challenging task, especially when coupled with production schedule optimization. Several researchers have simplified the problem by excluding operational features relevant to production schedule optimization. Our research objective is to investigate the extent to which operational features of transition mines accounted for affect the optimal production schedule. We also provide a framework for factors to consider in production schedule optimization for transition mines. An integrated mixed-integer linear programming (MILP) model is developed that maximizes the NPV as a function of production schedule and transition depth. A case study is performed to validate the model, with a comparative sensitivity analysis to obtain operational insights.

Keywords: underground mining, transition mines, mixed-integer linear programming, production schedule

Procedia PDF Downloads 155
932 Effect of Bacillus Pumilus Strains on Heavy Metal Accumulation in Lettuce Grown on Contaminated Soil

Authors: Sabeen Alam, Mehboob Alam

Abstract:

The research work entitled “Effect of Bacillus pumilus strains on heavy metal accumulation in lettuce grown on contaminated soil” focused on functional role of Bacillus pumilus strains inoculated with lettuce seed in mitigating heavy metal in chromite mining soil. In this experiment, factor A was three Bacillus pumilus strains (sequence C-2PMW-8, C-1 SSK-8 and C-1 PWK-7) while soil used for this experiment was collected from Prang Ghar mining site and lettuce seeds were grown in three levels of chromite mining soil (2.27, 4.65 and 7.14 %). For mining soil minimum days to germinate noted in lettuce grown on garden soil inoculated with sequence. Maximum germination percentage noted was for C-1 SSK-8 grown on garden soil, maximum lettuce height for sequence C-2 PWM-8, fresh leaf weight for C-1 PWK-7 inoculated lettuce, dry weight of lettuce leaf for lettuce inoculated with C-1 SSK-8 and C-1 PWK-7 strains, number of leaves per plant for lettuce inoculated with C-1 SSK-8, leaf area for C-2 PMW-8 inoculated lettuce, survival percentage for C-1 SSK-8 treated lettuce and chlorophyll content for C-2 PMW-8. Results related to heavy metals accumulation showed that minimum chromium was in lettuce and in soil for all three sequences, cadmium (Cd) in lettuce and in soil for all three sequences, manganese (Mn) in lettuce and in soil for three sequences, lead (Pb) in lettuce and in soil for three sequences. It can be concluded that chromite mining soil significantly reduced the growth and survival of lettuce, but when lettuce was inoculated with Bacillus.pumilus strains, it enhances growth and survival. Similarly, minimum heavy metal accumulation in plant and soil, regardless of type of Bacillus pumilus used, all three sequences has same mitigating effect on heavy metal in both soil and lettuce. All the three Bacillus pumilus strains ensured reduction in heavy metals content (Mn, Cd, Cr) in lettuce, below the maximum permissible limits of WHO 2011.

Keywords: bacillus pumilus, heavy metals, permissible limits, lettuce, chromite mining soil, mitigating effect

Procedia PDF Downloads 41
931 The Human Right to a Safe, Clean and Healthy Environment in Corporate Social Responsibility's Strategies: An Approach to Understanding Mexico's Mining Sector

Authors: Thalia Viveros-Uehara

Abstract:

The virtues of Corporate Social Responsibility (CSR) are explored widely in the academic literature. However, few studies address its link to human rights, per se; specifically, the right to a safe, clean and healthy environment. Fewer still are the research works in this area that relate to developing countries, where a number of areas are biodiversity hotspots. In Mexico, despite the rise and evolution of CSR schemes, grave episodes of pollution persist, especially those caused by the mining industry. These cases set up the question of the correspondence between the current CSR practices of mining companies in the country and their responsibility to respect the right to a safe, clean and healthy environment. The present study approaches precisely such a bridge, which until now has not been fully tackled in light of Mexico's 2011 constitutional human rights amendment and the United Nation's Guiding Principles on Business and Human Rights (UN Guiding Principles), adopted by the Human Rights Council in 2011. To that aim, it initially presents a contextual framework; it then explores qualitatively the adoption of human rights’ language in the CSR strategies of the three main mining companies in Mexico, and finally, it examines their standing with respect to the UN Guiding Principles. The results reveal that human rights are included in the RSE strategies of the analysed businesses, at least at the rhetoric level; however, they do not embrace the right to a safe, clean and healthy environment as such. Moreover, we conclude that despite the finding that corporations publicly express their commitment to respect human rights, some operational weaknesses that hamper the exercise of such responsibility persist; for example, the systematic lack of human rights impact assessments per mining unit, the denial of actual and publicly-known negative episodes on the environment linked directly to their operations, and the absence of effective mechanisms to remediate adverse impacts.

Keywords: corporate social responsibility, environmental impacts, human rights, right to a safe, clean and healthy environment, mining industry

Procedia PDF Downloads 319
930 A Recommender System for Job Seekers to Show up Companies Based on Their Psychometric Preferences and Company Sentiment Scores

Authors: A. Ashraff

Abstract:

The increasing importance of the web as a medium for electronic and business transactions has served as a catalyst or rather a driving force for the introduction and implementation of recommender systems. Recommender Systems play a major role in processing and analyzing thousands of data rows or reviews and help humans make a purchase decision of a product or service. It also has the ability to predict whether a particular user would rate a product or service based on the user’s profile behavioral pattern. At present, Recommender Systems are being used extensively in every domain known to us. They are said to be ubiquitous. However, in the field of recruitment, it’s not being utilized exclusively. Recent statistics show an increase in staff turnover, which has negatively impacted the organization as well as the employee. The reasons being company culture, working flexibility (work from home opportunity), no learning advancements, and pay scale. Further investigations revealed that there are lacking guidance or support, which helps a job seeker find the company that will suit him best, and though there’s information available about companies, job seekers can’t read all the reviews by themselves and get an analytical decision. In this paper, we propose an approach to study the available review data on IT companies (score their reviews based on user review sentiments) and gather information on job seekers, which includes their Psychometric evaluations. Then presents the job seeker with useful information or rather outputs on which company is most suitable for the job seeker. The theoretical approach, Algorithmic approach and the importance of such a system will be discussed in this paper.

Keywords: psychometric tests, recommender systems, sentiment analysis, hybrid recommender systems

Procedia PDF Downloads 96
929 Knowledge-Driven Decision Support System Based on Knowledge Warehouse and Data Mining by Improving Apriori Algorithm with Fuzzy Logic

Authors: Pejman Hosseinioun, Hasan Shakeri, Ghasem Ghorbanirostam

Abstract:

In recent years, we have seen an increasing importance of research and study on knowledge source, decision support systems, data mining and procedure of knowledge discovery in data bases and it is considered that each of these aspects affects the others. In this article, we have merged information source and knowledge source to suggest a knowledge based system within limits of management based on storing and restoring of knowledge to manage information and improve decision making and resources. In this article, we have used method of data mining and Apriori algorithm in procedure of knowledge discovery one of the problems of Apriori algorithm is that, a user should specify the minimum threshold for supporting the regularity. Imagine that a user wants to apply Apriori algorithm for a database with millions of transactions. Definitely, the user does not have necessary knowledge of all existing transactions in that database, and therefore cannot specify a suitable threshold. Our purpose in this article is to improve Apriori algorithm. To achieve our goal, we tried using fuzzy logic to put data in different clusters before applying the Apriori algorithm for existing data in the database and we also try to suggest the most suitable threshold to the user automatically.

Keywords: decision support system, data mining, knowledge discovery, data discovery, fuzzy logic

Procedia PDF Downloads 321
928 Mining in Peru and Local Governance: Assessing the Contribution of CRS Projects

Authors: Sandra Carrillo Hoyos

Abstract:

Mining activities in South America have significantly grown during the last decades, given the abundance of natural resources, the implemented governmental policies to incentivize foreign investment as well as the boom in international prices for metals and oil between 2002 and 2008. While this context allowed the region to occupy a leading position between the top producers of minerals around the world, it has also meant an increase in socio-environmental conflicts which have generated costs and negative impacts not only for the companies but especially for the governments and local communities.During the latest decade, the mining sector in Peru has faced with the social resistance of a large number of communities, which began organizing actions against the implementation of high investing projects. The dissatisfaction has derived in the prevalence of socio-environmental conflicts associated with mining activities, some of them never solved into an agreement. In order to prevent those socio-environmental conflicts and obtain the social license from local communities, most of the mining companies have developed diverse initiatives within the framework of policies and practices of corporate social responsibility (CSR). This paper has assessed the mining sector’s contribution toward the local development management along the last decade, as part of CSR strategies as well as the policies promoted by the Peruvian State. This assessment found that, in the beginning, these initiatives have been based on a philanthropic approach and were reacting to pressures from local stakeholders to maintain the consent to operate from the surrounding communities as well as to create, as a result, a harmonious atmosphere for operations. Due to the weak State presence, such practices have increased the expectations of communities related to the participation of mining companies in solving structural development problems, especially those related to primary needs, infrastructure, education, health, among others. In other words, this paper was focused on analyze in what extent these initiatives have promoted local empowerment for development planning and integrated management of natural resources from a territorial approach. From this perspective, the analysis demonstrates that, while the design and planning of social investment initiatives have improved due to the sector´s sustainability approach, many companies have developed actions beyond their competence during this process. In some cases, the referenced actions have generated dependency with communities, even though this relationship has not exempted the companies of conflict situations with unfortunate consequences. Furthermore, the social programs developed have not necessarily generated a significant impact in improving the quality of life of affected populations. In fact, it is possible to identify that those regions with high mining resources and investment are facing with a situation of poverty and high dependency on mining production. In spite of the revenues derived from mining industry, local governments have not been able to translate the royalties into sustainable development opportunities. For this reason, the proposed paper suggests some challenges for the mining sector contribution to local development based on the best practices and lessons learnt from a benchmarking for the leading mining companies.

Keywords: corporate social responsibility, local development, mining, socio-environmental conflict

Procedia PDF Downloads 389
927 Lead and Cadmium Spatial Pattern and Risk Assessment around Coal Mine in Hyrcanian Forest, North Iran

Authors: Mahsa Tavakoli, Seyed Mohammad Hojjati, Yahya Kooch

Abstract:

In this study, the effect of coal mining activities on lead and cadmium concentrations and distribution in soil was investigated in Hyrcanian forest, North Iran. 16 plots (20×20 m2) were established by systematic-randomly (60×60 m2) in an area of 4 ha (200×200 m2-mine entrance placed at center). An area adjacent to the mine was not affected by the mining activity; considered as the controlled area. In order to investigate soil lead and cadmium concentration, one sample was taken from the 0-10 cm in each plot. To study the spatial pattern of soil properties and lead and cadmium concentrations in the mining area, an area of 80×80m2 (the mine as the center) was considered and 80 soil samples were systematic-randomly taken (10 m intervals). Geostatistical analysis was performed via Kriging method and GS+ software (version 5.1). In order to estimate the impact of coal mining activities on soil quality, pollution index was measured. Lead and cadmium concentrations were significantly higher in mine area (Pb: 10.97±0.30, Cd: 184.47±6.26 mg.kg-1) in comparison to control area (Pb: 9.42±0.17, Cd: 131.71±15.77 mg.kg-1). The mean values of the PI index indicate that Pb (1.16) and Cd (1.77) presented slightly polluted. Results of the NIPI index showed that Pb (1.44) and Cd (2.52) presented slight pollution and moderate pollution respectively. Results of variography and kriging method showed that it is possible to prepare interpolation maps of lead and cadmium around the mining areas in Hyrcanian forest. According to results of pollution and risk assessments, forest soil was contaminated by heavy metals (lead and cadmium); therefore, using reclamation and remediation techniques in these areas is necessary.

Keywords: traditional coal mining, heavy metals, pollution indicators, geostatistics, Caspian forest

Procedia PDF Downloads 169
926 Study and Analysis of the Factors Affecting Road Safety Using Decision Tree Algorithms

Authors: Naina Mahajan, Bikram Pal Kaur

Abstract:

The purpose of traffic accident analysis is to find the possible causes of an accident. Road accidents cannot be totally prevented but by suitable traffic engineering and management the accident rate can be reduced to a certain extent. This paper discusses the classification techniques C4.5 and ID3 using the WEKA Data mining tool. These techniques use on the NH (National highway) dataset. With the C4.5 and ID3 technique it gives best results and high accuracy with less computation time and error rate.

Keywords: C4.5, ID3, NH(National highway), WEKA data mining tool

Procedia PDF Downloads 322
925 Phillips Curve Estimation in an Emerging Economy: Evidence from Sub-National Data of Indonesia

Authors: Harry Aginta

Abstract:

Using Phillips curve framework, this paper seeks for new empirical evidence on the relationship between inflation and output in a major emerging economy. By exploiting sub-national data, the contribution of this paper is threefold. First, it resolves the issue of using on-target national inflation rates that potentially causes weakening inflation-output nexus. This is very relevant for Indonesia as its central bank has been adopting inflation targeting framework based on national consumer price index (CPI) inflation. Second, the study tests the relevance of mining sector in output gap estimation. The test for mining sector is important to control for the effects of mining regulation and nominal effects of coal prices on real economic activities. Third, the paper applies panel econometric method by incorporating regional variation that help to improve model estimation. The results from this paper confirm the strong presence of Phillips curve in Indonesia. Positive output gap that reflects excess demand condition gives rise to the inflation rates. In addition, the elasticity of output gap is higher if the mining sector is excluded from output gap estimation. In addition to inflation adaptation, the dynamics of exchange rate and international commodity price are also found to affect inflation significantly. The results are robust to the alternative measurement of output gap

Keywords: Phillips curve, inflation, Indonesia, panel data

Procedia PDF Downloads 111
924 Research of the Three-Dimensional Visualization Geological Modeling of Mine Based on Surpac

Authors: Honggang Qu, Yong Xu, Rongmei Liu, Zhenji Gao, Bin Wang

Abstract:

Today's mining industry is advancing gradually toward digital and visual direction. The three-dimensional visualization geological modeling of mine is the digital characterization of mineral deposits and is one of the key technology of digital mining. Three-dimensional geological modeling is a technology that combines geological spatial information management, geological interpretation, geological spatial analysis and prediction, geostatistical analysis, entity content analysis and graphic visualization in a three-dimensional environment with computer technology and is used in geological analysis. In this paper, the three-dimensional geological modeling of an iron mine through the use of Surpac is constructed, and the weight difference of the estimation methods between the distance power inverse ratio method and ordinary kriging is studied, and the ore body volume and reserves are simulated and calculated by using these two methods. Compared with the actual mine reserves, its result is relatively accurate, so it provides scientific bases for mine resource assessment, reserve calculation, mining design and so on.

Keywords: three-dimensional geological modeling, geological database, geostatistics, block model

Procedia PDF Downloads 65
923 Using Data Mining Technique for Scholarship Disbursement

Authors: J. K. Alhassan, S. A. Lawal

Abstract:

This work is on decision tree-based classification for the disbursement of scholarship. Tree-based data mining classification technique is used in other to determine the generic rule to be used to disburse the scholarship. The system based on the defined rules from the tree is able to determine the class (status) to which an applicant shall belong whether Granted or Not Granted. The applicants that fall to the class of granted denote a successful acquirement of scholarship while those in not granted class are unsuccessful in the scheme. An algorithm that can be used to classify the applicants based on the rules from tree-based classification was also developed. The tree-based classification is adopted because of its efficiency, effectiveness, and easy to comprehend features. The system was tested with the data of National Information Technology Development Agency (NITDA) Abuja, a Parastatal of Federal Ministry of Communication Technology that is mandated to develop and regulate information technology in Nigeria. The system was found working according to the specification. It is therefore recommended for all scholarship disbursement organizations.

Keywords: classification, data mining, decision tree, scholarship

Procedia PDF Downloads 355
922 Assessing Carbon Stock and Sequestration of Reforestation Species on Old Mining Sites in Morocco Using the DNDC Model

Authors: Nabil Elkhatri, Mohamed Louay Metougui, Ngonidzashe Chirinda

Abstract:

Mining activities have left a legacy of degraded landscapes, prompting urgent efforts for ecological restoration. Reforestation holds promise as a potent tool to rehabilitate these old mining sites, with the potential to sequester carbon and contribute to climate change mitigation. This study focuses on evaluating the carbon stock and sequestration potential of reforestation species in the context of Morocco's mining areas, employing the DeNitrification-DeComposition (DNDC) model. The research is grounded in recognizing the need to connect theoretical models with practical implementation, ensuring that reforestation efforts are informed by accurate and context-specific data. Field data collection encompasses growth patterns, biomass accumulation, and carbon sequestration rates, establishing an empirical foundation for the study's analyses. By integrating the collected data with the DNDC model, the study aims to provide a comprehensive understanding of carbon dynamics within reforested ecosystems on old mining sites. The major findings reveal varying sequestration rates among different reforestation species, indicating the potential for species-specific optimization of reforestation strategies to enhance carbon capture. This research's significance lies in its potential to contribute to sustainable land management practices and climate change mitigation strategies. By quantifying the carbon stock and sequestration potential of reforestation species, the study serves as a valuable resource for policymakers, land managers, and practitioners involved in ecological restoration and carbon management. Ultimately, the study aligns with global objectives to rejuvenate degraded landscapes while addressing pressing climate challenges.

Keywords: carbon stock, carbon sequestration, DNDC model, ecological restoration, mining sites, Morocco, reforestation, sustainable land management.

Procedia PDF Downloads 64
921 Using Textual Pre-Processing and Text Mining to Create Semantic Links

Authors: Ricardo Avila, Gabriel Lopes, Vania Vidal, Jose Macedo

Abstract:

This article offers a approach to the automatic discovery of semantic concepts and links in the domain of Oil Exploration and Production (E&P). Machine learning methods combined with textual pre-processing techniques were used to detect local patterns in texts and, thus, generate new concepts and new semantic links. Even using more specific vocabularies within the oil domain, our approach has achieved satisfactory results, suggesting that the proposal can be applied in other domains and languages, requiring only minor adjustments.

Keywords: semantic links, data mining, linked data, SKOS

Procedia PDF Downloads 165
920 Application of Advanced Remote Sensing Data in Mineral Exploration in the Vicinity of Heavy Dense Forest Cover Area of Jharkhand and Odisha State Mining Area

Authors: Hemant Kumar, R. N. K. Sharma, A. P. Krishna

Abstract:

The study has been carried out on the Saranda in Jharkhand and a part of Odisha state. Geospatial data of Hyperion, a remote sensing satellite, have been used. This study has used a wide variety of patterns related to image processing to enhance and extract the mining class of Fe and Mn ores.Landsat-8, OLI sensor data have also been used to correctly explore related minerals. In this way, various processes have been applied to increase the mineralogy class and comparative evaluation with related frequency done. The Hyperion dataset for hyperspectral remote sensing has been specifically verified as an effective tool for mineral or rock information extraction within the band range of shortwave infrared used. The abundant spatial and spectral information contained in hyperspectral images enables the differentiation of different objects of any object into targeted applications for exploration such as exploration detection, mining.

Keywords: Hyperion, hyperspectral, sensor, Landsat-8

Procedia PDF Downloads 107
919 Heritage Value and Industrial Tourism Potential of the Urals, Russia

Authors: Anatoly V. Stepanov, Maria Y. Ilyushkina, Alexander S. Burnasov

Abstract:

Expansion of tourism, especially after WWII, has led to significant improvements in the regional infrastructure. The present study has revealed a lot of progress in the advancement of industrial heritage narrative in the Central Urals. The evidence comes from the general public’s increased fascination with some of Europe’s oldest mining and industrial sites, and the agreement of many stakeholders that the Urals industrial heritage should be preserved. The development of tourist sites in Nizhny Tagil and Nevyansk, gold-digging in Beryosovsky, gemstone search in Murzinka, and the progress with the Urals Gemstone Ring project are the examples showing the immense opportunities of industrial heritage tourism development in the region that are still to be realized. Regardless of the economic future of the Central Urals, whether it will remain an industrial region or experience a deeper deindustrialization, the sprouts of the industrial heritage tourism should be advanced and amplified for the benefit of local communities and the tourist community at large as it is hard to imagine a more suitable site for the discovery of industrial and mining heritage than the Central Urals Region of Russia.

Keywords: industrial heritage, mining heritage, Central Urals, Russia

Procedia PDF Downloads 116
918 Using Data Mining Techniques to Evaluate the Different Factors Affecting the Academic Performance of Students at the Faculty of Information Technology in Hashemite University in Jordan

Authors: Feras Hanandeh, Majdi Shannag

Abstract:

This research studies the different factors that could affect the Faculty of Information Technology in Hashemite University students’ accumulative average. The research paper verifies the student information, background, their academic records, and how this information will affect the student to get high grades. The student information used in the study is extracted from the student’s academic records. The data mining tools and techniques are used to decide which attribute(s) will affect the student’s accumulative average. The results show that the most important factor which affects the students’ accumulative average is the student Acceptance Type. And we built a decision tree model and rules to determine how the student can get high grades in their courses. The overall accuracy of the model is 44% which is accepted rate.

Keywords: data mining, classification, extracting rules, decision tree

Procedia PDF Downloads 404
917 Relay Mining: Verifiable Multi-Tenant Distributed Rate Limiting

Authors: Daniel Olshansky, Ramiro Rodrıguez Colmeiro

Abstract:

Relay Mining presents a scalable solution employing probabilistic mechanisms and crypto-economic incentives to estimate RPC volume usage, facilitating decentralized multitenant rate limiting. Network traffic from individual applications can be concurrently serviced by multiple RPC service providers, with costs, rewards, and rate limiting governed by a native cryptocurrency on a distributed ledger. Building upon established research in token bucket algorithms and distributed rate-limiting penalty models, our approach harnesses a feedback loop control mechanism to adjust the difficulty of mining relay rewards, dynamically scaling with network usage growth. By leveraging crypto-economic incentives, we reduce coordination overhead costs and introduce a mechanism for providing RPC services that are both geopolitically and geographically distributed.

Keywords: remote procedure call, crypto-economic, commit-reveal, decentralization, scalability, blockchain, rate limiting, token bucket

Procedia PDF Downloads 42
916 Data Mining Approach: Classification Model Evaluation

Authors: Lubabatu Sada Sodangi

Abstract:

The rapid growth in exchange and accessibility of information via the internet makes many organisations acquire data on their own operation. The aim of data mining is to analyse the different behaviour of a dataset using observation. Although, the subset of the dataset being analysed may not display all the behaviours and relationships of the entire data and, therefore, may not represent other parts that exist in the dataset. There is a range of techniques used in data mining to determine the hidden or unknown information in datasets. In this paper, the performance of two algorithms Chi-Square Automatic Interaction Detection (CHAID) and multilayer perceptron (MLP) would be matched using an Adult dataset to find out the percentage of an/the adults that earn > 50k and those that earn <= 50k per year. The two algorithms were studied and compared using IBM SPSS statistics software. The result for CHAID shows that the most important predictors are relationship and education. The algorithm shows that those are married (husband) and have qualification: Bachelor, Masters, Doctorate or Prof-school whose their age is > 41<57 earn > 50k. Also, multilayer perceptron displays marital status and capital gain as the most important predictors of the income. It also shows that individuals that their capital gain is less than 6,849 and are single, separated or widow, earn <= 50K, whereas individuals with their capital gain is > 6,849, work > 35 hrs/wk, and > 27yrs their income will be > 50k. By comparing the two algorithms, it is observed that both algorithms are reliable but there is strong reliability in CHAID which clearly shows that relation and education contribute to the prediction as displayed in the data visualisation.

Keywords: data mining, CHAID, multi-layer perceptron, SPSS, Adult dataset

Procedia PDF Downloads 368
915 On Exploring Search Heuristics for improving the efficiency in Web Information Extraction

Authors: Patricia Jiménez, Rafael Corchuelo

Abstract:

Nowadays the World Wide Web is the most popular source of information that relies on billions of on-line documents. Web mining is used to crawl through these documents, collect the information of interest and process it by applying data mining tools in order to use the gathered information in the best interest of a business, what enables companies to promote theirs. Unfortunately, it is not easy to extract the information a web site provides automatically when it lacks an API that allows to transform the user-friendly data provided in web documents into a structured format that is machine-readable. Rule-based information extractors are the tools intended to extract the information of interest automatically and offer it in a structured format that allow mining tools to process it. However, the performance of an information extractor strongly depends on the search heuristic employed since bad choices regarding how to learn a rule may easily result in loss of effectiveness and/or efficiency. Improving search heuristics regarding efficiency is of uttermost importance in the field of Web Information Extraction since typical datasets are very large. In this paper, we employ an information extractor based on a classical top-down algorithm that uses the so-called Information Gain heuristic introduced by Quinlan and Cameron-Jones. Unfortunately, the Information Gain relies on some well-known problems so we analyse an intuitive alternative, Termini, that is clearly more efficient; we also analyse other proposals in the literature and conclude that none of them outperforms the previous alternative.

Keywords: information extraction, search heuristics, semi-structured documents, web mining.

Procedia PDF Downloads 324
914 A Method for Reduction of Association Rules in Data Mining

Authors: Diego De Castro Rodrigues, Marcelo Lisboa Rocha, Daniela M. De Q. Trevisan, Marcos Dias Da Conceicao, Gabriel Rosa, Rommel M. Barbosa

Abstract:

The use of association rules algorithms within data mining is recognized as being of great value in the knowledge discovery in databases. Very often, the number of rules generated is high, sometimes even in databases with small volume, so the success in the analysis of results can be hampered by this quantity. The purpose of this research is to present a method for reducing the quantity of rules generated with association algorithms. Therefore, a computational algorithm was developed with the use of a Weka Application Programming Interface, which allows the execution of the method on different types of databases. After the development, tests were carried out on three types of databases: synthetic, model, and real. Efficient results were obtained in reducing the number of rules, where the worst case presented a gain of more than 50%, considering the concepts of support, confidence, and lift as measures. This study concluded that the proposed model is feasible and quite interesting, contributing to the analysis of the results of association rules generated from the use of algorithms.

Keywords: data mining, association rules, rules reduction, artificial intelligence

Procedia PDF Downloads 151
913 The Significance of Picture Mining in the Fashion and Design as a New Research Method

Authors: Katsue Edo, Yu Hiroi

Abstract:

T Increasing attention has been paid to using pictures and photographs in research since the beginning of the 21th century in social sciences. Meanwhile we have been studying the usefulness of Picture mining, which is one of the new ways for a these picture using researches. Picture Mining is an explorative research analysis method that takes useful information from pictures, photographs and static or moving images. It is often compared with the methods of text mining. The Picture Mining concept includes observational research in the broad sense, because it also aims to analyze moving images (Ochihara and Edo 2013). In the recent literature, studies and reports using pictures are increasing due to the environmental changes. These are identified as technological and social changes (Edo et.al. 2013). Low price digital cameras and i-phones, high information transmission speed, low costs for information transferring and high performance and resolution of the cameras of mobile phones have changed the photographing behavior of people. Consequently, there is less resistance in taking and processing photographs for most of the people in the developing countries. In these studies, this method of collecting data from respondents is often called as ‘participant-generated photography’ or ‘respondent-generated visual imagery’, which focuses on the collection of data and its analysis (Pauwels 2011, Snyder 2012). But there are few systematical and conceptual studies that supports it significance of these methods. We have discussed in the recent years to conceptualize these picture using research methods and formalize theoretical findings (Edo et. al. 2014). We have identified the most efficient fields of Picture mining in the following areas inductively and in case studies; 1) Research in Consumer and Customer Lifestyles. 2) New Product Development. 3) Research in Fashion and Design. Though we have found that it will be useful in these fields and areas, we must verify these assumptions. In this study we will focus on the field of fashion and design, to determine whether picture mining methods are really reliable in this area. In order to do so we have conducted an empirical research of the respondents’ attitudes and behavior concerning pictures and photographs. We compared the attitudes and behavior of pictures toward fashion to meals, and found out that taking pictures of fashion is not as easy as taking meals and food. Respondents do not often take pictures of fashion and upload their pictures online, such as Facebook and Instagram, compared to meals and food because of the difficulty of taking them. We concluded that we should be more careful in analyzing pictures in the fashion area for there still might be some kind of bias existing even if the environment of pictures have drastically changed in these years.

Keywords: empirical research, fashion and design, Picture Mining, qualitative research

Procedia PDF Downloads 352
912 The Affective Motivation of Women Miners in Ghana

Authors: Adesuwa Omorede, Rufai Haruna Kilu

Abstract:

Affective motivation (motivation that is emotionally laden usually related to affect, passion, emotions, moods) in the workplace stimulates individuals to reinforce, persist and commit to their task, which leads to the individual and organizational performance. This leads individuals to reach goals especially in situations where task are highly challenging and hostile. In such situations, individuals are more disposed to be more creative, innovative and see new opportunities from the loopholes in their workplace. However, when individuals feel displaced and less important, an adverse reaction may suffice which may be detrimental to the organization and its performance. One sector where affective motivation is eminently present and relevant, is the mining industry. Due to its intense work environment; mostly dominated by men and masculinity cultures; and deliberate exclusion of women in this environment which, makes the women working in these environments to feel marginalized. In Ghana, the mining industry is mostly seen as a very physical environment especially underground and mostly considerd as 'no place for a woman'. Despite the fact that these women feel less 'needed' or 'appreciated' in such environments, they still have to juggle between intense work shifts; face violence and other health risks with their families, which put a strain on their affective motivational reaction. Beyond these challenges, however, several mining companies in Ghana today are working towards providing a fair and equal working situation for both men and women miners, by recognizing them as key stakeholders, as well as including them in the stages of mining projects from the planning and designing phase to the evaluation and implementation stage. Drawing from the psychology and gender literature, this study takes a narrative approach to identify and understand the shifting gender dynamics within the mine works in Ghana, occasioning a change in background disposition of miners, which leads to more women taking up mine jobs in the country. In doing so, a qualitative study was conducted using semi-structured interviews from Ghana. Several women working within the mining industries in Ghana shared their experiences and how they felt and still feel in their workplace. In addition, archival documents were gathered to support the findings. The results suggest a change in enrolment regimes in a mining and technology university in Ghana, making room for a more gender equal enrolments in the university. A renowned university that train and feed mine work professional into the industry. The results further acknowledge gender equal and diversity recruitment policies and initiatives among the mining companies of Ghana. This study contributes to the psychology and gender literature by highlighting the hindrances women face in the mining industry as well as highlighting several of their affective reactions towards gender inequality. The study also provides several suggestions for decision makers in the mining industry of what can be done in the future to reduce the gender inequality gap within the industry.

Keywords: affective motivation, gender shape shifting, mining industry, women miners

Procedia PDF Downloads 284
911 Developing an Advanced Algorithm Capable of Classifying News, Articles and Other Textual Documents Using Text Mining Techniques

Authors: R. B. Knudsen, O. T. Rasmussen, R. A. Alphinas

Abstract:

The reason for conducting this research is to develop an algorithm that is capable of classifying news articles from the automobile industry, according to the competitive actions that they entail, with the use of Text Mining (TM) methods. It is needed to test how to properly preprocess the data for this research by preparing pipelines which fits each algorithm the best. The pipelines are tested along with nine different classification algorithms in the realm of regression, support vector machines, and neural networks. Preliminary testing for identifying the optimal pipelines and algorithms resulted in the selection of two algorithms with two different pipelines. The two algorithms are Logistic Regression (LR) and Artificial Neural Network (ANN). These algorithms are optimized further, where several parameters of each algorithm are tested. The best result is achieved with the ANN. The final model yields an accuracy of 0.79, a precision of 0.80, a recall of 0.78, and an F1 score of 0.76. By removing three of the classes that created noise, the final algorithm is capable of reaching an accuracy of 94%.

Keywords: Artificial Neural network, Competitive dynamics, Logistic Regression, Text classification, Text mining

Procedia PDF Downloads 109
910 Distributed Perceptually Important Point Identification for Time Series Data Mining

Authors: Tak-Chung Fu, Ying-Kit Hung, Fu-Lai Chung

Abstract:

In the field of time series data mining, the concept of the Perceptually Important Point (PIP) identification process is first introduced in 2001. This process originally works for financial time series pattern matching and it is then found suitable for time series dimensionality reduction and representation. Its strength is on preserving the overall shape of the time series by identifying the salient points in it. With the rise of Big Data, time series data contributes a major proportion, especially on the data which generates by sensors in the Internet of Things (IoT) environment. According to the nature of PIP identification and the successful cases, it is worth to further explore the opportunity to apply PIP in time series ‘Big Data’. However, the performance of PIP identification is always considered as the limitation when dealing with ‘Big’ time series data. In this paper, two distributed versions of PIP identification based on the Specialized Binary (SB) Tree are proposed. The proposed approaches solve the bottleneck when running the PIP identification process in a standalone computer. Improvement in term of speed is obtained by the distributed versions.

Keywords: distributed computing, performance analysis, Perceptually Important Point identification, time series data mining

Procedia PDF Downloads 418
909 Develop a Conceptual Data Model of Geotechnical Risk Assessment in Underground Coal Mining Using a Cloud-Based Machine Learning Platform

Authors: Reza Mohammadzadeh

Abstract:

The major challenges in geotechnical engineering in underground spaces arise from uncertainties and different probabilities. The collection, collation, and collaboration of existing data to incorporate them in analysis and design for given prospect evaluation would be a reliable, practical problem solving method under uncertainty. Machine learning (ML) is a subfield of artificial intelligence in statistical science which applies different techniques (e.g., Regression, neural networks, support vector machines, decision trees, random forests, genetic programming, etc.) on data to automatically learn and improve from them without being explicitly programmed and make decisions and predictions. In this paper, a conceptual database schema of geotechnical risks in underground coal mining based on a cloud system architecture has been designed. A new approach of risk assessment using a three-dimensional risk matrix supported by the level of knowledge (LoK) has been proposed in this model. Subsequently, the model workflow methodology stages have been described. In order to train data and LoK models deployment, an ML platform has been implemented. IBM Watson Studio, as a leading data science tool and data-driven cloud integration ML platform, is employed in this study. As a Use case, a data set of geotechnical hazards and risk assessment in underground coal mining were prepared to demonstrate the performance of the model, and accordingly, the results have been outlined.

Keywords: data model, geotechnical risks, machine learning, underground coal mining

Procedia PDF Downloads 258
908 Data Mining in Healthcare for Predictive Analytics

Authors: Ruzanna Muradyan

Abstract:

Medical data mining is a crucial field in contemporary healthcare that offers cutting-edge tactics with enormous potential to transform patient care. This abstract examines how sophisticated data mining techniques could transform the healthcare industry, with a special focus on how they might improve patient outcomes. Healthcare data repositories have dynamically evolved, producing a rich tapestry of different, multi-dimensional information that includes genetic profiles, lifestyle markers, electronic health records, and more. By utilizing data mining techniques inside this vast library, a variety of prospects for precision medicine, predictive analytics, and insight production become visible. Predictive modeling for illness prediction, risk stratification, and therapy efficacy evaluations are important points of focus. Healthcare providers may use this abundance of data to tailor treatment plans, identify high-risk patient populations, and forecast disease trajectories by applying machine learning algorithms and predictive analytics. Better patient outcomes, more efficient use of resources, and early treatments are made possible by this proactive strategy. Furthermore, data mining techniques act as catalysts to reveal complex relationships between apparently unrelated data pieces, providing enhanced insights into the cause of disease, genetic susceptibilities, and environmental factors. Healthcare practitioners can get practical insights that guide disease prevention, customized patient counseling, and focused therapies by analyzing these associations. The abstract explores the problems and ethical issues that come with using data mining techniques in the healthcare industry. In order to properly use these approaches, it is essential to find a balance between data privacy, security issues, and the interpretability of complex models. Finally, this abstract demonstrates the revolutionary power of modern data mining methodologies in transforming the healthcare sector. Healthcare practitioners and researchers can uncover unique insights, enhance clinical decision-making, and ultimately elevate patient care to unprecedented levels of precision and efficacy by employing cutting-edge methodologies.

Keywords: data mining, healthcare, patient care, predictive analytics, precision medicine, electronic health records, machine learning, predictive modeling, disease prognosis, risk stratification, treatment efficacy, genetic profiles, precision health

Procedia PDF Downloads 43
907 Lessons from Farmers Performing Agroforestry for Reclamation of Gold Mine Spoils in Colombia

Authors: Bibiana Betancur-Corredor, Juan Carlos Loaiza, Manfred Denich, Christian Borgemeister

Abstract:

Alluvial gold mining generates a vast amount of deposits that cover the natural soil and negatively impacts riverbeds and valleys, causing loss of livelihood opportunities for farmers of these regions. In Colombia, more than 79,000 ha are affected by alluvial gold mining, therefore developing strategies to return this land to productivity is of crucial importance for the country. A novel restoration strategy has been created by a mining company, where the land is restored through the establishment of agroforestry systems, in which agricultural crops and livestock are combined to complement reforestation in the area. The purpose of this study is to capture the knowledge of farmers who perform agroforestry in areas with deposits created by alluvial gold mining activities. Semi structured interviews were conducted with farmers with regard to the following: indicators of soil fertility, management practices, soil heterogeneity, pest outbreaks and weeds. In order to compare the perceptions of soil fertility of farmers with physicochemical properties of soils, the farmers were asked to identify spots within their farms that have exhibited good and poor yields. Soil samples were collected in order to correlate farmer’s perceptions with soil physicochemical properties. The findings suggest that the main challenge that farmers face is the identification of fertile soil for crop establishment. They identify the fertile soil through visually analyzing soil color and compaction as well as the use of spontaneous growth of specific plants as indicator of soil fertility. For less fertile areas, nitrogen fixing plants are used as green manure to restore soil fertility for crop establishment. The findings of this study imply that if gold mining is followed by reclamation practices that involve the successful establishment of productive farmlands, agricultural productivity of these lands might improve, increasing food security of the affected communities.

Keywords: agroforestry, knowledge, mining, restoration

Procedia PDF Downloads 222
906 Main Cause of Children's Deaths in Indigenous Wayuu Community from Department of La Guajira: A Research Developed through Data Mining Use

Authors: Isaura Esther Solano Núñez, David Suarez

Abstract:

The main purpose of this research is to discover what causes death in children of the Wayuu community, and deeply analyze those results in order to take corrective measures to properly control infant mortality. We consider important to determine the reasons that are producing early death in this specific type of population, since they are the most vulnerable to high risk environmental conditions. In this way, the government, through competent authorities, may develop prevention policies and the right measures to avoid an increase of this tragic fact. The methodology used to develop this investigation is data mining, which consists in gaining and examining large amounts of data to produce new and valuable information. Through this technique it has been possible to determine that the child population is dying mostly from malnutrition. In short, this technique has been very useful to develop this study; it has allowed us to transform large amounts of information into a conclusive and important statement, which has made it easier to take appropriate steps to resolve a particular situation.

Keywords: malnutrition, data mining, analytical, descriptive, population, Wayuu, indigenous

Procedia PDF Downloads 150
905 Building an Integrated Relational Database from Swiss Nutrition National Survey and Swiss Health Datasets for Data Mining Purposes

Authors: Ilona Mewes, Helena Jenzer, Farshideh Einsele

Abstract:

Objective: The objective of the study was to integrate two big databases from Swiss nutrition national survey (menuCH) and Swiss health national survey 2012 for data mining purposes. Each database has a demographic base data. An integrated Swiss database is built to later discover critical food consumption patterns linked with lifestyle diseases known to be strongly tied with food consumption. Design: Swiss nutrition national survey (menuCH) with approx. 2000 respondents from two different surveys, one by Phone and the other by questionnaire along with Swiss health national survey 2012 with 21500 respondents were pre-processed, cleaned and finally integrated to a unique relational database. Results: The result of this study is an integrated relational database from the Swiss nutritional and health databases.

Keywords: health informatics, data mining, nutritional and health databases, nutritional and chronical databases

Procedia PDF Downloads 100