Search results for: Data Mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24815

Search results for: Data Mining

24245 A Framework for Event-Based Monitoring of Business Processes in the Supply Chain Management of Industry 4.0

Authors: Johannes Atug, Andreas Radke, Mitchell Tseng, Gunther Reinhart

Abstract:

In modern supply chains, large numbers of SKU (Stock-Keeping-Unit) need to be timely managed, and any delays in noticing disruptions of items often limit the ability to defer the impact on customer order fulfillment. However, in supply chains of IoT-connected enterprises, the ERP (Enterprise-Resource-Planning), the MES (Manufacturing-Execution-System) and the SCADA (Supervisory-Control-and-Data-Acquisition) systems generate large amounts of data, which generally glean much earlier notice of deviations in the business process steps. That is, analyzing these streams of data with process mining techniques allows the monitoring of the supply chain business processes and thus identification of items that deviate from the standard order fulfillment process. In this paper, a framework to enable event-based SCM (Supply-Chain-Management) processes including an overview of core enabling technologies are presented, which is based on the RAMI (Reference-Architecture-Model for Industrie 4.0) architecture. The application of this framework in the industry is presented, and implications for SCM in industry 4.0 and further research are outlined.

Keywords: cyber-physical production systems, event-based monitoring, supply chain management, RAMI (Reference-Architecture-Model for Industrie 4.0)

Procedia PDF Downloads 214
24244 Experience Modularization for New Value of Evanescent Cultural Communities: Developing Creative Tourism Services in Bangkok

Authors: Wuttigrai Ngamsirijit

Abstract:

Creative tourism is an ongoing development in many countries as an attempt to moving away from serial reproduction of culture and reviving the culture. Despite, in the destinations with diverse and potential cultural resources, creating new tourism services can be vague. This paper presents how tourism experiences are modularized and consolidated in order to form new creative tourism service offerings in evanescent cultural communities of Bangkok, Thailand. The benefits from data mining in accommodating value co-creation are discussed, and implication of experience modularization to national creative tourism policy is addressed.

Keywords: co-creation, creative tourism, new service design, experience modularization

Procedia PDF Downloads 347
24243 Comparison of Different Methods of Microorganism's Identification from a Copper Mining in Pará, Brazil

Authors: Louise H. Gracioso, Marcela P.G. Baltazar, Ingrid R. Avanzi, Bruno Karolski, Luciana J. Gimenes, Claudio O. Nascimento, Elen A. Perpetuo

Abstract:

Introduction: Higher copper concentrations promote a selection pressure on organisms such as plants, fungi and bacteria, which allows surviving only the resistant organisms to the contaminated site. This selective pressure keeps only the organisms most resistant to a specific condition and subsequently increases their bioremediation potential. Despite the bacteria importance for biosphere maintenance, it is estimated that only a small fraction living microbial species has been described and characterized. Due to the molecular biology development, tools based on analysis 16S ribosomal RNA or another specific gene are making a new scenario for the characterization studies and identification of microorganisms in the environment. News identification of microorganisms methods have also emerged like Biotyper (MALDI / TOF), this method mass spectrometry is subject to the recognition of spectroscopic patterns of conserved and features proteins for different microbial species. In view of this, this study aimed to isolate bacteria resistant to copper present in a Copper Processing Area (Sossego Mine, Canaan, PA) and identifies them in two different methods: Recent (spectrometry mass) and conventional. This work aimed to use them for a future bioremediation of this Mining. Material and Methods: Samples were collected at fifteen different sites of five periods of times. Microorganisms were isolated from mining wastes by culture enrichment technique; this procedure was repeated 4 times. The isolates were inoculated into MJS medium containing different concentrations of chloride copper (1mM, 2.5mM, 5mM, 7.5mM and 10 mM) and incubated in plates for 72 h at 28 ºC. These isolates were subjected to mass spectrometry identification methods (Biotyper – MALDI/TOF) and 16S gene sequencing. Results: A total of 105 strains were isolated in this area, bacterial identification by mass spectrometry method (MALDI/TOF) achieved 74% agreement with the conventional identification method (16S), 31% have been unsuccessful in MALDI-TOF and 2% did not obtain identification sequence the 16S. These results show that Biotyper can be a very useful tool in the identification of bacteria isolated from environmental samples, since it has a better value for money (cheap and simple sample preparation and MALDI plates are reusable). Furthermore, this technique is more rentable because it saves time and has a high performance (the mass spectra are compared to the database and it takes less than 2 minutes per sample).

Keywords: copper mining area, bioremediation, microorganisms, identification, MALDI/TOF, RNA 16S

Procedia PDF Downloads 360
24242 Developing Sustainable Tourism Practices in Communities Adjacent to Mines: An Exploratory Study in South Africa

Authors: Felicite Ann Fairer-Wessels

Abstract:

There has always been a disparity between mining and tourism mainly due to the socio-economic and environmental impacts of mines on both the adjacent resident communities and the areas taken up by the mining operation. Although heritage mining tourism has been actively and successfully pursued and developed in the UK, largely Wales, and Scandinavian countries, the debate whether active mining and tourism can have a mutually beneficial relationship remains imminent. This pilot study explores the relationship between the ‘to be developed’ future Nokeng Mine and its adjacent community, the rural community of Moloto, will be investigated in terms of whether sustainable tourism and livelihood activities can potentially be developed with the support of the mine. Concepts such as social entrepreneur, corporate social responsibility, sustainable development and triple bottom line are discussed. Within the South African context as a mineral rich developing country, the government has a statutory obligation to empower disenfranchised communities through social and labour plans and policies. All South African mines must preside over a Social and Labour Plan according to the Mineral and Petroleum Resources Development Act, No 28 of 2002. The ‘social’ component refers to the ‘social upliftment’ of communities within or adjacent to any mine; whereas the ‘labour’ component refers to the mine workers sourced from the specific community. A qualitative methodology is followed using the case study as research instrument for the Nokeng Mine and Moloto community with interviews and focus group discussions. The target population comprised of the Moloto Tribal Council members (8 in-depth interviews), the Moloto community members (17: focus groups); and the Nokeng Mine representatives (4 in-depth interviews). In this pilot study two disparate ‘worlds’ are potentially linked: on the one hand, the mine as social entrepreneur that is searching for feasible and sustainable ideas; and on the other hand, the community adjacent to the mine, with potentially sustainable tourism entrepreneurs that can tap into the resources of the mine should their ideas be feasible to build their businesses. Being an exploratory study the findings are limited but indicate that the possible success of tourism and sustainable livelihood activities lies in the fact that both the Mine and Community are keen to work together – the mine in terms of obtaining labour and profit; and the community in terms of improved and sustainable social and economic conditions; with both parties realizing the importance to mitigate negative environmental impacts. In conclusion, a relationship of trust is imperative between a mine and a community before a long term liaison is possible. However whether tourism is a viable solution for the community to engage in is debatable. The community could initially rather pursue the sustainable livelihoods approach and focus on life-supporting activities such as building, gardening, etc. that once established could feed into possible sustainable tourism activities.

Keywords: community development, mining tourism, sustainability, South Africa

Procedia PDF Downloads 280
24241 Assessment of Chromium Concentration and Human Health Risk in the Steelpoort River Sub-Catchment of the Olifants River Basin, South Africa

Authors: Abraham Addo-Bediako

Abstract:

Many freshwater ecosystems are facing immense pressure from anthropogenic activities, such as agricultural, industrial and mining. Trace metal pollution in freshwater ecosystems has become an issue of public health concern due to its toxicity and persistence in the environment. Trace elements pose a serious risk not only to the environment and aquatic biota but also humans. Chromium is one of such trace elements and its pollution in surface waters and groundwaters represents a serious environmental problem. In South Africa, agriculture, mining, industrial and domestic wastes are the main contributors to chromium discharge in rivers. The common forms of chromium are chromium (III) and chromium (VI). The latter is the most toxic because it can cause damage to human health. The aim of the study was to assess the contamination of chromium in the water and sediments of two rivers in the Steelpoort River sub-catchment of the Olifants River Basin, South Africa and human health risk. The concentration of Cr was analyzed using inductively coupled plasma–optical emission spectrometry (ICP-OES). The concentration of the metal was found to exceed the threshold limit, mainly in areas of high human activities. The hazard quotient through ingestion exposure did not exceed the threshold limit of 1 for adults and children and cancer risk for adults and children computed did not exceed the threshold limit of 10-4. Thus, there is no potential health risk from chromium through ingestion of drinking water for now. However, with increasing human activities, especially mining, the concentration could increase and become harmful to humans who depend on rivers for drinking water. It is recommended that proper management strategies should be taken to minimize the impact of chromium on the rivers and water from the rivers should properly be treated before domestic use.

Keywords: land use, health risk, metal pollution, water quality

Procedia PDF Downloads 64
24240 Three-Stage Mining Metals Supply Chain Coordination and Product Quality Improvement with Revenue Sharing Contract

Authors: Hamed Homaei, Iraj Mahdavi, Ali Tajdin

Abstract:

One of the main concerns of miners is to increase the quality level of their products because the mining metals price depends on their quality level; however, increasing the quality level of these products has different costs at different levels of the supply chain. These costs usually increase after extractor level. This paper studies the coordination issue of a decentralized three-level supply chain with one supplier (extractor), one mineral processor and one manufacturer in which the increasing product quality level cost at the processor level is higher than the supplier and at the level of the manufacturer is more than the processor. We identify the optimal product quality level for each supply chain member by designing a revenue sharing contract. Finally, numerical examples show that the designed contract not only increases the final product quality level but also provides a win-win condition for all supply chain members and increases the whole supply chain profit.

Keywords: three-stage supply chain, product quality improvement, channel coordination, revenue sharing

Procedia PDF Downloads 169
24239 Automatic Lexicon Generation for Domain Specific Dataset for Mining Public Opinion on China Pakistan Economic Corridor

Authors: Tayyaba Azim, Bibi Amina

Abstract:

The increase in the popularity of opinion mining with the rapid growth in the availability of social networks has attracted a lot of opportunities for research in the various domains of Sentiment Analysis and Natural Language Processing (NLP) using Artificial Intelligence approaches. The latest trend allows the public to actively use the internet for analyzing an individual’s opinion and explore the effectiveness of published facts. The main theme of this research is to account the public opinion on the most crucial and extensively discussed development projects, China Pakistan Economic Corridor (CPEC), considered as a game changer due to its promise of bringing economic prosperity to the region. So far, to the best of our knowledge, the theme of CPEC has not been analyzed for sentiment determination through the ML approach. This research aims to demonstrate the use of ML approaches to spontaneously analyze the public sentiment on Twitter tweets particularly about CPEC. Support Vector Machine SVM is used for classification task classifying tweets into positive, negative and neutral classes. Word2vec and TF-IDF features are used with the SVM model, a comparison of the trained model on manually labelled tweets and automatically generated lexicon is performed. The contributions of this work are: Development of a sentiment analysis system for public tweets on CPEC subject, construction of an automatic generation of the lexicon of public tweets on CPEC, different themes are identified among tweets and sentiments are assigned to each theme. It is worth noting that the applications of web mining that empower e-democracy by improving political transparency and public participation in decision making via social media have not been explored and practised in Pakistan region on CPEC yet.

Keywords: machine learning, natural language processing, sentiment analysis, support vector machine, Word2vec

Procedia PDF Downloads 130
24238 Patterns in Fish Diversity and Abundance of an Abandoned Gold Mine Reservoirs

Authors: O. E. Obayemi, M. A. Ayoade, O. O. Komolafe

Abstract:

Fish survey was carried out for an annual cycle covering both rainy and dry seasons using cast nets, gill nets and traps at two different reservoirs. The objective was to examined the fish assemblages of the reservoirs and provide more additional information on the reservoir. The fish species in the reservoirs comprised of twelve species of six families. The results of the study also showed that five species of fish were caught in reservoir five while ten fish species were captured in reservoir six. Species such as Malapterurus electricus, Ctenopoma kingsleyae, Mormyrus rume, Parachanna obscura, Sarotherodon galilaeus, Tilapia mariae, C. guntheri, Clarias macromystax, Coptodon zilii and Clarias gariepinus were caught during the sampling period. There was a significant difference (p=0.014, t = 1.711) in the abundance of fish species in the two reservoirs. Seasonally, reservoirs five (p=0.221, t = 1.859) and six (p=0.453, t = 1.734) showed there was no significant difference in their fish populations. Also, despite being impacted with gold mining the diversity indices were high when compared to less disturbed waterbodies. The study concluded that the environments recorded low abundant fish species which suggests the influence of mining on the abundance and diversity of fish species.

Keywords: Igun, fish, Shannon-Wiener Index, Simpson index, Pielou index

Procedia PDF Downloads 75
24237 The Structure and Function Investigation and Analysis of the Automatic Spin Regulator (ASR) in the Powertrain System of Construction and Mining Machines with the Focus on Dump Trucks

Authors: Amir Mirzaei

Abstract:

The powertrain system is one of the most basic and essential components in a machine. The occurrence of motion is practically impossible without the presence of this system. When power is generated by the engine, it is transmitted by the powertrain system to the wheels, which are the last parts of the system. Powertrain system has different components according to the type of use and design. When the force generated by the engine reaches to the wheels, the amount of frictional force between the tire and the ground determines the amount of traction and non-slip or the amount of slip. At various levels, such as icy, muddy, and snow-covered ground, the amount of friction coefficient between the tire and the ground decreases dramatically and considerably, which in turn increases the amount of force loss and the vehicle traction decreases drastically. This condition is caused by the phenomenon of slipping, which, in addition to the waste of energy produced, causes the premature wear of driving tires. It also causes the temperature of the transmission oil to rise too much, as a result, causes a reduction in the quality and become dirty to oil and also reduces the useful life of the clutches disk and plates inside the transmission. this issue is much more important in road construction and mining machinery than passenger vehicles and is always one of the most important and significant issues in the design discussion, in order to overcome. One of these methods is the automatic spin regulator system which is abbreviated as ASR. The importance of this method and its structure and function have solved one of the biggest challenges of the powertrain system in the field of construction and mining machinery. That this research is examined.

Keywords: automatic spin regulator, ASR, methods of reducing slipping, methods of preventing the reduction of the useful life of clutches disk and plate, methods of preventing the premature dirtiness of transmission oil, method of preventing the reduction of the useful life of tires

Procedia PDF Downloads 64
24236 Geochemical Baseline and Origin of Trace Elements in Soils and Sediments around Selibe-Phikwe Cu-Ni Mining Town, Botswana

Authors: Fiona S. Motswaiso, Kengo Nakamura, Takeshi Komai

Abstract:

Heavy metals may occur naturally in rocks and soils, but elevated quantities of them are being gradually released into the environment by anthropogenic activities such as mining. In order to address issues of heavy metal water and soil pollution, a distinction needs to be made between natural and anthropogenic anomalies. The current study aims at characterizing the spatial distribution of trace elements and evaluate site-specific geochemical background concentrations of trace elements in the mine soils examined, and also to discriminate between lithogenic and anthropogenic sources of enrichment around a copper-nickel mining town in Selibe-Phikwe, Botswana. A total of 20 Soil samples, 11 river sediment, and 9 river water samples were collected from an area of 625m² within the precincts of the mine and the smelter. The concentrations of metals (Cu, Ni, Pb, Zn, Cr, Ni, Mn, As, Pb, and Co) were determined by using an ICP-MS after digestion with aqua regia. Major elements were also determined using ED-XRF. Water pH and EC were measured on site and recorded while soil pH and EC were also determined in the laboratory after performing water elution tests. The highest Cu and Ni concentrations in soil are 593mg/kg and 453mg/kg respectively, which is 3 times higher than the crustal composition values and 2 times higher than the South African minimum allowable levels of heavy metals in soils. The level of copper contamination was higher than that of nickel and other contaminants. Water pH levels ranged from basic (9) to very acidic (3) in areas closer to the mine/smelter. There is high variation in heavy metal concentration, eg. Cu suggesting that some sites depict regional natural background concentrations while other depict anthropogenic sources.

Keywords: contamination, geochemical baseline, heavy metals, soils

Procedia PDF Downloads 136
24235 Sentiment Analysis of Ensemble-Based Classifiers for E-Mail Data

Authors: Muthukumarasamy Govindarajan

Abstract:

Detection of unwanted, unsolicited mails called spam from email is an interesting area of research. It is necessary to evaluate the performance of any new spam classifier using standard data sets. Recently, ensemble-based classifiers have gained popularity in this domain. In this research work, an efficient email filtering approach based on ensemble methods is addressed for developing an accurate and sensitive spam classifier. The proposed approach employs Naive Bayes (NB), Support Vector Machine (SVM) and Genetic Algorithm (GA) as base classifiers along with different ensemble methods. The experimental results show that the ensemble classifier was performing with accuracy greater than individual classifiers, and also hybrid model results are found to be better than the combined models for the e-mail dataset. The proposed ensemble-based classifiers turn out to be good in terms of classification accuracy, which is considered to be an important criterion for building a robust spam classifier.

Keywords: accuracy, arcing, bagging, genetic algorithm, Naive Bayes, sentiment mining, support vector machine

Procedia PDF Downloads 121
24234 Risk Assessment of Trace Metals in the Soil Surface of an Abandoned Mine, El-Abed Northwestern Algeria

Authors: Farida Mellah, Abdelhak Boutaleb, Bachir Henni, Dalila Berdous, Abdelhamid Mellah

Abstract:

Context/Purpose: One of the largest mining operations for lead and zinc deposits in northwestern Algeria in more than thirty years, El Abed is now the abandoned mine that has been inactive since 2004, leaving large amounts of accumulated mining waste under the influence of Wind, erosion, rain, and near agricultural lands. Materials & Methods: This study aims to verify the concentrations and sources of heavy metals for surface samples containing randomly taken soil. Chemical analyses were performed using iCAP 7000 Series ICP-optical emission spectrometer, using a set of environmental quality indicators by calculating the enrichment factor using iron and aluminum references, geographic accumulation index and geographic information system (GIS). On the basis of the spatial distribution. Results: The results indicated that the average metal concentration was: (As = 30,82),(Pb = 1219,27), (Zn = 2855,94), (Cu = 5,3), mg/Kg,based on these results, all metals except Cu passed by GBV in the Earth's crust. Environmental quality indicators were calculated based on the concentrations of trace metals such as lead, arsenic, zinc, copper, iron and aluminum. Interpretation: This study investigated the concentrations and sources of trace metals, and by using quality indicators and statistical methods, lead, zinc, and arsenic were determined from human sources, while copper was a natural source. And based on the spatial analysis on the basis of GIS, many hot spots were identified in the El-Abed region. Conclusion: These results could help in the development of future treatment strategies aimed primarily at eliminating materials from mining waste.

Keywords: soil contamination, trace metals, geochemical indices, El Abed mine, Algeria

Procedia PDF Downloads 53
24233 How to Use Big Data in Logistics Issues

Authors: Mehmet Akif Aslan, Mehmet Simsek, Eyup Sensoy

Abstract:

Big Data stands for today’s cutting-edge technology. As the technology becomes widespread, so does Data. Utilizing massive data sets enable companies to get competitive advantages over their adversaries. Out of many area of Big Data usage, logistics has significance role in both commercial sector and military. This paper lays out what big data is and how it is used in both military and commercial logistics.

Keywords: big data, logistics, operational efficiency, risk management

Procedia PDF Downloads 625
24232 Quantum Statistical Machine Learning and Quantum Time Series

Authors: Omar Alzeley, Sergey Utev

Abstract:

Minimizing a constrained multivariate function is the fundamental of Machine learning, and these algorithms are at the core of data mining and data visualization techniques. The decision function that maps input points to output points is based on the result of optimization. This optimization is the central of learning theory. One approach to complex systems where the dynamics of the system is inferred by a statistical analysis of the fluctuations in time of some associated observable is time series analysis. The purpose of this paper is a mathematical transition from the autoregressive model of classical time series to the matrix formalization of quantum theory. Firstly, we have proposed a quantum time series model (QTS). Although Hamiltonian technique becomes an established tool to detect a deterministic chaos, other approaches emerge. The quantum probabilistic technique is used to motivate the construction of our QTS model. The QTS model resembles the quantum dynamic model which was applied to financial data. Secondly, various statistical methods, including machine learning algorithms such as the Kalman filter algorithm, are applied to estimate and analyses the unknown parameters of the model. Finally, simulation techniques such as Markov chain Monte Carlo have been used to support our investigations. The proposed model has been examined by using real and simulated data. We establish the relation between quantum statistical machine and quantum time series via random matrix theory. It is interesting to note that the primary focus of the application of QTS in the field of quantum chaos was to find a model that explain chaotic behaviour. Maybe this model will reveal another insight into quantum chaos.

Keywords: machine learning, simulation techniques, quantum probability, tensor product, time series

Procedia PDF Downloads 448
24231 Visual Template Detection and Compositional Automatic Regular Expression Generation for Business Invoice Extraction

Authors: Anthony Proschka, Deepak Mishra, Merlyn Ramanan, Zurab Baratashvili

Abstract:

Small and medium-sized businesses receive over 160 billion invoices every year. Since these documents exhibit many subtle differences in layout and text, extracting structured fields such as sender name, amount, and VAT rate from them automatically is an open research question. In this paper, existing work in template-based document extraction is extended, and a system is devised that is able to reliably extract all required fields for up to 70% of all documents in the data set, more than any other previously reported method. The approaches are described for 1) detecting through visual features which template a given document belongs to, 2) automatically generating extraction rules for a given new template by composing regular expressions from multiple components, and 3) computing confidence scores that indicate the accuracy of the automatic extractions. The system can generate templates with as little as one training sample and only requires the ground truth field values instead of detailed annotations such as bounding boxes that are hard to obtain. The system is deployed and used inside a commercial accounting software.

Keywords: data mining, information retrieval, business, feature extraction, layout, business data processing, document handling, end-user trained information extraction, document archiving, scanned business documents, automated document processing, F1-measure, commercial accounting software

Procedia PDF Downloads 109
24230 From Text to Data: Sentiment Analysis of Presidential Election Political Forums

Authors: Sergio V Davalos, Alison L. Watkins

Abstract:

User generated content (UGC) such as website post has data associated with it: time of the post, gender, location, type of device, and number of words. The text entered in user generated content (UGC) can provide a valuable dimension for analysis. In this research, each user post is treated as a collection of terms (words). In addition to the number of words per post, the frequency of each term is determined by post and by the sum of occurrences in all posts. This research focuses on one specific aspect of UGC: sentiment. Sentiment analysis (SA) was applied to the content (user posts) of two sets of political forums related to the US presidential elections for 2012 and 2016. Sentiment analysis results in deriving data from the text. This enables the subsequent application of data analytic methods. The SASA (SAIL/SAI Sentiment Analyzer) model was used for sentiment analysis. The application of SASA resulted with a sentiment score for each post. Based on the sentiment scores for the posts there are significant differences between the content and sentiment of the two sets for the 2012 and 2016 presidential election forums. In the 2012 forums, 38% of the forums started with positive sentiment and 16% with negative sentiment. In the 2016 forums, 29% started with positive sentiment and 15% with negative sentiment. There also were changes in sentiment over time. For both elections as the election got closer, the cumulative sentiment score became negative. The candidate who won each election was in the more posts than the losing candidates. In the case of Trump, there were more negative posts than Clinton’s highest number of posts which were positive. KNIME topic modeling was used to derive topics from the posts. There were also changes in topics and keyword emphasis over time. Initially, the political parties were the most referenced and as the election got closer the emphasis changed to the candidates. The performance of the SASA method proved to predict sentiment better than four other methods in Sentibench. The research resulted in deriving sentiment data from text. In combination with other data, the sentiment data provided insight and discovery about user sentiment in the US presidential elections for 2012 and 2016.

Keywords: sentiment analysis, text mining, user generated content, US presidential elections

Procedia PDF Downloads 167
24229 Internet of Things, Edge and Cloud Computing in Rock Mechanical Investigation for Underground Surveys

Authors: Esmael Makarian, Ayub Elyasi, Fatemeh Saberi, Olusegun Stanley Tomomewo

Abstract:

Rock mechanical investigation is one of the most crucial activities in underground operations, especially in surveys related to hydrocarbon exploration and production, geothermal reservoirs, energy storage, mining, and geotechnics. There is a wide range of traditional methods for driving, collecting, and analyzing rock mechanics data. However, these approaches may not be suitable or work perfectly in some situations, such as fractured zones. Cutting-edge technologies have been provided to solve and optimize the mentioned issues. Internet of Things (IoT), Edge, and Cloud Computing technologies (ECt & CCt, respectively) are among the most widely used and new artificial intelligence methods employed for geomechanical studies. IoT devices act as sensors and cameras for real-time monitoring and mechanical-geological data collection of rocks, such as temperature, movement, pressure, or stress levels. Structural integrity, especially for cap rocks within hydrocarbon systems, and rock mass behavior assessment, to further activities such as enhanced oil recovery (EOR) and underground gas storage (UGS), or to improve safety risk management (SRM) and potential hazards identification (P.H.I), are other benefits from IoT technologies. EC techniques can process, aggregate, and analyze data immediately collected by IoT on a real-time scale, providing detailed insights into the behavior of rocks in various situations (e.g., stress, temperature, and pressure), establishing patterns quickly, and detecting trends. Therefore, this state-of-the-art and useful technology can adopt autonomous systems in rock mechanical surveys, such as drilling and production (in hydrocarbon wells) or excavation (in mining and geotechnics industries). Besides, ECt allows all rock-related operations to be controlled remotely and enables operators to apply changes or make adjustments. It must be mentioned that this feature is very important in environmental goals. More often than not, rock mechanical studies consist of different data, such as laboratory tests, field operations, and indirect information like seismic or well-logging data. CCt provides a useful platform for storing and managing a great deal of volume and different information, which can be very useful in fractured zones. Additionally, CCt supplies powerful tools for predicting, modeling, and simulating rock mechanical information, especially in fractured zones within vast areas. Also, it is a suitable source for sharing extensive information on rock mechanics, such as the direction and size of fractures in a large oil field or mine. The comprehensive review findings demonstrate that digital transformation through integrated IoT, Edge, and Cloud solutions is revolutionizing traditional rock mechanical investigation. These advanced technologies have empowered real-time monitoring, predictive analysis, and data-driven decision-making, culminating in noteworthy enhancements in safety, efficiency, and sustainability. Therefore, by employing IoT, CCt, and ECt, underground operations have experienced a significant boost, allowing for timely and informed actions using real-time data insights. The successful implementation of IoT, CCt, and ECt has led to optimized and safer operations, optimized processes, and environmentally conscious approaches in underground geological endeavors.

Keywords: rock mechanical studies, internet of things, edge computing, cloud computing, underground surveys, geological operations

Procedia PDF Downloads 38
24228 Evaluation of the CRISP-DM Business Understanding Step: An Approach for Assessing the Predictive Power of Regression versus Classification for the Quality Prediction of Hydraulic Test Results

Authors: Christian Neunzig, Simon Fahle, Jürgen Schulz, Matthias Möller, Bernd Kuhlenkötter

Abstract:

Digitalisation in production technology is a driver for the application of machine learning methods. Through the application of predictive quality, the great potential for saving necessary quality control can be exploited through the data-based prediction of product quality and states. However, the serial use of machine learning applications is often prevented by various problems. Fluctuations occur in real production data sets, which are reflected in trends and systematic shifts over time. To counteract these problems, data preprocessing includes rule-based data cleaning, the application of dimensionality reduction techniques, and the identification of comparable data subsets to extract stable features. Successful process control of the target variables aims to centre the measured values around a mean and minimise variance. Competitive leaders claim to have mastered their processes. As a result, much of the real data has a relatively low variance. For the training of prediction models, the highest possible generalisability is required, which is at least made more difficult by this data availability. The implementation of a machine learning application can be interpreted as a production process. The CRoss Industry Standard Process for Data Mining (CRISP-DM) is a process model with six phases that describes the life cycle of data science. As in any process, the costs to eliminate errors increase significantly with each advancing process phase. For the quality prediction of hydraulic test steps of directional control valves, the question arises in the initial phase whether a regression or a classification is more suitable. In the context of this work, the initial phase of the CRISP-DM, the business understanding, is critically compared for the use case at Bosch Rexroth with regard to regression and classification. The use of cross-process production data along the value chain of hydraulic valves is a promising approach to predict the quality characteristics of workpieces. Suitable methods for leakage volume flow regression and classification for inspection decision are applied. Impressively, classification is clearly superior to regression and achieves promising accuracies.

Keywords: classification, CRISP-DM, machine learning, predictive quality, regression

Procedia PDF Downloads 128
24227 Document-level Sentiment Analysis: An Exploratory Case Study of Low-resource Language Urdu

Authors: Ammarah Irum, Muhammad Ali Tahir

Abstract:

Document-level sentiment analysis in Urdu is a challenging Natural Language Processing (NLP) task due to the difficulty of working with lengthy texts in a language with constrained resources. Deep learning models, which are complex neural network architectures, are well-suited to text-based applications in addition to data formats like audio, image, and video. To investigate the potential of deep learning for Urdu sentiment analysis, we implemented five different deep learning models, including Bidirectional Long Short Term Memory (BiLSTM), Convolutional Neural Network (CNN), Convolutional Neural Network with Bidirectional Long Short Term Memory (CNN-BiLSTM), and Bidirectional Encoder Representation from Transformer (BERT). In this study, we developed a hybrid deep learning model called BiLSTM-Single Layer Multi Filter Convolutional Neural Network (BiLSTM-SLMFCNN) by fusing BiLSTM and CNN architecture. The proposed and baseline techniques are applied on Urdu Customer Support data set and IMDB Urdu movie review data set by using pre-trained Urdu word embedding that are suitable for sentiment analysis at the document level. Results of these techniques are evaluated and our proposed model outperforms all other deep learning techniques for Urdu sentiment analysis. BiLSTM-SLMFCNN outperformed the baseline deep learning models and achieved 83%, 79%, 83% and 94% accuracy on small, medium and large sized IMDB Urdu movie review data set and Urdu Customer Support data set respectively.

Keywords: urdu sentiment analysis, deep learning, natural language processing, opinion mining, low-resource language

Procedia PDF Downloads 49
24226 Copyright Clearance for Artificial Intelligence Training Data: Challenges and Solutions

Authors: Erva Akin

Abstract:

– The use of copyrighted material for machine learning purposes is a challenging issue in the field of artificial intelligence (AI). While machine learning algorithms require large amounts of data to train and improve their accuracy and creativity, the use of copyrighted material without permission from the authors may infringe on their intellectual property rights. In order to overcome copyright legal hurdle against the data sharing, access and re-use of data, the use of copyrighted material for machine learning purposes may be considered permissible under certain circumstances. For example, if the copyright holder has given permission to use the data through a licensing agreement, then the use for machine learning purposes may be lawful. It is also argued that copying for non-expressive purposes that do not involve conveying expressive elements to the public, such as automated data extraction, should not be seen as infringing. The focus of such ‘copy-reliant technologies’ is on understanding language rules, styles, and syntax and no creative ideas are being used. However, the non-expressive use defense is within the framework of the fair use doctrine, which allows the use of copyrighted material for research or educational purposes. The questions arise because the fair use doctrine is not available in EU law, instead, the InfoSoc Directive provides for a rigid system of exclusive rights with a list of exceptions and limitations. One could only argue that non-expressive uses of copyrighted material for machine learning purposes do not constitute a ‘reproduction’ in the first place. Nevertheless, the use of machine learning with copyrighted material is difficult because EU copyright law applies to the mere use of the works. Two solutions can be proposed to address the problem of copyright clearance for AI training data. The first is to introduce a broad exception for text and data mining, either mandatorily or for commercial and scientific purposes, or to permit the reproduction of works for non-expressive purposes. The second is that copyright laws should permit the reproduction of works for non-expressive purposes, which opens the door to discussions regarding the transposition of the fair use principle from the US into EU law. Both solutions aim to provide more space for AI developers to operate and encourage greater freedom, which could lead to more rapid innovation in the field. The Data Governance Act presents a significant opportunity to advance these debates. Finally, issues concerning the balance of general public interests and legitimate private interests in machine learning training data must be addressed. In my opinion, it is crucial that robot-creation output should fall into the public domain. Machines depend on human creativity, innovation, and expression. To encourage technological advancement and innovation, freedom of expression and business operation must be prioritised.

Keywords: artificial intelligence, copyright, data governance, machine learning

Procedia PDF Downloads 66
24225 Weighted-Distance Sliding Windows and Cooccurrence Graphs for Supporting Entity-Relationship Discovery in Unstructured Text

Authors: Paolo Fantozzi, Luigi Laura, Umberto Nanni

Abstract:

The problem of Entity relation discovery in structured data, a well covered topic in literature, consists in searching within unstructured sources (typically, text) in order to find connections among entities. These can be a whole dictionary, or a specific collection of named items. In many cases machine learning and/or text mining techniques are used for this goal. These approaches might be unfeasible in computationally challenging problems, such as processing massive data streams. A faster approach consists in collecting the cooccurrences of any two words (entities) in order to create a graph of relations - a cooccurrence graph. Indeed each cooccurrence highlights some grade of semantic correlation between the words because it is more common to have related words close each other than having them in the opposite sides of the text. Some authors have used sliding windows for such problem: they count all the occurrences within a sliding windows running over the whole text. In this paper we generalise such technique, coming up to a Weighted-Distance Sliding Window, where each occurrence of two named items within the window is accounted with a weight depending on the distance between items: a closer distance implies a stronger evidence of a relationship. We develop an experiment in order to support this intuition, by applying this technique to a data set consisting in the text of the Bible, split into verses.

Keywords: cooccurrence graph, entity relation graph, unstructured text, weighted distance

Procedia PDF Downloads 131
24224 A Dynamic Solution Approach for Heart Disease Prediction

Authors: Walid Moudani

Abstract:

The healthcare environment is generally perceived as being information rich yet knowledge poor. However, there is a lack of effective analysis tools to discover hidden relationships and trends in data. In fact, valuable knowledge can be discovered from application of data mining techniques in healthcare system. In this study, a proficient methodology for the extraction of significant patterns from the coronary heart disease warehouses for heart attack prediction, which unfortunately continues to be a leading cause of mortality in the whole world, has been presented. For this purpose, we propose to enumerate dynamically the optimal subsets of the reduced features of high interest by using rough sets technique associated to dynamic programming. Therefore, we propose to validate the classification using Random Forest (RF) decision tree to identify the risky heart disease cases. This work is based on a large amount of data collected from several clinical institutions based on the medical profile of patient. Moreover, the experts’ knowledge in this field has been taken into consideration in order to define the disease, its risk factors, and to establish significant knowledge relationships among the medical factors. A computer-aided system is developed for this purpose based on a population of 525 adults. The performance of the proposed model is analyzed and evaluated based on set of benchmark techniques applied in this classification problem.

Keywords: multi-classifier decisions tree, features reduction, dynamic programming, rough sets

Procedia PDF Downloads 394
24223 The Concentration of Natural Alpha Emitters Radionuclides in Fish and Their Contribution to the Internal Dose

Authors: Wagner Pereira, Alphonse Kelecom

Abstract:

Mining can impact the environment, and the major impact of some mining activities is the radiological impact. In human populations, such impact is well studied and regulated. For biota, this assessment always had as focus the protection of human food chain. The protection of biota itself is a new approach, still developing. In order to contribute to this new approach, fish collecting was carried out in areas of naturally occurring radioactive materials (NORM), where a uranium mine is in decommissioning phase. The activity concentrations were analyzed, in Bq/kg wet weight, for Uranium (Unat), Th-232 and Ra-226 in the lambari fish Astyanax bimaculatus L. (omnivorous fish) and in the traíra fish Hoplias malabaricus Bloch, 1794 (carnivorous fish). Seven composite samples (that is: a sufficient number of individuals to reach at least 2 kg of fresh weight) were collected every six months between 2013 and 2015. The mean activity concentrations (AC) for uranium ranged from 1.12 (lambari) to 0.60 (lungfish). For Th, variations ranged from 0.30 to 0.05 (lambari and traíra, respectively). Finally, the Ra-226 means ranged between 0.08 and 0.03. No temporal trends of accumulation could be identified. Systematically, the AC values of radionuclides were higher in omnivorous fish when compared to the carnivore ones.

Keywords: biota dose, NORM, fish, environmental protection

Procedia PDF Downloads 234
24222 Analysis of Scholarly Communication Patterns in Korean Studies

Authors: Erin Hea-Jin Kim

Abstract:

This study aims to investigate scholarly communication patterns in Korean studies, which focuses on all aspects of Korea, including history, culture, literature, politics, society, economics, religion, and so on. It is called ‘national study or home study’ as the subject of the study is itself, whereas it is called ‘area study’ as the subject of the study is others, i.e., outside of Korea. Understanding of the structure of scholarly communication in Korean studies is important since the motivations, procedures, results, or outcomes of individual studies may be affected by the cooperative relationships that appear in the communication structure. To this end, we collected 1,798 articles with the (author or index) keyword ‘Korean’ published in 2018 from the Scopus database and extracted the institution and country of the authors using a text mining technique. A total of 96 countries, including South Korea, was identified. Then we constructed a co-authorship network based on the countries identified. The indicators of social network analysis (SNA), co-occurrences, and cluster analysis were used to measure the activity and connectivity of participation in collaboration in Korean studies. As a result, the highest frequency of collaboration appears in the following order: S. Korea with the United States (603), S. Korea with Japan (146), S. Korea with China (131), S. Korea with the United Kingdom (83), and China with the United States (65). This means that the most active participants are S. Korea as well as the USA. The highest rank in the role of mediator measured by betweenness centrality appears in the following order: United States (0.165), United Kingdom (0.045), China (0.043), Japan (0.037), Australia (0.026), and South Africa (0.023). These results show that these countries contribute to connecting in Korean studies. We found two major communities among the co-authorship network. Asian countries and America belong to the same community, and the United Kingdom and European countries belong to the other community. Korean studies have a long history, and the study has emerged since Japanese colonization. However, Korean studies have never been investigated by digital content analysis. The contributions of this study are an analysis of co-authorship in Korean studies with a global perspective based on digital content, which has not attempted so far to our knowledge, and to suggest ideas on how to analyze the humanities disciplines such as history, literature, or Korean studies by text mining. The limitation of this study is that the scholarly data we collected did not cover all domestic journals because we only gathered scholarly data from Scopus. There are thousands of domestic journals not indexed in Scopus that we can consider in terms of national studies, but are not possible to collect.

Keywords: co-authorship network, Korean studies, Koreanology, scholarly communication

Procedia PDF Downloads 136
24221 Biosorption of Gold from Chloride Media in a Simultaneous Adsorption-Reduction Process

Authors: Shafiq Alam, Yen Ning Lee

Abstract:

Conventional hydrometallurgical processing of metals involves the use of large quantities of toxic chemicals. Realizing a need to develop sustainable technologies, extensive research studies are being carried out to recover and recycle base, precious and rare earth metals from their pregnant leach solutions (PLS) using green chemicals/biomaterials prepared from biomass wastes derived from agriculture, marine and forest resources. Our innovative research showed that bio-adsorbents prepared from such biomass wastes can effectively adsorb precious metals, especially gold after conversion of their functional groups in a very simple process. The highly effective ‘Adsorption-coupled-Reduction’ phenomenon witnessed appears promising for the potential use of this gold biosorption process in the mining industry. Proper management and effective use of biomass wastes as value added green chemicals will not only reduce the volume of wastes being generated every day in our society, but will also have a high-end value to the mining and mineral processing industries as those biomaterials would be cheap, but very selective for gold recovery/recycling from low grade ore, leach residue or e-wastes.

Keywords: biosorption, hydrometallurgy, gold, adsorption, reduction, biomass, sustainability

Procedia PDF Downloads 360
24220 Breast Cancer Survivability Prediction via Classifier Ensemble

Authors: Mohamed Al-Badrashiny, Abdelghani Bellaachia

Abstract:

This paper presents a classifier ensemble approach for predicting the survivability of the breast cancer patients using the latest database version of the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute. The system consists of two main components; features selection and classifier ensemble components. The features selection component divides the features in SEER database into four groups. After that it tries to find the most important features among the four groups that maximizes the weighted average F-score of a certain classification algorithm. The ensemble component uses three different classifiers, each of which models different set of features from SEER through the features selection module. On top of them, another classifier is used to give the final decision based on the output decisions and confidence scores from each of the underlying classifiers. Different classification algorithms have been examined; the best setup found is by using the decision tree, Bayesian network, and Na¨ıve Bayes algorithms for the underlying classifiers and Na¨ıve Bayes for the classifier ensemble step. The system outperforms all published systems to date when evaluated against the exact same data of SEER (period of 1973-2002). It gives 87.39% weighted average F-score compared to 85.82% and 81.34% of the other published systems. By increasing the data size to cover the whole database (period of 1973-2014), the overall weighted average F-score jumps to 92.4% on the held out unseen test set.

Keywords: classifier ensemble, breast cancer survivability, data mining, SEER

Procedia PDF Downloads 304
24219 Quantifying User-Related, System-Related, and Context-Related Patterns of Smartphone Use

Authors: Andrew T. Hendrickson, Liven De Marez, Marijn Martens, Gytha Muller, Tudor Paisa, Koen Ponnet, Catherine Schweizer, Megan Van Meer, Mariek Vanden Abeele

Abstract:

Quantifying and understanding the myriad ways people use their phones and how that impacts their relationships, cognitive abilities, mental health, and well-being is increasingly important in our phone-centric society. However, most studies on the patterns of phone use have focused on theory-driven tests of specific usage hypotheses using self-report questionnaires or analyses of smaller datasets. In this work we present a series of analyses from a large corpus of over 3000 users that combine data-driven and theory-driven analyses to identify reliable smartphone usage patterns and clusters of similar users. Furthermore, we compare the stability of user clusters across user- and system-initiated sessions, as well as during the hypothesized ritualized behavior times directly before and after sleeping. Our results indicate support for some hypothesized usage patterns but present a more complete and nuanced view of how people use smartphones.

Keywords: data mining, experience sampling, smartphone usage, health and well being

Procedia PDF Downloads 143
24218 A New Approach – A Numerical Assessment of Ground Strata Failure Potentials in Underground Mines

Authors: Omer Yeni

Abstract:

Ground strata failure or fall-of-ground is one of the underground mines' most prominent catastrophic risks. Mining companies use various methods/technics to prevent and critically control the associated risks. Some of those are safety by design, excavation methods, ground support, training, and competency, which all require quality control and assurance activities to confirm their efficiencies and performances and identify improvement opportunities through monitoring. However, many mining companies use quality control (QC) methods without quality assurance (QA), and they call it QA/QC together as a habit. From a simple definition, QC is a method of detecting defects, and QA is a method of preventing defects. Testing the final products at the end of the production line is not the way of proper QA/QC application but testing every component before assembly and the final product once completed. The installed ground support elements are some final products mining companies use to prevent ground strata failure. Testing the final product (i.e., rock bolt pull testing, shotcrete strength test, etc.) with QC methods only while those areas are already accessible; is not like testing an airplane full of passengers right after the production line or testing a car after the sale. Can only QC methods be called QA/QC? Can QA/QC activities be numerically scored for each critical control implemented to assess ground strata failure potential? Can numerical scores be used to identify Geotechnical Risk Rating (GRR) to determine the ground strata failure risk and its probability? This paper sets out to provide a specific QA/QC methodology to manage and confirm efficiencies and performances of the implemented critical controls and a numerical approach through the Geotechnical Risk Rating (GRR) process to assess ground strata failure risk to determine the gaps where proactive action is required to evaluate the probability of ground strata failures in underground mines.

Keywords: fall of ground, ground strata failure, QA/QC, underground

Procedia PDF Downloads 65
24217 Finding the Association Rule between Nursing Interventions and Early Evaluation Results of In-Hospital Cardiac Arrest to Improve Patient Safety

Authors: Wei-Chih Huang, Pei-Lung Chung, Ching-Heng Lin, Hsuan-Chia Yang, Der-Ming Liou

Abstract:

Background: In-Hospital Cardiac Arrest (IHCA) threaten life of the inpatients, cause serious effect to patient safety, quality of inpatients care and hospital service. Health providers must identify the signs of IHCA early to avoid the occurrence of IHCA. This study will consider the potential association between early signs of IHCA and the essence of patient care provided by nurses and other professionals before an IHCA occurs. The aim of this study is to identify significant associations between nursing interventions and abnormal early evaluation results of IHCA that can assist health care providers in monitoring inpatients at risk of IHCA to increase opportunities of IHCA early detection and prevention. Materials and Methods: This study used one of the data mining techniques called association rules mining to compute associations between nursing interventions and abnormal early evaluation results of IHCA. The nursing interventions and abnormal early evaluation results of IHCA were considered to be co-occurring if nursing interventions were provided within 24 hours of last being observed in abnormal early evaluation results of IHCA. The rule based methods were utilized 23.6 million electronic medical records (EMR) from a medical center in Taipei, Taiwan. This dataset includes 733 concepts of nursing interventions that coded by clinical care classification (CCC) codes and 13 early evaluation results of IHCA with binary codes. The values of interestingness and lift were computed as Q values to measure the co-occurrence and associations’ strength between all in-hospital patient care measures and abnormal early evaluation results of IHCA. The associations were evaluated by comparing the results of Q values and verified by medical experts. Results and Conclusions: The results show that there are 4195 pairs of associations between nursing interventions and abnormal early evaluation results of IHCA with their Q values. The indication of positive association is 203 pairs with Q values greater than 5. Inpatients with high blood sugar level (hyperglycemia) have positive association with having heart rate lower than 50 beats per minute or higher than 120 beats per minute, Q value is 6.636. Inpatients with temporary pacemaker (TPM) have significant association with high risk of IHCA, Q value is 47.403. There is significant positive correlation between inpatients with hypovolemia and happened abnormal heart rhythms (arrhythmias), Q value is 127.49. The results of this study can help to prevent IHCA from occurring by making health care providers early recognition of inpatients at risk of IHCA, assist with monitoring patients for providing quality of care to patients, improve IHCA surveillance and quality of in-hospital care.

Keywords: in-hospital cardiac arrest, patient safety, nursing intervention, association rule mining

Procedia PDF Downloads 256
24216 Government (Big) Data Ecosystem: Definition, Classification of Actors, and Their Roles

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Organizations, including governments, generate (big) data that are high in volume, velocity, veracity, and come from a variety of sources. Public Administrations are using (big) data, implementing base registries, and enforcing data sharing within the entire government to deliver (big) data related integrated services, provision of insights to users, and for good governance. Government (Big) data ecosystem actors represent distinct entities that provide data, consume data, manipulate data to offer paid services, and extend data services like data storage, hosting services to other actors. In this research work, we perform a systematic literature review. The key objectives of this paper are to propose a robust definition of government (big) data ecosystem and a classification of government (big) data ecosystem actors and their roles. We showcase a graphical view of actors, roles, and their relationship in the government (big) data ecosystem. We also discuss our research findings. We did not find too much published research articles about the government (big) data ecosystem, including its definition and classification of actors and their roles. Therefore, we lent ideas for the government (big) data ecosystem from numerous areas that include scientific research data, humanitarian data, open government data, industry data, in the literature.

Keywords: big data, big data ecosystem, classification of big data actors, big data actors roles, definition of government (big) data ecosystem, data-driven government, eGovernment, gaps in data ecosystems, government (big) data, public administration, systematic literature review

Procedia PDF Downloads 142