Search results for: data mining analytics
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25146

Search results for: data mining analytics

24636 Advancing in Cricket Analytics: Novel Approaches for Pitch and Ball Detection Employing OpenCV and YOLOV8

Authors: Pratham Madnur, Prathamkumar Shetty, Sneha Varur, Gouri Parashetti

Abstract:

In order to overcome conventional obstacles, this research paper investigates novel approaches for cricket pitch and ball detection that make use of cutting-edge technologies. The research integrates OpenCV for pitch inspection and modifies the YOLOv8 model for cricket ball detection in order to overcome the shortcomings of manual pitch assessment and traditional ball detection techniques. To ensure flexibility in a range of pitch environments, the pitch detection method leverages OpenCV’s color space transformation, contour extraction, and accurate color range defining features. Regarding ball detection, the YOLOv8 model emphasizes the preservation of minor object details to improve accuracy and is specifically trained to the unique properties of cricket balls. The methods are more reliable because of the careful preparation of the datasets, which include novel ball and pitch information. These cutting-edge methods not only improve cricket analytics but also set the stage for flexible methods in more general sports technology applications.

Keywords: OpenCV, YOLOv8, cricket, custom dataset, computer vision, sports

Procedia PDF Downloads 62
24635 Lead and Cadmium Spatial Pattern and Risk Assessment around Coal Mine in Hyrcanian Forest, North Iran

Authors: Mahsa Tavakoli, Seyed Mohammad Hojjati, Yahya Kooch

Abstract:

In this study, the effect of coal mining activities on lead and cadmium concentrations and distribution in soil was investigated in Hyrcanian forest, North Iran. 16 plots (20×20 m2) were established by systematic-randomly (60×60 m2) in an area of 4 ha (200×200 m2-mine entrance placed at center). An area adjacent to the mine was not affected by the mining activity; considered as the controlled area. In order to investigate soil lead and cadmium concentration, one sample was taken from the 0-10 cm in each plot. To study the spatial pattern of soil properties and lead and cadmium concentrations in the mining area, an area of 80×80m2 (the mine as the center) was considered and 80 soil samples were systematic-randomly taken (10 m intervals). Geostatistical analysis was performed via Kriging method and GS+ software (version 5.1). In order to estimate the impact of coal mining activities on soil quality, pollution index was measured. Lead and cadmium concentrations were significantly higher in mine area (Pb: 10.97±0.30, Cd: 184.47±6.26 mg.kg-1) in comparison to control area (Pb: 9.42±0.17, Cd: 131.71±15.77 mg.kg-1). The mean values of the PI index indicate that Pb (1.16) and Cd (1.77) presented slightly polluted. Results of the NIPI index showed that Pb (1.44) and Cd (2.52) presented slight pollution and moderate pollution respectively. Results of variography and kriging method showed that it is possible to prepare interpolation maps of lead and cadmium around the mining areas in Hyrcanian forest. According to results of pollution and risk assessments, forest soil was contaminated by heavy metals (lead and cadmium); therefore, using reclamation and remediation techniques in these areas is necessary.

Keywords: traditional coal mining, heavy metals, pollution indicators, geostatistics, Caspian forest

Procedia PDF Downloads 173
24634 Fuzzy Logic Classification Approach for Exponential Data Set in Health Care System for Predication of Future Data

Authors: Manish Pandey, Gurinderjit Kaur, Meenu Talwar, Sachin Chauhan, Jagbir Gill

Abstract:

Health-care management systems are a unit of nice connection as a result of the supply a straightforward and fast management of all aspects relating to a patient, not essentially medical. What is more, there are unit additional and additional cases of pathologies during which diagnosing and treatment may be solely allotted by victimization medical imaging techniques. With associate ever-increasing prevalence, medical pictures area unit directly acquired in or regenerate into digital type, for his or her storage additionally as sequent retrieval and process. Data Mining is the process of extracting information from large data sets through using algorithms and Techniques drawn from the field of Statistics, Machine Learning and Data Base Management Systems. Forecasting may be a prediction of what's going to occur within the future, associated it's an unsure method. Owing to the uncertainty, the accuracy of a forecast is as vital because the outcome foretold by foretelling the freelance variables. A forecast management should be wont to establish if the accuracy of the forecast is within satisfactory limits. Fuzzy regression strategies have normally been wont to develop shopper preferences models that correlate the engineering characteristics with shopper preferences relating to a replacement product; the patron preference models offer a platform, wherever by product developers will decide the engineering characteristics so as to satisfy shopper preferences before developing the merchandise. Recent analysis shows that these fuzzy regression strategies area units normally will not to model client preferences. We tend to propose a Testing the strength of Exponential Regression Model over regression toward the mean Model.

Keywords: health-care management systems, fuzzy regression, data mining, forecasting, fuzzy membership function

Procedia PDF Downloads 272
24633 A New Approach for Improving Accuracy of Multi Label Stream Data

Authors: Kunal Shah, Swati Patel

Abstract:

Many real world problems involve data which can be considered as multi-label data streams. Efficient methods exist for multi-label classification in non streaming scenarios. However, learning in evolving streaming scenarios is more challenging, as the learners must be able to adapt to change using limited time and memory. Classification is used to predict class of unseen instance as accurate as possible. Multi label classification is a variant of single label classification where set of labels associated with single instance. Multi label classification is used by modern applications, such as text classification, functional genomics, image classification, music categorization etc. This paper introduces the task of multi-label classification, methods for multi-label classification and evolution measure for multi-label classification. Also, comparative analysis of multi label classification methods on the basis of theoretical study, and then on the basis of simulation was done on various data sets.

Keywords: binary relevance, concept drift, data stream mining, MLSC, multiple window with buffer

Procedia PDF Downloads 577
24632 RA-Apriori: An Efficient and Faster MapReduce-Based Algorithm for Frequent Itemset Mining on Apache Flink

Authors: Sanjay Rathee, Arti Kashyap

Abstract:

Extraction of useful information from large datasets is one of the most important research problems. Association rule mining is one of the best methods for this purpose. Finding possible associations between items in large transaction based datasets (finding frequent patterns) is most important part of the association rule mining. There exist many algorithms to find frequent patterns but Apriori algorithm always remains a preferred choice due to its ease of implementation and natural tendency to be parallelized. Many single-machine based Apriori variants exist but massive amount of data available these days is above capacity of a single machine. Therefore, to meet the demands of this ever-growing huge data, there is a need of multiple machines based Apriori algorithm. For these types of distributed applications, MapReduce is a popular fault-tolerant framework. Hadoop is one of the best open-source software frameworks with MapReduce approach for distributed storage and distributed processing of huge datasets using clusters built from commodity hardware. However, heavy disk I/O operation at each iteration of a highly iterative algorithm like Apriori makes Hadoop inefficient. A number of MapReduce-based platforms are being developed for parallel computing in recent years. Among them, two platforms, namely, Spark and Flink have attracted a lot of attention because of their inbuilt support to distributed computations. Earlier we proposed a reduced- Apriori algorithm on Spark platform which outperforms parallel Apriori, one because of use of Spark and secondly because of the improvement we proposed in standard Apriori. Therefore, this work is a natural sequel of our work and targets on implementing, testing and benchmarking Apriori and Reduced-Apriori and our new algorithm ReducedAll-Apriori on Apache Flink and compares it with Spark implementation. Flink, a streaming dataflow engine, overcomes disk I/O bottlenecks in MapReduce, providing an ideal platform for distributed Apriori. Flink's pipelining based structure allows starting a next iteration as soon as partial results of earlier iteration are available. Therefore, there is no need to wait for all reducers result to start a next iteration. We conduct in-depth experiments to gain insight into the effectiveness, efficiency and scalability of the Apriori and RA-Apriori algorithm on Flink.

Keywords: apriori, apache flink, Mapreduce, spark, Hadoop, R-Apriori, frequent itemset mining

Procedia PDF Downloads 286
24631 Strategies to Enhance Compliance of Health and Safety Standards at the Selected Mining Industries in Limpopo Province, South Africa: Occupational Health Nurse’s Perspective

Authors: Livhuwani Muthelo

Abstract:

The health and safety of the miners in the South African mining industry are guided by the regulations and standards which are anticipated to promote a healthy work environment and fatalities. It is of utmost importance for the miners to comply with these regulations/standards to protect themselves from potential occupational health and safety risks, accidents, and fatalities. The purpose of this study was to develop and validate strategies to enhance compliance with the Health and safety standards within the mining industries of Limpopo province in South Africa. A mixed-method exploratory sequential research design was adopted. The population consisted of 5350 miners. Purposive sampling was used to select the participants in the qualitative strand and stratified random sampling in the quantitative strand. Semi-structured interviews were conducted among the occupational health nurse practitioners and the health and safety team. Thematic analysis was used to generate an understanding of the interviews. In the quantitative strand, a survey was conducted using a self-administered questionnaire. Data were analysed using SPSS version 26.0. A descriptive statistical test was used in the analysis of data including frequencies, means, and standard deviation. Cronbach's alpha test was used to measure internal consistency. The integrated results revealed that there are diverse experiences related to health and safety standards compliance among the mineworkers. The main findings were challenges related to leadership compliance and also related to the cost of maintaining safety, Miner's behavior-related challenges; the impact of non-compliance on the overall health of the miners was also described, the conflict between production and safety. Health and safety compliance is not just mere compliance with regulations and standards but a culture that warrants the miners and organization to take responsibility for their behavior and actions towards health and safety. Thus taking responsibility for your well-being and other miners.

Keywords: perceptions, compliance, health and safety, legislation, standards, miners

Procedia PDF Downloads 94
24630 Directional Dust Deposition Measurements: The Influence of Seasonal Changes and the Meteorological Conditions Influencing in Witbank Area and Carletonville Area

Authors: Maphuti Georgina Kwata

Abstract:

Coal mining in Mpumalanga Province is known of contributing to the atmospheric pollution from various activities. Gold mining in North-West Province is known of also contributing to the atmospheric pollution especially with the production of radon gas. In this research directional dust deposition gauge was used to measure source of direction and meteorological data was used to determine the wind rose blowing and the influence of the seasonal changes. Fourteen months of dust collection was undertaken in Witbank Area and Carletonville Area. The results shows that the sources of direction for Ericson Dam its East in February 2010 and Tip Area shows that the source of direction its West in October 2010. In the East direction there were mining operations, power stations which contributed to the East to be the sources of direction. In the West direction there were smelters, power stations and agricultural activities which contributed for the source of direction to be the West direction for Driefontein Mine: East Recreational Village Club. The East of Leslie Williams hospital is the source of direction which also indicated that there dust generating activities such as mining operation, agricultural activities. The meteorological results for Emalahleni Area in summer and winter the wind rose blow with wind speed of 5-10 ms-1 from the East sector. Annual average for the wind rose blow its East South eastern sector with 20 ms-1 and day time the wind rose from northwestern sector with excess of 20 ms-1. The night time wind direction East-eastern direction with a maximum wind speed of 20 ms-1. The meteorogical results for Driefontein Mine show that North-western sector and north-eastern sector wind rose is blowing with 5-10 ms-1 win speed. Day time wind blows from the West sector and night time wind blows from the north sector. In summer the wind blows North-east sector with 5-10 ms-1 and winter wind blows from North-west and it’s also predominant. In spring wind blows from north-east. The conclusion is that not only mining operation where the directional dust deposit gauge were installed contributed to the source of direction also the power stations, smelters, and other activities nearby the mining operation contributed. The recommendations are the dust suppressant for unpaved roads should be used on a regular basis and there should be monitoring of the weather conditions (the wind speed and direction prior to blasting to ensure minimal emissions).

Keywords: directional dust deposition gauge, BS part 5 1747 dust deposit gauge, wind rose, wind blowing

Procedia PDF Downloads 499
24629 Research of the Three-Dimensional Visualization Geological Modeling of Mine Based on Surpac

Authors: Honggang Qu, Yong Xu, Rongmei Liu, Zhenji Gao, Bin Wang

Abstract:

Today's mining industry is advancing gradually toward digital and visual direction. The three-dimensional visualization geological modeling of mine is the digital characterization of mineral deposits and is one of the key technology of digital mining. Three-dimensional geological modeling is a technology that combines geological spatial information management, geological interpretation, geological spatial analysis and prediction, geostatistical analysis, entity content analysis and graphic visualization in a three-dimensional environment with computer technology and is used in geological analysis. In this paper, the three-dimensional geological modeling of an iron mine through the use of Surpac is constructed, and the weight difference of the estimation methods between the distance power inverse ratio method and ordinary kriging is studied, and the ore body volume and reserves are simulated and calculated by using these two methods. Compared with the actual mine reserves, its result is relatively accurate, so it provides scientific bases for mine resource assessment, reserve calculation, mining design and so on.

Keywords: three-dimensional geological modeling, geological database, geostatistics, block model

Procedia PDF Downloads 72
24628 Collision Theory Based Sentiment Detection Using Discourse Analysis in Hadoop

Authors: Anuta Mukherjee, Saswati Mukherjee

Abstract:

Data is growing everyday. Social networking sites such as Twitter are becoming an integral part of our daily lives, contributing a large increase in the growth of data. It is a rich source especially for sentiment detection or mining since people often express honest opinion through tweets. However, although sentiment analysis is a well-researched topic in text, this analysis using Twitter data poses additional challenges since these are unstructured data with abbreviations and without a strict grammatical correctness. We have employed collision theory to achieve sentiment analysis in Twitter data. We have also incorporated discourse analysis in the collision theory based model to detect accurate sentiment from tweets. We have also used the retweet field to assign weights to certain tweets and obtained the overall weightage of a topic provided in the form of a query. Hadoop has been exploited for speed. Our experiments show effective results.

Keywords: sentiment analysis, twitter, collision theory, discourse analysis

Procedia PDF Downloads 527
24627 Critically Analyzing the Application of Big Data for Smart Transportation: A Case Study of Mumbai

Authors: Tanuj Joshi

Abstract:

Smart transportation is fast emerging as a solution to modern cities’ approach mobility issues, delayed emergency response rate and high congestion on streets. Present day scenario with Google Maps, Waze, Yelp etc. demonstrates how information and communications technologies controls the intelligent transportation system. This intangible and invisible infrastructure is largely guided by the big data analytics. On the other side, the exponential increase in Indian urban population has intensified the demand for better services and infrastructure to satisfy the transportation needs of its citizens. No doubt, India’s huge internet usage is looked as an important resource to guide to achieve this. However, with a projected number of over 40 billion objects connected to the Internet by 2025, the need for systems to handle massive volume of data (big data) also arises. This research paper attempts to identify the ways of exploiting the big data variables which will aid commuters on Indian tracks. This study explores real life inputs by conducting survey and interviews to identify which gaps need to be targeted to better satisfy the customers. Several experts at Mumbai Metropolitan Region Development Authority (MMRDA), Mumbai Metro and Brihanmumbai Electric Supply and Transport (BEST) were interviewed regarding the Information Technology (IT) systems currently in use. The interviews give relevant insights and requirements into the workings of public transportation systems whereas the survey investigates the macro situation.

Keywords: smart transportation, mobility issue, Mumbai transportation, big data, data analysis

Procedia PDF Downloads 170
24626 Shark Detection and Classification with Deep Learning

Authors: Jeremy Jenrette, Z. Y. C. Liu, Pranav Chimote, Edward Fox, Trevor Hastie, Francesco Ferretti

Abstract:

Suitable shark conservation depends on well-informed population assessments. Direct methods such as scientific surveys and fisheries monitoring are adequate for defining population statuses, but species-specific indices of abundance and distribution coming from these sources are rare for most shark species. We can rapidly fill these information gaps by boosting media-based remote monitoring efforts with machine learning and automation. We created a database of shark images by sourcing 24,546 images covering 219 species of sharks from the web application spark pulse and the social network Instagram. We used object detection to extract shark features and inflate this database to 53,345 images. We packaged object-detection and image classification models into a Shark Detector bundle. We developed the Shark Detector to recognize and classify sharks from videos and images using transfer learning and convolutional neural networks (CNNs). We applied these models to common data-generation approaches of sharks: boosting training datasets, processing baited remote camera footage and online videos, and data-mining Instagram. We examined the accuracy of each model and tested genus and species prediction correctness as a result of training data quantity. The Shark Detector located sharks in baited remote footage and YouTube videos with an average accuracy of 89\%, and classified located subjects to the species level with 69\% accuracy (n =\ eight species). The Shark Detector sorted heterogeneous datasets of images sourced from Instagram with 91\% accuracy and classified species with 70\% accuracy (n =\ 17 species). Data-mining Instagram can inflate training datasets and increase the Shark Detector’s accuracy as well as facilitate archiving of historical and novel shark observations. Base accuracy of genus prediction was 68\% across 25 genera. The average base accuracy of species prediction within each genus class was 85\%. The Shark Detector can classify 45 species. All data-generation methods were processed without manual interaction. As media-based remote monitoring strives to dominate methods for observing sharks in nature, we developed an open-source Shark Detector to facilitate common identification applications. Prediction accuracy of the software pipeline increases as more images are added to the training dataset. We provide public access to the software on our GitHub page.

Keywords: classification, data mining, Instagram, remote monitoring, sharks

Procedia PDF Downloads 106
24625 Agile Methodology for Modeling and Design of Data Warehouses -AM4DW-

Authors: Nieto Bernal Wilson, Carmona Suarez Edgar

Abstract:

The organizations have structured and unstructured information in different formats, sources, and systems. Part of these come from ERP under OLTP processing that support the information system, however these organizations in OLAP processing level, presented some deficiencies, part of this problematic lies in that does not exist interesting into extract knowledge from their data sources, as also the absence of operational capabilities to tackle with these kind of projects.  Data Warehouse and its applications are considered as non-proprietary tools, which are of great interest to business intelligence, since they are repositories basis for creating models or patterns (behavior of customers, suppliers, products, social networks and genomics) and facilitate corporate decision making and research. The following paper present a structured methodology, simple, inspired from the agile development models as Scrum, XP and AUP. Also the models object relational, spatial data models, and the base line of data modeling under UML and Big data, from this way sought to deliver an agile methodology for the developing of data warehouses, simple and of easy application. The methodology naturally take into account the application of process for the respectively information analysis, visualization and data mining, particularly for patterns generation and derived models from the objects facts structured.

Keywords: data warehouse, model data, big data, object fact, object relational fact, process developed data warehouse

Procedia PDF Downloads 397
24624 Recommender System Based on Mining Graph Databases for Data-Intensive Applications

Authors: Mostafa Gamal, Hoda K. Mohamed, Islam El-Maddah, Ali Hamdi

Abstract:

In recent years, many digital documents on the web have been created due to the rapid growth of ’social applications’ communities or ’Data-intensive applications’. The evolution of online-based multimedia data poses new challenges in storing and querying large amounts of data for online recommender systems. Graph data models have been shown to be more efficient than relational data models for processing complex data. This paper will explain the key differences between graph and relational databases, their strengths and weaknesses, and why using graph databases is the best technology for building a realtime recommendation system. Also, The paper will discuss several similarity metrics algorithms that can be used to compute a similarity score of pairs of nodes based on their neighbourhoods or their properties. Finally, the paper will discover how NLP strategies offer the premise to improve the accuracy and coverage of realtime recommendations by extracting the information from the stored unstructured knowledge, which makes up the bulk of the world’s data to enrich the graph database with this information. As the size and number of data items are increasing rapidly, the proposed system should meet current and future needs.

Keywords: graph databases, NLP, recommendation systems, similarity metrics

Procedia PDF Downloads 100
24623 Heritage Value and Industrial Tourism Potential of the Urals, Russia

Authors: Anatoly V. Stepanov, Maria Y. Ilyushkina, Alexander S. Burnasov

Abstract:

Expansion of tourism, especially after WWII, has led to significant improvements in the regional infrastructure. The present study has revealed a lot of progress in the advancement of industrial heritage narrative in the Central Urals. The evidence comes from the general public’s increased fascination with some of Europe’s oldest mining and industrial sites, and the agreement of many stakeholders that the Urals industrial heritage should be preserved. The development of tourist sites in Nizhny Tagil and Nevyansk, gold-digging in Beryosovsky, gemstone search in Murzinka, and the progress with the Urals Gemstone Ring project are the examples showing the immense opportunities of industrial heritage tourism development in the region that are still to be realized. Regardless of the economic future of the Central Urals, whether it will remain an industrial region or experience a deeper deindustrialization, the sprouts of the industrial heritage tourism should be advanced and amplified for the benefit of local communities and the tourist community at large as it is hard to imagine a more suitable site for the discovery of industrial and mining heritage than the Central Urals Region of Russia.

Keywords: industrial heritage, mining heritage, Central Urals, Russia

Procedia PDF Downloads 124
24622 From Two-Way to Multi-Way: A Comparative Study for Map-Reduce Join Algorithms

Authors: Marwa Hussien Mohamed, Mohamed Helmy Khafagy

Abstract:

Map-Reduce is a programming model which is widely used to extract valuable information from enormous volumes of data. Map-reduce designed to support heterogeneous datasets. Apache Hadoop map-reduce used extensively to uncover hidden pattern like data mining, SQL, etc. The most important operation for data analysis is joining operation. But, map-reduce framework does not directly support join algorithm. This paper explains and compares two-way and multi-way map-reduce join algorithms for map reduce also we implement MR join Algorithms and show the performance of each phase in MR join algorithms. Our experimental results show that map side join and map merge join in two-way join algorithms has the longest time according to preprocessing step sorting data and reduce side cascade join has the longest time at Multi-Way join algorithms.

Keywords: Hadoop, MapReduce, multi-way join, two-way join, Ubuntu

Procedia PDF Downloads 477
24621 Filtering Intrusion Detection Alarms Using Ant Clustering Approach

Authors: Ghodhbani Salah, Jemili Farah

Abstract:

With the growth of cyber attacks, information safety has become an important issue all over the world. Many firms rely on security technologies such as intrusion detection systems (IDSs) to manage information technology security risks. IDSs are considered to be the last line of defense to secure a network and play a very important role in detecting large number of attacks. However the main problem with today’s most popular commercial IDSs is generating high volume of alerts and huge number of false positives. This drawback has become the main motivation for many research papers in IDS area. Hence, in this paper we present a data mining technique to assist network administrators to analyze and reduce false positive alarms that are produced by an IDS and increase detection accuracy. Our data mining technique is unsupervised clustering method based on hybrid ANT algorithm. This algorithm discovers clusters of intruders’ behavior without prior knowledge of a possible number of classes, then we apply K-means algorithm to improve the convergence of the ANT clustering. Experimental results on real dataset show that our proposed approach is efficient with high detection rate and low false alarm rate.

Keywords: intrusion detection system, alarm filtering, ANT class, ant clustering, intruders’ behaviors, false alarms

Procedia PDF Downloads 399
24620 Relay Mining: Verifiable Multi-Tenant Distributed Rate Limiting

Authors: Daniel Olshansky, Ramiro Rodrıguez Colmeiro

Abstract:

Relay Mining presents a scalable solution employing probabilistic mechanisms and crypto-economic incentives to estimate RPC volume usage, facilitating decentralized multitenant rate limiting. Network traffic from individual applications can be concurrently serviced by multiple RPC service providers, with costs, rewards, and rate limiting governed by a native cryptocurrency on a distributed ledger. Building upon established research in token bucket algorithms and distributed rate-limiting penalty models, our approach harnesses a feedback loop control mechanism to adjust the difficulty of mining relay rewards, dynamically scaling with network usage growth. By leveraging crypto-economic incentives, we reduce coordination overhead costs and introduce a mechanism for providing RPC services that are both geopolitically and geographically distributed.

Keywords: remote procedure call, crypto-economic, commit-reveal, decentralization, scalability, blockchain, rate limiting, token bucket

Procedia PDF Downloads 50
24619 Unsupervised Text Mining Approach to Early Warning System

Authors: Ichihan Tai, Bill Olson, Paul Blessner

Abstract:

Traditional early warning systems that alarm against crisis are generally based on structured or numerical data; therefore, a system that can make predictions based on unstructured textual data, an uncorrelated data source, is a great complement to the traditional early warning systems. The Chicago Board Options Exchange (CBOE) Volatility Index (VIX), commonly referred to as the fear index, measures the cost of insurance against market crash, and spikes in the event of crisis. In this study, news data is consumed for prediction of whether there will be a market-wide crisis by predicting the movement of the fear index, and the historical references to similar events are presented in an unsupervised manner. Topic modeling-based prediction and representation are made based on daily news data between 1990 and 2015 from The Wall Street Journal against VIX index data from CBOE.

Keywords: early warning system, knowledge management, market prediction, topic modeling.

Procedia PDF Downloads 332
24618 Application of Remote Sensing Technique on the Monitoring of Mine Eco-Environment

Authors: Haidong Li, Weishou Shen, Guoping Lv, Tao Wang

Abstract:

Aiming to overcome the limitation of the application of traditional remote sensing (RS) technique in the mine eco-environmental monitoring, in this paper, we first classified the eco-environmental damages caused by mining activities and then introduced the principle, classification and characteristics of the Light Detection and Ranging (LiDAR) technique. The potentiality of LiDAR technique in the mine eco-environmental monitoring was analyzed, particularly in extracting vertical structure parameters of vegetation, through comparing the feasibility and applicability of traditional RS method and LiDAR technique in monitoring different types of indicators. The application situation of LiDAR technique in extracting typical mine indicators, such as land destruction in mining areas, damage of ecological integrity and natural soil erosion. The result showed that the LiDAR technique has the ability to monitor most of the mine eco-environmental indicators, and exhibited higher accuracy comparing with traditional RS technique, specifically speaking, the applicability of LiDAR technique on each indicator depends on the accuracy requirement of mine eco-environmental monitoring. In the item of large mine, LiDAR three-dimensional point cloud data not only could be used as the complementary data source of optical RS, Airborne/Satellite LiDAR could also fulfill the demand of extracting vertical structure parameters of vegetation in large areas.

Keywords: LiDAR, mine, ecological damage, monitoring, traditional remote sensing technique

Procedia PDF Downloads 391
24617 Framework to Quantify Customer Experience

Authors: Anant Sharma, Ashwin Rajan

Abstract:

Customer experience is measured today based on defining a set of metrics and KPIs, setting up thresholds and defining triggers across those thresholds. While this is an effective way of measuring against a Key Performance Indicator ( referred to as KPI in the rest of the paper ), this approach cannot capture the various nuances that make up the overall customer experience. Customers consume a product or service at various levels, which is not reflected in metrics like Customer Satisfaction or Net Promoter Score, but also across other measurements like recurring revenue, frequency of service usage, e-learning and depth of usage. Here we explore an alternative method of measuring customer experience by flipping the traditional views. Rather than rolling customers up to a metric, we roll up metrics to hierarchies and then measure customer experience. This method allows any team to quantify customer experience across multiple touchpoints in a customer’s journey. We make use of various data sources which contain information for metrics like CXSAT, NPS, Renewals, and depths of service usage collected across a customer lifecycle. This data can be mined systematically to get linkages between different data points like geographies, business groups, products and time. Additional views can be generated by blending synthetic contexts into the data to show trends and top/bottom types of reports. We have created a framework that allows us to measure customer experience using the above logic.

Keywords: analytics, customers experience, BI, business operations, KPIs, metrics

Procedia PDF Downloads 66
24616 The Use of Rule-Based Cellular Automata to Track and Forecast the Dispersal of Classical Biocontrol Agents at Scale, with an Application to the Fopius arisanus Fruit Fly Parasitoid

Authors: Agboka Komi Mensah, John Odindi, Elfatih M. Abdel-Rahman, Onisimo Mutanga, Henri Ez Tonnang

Abstract:

Ecosystems are networks of organisms and populations that form a community of various species interacting within their habitats. Such habitats are defined by abiotic and biotic conditions that establish the initial limits to a population's growth, development, and reproduction. The habitat’s conditions explain the context in which species interact to access resources such as food, water, space, shelter, and mates, allowing for feeding, dispersal, and reproduction. Dispersal is an essential life-history strategy that affects gene flow, resource competition, population dynamics, and species distributions. Despite the importance of dispersal in population dynamics and survival, understanding the mechanism underpinning the dispersal of organisms remains challenging. For instance, when an organism moves into an ecosystem for survival and resource competition, its progression is highly influenced by extrinsic factors such as its physiological state, climatic variables and ability to evade predation. Therefore, greater spatial detail is necessary to understand organism dispersal dynamics. Understanding organisms dispersal can be addressed using empirical and mechanistic modelling approaches, with the adopted approach depending on the study's purpose Cellular automata (CA) is an example of these approaches that have been successfully used in biological studies to analyze the dispersal of living organisms. Cellular automata can be briefly described as occupied cells by an individual that evolves based on proper decisions based on a set of neighbours' rules. However, in the ambit of modelling individual organisms dispersal at the landscape scale, we lack user friendly tools that do not require expertise in mathematical models and computing ability; such as a visual analytics framework for tracking and forecasting the dispersal behaviour of organisms. The term "visual analytics" (VA) describes a semiautomated approach to electronic data processing that is guided by users who can interact with data via an interface. Essentially, VA converts large amounts of quantitative or qualitative data into graphical formats that can be customized based on the operator's needs. Additionally, this approach can be used to enhance the ability of users from various backgrounds to understand data, communicate results, and disseminate information across a wide range of disciplines. To support effective analysis of the dispersal of organisms at the landscape scale, we therefore designed Pydisp which is a free visual data analytics tool for spatiotemporal dispersal modeling built in Python. Its user interface allows users to perform a quick and interactive spatiotemporal analysis of species dispersal using bioecological and climatic data. Pydisp enables reuse and upgrade through the use of simple principles such as Fuzzy cellular automata algorithms. The potential of dispersal modeling is demonstrated in a case study by predicting the dispersal of Fopius arisanus (Sonan), endoparasitoids to control Bactrocera dorsalis (Hendel) (Diptera: Tephritidae) in Kenya. The results obtained from our example clearly illustrate the parasitoid's dispersal process at the landscape level and confirm that dynamic processes in an agroecosystem are better understood when designed using mechanistic modelling approaches. Furthermore, as demonstrated in the example, the built software is highly effective in portraying the dispersal of organisms despite the unavailability of detailed data on the species dispersal mechanisms.

Keywords: cellular automata, fuzzy logic, landscape, spatiotemporal

Procedia PDF Downloads 74
24615 Helping the Development of Public Policies with Knowledge of Criminal Data

Authors: Diego De Castro Rodrigues, Marcelo B. Nery, Sergio Adorno

Abstract:

The project aims to develop a framework for social data analysis, particularly by mobilizing criminal records and applying descriptive computational techniques, such as associative algorithms and extraction of tree decision rules, among others. The methods and instruments discussed in this work will enable the discovery of patterns, providing a guided means to identify similarities between recurring situations in the social sphere using descriptive techniques and data visualization. The study area has been defined as the city of São Paulo, with the structuring of social data as the central idea, with a particular focus on the quality of the information. Given this, a set of tools will be validated, including the use of a database and tools for visualizing the results. Among the main deliverables related to products and the development of articles are the discoveries made during the research phase. The effectiveness and utility of the results will depend on studies involving real data, validated both by domain experts and by identifying and comparing the patterns found in this study with other phenomena described in the literature. The intention is to contribute to evidence-based understanding and decision-making in the social field.

Keywords: social data analysis, criminal records, computational techniques, data mining, big data

Procedia PDF Downloads 75
24614 The Affective Motivation of Women Miners in Ghana

Authors: Adesuwa Omorede, Rufai Haruna Kilu

Abstract:

Affective motivation (motivation that is emotionally laden usually related to affect, passion, emotions, moods) in the workplace stimulates individuals to reinforce, persist and commit to their task, which leads to the individual and organizational performance. This leads individuals to reach goals especially in situations where task are highly challenging and hostile. In such situations, individuals are more disposed to be more creative, innovative and see new opportunities from the loopholes in their workplace. However, when individuals feel displaced and less important, an adverse reaction may suffice which may be detrimental to the organization and its performance. One sector where affective motivation is eminently present and relevant, is the mining industry. Due to its intense work environment; mostly dominated by men and masculinity cultures; and deliberate exclusion of women in this environment which, makes the women working in these environments to feel marginalized. In Ghana, the mining industry is mostly seen as a very physical environment especially underground and mostly considerd as 'no place for a woman'. Despite the fact that these women feel less 'needed' or 'appreciated' in such environments, they still have to juggle between intense work shifts; face violence and other health risks with their families, which put a strain on their affective motivational reaction. Beyond these challenges, however, several mining companies in Ghana today are working towards providing a fair and equal working situation for both men and women miners, by recognizing them as key stakeholders, as well as including them in the stages of mining projects from the planning and designing phase to the evaluation and implementation stage. Drawing from the psychology and gender literature, this study takes a narrative approach to identify and understand the shifting gender dynamics within the mine works in Ghana, occasioning a change in background disposition of miners, which leads to more women taking up mine jobs in the country. In doing so, a qualitative study was conducted using semi-structured interviews from Ghana. Several women working within the mining industries in Ghana shared their experiences and how they felt and still feel in their workplace. In addition, archival documents were gathered to support the findings. The results suggest a change in enrolment regimes in a mining and technology university in Ghana, making room for a more gender equal enrolments in the university. A renowned university that train and feed mine work professional into the industry. The results further acknowledge gender equal and diversity recruitment policies and initiatives among the mining companies of Ghana. This study contributes to the psychology and gender literature by highlighting the hindrances women face in the mining industry as well as highlighting several of their affective reactions towards gender inequality. The study also provides several suggestions for decision makers in the mining industry of what can be done in the future to reduce the gender inequality gap within the industry.

Keywords: affective motivation, gender shape shifting, mining industry, women miners

Procedia PDF Downloads 292
24613 Predictive Analytics in Traffic Flow Management: Integrating Temporal Dynamics and Traffic Characteristics to Estimate Travel Time

Authors: Maria Ezziani, Rabie Zine, Amine Amar, Ilhame Kissani

Abstract:

This paper introduces a predictive model for urban transportation engineering, which is vital for efficient traffic management. Utilizing comprehensive datasets and advanced statistical techniques, the model accurately forecasts travel times by considering temporal variations and traffic dynamics. Machine learning algorithms, including regression trees and neural networks, are employed to capture sequential dependencies. Results indicate significant improvements in predictive accuracy, particularly during peak hours and holidays, with the incorporation of traffic flow and speed variables. Future enhancements may integrate weather conditions and traffic incidents. The model's applications range from adaptive traffic management systems to route optimization algorithms, facilitating congestion reduction and enhancing journey reliability. Overall, this research extends beyond travel time estimation, offering insights into broader transportation planning and policy-making realms, empowering stakeholders to optimize infrastructure utilization and improve network efficiency.

Keywords: predictive analytics, traffic flow, travel time estimation, urban transportation, machine learning, traffic management

Procedia PDF Downloads 70
24612 A General Framework for Measuring the Internal Fraud Risk of an Enterprise Resource Planning System

Authors: Imran Dayan, Ashiqul Khan

Abstract:

Internal corporate fraud, which is fraud carried out by internal stakeholders of a company, affects the well-being of the organisation just like its external counterpart. Even if such an act is carried out for the short-term benefit of a corporation, the act is ultimately harmful to the entity in the long run. Internal fraud is often carried out by relying upon aberrations from usual business processes. Business processes are the lifeblood of a company in modern managerial context. Such processes are developed and fine-tuned over time as a corporation grows through its life stages. Modern corporations have embraced technological innovations into their business processes, and Enterprise Resource Planning (ERP) systems being at the heart of such business processes is a testimony to that. Since ERP systems record a huge amount of data in their event logs, the logs are a treasure trove for anyone trying to detect any sort of fraudulent activities hidden within the day-to-day business operations and processes. This research utilises the ERP systems in place within corporations to assess the likelihood of prospective internal fraud through developing a framework for measuring the risks of fraud through Process Mining techniques and hence finds risky designs and loose ends within these business processes. This framework helps not only in identifying existing cases of fraud in the records of the event log, but also signals the overall riskiness of certain business processes, and hence draws attention for carrying out a redesign of such processes to reduce the chance of future internal fraud while improving internal control within the organisation. The research adds value by applying the concepts of Process Mining into the analysis of data from modern day applications of business process records, which is the ERP event logs, and develops a framework that should be useful to internal stakeholders for strengthening internal control as well as provide external auditors with a tool of use in case of suspicion. The research proves its usefulness through a few case studies conducted with respect to big corporations with complex business processes and an ERP in place.

Keywords: enterprise resource planning, fraud risk framework, internal corporate fraud, process mining

Procedia PDF Downloads 326
24611 Integrating Data Mining with Case-Based Reasoning for Diagnosing Sorghum Anthracnose

Authors: Mariamawit T. Belete

Abstract:

Cereal production and marketing are the means of livelihood for millions of households in Ethiopia. However, cereal production is constrained by technical and socio-economic factors. Among the technical factors, cereal crop diseases are the major contributing factors to the low yield. The aim of this research is to develop an integration of data mining and knowledge based system for sorghum anthracnose disease diagnosis that assists agriculture experts and development agents to make timely decisions. Anthracnose diagnosing systems gather information from Melkassa agricultural research center and attempt to score anthracnose severity scale. Empirical research is designed for data exploration, modeling, and confirmatory procedures for testing hypothesis and prediction to draw a sound conclusion. WEKA (Waikato Environment for Knowledge Analysis) was employed for the modeling. Knowledge based system has come across a variety of approaches based on the knowledge representation method; case-based reasoning (CBR) is one of the popular approaches used in knowledge-based system. CBR is a problem solving strategy that uses previous cases to solve new problems. The system utilizes hidden knowledge extracted by employing clustering algorithms, specifically K-means clustering from sampled anthracnose dataset. Clustered cases with centroid value are mapped to jCOLIBRI, and then the integrator application is created using NetBeans with JDK 8.0.2. The important part of a case based reasoning model includes case retrieval; the similarity measuring stage, reuse; which allows domain expert to transfer retrieval case solution to suit for the current case, revise; to test the solution, and retain to store the confirmed solution to the case base for future use. Evaluation of the system was done for both system performance and user acceptance. For testing the prototype, seven test cases were used. Experimental result shows that the system achieves an average precision and recall values of 70% and 83%, respectively. User acceptance testing also performed by involving five domain experts, and an average of 83% acceptance is achieved. Although the result of this study is promising, however, further study should be done an investigation on hybrid approach such as rule based reasoning, and pictorial retrieval process are recommended.

Keywords: sorghum anthracnose, data mining, case based reasoning, integration

Procedia PDF Downloads 74
24610 Topic Modelling Using Latent Dirichlet Allocation and Latent Semantic Indexing on SA Telco Twitter Data

Authors: Phumelele Kubheka, Pius Owolawi, Gbolahan Aiyetoro

Abstract:

Twitter is one of the most popular social media platforms where users can share their opinions on different subjects. As of 2010, The Twitter platform generates more than 12 Terabytes of data daily, ~ 4.3 petabytes in a single year. For this reason, Twitter is a great source for big mining data. Many industries such as Telecommunication companies can leverage the availability of Twitter data to better understand their markets and make an appropriate business decision. This study performs topic modeling on Twitter data using Latent Dirichlet Allocation (LDA). The obtained results are benchmarked with another topic modeling technique, Latent Semantic Indexing (LSI). The study aims to retrieve topics on a Twitter dataset containing user tweets on South African Telcos. Results from this study show that LSI is much faster than LDA. However, LDA yields better results with higher topic coherence by 8% for the best-performing model represented in Table 1. A higher topic coherence score indicates better performance of the model.

Keywords: big data, latent Dirichlet allocation, latent semantic indexing, telco, topic modeling, twitter

Procedia PDF Downloads 145
24609 Lessons from Farmers Performing Agroforestry for Reclamation of Gold Mine Spoils in Colombia

Authors: Bibiana Betancur-Corredor, Juan Carlos Loaiza, Manfred Denich, Christian Borgemeister

Abstract:

Alluvial gold mining generates a vast amount of deposits that cover the natural soil and negatively impacts riverbeds and valleys, causing loss of livelihood opportunities for farmers of these regions. In Colombia, more than 79,000 ha are affected by alluvial gold mining, therefore developing strategies to return this land to productivity is of crucial importance for the country. A novel restoration strategy has been created by a mining company, where the land is restored through the establishment of agroforestry systems, in which agricultural crops and livestock are combined to complement reforestation in the area. The purpose of this study is to capture the knowledge of farmers who perform agroforestry in areas with deposits created by alluvial gold mining activities. Semi structured interviews were conducted with farmers with regard to the following: indicators of soil fertility, management practices, soil heterogeneity, pest outbreaks and weeds. In order to compare the perceptions of soil fertility of farmers with physicochemical properties of soils, the farmers were asked to identify spots within their farms that have exhibited good and poor yields. Soil samples were collected in order to correlate farmer’s perceptions with soil physicochemical properties. The findings suggest that the main challenge that farmers face is the identification of fertile soil for crop establishment. They identify the fertile soil through visually analyzing soil color and compaction as well as the use of spontaneous growth of specific plants as indicator of soil fertility. For less fertile areas, nitrogen fixing plants are used as green manure to restore soil fertility for crop establishment. The findings of this study imply that if gold mining is followed by reclamation practices that involve the successful establishment of productive farmlands, agricultural productivity of these lands might improve, increasing food security of the affected communities.

Keywords: agroforestry, knowledge, mining, restoration

Procedia PDF Downloads 228
24608 Ontology-Driven Knowledge Discovery and Validation from Admission Databases: A Structural Causal Model Approach for Polytechnic Education in Nigeria

Authors: Bernard Igoche Igoche, Olumuyiwa Matthew, Peter Bednar, Alexander Gegov

Abstract:

This study presents an ontology-driven approach for knowledge discovery and validation from admission databases in Nigerian polytechnic institutions. The research aims to address the challenges of extracting meaningful insights from vast amounts of admission data and utilizing them for decision-making and process improvement. The proposed methodology combines the knowledge discovery in databases (KDD) process with a structural causal model (SCM) ontological framework. The admission database of Benue State Polytechnic Ugbokolo (Benpoly) is used as a case study. The KDD process is employed to mine and distill knowledge from the database, while the SCM ontology is designed to identify and validate the important features of the admission process. The SCM validation is performed using the conditional independence test (CIT) criteria, and an algorithm is developed to implement the validation process. The identified features are then used for machine learning (ML) modeling and prediction of admission status. The results demonstrate the adequacy of the SCM ontological framework in representing the admission process and the high predictive accuracies achieved by the ML models, with k-nearest neighbors (KNN) and support vector machine (SVM) achieving 92% accuracy. The study concludes that the proposed ontology-driven approach contributes to the advancement of educational data mining and provides a foundation for future research in this domain.

Keywords: admission databases, educational data mining, machine learning, ontology-driven knowledge discovery, polytechnic education, structural causal model

Procedia PDF Downloads 54
24607 Opinion Mining to Extract Community Emotions on Covid-19 Immunization Possible Side Effects

Authors: Yahya Almurtadha, Mukhtar Ghaleb, Ahmed M. Shamsan Saleh

Abstract:

The world witnessed a fierce attack from the Covid-19 virus, which affected public life socially, economically, healthily and psychologically. The world's governments tried to confront the pandemic by imposing a number of precautionary measures such as general closure, curfews and social distancing. Scientists have also made strenuous efforts to develop an effective vaccine to train the immune system to develop antibodies to combat the virus, thus reducing its symptoms and limiting its spread. Artificial intelligence, along with researchers and medical authorities, has accelerated the vaccine development process through big data processing and simulation. On the other hand, one of the most important negatives of the impact of Covid 19 was the state of anxiety and fear due to the blowout of rumors through social media, which prompted governments to try to reassure the public with the available means. This study aims to proposed using Sentiment Analysis (AKA Opinion Mining) and deep learning as efficient artificial intelligence techniques to work on retrieving the tweets of the public from Twitter and then analyze it automatically to extract their opinions, expression and feelings, negatively or positively, about the symptoms they may feel after vaccination. Sentiment analysis is characterized by its ability to access what the public post in social media within a record time and at a lower cost than traditional means such as questionnaires and interviews, not to mention the accuracy of the information as it comes from what the public expresses voluntarily.

Keywords: deep learning, opinion mining, natural language processing, sentiment analysis

Procedia PDF Downloads 164