Search results for: predictive data mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25287

Search results for: predictive data mining

24507 A Method to Evaluate and Compare Web Information Extractors

Authors: Patricia Jiménez, Rafael Corchuelo, Hassan A. Sleiman

Abstract:

Web mining is gaining importance at an increasing pace. Currently, there are many complementary research topics under this umbrella. Their common theme is that they all focus on applying knowledge discovery techniques to data that is gathered from the Web. Sometimes, these data are relatively easy to gather, chiefly when it comes from server logs. Unfortunately, there are cases in which the data to be mined is the data that is displayed on a web document. In such cases, it is necessary to apply a pre-processing step to first extract the information of interest from the web documents. Such pre-processing steps are performed using so-called information extractors, which are software components that are typically configured by means of rules that are tailored to extracting the information of interest from a web page and structuring it according to a pre-defined schema. Paramount to getting good mining results is that the technique used to extract the source information is exact, which requires to evaluate and compare the different proposals in the literature from an empirical point of view. According to Google Scholar, about 4 200 papers on information extraction have been published during the last decade. Unfortunately, they were not evaluated within a homogeneous framework, which leads to difficulties to compare them empirically. In this paper, we report on an original information extraction evaluation method. Our contribution is three-fold: a) this is the first attempt to provide an evaluation method for proposals that work on semi-structured documents; the little existing work on this topic focuses on proposals that work on free text, which has little to do with extracting information from semi-structured documents. b) It provides a method that relies on statistically sound tests to support the conclusions drawn; the previous work does not provide clear guidelines or recommend statistically sound tests, but rather a survey that collects many features to take into account as well as related work; c) We provide a novel method to compute the performance measures regarding unsupervised proposals; otherwise they would require the intervention of a user to compute them by using the annotations on the evaluation sets and the information extracted. Our contributions will definitely help researchers in this area make sure that they have advanced the state of the art not only conceptually, but from an empirical point of view; it will also help practitioners make informed decisions on which proposal is the most adequate for a particular problem. This conference is a good forum to discuss on our ideas so that we can spread them to help improve the evaluation of information extraction proposals and gather valuable feedback from other researchers.

Keywords: web information extractors, information extraction evaluation method, Google scholar, web

Procedia PDF Downloads 236
24506 Information Communication Technology Based Road Traffic Accidents’ Identification, and Related Smart Solution Utilizing Big Data

Authors: Ghulam Haider Haidaree, Nsenda Lukumwena

Abstract:

Today the world of research enjoys abundant data, available in virtually any field, technology, science, and business, politics, etc. This is commonly referred to as big data. This offers a great deal of precision and accuracy, supportive of an in-depth look at any decision-making process. When and if well used, Big Data affords its users with the opportunity to produce substantially well supported and good results. This paper leans extensively on big data to investigate possible smart solutions to urban mobility and related issues, namely road traffic accidents, its casualties, and fatalities based on multiple factors, including age, gender, location occurrences of accidents, etc. Multiple technologies were used in combination to produce an Information Communication Technology (ICT) based solution with embedded technology. Those technologies include principally Geographic Information System (GIS), Orange Data Mining Software, Bayesian Statistics, to name a few. The study uses the Leeds accident 2016 to illustrate the thinking process and extracts thereof a model that can be tested, evaluated, and replicated. The authors optimistically believe that the proposed model will significantly and smartly help to flatten the curve of road traffic accidents in the fast-growing population densities, which increases considerably motor-based mobility.

Keywords: accident factors, geographic information system, information communication technology, mobility

Procedia PDF Downloads 197
24505 Predicting the Diagnosis of Alzheimer’s Disease: Development and Validation of Machine Learning Models

Authors: Jay L. Fu

Abstract:

Patients with Alzheimer's disease progressively lose their memory and thinking skills and, eventually, the ability to carry out simple daily tasks. The disease is irreversible, but early detection and treatment can slow down the disease progression. In this research, publicly available MRI data and demographic data from 373 MRI imaging sessions were utilized to build models to predict dementia. Various machine learning models, including logistic regression, k-nearest neighbor, support vector machine, random forest, and neural network, were developed. Data were divided into training and testing sets, where training sets were used to build the predictive model, and testing sets were used to assess the accuracy of prediction. Key risk factors were identified, and various models were compared to come forward with the best prediction model. Among these models, the random forest model appeared to be the best model with an accuracy of 90.34%. MMSE, nWBV, and gender were the three most important contributing factors to the detection of Alzheimer’s. Among all the models used, the percent in which at least 4 of the 5 models shared the same diagnosis for a testing input was 90.42%. These machine learning models allow early detection of Alzheimer’s with good accuracy, which ultimately leads to early treatment of these patients.

Keywords: Alzheimer's disease, clinical diagnosis, magnetic resonance imaging, machine learning prediction

Procedia PDF Downloads 125
24504 Improved Classification Procedure for Imbalanced and Overlapped Situations

Authors: Hankyu Lee, Seoung Bum Kim

Abstract:

The issue with imbalance and overlapping in the class distribution becomes important in various applications of data mining. The imbalanced dataset is a special case in classification problems in which the number of observations of one class (i.e., major class) heavily exceeds the number of observations of the other class (i.e., minor class). Overlapped dataset is the case where many observations are shared together between the two classes. Imbalanced and overlapped data can be frequently found in many real examples including fraud and abuse patients in healthcare, quality prediction in manufacturing, text classification, oil spill detection, remote sensing, and so on. The class imbalance and overlap problem is the challenging issue because this situation degrades the performance of most of the standard classification algorithms. In this study, we propose a classification procedure that can effectively handle imbalanced and overlapped datasets by splitting data space into three parts: nonoverlapping, light overlapping, and severe overlapping and applying the classification algorithm in each part. These three parts were determined based on the Hausdorff distance and the margin of the modified support vector machine. An experiments study was conducted to examine the properties of the proposed method and compared it with other classification algorithms. The results showed that the proposed method outperformed the competitors under various imbalanced and overlapped situations. Moreover, the applicability of the proposed method was demonstrated through the experiment with real data.

Keywords: classification, imbalanced data with class overlap, split data space, support vector machine

Procedia PDF Downloads 293
24503 Predictive Value of ¹⁸F-Fdg Accumulation in Visceral Fat Activity to Detect Colorectal Cancer Metastases

Authors: Amil Suleimanov, Aigul Saduakassova, Denis Vinnikov

Abstract:

Objective: To assess functional visceral fat (VAT) activity evaluated by ¹⁸F-fluorodeoxyglucose (¹⁸F-FDG) positron emission tomography/computed tomography (PET/CT) as a predictor of metastases in colorectal cancer (CRC). Materials and methods: We assessed 60 patients with histologically confirmed CRC who underwent 18F-FDG PET/CT after a surgical treatment and courses of chemotherapy. Age, histology, stage, and tumor grade were recorded. Functional VAT activity was measured by maximum standardized uptake value (SUVmax) using ¹⁸F-FDG PET/CT and tested as a predictor of later metastases in eight abdominal locations (RE – Epigastric Region, RLH – Left Hypochondriac Region, RRL – Right Lumbar Region, RU – Umbilical Region, RLL – Left Lumbar Region, RRI – Right Inguinal Region, RP – Hypogastric (Pubic) Region, RLI – Left Inguinal Region) and pelvic cavity (P) in the adjusted regression models. We also report the best areas under the curve (AUC) for SUVmax with the corresponding sensitivity (Se) and specificity (Sp). Results: In both adjusted for age regression models and ROC analysis, 18F-FDG accumulation in RLH (cutoff SUVmax 0.74; Se 75%; Sp 61%; AUC 0.668; p = 0.049), RU (cutoff SUVmax 0.78; Se 69%; Sp 61%; AUC 0.679; p = 0.035), RRL (cutoff SUVmax 1.05; Se 69%; Sp 77%; AUC 0.682; p = 0.032) and RRI (cutoff SUVmax 0.85; Se 63%; Sp 61%; AUC 0.672; p = 0.043) could predict later metastases in CRC patients, as opposed to age, sex, primary tumor location, tumor grade and histology. Conclusions: VAT SUVmax is significantly associated with later metastases in CRC patients and can be used as their predictor.

Keywords: ¹⁸F-FDG, PET/CT, colorectal cancer, predictive value

Procedia PDF Downloads 103
24502 The Reduction of Post-Blast Fumes to Improve Productivity and Safety: A Review Paper

Authors: Nhleko Monique Chiloane

Abstract:

The gold mining industry has predominantly used ammonium nitrate fuel oil (ANFO) explosives for decades, although these are known to be “gassier” and their detonation results in toxic fumes, for example, carbon monoxide (CO), nitrogen oxides (NOx) and ammonia. Re-entry into underground workings too soon after blasting can lead to fatal exposure to toxic fumes. It is, therefore, required that the polluted air be removed from the affected areas within a reasonable period before employees' re-entry into the working area. Post-blast re-entry times have therefore been described as a productivity bottleneck. The known causes of post-blast fumes are water ingress, incorrect fuel to oxygen ratio, confinement, explosive additives etc. To prevent or minimize post-blast fumes, some researchers have used neutralization, re-burning technique and non-explosive products or different oxidizing agents. The use of commercial explosives without nitrate oxidizing agents can also minimize the production of blasting fumes and thereby reduce the time needed for the clearance of these fumes to allow workers to re-enter the underground workings safely. The reduction in non-production time directly contributes to an increase in the available time per shift for productive work, thus leading to continuous mining. However, owing to its low cost and ease of use, ANFO is still widely used in South African underground blasting operations.

Keywords: post-blast fumes, continuous mining, ammonium nitrate explosive, non-explosive blasting, re-entry period

Procedia PDF Downloads 167
24501 Semi-Automatic Method to Assist Expert for Association Rules Validation

Authors: Amdouni Hamida, Gammoudi Mohamed Mohsen

Abstract:

In order to help the expert to validate association rules extracted from data, some quality measures are proposed in the literature. We distinguish two categories: objective and subjective measures. The first one depends on a fixed threshold and on data quality from which the rules are extracted. The second one consists on providing to the expert some tools in the objective to explore and visualize rules during the evaluation step. However, the number of extracted rules to validate remains high. Thus, the manually mining rules task is very hard. To solve this problem, we propose, in this paper, a semi-automatic method to assist the expert during the association rule's validation. Our method uses rule-based classification as follow: (i) We transform association rules into classification rules (classifiers), (ii) We use the generated classifiers for data classification. (iii) We visualize association rules with their quality classification to give an idea to the expert and to assist him during validation process.

Keywords: association rules, rule-based classification, classification quality, validation

Procedia PDF Downloads 418
24500 Optimizing Energy Efficiency: Leveraging Big Data Analytics and AWS Services for Buildings and Industries

Authors: Gaurav Kumar Sinha

Abstract:

In an era marked by increasing concerns about energy sustainability, this research endeavors to address the pressing challenge of energy consumption in buildings and industries. This study delves into the transformative potential of AWS services in optimizing energy efficiency. The research is founded on the recognition that effective management of energy consumption is imperative for both environmental conservation and economic viability. Buildings and industries account for a substantial portion of global energy use, making it crucial to develop advanced techniques for analysis and reduction. This study sets out to explore the integration of AWS services with big data analytics to provide innovative solutions for energy consumption analysis. Leveraging AWS's cloud computing capabilities, scalable infrastructure, and data analytics tools, the research aims to develop efficient methods for collecting, processing, and analyzing energy data from diverse sources. The core focus is on creating predictive models and real-time monitoring systems that enable proactive energy management. By harnessing AWS's machine learning and data analytics capabilities, the research seeks to identify patterns, anomalies, and optimization opportunities within energy consumption data. Furthermore, this study aims to propose actionable recommendations for reducing energy consumption in buildings and industries. By combining AWS services with metrics-driven insights, the research strives to facilitate the implementation of energy-efficient practices, ultimately leading to reduced carbon emissions and cost savings. The integration of AWS services not only enhances the analytical capabilities but also offers scalable solutions that can be customized for different building and industrial contexts. The research also recognizes the potential for AWS-powered solutions to promote sustainable practices and support environmental stewardship.

Keywords: energy consumption analysis, big data analytics, AWS services, energy efficiency

Procedia PDF Downloads 47
24499 Developing and Evaluating Clinical Risk Prediction Models for Coronary Artery Bypass Graft Surgery

Authors: Mohammadreza Mohebbi, Masoumeh Sanagou

Abstract:

The ability to predict clinical outcomes is of great importance to physicians and clinicians. A number of different methods have been used in an effort to accurately predict these outcomes. These methods include the development of scoring systems based on multivariate statistical modelling, and models involving the use of classification and regression trees. The process usually consists of two consecutive phases, namely model development and external validation. The model development phase consists of building a multivariate model and evaluating its predictive performance by examining calibration and discrimination, and internal validation. External validation tests the predictive performance of a model by assessing its calibration and discrimination in different but plausibly related patients. A motivate example focuses on prediction modeling using a sample of patients undergone coronary artery bypass graft (CABG) has been used for illustrative purpose and a set of primary considerations for evaluating prediction model studies using specific quality indicators as criteria to help stakeholders evaluate the quality of a prediction model study has been proposed.

Keywords: clinical prediction models, clinical decision rule, prognosis, external validation, model calibration, biostatistics

Procedia PDF Downloads 282
24498 Harnessing Artificial Intelligence for Early Detection and Management of Infectious Disease Outbreaks

Authors: Amarachukwu B. Isiaka, Vivian N. Anakwenze, Chinyere C. Ezemba, Chiamaka R. Ilodinso, Chikodili G. Anaukwu, Chukwuebuka M. Ezeokoli, Ugonna H. Uzoka

Abstract:

Infectious diseases continue to pose significant threats to global public health, necessitating advanced and timely detection methods for effective outbreak management. This study explores the integration of artificial intelligence (AI) in the early detection and management of infectious disease outbreaks. Leveraging vast datasets from diverse sources, including electronic health records, social media, and environmental monitoring, AI-driven algorithms are employed to analyze patterns and anomalies indicative of potential outbreaks. Machine learning models, trained on historical data and continuously updated with real-time information, contribute to the identification of emerging threats. The implementation of AI extends beyond detection, encompassing predictive analytics for disease spread and severity assessment. Furthermore, the paper discusses the role of AI in predictive modeling, enabling public health officials to anticipate the spread of infectious diseases and allocate resources proactively. Machine learning algorithms can analyze historical data, climatic conditions, and human mobility patterns to predict potential hotspots and optimize intervention strategies. The study evaluates the current landscape of AI applications in infectious disease surveillance and proposes a comprehensive framework for their integration into existing public health infrastructures. The implementation of an AI-driven early detection system requires collaboration between public health agencies, healthcare providers, and technology experts. Ethical considerations, privacy protection, and data security are paramount in developing a framework that balances the benefits of AI with the protection of individual rights. The synergistic collaboration between AI technologies and traditional epidemiological methods is emphasized, highlighting the potential to enhance a nation's ability to detect, respond to, and manage infectious disease outbreaks in a proactive and data-driven manner. The findings of this research underscore the transformative impact of harnessing AI for early detection and management, offering a promising avenue for strengthening the resilience of public health systems in the face of evolving infectious disease challenges. This paper advocates for the integration of artificial intelligence into the existing public health infrastructure for early detection and management of infectious disease outbreaks. The proposed AI-driven system has the potential to revolutionize the way we approach infectious disease surveillance, providing a more proactive and effective response to safeguard public health.

Keywords: artificial intelligence, early detection, disease surveillance, infectious diseases, outbreak management

Procedia PDF Downloads 47
24497 A Survey on Compression Methods for Table Constraints

Authors: N. Gharbi

Abstract:

Constraint Satisfaction problems are mathematical problems that are often used to model many real-world problems for which we look if there exists a solution satisfying all its constraints. Table constraints are important for modeling parts of many problems since they list all combinations of allowed or forbidden values. However, they admit practical limitations because they are sometimes too large to be represented in a direct way. In this paper, we present a survey of the different categories of the proposed approaches to compress table constraints in order to reduce both space and time complexities.

Keywords: constraint programming, compression, data mining, table constraints

Procedia PDF Downloads 307
24496 Improvement of Microstructure, Wear and Mechanical Properties of Modified G38NiCrMo8-4-4 Steel Used in Mining Industry

Authors: Mustafa Col, Funda Gul Koc, Merve Yangaz, Eylem Subasi, Can Akbasoglu

Abstract:

G38NiCrMo8-4-4 steel is widely used in mining industries, machine parts, gears due to its high strength and toughness properties. In this study, microstructure, wear and mechanical properties of G38NiCrMo8-4-4 steel modified with boron used in the mining industry were investigated. For this purpose, cast materials were alloyed by melting in an induction furnace to include boron with the rates of 0 ppm, 15 ppm, and 50 ppm (wt.) and were formed in the dimensions of 150x200x150 mm by casting into the sand mould. Homogenization heat treatment was applied to the specimens at 1150˚C for 7 hours. Then all specimens were austenitized at 930˚C for 1 hour, quenched in the polymer solution and tempered at 650˚C for 1 hour. Microstructures of the specimens were investigated by using light microscope and SEM to determine the effect of boron and heat treatment conditions. Changes in microstructure properties and material hardness were obtained due to increasing boron content and heat treatment conditions after microstructure investigations and hardness tests. Wear tests were carried out using a pin-on-disc tribometer under dry sliding conditions. Charpy V notch impact test was performed to determine the toughness properties of the specimens. Fracture and worn surfaces were investigated with scanning electron microscope (SEM). The results show that boron element has a positive effect on the hardness and wear properties of G38NiCrMo8-4-4 steel.

Keywords: G38NiCrMo8-4-4 steel, boron, heat treatment, microstructure, wear, mechanical properties

Procedia PDF Downloads 180
24495 Impact of Coal Mining on River Sediment Quality in the Sydney Basin, Australia

Authors: A. Ali, V. Strezov, P. Davies, I. Wright, T. Kan

Abstract:

The environmental impacts arising from mining activities affect the air, water, and soil quality. Impacts may result in unexpected and adverse environmental outcomes. This study reports on the impact of coal production on sediment in Sydney region of Australia. The sediment samples upstream and downstream from the discharge points from three mines were taken, and 80 parameters were tested. The results were assessed against sediment quality based on presence of metals. The study revealed the increment of metal content in the sediment downstream of the reference locations. In many cases, the sediment was above the Australia and New Zealand Environment Conservation Council and international sediment quality guidelines value (SQGV). The major outliers to the guidelines were nickel (Ni) and zinc (Zn).

Keywords: coal mine, environmental impact, produced water, sediment quality guidelines value (SQGV)

Procedia PDF Downloads 288
24494 Analysis of Vocal Fold Vibrations from High-Speed Digital Images Based on Dynamic Time Warping

Authors: A. I. A. Rahman, Sh-Hussain Salleh, K. Ahmad, K. Anuar

Abstract:

Analysis of vocal fold vibration is essential for understanding the mechanism of voice production and for improving clinical assessment of voice disorders. This paper presents a Dynamic Time Warping (DTW) based approach to analyze and objectively classify vocal fold vibration patterns. The proposed technique was designed and implemented on a Glottal Area Waveform (GAW) extracted from high-speed laryngeal images by delineating the glottal edges for each image frame. Feature extraction from the GAW was performed using Linear Predictive Coding (LPC). Several types of voice reference templates from simulations of clear, breathy, fry, pressed and hyperfunctional voice productions were used. The patterns of the reference templates were first verified using the analytical signal generated through Hilbert transformation of the GAW. Samples from normal speakers’ voice recordings were then used to evaluate and test the effectiveness of this approach. The classification of the voice patterns using the technique of LPC and DTW gave the accuracy of 81%.

Keywords: dynamic time warping, glottal area waveform, linear predictive coding, high-speed laryngeal images, Hilbert transform

Procedia PDF Downloads 227
24493 Destination Port Detection For Vessels: An Analytic Tool For Optimizing Port Authorities Resources

Authors: Lubna Eljabu, Mohammad Etemad, Stan Matwin

Abstract:

Port authorities have many challenges in congested ports to allocate their resources to provide a safe and secure loading/ unloading procedure for cargo vessels. Selecting a destination port is the decision of a vessel master based on many factors such as weather, wavelength and changes of priorities. Having access to a tool which leverages AIS messages to monitor vessel’s movements and accurately predict their next destination port promotes an effective resource allocation process for port authorities. In this research, we propose a method, namely, Reference Route of Trajectory (RRoT) to assist port authorities in predicting inflow and outflow traffic in their local environment by monitoring Automatic Identification System (AIS) messages. Our RRoT method creates a reference route based on historical AIS messages. It utilizes some of the best trajectory similarity measure to identify the destination of a vessel using their recent movement. We evaluated five different similarity measures such as Discrete Fr´echet Distance (DFD), Dynamic Time Warping (DTW), Partial Curve Mapping (PCM), Area between two curves (Area) and Curve length (CL). Our experiments show that our method identifies the destination port with an accuracy of 98.97% and an fmeasure of 99.08% using Dynamic Time Warping (DTW) similarity measure.

Keywords: spatial temporal data mining, trajectory mining, trajectory similarity, resource optimization

Procedia PDF Downloads 100
24492 Machine Learning Methods for Network Intrusion Detection

Authors: Mouhammad Alkasassbeh, Mohammad Almseidin

Abstract:

Network security engineers work to keep services available all the time by handling intruder attacks. Intrusion Detection System (IDS) is one of the obtainable mechanisms that is used to sense and classify any abnormal actions. Therefore, the IDS must be always up to date with the latest intruder attacks signatures to preserve confidentiality, integrity, and availability of the services. The speed of the IDS is a very important issue as well learning the new attacks. This research work illustrates how the Knowledge Discovery and Data Mining (or Knowledge Discovery in Databases) KDD dataset is very handy for testing and evaluating different Machine Learning Techniques. It mainly focuses on the KDD preprocess part in order to prepare a decent and fair experimental data set. The J48, MLP, and Bayes Network classifiers have been chosen for this study. It has been proven that the J48 classifier has achieved the highest accuracy rate for detecting and classifying all KDD dataset attacks, which are of type DOS, R2L, U2R, and PROBE.

Keywords: IDS, DDoS, MLP, KDD

Procedia PDF Downloads 223
24491 The Impact of Mining Activities on the Surface Water Quality: A Case Study of the Kaap River in Barberton, Mpumalanga

Authors: M. F. Mamabolo

Abstract:

Mining activities are identified as the most significant source of heavy metal contamination in river basins, due to inadequate disposal of mining waste thus resulting in acid mine drainage. Waste materials generated from gold mining and processing have severe and widespread impacts on water resources. Therefore, a total of 30 water samples were collected from Fig Tree Creek, Kaapriver, Sheba mine stream & Sauid kaap river to investigate the impact of gold mines on the Kaap River system. Physicochemical parameters (pH, EC and TDS) were taken using a BANTE 900P portable water quality meter. The concentration of Fe, Cu, Co, and SO₄²⁻ in water samples were analysed using Inductively Coupled Plasma-Mass spectrophotometry (ICP-MS) at 0.01 mg/L. The results were compared to the regulatory guideline of the World Health Organization (WHO) and the South Africa National Standards (SANS). It was found that Fe, Cu and Co were below the guideline values while SO₄²⁻ detected in Sheba mine stream exceeded the 250 mg/L limit for both seasons, attributed by mine wastewater. SO₄²⁻ was higher in wet season due to high evaporation rates and greater interaction between rocks and water. The pH of all the streams was within the limit (≥5 to ≤9.7), however EC of the Sheba mine stream, Suid Kaap River & where the tributary connects with the Fig Tree Creek exceeded 1700 uS/m, due to dissolved material. The TDS of Sheba mine stream exceeded 1000 mg/L, attributed by high SO₄²⁻ concentration. While the tributary connecting to the Fig Tree Creek exceed the value due to pollution from household waste, runoff from agriculture etc. In conclusion, the water from all sampled streams were safe for consumption due to low concentrations of physicochemical parameters. However, elevated concentration of SO₄²⁻ should be monitored and managed to avoid water quality deterioration in the Kaap River system.

Keywords: Kaap river system, mines, heavy metals, sulphate

Procedia PDF Downloads 61
24490 The Use of Piezocone Penetration Test Data for the Assessment of Iron Ore Tailings Liquefaction Susceptibility

Authors: Breno M. Castilho

Abstract:

The Iron Ore Quadrangle, located in the state of Minas Gerais, Brazil is responsible for most of the country’s iron ore production. As a result, some of the biggest tailings dams in the country are located in this area. In recent years, several major failure events have happened in Tailings Storage Facilities (TSF) located in the Iron Ore Quadrangle. Some of these failures were found to be caused by liquefaction flowslides. This paper presents Piezocone Penetration Test (CPTu) data that was used, by applying Olson and Peterson methods, for the liquefaction susceptibility assessment of the iron ore tailings that are typically found in most TSF in the area. Piezocone data was also used to determine the steady-state strength of the tailings so as to allow for comparison with its drained strength. Results have shown great susceptibility for liquefaction to occur in the studied tailings and, more importantly, a large reduction in its strength. These results are key to understanding the failures that took place over the last few years.

Keywords: Piezocone Penetration Test CPTu, iron ore tailings, mining, liquefaction susceptibility assessment

Procedia PDF Downloads 216
24489 Study of the Transport of ²²⁶Ra Colloidal in Mining Context Using a Multi-Disciplinary Approach

Authors: Marine Reymond, Michael Descostes, Marie Muguet, Clemence Besancon, Martine Leermakers, Catherine Beaucaire, Sophie Billon, Patricia Patrier

Abstract:

²²⁶Ra is one of the radionuclides resulting from the disintegration of ²³⁸U. Due to its half-life (1600 y) and its high specific activity (3.7 x 1010 Bq/g), ²²⁶Ra is found at the ultra-trace level in the natural environment (usually below 1 Bq/L, i.e. 10-13 mol/L). Because of its decay in ²²²Rn, a radioactive gas with a shorter half-life (3.8 days) which is difficult to control and dangerous for humans when inhaled, ²²⁶Ra is subject to a dedicated monitoring in surface waters especially in the context of uranium mining. In natural waters, radionuclides occur in dissolved, colloidal or particular forms. Due to the size of colloids, generally ranging between 1 nm and 1 µm and their high specific surface areas, the colloidal fraction could be involved in the transport of trace elements, including radionuclides in the environment. The colloidal fraction is not always easy to determine and few existing studies focus on ²²⁶Ra. In the present study, a complete multidisciplinary approach is proposed to assess the colloidal transport of ²²⁶Ra. It includes water sampling by conventional filtration (0.2µm) and the innovative Diffusive Gradient in Thin Films technique to measure the dissolved fraction (<10nm), from which the colloidal fraction could be estimated. Suspended matter in these waters were also sampled and characterized mineralogically by X-Ray Diffraction, infrared spectroscopy and scanning electron microscopy. All of these data, which were acquired on a rehabilitated former uranium mine, allowed to build a geochemical model using the geochemical calculation code PhreeqC to describe, as accurately as possible, the colloidal transport of ²²⁶Ra. Colloidal transport of ²²⁶Ra was found, for some of the sampling points, to account for up to 95% of the total ²²⁶Ra measured in water. Mineralogical characterization and associated geochemical modelling highlight the role of barite, a barium sulfate mineral well known to trap ²²⁶Ra into its structure. Barite was shown to be responsible for the colloidal ²²⁶Ra fraction despite the presence of kaolinite and ferrihydrite, which are also known to retain ²²⁶Ra by sorption.

Keywords: colloids, mining context, radium, transport

Procedia PDF Downloads 139
24488 Statistical Analysis to Select Evacuation Route

Authors: Zaky Musyarof, Dwi Yono Sutarto, Dwima Rindy Atika, R. B. Fajriya Hakim

Abstract:

Each country should be responsible for the safety of people, especially responsible for the safety of people living in disaster-prone areas. One of those services is provides evacuation route for them. But all this time, the selection of evacuation route is seem doesn’t well organized, it could be seen that when a disaster happen, there will be many accumulation of people on the steps of evacuation route. That condition is dangerous to people because hampers evacuation process. By some methods in Statistical analysis, author tries to give a suggestion how to prepare evacuation route which is organized and based on people habit. Those methods are association rules, sequential pattern mining, hierarchical cluster analysis and fuzzy logic.

Keywords: association rules, sequential pattern mining, cluster analysis, fuzzy logic, evacuation route

Procedia PDF Downloads 484
24487 Predictive Value of ¹⁸F-Fluorodeoxyglucose Accumulation in Visceral Fat Activity to Detect Epithelial Ovarian Cancer Metastases

Authors: A. F. Suleimanov, A. B. Saduakassova, V. S. Pokrovsky, D. V. Vinnikov

Abstract:

Relevance: Epithelial ovarian cancer (EOC) is the most lethal gynecological malignancy, with relapse occurring in about 70% of advanced cases with poor prognoses. The aim of the study was to evaluate functional visceral fat activity (VAT) evaluated by ¹⁸F-fluorodeoxyglucose (¹⁸F-FDG) positron emission tomography/computed tomography (PET/CT) as a predictor of metastases in epithelial ovarian cancer (EOC). Materials and methods: We assessed 53 patients with histologically confirmed EOC who underwent ¹⁸F-FDG PET/CT after a surgical treatment and courses of chemotherapy. Age, histology, stage, and tumor grade were recorded. Functional VAT activity was measured by maximum standardized uptake value (SUVₘₐₓ) using ¹⁸F-FDG PET/CT and tested as a predictor of later metastases in eight abdominal locations (RE – Epigastric Region, RLH – Left Hypochondriac Region, RRL – Right Lumbar Region, RU – Umbilical Region, RLL – Left Lumbar Region, RRI – Right Inguinal Region, RP – Hypogastric (Pubic) Region, RLI – Left Inguinal Region) and pelvic cavity (P) in the adjusted regression models. We also identified the best areas under the curve (AUC) for SUVₘₐₓ with the corresponding sensitivity (Se) and specificity (Sp). Results: In both adjusted-for regression models and ROC analysis, ¹⁸F-FDG accumulation in RE (cut-off SUVₘₐₓ 1.18; Se 64%; Sp 64%; AUC 0.669; p = 0.035) could predict later metastases in EOC patients, as opposed to age, sex, primary tumor location, tumor grade, and histology. Conclusions: VAT SUVₘₐₓ is significantly associated with later metastases in EOC patients and can be used as their predictor.

Keywords: ¹⁸F-FDG, PET/CT, EOC, predictive value

Procedia PDF Downloads 54
24486 Cluster Analysis of Students’ Learning Satisfaction

Authors: Purevdolgor Luvsantseren, Ajnai Luvsan-Ish, Oyuntsetseg Sandag, Javzmaa Tsend, Akhit Tileubai, Baasandorj Chilhaasuren, Jargalbat Puntsagdash, Galbadrakh Chuluunbaatar

Abstract:

One of the indicators of the quality of university services is student satisfaction. Aim: We aimed to study the level of satisfaction of students in the first year of premedical courses in the course of Medical Physics using the cluster method. Materials and Methods: In the framework of this goal, a questionnaire was collected from a total of 324 students who studied the medical physics course of the 1st course of the premedical course at the Mongolian National University of Medical Sciences. When determining the level of satisfaction, the answers were obtained on five levels of satisfaction: "excellent", "good", "medium", "bad" and "very bad". A total of 39 questionnaires were collected from students: 8 for course evaluation, 19 for teacher evaluation, and 12 for student evaluation. From the research, a database with 39 fields and 324 records was created. Results: In this database, cluster analysis was performed in MATLAB and R programs using the k-means method of data mining. Calculated the Hopkins statistic in the created database, the values are 0.88, 0.87, and 0.97. This shows that cluster analysis methods can be used. The course evaluation sub-fund is divided into three clusters. Among them, cluster I has 150 objects with a "good" rating of 46.2%, cluster II has 119 objects with a "medium" rating of 36.7%, and Cluster III has 54 objects with a "good" rating of 16.6%. The teacher evaluation sub-base into three clusters, there are 179 objects with a "good" rating of 55.2% in cluster II, 108 objects with an "average" rating of 33.3% in cluster III, and 36 objects with an "excellent" rating in cluster I of 11.1%. The sub-base of student evaluations is divided into two clusters: cluster II has 215 objects with an "excellent" rating of 66.3%, and cluster I has 108 objects with an "excellent" rating of 33.3%. Evaluating the resulting clusters with the Silhouette coefficient, 0.32 for the course evaluation cluster, 0.31 for the teacher evaluation cluster, and 0.30 for student evaluation show statistical significance. Conclusion: Finally, to conclude, cluster analysis in the model of the medical physics lesson “good” - 46.2%, “middle” - 36.7%, “bad” - 16.6%; 55.2% - “good”, 33.3% - “middle”, 11.1% - “bad” in the teacher evaluation model; 66.3% - “good” and 33.3% of “bad” in the student evaluation model.

Keywords: questionnaire, data mining, k-means method, silhouette coefficient

Procedia PDF Downloads 28
24485 Improving Grade Control Turnaround Times with In-Pit Hyperspectral Assaying

Authors: Gary Pattemore, Michael Edgar, Andrew Job, Marina Auad, Kathryn Job

Abstract:

As critical commodities become more scarce, significant time and resources have been used to better understand complicated ore bodies and extract their full potential. These challenging ore bodies provide several pain points for geologists and engineers to overcome, poor handling of these issues flows downs stream to the processing plant affecting throughput rates and recovery. Many open cut mines utilise blast hole drilling to extract additional information to feed back into the modelling process. This method requires samples to be collected during or after blast hole drilling. Samples are then sent for assay with turnaround times varying from 1 to 12 days. This method is time consuming, costly, requires human exposure on the bench and collects elemental data only. To address this challenge, research has been undertaken to utilise hyperspectral imaging across a broad spectrum to scan samples, collars or take down hole measurements for minerals and moisture content and grade abundances. Automation of this process using unmanned vehicles and on-board processing reduces human in pit exposure to ensure ongoing safety. On-board processing allows data to be integrated into modelling workflows with immediacy. The preliminary results demonstrate numerous direct and indirect benefits from this new technology, including rapid and accurate grade estimates, moisture content and mineralogy. These benefits allow for faster geo modelling updates, better informed mine scheduling and improved downstream blending and processing practices. The paper presents recommendations for implementation of the technology in open cut mining environments.

Keywords: grade control, hyperspectral scanning, artificial intelligence, autonomous mining, machine learning

Procedia PDF Downloads 92
24484 Seasonal Variation of the Impact of Mining Activities on Ga-Selati River in Limpopo Province, South Africa

Authors: Joshua N. Edokpayi, John O. Odiyo, Patience P. Shikwambana

Abstract:

Water is a very rare natural resource in South Africa. Ga-Selati River is used for both domestic and industrial purposes. This study was carried out in order to assess the quality of Ga-Selati River in a mining area of Limpopo Province-Phalaborwa. The pH, Electrical Conductivity (EC) and Total Dissolved Solids (TDS) were determined using a Crinson multimeter while turbidity was measured using a Labcon Turbidimeter. The concentrations of Al, Ca, Cd, Cr, Fe, K, Mg, Mn, Na and Pb were analysed in triplicate using a Varian 520 flame atomic absorption spectrometer (AAS) supplied by PerkinElmer, after acid digestion with nitric acid in a fume cupboard. The average pH of the river from eight different sampling sites was 8.00 and 9.38 in wet and dry season respectively. Higher EC values were determined in the dry season (138.7 mS/m) than in the wet season (96.93 mS/m). Similarly, TDS values were higher in dry (929.29 mg/L) than in the wet season (640.72 mg/L) season. These values exceeded the recommended guideline of South Africa Department of Water Affairs and Forestry (DWAF) for domestic water use (70 mS/m) and that of the World Health Organization (WHO) (600 mS/m), respectively. Turbidity varied between 1.78-5.20 and 0.95-2.37 NTU in both wet and dry seasons. Total hardness of 312.50 mg/L and 297.75 mg/L as the concentration of CaCO3 was computed for the river in both the wet and the dry seasons and the river water was categorised as very hard. Mean concentration of the metals studied in both the wet and the dry seasons are: Na (94.06 mg/L and 196.3 mg/L), K (11.79 mg/L and 13.62 mg/L), Ca (45.60 mg/L and 41.30 mg/L), Mg (48.41 mg/L and 44.71 mg/L), Al (0.31 mg/L and 0.38 mg/L), Cd (0.01 mg/L and 0.01 mg/L), Cr (0.02 mg/L and 0.09 mg/L), Pb (0.05 mg/L and 0.06 mg/L), Mn (0.31 mg/L and 0.11 mg/L) and Fe (0.76 mg/L and 0.69 mg/L). Results from this study reveal that most of the metals were present in concentrations higher than the recommended guidelines of DWAF and WHO for domestic use and the protection of aquatic life.

Keywords: contamination, mining activities, surface water, trace metals

Procedia PDF Downloads 300
24483 COVID_ICU_BERT: A Fine-Tuned Language Model for COVID-19 Intensive Care Unit Clinical Notes

Authors: Shahad Nagoor, Lucy Hederman, Kevin Koidl, Annalina Caputo

Abstract:

Doctors’ notes reflect their impressions, attitudes, clinical sense, and opinions about patients’ conditions and progress, and other information that is essential for doctors’ daily clinical decisions. Despite their value, clinical notes are insufficiently researched within the language processing community. Automatically extracting information from unstructured text data is known to be a difficult task as opposed to dealing with structured information such as vital physiological signs, images, and laboratory results. The aim of this research is to investigate how Natural Language Processing (NLP) techniques and machine learning techniques applied to clinician notes can assist in doctors’ decision-making in Intensive Care Unit (ICU) for coronavirus disease 2019 (COVID-19) patients. The hypothesis is that clinical outcomes like survival or mortality can be useful in influencing the judgement of clinical sentiment in ICU clinical notes. This paper introduces two contributions: first, we introduce COVID_ICU_BERT, a fine-tuned version of clinical transformer models that can reliably predict clinical sentiment for notes of COVID patients in the ICU. We train the model on clinical notes for COVID-19 patients, a type of notes that were not previously seen by clinicalBERT, and Bio_Discharge_Summary_BERT. The model, which was based on clinicalBERT achieves higher predictive accuracy (Acc 93.33%, AUC 0.98, and precision 0.96 ). Second, we perform data augmentation using clinical contextual word embedding that is based on a pre-trained clinical model to balance the samples in each class in the data (survived vs. deceased patients). Data augmentation improves the accuracy of prediction slightly (Acc 96.67%, AUC 0.98, and precision 0.92 ).

Keywords: BERT fine-tuning, clinical sentiment, COVID-19, data augmentation

Procedia PDF Downloads 184
24482 Classification of Land Cover Usage from Satellite Images Using Deep Learning Algorithms

Authors: Shaik Ayesha Fathima, Shaik Noor Jahan, Duvvada Rajeswara Rao

Abstract:

Earth's environment and its evolution can be seen through satellite images in near real-time. Through satellite imagery, remote sensing data provide crucial information that can be used for a variety of applications, including image fusion, change detection, land cover classification, agriculture, mining, disaster mitigation, and monitoring climate change. The objective of this project is to propose a method for classifying satellite images according to multiple predefined land cover classes. The proposed approach involves collecting data in image format. The data is then pre-processed using data pre-processing techniques. The processed data is fed into the proposed algorithm and the obtained result is analyzed. Some of the algorithms used in satellite imagery classification are U-Net, Random Forest, Deep Labv3, CNN, ANN, Resnet etc. In this project, we are using the DeepLabv3 (Atrous convolution) algorithm for land cover classification. The dataset used is the deep globe land cover classification dataset. DeepLabv3 is a semantic segmentation system that uses atrous convolution to capture multi-scale context by adopting multiple atrous rates in cascade or in parallel to determine the scale of segments.

Keywords: area calculation, atrous convolution, deep globe land cover classification, deepLabv3, land cover classification, resnet 50

Procedia PDF Downloads 125
24481 Estimation of Morbidity Level of Industrial Labour Conditions at Zestafoni Ferroalloy Plant

Authors: M. Turmanauli, T. Todua, O. Gvaberidze, R. Javakhadze, N. Chkhaidze, N. Khatiashvili

Abstract:

Background: Mining process has the significant influence on human health and quality of life. In recent years the events in Georgia were reflected on the industry working process, especially minimal requirements of labor safety, hygiene standards of workplace and the regime of work and rest are not observed. This situation is often caused by the lack of responsibility, awareness, and knowledge both of workers and employers. The control of working conditions and its protection has been worsened in many of industries. Materials and Methods: For evaluation of the current situation the prospective epidemiological study by face to face interview method was conducted at Georgian “Manganese Zestafoni Ferroalloy Plant” in 2011-2013. 65.7% of employees (1428 bulletin) were surveyed and the incidence rates of temporary disability days were studied. Results: The average length of a temporary disability single accident was studied taking into consideration as sex groups as well as the whole cohort. According to the classes of harmfulness the following results were received: Class 2.0-10.3%; 3.1-12.4%; 3.2-35.1%; 3.3-12.1%; 3.4-17.6%; 4.0-12.5%. Among the employees 47.5% and 83.1% were tobacco and alcohol consumers respectively. According to the age groups and years of work on the base of previous experience ≥50 ages and ≥21 years of work data prevalence respectively. The obtained data revealed increased morbidity rate according to age and years of work. It was found that the bone and articulate system and connective tissue diseases, aggravation of chronic respiratory diseases, ischemic heart diseases, hypertension and cerebral blood discirculation were the leading among the other diseases. High prevalence of morbidity observed in the workplace with not satisfactory labor conditions from the hygienic point of view. Conclusion: According to received data the causes of morbidity are the followings: unsafety labor conditions; incomplete of preventive medical examinations (preliminary and periodic); lack of access to appropriate health care services; derangement of gathering, recording, and analysis of morbidity data. This epidemiological study was conducted at the JSC “Manganese Ferro Alloy Plant” according to State program “ Prevention of Occupational Diseases” (Program code is 35 03 02 05).

Keywords: occupational health, mining process, morbidity level, cerebral blood discirculation

Procedia PDF Downloads 414
24480 A Machine Learning Approach for Performance Prediction Based on User Behavioral Factors in E-Learning Environments

Authors: Naduni Ranasinghe

Abstract:

E-learning environments are getting more popular than any other due to the impact of COVID19. Even though e-learning is one of the best solutions for the teaching-learning process in the academic process, it’s not without major challenges. Nowadays, machine learning approaches are utilized in the analysis of how behavioral factors lead to better adoption and how they related to better performance of the students in eLearning environments. During the pandemic, we realized the academic process in the eLearning approach had a major issue, especially for the performance of the students. Therefore, an approach that investigates student behaviors in eLearning environments using a data-intensive machine learning approach is appreciated. A hybrid approach was used to understand how each previously told variables are related to the other. A more quantitative approach was used referred to literature to understand the weights of each factor for adoption and in terms of performance. The data set was collected from previously done research to help the training and testing process in ML. Special attention was made to incorporating different dimensionality of the data to understand the dependency levels of each. Five independent variables out of twelve variables were chosen based on their impact on the dependent variable, and by considering the descriptive statistics, out of three models developed (Random Forest classifier, SVM, and Decision tree classifier), random forest Classifier (Accuracy – 0.8542) gave the highest value for accuracy. Overall, this work met its goals of improving student performance by identifying students who are at-risk and dropout, emphasizing the necessity of using both static and dynamic data.

Keywords: academic performance prediction, e learning, learning analytics, machine learning, predictive model

Procedia PDF Downloads 133
24479 Molecular Topology and TLC Retention Behaviour of s-Triazines: QSRR Study

Authors: Lidija R. Jevrić, Sanja O. Podunavac-Kuzmanović, Strahinja Z. Kovačević

Abstract:

Quantitative structure-retention relationship (QSRR) analysis was used to predict the chromatographic behavior of s-triazine derivatives by using theoretical descriptors computed from the chemical structure. Fundamental basis of the reported investigation is to relate molecular topological descriptors with chromatographic behavior of s-triazine derivatives obtained by reversed-phase (RP) thin layer chromatography (TLC) on silica gel impregnated with paraffin oil and applied ethanol-water (φ = 0.5-0.8; v/v). Retention parameter (RM0) of 14 investigated s-triazine derivatives was used as dependent variable while simple connectivity index different orders were used as independent variables. The best QSRR model for predicting RM0 value was obtained with simple third order connectivity index (3χ) in the second-degree polynomial equation. Numerical values of the correlation coefficient (r=0.915), Fisher's value (F=28.34) and root mean square error (RMSE = 0.36) indicate that model is statistically significant. In order to test the predictive power of the QSRR model leave-one-out cross-validation technique has been applied. The parameters of the internal cross-validation analysis (r2CV=0.79, r2adj=0.81, PRESS=1.89) reflect the high predictive ability of the generated model and it confirms that can be used to predict RM0 value. Multivariate classification technique, hierarchical cluster analysis (HCA), has been applied in order to group molecules according to their molecular connectivity indices. HCA is a descriptive statistical method and it is the most frequently used for important area of data processing such is classification. The HCA performed on simple molecular connectivity indices obtained from the 2D structure of investigated s-triazine compounds resulted in two main clusters in which compounds molecules were grouped according to the number of atoms in the molecule. This is in agreement with the fact that these descriptors were calculated on the basis of the number of atoms in the molecule of the investigated s-triazine derivatives.

Keywords: s-triazines, QSRR, chemometrics, chromatography, molecular descriptors

Procedia PDF Downloads 375
24478 Developing a DNN Model for the Production of Biogas From a Hybrid BO-TPE System in an Anaerobic Wastewater Treatment Plant

Authors: Hadjer Sadoune, Liza Lamini, Scherazade Krim, Amel Djouadi, Rachida Rihani

Abstract:

Deep neural networks are highly regarded for their accuracy in predicting intricate fermentation processes. Their ability to learn from a large amount of datasets through artificial intelligence makes them particularly effective models. The primary obstacle in improving the performance of these models is to carefully choose the suitable hyperparameters, including the neural network architecture (number of hidden layers and hidden units), activation function, optimizer, learning rate, and other relevant factors. This study predicts biogas production from real wastewater treatment plant data using a sophisticated approach: hybrid Bayesian optimization with a tree-structured Parzen estimator (BO-TPE) for an optimised deep neural network (DNN) model. The plant utilizes an Upflow Anaerobic Sludge Blanket (UASB) digester that treats industrial wastewater from soft drinks and breweries. The digester has a working volume of 1574 m3 and a total volume of 1914 m3. Its internal diameter and height were 19 and 7.14 m, respectively. The data preprocessing was conducted with meticulous attention to preserving data quality while avoiding data reduction. Three normalization techniques were applied to the pre-processed data (MinMaxScaler, RobustScaler and StandardScaler) and compared with the Non-Normalized data. The RobustScaler approach has strong predictive ability for estimating the volume of biogas produced. The highest predicted biogas volume was 2236.105 Nm³/d, with coefficient of determination (R2), mean absolute error (MAE), and root mean square error (RMSE) values of 0.712, 164.610, and 223.429, respectively.

Keywords: anaerobic digestion, biogas production, deep neural network, hybrid bo-tpe, hyperparameters tuning

Procedia PDF Downloads 22