Search results for: graph mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1522

Search results for: graph mining

1072 A Note on Metallurgy at Khanak: An Indus Site in Tosham Mining Area, Haryana

Authors: Ravindra N. Singh, Dheerendra P. Singh

Abstract:

Recent discoveries of Bronze Age artefacts, tin slag, furnaces and crucibles, together with new geological evidence on tin deposits in Tosham area of Bhiwani district in Haryana (India) provide the opportunity to survey the evidence for possible sources of tin and the use of bronze in the Harappan sites of north western India. Earlier, Afghanistan emerged as the most promising eastern source of tin utilized by Indus Civilization copper-smiths. Our excavations conducted at Khanak near Tosham mining area during 2014 and 2016 revealed ample evidence of metallurgical activities as attested by the occurrence of slag, ores and evidences of ashes and fragments of furnaces in addition to the bronze objects. We have conducted petrological, XRD, EDAX, TEM, SEM and metallography on the slag, ores, crucible fragments and bronze objects samples recovered from Khanak excavations. This has given positive indication of mining and metallurgy of poly-mettalic Tin at the site; however, it can only be ascertained after the detailed scientific examination of the materials which is underway. In view of the importance of site, we intend to excavate the site horizontally in future so as to obtain more samples for scientific studies.

Keywords: archaeometallurgy, problem of tin, metallography, indus civilization

Procedia PDF Downloads 300
1071 Isolation Preserving Medical Conclusion Hold Structure via C5 Algorithm

Authors: Swati Kishor Zode, Rahul Ambekar

Abstract:

Data mining is the extraction of fascinating examples on the other hand information from enormous measure of information and choice is made as indicated by the applicable information extracted. As of late, with the dangerous advancement in internet, stockpiling of information and handling procedures, privacy preservation has been one of the major (higher) concerns in data mining. Various techniques and methods have been produced for protection saving data mining. In the situation of Clinical Decision Support System, the choice is to be made on the premise of the data separated from the remote servers by means of Internet to diagnose the patient. In this paper, the fundamental thought is to build the precision of Decision Support System for multiple diseases for different maladies and in addition protect persistent information while correspondence between Clinician side (Client side) also, the Server side. A privacy preserving protocol for clinical decision support network is proposed so that patients information dependably stay scrambled amid diagnose prepare by looking after the accuracy. To enhance the precision of Decision Support System for various malady C5.0 classifiers and to save security, a Homomorphism encryption algorithm Paillier cryptosystem is being utilized.

Keywords: classification, homomorphic encryption, clinical decision support, privacy

Procedia PDF Downloads 330
1070 Secure Multiparty Computations for Privacy Preserving Classifiers

Authors: M. Sumana, K. S. Hareesha

Abstract:

Secure computations are essential while performing privacy preserving data mining. Distributed privacy preserving data mining involve two to more sites that cannot pool in their data to a third party due to the violation of law regarding the individual. Hence in order to model the private data without compromising privacy and information loss, secure multiparty computations are used. Secure computations of product, mean, variance, dot product, sigmoid function using the additive and multiplicative homomorphic property is discussed. The computations are performed on vertically partitioned data with a single site holding the class value.

Keywords: homomorphic property, secure product, secure mean and variance, secure dot product, vertically partitioned data

Procedia PDF Downloads 412
1069 In-situ Oxygen Enrichment for Underground Coal Gasification

Authors: Adesola O. Orimoloye, Edward Gobina

Abstract:

Membrane separation technology is still considered as an emerging technology in the mining sector and does not yet have the widespread acceptance that it has in other industrial sectors. Underground Coal Gasification (UCG), wherein coal is converted to gas in-situ, is a safer alternative to mining method that retains all pollutants underground making the process environmentally friendly. In-situ combustion of coal for power generation allows access to more of the physical global coal resource than would be included in current economically recoverable reserve estimates. Where mining is no longer taking place, for economic or geological reasons, controlled gasification permits exploitation of the deposit (again a reaction of coal to form a synthesis gas) of coal seams in situ. The oxygen supply stage is one of the most expensive parts of any gasification project but the use of membranes is a potentially attractive approach for producing oxygen-enriched air. In this study, a variety of cost-effective membrane materials that gives an optimal amount of oxygen concentrations in the range of interest was designed and tested at diverse operating conditions. Oxygen-enriched atmosphere improves the combustion temperature but a decline is observed if oxygen concentration exceeds optimum. Experimental result also reveals the preparatory method, apparatus and performance of the fabricated membrane.

Keywords: membranes, oxygen-enrichment, gasification, coal

Procedia PDF Downloads 459
1068 Accumulation of PM10 and Associated Metals Due to Opencast Coal Mining Activities and Their Impact on Human Health

Authors: Arundhuti Devi, Gitumani Devi, Krishna G. Bhattacharyya

Abstract:

The goal of this study was to assess the characteristics of the airborne dust created by opencast coal mining and its relation to population hospitalization risk for skin and lung diseases in Margherita Coalfield, Assam, India. Air samples were collected for 24 h in three 8-h periods. For the collection of particulate matter (PM10) and total suspended particulate matter (SPM) samples, respiratory dust samplers with glass microfiber filter papers were used. PM10 was analyzed for Cu, Cd, Cr, Mn, Zn, Ni, Fe and Pb with Flame Atomic Absorption Spectrophotometer (FAAS). SPM and PM10 concentrations were respectively found to be as high as 1,035 and 265.85 μg/m³ in work zone air. The concentration of metals associated with PM10 showed values higher than the permissible limits. It was observed that the average concentrations of the metals Fe, Pb, Ni, Zn, and Cu were very high during the winter month of December, those of Cd and Cr were high during the month of May and Mn was high during February. The morphology of the particles studied with scanning electron microscopy (SEM) gave significant results. Due to opencast coal mining, the air in the work zone, as well as the general ambient air, was found to be highly polluted with respect to dust. More than 8000 patient records maintained by the hospital authority were collected from three hospitals in the area. The highest percentage of people suffering from lung diseases are found in Margherita Civil Hospital (~26.77%) whereas most people suffering from skin diseases reported for treatment in the ESIC hospital (47.47%). Both PM10 and SPM were alarmingly high, and the results were in conformity with the high incidence of lung and other respiratory diseases in the study area.

Keywords: heavy metals, open cast coal mining, PM10, respiratory diseases

Procedia PDF Downloads 315
1067 Accessibility and Visibility through Space Syntax Analysis of the Linga Raj Temple in Odisha, India

Authors: S. Pramanik

Abstract:

Since the early ages, the Hindu temples have been interpreted through various Vedic philosophies. These temples are visited by pilgrims which demonstrate the rituals and religious belief of communities, reflecting a variety of actions and behaviors. Darsana a direct seeing, is a part of the pilgrimage activity. During the process of Darsana, a devotee is prepared for entry in the temple to realize the cognizing Truth culminating in visualizing the idol of God, placed at the Garbhagriha (sanctum sanctorum). For this, the pilgrim must pass through a sequential arrangement of spaces. During the process of progress, the pilgrims visualize the spaces differently from various points of views. The viewpoints create a variety of spatial patterns in the minds of pilgrims coherent to the Hindu philosophies. The space organization and its order are perceived by various techniques of spatial analysis. A temple, as examples of Kalinga stylistic variations, has been chosen for the study. This paper intends to demonstrate some visual patterns generated during the process of Darsana (visibility) and its accessibility by Point Isovist Studies and Visibility Graph Analysis from the entrance (Simha Dwara) to The Sanctum sanctorum (Garbhagriha).

Keywords: Hindu temple architecture, point isovist, space syntax analysis, visibility graph analysis

Procedia PDF Downloads 120
1066 Optimizing the Location of Parking Areas Adapted for Dangerous Goods in the European Road Transport Network

Authors: María Dolores Caro, Eugenio M. Fedriani, Ángel F. Tenorio

Abstract:

The transportation of dangerous goods by lorries throughout Europe must be done by using the roads conforming the European Road Transport Network. In this network, there are several parking areas where lorry drivers can park to rest according to the regulations. According to the "European Agreement concerning the International Carriage of Dangerous Goods by Road", parking areas where lorries transporting dangerous goods can park to rest, must follow several security stipulations to keep safe the rest of road users. At this respect, these lorries must be parked in adapted areas with strict and permanent surveillance measures. Moreover, drivers must satisfy several restrictions about resting and driving time. Under these facts, one may expect that there exist enough parking areas for the transport of this type of goods in order to obey the regulations prescribed by the European Union and its member countries. However, the already-existing parking areas are not sufficient to cover all the stops required by drivers transporting dangerous goods. Our main goal is, starting from the already-existing parking areas and the loading-and-unloading location, to provide an optimal answer to the following question: how many additional parking areas must be built and where must they be located to assure that lorry drivers can transport dangerous goods following all the stipulations about security and safety for their stops? The sense of the word “optimal” is due to the fact that we give a global solution for the location of parking areas throughout the whole European Road Transport Network, adjusting the number of additional areas to be as lower as possible. To do so, we have modeled the problem using graph theory since we are working with a road network. As nodes, we have considered the locations of each already-existing parking area, each loading-and-unloading area each road bifurcation. Each road connecting two nodes is considered as an edge in the graph whose weight corresponds to the distance between both nodes in the edge. By applying a new efficient algorithm, we have found the additional nodes for the network representing the new parking areas adapted for dangerous goods, under the fact that the distance between two parking areas must be less than or equal to 400 km.

Keywords: trans-european transport network, dangerous goods, parking areas, graph-based modeling

Procedia PDF Downloads 280
1065 Optimization of Feeder Bus Routes at Urban Rail Transit Stations Based on Link Growth Probability

Authors: Yu Song, Yuefei Jin

Abstract:

Urban public transportation can be integrated when there is an efficient connection between urban rail lines, however, there are currently no effective or quick solutions being investigated for this connection. This paper analyzes the space-time distribution and travel demand of passenger connection travel based on taxi track data and data from the road network, excavates potential bus connection stations based on potential connection demand data, and introduces the link growth probability model in the complex network to solve the basic connection bus lines in order to ascertain the direction of the bus lines that are the most connected given the demand characteristics. Then, a tree view exhaustive approach based on constraints is suggested based on graph theory, which can hasten the convergence of findings while doing chain calculations. This study uses WEI QU NAN Station, the Xi'an Metro Line 2 terminal station in Shaanxi Province, as an illustration, to evaluate the model's and the solution method's efficacy. According to the findings, 153 prospective stations have been dug up in total, the feeder bus network for the entire line has been laid out, and the best route adjustment strategy has been found.

Keywords: feeder bus, route optimization, link growth probability, the graph theory

Procedia PDF Downloads 77
1064 Quantifying Multivariate Spatiotemporal Dynamics of Malaria Risk Using Graph-Based Optimization in Southern Ethiopia

Authors: Yonas Shuke Kitawa

Abstract:

Background: Although malaria incidence has substantially fallen sharply over the past few years, the rate of decline varies by district, time, and malaria type. Despite this turn-down, malaria remains a major public health threat in various districts of Ethiopia. Consequently, the present study is aimed at developing a predictive model that helps to identify the spatio-temporal variation in malaria risk by multiple plasmodium species. Methods: We propose a multivariate spatio-temporal Bayesian model to obtain a more coherent picture of the temporally varying spatial variation in disease risk. The spatial autocorrelation in such a data set is typically modeled by a set of random effects that assign a conditional autoregressive prior distribution. However, the autocorrelation considered in such cases depends on a binary neighborhood matrix specified through the border-sharing rule. Over here, we propose a graph-based optimization algorithm for estimating the neighborhood matrix that merely represents the spatial correlation by exploring the areal units as the vertices of a graph and the neighbor relations as the series of edges. Furthermore, we used aggregated malaria count in southern Ethiopia from August 2013 to May 2019. Results: We recognized that precipitation, temperature, and humidity are positively associated with the malaria threat in the area. On the other hand, enhanced vegetation index, nighttime light (NTL), and distance from coastal areas are negatively associated. Moreover, nonlinear relationships were observed between malaria incidence and precipitation, temperature, and NTL. Additionally, lagged effects of temperature and humidity have a significant effect on malaria risk by either species. More elevated risk of P. falciparum was observed following the rainy season, and unstable transmission of P. vivax was observed in the area. Finally, P. vivax risks are less sensitive to environmental factors than those of P. falciparum. Conclusion: The improved inference was gained by employing the proposed approach in comparison to the commonly used border-sharing rule. Additionally, different covariates are identified, including delayed effects, and elevated risks of either of the cases were observed in districts found in the central and western regions. As malaria transmission operates in a spatially continuous manner, a spatially continuous model should be employed when it is computationally feasible.

Keywords: disease mapping, MSTCAR, graph-based optimization algorithm, P. falciparum, P. vivax, waiting matrix

Procedia PDF Downloads 77
1063 Visualizing the Commercial Activity of a City by Analyzing the Data Information in Layers

Authors: Taras Agryzkov, Jose L. Oliver, Leandro Tortosa, Jose Vicent

Abstract:

This paper aims to demonstrate how network models can be used to understand and to deal with some aspects of urban complexity. As it is well known, the Theory of Architecture and Urbanism has been using for decades’ intellectual tools based on the ‘sciences of complexity’ as a strategy to propose theoretical approaches about cities and about architecture. In this sense, it is possible to find a vast literature in which for instance network theory is used as an instrument to understand very diverse questions about cities: from their commercial activity to their heritage condition. The contribution of this research consists in adding one step of complexity to this process: instead of working with one single primal graph as it is usually done, we will show how new network models arise from the consideration of two different primal graphs interacting in two layers. When we model an urban network through a mathematical structure like a graph, the city is usually represented by a set of nodes and edges that reproduce its topology, with the data generated or extracted from the city embedded in it. All this information is normally displayed in a single layer. Here, we propose to separate the information in two layers so that we can evaluate the interaction between them. Besides, both layers may be composed of structures that do not have to coincide: from this bi-layer system, groups of interactions emerge, suggesting reflections and in consequence, possible actions.

Keywords: graphs, mathematics, networks, urban studies

Procedia PDF Downloads 180
1062 Developing Structured Sizing Systems for Manufacturing Ready-Made Garments of Indian Females Using Decision Tree-Based Data Mining

Authors: Hina Kausher, Sangita Srivastava

Abstract:

In India, there is a lack of standard, systematic sizing approach for producing readymade garments. Garments manufacturing companies use their own created size tables by modifying international sizing charts of ready-made garments. The purpose of this study is to tabulate the anthropometric data which covers the variety of figure proportions in both height and girth. 3,000 data has been collected by an anthropometric survey undertaken over females between the ages of 16 to 80 years from some states of India to produce the sizing system suitable for clothing manufacture and retailing. This data is used for the statistical analysis of body measurements, the formulation of sizing systems and body measurements tables. Factor analysis technique is used to filter the control body dimensions from a large number of variables. Decision tree-based data mining is used to cluster the data. The standard and structured sizing system can facilitate pattern grading and garment production. Moreover, it can exceed buying ratios and upgrade size allocations to retail segments.

Keywords: anthropometric data, data mining, decision tree, garments manufacturing, sizing systems, ready-made garments

Procedia PDF Downloads 133
1061 Parameter Estimation for Contact Tracing in Graph-Based Models

Authors: Augustine Okolie, Johannes Müller, Mirjam Kretzchmar

Abstract:

We adopt a maximum-likelihood framework to estimate parameters of a stochastic susceptible-infected-recovered (SIR) model with contact tracing on a rooted random tree. Given the number of detectees per index case, our estimator allows to determine the degree distribution of the random tree as well as the tracing probability. Since we do not discover all infectees via contact tracing, this estimation is non-trivial. To keep things simple and stable, we develop an approximation suited for realistic situations (contract tracing probability small, or the probability for the detection of index cases small). In this approximation, the only epidemiological parameter entering the estimator is the basic reproduction number R0. The estimator is tested in a simulation study and applied to covid-19 contact tracing data from India. The simulation study underlines the efficiency of the method. For the empirical covid-19 data, we are able to compare different degree distributions and perform a sensitivity analysis. We find that particularly a power-law and a negative binomial degree distribution meet the data well and that the tracing probability is rather large. The sensitivity analysis shows no strong dependency on the reproduction number.

Keywords: stochastic SIR model on graph, contact tracing, branching process, parameter inference

Procedia PDF Downloads 77
1060 Language Development and Growing Spanning Trees in Children Semantic Network

Authors: Somayeh Sadat Hashemi Kamangar, Fatemeh Bakouie, Shahriar Gharibzadeh

Abstract:

In this study, we target to exploit Maximum Spanning Trees (MST) of children's semantic networks to investigate their language development. To do so, we examine the graph-theoretic properties of word-embedding networks. The networks are made of words children learn prior to the age of 30 months as the nodes and the links which are built from the cosine vector similarity of words normatively acquired by children prior to two and a half years of age. These networks are weighted graphs and the strength of each link is determined by the numerical similarities of the two words (nodes) on the sides of the link. To avoid changing the weighted networks to the binaries by setting a threshold, constructing MSTs might present a solution. MST is a unique sub-graph that connects all the nodes in such a way that the sum of all the link weights is maximized without forming cycles. MSTs as the backbone of the semantic networks are suitable to examine developmental changes in semantic network topology in children. From these trees, several parameters were calculated to characterize the developmental change in network organization. We showed that MSTs provides an elegant method sensitive to capture subtle developmental changes in semantic network organization.

Keywords: maximum spanning trees, word-embedding, semantic networks, language development

Procedia PDF Downloads 145
1059 Game Structure and Spatio-Temporal Action Detection in Soccer Using Graphs and 3D Convolutional Networks

Authors: Jérémie Ochin

Abstract:

Soccer analytics are built on two data sources: the frame-by-frame position of each player on the terrain and the sequences of events, such as ball drive, pass, cross, shot, throw-in... With more than 2000 ball-events per soccer game, their precise and exhaustive annotation, based on a monocular video stream such as a TV broadcast, remains a tedious and costly manual task. State-of-the-art methods for spatio-temporal action detection from a monocular video stream, often based on 3D convolutional neural networks, are close to reach levels of performances in mean Average Precision (mAP) compatibles with the automation of such task. Nevertheless, to meet their expectation of exhaustiveness in the context of data analytics, such methods must be applied in a regime of high recall – low precision, using low confidence score thresholds. This setting unavoidably leads to the detection of false positives that are the product of the well documented overconfidence behaviour of neural networks and, in this case, their limited access to contextual information and understanding of the game: their predictions are highly unstructured. Based on the assumption that professional soccer players’ behaviour, pose, positions and velocity are highly interrelated and locally driven by the player performing a ball-action, it is hypothesized that the addition of information regarding surrounding player’s appearance, positions and velocity in the prediction methods can improve their metrics. Several methods are compared to build a proper representation of the game surrounding a player, from handcrafted features of the local graph, based on domain knowledge, to the use of Graph Neural Networks trained in an end-to-end fashion with existing state-of-the-art 3D convolutional neural networks. It is shown that the inclusion of information regarding surrounding players helps reaching higher metrics.

Keywords: fine-grained action recognition, human action recognition, convolutional neural networks, graph neural networks, spatio-temporal action recognition

Procedia PDF Downloads 23
1058 High-Fidelity Materials Screening with a Multi-Fidelity Graph Neural Network and Semi-Supervised Learning

Authors: Akeel A. Shah, Tong Zhang

Abstract:

Computational approaches to learning the properties of materials are commonplace, motivated by the need to screen or design materials for a given application, e.g., semiconductors and energy storage. Experimental approaches can be both time consuming and costly. Unfortunately, computational approaches such as ab-initio electronic structure calculations and classical or ab-initio molecular dynamics are themselves can be too slow for the rapid evaluation of materials, often involving thousands to hundreds of thousands of candidates. Machine learning assisted approaches have been developed to overcome the time limitations of purely physics-based approaches. These approaches, on the other hand, require large volumes of data for training (hundreds of thousands on many standard data sets such as QM7b). This means that they are limited by how quickly such a large data set of physics-based simulations can be established. At high fidelity, such as configuration interaction, composite methods such as G4, and coupled cluster theory, gathering such a large data set can become infeasible, which can compromise the accuracy of the predictions - many applications require high accuracy, for example band structures and energy levels in semiconductor materials and the energetics of charge transfer in energy storage materials. In order to circumvent this problem, multi-fidelity approaches can be adopted, for example the Δ-ML method, which learns a high-fidelity output from a low-fidelity result such as Hartree-Fock or density functional theory (DFT). The general strategy is to learn a map between the low and high fidelity outputs, so that the high-fidelity output is obtained a simple sum of the physics-based low-fidelity and correction, Although this requires a low-fidelity calculation, it typically requires far fewer high-fidelity results to learn the correction map, and furthermore, the low-fidelity result, such as Hartree-Fock or semi-empirical ZINDO, is typically quick to obtain, For high-fidelity outputs the result can be an order of magnitude or more in speed up. In this work, a new multi-fidelity approach is developed, based on a graph convolutional network (GCN) combined with semi-supervised learning. The GCN allows for the material or molecule to be represented as a graph, which is known to improve accuracy, for example SchNet and MEGNET. The graph incorporates information regarding the numbers of, types and properties of atoms; the types of bonds; and bond angles. They key to the accuracy in multi-fidelity methods, however, is the incorporation of low-fidelity output to learn the high-fidelity equivalent, in this case by learning their difference. Semi-supervised learning is employed to allow for different numbers of low and high-fidelity training points, by using an additional GCN-based low-fidelity map to predict high fidelity outputs. It is shown on 4 different data sets that a significant (at least one order of magnitude) increase in accuracy is obtained, using one to two orders of magnitude fewer low and high fidelity training points. One of the data sets is developed in this work, pertaining to 1000 simulations of quinone molecules (up to 24 atoms) at 5 different levels of fidelity, furnishing the energy, dipole moment and HOMO/LUMO.

Keywords: .materials screening, computational materials, machine learning, multi-fidelity, graph convolutional network, semi-supervised learning

Procedia PDF Downloads 39
1057 Modelling of Recovery and Application of Low-Grade Thermal Resources in the Mining and Mineral Processing Industry

Authors: S. McLean, J. A. Scott

Abstract:

The research topic is focusing on improving sustainable operation through recovery and reuse of waste heat in process water streams, an area in the mining industry that is often overlooked. There are significant advantages to the application of this topic, including economic and environmental benefits. The smelting process in the mining industry presents an opportunity to recover waste heat and apply it to alternative uses, thereby enhancing the overall process. This applied research has been conducted at the Sudbury Integrated Nickel Operations smelter site, in particular on the water cooling towers. The aim was to determine and optimize methods for appropriate recovery and subsequent upgrading of thermally low-grade heat lost from the water cooling towers in a manner that makes it useful for repurposing in applications, such as within an acid plant. This would be valuable to mining companies as it would be an opportunity to reduce the cost of the process, as well as decrease environmental impact and primary fuel usage. The waste heat from the cooling towers needs to be upgraded before it can be beneficially applied, as lower temperatures result in a decrease of the number of potential applications. Temperature and flow rate data were collected from the water cooling towers at an acid plant over two years. The research includes process control strategies and the development of a model capable of determining if the proposed heat recovery technique is economically viable, as well as assessing any environmental impact with the reduction in net energy consumption by the process. Therefore, comprehensive cost and impact analyses are carried out to determine the best area of application for the recovered waste heat. This method will allow engineers to easily identify the value of thermal resources available to them and determine if a full feasibility study should be carried out. The rapid scoping model developed will be applicable to any site that generates large amounts of waste heat. Results show that heat pumps are an economically viable solution for this application, allowing for reduced cost and CO₂ emissions.

Keywords: environment, heat recovery, mining engineering, sustainability

Procedia PDF Downloads 110
1056 Optimizing Communications Overhead in Heterogeneous Distributed Data Streams

Authors: Rashi Bhalla, Russel Pears, M. Asif Naeem

Abstract:

In this 'Information Explosion Era' analyzing data 'a critical commodity' and mining knowledge from vertically distributed data stream incurs huge communication cost. However, an effort to decrease the communication in the distributed environment has an adverse influence on the classification accuracy; therefore, a research challenge lies in maintaining a balance between transmission cost and accuracy. This paper proposes a method based on Bayesian inference to reduce the communication volume in a heterogeneous distributed environment while retaining prediction accuracy. Our experimental evaluation reveals that a significant reduction in communication can be achieved across a diverse range of dataset types.

Keywords: big data, bayesian inference, distributed data stream mining, heterogeneous-distributed data

Procedia PDF Downloads 161
1055 Topological Language for Classifying Linear Chord Diagrams via Intersection Graphs

Authors: Michela Quadrini

Abstract:

Chord diagrams occur in mathematics, from the study of RNA to knot theory. They are widely used in theory of knots and links for studying the finite type invariants, whereas in molecular biology one important motivation to study chord diagrams is to deal with the problem of RNA structure prediction. An RNA molecule is a linear polymer, referred to as the backbone, that consists of four types of nucleotides. Each nucleotide is represented by a point, whereas each chord of the diagram stands for one interaction for Watson-Crick base pairs between two nonconsecutive nucleotides. A chord diagram is an oriented circle with a set of n pairs of distinct points, considered up to orientation preserving diffeomorphisms of the circle. A linear chord diagram (LCD) is a special kind of graph obtained cutting the oriented circle of a chord diagram. It consists of a line segment, called its backbone, to which are attached a number of chords with distinct endpoints. There is a natural fattening on any linear chord diagram; the backbone lies on the real axis, while all the chords are in the upper half-plane. Each linear chord diagram has a natural genus of its associated surface. To each chord diagram and linear chord diagram, it is possible to associate the intersection graph. It consists of a graph whose vertices correspond to the chords of the diagram, whereas the chord intersections are represented by a connection between the vertices. Such intersection graph carries a lot of information about the diagram. Our goal is to define an LCD equivalence class in terms of identity of intersection graphs, from which many chord diagram invariants depend. For studying these invariants, we introduce a new representation of Linear Chord Diagrams based on a set of appropriate topological operators that permits to model LCD in terms of the relations among chords. Such set is composed of: crossing, nesting, and concatenations. The crossing operator is able to generate the whole space of linear chord diagrams, and a multiple context free grammar able to uniquely generate each LDC starting from a linear chord diagram adding a chord for each production of the grammar is defined. In other words, it allows to associate a unique algebraic term to each linear chord diagram, while the remaining operators allow to rewrite the term throughout a set of appropriate rewriting rules. Such rules define an LCD equivalence class in terms of the identity of intersection graphs. Starting from a modelled RNA molecule and the linear chord, some authors proposed a topological classification and folding. Our LCD equivalence class could contribute to the RNA folding problem leading to the definition of an algorithm that calculates the free energy of the molecule more accurately respect to the existing ones. Such LCD equivalence class could be useful to obtain a more accurate estimate of link between the crossing number and the topological genus and to study the relation among other invariants.

Keywords: chord diagrams, linear chord diagram, equivalence class, topological language

Procedia PDF Downloads 201
1054 A Mixed Integer Programming Model for Optimizing the Layout of an Emergency Department

Authors: Farhood Rismanchian, Seong Hyeon Park, Young Hoon Lee

Abstract:

During the recent years, demand for healthcare services has dramatically increased. As the demand for healthcare services increases, so does the necessity of constructing new healthcare buildings and redesigning and renovating existing ones. Increasing demands necessitate the use of optimization techniques to improve the overall service efficiency in healthcare settings. However, high complexity of care processes remains the major challenge to accomplish this goal. This study proposes a method based on process mining results to address the high complexity of care processes and to find the optimal layout of the various medical centers in an emergency department. ProM framework is used to discover clinical pathway patterns and relationship between activities. Sequence clustering plug-in is used to remove infrequent events and to derive the process model in the form of Markov chain. The process mining results served as an input for the next phase which consists of the development of the optimization model. Comparison of the current ED design with the one obtained from the proposed method indicated that a carefully designed layout can significantly decrease the distances that patients must travel.

Keywords: Mixed Integer programming, Facility layout problem, Process Mining, Healthcare Operation Management

Procedia PDF Downloads 339
1053 Data Mining Algorithms Analysis: Case Study of Price Predictions of Lands

Authors: Julio Albuja, David Zaldumbide

Abstract:

Data analysis is an important step before taking a decision about money. The aim of this work is to analyze the factors that influence the final price of the houses through data mining algorithms. To our best knowledge, previous work was researched just to compare results. Furthermore, before using the data of the data set, the Z-Transformation were used to standardize the data in the same range. Hence, the data was classified into two groups to visualize them in a readability format. A decision tree was built, and graphical data is displayed where clearly is easy to see the results and the factors' influence in these graphics. The definitions of these methods are described, as well as the descriptions of the results. Finally, conclusions and recommendations are presented related to the released results that our research showed making it easier to apply these algorithms using a customized data set.

Keywords: algorithms, data, decision tree, transformation

Procedia PDF Downloads 374
1052 Comparison of Unit Hydrograph Models to Simulate Flood Events at the Field Scale

Authors: Imene Skhakhfa, Lahbaci Ouerdachi

Abstract:

To ensure the overall coherence of simulated results, it is necessary to develop a robust validation process. In many applications, it is no longer content to calibrate and validate the model only in relation to the hydro graph measured at the outlet, but we try to better simulate the functioning of the watershed in space. Therefore the timing also performs compared to other variables such as water level measurements in intermediate stations or groundwater levels. As part of this work, we limit ourselves to modeling flood of short duration for which the process of evapotranspiration is negligible. The main parameters to identify the models are related to the method of unit hydro graph (HU). Three different models were tested: SNYDER, CLARK and SCS. These models differ in their mathematical structure and parameters to be calibrated while hydrological data are the same, the initial water content and precipitation. The models are compared on the basis of their performance in terms six objective criteria, three global criteria and three criteria representing volume, peak flow, and the mean square error. The first type of criteria gives more weight to strong events whereas the second considers all events to be of equal weight. The results show that the calibrated parameter values are dependent and also highlight the problems associated with the simulation of low flow events and intermittent precipitation.

Keywords: model calibration, intensity, runoff, hydrograph

Procedia PDF Downloads 486
1051 An Insite to the Probabilistic Assessment of Reserves in Conventional Reservoirs

Authors: Sai Sudarshan, Harsh Vyas, Riddhiman Sherlekar

Abstract:

The oil and gas industry has been unwilling to adopt stochastic definition of reserves. Nevertheless, Monte Carlo simulation methods have gained acceptance by engineers, geoscientists and other professionals who want to evaluate prospects or otherwise analyze problems that involve uncertainty. One of the common applications of Monte Carlo simulation is the estimation of recoverable hydrocarbon from a reservoir.Monte Carlo Simulation makes use of random samples of parameters or inputs to explore the behavior of a complex system or process. It finds application whenever one needs to make an estimate, forecast or decision where there is significant uncertainty. First, the project focuses on performing Monte-Carlo Simulation on a given data set using U. S Department of Energy’s MonteCarlo Software, which is a freeware e&p tool. Further, an algorithm for simulation has been developed for MATLAB and program performs simulation by prompting user for input distributions and parameters associated with each distribution (i.e. mean, st.dev, min., max., most likely, etc.). It also prompts user for desired probability for which reserves are to be calculated. The algorithm so developed and tested in MATLAB further finds implementation in Python where existing libraries on statistics and graph plotting have been imported to generate better outcome. With PyQt designer, codes for a simple graphical user interface have also been written. The graph so plotted is then validated with already available results from U.S DOE MonteCarlo Software.

Keywords: simulation, probability, confidence interval, sensitivity analysis

Procedia PDF Downloads 382
1050 Graph-Based Semantical Extractive Text Analysis

Authors: Mina Samizadeh

Abstract:

In the past few decades, there has been an explosion in the amount of available data produced from various sources with different topics. The availability of this enormous data necessitates us to adopt effective computational tools to explore the data. This leads to an intense growing interest in the research community to develop computational methods focused on processing this text data. A line of study focused on condensing the text so that we are able to get a higher level of understanding in a shorter time. The two important tasks to do this are keyword extraction and text summarization. In keyword extraction, we are interested in finding the key important words from a text. This makes us familiar with the general topic of a text. In text summarization, we are interested in producing a short-length text which includes important information about the document. The TextRank algorithm, an unsupervised learning method that is an extension of the PageRank (algorithm which is the base algorithm of Google search engine for searching pages and ranking them), has shown its efficacy in large-scale text mining, especially for text summarization and keyword extraction. This algorithm can automatically extract the important parts of a text (keywords or sentences) and declare them as a result. However, this algorithm neglects the semantic similarity between the different parts. In this work, we improved the results of the TextRank algorithm by incorporating the semantic similarity between parts of the text. Aside from keyword extraction and text summarization, we develop a topic clustering algorithm based on our framework, which can be used individually or as a part of generating the summary to overcome coverage problems.

Keywords: keyword extraction, n-gram extraction, text summarization, topic clustering, semantic analysis

Procedia PDF Downloads 70
1049 Lead Removal From Ex- Mining Pond Water by Electrocoagulation: Kinetics, Isotherm, and Dynamic Studies

Authors: Kalu Uka Orji, Nasiman Sapari, Khamaruzaman W. Yusof

Abstract:

Exposure of galena (PbS), tealite (PbSnS2), and other associated minerals during mining activities release lead (Pb) and other heavy metals into the mining water through oxidation and dissolution. Heavy metal pollution has become an environmental challenge. Lead, for instance, can cause toxic effects to human health, including brain damage. Ex-mining pond water was reported to contain lead as high as 69.46 mg/L. Conventional treatment does not easily remove lead from water. A promising and emerging treatment technology for lead removal is the application of the electrocoagulation (EC) process. However, some of the problems associated with EC are systematic reactor design, selection of maximum EC operating parameters, scale-up, among others. This study investigated an EC process for the removal of lead from synthetic ex-mining pond water using a batch reactor and Fe electrodes. The effects of various operating parameters on lead removal efficiency were examined. The results obtained indicated that the maximum removal efficiency of 98.6% was achieved at an initial PH of 9, the current density of 15mA/cm2, electrode spacing of 0.3cm, treatment time of 60 minutes, Liquid Motion of Magnetic Stirring (LM-MS), and electrode arrangement = BP-S. The above experimental data were further modeled and optimized using a 2-Level 4-Factor Full Factorial design, a Response Surface Methodology (RSM). The four factors optimized were the current density, electrode spacing, electrode arrangements, and Liquid Motion Driving Mode (LM). Based on the regression model and the analysis of variance (ANOVA) at 0.01%, the results showed that an increase in current density and LM-MS increased the removal efficiency while the reverse was the case for electrode spacing. The model predicted the optimal lead removal efficiency of 99.962% with an electrode spacing of 0.38 cm alongside others. Applying the predicted parameters, the lead removal efficiency of 100% was actualized. The electrode and energy consumptions were 0.192kg/m3 and 2.56 kWh/m3 respectively. Meanwhile, the adsorption kinetic studies indicated that the overall lead adsorption system belongs to the pseudo-second-order kinetic model. The adsorption dynamics were also random, spontaneous, and endothermic. The higher temperature of the process enhances adsorption capacity. Furthermore, the adsorption isotherm fitted the Freundlish model more than the Langmuir model; describing the adsorption on a heterogeneous surface and showed good adsorption efficiency by the Fe electrodes. Adsorption of Pb2+ onto the Fe electrodes was a complex reaction, involving more than one mechanism. The overall results proved that EC is an efficient technique for lead removal from synthetic mining pond water. The findings of this study would have application in the scale-up of EC reactor and in the design of water treatment plants for feed-water sources that contain lead using the electrocoagulation method.

Keywords: ex-mining water, electrocoagulation, lead, adsorption kinetics

Procedia PDF Downloads 148
1048 Sustainable Mining Fulfilling Constitutional Responsibilities: A Case Study of NMDC Limited Bacheli in India

Authors: Bagam Venkateswarlu

Abstract:

NMDC Limited, Indian multinational mining company operates under administrative control of Ministry of Steel, Government of India. This study is undertaken to evaluate how sustainable mining practiced by the company fulfils the provisions of Indian Constitution to secure to its citizen – justice, equality of status and opportunity, promoting social, economic, political, and religious wellbeing. The Constitution of India lays down a road map as to how the goal of being a “Welfare State” shall be achieved. The vision of sustainable mining being practiced is oriented along the constitutional responsibilities on Indian Citizens and the Corporate World. This qualitative study shall be backed by quantitative studies of National Mineral Development Corporation performances in various domains of sustainable mining and ESG, that is, environment, social and governance parameters. For example, Five Star Rating of mine is a comprehensive evaluation system introduced by Ministry of Mines, Govt. of India is one of the methodologies. Corporate Social Responsibilities is one of the thrust areas for securing social well-being. Green energy initiatives in and around the mines has given the title of “Eco-Friendly Miner” to NMDC Limited. While operating fully mechanized large scale iron ore mine (18.8 million tonne per annum capacity) in Bacheli, Chhattisgarh, M/s NMDC Limited caters to the needs of mineral security of State of Chhattisgarh and Indian Union. It preserves forest, wild-life, and environment heritage of richly endowed State of Chhattisgarh. In the remote and far-flung interiors of Chhattisgarh, NMDC empowers the local population by providing world class educational & medical facilities, transportation network, drinking water facilities, irrigational agricultural supports, employment opportunities, establishing religious harmony. All this ultimately results in empowered, educated, and improved awareness in population. Thus, the basic tenets of constitution of India- secularism, democracy, welfare for all, socialism, humanism, decentralization, liberalism, mixed economy, and non-violence is fulfilled. Constitution declares India as a welfare state – for the people, of the people and by the people. The sustainable mining practices by NMDC are in line with the objective. Thus, the purpose of study is fully met with. The potential benefit of the study includes replicating this model in existing or new establishments in various parts of country – especially in the under-privileged interiors and far-flung areas which are yet to see the lights of development.

Keywords: ESG values, Indian constitution, NMDC limited, sustainable mining, CSR, green energy

Procedia PDF Downloads 75
1047 Application of Data Mining for Aquifer Environmental Assessment

Authors: Saman Javadi, Mehdi Hashemy, Mohahammad Mahmoodi

Abstract:

Vulnerability maps are employed as an important solution in order to handle entrance of pollution into the aquifers. The common way to provide vulnerability map is DRASTIC. Meanwhile, application of the method is not easy to apply for any aquifer due to choosing appropriate constant values of weights and ranks. In this study, a new approach using k-means clustering is applied to make vulnerability maps. Four features of depth to groundwater, hydraulic conductivity, recharge value and vadose zone were considered at the same time as features of clustering. Five regions are recognized out of the case study represent zones with different level of vulnerability. The finding results show that clustering provides a realistic vulnerability map so that, Pearson’s correlation coefficients between nitrate concentrations and clustering vulnerability is obtained 61%.

Keywords: clustering, data mining, groundwater, vulnerability assessment

Procedia PDF Downloads 603
1046 Attributes That Influence Respondents When Choosing a Mate in Internet Dating Sites: An Innovative Matching Algorithm

Authors: Moti Zwilling, Srečko Natek

Abstract:

This paper aims to present an innovative predictive analytics analysis in order to find the best combination between two consumers who strive to find their partner or in internet sites. The methodology shown in this paper is based on analysis of consumer preferences and involves data mining and machine learning search techniques. The study is composed of two parts: The first part examines by means of descriptive statistics the correlations between a set of parameters that are taken between man and women where they intent to meet each other through the social media, usually the internet. In this part several hypotheses were examined and statistical analysis were taken place. Results show that there is a strong correlation between the affiliated attributes of man and woman as long as concerned to how they present themselves in a social media such as "Facebook". One interesting issue is the strong desire to develop a serious relationship between most of the respondents. In the second part, the authors used common data mining algorithms to search and classify the most important and effective attributes that affect the response rate of the other side. Results exhibit that personal presentation and education background are found as most affective to achieve a positive attitude to one's profile from the other mate.

Keywords: dating sites, social networks, machine learning, decision trees, data mining

Procedia PDF Downloads 293
1045 Utilization of Process Mapping Tool to Enhance Production Drilling in Underground Metal Mining Operations

Authors: Sidharth Talan, Sanjay Kumar Sharma, Eoin Joseph Wallace, Nikita Agrawal

Abstract:

Underground mining is at the core of rapidly evolving metals and minerals sector due to the increasing mineral consumption globally. Even though the surface mines are still more abundant on earth, the scales of industry are slowly tipping towards underground mining due to rising depth and complexities of orebodies. Thus, the efficient and productive functioning of underground operations depends significantly on the synchronized performance of key elements such as operating site, mining equipment, manpower and mine services. Production drilling is the process of conducting long hole drilling for the purpose of charging and blasting these holes for the production of ore in underground metal mines. Thus, production drilling is the crucial segment in the underground metal mining value chain. This paper presents the process mapping tool to evaluate the production drilling process in the underground metal mining operation by dividing the given process into three segments namely Input, Process and Output. The three segments are further segregated into factors and sub-factors. As per the study, the major input factors crucial for the efficient functioning of production drilling process are power, drilling water, geotechnical support of the drilling site, skilled drilling operators, services installation crew, oils and drill accessories for drilling machine, survey markings at drill site, proper housekeeping, regular maintenance of drill machine, suitable transportation for reaching the drilling site and finally proper ventilation. The major outputs for the production drilling process are ore, waste as a result of dilution, timely reporting and investigation of unsafe practices, optimized process time and finally well fragmented blasted material within specifications set by the mining company. The paper also exhibits the drilling loss matrix, which is utilized to appraise the loss in planned production meters per day in a mine on account of availability loss in the machine due to breakdowns, underutilization of the machine and productivity loss in the machine measured in drilling meters per unit of percussion hour with respect to its planned productivity for the day. The given three losses would be essential to detect the bottlenecks in the process map of production drilling operation so as to instigate the action plan to suppress or prevent the causes leading to the operational performance deficiency. The given tool is beneficial to mine management to focus on the critical factors negatively impacting the production drilling operation and design necessary operational and maintenance strategies to mitigate them. 

Keywords: process map, drilling loss matrix, SIPOC, productivity, percussion rate

Procedia PDF Downloads 215
1044 Investigation of the Heavy Metal Pollution of the River Ecosystems in the Lake Sevan Basin, Armenia

Authors: G. Gevorgyan, S. Khudaverdyan, A. Vaseashta

Abstract:

The Lake Sevan basin is situated in the eastern part of the Republic of Armenia (Gegharquniq marz/district). The heavy metal pollution of the some tributaries of Lake Sevan was investigated. Water sampling was performed in August and December, 2014 from the 4 observation sites: 1) Sotq river upstream (about 600 meters upstream from the Sotq gold mine); 2) Sotq river mouth; 3) Masrik river mouth; 4) Dzknaget river mouth. Heavy metal (V, Fe, Ni, Cu, As, Mo, Pb) concentrations in the water samples were determined by the standard methods using an atomic absorption spectrophotometer. The results of the study showed that heavy metal content mainly increased from the upstream of the Sotq river to the mouth of the Masrik river which may have been conditioned by the influence of gold mining activity as the Masrik and its tributary-Sotq rivers passing through the gold mining area were exposed to heavy metal pollution. The observation sites can be ranked by pollution degree as follows: №3> №2> №1> №4. The highest heavy metal pollution degree was observed in the Masrik river mouth which may have been conditioned by the direct impact of gold mining activity and the pressure of its tributary–the Sotq river which flows through the gold mining area. The lowest heavy metal pollution degree was registered in the Dzknaget river mouth which flowing through rural areas wasn’t subject to significant heavy metal pollution. According to the observation sites of the Sotq and Masrik rivers, high positive correlation was mainly observed between the concentrations of the investigated heavy metals (except nickel) which indicated that all the heavy metals except the nickel had the same anthropogenic pollution source which was the activity of the Sotq gold mine. In general, it is possible to state that the activity of the Sotq gold mine in the Lake Sevan basin caused the heavy metal pollution of the Sotq and Masrik rivers which may have posed environmental hazards. Heavy metals are nondegradable substances, and heavy metal pollution of freshwater systems may pose risks to the environment and human health through accumulation in the tissues of aquatic organisms, water-food chain as well as oral ingestion and dermal contact.

Keywords: Armenia, Lake Sevan basin, gold mining activity, river ecosystems, heavy metal pollution

Procedia PDF Downloads 584
1043 Domain specific Ontology-Based Knowledge Extraction Using R-GNN and Large Language Models

Authors: Andrey Khalov

Abstract:

The rapid proliferation of unstructured data in IT infrastructure management demands innovative approaches for extracting actionable knowledge. This paper presents a framework for ontology-based knowledge extraction that combines relational graph neural networks (R-GNN) with large language models (LLMs). The proposed method leverages the DOLCE framework as the foundational ontology, extending it with concepts from ITSMO for domain-specific applications in IT service management and outsourcing. A key component of this research is the use of transformer-based models, such as DeBERTa-v3-large, for automatic entity and relationship extraction from unstructured texts. Furthermore, the paper explores how transfer learning techniques can be applied to fine-tune large language models (LLaMA) for using to generate synthetic datasets to improve precision in BERT-based entity recognition and ontology alignment. The resulting IT Ontology (ITO) serves as a comprehensive knowledge base that integrates domain-specific insights from ITIL processes, enabling more efficient decision-making. Experimental results demonstrate significant improvements in knowledge extraction and relationship mapping, offering a cutting-edge solution for enhancing cognitive computing in IT service environments.

Keywords: ontology mapping, R-GNN, knowledge extraction, large language models, NER, knowlege graph

Procedia PDF Downloads 16